trunk/oopsePaper/ProgramDesign.tex


\section{\label{sec:ProgramDesign}Program Design}

\subsection{\label{sec:architecture} OOPSE Architecture}

The core of OOPSE is divided into two main object libraries: {\texttt
libBASS} and {\texttt libmdtools}. {\texttt libBASS} is the library
developed around the parseing engine and {\texttt libmdtools} is the
software library developed around the simulation engine.


\subsection{\label{sec:programLang} Programming Languages }

\subsection{\label{sec:parallelization} Parallelization of OOPSE}

Although processor power is doubling roughly every 18 months according
to the famous Moore's Law\cite{moore}, it is still unreasonable to
simulate systems of more then a 1000 atoms on a single processor. To
facilitate study of larger system sizes or smaller systems on long
time scales in a reasonable period of time, parallel methods were
developed allowing multiple CPU's to share the simulation
workload. Three general categories of parallel decomposition method's
have been developed including atomic, spatial and force decomposition
methods.

Algorithmically simplest of the three method's is atomic decomposition
where N particles in a simulation are split among P processors for the
duration of the simulation. Computational cost scales as an optimal
$O(N/P)$ for atomic decomposition. Unfortunately all processors must
communicate positions and forces with all other processors leading
communication to scale as an unfavorable $O(N)$ independent of the
number of processors. This communication bottleneck led to the
development of spatial and force decomposition methods in which
communication among processors scales much more favorably. Spatial or
domain decomposition divides the physical spatial domain into 3D boxes
in which each processor is responsible for calculation of forces and
positions of particles located in its box. Particles are reassigned to
different processors as they move through simulation space. To
calculate forces on a given particle, a processor must know the
positions of particles within some cutoff radius located on nearby
processors instead of the positions of particles on all
processors. Both communication between processors and computation
scale as $O(N/P)$ in the spatial method. However, spatial
decomposition adds algorithmic complexity to the simulation code and
is not very efficient for small N since the overall communication
scales as the surface to volume ratio $(N/P)^{2/3}$ in three
dimensions.

Force decomposition assigns particles to processors based on a block
decomposition of the force matrix. Processors are split into a
optimally square grid forming row and column processor groups. Forces
are calculated on particles in a given row by particles located in
that processors column assignment. Force decomposition is less complex
to implement then the spatial method but still scales computationally
as $O(N/P)$ and scales as $(N/\sqrt{p})$ in communication
cost. Plimpton also found that force decompositions scales more
favorably then spatial decomposition up to 10,000 atoms and favorably
competes with spatial methods for up to 100,000 atoms.

\subsection{\label{sec:memory}Memory Allocation in Analysis}

\subsection{\label{sec:documentation}Documentation}

\subsection{\label{openSource}Open Source and Distribution License}

Revision:	902
Committed:	Tue Jan 6 21:19:39 2004 UTC (21 years, 10 months ago) by mmeineke
Content type:	application/x-tex
File size:	3299 byte(s)
Log Message:	started adding some changes from before break
#	User	Rev	Content
1	mmeineke	899
2			\section{\label{sec:ProgramDesign}Program Design}
3
4			\subsection{\label{sec:architecture} OOPSE Architecture}
5
6	mmeineke	902	The core of OOPSE is divided into two main object libraries: {\texttt
7			libBASS} and {\texttt libmdtools}. {\texttt libBASS} is the library
8			developed around the parseing engine and {\texttt libmdtools} is the
9			software library developed around the simulation engine.
10	mmeineke	899
11	mmeineke	902
12
13	mmeineke	899	\subsection{\label{sec:programLang} Programming Languages }
14
15			\subsection{\label{sec:parallelization} Parallelization of OOPSE}
16
17			Although processor power is doubling roughly every 18 months according
18			to the famous Moore's Law\cite{moore}, it is still unreasonable to
19			simulate systems of more then a 1000 atoms on a single processor. To
20			facilitate study of larger system sizes or smaller systems on long
21			time scales in a reasonable period of time, parallel methods were
22			developed allowing multiple CPU's to share the simulation
23			workload. Three general categories of parallel decomposition method's
24			have been developed including atomic, spatial and force decomposition
25			methods.
26
27			Algorithmically simplest of the three method's is atomic decomposition
28			where N particles in a simulation are split among P processors for the
29			duration of the simulation. Computational cost scales as an optimal
30			$O(N/P)$ for atomic decomposition. Unfortunately all processors must
31			communicate positions and forces with all other processors leading
32			communication to scale as an unfavorable $O(N)$ independent of the
33			number of processors. This communication bottleneck led to the
34			development of spatial and force decomposition methods in which
35			communication among processors scales much more favorably. Spatial or
36			domain decomposition divides the physical spatial domain into 3D boxes
37			in which each processor is responsible for calculation of forces and
38			positions of particles located in its box. Particles are reassigned to
39			different processors as they move through simulation space. To
40			calculate forces on a given particle, a processor must know the
41			positions of particles within some cutoff radius located on nearby
42			processors instead of the positions of particles on all
43			processors. Both communication between processors and computation
44			scale as $O(N/P)$ in the spatial method. However, spatial
45			decomposition adds algorithmic complexity to the simulation code and
46			is not very efficient for small N since the overall communication
47			scales as the surface to volume ratio $(N/P)^{2/3}$ in three
48			dimensions.
49
50			Force decomposition assigns particles to processors based on a block
51			decomposition of the force matrix. Processors are split into a
52			optimally square grid forming row and column processor groups. Forces
53			are calculated on particles in a given row by particles located in
54			that processors column assignment. Force decomposition is less complex
55			to implement then the spatial method but still scales computationally
56			as $O(N/P)$ and scales as $(N/\sqrt{p})$ in communication
57			cost. Plimpton also found that force decompositions scales more
58			favorably then spatial decomposition up to 10,000 atoms and favorably
59			competes with spatial methods for up to 100,000 atoms.
60
61			\subsection{\label{sec:memory}Memory Allocation in Analysis}
62
63			\subsection{\label{sec:documentation}Documentation}
64
65			\subsection{\label{openSource}Open Source and Distribution License}
66