1 |
|
2 |
\section{\label{sec:ProgramDesign}Program Design} |
3 |
|
4 |
\subsection{\label{sec:architecture} OOPSE Architecture} |
5 |
|
6 |
The core of OOPSE is divided into two main object libraries: {\texttt |
7 |
libBASS} and {\texttt libmdtools}. {\texttt libBASS} is the library |
8 |
developed around the parseing engine and {\texttt libmdtools} is the |
9 |
software library developed around the simulation engine. |
10 |
|
11 |
|
12 |
|
13 |
\subsection{\label{sec:programLang} Programming Languages } |
14 |
|
15 |
\subsection{\label{sec:parallelization} Parallelization of OOPSE} |
16 |
|
17 |
Although processor power is doubling roughly every 18 months according |
18 |
to the famous Moore's Law\cite{moore}, it is still unreasonable to |
19 |
simulate systems of more then a 1000 atoms on a single processor. To |
20 |
facilitate study of larger system sizes or smaller systems on long |
21 |
time scales in a reasonable period of time, parallel methods were |
22 |
developed allowing multiple CPU's to share the simulation |
23 |
workload. Three general categories of parallel decomposition method's |
24 |
have been developed including atomic, spatial and force decomposition |
25 |
methods. |
26 |
|
27 |
Algorithmically simplest of the three method's is atomic decomposition |
28 |
where N particles in a simulation are split among P processors for the |
29 |
duration of the simulation. Computational cost scales as an optimal |
30 |
$O(N/P)$ for atomic decomposition. Unfortunately all processors must |
31 |
communicate positions and forces with all other processors leading |
32 |
communication to scale as an unfavorable $O(N)$ independent of the |
33 |
number of processors. This communication bottleneck led to the |
34 |
development of spatial and force decomposition methods in which |
35 |
communication among processors scales much more favorably. Spatial or |
36 |
domain decomposition divides the physical spatial domain into 3D boxes |
37 |
in which each processor is responsible for calculation of forces and |
38 |
positions of particles located in its box. Particles are reassigned to |
39 |
different processors as they move through simulation space. To |
40 |
calculate forces on a given particle, a processor must know the |
41 |
positions of particles within some cutoff radius located on nearby |
42 |
processors instead of the positions of particles on all |
43 |
processors. Both communication between processors and computation |
44 |
scale as $O(N/P)$ in the spatial method. However, spatial |
45 |
decomposition adds algorithmic complexity to the simulation code and |
46 |
is not very efficient for small N since the overall communication |
47 |
scales as the surface to volume ratio $(N/P)^{2/3}$ in three |
48 |
dimensions. |
49 |
|
50 |
Force decomposition assigns particles to processors based on a block |
51 |
decomposition of the force matrix. Processors are split into a |
52 |
optimally square grid forming row and column processor groups. Forces |
53 |
are calculated on particles in a given row by particles located in |
54 |
that processors column assignment. Force decomposition is less complex |
55 |
to implement then the spatial method but still scales computationally |
56 |
as $O(N/P)$ and scales as $(N/\sqrt{p})$ in communication |
57 |
cost. Plimpton also found that force decompositions scales more |
58 |
favorably then spatial decomposition up to 10,000 atoms and favorably |
59 |
competes with spatial methods for up to 100,000 atoms. |
60 |
|
61 |
\subsection{\label{sec:memory}Memory Allocation in Analysis} |
62 |
|
63 |
\subsection{\label{sec:documentation}Documentation} |
64 |
|
65 |
\subsection{\label{openSource}Open Source and Distribution License} |
66 |
|