1 |
|
2 |
\section{\label{sec:ProgramDesign}Program Design} |
3 |
|
4 |
\subsection{\label{sec:architecture} OOPSE Architecture} |
5 |
|
6 |
|
7 |
\subsection{\label{sec:programLang} Programming Languages } |
8 |
|
9 |
\subsection{\label{sec:parallelization} Parallelization of OOPSE} |
10 |
|
11 |
Although processor power is doubling roughly every 18 months according |
12 |
to the famous Moore's Law\cite{moore}, it is still unreasonable to |
13 |
simulate systems of more then a 1000 atoms on a single processor. To |
14 |
facilitate study of larger system sizes or smaller systems on long |
15 |
time scales in a reasonable period of time, parallel methods were |
16 |
developed allowing multiple CPU's to share the simulation |
17 |
workload. Three general categories of parallel decomposition method's |
18 |
have been developed including atomic, spatial and force decomposition |
19 |
methods. |
20 |
|
21 |
Algorithmically simplest of the three method's is atomic decomposition |
22 |
where N particles in a simulation are split among P processors for the |
23 |
duration of the simulation. Computational cost scales as an optimal |
24 |
$O(N/P)$ for atomic decomposition. Unfortunately all processors must |
25 |
communicate positions and forces with all other processors leading |
26 |
communication to scale as an unfavorable $O(N)$ independent of the |
27 |
number of processors. This communication bottleneck led to the |
28 |
development of spatial and force decomposition methods in which |
29 |
communication among processors scales much more favorably. Spatial or |
30 |
domain decomposition divides the physical spatial domain into 3D boxes |
31 |
in which each processor is responsible for calculation of forces and |
32 |
positions of particles located in its box. Particles are reassigned to |
33 |
different processors as they move through simulation space. To |
34 |
calculate forces on a given particle, a processor must know the |
35 |
positions of particles within some cutoff radius located on nearby |
36 |
processors instead of the positions of particles on all |
37 |
processors. Both communication between processors and computation |
38 |
scale as $O(N/P)$ in the spatial method. However, spatial |
39 |
decomposition adds algorithmic complexity to the simulation code and |
40 |
is not very efficient for small N since the overall communication |
41 |
scales as the surface to volume ratio $(N/P)^{2/3}$ in three |
42 |
dimensions. |
43 |
|
44 |
Force decomposition assigns particles to processors based on a block |
45 |
decomposition of the force matrix. Processors are split into a |
46 |
optimally square grid forming row and column processor groups. Forces |
47 |
are calculated on particles in a given row by particles located in |
48 |
that processors column assignment. Force decomposition is less complex |
49 |
to implement then the spatial method but still scales computationally |
50 |
as $O(N/P)$ and scales as $(N/\sqrt{p})$ in communication |
51 |
cost. Plimpton also found that force decompositions scales more |
52 |
favorably then spatial decomposition up to 10,000 atoms and favorably |
53 |
competes with spatial methods for up to 100,000 atoms. |
54 |
|
55 |
\subsection{\label{sec:memory}Memory Allocation in Analysis} |
56 |
|
57 |
\subsection{\label{sec:documentation}Documentation} |
58 |
|
59 |
\subsection{\label{openSource}Open Source and Distribution License} |
60 |
|