| 1 | 
mmeineke | 
899 | 
 | 
| 2 | 
  | 
  | 
\section{\label{sec:ProgramDesign}Program Design} | 
| 3 | 
  | 
  | 
 | 
| 4 | 
  | 
  | 
\subsection{\label{sec:architecture} OOPSE Architecture} | 
| 5 | 
  | 
  | 
 | 
| 6 | 
mmeineke | 
902 | 
The core of OOPSE is divided into two main object libraries: {\texttt | 
| 7 | 
  | 
  | 
libBASS} and {\texttt libmdtools}. {\texttt libBASS} is the library | 
| 8 | 
  | 
  | 
developed around the parseing engine and {\texttt libmdtools} is the | 
| 9 | 
  | 
  | 
software library developed around the simulation engine. | 
| 10 | 
mmeineke | 
899 | 
 | 
| 11 | 
mmeineke | 
902 | 
 | 
| 12 | 
  | 
  | 
 | 
| 13 | 
mmeineke | 
899 | 
\subsection{\label{sec:programLang} Programming Languages } | 
| 14 | 
  | 
  | 
 | 
| 15 | 
  | 
  | 
\subsection{\label{sec:parallelization} Parallelization of OOPSE} | 
| 16 | 
  | 
  | 
 | 
| 17 | 
  | 
  | 
Although processor power is doubling roughly every 18 months according | 
| 18 | 
  | 
  | 
to the famous Moore's Law\cite{moore}, it is still unreasonable to | 
| 19 | 
  | 
  | 
simulate systems of more then a 1000 atoms on a single processor. To | 
| 20 | 
  | 
  | 
facilitate study of larger system sizes or smaller systems on long | 
| 21 | 
  | 
  | 
time scales in a reasonable period of time, parallel methods were | 
| 22 | 
  | 
  | 
developed allowing multiple CPU's to share the simulation | 
| 23 | 
  | 
  | 
workload. Three general categories of parallel decomposition method's | 
| 24 | 
  | 
  | 
have been developed including atomic, spatial and force decomposition | 
| 25 | 
  | 
  | 
methods. | 
| 26 | 
  | 
  | 
 | 
| 27 | 
  | 
  | 
Algorithmically simplest of the three method's is atomic decomposition | 
| 28 | 
  | 
  | 
where N particles in a simulation are split among P processors for the | 
| 29 | 
  | 
  | 
duration of the simulation. Computational cost scales as an optimal | 
| 30 | 
  | 
  | 
$O(N/P)$ for atomic decomposition. Unfortunately all processors must | 
| 31 | 
  | 
  | 
communicate positions and forces with all other processors leading | 
| 32 | 
  | 
  | 
communication to scale as an unfavorable $O(N)$ independent of the | 
| 33 | 
  | 
  | 
number of processors. This communication bottleneck led to the | 
| 34 | 
  | 
  | 
development of spatial and force decomposition methods in which | 
| 35 | 
  | 
  | 
communication among processors scales much more favorably. Spatial or | 
| 36 | 
  | 
  | 
domain decomposition divides the physical spatial domain into 3D boxes | 
| 37 | 
  | 
  | 
in which each processor is responsible for calculation of forces and | 
| 38 | 
  | 
  | 
positions of particles located in its box. Particles are reassigned to | 
| 39 | 
  | 
  | 
different processors as they move through simulation space. To | 
| 40 | 
  | 
  | 
calculate forces on a given particle, a processor must know the | 
| 41 | 
  | 
  | 
positions of particles within some cutoff radius located on nearby | 
| 42 | 
  | 
  | 
processors instead of the positions of particles on all | 
| 43 | 
  | 
  | 
processors. Both communication between processors and computation | 
| 44 | 
  | 
  | 
scale as $O(N/P)$ in the spatial method. However, spatial | 
| 45 | 
  | 
  | 
decomposition adds algorithmic complexity to the simulation code and | 
| 46 | 
  | 
  | 
is not very efficient for small N since the overall communication | 
| 47 | 
  | 
  | 
scales as the surface to volume ratio $(N/P)^{2/3}$ in three | 
| 48 | 
  | 
  | 
dimensions. | 
| 49 | 
  | 
  | 
 | 
| 50 | 
  | 
  | 
Force decomposition assigns particles to processors based on a block | 
| 51 | 
  | 
  | 
decomposition of the force matrix. Processors are split into a | 
| 52 | 
  | 
  | 
optimally square grid forming row and column processor groups. Forces | 
| 53 | 
  | 
  | 
are calculated on particles in a given row by particles located in | 
| 54 | 
  | 
  | 
that processors column assignment. Force decomposition is less complex | 
| 55 | 
  | 
  | 
to implement then the spatial method but still scales computationally | 
| 56 | 
  | 
  | 
as $O(N/P)$ and scales as $(N/\sqrt{p})$ in communication | 
| 57 | 
  | 
  | 
cost. Plimpton also found that force decompositions scales more | 
| 58 | 
  | 
  | 
favorably then spatial decomposition up to 10,000 atoms and favorably | 
| 59 | 
  | 
  | 
competes with spatial methods for up to 100,000 atoms. | 
| 60 | 
  | 
  | 
 | 
| 61 | 
  | 
  | 
\subsection{\label{sec:memory}Memory Allocation in Analysis} | 
| 62 | 
  | 
  | 
 | 
| 63 | 
  | 
  | 
\subsection{\label{sec:documentation}Documentation} | 
| 64 | 
  | 
  | 
 | 
| 65 | 
  | 
  | 
\subsection{\label{openSource}Open Source and Distribution License} | 
| 66 | 
  | 
  | 
 |