1 |
|
\documentclass[11pt]{article} |
2 |
|
\usepackage{amsmath} |
3 |
< |
\usepackage{amssymb} |
3 |
> |
\usepackage{amssymb} |
4 |
|
\usepackage{endfloat} |
5 |
|
\usepackage{listings} |
6 |
|
\usepackage{berkeley} |
22 |
|
xleftmargin=0.5in, xrightmargin=0.5in,captionpos=b, % |
23 |
|
abovecaptionskip=0.5cm, belowcaptionskip=0.5cm} |
24 |
|
\renewcommand{\lstlistingname}{Scheme} |
25 |
< |
\title{{\sc oopse}: An Open Source Object-Oriented Parallel Simulation |
25 |
> |
\title{{\sc oopse}: An Object-Oriented Parallel Simulation |
26 |
|
Engine for Molecular Dynamics} |
27 |
|
|
28 |
|
\author{Matthew A. Meineke, Charles F. Vardeman II, Teng Lin,\\ |
43 |
|
simulations are carried out using the force-based decomposition |
44 |
|
method. Simulations are specified using a very simple C-based |
45 |
|
meta-data language. A number of advanced integrators are included, |
46 |
< |
and the base integrator for orientational dynamics provides |
47 |
< |
substantial improvements over older quaternion-base schemes. All |
48 |
< |
source code is available under a very permissive (BSD-style) Open |
49 |
< |
Source license. |
46 |
> |
and the basic integrator for orientational dynamics provides |
47 |
> |
substantial improvements over older quaternion-based schemes. |
48 |
|
\end{abstract} |
49 |
|
|
50 |
|
\section{\label{sec:intro}Introduction} |
51 |
|
|
52 |
|
There are a number of excellent molecular dynamics packages available |
53 |
|
to the chemical physics |
54 |
< |
community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn} |
54 |
> |
community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn,namd,macromodel} |
55 |
|
All of these packages are stable, polished programs which solve many |
56 |
|
problems of interest. Most are now capable of performing molecular |
57 |
|
dynamics simulations on parallel computers. Some have source code |
62 |
|
programs referenced can handle transition metal force fields like the |
63 |
|
Embedded Atom Method ({\sc eam}). The direction our research program |
64 |
|
has taken us now involves the use of atoms with orientational degrees |
65 |
< |
of freedom and transition metals. Since these simulation methods may |
66 |
< |
be of some use to other researchers, we have decided to release our |
67 |
< |
program to the scientific community with a permissive open source |
68 |
< |
license. |
65 |
> |
of freedom as well as transition metals. Since these simulation |
66 |
> |
methods may be of some use to other researchers, we have decided to |
67 |
> |
release our program (and all related source code) to the scientific |
68 |
> |
community. |
69 |
|
|
70 |
|
This paper communicates the algorithmic details of our program, which |
71 |
< |
we have been calling the Open source Object-oriented Parallel |
72 |
< |
Simulation Engine (i.e. {\sc oopse}). We have structured this paper |
73 |
< |
to first discuss the underlying concepts in this simulation package |
71 |
> |
we have been calling the Object-Oriented Parallel Simulation Engine |
72 |
> |
(i.e. {\sc oopse}). We have structured this paper to first discuss |
73 |
> |
the underlying concepts in this simulation package |
74 |
|
(Sec. \ref{oopseSec:IOfiles}). The empirical energy functions |
75 |
|
implemented are discussed in Sec.~\ref{oopseSec:empiricalEnergy}. |
76 |
|
Sec.~\ref{oopseSec:mechanics} describes the various Molecular Dynamics |
281 |
|
$\psi$.\cite{Goldstein01} In order to avoid numerical instabilities |
282 |
|
inherent in using the Euler angles, the four parameter ``quaternion'' |
283 |
|
scheme is often used. The elements of $\mathsf{A}$ can be expressed as |
284 |
< |
arithmetic operations involving the four quaternions ($q_0, q_1, q_2,$ |
285 |
< |
and $q_3$).\cite{allen87:csl} Use of quaternions also leads to |
284 |
> |
arithmetic operations involving the four quaternions ($q_w, q_x, q_y,$ |
285 |
> |
and $q_z$).\cite{allen87:csl} Use of quaternions also leads to |
286 |
|
performance enhancements, particularly for very small |
287 |
|
systems.\cite{Evans77} |
288 |
|
|
389 |
|
It is important to note the fundamental units in all files which are |
390 |
|
read and written by {\sc oopse}. Energies are in $\mbox{kcal |
391 |
|
mol}^{-1}$, distances are in $\mbox{\AA}$, times are in $\mbox{fs}$, |
392 |
< |
translational velocities are in $\mbox{\AA fs}^{-1}$, and masses are |
392 |
> |
translational velocities are in $\mbox{\AA~fs}^{-1}$, and masses are |
393 |
|
in $\mbox{amu}$. Orientational degrees of freedom are described using |
394 |
|
quaternions (unitless, but $q_w^2 + q_x^2 + q_y^2 + q_z^2 = 1$), |
395 |
|
body-fixed angular momenta ($\mbox{amu \AA}^{2} \mbox{radians |
409 |
|
|
410 |
|
{\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline |
411 |
|
|
412 |
< |
{\tt forceField} & string & Sets the force field. & Possible force fields are "DUFF", "LJ", and "EAM". \\ |
412 |
> |
{\tt forceField} & string & Sets the force field. & Possible force fields are DUFF, LJ, and EAM. \\ |
413 |
|
{\tt nComponents} & integer & Sets the number of components. & Needs to appear before the first {\tt Component} block. \\ |
414 |
|
{\tt initialConfig} & string & Sets the file containing the initial configuration. & Can point to any file containing the configuration in the correct order. \\ |
415 |
|
{\tt minimizer}& string & Chooses a minimizer & Possible minimizers |
416 |
< |
are "SD" and "CG". Either {\tt ensemble} or {\tt minimizer} must be specified. \\ |
416 |
> |
are SD and CG. Either {\tt ensemble} or {\tt minimizer} must be specified. \\ |
417 |
|
{\tt ensemble} & string & Sets the ensemble. & Possible ensembles are |
418 |
< |
"NVE", "NVT", "NPTi", "NPTf", and "NPTxyz". Either {\tt ensemble} |
418 |
> |
NVE, NVT, NPTi, NPTf, and NPTxyz. Either {\tt ensemble} |
419 |
|
or {\tt minimizer} must be specified. \\ |
420 |
|
{\tt dt} & fs & Sets the time step. & Selection of {\tt dt} should be |
421 |
< |
small enough to sample the fastest motion of the simulation. (required |
422 |
< |
for molecular dynamics simulations)\\ |
421 |
> |
small enough to sample the fastest motion of the simulation. ({\tt |
422 |
> |
dt} is required for molecular dynamics simulations)\\ |
423 |
|
{\tt runTime} & fs & Sets the time at which the simulation should |
424 |
|
end. & This is an absolute time, and will end the simulation when the |
425 |
< |
current time meets or exceeds the {\tt runTime}. (required for |
426 |
< |
molecular dynamics simulations)\\ |
425 |
> |
current time meets or exceeds the {\tt runTime}. ({\tt runTime} is |
426 |
> |
required for molecular dynamics simulations)\\ |
427 |
|
|
428 |
|
\end{tabularx} |
429 |
|
\end{center} |
448 |
|
the root name of the initial meta-data file but with an {\tt .eor} |
449 |
|
extension. \\ |
450 |
|
{\tt useInitialTime} & logical & Sets whether the initial time is taken from the {\tt .in} file. & Useful when recovering a simulation from a crashed processor. Default is false. \\ |
451 |
< |
{\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump} file is written. & Default sets the frequency to the {\tt runTime}. \\ |
452 |
< |
{\tt statusTime} & fs & Sets the frequency at which the {\tt .stat} file is written. & Defaults set the frequency to the {\tt sampleTime}. \\ |
453 |
< |
{\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoffRadius & Defaults to |
454 |
< |
$15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5 |
455 |
< |
\sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the |
451 |
> |
{\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump} |
452 |
> |
file is written. & The default is equal to the {\tt runTime}. \\ |
453 |
> |
{\tt statusTime} & fs & Sets the frequency at which the {\tt .stat} |
454 |
> |
file is written. & The default is equal to the {\tt sampleTime}. \\ |
455 |
> |
{\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoff radius & |
456 |
> |
Defaults to $15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5 |
457 |
> |
\sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the |
458 |
|
simulation. \\ |
459 |
|
{\tt switchingRadius} & $\mbox{\AA}$ & Manually sets the inner radius for the switching function. & Defaults to 95~\% of the {\tt cutoffRadius}. \\ |
460 |
< |
{\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is "false". \\ |
460 |
> |
{\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is false. \\ |
461 |
|
{\tt dielectric} & unitless & Sets the dielectric constant for reaction field. & If {\tt useReactionField} is true, then {\tt dielectric} must be set. \\ |
462 |
|
{\tt usePeriodicBoundaryConditions} & & & \\ |
463 |
< |
& logical & Turns periodic boundary conditions on/off. & Default is "true". \\ |
463 |
> |
& logical & Turns periodic boundary conditions on/off. & Default is true. \\ |
464 |
|
{\tt seed } & integer & Sets the seed value for the random number |
465 |
|
generator. & The seed needs to be at least 9 digits long. The default |
466 |
|
is to take the seed from the CPU clock. \\ |
467 |
|
{\tt forceFieldVariant} & string & Sets the name of the variant of the |
468 |
< |
force field. ({\sc eam} has three variants: {\tt u3}, {\tt u6}, and |
468 |
> |
force field. & {\sc eam} has three variants: {\tt u3}, {\tt u6}, and |
469 |
|
{\tt VC}. |
470 |
|
|
471 |
|
\end{tabularx} |
565 |
|
\item The ordering of the atoms for each molecule follows the order |
566 |
|
declared in the molecule's declaration within the model file. |
567 |
|
\item Only atoms which are not members of a {\tt rigidBody} are |
568 |
< |
included |
568 |
> |
included. |
569 |
|
\item Rigid Body coordinates for a molecule are listed immediately |
570 |
|
after the the other atoms in a molecule. Some molecules may be |
571 |
|
entirely rigid, in which case, only the rigid body coordinates are |
667 |
|
\begin{equation} |
668 |
|
V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}. |
669 |
|
\end{equation} |
670 |
< |
The short-ranged portion includes explicit bonds, bends and torsions, |
671 |
< |
which have been defined in the meta-data file for the molecules which |
672 |
< |
present in the simulation. The functional forms and parameters for |
673 |
< |
these interactions are defined by the force field which is chosen. |
670 |
> |
The short-ranged portion includes the explicit bonds, bends, and |
671 |
> |
torsions which have been defined in the meta-data file for the |
672 |
> |
molecules which are present in the simulation. The functional forms and |
673 |
> |
parameters for these interactions are defined by the force field which |
674 |
> |
is chosen. |
675 |
|
|
676 |
< |
Calculating long-range (non-bonded) potential involves a sum over all |
677 |
< |
pairs of atoms (except for those atoms which are involved in a bond, |
678 |
< |
bend, or torsion with each other). If done poorly, calculating the |
679 |
< |
the long-range interactions for $N$ atoms would involve $N^2$ |
680 |
< |
evaluations of atomic distance. To reduce the number of distance |
676 |
> |
Calculating the long-range (non-bonded) potential involves a sum over |
677 |
> |
all pairs of atoms (except for those atoms which are involved in a |
678 |
> |
bond, bend, or torsion with each other). If done poorly, calculating |
679 |
> |
the the long-range interactions for $N$ atoms would involve $N(N-1)/2$ |
680 |
> |
evaluations of atomic distances. To reduce the number of distance |
681 |
|
evaluations between pairs of atoms, {\sc oopse} uses a switched cutoff |
682 |
|
with Verlet neighbor lists.\cite{allen87:csl} It is well known that |
683 |
|
neutral groups which contain charges will exhibit pathological forces |
692 |
|
where $r_{ab}$ is the distance between the centers of mass of the two |
693 |
|
cutoff groups ($a$ and $b$). |
694 |
|
|
695 |
< |
The sums over $a$ and $b$ are over the cutoffGroups that are present |
695 |
> |
The sums over $a$ and $b$ are over the cutoff groups that are present |
696 |
|
in the simulation. Atoms which are not explicitly defined as members |
697 |
|
of a {\tt cutoffGroup} are treated as a group consisting of only one |
698 |
|
atom. The switching function, $s(r)$ is the standard cubic switching |
718 |
|
atoms, the cutoff radius has a default value of $2.5\sigma_{ii}$, |
719 |
|
where $\sigma_{ii}$ is the largest Lennard-Jones length parameter |
720 |
|
present in the simulation. In simulations containing charged or |
721 |
< |
dipolar atoms, the default cutoff Radius is $15 \mbox{\AA}$. |
721 |
> |
dipolar atoms, the default cutoff radius is $15 \mbox{\AA}$. |
722 |
|
|
723 |
|
The {\tt switchingRadius} is set to a default value of 95\% of the |
724 |
|
{\tt cutoffRadius}. In the special case of a simulation containing |
727 |
|
potential to remove discontinuities in the potential at the cutoff. |
728 |
|
Both radii may be specified in the meta-data file. |
729 |
|
|
730 |
< |
Force fields can easily be added to {\sc oopse}, although it comes |
731 |
< |
with a few simple examples (Lennard-Jones, {\sc duff}, {\sc water}, |
732 |
< |
and {\sc eam}) which are explained in the following sections. |
730 |
> |
Force fields can be added to {\sc oopse}, although it comes with a few |
731 |
> |
simple examples (Lennard-Jones, {\sc duff}, {\sc water}, and {\sc |
732 |
> |
eam}) which are explained in the following sections. |
733 |
|
|
734 |
|
\subsection{\label{sec:LJPot}The Lennard Jones Force Field} |
735 |
|
|
1081 |
|
\subsection{\label{oopseSec:WATER}The {\sc water} Force Field} |
1082 |
|
|
1083 |
|
In addition to the {\sc duff} force field's solvent description, a |
1084 |
< |
separate {\sc water} force field has been included for simulating many |
1085 |
< |
of the common rigid-body water models. In addition to the simple or |
1086 |
< |
dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD water), the common |
1087 |
< |
charge-based models were included (SPC, SPC/E, TIP3P, TIP4P, and |
1084 |
> |
separate {\sc water} force field has been included for simulating most |
1085 |
> |
of the common rigid-body water models. This force field includes the |
1086 |
> |
simple and point-dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD |
1087 |
> |
water), as well as the common charge-based models (SPC, SPC/E, TIP3P, |
1088 |
> |
TIP4P, and |
1089 |
|
TIP5P).\cite{liu96:new_model,Ichiye03,fennell04,Marrink01,Berendsen81,Berendsen87,Jorgensen83,Mahoney00} |
1090 |
|
In order to handle these models, charge-charge interactions were |
1091 |
|
included in the force-loop: |
1093 |
|
V_{\text{charge}}(r_{ij}) = \sum_{ij}\frac{q_iq_je^2}{r_{ij}}, |
1094 |
|
\end{equation} |
1095 |
|
where $q$ represents the charge on particle $i$ or $j$, and $e$ is the |
1096 |
< |
charge of an electron in Coulombs. The charge-charge interaction |
1097 |
< |
support is rudimentary in the current version of {\sc oopse}. As with |
1098 |
< |
the other pair interactions, charges can be simulated with a pure |
1099 |
< |
cutoff or a reaction field. The various methods for performing the |
1100 |
< |
Ewald summation have not yet been included. Also, the charge-dipole |
1101 |
< |
and charge-quadrupole (for interactions between SSD type water and |
1102 |
< |
charges) are not yet available, so it is currently inadvisable to mix |
1101 |
< |
dipolar and charge based molecules in the same system. |
1096 |
> |
charge of an electron in Coulombs. As with the other pair |
1097 |
> |
interactions, charges can be simulated with a pure cutoff or a |
1098 |
> |
reaction field. The {\sc water} force field can be easily expanded |
1099 |
> |
through modification of the {\sc water} force field file ({\tt |
1100 |
> |
WATER.frc}). By adding atom types and inserting the appropriate |
1101 |
> |
parameters, it is possible to extend the force field to handle rigid |
1102 |
> |
molecules other than water. |
1103 |
|
|
1103 |
– |
The {\sc water} force field can be easily expanded through |
1104 |
– |
modification of the {\sc water} force field file ({\tt WATER.frc}). By |
1105 |
– |
adding atom types and inserting the appropriate parameters, it is |
1106 |
– |
possible to extend the force field to handle rigid molecules other |
1107 |
– |
than water. |
1108 |
– |
|
1104 |
|
\subsection{\label{oopseSec:eam}Embedded Atom Method} |
1105 |
|
|
1106 |
|
{\sc oopse} implements a potential that describes bonding in |
1247 |
|
\section{\label{oopseSec:mechanics}Mechanics} |
1248 |
|
|
1249 |
|
\subsection{\label{oopseSec:integrate}Integrating the Equations of Motion: the |
1250 |
< |
DLM method} |
1250 |
> |
{\sc dlm} method} |
1251 |
|
|
1252 |
|
The default method for integrating the equations of motion in {\sc |
1253 |
|
oopse} is a velocity-Verlet version of the symplectic splitting method |
1254 |
|
proposed by Dullweber, Leimkuhler and McLachlan |
1255 |
< |
(DLM).\cite{Dullweber1997} When there are no directional atoms or |
1255 |
> |
({\sc dlm}).\cite{Dullweber1997} When there are no directional atoms or |
1256 |
|
rigid bodies present in the simulation, this integrator becomes the |
1257 |
|
standard velocity-Verlet integrator which is known to sample the |
1258 |
|
microcanonical (NVE) ensemble.\cite{Frenkel1996} |
1259 |
|
|
1260 |
|
Previous integration methods for orientational motion have problems |
1261 |
< |
that are avoided in the DLM method. Direct propagation of the Euler |
1261 |
> |
that are avoided in the {\sc dlm} method. Direct propagation of the Euler |
1262 |
|
angles has a known $1/\sin\theta$ divergence in the equations of |
1263 |
|
motion for $\phi$ and $\psi$,\cite{allen87:csl} leading to numerical |
1264 |
|
instabilities any time one of the directional atoms or rigid bodies |
1366 |
|
Here $\hat{\bf u}$ is a unit vector pointing along the principal axis |
1367 |
|
of the particle in the space-fixed frame. |
1368 |
|
|
1369 |
< |
The DLM method uses a Trotter factorization of the orientational |
1369 |
> |
The {\sc dlm} method uses a Trotter factorization of the orientational |
1370 |
|
propagator. This has three effects: |
1371 |
|
\begin{enumerate} |
1372 |
|
\item the integrator is area-preserving in phase space (i.e. it is |
1460 |
|
+ \frac{h}{2} {\bf \tau}^b(t + h) . |
1461 |
|
\end{align*} |
1462 |
|
|
1463 |
< |
The matrix rotations used in the DLM method end up being more costly |
1464 |
< |
computationally than the simpler arithmetic quaternion |
1463 |
> |
The matrix rotations used in the {\sc dlm} method end up being more |
1464 |
> |
costly computationally than the simpler arithmetic quaternion |
1465 |
|
propagation. With the same time step, a 1024-molecule water simulation |
1466 |
< |
shows an 12\% increase in computation time (averaged over several |
1467 |
< |
different time steps) using the DLM method in place of |
1468 |
< |
quaternions. This cost is more than justified when comparing the |
1469 |
< |
energy conservation of the two methods. Figure ~\ref{quatdlm} provides |
1470 |
< |
a comparative analysis of the {\sc dlm} method versus the simple |
1476 |
< |
quaternion method that was originally implemented. |
1466 |
> |
incurs an average 12\% increase in computation time using the {\sc |
1467 |
> |
dlm} method in place of quaternions. This cost is more than justified |
1468 |
> |
when comparing the energy conservation achieved by the two |
1469 |
> |
methods. Figure ~\ref{quatdlm} provides a comparative analysis of the |
1470 |
> |
{\sc dlm} method versus the traditional quaternion scheme. |
1471 |
|
|
1472 |
|
\begin{figure} |
1473 |
|
\centering |
1474 |
|
\includegraphics[width=\linewidth]{quatvsdlm.eps} |
1475 |
|
\caption[Energy conservation analysis of the {\sc dlm} and quaternion |
1476 |
< |
integration methods]{The logarithm of absolute value of the slope of |
1477 |
< |
the energy drift (\delta E$_1$) and the standard deviation of the |
1478 |
< |
energy fluctuations (\delta E$_0$) as a function of chosen time |
1479 |
< |
step. All simulations were of a 1024-molecule simulation of SSD water |
1480 |
< |
at 298 K starting from the same initial configuration. Note that the |
1481 |
< |
{\sc dlm} method provides a greater-than order-of-magnitude |
1482 |
< |
improvement in energy conservation and relative energy fluctuations |
1483 |
< |
over the quaternion method at all the tested time steps. The energy |
1484 |
< |
drift is quite steep for the larger time steps in both methods, and |
1485 |
< |
results in discontinuous behavior as the systems compound their |
1486 |
< |
anomolous energy accumulation.} |
1476 |
> |
integration methods]{Analysis of the energy conservation of the {\sc |
1477 |
> |
dlm} and quaternion integration methods. $\delta \mathrm{E}_1$ is the |
1478 |
> |
linear drift in energy over time and $\delta \mathrm{E}_0$ is the |
1479 |
> |
standard deviation of energy fluctuations around this drift. All |
1480 |
> |
simulations were of a 1024-molecule simulation of SSD water at 298 K |
1481 |
> |
starting from the same initial configuration. Note that the {\sc dlm} |
1482 |
> |
method provides more than an order of magnitude improvement in both |
1483 |
> |
the energy drift and the size of the energy fluctuations when compared |
1484 |
> |
with the quaternion method at any given time step. At time steps |
1485 |
> |
larger than 4 fs, the quaternion scheme resulted in rapidly rising |
1486 |
> |
energies which eventually lead to simulation failure. Using the {\sc |
1487 |
> |
dlm} method, time steps up to 8 fs can be taken before this behavior |
1488 |
> |
is evident.} |
1489 |
|
\label{quatdlm} |
1490 |
|
\end{figure} |
1491 |
|
|
1492 |
< |
In Fig.~\ref{quatdlm}, \delta E$_1$ is a measure of the linear energy |
1493 |
< |
drift in units of kcal/mol per particle over a nanosecond of |
1494 |
< |
simulation time, and \delta E$_0$ is the standard deviation of the |
1495 |
< |
energy fluctuations in units of kcal/mol per particle. In the top |
1496 |
< |
plot, it is apparent that the energy drift is reduced by a significant |
1497 |
< |
amount (2 to 3 orders-of-magnitude improvement at every tested time |
1498 |
< |
step) by chosing the {\sc dlm} method over the simple non-symplectic |
1499 |
< |
quaternion integration method. When the energy drift becomes very |
1500 |
< |
small ($log_{10}[|\delta\text{E}_1|] < -3$), it is more difficult to |
1501 |
< |
calculate a slope, resulting in the larger displayed error bars. In |
1502 |
< |
addition to this improvement in energy drift, the fluctuation is the |
1507 |
< |
total energy are also dampened out by 1 to 2 orders-of-magnitude by |
1508 |
< |
utilizing the {\sc dlm} integration method. |
1492 |
> |
In Fig.~\ref{quatdlm}, $\delta \mbox{E}_1$ is a measure of the linear |
1493 |
> |
energy drift in units of $\mbox{kcal mol}^{-1}$ per particle over a |
1494 |
> |
nanosecond of simulation time, and $\delta \mbox{E}_0$ is the standard |
1495 |
> |
deviation of the energy fluctuations in units of $\mbox{kcal |
1496 |
> |
mol}^{-1}$ per particle. In the top plot, it is apparent that the |
1497 |
> |
energy drift is reduced by a significant amount (2 to 3 orders of |
1498 |
> |
magnitude improvement at all tested time steps) by chosing the {\sc |
1499 |
> |
dlm} method over the simple non-symplectic quaternion integration |
1500 |
> |
method. In addition to this improvement in energy drift, the |
1501 |
> |
fluctuations in the total energy are also dampened by 1 to 2 orders of |
1502 |
> |
magnitude by utilizing the {\sc dlm} method. |
1503 |
|
|
1504 |
< |
It was stated previously that the {\sc dlm} method was the more |
1505 |
< |
computationally expensive of the two implimented integration |
1506 |
< |
methodologies. In order to incorporate this information into the |
1507 |
< |
energy analysis a plot of energy drift versus computational cost was |
1508 |
< |
generated (Fig.~\ref{cpuCost}). This figure provides an estimate of |
1509 |
< |
the CPU time required under the two integration schemes for 1 |
1510 |
< |
nanosecond of simulation time for the model 1024-molecule system. The |
1511 |
< |
plot is read by chosing a desired energy drift value and determining |
1512 |
< |
where both the curves cross. If a \delta E$_1$ of 1E-3 kcal/mol per |
1513 |
< |
particle is desired, a nanosecond of simulation time will require ~19 |
1514 |
< |
hours of CPU time with the {\sc dlm} integrator, while the same small |
1515 |
< |
drift value will require ~154 hours of CPU time. This demonstrates the |
1516 |
< |
computational advantage of the integration scheme utilized in {\sc |
1517 |
< |
oopse}. |
1504 |
> |
Although the {\sc dlm} method is more computationally expensive than |
1505 |
> |
the traditional quaternion scheme for propagating a single time step, |
1506 |
> |
consideration of the computational cost for a long simulation with a |
1507 |
> |
particular level of energy conservation is in order. A plot of energy |
1508 |
> |
drift versus computational cost was generated |
1509 |
> |
(Fig.~\ref{cpuCost}). This figure provides an estimate of the CPU time |
1510 |
> |
required under the two integration schemes for 1 nanosecond of |
1511 |
> |
simulation time for the model 1024-molecule system. By chosing a |
1512 |
> |
desired energy drift value it is possible to determine the CPU time |
1513 |
> |
required for both methods. If a $\delta \mbox{E}_1$ of $1 \times |
1514 |
> |
10^{-3} \mbox{kcal mol}^{-1}$ per particle is desired, a nanosecond of |
1515 |
> |
simulation time will require ~19 hours of CPU time with the {\sc dlm} |
1516 |
> |
integrator, while the quaternion scheme will require ~154 hours of CPU |
1517 |
> |
time. This demonstrates the computational advantage of the integration |
1518 |
> |
scheme utilized in {\sc oopse}. |
1519 |
|
|
1520 |
|
\begin{figure} |
1521 |
|
\centering |
1522 |
|
\includegraphics[width=\linewidth]{compCost.eps} |
1523 |
|
\caption[Energy drift as a function of required simulation run |
1524 |
< |
time]{The logarithm of absolute value of the slope of the energy drift |
1525 |
< |
(\delta E$_1$) as a function of simulation run time. Simulations were |
1526 |
< |
performed on a single 2.5 GHz Pentium IV processor. Simulation time |
1527 |
< |
comparisons can be made by tracing horizontally from one curve to the |
1528 |
< |
other. For example, a simulation that takes ~24 hours using the {\sc |
1529 |
< |
dlm} method will take roughly 210 hours using the simple quaternion |
1530 |
< |
method if the same degree of energy conservation is desired.} |
1524 |
> |
time]{Energy drift as a function of required simulation run time. |
1525 |
> |
$\delta \mathrm{E}_1$ is the linear drift in energy over time. |
1526 |
> |
Simulations were performed on a single 2.5 GHz Pentium 4 |
1527 |
> |
processor. Simulation time comparisons can be made by tracing |
1528 |
> |
horizontally from one curve to the other. For example, a simulation |
1529 |
> |
that takes ~24 hours using the {\sc dlm} method will take roughly 210 |
1530 |
> |
hours using the simple quaternion method if the same degree of energy |
1531 |
> |
conservation is desired.} |
1532 |
|
\label{cpuCost} |
1533 |
|
\end{figure} |
1534 |
|
|
1638 |
|
\end{equation} |
1639 |
|
Here, $f$ is the total number of degrees of freedom in the system, |
1640 |
|
\begin{equation} |
1641 |
< |
f = 3 N + 3 N_{\mathrm{orient}} - N_{\mathrm{constraints}}, |
1641 |
> |
f = 3 N + 2 N_{\mathrm{linear}} + 3 N_{\mathrm{non-linear}} - N_{\mathrm{constraints}}, |
1642 |
|
\end{equation} |
1643 |
|
and $K$ is the total kinetic energy, |
1644 |
|
\begin{equation} |
1645 |
|
K = \sum_{i=1}^{N} \frac{1}{2} m_i {\bf v}_i^T \cdot {\bf v}_i + |
1646 |
< |
\sum_{i=1}^{N_{\mathrm{orient}}} \frac{1}{2} {\bf j}_i^T \cdot |
1646 |
> |
\sum_{i=1}^{N_{\mathrm{linear}}+N_{\mathrm{non-linear}}} \frac{1}{2} {\bf j}_i^T \cdot |
1647 |
|
\overleftrightarrow{\mathsf{I}}_i^{-1} \cdot {\bf j}_i. |
1648 |
|
\end{equation} |
1649 |
+ |
$N_{\mathrm{linear}}$ is the number of linear rotors (i.e. with two |
1650 |
+ |
non-zero moments of inertia), and $N_{\mathrm{non-linear}}$ is the |
1651 |
+ |
number of non-linear rotors (i.e. with three non-zero moments of |
1652 |
+ |
inertia). |
1653 |
|
|
1654 |
|
In eq.(\ref{eq:nosehooverext}), $\tau_T$ is the time constant for |
1655 |
|
relaxation of the temperature to the target value. To set values for |
1687 |
|
Here $\mathrm{rotate}(h * {\bf j} |
1688 |
|
\overleftrightarrow{\mathsf{I}}^{-1})$ is the same symplectic Trotter |
1689 |
|
factorization of the three rotation operations that was discussed in |
1690 |
< |
the section on the DLM integrator. Note that this operation modifies |
1690 |
> |
the section on the {\sc dlm} integrator. Note that this operation modifies |
1691 |
|
both the rotation matrix $\mathsf{A}$ and the angular momentum ${\bf |
1692 |
|
j}$. {\tt moveA} propagates velocities by a half time step, and |
1693 |
|
positional degrees of freedom by a full time step. The new positions |
1694 |
|
(and orientations) are then used to calculate a new set of forces and |
1695 |
|
torques in exactly the same way they are calculated in the {\tt |
1696 |
< |
doForces} portion of the DLM integrator. |
1696 |
> |
doForces} portion of the {\sc dlm} integrator. |
1697 |
|
|
1698 |
|
Once the forces and torques have been obtained at the new time step, |
1699 |
|
the temperature, velocities, and the extended system variable can be |
1798 |
|
where ${\bf r}_{ab}$ is the distance between the centers of mass, and |
1799 |
|
\begin{equation} |
1800 |
|
{\bf f}_{ab} = s(r_{ab}) \sum_{i \in a} \sum_{j \in b} {\bf f}_{ij} + |
1801 |
< |
s\prime(r_{ab}) \frac{{\bf r}_{ab}}{|r_{ab}|} \sum_{i \in a} \sum_{j |
1801 |
> |
s^{\prime}(r_{ab}) \frac{{\bf r}_{ab}}{|r_{ab}|} \sum_{i \in a} \sum_{j |
1802 |
|
\in b} V_{ij}({\bf r}_{ij}). |
1803 |
|
\end{equation} |
1804 |
|
|
1805 |
|
The instantaneous pressure is then simply obtained from the trace of |
1806 |
|
the pressure tensor, |
1807 |
|
\begin{equation} |
1808 |
< |
P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t). |
1809 |
< |
\right) |
1808 |
> |
P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t) |
1809 |
> |
\right). |
1810 |
|
\end{equation} |
1811 |
|
|
1812 |
|
In eq.(\ref{eq:melchionna1}), $\tau_B$ is the time constant for |
1816 |
|
file. The units for {\tt tauBarostat} are fs, and the units for the |
1817 |
|
{\tt targetPressure} are atmospheres. Like in the NVT integrator, the |
1818 |
|
integration of the equations of motion is carried out in a |
1819 |
< |
velocity-Verlet style 2 part algorithm with only the following differences: |
1819 |
> |
velocity-Verlet style two part algorithm with only the following |
1820 |
> |
differences: |
1821 |
|
|
1822 |
|
{\tt moveA:} |
1823 |
|
\begin{align*} |
1850 |
|
the box by |
1851 |
|
\begin{equation} |
1852 |
|
\mathcal{V}(t + h) \leftarrow e^{ - 3 h \eta(t + h /2)} \times |
1853 |
< |
\mathcal{V}(t) |
1853 |
> |
\mathcal{V}(t). |
1854 |
|
\end{equation} |
1855 |
|
|
1856 |
|
The {\tt doForces} step for the NPTi integrator is exactly the same as |
1857 |
< |
in both the DLM and NVT integrators. Once the forces and torques have |
1857 |
> |
in both the {\sc dlm} and NVT integrators. Once the forces and torques have |
1858 |
|
been obtained at the new time step, the velocities can be advanced to |
1859 |
|
the same time value. |
1860 |
|
|
2079 |
|
subtract the total constraint forces from the rest of the system after |
2080 |
|
the force calculation at each time step. |
2081 |
|
|
2082 |
< |
After the force calculation, the total force on molecule $\alpha$, |
2082 |
> |
After the force calculation, the total force on molecule $\alpha$ is: |
2083 |
|
\begin{equation} |
2084 |
|
G_{\alpha} = \sum_i F_{\alpha i}, |
2085 |
|
\label{oopseEq:zc1} |
2100 |
|
v_{\alpha i} = v_{\alpha i} - |
2101 |
|
\frac{\sum_i m_{\alpha i} v_{\alpha i}}{\sum_i m_{\alpha i}}, |
2102 |
|
\end{equation} |
2103 |
< |
where $v_{\alpha i}$ is the velocity of atom $i$ in the z direction. |
2103 |
> |
where $v_{\alpha i}$ is the velocity of atom $i$ in the $z$ direction. |
2104 |
|
Lastly, all of the accumulated constraint forces must be subtracted |
2105 |
|
from the rest of the unconstrained system to keep the system center of |
2106 |
|
mass of the entire system from drifting. |
2138 |
|
|
2139 |
|
The user may also specify the use of a constant velocity method |
2140 |
|
(steered molecular dynamics) to move the molecules to their desired |
2141 |
< |
initial positions. |
2141 |
> |
initial positions. Based on concepts from atomic force microscopy, |
2142 |
> |
{\sc smd} has been used to study many processes which occur via rare |
2143 |
> |
events on the time scale of a few hundreds of picoseconds. For |
2144 |
> |
example,{\sc smd} has been used to observe the dissociation of |
2145 |
> |
Streptavidin-biotin Complex.\cite{smd} |
2146 |
|
|
2147 |
|
To use of the $z$-constraint method in an {\sc oopse} simulation, the |
2148 |
|
molecules must be specified using the {\tt nZconstraints} keyword in |
2150 |
|
of the $z$-constraint method are listed in table~\ref{table:zconParams}. |
2151 |
|
|
2152 |
|
\begin{table} |
2153 |
< |
\caption{The Global Keywords: Z-Constraint Parameters} |
2153 |
> |
\caption{Meta-data Keywords: Z-Constraint Parameters} |
2154 |
|
\label{table:zconParams} |
2155 |
|
\begin{center} |
2156 |
|
% Note when adding or removing columns, the \hsize numbers must add up to the total number |
2163 |
|
|
2164 |
|
{\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline |
2165 |
|
|
2166 |
< |
{\tt nZconstraints} & integer & The number of zconstraint molecules& If using zconstraint method, {\tt nZconstraints} must be set \\ |
2167 |
< |
{\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file is written & \\ |
2168 |
< |
{\tt zconsForcePolicy} & string & The strategy of subtracting |
2169 |
< |
zconstraint force from the unconstrained molecules & Possible |
2170 |
< |
strategies are {\tt BYMASS} and {\tt BYNUMBER}. Default |
2171 |
< |
strategy is set to {\tt BYMASS}\\ |
2172 |
< |
{\tt zconsGap} & $\mbox{\AA}$ & Set the distance between two adjacent |
2173 |
< |
constraint positions& Used mainly in moving molecules through a simulation \\ |
2174 |
< |
{\tt zconsFixtime} & fs & Sets how long the zconstraint molecule is |
2175 |
< |
fixed & {\tt zconsFixtime} must be set if {\tt zconsGap} is set\\ |
2176 |
< |
{\tt zconsUsingSMD} &logical & Flag for using Steered Molecular |
2177 |
< |
Dynamics or Harmonic Forces to move the molecule & Harmonic Forces are |
2178 |
< |
used by default\\ |
2166 |
> |
{\tt nZconstraints} & integer & The number of $z$-constrained |
2167 |
> |
molecules & If using the $z$-constraint method, {\tt nZconstraints} |
2168 |
> |
must be set \\ |
2169 |
> |
{\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file |
2170 |
> |
is written & \\ |
2171 |
> |
{\tt zconsForcePolicy} & string & The strategy for subtracting |
2172 |
> |
the $z$-constraint force from the {\it unconstrained} molecules & Possible |
2173 |
> |
strategies are {\tt BYMASS} and {\tt BYNUMBER}. The default |
2174 |
> |
strategy is {\tt BYMASS}\\ |
2175 |
> |
{\tt zconsGap} & $\mbox{\AA}$ & Sets the distance between two adjacent |
2176 |
> |
constraint positions&Used mainly to move molecules through a |
2177 |
> |
simulation to estimate potentials of mean force. \\ |
2178 |
> |
{\tt zconsFixtime} & fs & Sets the length of time the $z$-constraint |
2179 |
> |
molecule is held fixed & {\tt zconsFixtime} must be set if {\tt |
2180 |
> |
zconsGap} is set\\ |
2181 |
> |
{\tt zconsUsingSMD} & logical & Flag for using Steered Molecular |
2182 |
> |
Dynamics to move the molecules to the correct constrained positions & |
2183 |
> |
Harmonic Forces are used by default\\ |
2184 |
|
|
2185 |
|
\end{tabularx} |
2186 |
|
\end{center} |
2197 |
|
stable fixed points on the surface. We have included two simple |
2198 |
|
minimization algorithms: steepest descent, ({\sc sd}) and conjugate |
2199 |
|
gradient ({\sc cg}) to help users find reasonable local minima from |
2200 |
< |
their initial configurations. |
2200 |
> |
their initial configurations. Since {\sc oopse} handles atoms and |
2201 |
> |
rigid bodies which have orientational coordinates as well as |
2202 |
> |
translational coordinates, there is some subtlety to the choice of |
2203 |
> |
parameters for minimization algorithms. |
2204 |
|
|
2192 |
– |
Since {\sc oopse} handles atoms and rigid bodies which have |
2193 |
– |
orientational coordinates as well as translational coordinates, there |
2194 |
– |
is some subtlety to the choice of parameters for minimization |
2195 |
– |
algorithms. |
2196 |
– |
|
2205 |
|
Given a coordinate set $x_{k}$ and a search direction $d_{k}$, a line |
2206 |
|
search algorithm is performed along $d_{k}$ to produce |
2207 |
< |
$x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$. |
2208 |
< |
|
2201 |
< |
In the steepest descent ({\sc sd}) algorithm,% |
2207 |
> |
$x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$. In the steepest descent ({\sc |
2208 |
> |
sd}) algorithm,% |
2209 |
|
\begin{equation} |
2210 |
< |
d_{k}=-\nabla V(x_{k}) |
2210 |
> |
d_{k}=-\nabla V(x_{k}). |
2211 |
|
\end{equation} |
2212 |
|
The gradient and the direction of next step are always orthogonal. |
2213 |
|
This may cause oscillatory behavior in narrow valleys. To overcome |
2214 |
|
this problem, the Fletcher-Reeves variant~\cite{FletcherReeves} of the |
2215 |
|
conjugate gradient ({\sc cg}) algorithm is used to generate $d_{k+1}$ |
2216 |
|
via simple recursion: |
2217 |
< |
\begin{align} |
2218 |
< |
d_{k+1} & =-\nabla V(x_{k+1})+\gamma_{k}d_{k}\\ |
2219 |
< |
\gamma_{k} & =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla |
2220 |
< |
V(x_{k})^{T}\nabla V(x_{k})}% |
2221 |
< |
\end{align} |
2217 |
> |
\begin{equation} |
2218 |
> |
d_{k+1} =-\nabla V(x_{k+1})+\gamma_{k}d_{k} |
2219 |
> |
\end{equation} |
2220 |
> |
where |
2221 |
> |
\begin{equation} |
2222 |
> |
\gamma_{k} =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla |
2223 |
> |
V(x_{k})^{T}\nabla V(x_{k})}. |
2224 |
> |
\end{equation} |
2225 |
|
|
2226 |
|
The Polak-Ribiere variant~\cite{PolakRibiere} of the conjugate |
2227 |
|
gradient ($\gamma_{k}$) is defined as% |
2229 |
|
\gamma_{k}=\frac{[\nabla V(x_{k+1})-\nabla V(x)]^{T}\nabla V(x_{k+1})}{\nabla |
2230 |
|
V(x_{k})^{T}\nabla V(x_{k})}% |
2231 |
|
\end{equation} |
2232 |
+ |
It is widely agreed that the Polak-Ribiere variant gives better |
2233 |
+ |
convergence than the Fletcher-Reeves variant, so the conjugate |
2234 |
+ |
gradient approach implemented in {\sc oopse} is the Polak-Ribiere |
2235 |
+ |
variant. |
2236 |
|
|
2237 |
|
The conjugate gradient method assumes that the conformation is close |
2238 |
|
enough to a local minimum that the potential energy surface is very |
2246 |
|
minimizer are given in Table~\ref{table:minimizeParams} |
2247 |
|
|
2248 |
|
\begin{table} |
2249 |
< |
\caption{The Global Keywords: Energy Minimizer Parameters} |
2249 |
> |
\caption{Meta-data Keywords: Energy Minimizer Parameters} |
2250 |
|
\label{table:minimizeParams} |
2251 |
|
\begin{center} |
2252 |
|
% Note when adding or removing columns, the \hsize numbers must add up to the total number |
2262 |
|
{\tt minimizer} & string & selects the minimization method to be used |
2263 |
|
& either {\tt CG} (conjugate gradient) or {\tt SD} (steepest |
2264 |
|
descent) \\ |
2265 |
< |
{\tt minimizerMaxIter} & steps & Sets the maximum iteration number in the energy minimization & Default value is 200\\ |
2266 |
< |
{\tt minimizerWriteFreq} & steps & Sets the frequency at which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\ |
2267 |
< |
{\tt minimizerStepSize} & $\mbox{\AA}$ & Set the step size of line search & Default value is 0.01\\ |
2268 |
< |
{\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$ & Sets energy tolerance & Default value is $10^{-8}$\\ |
2269 |
< |
{\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets gradient tolerance & Default value is $10^{-8}$\\ |
2270 |
< |
{\tt minimizerLSTol} & $\mbox{kcal mol}^{-1}$ & Sets line search tolerance & Default value is $10^{-8}$\\ |
2271 |
< |
{\tt minimizerLSMaxIter} & steps & Sets the maximum iteration of line searching & Default value is 50\\ |
2265 |
> |
{\tt minimizerMaxIter} & steps & Sets the maximum number of iterations |
2266 |
> |
for the energy minimization & The default value is 200\\ |
2267 |
> |
{\tt minimizerWriteFreq} & steps & Sets the frequency with which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\ |
2268 |
> |
{\tt minimizerStepSize} & $\mbox{\AA}$ & Sets the step size for the |
2269 |
> |
line search & The default value is 0.01\\ |
2270 |
> |
{\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$ & Sets the energy tolerance |
2271 |
> |
for stopping the minimziation. & The default value is $10^{-8}$\\ |
2272 |
> |
{\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets the |
2273 |
> |
gradient tolerance for stopping the minimization. & The default value |
2274 |
> |
is $10^{-8}$\\ |
2275 |
> |
{\tt minimizerLSTol} & $\mbox{kcal mol}^{-1}$ & Sets line search |
2276 |
> |
tolerance for terminating each step of the minimization. & The default |
2277 |
> |
value is $10^{-8}$\\ |
2278 |
> |
{\tt minimizerLSMaxIter} & steps & Sets the maximum number of |
2279 |
> |
iterations for each line search & The default value is 50\\ |
2280 |
|
|
2281 |
|
\end{tabularx} |
2282 |
|
\end{center} |
2285 |
|
\section{\label{oopseSec:parallelization} Parallel Simulation Implementation} |
2286 |
|
|
2287 |
|
Although processor power is continually improving, it is still |
2288 |
< |
unreasonable to simulate systems of more then a 1000 atoms on a single |
2288 |
> |
unreasonable to simulate systems of more than 10,000 atoms on a single |
2289 |
|
processor. To facilitate study of larger system sizes or smaller |
2290 |
|
systems for longer time scales, parallel methods were developed to |
2291 |
|
allow multiple CPU's to share the simulation workload. Three general |
2296 |
|
Algorithmically simplest of the three methods is atomic decomposition, |
2297 |
|
where $N$ particles in a simulation are split among $P$ processors for |
2298 |
|
the duration of the simulation. Computational cost scales as an |
2299 |
< |
optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately all |
2299 |
> |
optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately, all |
2300 |
|
processors must communicate positions and forces with all other |
2301 |
|
processors at every force evaluation, leading the communication costs |
2302 |
|
to scale as an unfavorable $\mathcal{O}(N)$, \emph{independent of the |
2318 |
|
three dimensions. |
2319 |
|
|
2320 |
|
The parallelization method used in {\sc oopse} is the force |
2321 |
< |
decomposition method. Force decomposition assigns particles to |
2322 |
< |
processors based on a block decomposition of the force |
2321 |
> |
decomposition method.\cite{hendrickson:95} Force decomposition assigns |
2322 |
> |
particles to processors based on a block decomposition of the force |
2323 |
|
matrix. Processors are split into an optimally square grid forming row |
2324 |
|
and column processor groups. Forces are calculated on particles in a |
2325 |
|
given row by particles located in that processor's column |
2326 |
< |
assignment. Force decomposition is less complex to implement than the |
2327 |
< |
spatial method but still scales computationally as $\mathcal{O}(N/P)$ |
2328 |
< |
and scales as $\mathcal{O}(N/\sqrt{P})$ in communication |
2329 |
< |
cost. Plimpton has also found that force decompositions scale more |
2330 |
< |
favorably than spatial decompositions for systems up to 10,000 atoms |
2331 |
< |
and favorably compete with spatial methods up to 100,000 |
2332 |
< |
atoms.\cite{plimpton95} |
2333 |
< |
|
2326 |
> |
assignment. One deviation from the algorithm described by Hendrickson |
2327 |
> |
{\it et al.} is the use of column ordering based on the row indexes |
2328 |
> |
preventing the need for a transpose operation necessitating a second |
2329 |
> |
communication step when gathering the final force components. Force |
2330 |
> |
decomposition is less complex to implement than the spatial method but |
2331 |
> |
still scales computationally as $\mathcal{O}(N/P)$ and scales as |
2332 |
> |
$\mathcal{O}(N/\sqrt{P})$ in communication cost. Plimpton has also |
2333 |
> |
found that force decompositions scale more favorably than spatial |
2334 |
> |
decompositions for systems up to 10,000 atoms and favorably compete |
2335 |
> |
with spatial methods up to 100,000 atoms.\cite{plimpton95} |
2336 |
> |
|
2337 |
|
\section{\label{oopseSec:conclusion}Conclusion} |
2338 |
|
|
2339 |
< |
We have presented a new open source parallel simulation program {\sc |
2339 |
> |
We have presented a new parallel simulation program called {\sc |
2340 |
|
oopse}. This program offers some novel capabilities, but mostly makes |
2341 |
|
available a library of modern object-oriented code for the scientific |
2342 |
|
community to use freely. Notably, {\sc oopse} can handle symplectic |
2365 |
|
the Notre Dame Bunch-of-Boxes (B.o.B) computer cluster under NSF grant |
2366 |
|
DMR-0079647. |
2367 |
|
|
2368 |
< |
\bibliographystyle{achemso} |
2368 |
> |
\bibliographystyle{jcc} |
2369 |
|
\bibliography{oopsePaper} |
2370 |
|
|
2371 |
|
\end{document} |