ViewVC Help
View File | Revision Log | Show Annotations | View Changeset | Root Listing
root/group/trunk/oopsePaper/oopsePaper.tex
(Generate patch)

Comparing trunk/oopsePaper/oopsePaper.tex (file contents):
Revision 1428 by gezelter, Wed Jul 28 19:46:08 2004 UTC vs.
Revision 1439 by gezelter, Thu Jul 29 20:06:07 2004 UTC

# Line 1 | Line 1
1   \documentclass[11pt]{article}
2   \usepackage{amsmath}
3 < \usepackage{amssymb}
3 > \usepackage{amssymb}
4   \usepackage{endfloat}
5   \usepackage{listings}
6   \usepackage{berkeley}
# Line 22 | Line 22
22          xleftmargin=0.5in, xrightmargin=0.5in,captionpos=b, %
23          abovecaptionskip=0.5cm, belowcaptionskip=0.5cm}
24   \renewcommand{\lstlistingname}{Scheme}
25 < \title{{\sc oopse}: An Open Source Object-Oriented Parallel Simulation
25 > \title{{\sc oopse}: An Object-Oriented Parallel Simulation
26   Engine for Molecular Dynamics}
27  
28   \author{Matthew A. Meineke, Charles F. Vardeman II, Teng Lin,\\
# Line 43 | Line 43 | meta-data language.  A number of advanced integrators
43   simulations are carried out using the force-based decomposition
44   method.  Simulations are specified using a very simple C-based
45   meta-data language.  A number of advanced integrators are included,
46 < and the base integrator for orientational dynamics provides
47 < substantial improvements over older quaternion-base schemes.  All
48 < source code is available under a very permissive (BSD-style) Open
49 < Source license.
46 > and the basic integrator for orientational dynamics provides
47 > substantial improvements over older quaternion-based schemes.
48   \end{abstract}
49  
50   \section{\label{sec:intro}Introduction}
51  
52   There are a number of excellent molecular dynamics packages available
53   to the chemical physics
54 < community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn}
54 > community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn,namd,macromodel}
55   All of these packages are stable, polished programs which solve many
56   problems of interest.  Most are now capable of performing molecular
57   dynamics simulations on parallel computers.  Some have source code
# Line 64 | Line 62 | has taken us now involves the use of atoms with orient
62   programs referenced can handle transition metal force fields like the
63   Embedded Atom Method ({\sc eam}).  The direction our research program
64   has taken us now involves the use of atoms with orientational degrees
65 < of freedom and transition metals.  Since these simulation methods may
66 < be of some use to other researchers, we have decided to release our
67 < program to the scientific community with a permissive open source
68 < license.
65 > of freedom as well as transition metals.  Since these simulation
66 > methods may be of some use to other researchers, we have decided to
67 > release our program (and all related source code) to the scientific
68 > community.
69  
70   This paper communicates the algorithmic details of our program, which
71 < we have been calling the Open source Object-oriented Parallel
72 < Simulation Engine (i.e. {\sc oopse}).  We have structured this paper
73 < to first discuss the underlying concepts in this simulation package
71 > we have been calling the Object-Oriented Parallel Simulation Engine
72 > (i.e. {\sc oopse}).  We have structured this paper to first discuss
73 > the underlying concepts in this simulation package
74   (Sec. \ref{oopseSec:IOfiles}).  The empirical energy functions
75   implemented are discussed in Sec.~\ref{oopseSec:empiricalEnergy}.
76   Sec.~\ref{oopseSec:mechanics} describes the various Molecular Dynamics
# Line 283 | Line 281 | scheme is often used. The elements of $\mathsf{A}$ can
281   $\psi$.\cite{Goldstein01} In order to avoid numerical instabilities
282   inherent in using the Euler angles, the four parameter ``quaternion''
283   scheme is often used. The elements of $\mathsf{A}$ can be expressed as
284 < arithmetic operations involving the four quaternions ($q_0, q_1, q_2,$
285 < and $q_3$).\cite{allen87:csl} Use of quaternions also leads to
284 > arithmetic operations involving the four quaternions ($q_w, q_x, q_y,$
285 > and $q_z$).\cite{allen87:csl} Use of quaternions also leads to
286   performance enhancements, particularly for very small
287   systems.\cite{Evans77}
288  
# Line 391 | Line 389 | mol}^{-1}$, distances are in $\mbox{\AA}$, times are i
389   It is important to note the fundamental units in all files which are
390   read and written by {\sc oopse}.  Energies are in $\mbox{kcal
391   mol}^{-1}$, distances are in $\mbox{\AA}$, times are in $\mbox{fs}$,
392 < translational velocities are in $\mbox{\AA fs}^{-1}$, and masses are
392 > translational velocities are in $\mbox{\AA~fs}^{-1}$, and masses are
393   in $\mbox{amu}$.  Orientational degrees of freedom are described using
394   quaternions (unitless, but $q_w^2 + q_x^2 + q_y^2 + q_z^2 = 1$),
395   body-fixed angular momenta ($\mbox{amu \AA}^{2} \mbox{radians
# Line 411 | Line 409 | fs}^{-1}$), and body-fixed moments of inertia ($\mbox{
409  
410   {\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
411  
412 < {\tt forceField} & string & Sets the force field. & Possible force fields are "DUFF", "LJ", and "EAM". \\
412 > {\tt forceField} & string & Sets the force field. & Possible force fields are DUFF, LJ, and EAM. \\
413   {\tt nComponents} & integer & Sets the number of components. & Needs to appear before the first {\tt Component} block. \\
414   {\tt initialConfig} & string & Sets the file containing the initial configuration. & Can point to any file containing the configuration in the correct order. \\
415   {\tt minimizer}& string & Chooses a minimizer & Possible minimizers
416 < are "SD" and "CG". Either {\tt ensemble} or {\tt minimizer} must be specified. \\
416 > are SD and CG. Either {\tt ensemble} or {\tt minimizer} must be specified. \\
417   {\tt ensemble} & string & Sets the ensemble. & Possible ensembles are
418 < "NVE", "NVT", "NPTi", "NPTf", and "NPTxyz".  Either {\tt ensemble}
418 > NVE, NVT, NPTi, NPTf, and NPTxyz.  Either {\tt ensemble}
419   or {\tt minimizer} must be specified. \\
420   {\tt dt} & fs & Sets the time step. & Selection of {\tt dt} should be
421 < small enough to sample the fastest motion of the simulation. (required
422 < for molecular dynamics simulations)\\
421 > small enough to sample the fastest motion of the simulation. ({\tt
422 > dt} is required for molecular dynamics simulations)\\
423   {\tt runTime} & fs & Sets the time at which the simulation should
424   end. & This is an absolute time, and will end the simulation when the
425 < current time meets or exceeds the {\tt runTime}. (required for
426 < molecular dynamics simulations)\\
425 > current time meets or exceeds the {\tt runTime}. ({\tt runTime} is
426 > required for molecular dynamics simulations)\\
427  
428   \end{tabularx}
429   \end{center}
# Line 450 | Line 448 | extension. \\
448   the root name of the initial meta-data file but with an {\tt .eor}
449   extension. \\
450   {\tt useInitialTime} & logical & Sets whether the initial time is taken from the {\tt .in} file. & Useful when recovering a simulation from a crashed processor. Default is false. \\
451 < {\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump} file is written. & Default sets the frequency to the {\tt runTime}. \\
452 < {\tt statusTime} & fs & Sets the frequency at which the {\tt .stat} file is written. & Defaults set the frequency to the {\tt sampleTime}. \\
453 < {\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoffRadius & Defaults to
454 < $15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
455 < \sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
451 > {\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump}
452 > file is written. & The default is equal to the {\tt runTime}. \\
453 > {\tt statusTime} & fs & Sets the frequency at which the {\tt .stat}
454 > file is written. & The default is equal to the {\tt sampleTime}. \\
455 > {\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoff radius &
456 > Defaults to $15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
457 > \sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
458   simulation. \\
459   {\tt switchingRadius} & $\mbox{\AA}$  & Manually sets the inner radius for the switching function. & Defaults to 95~\% of the {\tt cutoffRadius}. \\
460 < {\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is "false". \\
460 > {\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is false. \\
461   {\tt dielectric} & unitless & Sets the dielectric constant for reaction field. & If {\tt useReactionField} is true, then {\tt dielectric} must be set. \\
462   {\tt usePeriodicBoundaryConditions} & & & \\
463 <        & logical & Turns periodic boundary conditions on/off. & Default is "true". \\
463 >        & logical & Turns periodic boundary conditions on/off. & Default is true. \\
464   {\tt seed } & integer & Sets the seed value for the random number
465   generator. & The seed needs to be at least 9 digits long. The default
466   is to take the seed from the CPU clock. \\
467   {\tt forceFieldVariant} & string & Sets the name of the variant of the
468 < force field.  ({\sc eam} has three variants: {\tt u3}, {\tt u6}, and
468 > force field.  & {\sc eam} has three variants: {\tt u3}, {\tt u6}, and
469   {\tt VC}.
470  
471   \end{tabularx}
# Line 565 | Line 565 | declared in the molecule's declaration within the mode
565   \item The ordering of the atoms for each molecule follows the order
566   declared in the molecule's declaration within the model file.
567   \item Only atoms which are not members of a {\tt rigidBody} are
568 < included
568 > included.
569   \item Rigid Body coordinates for a molecule are listed immediately
570   after the the other atoms in a molecule.  Some molecules may be
571   entirely rigid, in which case, only the rigid body coordinates are
# Line 667 | Line 667 | V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}
667   \begin{equation}
668   V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}.
669   \end{equation}
670 < The short-ranged portion includes explicit bonds, bends and torsions,
671 < which have been defined in the meta-data file for the molecules which
672 < present in the simulation.  The functional forms and parameters for
673 < these interactions are defined by the force field which is chosen.
670 > The short-ranged portion includes the explicit bonds, bends, and
671 > torsions which have been defined in the meta-data file for the
672 > molecules which are present in the simulation.  The functional forms and
673 > parameters for these interactions are defined by the force field which
674 > is chosen.
675  
676 < Calculating long-range (non-bonded) potential involves a sum over all
677 < pairs of atoms (except for those atoms which are involved in a bond,
678 < bend, or torsion with each other).  If done poorly, calculating the
679 < the long-range interactions for $N$ atoms would involve $N^2$
680 < evaluations of atomic distance.  To reduce the number of distance
676 > Calculating the long-range (non-bonded) potential involves a sum over
677 > all pairs of atoms (except for those atoms which are involved in a
678 > bond, bend, or torsion with each other).  If done poorly, calculating
679 > the the long-range interactions for $N$ atoms would involve $N(N-1)/2$
680 > evaluations of atomic distances.  To reduce the number of distance
681   evaluations between pairs of atoms, {\sc oopse} uses a switched cutoff
682   with Verlet neighbor lists.\cite{allen87:csl} It is well known that
683   neutral groups which contain charges will exhibit pathological forces
# Line 691 | Line 692 | cutoff groups ($a$ and $b$).
692   where $r_{ab}$ is the distance between the centers of mass of the two
693   cutoff groups ($a$ and $b$).
694  
695 < The sums over $a$ and $b$ are over the cutoffGroups that are present
695 > The sums over $a$ and $b$ are over the cutoff groups that are present
696   in the simulation.  Atoms which are not explicitly defined as members
697   of a {\tt cutoffGroup} are treated as a group consisting of only one
698   atom.  The switching function, $s(r)$ is the standard cubic switching
# Line 717 | Line 718 | present in the simulation.  In simulations containing
718   atoms, the cutoff radius has a default value of $2.5\sigma_{ii}$,
719   where $\sigma_{ii}$ is the largest Lennard-Jones length parameter
720   present in the simulation.  In simulations containing charged or
721 < dipolar atoms, the default cutoff Radius is $15 \mbox{\AA}$.  
721 > dipolar atoms, the default cutoff radius is $15 \mbox{\AA}$.  
722  
723   The {\tt switchingRadius} is set to a default value of 95\% of the
724   {\tt cutoffRadius}.  In the special case of a simulation containing
# Line 726 | Line 727 | Both radii may be specified in the meta-data file.
727   potential to remove discontinuities in the potential at the cutoff.
728   Both radii may be specified in the meta-data file.
729  
730 < Force fields can easily be added to {\sc oopse}, although it comes
731 < with a few simple examples (Lennard-Jones, {\sc duff}, {\sc water},
732 < and {\sc eam}) which are explained in the following sections.
730 > Force fields can be added to {\sc oopse}, although it comes with a few
731 > simple examples (Lennard-Jones, {\sc duff}, {\sc water}, and {\sc
732 > eam}) which are explained in the following sections.
733  
734   \subsection{\label{sec:LJPot}The Lennard Jones Force Field}
735  
# Line 1077 | Line 1078 | models can be found in reference~\citen{fennell04}.
1078   and the drawbacks and benefits of the different density corrected SSD
1079   models can be found in reference~\citen{fennell04}.
1080  
1081 + \subsection{\label{oopseSec:WATER}The {\sc water} Force Field}
1082 +
1083 + In addition to the {\sc duff} force field's solvent description, a
1084 + separate {\sc water} force field has been included for simulating most
1085 + of the common rigid-body water models. This force field includes the
1086 + simple and point-dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD
1087 + water), as well as the common charge-based models (SPC, SPC/E, TIP3P,
1088 + TIP4P, and
1089 + TIP5P).\cite{liu96:new_model,Ichiye03,fennell04,Marrink01,Berendsen81,Berendsen87,Jorgensen83,Mahoney00}
1090 + In order to handle these models, charge-charge interactions were
1091 + included in the force-loop:
1092 + \begin{equation}
1093 + V_{\text{charge}}(r_{ij}) = \sum_{ij}\frac{q_iq_je^2}{r_{ij}},
1094 + \end{equation}
1095 + where $q$ represents the charge on particle $i$ or $j$, and $e$ is the
1096 + charge of an electron in Coulombs.  As with the other pair
1097 + interactions, charges can be simulated with a pure cutoff or a
1098 + reaction field.  The {\sc water} force field can be easily expanded
1099 + through modification of the {\sc water} force field file ({\tt
1100 + WATER.frc}). By adding atom types and inserting the appropriate
1101 + parameters, it is possible to extend the force field to handle rigid
1102 + molecules other than water.
1103 +
1104   \subsection{\label{oopseSec:eam}Embedded Atom Method}
1105  
1106   {\sc oopse} implements a potential that describes bonding in
# Line 1223 | Line 1247 | the inter-atomic forces.
1247   \section{\label{oopseSec:mechanics}Mechanics}
1248  
1249   \subsection{\label{oopseSec:integrate}Integrating the Equations of Motion: the
1250 < DLM method}
1250 > {\sc dlm} method}
1251  
1252   The default method for integrating the equations of motion in {\sc
1253   oopse} is a velocity-Verlet version of the symplectic splitting method
1254   proposed by Dullweber, Leimkuhler and McLachlan
1255 < (DLM).\cite{Dullweber1997} When there are no directional atoms or
1255 > ({\sc dlm}).\cite{Dullweber1997} When there are no directional atoms or
1256   rigid bodies present in the simulation, this integrator becomes the
1257   standard velocity-Verlet integrator which is known to sample the
1258   microcanonical (NVE) ensemble.\cite{Frenkel1996}
1259  
1260   Previous integration methods for orientational motion have problems
1261 < that are avoided in the DLM method.  Direct propagation of the Euler
1261 > that are avoided in the {\sc dlm} method.  Direct propagation of the Euler
1262   angles has a known $1/\sin\theta$ divergence in the equations of
1263   motion for $\phi$ and $\psi$,\cite{allen87:csl} leading to numerical
1264   instabilities any time one of the directional atoms or rigid bodies
# Line 1342 | Line 1366 | of the particle in the space-fixed frame.
1366   Here $\hat{\bf u}$ is a unit vector pointing along the principal axis
1367   of the particle in the space-fixed frame.
1368  
1369 < The DLM method uses a Trotter factorization of the orientational
1369 > The {\sc dlm} method uses a Trotter factorization of the orientational
1370   propagator.  This has three effects:
1371   \begin{enumerate}
1372   \item the integrator is area-preserving in phase space (i.e. it is
# Line 1436 | Line 1460 | advanced to the same time value.
1460          + \frac{h}{2} {\bf \tau}^b(t + h) .
1461   \end{align*}
1462  
1463 < The matrix rotations used in the DLM method end up being more costly
1464 < computationally than the simpler arithmetic quaternion
1465 < propagation. With the same time step, a 1000-molecule water simulation
1466 < shows an average 7\% increase in computation time using the DLM method
1467 < in place of quaternions. This cost is more than justified when
1468 < comparing the energy conservation of the two methods as illustrated in
1469 < Fig.~\ref{timestep}.
1463 > The matrix rotations used in the {\sc dlm} method end up being more
1464 > costly computationally than the simpler arithmetic quaternion
1465 > propagation. With the same time step, a 1024-molecule water simulation
1466 > incurs an average 12\% increase in computation time using the {\sc
1467 > dlm} method in place of quaternions. This cost is more than justified
1468 > when comparing the energy conservation achieved by the two
1469 > methods. Figure ~\ref{quatdlm} provides a comparative analysis of the
1470 > {\sc dlm} method versus the traditional quaternion scheme.
1471  
1472   \begin{figure}
1473   \centering
1474 < \includegraphics[width=\linewidth]{timeStep.eps}
1475 < \caption[Energy conservation for quaternion versus DLM dynamics]{Energy conservation using quaternion based integration versus
1476 < the method proposed by Dullweber \emph{et al.} with increasing time
1477 < step. For each time step, the dotted line is total energy using the
1478 < DLM integrator, and the solid line comes from the quaternion
1479 < integrator. The larger time step plots are shifted up from the true
1480 < energy baseline for clarity.}
1481 < \label{timestep}
1474 > \includegraphics[width=\linewidth]{quatvsdlm.eps}
1475 > \caption[Energy conservation analysis of the {\sc dlm} and quaternion
1476 > integration methods]{Analysis of the energy conservation of the {\sc
1477 > dlm} and quaternion integration methods.  $\delta \mathrm{E}_1$ is the
1478 > linear drift in energy over time and $\delta \mathrm{E}_0$ is the
1479 > standard deviation of energy fluctuations around this drift.  All
1480 > simulations were of a 1024-molecule simulation of SSD water at 298 K
1481 > starting from the same initial configuration. Note that the {\sc dlm}
1482 > method provides more than an order of magnitude improvement in both
1483 > the energy drift and the size of the energy fluctuations when compared
1484 > with the quaternion method at any given time step.  At time steps
1485 > larger than 4 fs, the quaternion scheme resulted in rapidly rising
1486 > energies which eventually lead to simulation failure.  Using the {\sc
1487 > dlm} method, time steps up to 8 fs can be taken before this behavior
1488 > is evident.}
1489 > \label{quatdlm}
1490   \end{figure}
1491  
1492 < In Fig.~\ref{timestep}, the resulting energy drift at various time
1493 < steps for both the DLM and quaternion integration schemes is
1494 < compared. All of the 1000 molecule water simulations started with the
1495 < same configuration, and the only difference was the method for
1496 < handling rotational motion. At time steps of 0.1 and 0.5 fs, both
1497 < methods for propagating molecule rotation conserve energy fairly well,
1498 < with the quaternion method showing a slight energy drift over time in
1499 < the 0.5 fs time step simulation. At time steps of 1 and 2 fs, the
1500 < energy conservation benefits of the DLM method are clearly
1501 < demonstrated. Thus, while maintaining the same degree of energy
1502 < conservation, one can take considerably longer time steps, leading to
1503 < an overall reduction in computation time.
1492 > In Fig.~\ref{quatdlm}, $\delta \mbox{E}_1$ is a measure of the linear
1493 > energy drift in units of $\mbox{kcal mol}^{-1}$ per particle over a
1494 > nanosecond of simulation time, and $\delta \mbox{E}_0$ is the standard
1495 > deviation of the energy fluctuations in units of $\mbox{kcal
1496 > mol}^{-1}$ per particle. In the top plot, it is apparent that the
1497 > energy drift is reduced by a significant amount (2 to 3 orders of
1498 > magnitude improvement at all tested time steps) by chosing the {\sc
1499 > dlm} method over the simple non-symplectic quaternion integration
1500 > method.  In addition to this improvement in energy drift, the
1501 > fluctuations in the total energy are also dampened by 1 to 2 orders of
1502 > magnitude by utilizing the {\sc dlm} method.
1503 >
1504 > Although the {\sc dlm} method is more computationally expensive than
1505 > the traditional quaternion scheme for propagating a single time step,
1506 > consideration of the computational cost for a long simulation with a
1507 > particular level of energy conservation is in order.  A plot of energy
1508 > drift versus computational cost was generated
1509 > (Fig.~\ref{cpuCost}). This figure provides an estimate of the CPU time
1510 > required under the two integration schemes for 1 nanosecond of
1511 > simulation time for the model 1024-molecule system.  By chosing a
1512 > desired energy drift value it is possible to determine the CPU time
1513 > required for both methods. If a $\delta \mbox{E}_1$ of $1 \times
1514 > 10^{-3} \mbox{kcal mol}^{-1}$ per particle is desired, a nanosecond of
1515 > simulation time will require ~19 hours of CPU time with the {\sc dlm}
1516 > integrator, while the quaternion scheme will require ~154 hours of CPU
1517 > time. This demonstrates the computational advantage of the integration
1518 > scheme utilized in {\sc oopse}.
1519  
1520 + \begin{figure}
1521 + \centering
1522 + \includegraphics[width=\linewidth]{compCost.eps}
1523 + \caption[Energy drift as a function of required simulation run
1524 + time]{Energy drift as a function of required simulation run time.
1525 + $\delta \mathrm{E}_1$ is the linear drift in energy over time.
1526 + Simulations were performed on a single 2.5 GHz Pentium 4
1527 + processor. Simulation time comparisons can be made by tracing
1528 + horizontally from one curve to the other. For example, a simulation
1529 + that takes ~24 hours using the {\sc dlm} method will take roughly 210
1530 + hours using the simple quaternion method if the same degree of energy
1531 + conservation is desired.}
1532 + \label{cpuCost}
1533 + \end{figure}
1534 +
1535   There is only one specific keyword relevant to the default integrator,
1536   and that is the time step for integrating the equations of motion.
1537  
# Line 1575 | Line 1638 | Here, $f$ is the total number of degrees of freedom in
1638   \end{equation}
1639   Here, $f$ is the total number of degrees of freedom in the system,
1640   \begin{equation}
1641 < f = 3 N + 3 N_{\mathrm{orient}} - N_{\mathrm{constraints}},
1641 > f = 3 N + 2 N_{\mathrm{linear}} + 3 N_{\mathrm{non-linear}} - N_{\mathrm{constraints}},
1642   \end{equation}
1643   and $K$ is the total kinetic energy,
1644   \begin{equation}
1645   K = \sum_{i=1}^{N} \frac{1}{2} m_i {\bf v}_i^T \cdot {\bf v}_i +
1646 < \sum_{i=1}^{N_{\mathrm{orient}}}  \frac{1}{2} {\bf j}_i^T \cdot
1646 > \sum_{i=1}^{N_{\mathrm{linear}}+N_{\mathrm{non-linear}}}  \frac{1}{2} {\bf j}_i^T \cdot
1647   \overleftrightarrow{\mathsf{I}}_i^{-1} \cdot {\bf j}_i.
1648   \end{equation}
1649 + $N_{\mathrm{linear}}$ is the number of linear rotors (i.e. with two
1650 + non-zero moments of inertia), and $N_{\mathrm{non-linear}}$ is the
1651 + number of non-linear rotors (i.e. with three non-zero moments of
1652 + inertia).  
1653  
1654   In eq.(\ref{eq:nosehooverext}), $\tau_T$ is the time constant for
1655   relaxation of the temperature to the target value.  To set values for
# Line 1620 | Line 1687 | factorization of the three rotation operations that wa
1687   Here $\mathrm{rotate}(h * {\bf j}
1688   \overleftrightarrow{\mathsf{I}}^{-1})$ is the same symplectic Trotter
1689   factorization of the three rotation operations that was discussed in
1690 < the section on the DLM integrator.  Note that this operation modifies
1690 > the section on the {\sc dlm} integrator.  Note that this operation modifies
1691   both the rotation matrix $\mathsf{A}$ and the angular momentum ${\bf
1692   j}$.  {\tt moveA} propagates velocities by a half time step, and
1693   positional degrees of freedom by a full time step.  The new positions
1694   (and orientations) are then used to calculate a new set of forces and
1695   torques in exactly the same way they are calculated in the {\tt
1696 < doForces} portion of the DLM integrator.
1696 > doForces} portion of the {\sc dlm} integrator.
1697  
1698   Once the forces and torques have been obtained at the new time step,
1699   the temperature, velocities, and the extended system variable can be
# Line 1731 | Line 1798 | where ${\bf r}_{ab}$ is the distance between the cente
1798   where ${\bf r}_{ab}$ is the distance between the centers of mass, and
1799   \begin{equation}
1800   {\bf f}_{ab} = s(r_{ab}) \sum_{i \in a} \sum_{j \in b} {\bf f}_{ij} +
1801 < s\prime(r_{ab}) \frac{{\bf r}_{ab}}{|r_{ab}|} \sum_{i \in a} \sum_{j
1801 > s^{\prime}(r_{ab}) \frac{{\bf r}_{ab}}{|r_{ab}|} \sum_{i \in a} \sum_{j
1802   \in b} V_{ij}({\bf r}_{ij}).
1803   \end{equation}
1804  
1805   The instantaneous pressure is then simply obtained from the trace of
1806   the pressure tensor,
1807   \begin{equation}
1808 < P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t).
1809 < \right)
1808 > P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t)
1809 > \right).
1810   \end{equation}
1811  
1812   In eq.(\ref{eq:melchionna1}), $\tau_B$ is the time constant for
# Line 1749 | Line 1816 | integration of the equations of motion is carried out
1816   file.  The units for {\tt tauBarostat} are fs, and the units for the
1817   {\tt targetPressure} are atmospheres.  Like in the NVT integrator, the
1818   integration of the equations of motion is carried out in a
1819 < velocity-Verlet style 2 part algorithm with only the following differences:
1819 > velocity-Verlet style two part algorithm with only the following
1820 > differences:
1821  
1822   {\tt moveA:}
1823   \begin{align*}
# Line 1782 | Line 1850 | the box by
1850   the box by
1851   \begin{equation}
1852   \mathcal{V}(t + h) \leftarrow e^{ - 3 h \eta(t + h /2)} \times
1853 < \mathcal{V}(t)
1853 > \mathcal{V}(t).
1854   \end{equation}
1855  
1856   The {\tt doForces} step for the NPTi integrator is exactly the same as
1857 < in both the DLM and NVT integrators.  Once the forces and torques have
1857 > in both the {\sc dlm} and NVT integrators.  Once the forces and torques have
1858   been obtained at the new time step, the velocities can be advanced to
1859   the same time value.
1860  
# Line 2011 | Line 2079 | the force calculation at each time step.
2079   subtract the total constraint forces from the rest of the system after
2080   the force calculation at each time step.
2081  
2082 < After the force calculation, the total force on molecule $\alpha$,
2082 > After the force calculation, the total force on molecule $\alpha$ is:
2083   \begin{equation}
2084   G_{\alpha} = \sum_i F_{\alpha i},
2085   \label{oopseEq:zc1}
# Line 2032 | Line 2100 | v_{\alpha i} = v_{\alpha i} -
2100   v_{\alpha i} = v_{\alpha i} -
2101          \frac{\sum_i m_{\alpha i} v_{\alpha i}}{\sum_i m_{\alpha i}},
2102   \end{equation}
2103 < where $v_{\alpha i}$ is the velocity of atom $i$ in the z direction.
2103 > where $v_{\alpha i}$ is the velocity of atom $i$ in the $z$ direction.
2104   Lastly, all of the accumulated constraint forces must be subtracted
2105   from the rest of the unconstrained system to keep the system center of
2106   mass of the entire system from drifting.
# Line 2070 | Line 2138 | The user may also specify the use of a constant veloci
2138  
2139   The user may also specify the use of a constant velocity method
2140   (steered molecular dynamics) to move the molecules to their desired
2141 < initial positions.
2141 > initial positions. Based on concepts from atomic force microscopy,
2142 > {\sc smd} has been used to study many processes which occur via rare
2143 > events on the time scale of a few hundreds of picoseconds.  For
2144 > example,{\sc smd} has been used to observe the dissociation of
2145 > Streptavidin-biotin Complex.\cite{smd}  
2146  
2147   To use of the $z$-constraint method in an {\sc oopse} simulation, the
2148   molecules must be specified using the {\tt nZconstraints} keyword in
# Line 2078 | Line 2150 | of the $z$-constraint method are listed in table~\ref{
2150   of the $z$-constraint method are listed in table~\ref{table:zconParams}.
2151  
2152   \begin{table}
2153 < \caption{The Global Keywords: Z-Constraint Parameters}
2153 > \caption{Meta-data Keywords: Z-Constraint Parameters}
2154   \label{table:zconParams}
2155   \begin{center}
2156   % Note when adding or removing columns, the \hsize numbers must add up to the total number
# Line 2091 | Line 2163 | of the $z$-constraint method are listed in table~\ref{
2163  
2164   {\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
2165  
2166 < {\tt nZconstraints} & integer &  The number of zconstraint molecules& If using zconstraint method, {\tt nZconstraints} must be set \\
2167 < {\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file is written &  \\
2168 < {\tt zconsForcePolicy} & string & The strategy of subtracting
2169 < zconstraint force from the unconstrained molecules & Possible
2170 < strategies are {\tt BYMASS} and {\tt BYNUMBER}. Default
2171 < strategy is set to {\tt BYMASS}\\
2172 < {\tt zconsGap} & $\mbox{\AA}$ & Set the distance between two adjacent
2173 < constraint positions& Used mainly in moving molecules through a simulation \\
2174 < {\tt zconsFixtime} & fs & Sets how long the zconstraint molecule is
2175 < fixed & {\tt zconsFixtime} must be set if {\tt zconsGap} is set\\
2176 < {\tt zconsUsingSMD} &logical & Flag for using Steered Molecular
2177 < Dynamics or Harmonic Forces to move the molecule  & Harmonic Forces are
2178 < used by default\\
2166 > {\tt nZconstraints} & integer &  The number of $z$-constrained
2167 > molecules & If using the $z$-constraint method, {\tt nZconstraints}
2168 > must be set \\
2169 > {\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file
2170 > is written &  \\
2171 > {\tt zconsForcePolicy} & string & The strategy for subtracting
2172 > the $z$-constraint force from the {\it unconstrained} molecules & Possible
2173 > strategies are {\tt BYMASS} and {\tt BYNUMBER}. The default
2174 > strategy is {\tt BYMASS}\\
2175 > {\tt zconsGap} & $\mbox{\AA}$ & Sets the distance between two adjacent
2176 > constraint positions&Used mainly to move molecules through a
2177 > simulation to estimate potentials of mean force. \\
2178 > {\tt zconsFixtime} & fs & Sets the length of time the $z$-constraint
2179 > molecule is held fixed & {\tt zconsFixtime} must be set if {\tt
2180 > zconsGap} is set\\
2181 > {\tt zconsUsingSMD} & logical & Flag for using Steered Molecular
2182 > Dynamics to move the molecules to the correct constrained positions  &
2183 > Harmonic Forces are used by default\\
2184  
2185   \end{tabularx}
2186   \end{center}
# Line 2120 | Line 2197 | gradient ({\sc cg}) to help users find reasonable loca
2197   stable fixed points on the surface.  We have included two simple
2198   minimization algorithms: steepest descent, ({\sc sd}) and conjugate
2199   gradient ({\sc cg}) to help users find reasonable local minima from
2200 < their initial configurations.
2200 > their initial configurations. Since {\sc oopse} handles atoms and
2201 > rigid bodies which have orientational coordinates as well as
2202 > translational coordinates, there is some subtlety to the choice of
2203 > parameters for minimization algorithms.
2204  
2125 Since {\sc oopse} handles atoms and rigid bodies which have
2126 orientational coordinates as well as translational coordinates, there
2127 is some subtlety to the choice of parameters for minimization
2128 algorithms.
2129
2205   Given a coordinate set $x_{k}$ and a search direction $d_{k}$, a line
2206   search algorithm is performed along $d_{k}$ to produce
2207 < $x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$.
2208 <
2134 < In the steepest descent ({\sc sd}) algorithm,%
2207 > $x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$. In the steepest descent ({\sc
2208 > sd}) algorithm,%
2209   \begin{equation}
2210 < d_{k}=-\nabla V(x_{k})
2210 > d_{k}=-\nabla V(x_{k}).
2211   \end{equation}
2212   The gradient and the direction of next step are always orthogonal.
2213   This may cause oscillatory behavior in narrow valleys.  To overcome
2214   this problem, the Fletcher-Reeves variant~\cite{FletcherReeves} of the
2215   conjugate gradient ({\sc cg}) algorithm is used to generate $d_{k+1}$
2216   via simple recursion:
2217 < \begin{align}
2218 < d_{k+1}  &  =-\nabla V(x_{k+1})+\gamma_{k}d_{k}\\
2219 < \gamma_{k}  &  =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2220 < V(x_{k})^{T}\nabla V(x_{k})}%
2221 < \end{align}
2217 > \begin{equation}
2218 > d_{k+1}  =-\nabla V(x_{k+1})+\gamma_{k}d_{k}
2219 > \end{equation}
2220 > where
2221 > \begin{equation}
2222 > \gamma_{k}  =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2223 > V(x_{k})^{T}\nabla V(x_{k})}.
2224 > \end{equation}
2225  
2226   The Polak-Ribiere variant~\cite{PolakRibiere} of the conjugate
2227   gradient ($\gamma_{k}$) is defined as%
# Line 2152 | Line 2229 | V(x_{k})^{T}\nabla V(x_{k})}%
2229   \gamma_{k}=\frac{[\nabla V(x_{k+1})-\nabla V(x)]^{T}\nabla V(x_{k+1})}{\nabla
2230   V(x_{k})^{T}\nabla V(x_{k})}%
2231   \end{equation}
2232 + It is widely agreed that the Polak-Ribiere variant gives better
2233 + convergence than the Fletcher-Reeves variant, so the conjugate
2234 + gradient approach implemented in {\sc oopse} is the Polak-Ribiere
2235 + variant.
2236  
2237   The conjugate gradient method assumes that the conformation is close
2238   enough to a local minimum that the potential energy surface is very
# Line 2165 | Line 2246 | minimizer are given in Table~\ref{table:minimizeParams
2246   minimizer are given in Table~\ref{table:minimizeParams}
2247  
2248   \begin{table}
2249 < \caption{The Global Keywords: Energy Minimizer Parameters}
2249 > \caption{Meta-data Keywords: Energy Minimizer Parameters}
2250   \label{table:minimizeParams}
2251   \begin{center}
2252   % Note when adding or removing columns, the \hsize numbers must add up to the total number
# Line 2181 | Line 2262 | descent) \\
2262   {\tt minimizer} & string &  selects the minimization method to be used
2263   & either {\tt CG} (conjugate gradient) or {\tt SD} (steepest
2264   descent) \\
2265 < {\tt minimizerMaxIter} & steps & Sets the maximum iteration number in the energy minimization & Default value is 200\\
2266 < {\tt minimizerWriteFreq} & steps & Sets the frequency at which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2267 < {\tt minimizerStepSize} & $\mbox{\AA}$ &  Set the step size of line search & Default value is 0.01\\
2268 < {\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$  & Sets energy tolerance  & Default value is $10^{-8}$\\
2269 < {\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets gradient tolerance & Default value is $10^{-8}$\\
2270 < {\tt minimizerLSTol} &  $\mbox{kcal mol}^{-1}$ & Sets line search tolerance & Default value is $10^{-8}$\\
2271 < {\tt minimizerLSMaxIter} & steps &  Sets the maximum iteration of line searching & Default value is 50\\
2265 > {\tt minimizerMaxIter} & steps & Sets the maximum number of iterations
2266 > for the energy minimization & The default value is 200\\
2267 > {\tt minimizerWriteFreq} & steps & Sets the frequency with which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2268 > {\tt minimizerStepSize} & $\mbox{\AA}$ &  Sets the step size for the
2269 > line search & The default value is 0.01\\
2270 > {\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$  & Sets the energy tolerance
2271 > for stopping the minimziation. & The default value is $10^{-8}$\\
2272 > {\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets the
2273 > gradient tolerance for stopping the minimization. & The default value
2274 > is  $10^{-8}$\\
2275 > {\tt minimizerLSTol} &  $\mbox{kcal mol}^{-1}$ & Sets line search
2276 > tolerance for terminating each step of the minimization. & The default
2277 > value is $10^{-8}$\\
2278 > {\tt minimizerLSMaxIter} & steps &  Sets the maximum number of
2279 > iterations for each line search & The default value is 50\\
2280  
2281   \end{tabularx}
2282   \end{center}
# Line 2196 | Line 2285 | Although processor power is continually improving, it
2285   \section{\label{oopseSec:parallelization} Parallel Simulation Implementation}
2286  
2287   Although processor power is continually improving, it is still
2288 < unreasonable to simulate systems of more then a 1000 atoms on a single
2288 > unreasonable to simulate systems of more than 10,000 atoms on a single
2289   processor. To facilitate study of larger system sizes or smaller
2290   systems for longer time scales, parallel methods were developed to
2291   allow multiple CPU's to share the simulation workload. Three general
# Line 2207 | Line 2296 | the duration of the simulation. Computational cost sca
2296   Algorithmically simplest of the three methods is atomic decomposition,
2297   where $N$ particles in a simulation are split among $P$ processors for
2298   the duration of the simulation. Computational cost scales as an
2299 < optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately all
2299 > optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately, all
2300   processors must communicate positions and forces with all other
2301   processors at every force evaluation, leading the communication costs
2302   to scale as an unfavorable $\mathcal{O}(N)$, \emph{independent of the
# Line 2229 | Line 2318 | The parallelization method used in {\sc oopse} is the
2318   three dimensions.
2319  
2320   The parallelization method used in {\sc oopse} is the force
2321 < decomposition method.  Force decomposition assigns particles to
2322 < processors based on a block decomposition of the force
2321 > decomposition method.\cite{hendrickson:95} Force decomposition assigns
2322 > particles to processors based on a block decomposition of the force
2323   matrix. Processors are split into an optimally square grid forming row
2324   and column processor groups. Forces are calculated on particles in a
2325   given row by particles located in that processor's column
2326 < assignment. Force decomposition is less complex to implement than the
2327 < spatial method but still scales computationally as $\mathcal{O}(N/P)$
2328 < and scales as $\mathcal{O}(N/\sqrt{P})$ in communication
2329 < cost. Plimpton has also found that force decompositions scale more
2330 < favorably than spatial decompositions for systems up to 10,000 atoms
2331 < and favorably compete with spatial methods up to 100,000
2332 < atoms.\cite{plimpton95}
2333 <
2326 > assignment. One deviation from the algorithm described by Hendrickson
2327 > {\it et al.} is the use of column ordering based on the row indexes
2328 > preventing the need for a transpose operation necessitating a second
2329 > communication step when gathering the final force components.  Force
2330 > decomposition is less complex to implement than the spatial method but
2331 > still scales computationally as $\mathcal{O}(N/P)$ and scales as
2332 > $\mathcal{O}(N/\sqrt{P})$ in communication cost. Plimpton has also
2333 > found that force decompositions scale more favorably than spatial
2334 > decompositions for systems up to 10,000 atoms and favorably compete
2335 > with spatial methods up to 100,000 atoms.\cite{plimpton95}
2336 >
2337   \section{\label{oopseSec:conclusion}Conclusion}
2338  
2339 < We have presented a new open source parallel simulation program {\sc
2339 > We have presented a new parallel simulation program called {\sc
2340   oopse}. This program offers some novel capabilities, but mostly makes
2341   available a library of modern object-oriented code for the scientific
2342   community to use freely.  Notably, {\sc oopse} can handle symplectic
# Line 2273 | Line 2365 | DMR-0079647.
2365   the Notre Dame Bunch-of-Boxes (B.o.B) computer cluster under NSF grant
2366   DMR-0079647.
2367  
2368 < \bibliographystyle{achemso}
2368 > \bibliographystyle{jcc}
2369   \bibliography{oopsePaper}
2370  
2371   \end{document}

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines