ViewVC Help
View File | Revision Log | Show Annotations | View Changeset | Root Listing
root/group/trunk/oopsePaper/oopsePaper.tex
(Generate patch)

Comparing trunk/oopsePaper/oopsePaper.tex (file contents):
Revision 1434 by chrisfen, Thu Jul 29 18:01:05 2004 UTC vs.
Revision 1439 by gezelter, Thu Jul 29 20:06:07 2004 UTC

# Line 1 | Line 1
1   \documentclass[11pt]{article}
2   \usepackage{amsmath}
3 < \usepackage{amssymb}
3 > \usepackage{amssymb}
4   \usepackage{endfloat}
5   \usepackage{listings}
6   \usepackage{berkeley}
# Line 22 | Line 22
22          xleftmargin=0.5in, xrightmargin=0.5in,captionpos=b, %
23          abovecaptionskip=0.5cm, belowcaptionskip=0.5cm}
24   \renewcommand{\lstlistingname}{Scheme}
25 < \title{{\sc oopse}: An Open Source Object-Oriented Parallel Simulation
25 > \title{{\sc oopse}: An Object-Oriented Parallel Simulation
26   Engine for Molecular Dynamics}
27  
28   \author{Matthew A. Meineke, Charles F. Vardeman II, Teng Lin,\\
# Line 43 | Line 43 | meta-data language.  A number of advanced integrators
43   simulations are carried out using the force-based decomposition
44   method.  Simulations are specified using a very simple C-based
45   meta-data language.  A number of advanced integrators are included,
46 < and the base integrator for orientational dynamics provides
47 < substantial improvements over older quaternion-base schemes.  All
48 < source code is available under a very permissive (BSD-style) Open
49 < Source license.
46 > and the basic integrator for orientational dynamics provides
47 > substantial improvements over older quaternion-based schemes.
48   \end{abstract}
49  
50   \section{\label{sec:intro}Introduction}
51  
52   There are a number of excellent molecular dynamics packages available
53   to the chemical physics
54 < community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn}
54 > community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn,namd,macromodel}
55   All of these packages are stable, polished programs which solve many
56   problems of interest.  Most are now capable of performing molecular
57   dynamics simulations on parallel computers.  Some have source code
# Line 64 | Line 62 | has taken us now involves the use of atoms with orient
62   programs referenced can handle transition metal force fields like the
63   Embedded Atom Method ({\sc eam}).  The direction our research program
64   has taken us now involves the use of atoms with orientational degrees
65 < of freedom and transition metals.  Since these simulation methods may
66 < be of some use to other researchers, we have decided to release our
67 < program to the scientific community with a permissive open source
68 < license.
65 > of freedom as well as transition metals.  Since these simulation
66 > methods may be of some use to other researchers, we have decided to
67 > release our program (and all related source code) to the scientific
68 > community.
69  
70   This paper communicates the algorithmic details of our program, which
71 < we have been calling the Open source Object-oriented Parallel
72 < Simulation Engine (i.e. {\sc oopse}).  We have structured this paper
73 < to first discuss the underlying concepts in this simulation package
71 > we have been calling the Object-Oriented Parallel Simulation Engine
72 > (i.e. {\sc oopse}).  We have structured this paper to first discuss
73 > the underlying concepts in this simulation package
74   (Sec. \ref{oopseSec:IOfiles}).  The empirical energy functions
75   implemented are discussed in Sec.~\ref{oopseSec:empiricalEnergy}.
76   Sec.~\ref{oopseSec:mechanics} describes the various Molecular Dynamics
# Line 283 | Line 281 | scheme is often used. The elements of $\mathsf{A}$ can
281   $\psi$.\cite{Goldstein01} In order to avoid numerical instabilities
282   inherent in using the Euler angles, the four parameter ``quaternion''
283   scheme is often used. The elements of $\mathsf{A}$ can be expressed as
284 < arithmetic operations involving the four quaternions ($q_0, q_1, q_2,$
285 < and $q_3$).\cite{allen87:csl} Use of quaternions also leads to
284 > arithmetic operations involving the four quaternions ($q_w, q_x, q_y,$
285 > and $q_z$).\cite{allen87:csl} Use of quaternions also leads to
286   performance enhancements, particularly for very small
287   systems.\cite{Evans77}
288  
# Line 391 | Line 389 | mol}^{-1}$, distances are in $\mbox{\AA}$, times are i
389   It is important to note the fundamental units in all files which are
390   read and written by {\sc oopse}.  Energies are in $\mbox{kcal
391   mol}^{-1}$, distances are in $\mbox{\AA}$, times are in $\mbox{fs}$,
392 < translational velocities are in $\mbox{\AA fs}^{-1}$, and masses are
392 > translational velocities are in $\mbox{\AA~fs}^{-1}$, and masses are
393   in $\mbox{amu}$.  Orientational degrees of freedom are described using
394   quaternions (unitless, but $q_w^2 + q_x^2 + q_y^2 + q_z^2 = 1$),
395   body-fixed angular momenta ($\mbox{amu \AA}^{2} \mbox{radians
# Line 411 | Line 409 | fs}^{-1}$), and body-fixed moments of inertia ($\mbox{
409  
410   {\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
411  
412 < {\tt forceField} & string & Sets the force field. & Possible force fields are "DUFF", "LJ", and "EAM". \\
412 > {\tt forceField} & string & Sets the force field. & Possible force fields are DUFF, LJ, and EAM. \\
413   {\tt nComponents} & integer & Sets the number of components. & Needs to appear before the first {\tt Component} block. \\
414   {\tt initialConfig} & string & Sets the file containing the initial configuration. & Can point to any file containing the configuration in the correct order. \\
415   {\tt minimizer}& string & Chooses a minimizer & Possible minimizers
416 < are "SD" and "CG". Either {\tt ensemble} or {\tt minimizer} must be specified. \\
416 > are SD and CG. Either {\tt ensemble} or {\tt minimizer} must be specified. \\
417   {\tt ensemble} & string & Sets the ensemble. & Possible ensembles are
418 < "NVE", "NVT", "NPTi", "NPTf", and "NPTxyz".  Either {\tt ensemble}
418 > NVE, NVT, NPTi, NPTf, and NPTxyz.  Either {\tt ensemble}
419   or {\tt minimizer} must be specified. \\
420   {\tt dt} & fs & Sets the time step. & Selection of {\tt dt} should be
421 < small enough to sample the fastest motion of the simulation. (required
422 < for molecular dynamics simulations)\\
421 > small enough to sample the fastest motion of the simulation. ({\tt
422 > dt} is required for molecular dynamics simulations)\\
423   {\tt runTime} & fs & Sets the time at which the simulation should
424   end. & This is an absolute time, and will end the simulation when the
425 < current time meets or exceeds the {\tt runTime}. (required for
426 < molecular dynamics simulations)\\
425 > current time meets or exceeds the {\tt runTime}. ({\tt runTime} is
426 > required for molecular dynamics simulations)\\
427  
428   \end{tabularx}
429   \end{center}
# Line 450 | Line 448 | extension. \\
448   the root name of the initial meta-data file but with an {\tt .eor}
449   extension. \\
450   {\tt useInitialTime} & logical & Sets whether the initial time is taken from the {\tt .in} file. & Useful when recovering a simulation from a crashed processor. Default is false. \\
451 < {\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump} file is written. & Default sets the frequency to the {\tt runTime}. \\
452 < {\tt statusTime} & fs & Sets the frequency at which the {\tt .stat} file is written. & Defaults set the frequency to the {\tt sampleTime}. \\
453 < {\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoffRadius & Defaults to
454 < $15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
455 < \sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
451 > {\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump}
452 > file is written. & The default is equal to the {\tt runTime}. \\
453 > {\tt statusTime} & fs & Sets the frequency at which the {\tt .stat}
454 > file is written. & The default is equal to the {\tt sampleTime}. \\
455 > {\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoff radius &
456 > Defaults to $15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
457 > \sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
458   simulation. \\
459   {\tt switchingRadius} & $\mbox{\AA}$  & Manually sets the inner radius for the switching function. & Defaults to 95~\% of the {\tt cutoffRadius}. \\
460 < {\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is "false". \\
460 > {\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is false. \\
461   {\tt dielectric} & unitless & Sets the dielectric constant for reaction field. & If {\tt useReactionField} is true, then {\tt dielectric} must be set. \\
462   {\tt usePeriodicBoundaryConditions} & & & \\
463 <        & logical & Turns periodic boundary conditions on/off. & Default is "true". \\
463 >        & logical & Turns periodic boundary conditions on/off. & Default is true. \\
464   {\tt seed } & integer & Sets the seed value for the random number
465   generator. & The seed needs to be at least 9 digits long. The default
466   is to take the seed from the CPU clock. \\
467   {\tt forceFieldVariant} & string & Sets the name of the variant of the
468 < force field.  ({\sc eam} has three variants: {\tt u3}, {\tt u6}, and
468 > force field.  & {\sc eam} has three variants: {\tt u3}, {\tt u6}, and
469   {\tt VC}.
470  
471   \end{tabularx}
# Line 565 | Line 565 | declared in the molecule's declaration within the mode
565   \item The ordering of the atoms for each molecule follows the order
566   declared in the molecule's declaration within the model file.
567   \item Only atoms which are not members of a {\tt rigidBody} are
568 < included
568 > included.
569   \item Rigid Body coordinates for a molecule are listed immediately
570   after the the other atoms in a molecule.  Some molecules may be
571   entirely rigid, in which case, only the rigid body coordinates are
# Line 667 | Line 667 | V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}
667   \begin{equation}
668   V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}.
669   \end{equation}
670 < The short-ranged portion includes explicit bonds, bends and torsions,
671 < which have been defined in the meta-data file for the molecules which
672 < present in the simulation.  The functional forms and parameters for
673 < these interactions are defined by the force field which is chosen.
670 > The short-ranged portion includes the explicit bonds, bends, and
671 > torsions which have been defined in the meta-data file for the
672 > molecules which are present in the simulation.  The functional forms and
673 > parameters for these interactions are defined by the force field which
674 > is chosen.
675  
676 < Calculating long-range (non-bonded) potential involves a sum over all
677 < pairs of atoms (except for those atoms which are involved in a bond,
678 < bend, or torsion with each other).  If done poorly, calculating the
679 < the long-range interactions for $N$ atoms would involve $N^2$
680 < evaluations of atomic distance.  To reduce the number of distance
676 > Calculating the long-range (non-bonded) potential involves a sum over
677 > all pairs of atoms (except for those atoms which are involved in a
678 > bond, bend, or torsion with each other).  If done poorly, calculating
679 > the the long-range interactions for $N$ atoms would involve $N(N-1)/2$
680 > evaluations of atomic distances.  To reduce the number of distance
681   evaluations between pairs of atoms, {\sc oopse} uses a switched cutoff
682   with Verlet neighbor lists.\cite{allen87:csl} It is well known that
683   neutral groups which contain charges will exhibit pathological forces
# Line 691 | Line 692 | cutoff groups ($a$ and $b$).
692   where $r_{ab}$ is the distance between the centers of mass of the two
693   cutoff groups ($a$ and $b$).
694  
695 < The sums over $a$ and $b$ are over the cutoffGroups that are present
695 > The sums over $a$ and $b$ are over the cutoff groups that are present
696   in the simulation.  Atoms which are not explicitly defined as members
697   of a {\tt cutoffGroup} are treated as a group consisting of only one
698   atom.  The switching function, $s(r)$ is the standard cubic switching
# Line 717 | Line 718 | present in the simulation.  In simulations containing
718   atoms, the cutoff radius has a default value of $2.5\sigma_{ii}$,
719   where $\sigma_{ii}$ is the largest Lennard-Jones length parameter
720   present in the simulation.  In simulations containing charged or
721 < dipolar atoms, the default cutoff Radius is $15 \mbox{\AA}$.  
721 > dipolar atoms, the default cutoff radius is $15 \mbox{\AA}$.  
722  
723   The {\tt switchingRadius} is set to a default value of 95\% of the
724   {\tt cutoffRadius}.  In the special case of a simulation containing
# Line 726 | Line 727 | Both radii may be specified in the meta-data file.
727   potential to remove discontinuities in the potential at the cutoff.
728   Both radii may be specified in the meta-data file.
729  
730 < Force fields can easily be added to {\sc oopse}, although it comes
731 < with a few simple examples (Lennard-Jones, {\sc duff}, {\sc water},
732 < and {\sc eam}) which are explained in the following sections.
730 > Force fields can be added to {\sc oopse}, although it comes with a few
731 > simple examples (Lennard-Jones, {\sc duff}, {\sc water}, and {\sc
732 > eam}) which are explained in the following sections.
733  
734   \subsection{\label{sec:LJPot}The Lennard Jones Force Field}
735  
# Line 1080 | Line 1081 | In addition to the {\sc duff} force field's solvent de
1081   \subsection{\label{oopseSec:WATER}The {\sc water} Force Field}
1082  
1083   In addition to the {\sc duff} force field's solvent description, a
1084 < separate {\sc water} force field has been included for simulating many
1085 < of the common rigid-body water models. In addition to the simple or
1086 < dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD water), the common
1087 < charge-based models were included (SPC, SPC/E, TIP3P, TIP4P, and
1084 > separate {\sc water} force field has been included for simulating most
1085 > of the common rigid-body water models. This force field includes the
1086 > simple and point-dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD
1087 > water), as well as the common charge-based models (SPC, SPC/E, TIP3P,
1088 > TIP4P, and
1089   TIP5P).\cite{liu96:new_model,Ichiye03,fennell04,Marrink01,Berendsen81,Berendsen87,Jorgensen83,Mahoney00}
1090   In order to handle these models, charge-charge interactions were
1091   included in the force-loop:
# Line 1091 | Line 1093 | where $q$ represents the charge on particle $i$ or $j$
1093   V_{\text{charge}}(r_{ij}) = \sum_{ij}\frac{q_iq_je^2}{r_{ij}},
1094   \end{equation}
1095   where $q$ represents the charge on particle $i$ or $j$, and $e$ is the
1096 < charge of an electron in Coulombs. The charge-charge interaction
1097 < support is rudimentary in the current version of {\sc oopse}. As with
1098 < the other pair interactions, charges can be simulated with a pure
1099 < cutoff or a reaction field. The various methods for performing the
1100 < Ewald summation have not yet been included. Also, the charge-dipole
1101 < and charge-quadrupole (for interactions between SSD type water and
1102 < charges) are not yet available, so it is currently inadvisable to mix
1101 < dipolar and charge based molecules in the same system.
1096 > charge of an electron in Coulombs.  As with the other pair
1097 > interactions, charges can be simulated with a pure cutoff or a
1098 > reaction field.  The {\sc water} force field can be easily expanded
1099 > through modification of the {\sc water} force field file ({\tt
1100 > WATER.frc}). By adding atom types and inserting the appropriate
1101 > parameters, it is possible to extend the force field to handle rigid
1102 > molecules other than water.
1103  
1103 The {\sc water} force field can be easily expanded through
1104 modification of the {\sc water} force field file ({\tt WATER.frc}). By
1105 adding atom types and inserting the appropriate parameters, it is
1106 possible to extend the force field to handle rigid molecules other
1107 than water.
1108
1104   \subsection{\label{oopseSec:eam}Embedded Atom Method}
1105  
1106   {\sc oopse} implements a potential that describes bonding in
# Line 1252 | Line 1247 | the inter-atomic forces.
1247   \section{\label{oopseSec:mechanics}Mechanics}
1248  
1249   \subsection{\label{oopseSec:integrate}Integrating the Equations of Motion: the
1250 < DLM method}
1250 > {\sc dlm} method}
1251  
1252   The default method for integrating the equations of motion in {\sc
1253   oopse} is a velocity-Verlet version of the symplectic splitting method
1254   proposed by Dullweber, Leimkuhler and McLachlan
1255 < (DLM).\cite{Dullweber1997} When there are no directional atoms or
1255 > ({\sc dlm}).\cite{Dullweber1997} When there are no directional atoms or
1256   rigid bodies present in the simulation, this integrator becomes the
1257   standard velocity-Verlet integrator which is known to sample the
1258   microcanonical (NVE) ensemble.\cite{Frenkel1996}
1259  
1260   Previous integration methods for orientational motion have problems
1261 < that are avoided in the DLM method.  Direct propagation of the Euler
1261 > that are avoided in the {\sc dlm} method.  Direct propagation of the Euler
1262   angles has a known $1/\sin\theta$ divergence in the equations of
1263   motion for $\phi$ and $\psi$,\cite{allen87:csl} leading to numerical
1264   instabilities any time one of the directional atoms or rigid bodies
# Line 1371 | Line 1366 | of the particle in the space-fixed frame.
1366   Here $\hat{\bf u}$ is a unit vector pointing along the principal axis
1367   of the particle in the space-fixed frame.
1368  
1369 < The DLM method uses a Trotter factorization of the orientational
1369 > The {\sc dlm} method uses a Trotter factorization of the orientational
1370   propagator.  This has three effects:
1371   \begin{enumerate}
1372   \item the integrator is area-preserving in phase space (i.e. it is
# Line 1465 | Line 1460 | advanced to the same time value.
1460          + \frac{h}{2} {\bf \tau}^b(t + h) .
1461   \end{align*}
1462  
1463 < The matrix rotations used in the DLM method end up being more costly
1464 < computationally than the simpler arithmetic quaternion
1463 > The matrix rotations used in the {\sc dlm} method end up being more
1464 > costly computationally than the simpler arithmetic quaternion
1465   propagation. With the same time step, a 1024-molecule water simulation
1466 < shows an 12\% increase in computation time (averaged over several
1467 < different time steps) using the DLM method in place of
1468 < quaternions. This cost is more than justified when comparing the
1469 < energy conservation of the two methods. Figure ~\ref{quatdlm} provides
1470 < a comparative analysis of the {\sc dlm} method versus the simple
1476 < quaternion method that was originally implemented.
1466 > incurs an average 12\% increase in computation time using the {\sc
1467 > dlm} method in place of quaternions. This cost is more than justified
1468 > when comparing the energy conservation achieved by the two
1469 > methods. Figure ~\ref{quatdlm} provides a comparative analysis of the
1470 > {\sc dlm} method versus the traditional quaternion scheme.
1471  
1472   \begin{figure}
1473   \centering
1474   \includegraphics[width=\linewidth]{quatvsdlm.eps}
1475   \caption[Energy conservation analysis of the {\sc dlm} and quaternion
1476 < integration methods]{The logarithm of absolute value of the slope of
1477 < the energy drift (\delta E$_1$) and the standard deviation of the
1478 < energy fluctuations (\delta E$_0$) as a function of chosen time
1479 < step. All simulations were of a 1024-molecule simulation of SSD water
1480 < at 298 K starting from the same initial configuration. Note that the
1481 < {\sc dlm} method provides a greater-than order-of-magnitude
1482 < improvement in energy conservation and relative energy fluctuations
1483 < over the quaternion method at all the tested time steps. The energy
1484 < drift is quite steep for the larger time steps in both methods, and
1485 < results in discontinuous behavior as the systems compound their
1486 < anomolous energy accumulation.}
1476 > integration methods]{Analysis of the energy conservation of the {\sc
1477 > dlm} and quaternion integration methods.  $\delta \mathrm{E}_1$ is the
1478 > linear drift in energy over time and $\delta \mathrm{E}_0$ is the
1479 > standard deviation of energy fluctuations around this drift.  All
1480 > simulations were of a 1024-molecule simulation of SSD water at 298 K
1481 > starting from the same initial configuration. Note that the {\sc dlm}
1482 > method provides more than an order of magnitude improvement in both
1483 > the energy drift and the size of the energy fluctuations when compared
1484 > with the quaternion method at any given time step.  At time steps
1485 > larger than 4 fs, the quaternion scheme resulted in rapidly rising
1486 > energies which eventually lead to simulation failure.  Using the {\sc
1487 > dlm} method, time steps up to 8 fs can be taken before this behavior
1488 > is evident.}
1489   \label{quatdlm}
1490   \end{figure}
1491  
1492 < In Fig.~\ref{quatdlm}, \delta E$_1$ is a measure of the linear energy
1493 < drift in units of kcal/mol per particle over a nanosecond of
1494 < simulation time, and \delta E$_0$ is the standard deviation of the
1495 < energy fluctuations in units of kcal/mol per particle. In the top
1496 < plot, it is apparent that the energy drift is reduced by a significant
1497 < amount (2 to 3 orders-of-magnitude improvement at every tested time
1498 < step) by chosing the {\sc dlm} method over the simple non-symplectic
1499 < quaternion integration method. When the energy drift becomes very
1500 < small ($log_{10}[|\delta\text{E}_1|] < -3$), it is more difficult to
1501 < calculate a slope, resulting in the larger displayed error bars. In
1502 < addition to this improvement in energy drift, the fluctuation is the
1507 < total energy are also dampened out by 1 to 2 orders-of-magnitude by
1508 < utilizing the {\sc dlm} integration method.
1492 > In Fig.~\ref{quatdlm}, $\delta \mbox{E}_1$ is a measure of the linear
1493 > energy drift in units of $\mbox{kcal mol}^{-1}$ per particle over a
1494 > nanosecond of simulation time, and $\delta \mbox{E}_0$ is the standard
1495 > deviation of the energy fluctuations in units of $\mbox{kcal
1496 > mol}^{-1}$ per particle. In the top plot, it is apparent that the
1497 > energy drift is reduced by a significant amount (2 to 3 orders of
1498 > magnitude improvement at all tested time steps) by chosing the {\sc
1499 > dlm} method over the simple non-symplectic quaternion integration
1500 > method.  In addition to this improvement in energy drift, the
1501 > fluctuations in the total energy are also dampened by 1 to 2 orders of
1502 > magnitude by utilizing the {\sc dlm} method.
1503  
1504 < It was stated previously that the {\sc dlm} method was the more
1505 < computationally expensive of the two implimented integration
1506 < methodologies. In order to incorporate this information into the
1507 < energy analysis a plot of energy drift versus computational cost was
1508 < generated (Fig.~\ref{cpuCost}). This figure provides an estimate of
1509 < the CPU time required under the two integration schemes for 1
1510 < nanosecond of simulation time for the model 1024-molecule system. The
1511 < plot is read by chosing a desired energy drift value and determining
1512 < where both the curves cross. If a \delta E$_1$ of 1E-3 kcal/mol per
1513 < particle is desired, a nanosecond of simulation time will require ~19
1514 < hours of CPU time with the {\sc dlm} integrator, while the same small
1515 < drift value will require ~154 hours of CPU time. This demonstrates the
1516 < computational advantage of the integration scheme utilized in {\sc
1517 < oopse}.
1504 > Although the {\sc dlm} method is more computationally expensive than
1505 > the traditional quaternion scheme for propagating a single time step,
1506 > consideration of the computational cost for a long simulation with a
1507 > particular level of energy conservation is in order.  A plot of energy
1508 > drift versus computational cost was generated
1509 > (Fig.~\ref{cpuCost}). This figure provides an estimate of the CPU time
1510 > required under the two integration schemes for 1 nanosecond of
1511 > simulation time for the model 1024-molecule system.  By chosing a
1512 > desired energy drift value it is possible to determine the CPU time
1513 > required for both methods. If a $\delta \mbox{E}_1$ of $1 \times
1514 > 10^{-3} \mbox{kcal mol}^{-1}$ per particle is desired, a nanosecond of
1515 > simulation time will require ~19 hours of CPU time with the {\sc dlm}
1516 > integrator, while the quaternion scheme will require ~154 hours of CPU
1517 > time. This demonstrates the computational advantage of the integration
1518 > scheme utilized in {\sc oopse}.
1519  
1520   \begin{figure}
1521   \centering
1522   \includegraphics[width=\linewidth]{compCost.eps}
1523   \caption[Energy drift as a function of required simulation run
1524 < time]{The logarithm of absolute value of the slope of the energy drift
1525 < (\delta E$_1$) as a function of simulation run time. Simulations were
1526 < performed on a single 2.5 GHz Pentium IV processor. Simulation time
1527 < comparisons can be made by tracing horizontally from one curve to the
1528 < other. For example, a simulation that takes ~24 hours using the {\sc
1529 < dlm} method will take roughly 210 hours using the simple quaternion
1530 < method if the same degree of energy conservation is desired.}
1524 > time]{Energy drift as a function of required simulation run time.
1525 > $\delta \mathrm{E}_1$ is the linear drift in energy over time.
1526 > Simulations were performed on a single 2.5 GHz Pentium 4
1527 > processor. Simulation time comparisons can be made by tracing
1528 > horizontally from one curve to the other. For example, a simulation
1529 > that takes ~24 hours using the {\sc dlm} method will take roughly 210
1530 > hours using the simple quaternion method if the same degree of energy
1531 > conservation is desired.}
1532   \label{cpuCost}
1533   \end{figure}
1534  
# Line 1642 | Line 1638 | Here, $f$ is the total number of degrees of freedom in
1638   \end{equation}
1639   Here, $f$ is the total number of degrees of freedom in the system,
1640   \begin{equation}
1641 < f = 3 N + 3 N_{\mathrm{orient}} - N_{\mathrm{constraints}},
1641 > f = 3 N + 2 N_{\mathrm{linear}} + 3 N_{\mathrm{non-linear}} - N_{\mathrm{constraints}},
1642   \end{equation}
1643   and $K$ is the total kinetic energy,
1644   \begin{equation}
1645   K = \sum_{i=1}^{N} \frac{1}{2} m_i {\bf v}_i^T \cdot {\bf v}_i +
1646 < \sum_{i=1}^{N_{\mathrm{orient}}}  \frac{1}{2} {\bf j}_i^T \cdot
1646 > \sum_{i=1}^{N_{\mathrm{linear}}+N_{\mathrm{non-linear}}}  \frac{1}{2} {\bf j}_i^T \cdot
1647   \overleftrightarrow{\mathsf{I}}_i^{-1} \cdot {\bf j}_i.
1648   \end{equation}
1649 + $N_{\mathrm{linear}}$ is the number of linear rotors (i.e. with two
1650 + non-zero moments of inertia), and $N_{\mathrm{non-linear}}$ is the
1651 + number of non-linear rotors (i.e. with three non-zero moments of
1652 + inertia).  
1653  
1654   In eq.(\ref{eq:nosehooverext}), $\tau_T$ is the time constant for
1655   relaxation of the temperature to the target value.  To set values for
# Line 1687 | Line 1687 | factorization of the three rotation operations that wa
1687   Here $\mathrm{rotate}(h * {\bf j}
1688   \overleftrightarrow{\mathsf{I}}^{-1})$ is the same symplectic Trotter
1689   factorization of the three rotation operations that was discussed in
1690 < the section on the DLM integrator.  Note that this operation modifies
1690 > the section on the {\sc dlm} integrator.  Note that this operation modifies
1691   both the rotation matrix $\mathsf{A}$ and the angular momentum ${\bf
1692   j}$.  {\tt moveA} propagates velocities by a half time step, and
1693   positional degrees of freedom by a full time step.  The new positions
1694   (and orientations) are then used to calculate a new set of forces and
1695   torques in exactly the same way they are calculated in the {\tt
1696 < doForces} portion of the DLM integrator.
1696 > doForces} portion of the {\sc dlm} integrator.
1697  
1698   Once the forces and torques have been obtained at the new time step,
1699   the temperature, velocities, and the extended system variable can be
# Line 1798 | Line 1798 | where ${\bf r}_{ab}$ is the distance between the cente
1798   where ${\bf r}_{ab}$ is the distance between the centers of mass, and
1799   \begin{equation}
1800   {\bf f}_{ab} = s(r_{ab}) \sum_{i \in a} \sum_{j \in b} {\bf f}_{ij} +
1801 < s\prime(r_{ab}) \frac{{\bf r}_{ab}}{|r_{ab}|} \sum_{i \in a} \sum_{j
1801 > s^{\prime}(r_{ab}) \frac{{\bf r}_{ab}}{|r_{ab}|} \sum_{i \in a} \sum_{j
1802   \in b} V_{ij}({\bf r}_{ij}).
1803   \end{equation}
1804  
1805   The instantaneous pressure is then simply obtained from the trace of
1806   the pressure tensor,
1807   \begin{equation}
1808 < P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t).
1809 < \right)
1808 > P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t)
1809 > \right).
1810   \end{equation}
1811  
1812   In eq.(\ref{eq:melchionna1}), $\tau_B$ is the time constant for
# Line 1816 | Line 1816 | integration of the equations of motion is carried out
1816   file.  The units for {\tt tauBarostat} are fs, and the units for the
1817   {\tt targetPressure} are atmospheres.  Like in the NVT integrator, the
1818   integration of the equations of motion is carried out in a
1819 < velocity-Verlet style 2 part algorithm with only the following differences:
1819 > velocity-Verlet style two part algorithm with only the following
1820 > differences:
1821  
1822   {\tt moveA:}
1823   \begin{align*}
# Line 1849 | Line 1850 | the box by
1850   the box by
1851   \begin{equation}
1852   \mathcal{V}(t + h) \leftarrow e^{ - 3 h \eta(t + h /2)} \times
1853 < \mathcal{V}(t)
1853 > \mathcal{V}(t).
1854   \end{equation}
1855  
1856   The {\tt doForces} step for the NPTi integrator is exactly the same as
1857 < in both the DLM and NVT integrators.  Once the forces and torques have
1857 > in both the {\sc dlm} and NVT integrators.  Once the forces and torques have
1858   been obtained at the new time step, the velocities can be advanced to
1859   the same time value.
1860  
# Line 2078 | Line 2079 | the force calculation at each time step.
2079   subtract the total constraint forces from the rest of the system after
2080   the force calculation at each time step.
2081  
2082 < After the force calculation, the total force on molecule $\alpha$,
2082 > After the force calculation, the total force on molecule $\alpha$ is:
2083   \begin{equation}
2084   G_{\alpha} = \sum_i F_{\alpha i},
2085   \label{oopseEq:zc1}
# Line 2099 | Line 2100 | v_{\alpha i} = v_{\alpha i} -
2100   v_{\alpha i} = v_{\alpha i} -
2101          \frac{\sum_i m_{\alpha i} v_{\alpha i}}{\sum_i m_{\alpha i}},
2102   \end{equation}
2103 < where $v_{\alpha i}$ is the velocity of atom $i$ in the z direction.
2103 > where $v_{\alpha i}$ is the velocity of atom $i$ in the $z$ direction.
2104   Lastly, all of the accumulated constraint forces must be subtracted
2105   from the rest of the unconstrained system to keep the system center of
2106   mass of the entire system from drifting.
# Line 2137 | Line 2138 | The user may also specify the use of a constant veloci
2138  
2139   The user may also specify the use of a constant velocity method
2140   (steered molecular dynamics) to move the molecules to their desired
2141 < initial positions.
2141 > initial positions. Based on concepts from atomic force microscopy,
2142 > {\sc smd} has been used to study many processes which occur via rare
2143 > events on the time scale of a few hundreds of picoseconds.  For
2144 > example,{\sc smd} has been used to observe the dissociation of
2145 > Streptavidin-biotin Complex.\cite{smd}  
2146  
2147   To use of the $z$-constraint method in an {\sc oopse} simulation, the
2148   molecules must be specified using the {\tt nZconstraints} keyword in
# Line 2145 | Line 2150 | of the $z$-constraint method are listed in table~\ref{
2150   of the $z$-constraint method are listed in table~\ref{table:zconParams}.
2151  
2152   \begin{table}
2153 < \caption{The Global Keywords: Z-Constraint Parameters}
2153 > \caption{Meta-data Keywords: Z-Constraint Parameters}
2154   \label{table:zconParams}
2155   \begin{center}
2156   % Note when adding or removing columns, the \hsize numbers must add up to the total number
# Line 2158 | Line 2163 | of the $z$-constraint method are listed in table~\ref{
2163  
2164   {\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
2165  
2166 < {\tt nZconstraints} & integer &  The number of zconstraint molecules& If using zconstraint method, {\tt nZconstraints} must be set \\
2167 < {\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file is written &  \\
2168 < {\tt zconsForcePolicy} & string & The strategy of subtracting
2169 < zconstraint force from the unconstrained molecules & Possible
2170 < strategies are {\tt BYMASS} and {\tt BYNUMBER}. Default
2171 < strategy is set to {\tt BYMASS}\\
2172 < {\tt zconsGap} & $\mbox{\AA}$ & Set the distance between two adjacent
2173 < constraint positions& Used mainly in moving molecules through a simulation \\
2174 < {\tt zconsFixtime} & fs & Sets how long the zconstraint molecule is
2175 < fixed & {\tt zconsFixtime} must be set if {\tt zconsGap} is set\\
2176 < {\tt zconsUsingSMD} &logical & Flag for using Steered Molecular
2177 < Dynamics or Harmonic Forces to move the molecule  & Harmonic Forces are
2178 < used by default\\
2166 > {\tt nZconstraints} & integer &  The number of $z$-constrained
2167 > molecules & If using the $z$-constraint method, {\tt nZconstraints}
2168 > must be set \\
2169 > {\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file
2170 > is written &  \\
2171 > {\tt zconsForcePolicy} & string & The strategy for subtracting
2172 > the $z$-constraint force from the {\it unconstrained} molecules & Possible
2173 > strategies are {\tt BYMASS} and {\tt BYNUMBER}. The default
2174 > strategy is {\tt BYMASS}\\
2175 > {\tt zconsGap} & $\mbox{\AA}$ & Sets the distance between two adjacent
2176 > constraint positions&Used mainly to move molecules through a
2177 > simulation to estimate potentials of mean force. \\
2178 > {\tt zconsFixtime} & fs & Sets the length of time the $z$-constraint
2179 > molecule is held fixed & {\tt zconsFixtime} must be set if {\tt
2180 > zconsGap} is set\\
2181 > {\tt zconsUsingSMD} & logical & Flag for using Steered Molecular
2182 > Dynamics to move the molecules to the correct constrained positions  &
2183 > Harmonic Forces are used by default\\
2184  
2185   \end{tabularx}
2186   \end{center}
# Line 2187 | Line 2197 | gradient ({\sc cg}) to help users find reasonable loca
2197   stable fixed points on the surface.  We have included two simple
2198   minimization algorithms: steepest descent, ({\sc sd}) and conjugate
2199   gradient ({\sc cg}) to help users find reasonable local minima from
2200 < their initial configurations.
2200 > their initial configurations. Since {\sc oopse} handles atoms and
2201 > rigid bodies which have orientational coordinates as well as
2202 > translational coordinates, there is some subtlety to the choice of
2203 > parameters for minimization algorithms.
2204  
2192 Since {\sc oopse} handles atoms and rigid bodies which have
2193 orientational coordinates as well as translational coordinates, there
2194 is some subtlety to the choice of parameters for minimization
2195 algorithms.
2196
2205   Given a coordinate set $x_{k}$ and a search direction $d_{k}$, a line
2206   search algorithm is performed along $d_{k}$ to produce
2207 < $x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$.
2208 <
2201 < In the steepest descent ({\sc sd}) algorithm,%
2207 > $x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$. In the steepest descent ({\sc
2208 > sd}) algorithm,%
2209   \begin{equation}
2210 < d_{k}=-\nabla V(x_{k})
2210 > d_{k}=-\nabla V(x_{k}).
2211   \end{equation}
2212   The gradient and the direction of next step are always orthogonal.
2213   This may cause oscillatory behavior in narrow valleys.  To overcome
2214   this problem, the Fletcher-Reeves variant~\cite{FletcherReeves} of the
2215   conjugate gradient ({\sc cg}) algorithm is used to generate $d_{k+1}$
2216   via simple recursion:
2217 < \begin{align}
2218 < d_{k+1}  &  =-\nabla V(x_{k+1})+\gamma_{k}d_{k}\\
2219 < \gamma_{k}  &  =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2220 < V(x_{k})^{T}\nabla V(x_{k})}%
2221 < \end{align}
2217 > \begin{equation}
2218 > d_{k+1}  =-\nabla V(x_{k+1})+\gamma_{k}d_{k}
2219 > \end{equation}
2220 > where
2221 > \begin{equation}
2222 > \gamma_{k}  =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2223 > V(x_{k})^{T}\nabla V(x_{k})}.
2224 > \end{equation}
2225  
2226   The Polak-Ribiere variant~\cite{PolakRibiere} of the conjugate
2227   gradient ($\gamma_{k}$) is defined as%
# Line 2219 | Line 2229 | V(x_{k})^{T}\nabla V(x_{k})}%
2229   \gamma_{k}=\frac{[\nabla V(x_{k+1})-\nabla V(x)]^{T}\nabla V(x_{k+1})}{\nabla
2230   V(x_{k})^{T}\nabla V(x_{k})}%
2231   \end{equation}
2232 + It is widely agreed that the Polak-Ribiere variant gives better
2233 + convergence than the Fletcher-Reeves variant, so the conjugate
2234 + gradient approach implemented in {\sc oopse} is the Polak-Ribiere
2235 + variant.
2236  
2237   The conjugate gradient method assumes that the conformation is close
2238   enough to a local minimum that the potential energy surface is very
# Line 2232 | Line 2246 | minimizer are given in Table~\ref{table:minimizeParams
2246   minimizer are given in Table~\ref{table:minimizeParams}
2247  
2248   \begin{table}
2249 < \caption{The Global Keywords: Energy Minimizer Parameters}
2249 > \caption{Meta-data Keywords: Energy Minimizer Parameters}
2250   \label{table:minimizeParams}
2251   \begin{center}
2252   % Note when adding or removing columns, the \hsize numbers must add up to the total number
# Line 2248 | Line 2262 | descent) \\
2262   {\tt minimizer} & string &  selects the minimization method to be used
2263   & either {\tt CG} (conjugate gradient) or {\tt SD} (steepest
2264   descent) \\
2265 < {\tt minimizerMaxIter} & steps & Sets the maximum iteration number in the energy minimization & Default value is 200\\
2266 < {\tt minimizerWriteFreq} & steps & Sets the frequency at which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2267 < {\tt minimizerStepSize} & $\mbox{\AA}$ &  Set the step size of line search & Default value is 0.01\\
2268 < {\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$  & Sets energy tolerance  & Default value is $10^{-8}$\\
2269 < {\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets gradient tolerance & Default value is $10^{-8}$\\
2270 < {\tt minimizerLSTol} &  $\mbox{kcal mol}^{-1}$ & Sets line search tolerance & Default value is $10^{-8}$\\
2271 < {\tt minimizerLSMaxIter} & steps &  Sets the maximum iteration of line searching & Default value is 50\\
2265 > {\tt minimizerMaxIter} & steps & Sets the maximum number of iterations
2266 > for the energy minimization & The default value is 200\\
2267 > {\tt minimizerWriteFreq} & steps & Sets the frequency with which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2268 > {\tt minimizerStepSize} & $\mbox{\AA}$ &  Sets the step size for the
2269 > line search & The default value is 0.01\\
2270 > {\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$  & Sets the energy tolerance
2271 > for stopping the minimziation. & The default value is $10^{-8}$\\
2272 > {\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets the
2273 > gradient tolerance for stopping the minimization. & The default value
2274 > is  $10^{-8}$\\
2275 > {\tt minimizerLSTol} &  $\mbox{kcal mol}^{-1}$ & Sets line search
2276 > tolerance for terminating each step of the minimization. & The default
2277 > value is $10^{-8}$\\
2278 > {\tt minimizerLSMaxIter} & steps &  Sets the maximum number of
2279 > iterations for each line search & The default value is 50\\
2280  
2281   \end{tabularx}
2282   \end{center}
# Line 2263 | Line 2285 | Although processor power is continually improving, it
2285   \section{\label{oopseSec:parallelization} Parallel Simulation Implementation}
2286  
2287   Although processor power is continually improving, it is still
2288 < unreasonable to simulate systems of more then a 1000 atoms on a single
2288 > unreasonable to simulate systems of more than 10,000 atoms on a single
2289   processor. To facilitate study of larger system sizes or smaller
2290   systems for longer time scales, parallel methods were developed to
2291   allow multiple CPU's to share the simulation workload. Three general
# Line 2274 | Line 2296 | the duration of the simulation. Computational cost sca
2296   Algorithmically simplest of the three methods is atomic decomposition,
2297   where $N$ particles in a simulation are split among $P$ processors for
2298   the duration of the simulation. Computational cost scales as an
2299 < optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately all
2299 > optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately, all
2300   processors must communicate positions and forces with all other
2301   processors at every force evaluation, leading the communication costs
2302   to scale as an unfavorable $\mathcal{O}(N)$, \emph{independent of the
# Line 2296 | Line 2318 | The parallelization method used in {\sc oopse} is the
2318   three dimensions.
2319  
2320   The parallelization method used in {\sc oopse} is the force
2321 < decomposition method.  Force decomposition assigns particles to
2322 < processors based on a block decomposition of the force
2321 > decomposition method.\cite{hendrickson:95} Force decomposition assigns
2322 > particles to processors based on a block decomposition of the force
2323   matrix. Processors are split into an optimally square grid forming row
2324   and column processor groups. Forces are calculated on particles in a
2325   given row by particles located in that processor's column
2326 < assignment. Force decomposition is less complex to implement than the
2327 < spatial method but still scales computationally as $\mathcal{O}(N/P)$
2328 < and scales as $\mathcal{O}(N/\sqrt{P})$ in communication
2329 < cost. Plimpton has also found that force decompositions scale more
2330 < favorably than spatial decompositions for systems up to 10,000 atoms
2331 < and favorably compete with spatial methods up to 100,000
2332 < atoms.\cite{plimpton95}
2333 <
2326 > assignment. One deviation from the algorithm described by Hendrickson
2327 > {\it et al.} is the use of column ordering based on the row indexes
2328 > preventing the need for a transpose operation necessitating a second
2329 > communication step when gathering the final force components.  Force
2330 > decomposition is less complex to implement than the spatial method but
2331 > still scales computationally as $\mathcal{O}(N/P)$ and scales as
2332 > $\mathcal{O}(N/\sqrt{P})$ in communication cost. Plimpton has also
2333 > found that force decompositions scale more favorably than spatial
2334 > decompositions for systems up to 10,000 atoms and favorably compete
2335 > with spatial methods up to 100,000 atoms.\cite{plimpton95}
2336 >
2337   \section{\label{oopseSec:conclusion}Conclusion}
2338  
2339 < We have presented a new open source parallel simulation program {\sc
2339 > We have presented a new parallel simulation program called {\sc
2340   oopse}. This program offers some novel capabilities, but mostly makes
2341   available a library of modern object-oriented code for the scientific
2342   community to use freely.  Notably, {\sc oopse} can handle symplectic
# Line 2340 | Line 2365 | DMR-0079647.
2365   the Notre Dame Bunch-of-Boxes (B.o.B) computer cluster under NSF grant
2366   DMR-0079647.
2367  
2368 < \bibliographystyle{achemso}
2368 > \bibliographystyle{jcc}
2369   \bibliography{oopsePaper}
2370  
2371   \end{document}

Diff Legend

Removed lines
+ Added lines
< Changed lines
> Changed lines