[ViewVC] Diff of: group/trunk/oopsePaper/oopsePaper.tex

Comparing trunk/oopsePaper/oopsePaper.tex (file contents):
Revision 1434 by chrisfen, Thu Jul 29 18:01:05 2004 UTC vs.
Revision 1439 by gezelter, Thu Jul 29 20:06:07 2004 UTC

#	Line 1 \| Line 1
1		\documentclass[11pt]{article}
2		\usepackage{amsmath}
3	<	\usepackage{amssymb}
3	>	\usepackage{amssymb}
4		\usepackage{endfloat}
5		\usepackage{listings}
6		\usepackage{berkeley}
#	Line 22 \| Line 22
22		xleftmargin=0.5in, xrightmargin=0.5in,captionpos=b, %
23		abovecaptionskip=0.5cm, belowcaptionskip=0.5cm}
24		\renewcommand{\lstlistingname}{Scheme}
25	<	\title{{\sc oopse}: An Open Source Object-Oriented Parallel Simulation
25	>	\title{{\sc oopse}: An Object-Oriented Parallel Simulation
26		Engine for Molecular Dynamics}
27
28		\author{Matthew A. Meineke, Charles F. Vardeman II, Teng Lin,\\
#	Line 43 \| Line 43 \| meta-data language. A number of advanced integrators
43		simulations are carried out using the force-based decomposition
44		method. Simulations are specified using a very simple C-based
45		meta-data language. A number of advanced integrators are included,
46	<	and the base integrator for orientational dynamics provides
47	<	substantial improvements over older quaternion-base schemes. All
48	<	source code is available under a very permissive (BSD-style) Open
49	<	Source license.
46	>	and the basic integrator for orientational dynamics provides
47	>	substantial improvements over older quaternion-based schemes.
48		\end{abstract}
49
50		\section{\label{sec:intro}Introduction}
51
52		There are a number of excellent molecular dynamics packages available
53		to the chemical physics
54	<	community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn}
54	>	community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn,namd,macromodel}
55		All of these packages are stable, polished programs which solve many
56		problems of interest. Most are now capable of performing molecular
57		dynamics simulations on parallel computers. Some have source code
#	Line 64 \| Line 62 \| has taken us now involves the use of atoms with orient
62		programs referenced can handle transition metal force fields like the
63		Embedded Atom Method ({\sc eam}). The direction our research program
64		has taken us now involves the use of atoms with orientational degrees
65	<	of freedom and transition metals. Since these simulation methods may
66	<	be of some use to other researchers, we have decided to release our
67	<	program to the scientific community with a permissive open source
68	<	license.
65	>	of freedom as well as transition metals. Since these simulation
66	>	methods may be of some use to other researchers, we have decided to
67	>	release our program (and all related source code) to the scientific
68	>	community.
69
70		This paper communicates the algorithmic details of our program, which
71	<	we have been calling the Open source Object-oriented Parallel
72	<	Simulation Engine (i.e. {\sc oopse}). We have structured this paper
73	<	to first discuss the underlying concepts in this simulation package
71	>	we have been calling the Object-Oriented Parallel Simulation Engine
72	>	(i.e. {\sc oopse}). We have structured this paper to first discuss
73	>	the underlying concepts in this simulation package
74		(Sec. \ref{oopseSec:IOfiles}). The empirical energy functions
75		implemented are discussed in Sec.~\ref{oopseSec:empiricalEnergy}.
76		Sec.~\ref{oopseSec:mechanics} describes the various Molecular Dynamics
#	Line 283 \| Line 281 \| scheme is often used. The elements of $\mathsf{A}$ can
281		$\psi$.\cite{Goldstein01} In order to avoid numerical instabilities
282		inherent in using the Euler angles, the four parameter ``quaternion''
283		scheme is often used. The elements of $\mathsf{A}$ can be expressed as
284	<	arithmetic operations involving the four quaternions ($q_0, q_1, q_2,$
285	<	and $q_3$).\cite{allen87:csl} Use of quaternions also leads to
284	>	arithmetic operations involving the four quaternions ($q_w, q_x, q_y,$
285	>	and $q_z$).\cite{allen87:csl} Use of quaternions also leads to
286		performance enhancements, particularly for very small
287		systems.\cite{Evans77}
288
#	Line 391 \| Line 389 \| mol}^{-1}$, distances are in $\mbox{\AA}$, times are i
389		It is important to note the fundamental units in all files which are
390		read and written by {\sc oopse}. Energies are in $\mbox{kcal
391		mol}^{-1}$, distances are in $\mbox{\AA}$, times are in $\mbox{fs}$,
392	<	translational velocities are in $\mbox{\AA fs}^{-1}$, and masses are
392	>	translational velocities are in $\mbox{\AA~fs}^{-1}$, and masses are
393		in $\mbox{amu}$. Orientational degrees of freedom are described using
394		quaternions (unitless, but $q_w^2 + q_x^2 + q_y^2 + q_z^2 = 1$),
395		body-fixed angular momenta ($\mbox{amu \AA}^{2} \mbox{radians
#	Line 411 \| Line 409 \| fs}^{-1}$), and body-fixed moments of inertia ($\mbox{
409
410		{\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
411
412	<	{\tt forceField} & string & Sets the force field. & Possible force fields are "DUFF", "LJ", and "EAM". \\
412	>	{\tt forceField} & string & Sets the force field. & Possible force fields are DUFF, LJ, and EAM. \\
413		{\tt nComponents} & integer & Sets the number of components. & Needs to appear before the first {\tt Component} block. \\
414		{\tt initialConfig} & string & Sets the file containing the initial configuration. & Can point to any file containing the configuration in the correct order. \\
415		{\tt minimizer}& string & Chooses a minimizer & Possible minimizers
416	<	are "SD" and "CG". Either {\tt ensemble} or {\tt minimizer} must be specified. \\
416	>	are SD and CG. Either {\tt ensemble} or {\tt minimizer} must be specified. \\
417		{\tt ensemble} & string & Sets the ensemble. & Possible ensembles are
418	<	"NVE", "NVT", "NPTi", "NPTf", and "NPTxyz". Either {\tt ensemble}
418	>	NVE, NVT, NPTi, NPTf, and NPTxyz. Either {\tt ensemble}
419		or {\tt minimizer} must be specified. \\
420		{\tt dt} & fs & Sets the time step. & Selection of {\tt dt} should be
421	<	small enough to sample the fastest motion of the simulation. (required
422	<	for molecular dynamics simulations)\\
421	>	small enough to sample the fastest motion of the simulation. ({\tt
422	>	dt} is required for molecular dynamics simulations)\\
423		{\tt runTime} & fs & Sets the time at which the simulation should
424		end. & This is an absolute time, and will end the simulation when the
425	<	current time meets or exceeds the {\tt runTime}. (required for
426	<	molecular dynamics simulations)\\
425	>	current time meets or exceeds the {\tt runTime}. ({\tt runTime} is
426	>	required for molecular dynamics simulations)\\
427
428		\end{tabularx}
429		\end{center}
#	Line 450 \| Line 448 \| extension. \\
448		the root name of the initial meta-data file but with an {\tt .eor}
449		extension. \\
450		{\tt useInitialTime} & logical & Sets whether the initial time is taken from the {\tt .in} file. & Useful when recovering a simulation from a crashed processor. Default is false. \\
451	<	{\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump} file is written. & Default sets the frequency to the {\tt runTime}. \\
452	<	{\tt statusTime} & fs & Sets the frequency at which the {\tt .stat} file is written. & Defaults set the frequency to the {\tt sampleTime}. \\
453	<	{\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoffRadius & Defaults to
454	<	$15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
455	<	\sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
451	>	{\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump}
452	>	file is written. & The default is equal to the {\tt runTime}. \\
453	>	{\tt statusTime} & fs & Sets the frequency at which the {\tt .stat}
454	>	file is written. & The default is equal to the {\tt sampleTime}. \\
455	>	{\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoff radius &
456	>	Defaults to $15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
457	>	\sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
458		simulation. \\
459		{\tt switchingRadius} & $\mbox{\AA}$ & Manually sets the inner radius for the switching function. & Defaults to 95~\% of the {\tt cutoffRadius}. \\
460	<	{\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is "false". \\
460	>	{\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is false. \\
461		{\tt dielectric} & unitless & Sets the dielectric constant for reaction field. & If {\tt useReactionField} is true, then {\tt dielectric} must be set. \\
462		{\tt usePeriodicBoundaryConditions} & & & \\
463	<	& logical & Turns periodic boundary conditions on/off. & Default is "true". \\
463	>	& logical & Turns periodic boundary conditions on/off. & Default is true. \\
464		{\tt seed } & integer & Sets the seed value for the random number
465		generator. & The seed needs to be at least 9 digits long. The default
466		is to take the seed from the CPU clock. \\
467		{\tt forceFieldVariant} & string & Sets the name of the variant of the
468	<	force field. ({\sc eam} has three variants: {\tt u3}, {\tt u6}, and
468	>	force field. & {\sc eam} has three variants: {\tt u3}, {\tt u6}, and
469		{\tt VC}.
470
471		\end{tabularx}
#	Line 565 \| Line 565 \| declared in the molecule's declaration within the mode
565		\item The ordering of the atoms for each molecule follows the order
566		declared in the molecule's declaration within the model file.
567		\item Only atoms which are not members of a {\tt rigidBody} are
568	<	included
568	>	included.
569		\item Rigid Body coordinates for a molecule are listed immediately
570		after the the other atoms in a molecule. Some molecules may be
571		entirely rigid, in which case, only the rigid body coordinates are
#	Line 667 \| Line 667 \| V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}
667		\begin{equation}
668		V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}.
669		\end{equation}
670	<	The short-ranged portion includes explicit bonds, bends and torsions,
671	<	which have been defined in the meta-data file for the molecules which
672	<	present in the simulation. The functional forms and parameters for
673	<	these interactions are defined by the force field which is chosen.
670	>	The short-ranged portion includes the explicit bonds, bends, and
671	>	torsions which have been defined in the meta-data file for the
672	>	molecules which are present in the simulation. The functional forms and
673	>	parameters for these interactions are defined by the force field which
674	>	is chosen.
675
676	<	Calculating long-range (non-bonded) potential involves a sum over all
677	<	pairs of atoms (except for those atoms which are involved in a bond,
678	<	bend, or torsion with each other). If done poorly, calculating the
679	<	the long-range interactions for $N$ atoms would involve $N^2$
680	<	evaluations of atomic distance. To reduce the number of distance
676	>	Calculating the long-range (non-bonded) potential involves a sum over
677	>	all pairs of atoms (except for those atoms which are involved in a
678	>	bond, bend, or torsion with each other). If done poorly, calculating
679	>	the the long-range interactions for $N$ atoms would involve $N(N-1)/2$
680	>	evaluations of atomic distances. To reduce the number of distance
681		evaluations between pairs of atoms, {\sc oopse} uses a switched cutoff
682		with Verlet neighbor lists.\cite{allen87:csl} It is well known that
683		neutral groups which contain charges will exhibit pathological forces
#	Line 691 \| Line 692 \| cutoff groups ($a$ and $b$).
692		where $r_{ab}$ is the distance between the centers of mass of the two
693		cutoff groups ($a$ and $b$).
694
695	<	The sums over $a$ and $b$ are over the cutoffGroups that are present
695	>	The sums over $a$ and $b$ are over the cutoff groups that are present
696		in the simulation. Atoms which are not explicitly defined as members
697		of a {\tt cutoffGroup} are treated as a group consisting of only one
698		atom. The switching function, $s(r)$ is the standard cubic switching
#	Line 717 \| Line 718 \| present in the simulation. In simulations containing
718		atoms, the cutoff radius has a default value of $2.5\sigma_{ii}$,
719		where $\sigma_{ii}$ is the largest Lennard-Jones length parameter
720		present in the simulation. In simulations containing charged or
721	<	dipolar atoms, the default cutoff Radius is $15 \mbox{\AA}$.
721	>	dipolar atoms, the default cutoff radius is $15 \mbox{\AA}$.
722
723		The {\tt switchingRadius} is set to a default value of 95\% of the
724		{\tt cutoffRadius}. In the special case of a simulation containing
#	Line 726 \| Line 727 \| Both radii may be specified in the meta-data file.
727		potential to remove discontinuities in the potential at the cutoff.
728		Both radii may be specified in the meta-data file.
729
730	<	Force fields can easily be added to {\sc oopse}, although it comes
731	<	with a few simple examples (Lennard-Jones, {\sc duff}, {\sc water},
732	<	and {\sc eam}) which are explained in the following sections.
730	>	Force fields can be added to {\sc oopse}, although it comes with a few
731	>	simple examples (Lennard-Jones, {\sc duff}, {\sc water}, and {\sc
732	>	eam}) which are explained in the following sections.
733
734		\subsection{\label{sec:LJPot}The Lennard Jones Force Field}
735
#	Line 1080 \| Line 1081 \| In addition to the {\sc duff} force field's solvent de
1081		\subsection{\label{oopseSec:WATER}The {\sc water} Force Field}
1082
1083		In addition to the {\sc duff} force field's solvent description, a
1084	<	separate {\sc water} force field has been included for simulating many
1085	<	of the common rigid-body water models. In addition to the simple or
1086	<	dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD water), the common
1087	<	charge-based models were included (SPC, SPC/E, TIP3P, TIP4P, and
1084	>	separate {\sc water} force field has been included for simulating most
1085	>	of the common rigid-body water models. This force field includes the
1086	>	simple and point-dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD
1087	>	water), as well as the common charge-based models (SPC, SPC/E, TIP3P,
1088	>	TIP4P, and
1089		TIP5P).\cite{liu96:new_model,Ichiye03,fennell04,Marrink01,Berendsen81,Berendsen87,Jorgensen83,Mahoney00}
1090		In order to handle these models, charge-charge interactions were
1091		included in the force-loop:
#	Line 1091 \| Line 1093 \| where $q$ represents the charge on particle $i$ or $j$
1093		V_{\text{charge}}(r_{ij}) = \sum_{ij}\frac{q_iq_je^2}{r_{ij}},
1094		\end{equation}
1095		where $q$ represents the charge on particle $i$ or $j$, and $e$ is the
1096	<	charge of an electron in Coulombs. The charge-charge interaction
1097	<	support is rudimentary in the current version of {\sc oopse}. As with
1098	<	the other pair interactions, charges can be simulated with a pure
1099	<	cutoff or a reaction field. The various methods for performing the
1100	<	Ewald summation have not yet been included. Also, the charge-dipole
1101	<	and charge-quadrupole (for interactions between SSD type water and
1102	<	charges) are not yet available, so it is currently inadvisable to mix
1101	<	dipolar and charge based molecules in the same system.
1096	>	charge of an electron in Coulombs. As with the other pair
1097	>	interactions, charges can be simulated with a pure cutoff or a
1098	>	reaction field. The {\sc water} force field can be easily expanded
1099	>	through modification of the {\sc water} force field file ({\tt
1100	>	WATER.frc}). By adding atom types and inserting the appropriate
1101	>	parameters, it is possible to extend the force field to handle rigid
1102	>	molecules other than water.
1103
1103	–	The {\sc water} force field can be easily expanded through
1104	–	modification of the {\sc water} force field file ({\tt WATER.frc}). By
1105	–	adding atom types and inserting the appropriate parameters, it is
1106	–	possible to extend the force field to handle rigid molecules other
1107	–	than water.
1108	–
1104		\subsection{\label{oopseSec:eam}Embedded Atom Method}
1105
1106		{\sc oopse} implements a potential that describes bonding in
#	Line 1252 \| Line 1247 \| the inter-atomic forces.
1247		\section{\label{oopseSec:mechanics}Mechanics}
1248
1249		\subsection{\label{oopseSec:integrate}Integrating the Equations of Motion: the
1250	<	DLM method}
1250	>	{\sc dlm} method}
1251
1252		The default method for integrating the equations of motion in {\sc
1253		oopse} is a velocity-Verlet version of the symplectic splitting method
1254		proposed by Dullweber, Leimkuhler and McLachlan
1255	<	(DLM).\cite{Dullweber1997} When there are no directional atoms or
1255	>	({\sc dlm}).\cite{Dullweber1997} When there are no directional atoms or
1256		rigid bodies present in the simulation, this integrator becomes the
1257		standard velocity-Verlet integrator which is known to sample the
1258		microcanonical (NVE) ensemble.\cite{Frenkel1996}
1259
1260		Previous integration methods for orientational motion have problems
1261	<	that are avoided in the DLM method. Direct propagation of the Euler
1261	>	that are avoided in the {\sc dlm} method. Direct propagation of the Euler
1262		angles has a known $1/\sin\theta$ divergence in the equations of
1263		motion for $\phi$ and $\psi$,\cite{allen87:csl} leading to numerical
1264		instabilities any time one of the directional atoms or rigid bodies
#	Line 1371 \| Line 1366 \| of the particle in the space-fixed frame.
1366		Here $\hat{\bf u}$ is a unit vector pointing along the principal axis
1367		of the particle in the space-fixed frame.
1368
1369	<	The DLM method uses a Trotter factorization of the orientational
1369	>	The {\sc dlm} method uses a Trotter factorization of the orientational
1370		propagator. This has three effects:
1371		\begin{enumerate}
1372		\item the integrator is area-preserving in phase space (i.e. it is
#	Line 1465 \| Line 1460 \| advanced to the same time value.
1460		+ \frac{h}{2} {\bf \tau}^b(t + h) .
1461		\end{align*}
1462
1463	<	The matrix rotations used in the DLM method end up being more costly
1464	<	computationally than the simpler arithmetic quaternion
1463	>	The matrix rotations used in the {\sc dlm} method end up being more
1464	>	costly computationally than the simpler arithmetic quaternion
1465		propagation. With the same time step, a 1024-molecule water simulation
1466	<	shows an 12\% increase in computation time (averaged over several
1467	<	different time steps) using the DLM method in place of
1468	<	quaternions. This cost is more than justified when comparing the
1469	<	energy conservation of the two methods. Figure ~\ref{quatdlm} provides
1470	<	a comparative analysis of the {\sc dlm} method versus the simple
1476	<	quaternion method that was originally implemented.
1466	>	incurs an average 12\% increase in computation time using the {\sc
1467	>	dlm} method in place of quaternions. This cost is more than justified
1468	>	when comparing the energy conservation achieved by the two
1469	>	methods. Figure ~\ref{quatdlm} provides a comparative analysis of the
1470	>	{\sc dlm} method versus the traditional quaternion scheme.
1471
1472		\begin{figure}
1473		\centering
1474		\includegraphics[width=\linewidth]{quatvsdlm.eps}
1475		\caption[Energy conservation analysis of the {\sc dlm} and quaternion
1476	<	integration methods]{The logarithm of absolute value of the slope of
1477	<	the energy drift (\delta E$_1$) and the standard deviation of the
1478	<	energy fluctuations (\delta E$_0$) as a function of chosen time
1479	<	step. All simulations were of a 1024-molecule simulation of SSD water
1480	<	at 298 K starting from the same initial configuration. Note that the
1481	<	{\sc dlm} method provides a greater-than order-of-magnitude
1482	<	improvement in energy conservation and relative energy fluctuations
1483	<	over the quaternion method at all the tested time steps. The energy
1484	<	drift is quite steep for the larger time steps in both methods, and
1485	<	results in discontinuous behavior as the systems compound their
1486	<	anomolous energy accumulation.}
1476	>	integration methods]{Analysis of the energy conservation of the {\sc
1477	>	dlm} and quaternion integration methods. $\delta \mathrm{E}_1$ is the
1478	>	linear drift in energy over time and $\delta \mathrm{E}_0$ is the
1479	>	standard deviation of energy fluctuations around this drift. All
1480	>	simulations were of a 1024-molecule simulation of SSD water at 298 K
1481	>	starting from the same initial configuration. Note that the {\sc dlm}
1482	>	method provides more than an order of magnitude improvement in both
1483	>	the energy drift and the size of the energy fluctuations when compared
1484	>	with the quaternion method at any given time step. At time steps
1485	>	larger than 4 fs, the quaternion scheme resulted in rapidly rising
1486	>	energies which eventually lead to simulation failure. Using the {\sc
1487	>	dlm} method, time steps up to 8 fs can be taken before this behavior
1488	>	is evident.}
1489		\label{quatdlm}
1490		\end{figure}
1491
1492	<	In Fig.~\ref{quatdlm}, \delta E$_1$ is a measure of the linear energy
1493	<	drift in units of kcal/mol per particle over a nanosecond of
1494	<	simulation time, and \delta E$_0$ is the standard deviation of the
1495	<	energy fluctuations in units of kcal/mol per particle. In the top
1496	<	plot, it is apparent that the energy drift is reduced by a significant
1497	<	amount (2 to 3 orders-of-magnitude improvement at every tested time
1498	<	step) by chosing the {\sc dlm} method over the simple non-symplectic
1499	<	quaternion integration method. When the energy drift becomes very
1500	<	small ($log_{10}[\|\delta\text{E}_1\|] < -3$), it is more difficult to
1501	<	calculate a slope, resulting in the larger displayed error bars. In
1502	<	addition to this improvement in energy drift, the fluctuation is the
1507	<	total energy are also dampened out by 1 to 2 orders-of-magnitude by
1508	<	utilizing the {\sc dlm} integration method.
1492	>	In Fig.~\ref{quatdlm}, $\delta \mbox{E}_1$ is a measure of the linear
1493	>	energy drift in units of $\mbox{kcal mol}^{-1}$ per particle over a
1494	>	nanosecond of simulation time, and $\delta \mbox{E}_0$ is the standard
1495	>	deviation of the energy fluctuations in units of $\mbox{kcal
1496	>	mol}^{-1}$ per particle. In the top plot, it is apparent that the
1497	>	energy drift is reduced by a significant amount (2 to 3 orders of
1498	>	magnitude improvement at all tested time steps) by chosing the {\sc
1499	>	dlm} method over the simple non-symplectic quaternion integration
1500	>	method. In addition to this improvement in energy drift, the
1501	>	fluctuations in the total energy are also dampened by 1 to 2 orders of
1502	>	magnitude by utilizing the {\sc dlm} method.
1503
1504	<	It was stated previously that the {\sc dlm} method was the more
1505	<	computationally expensive of the two implimented integration
1506	<	methodologies. In order to incorporate this information into the
1507	<	energy analysis a plot of energy drift versus computational cost was
1508	<	generated (Fig.~\ref{cpuCost}). This figure provides an estimate of
1509	<	the CPU time required under the two integration schemes for 1
1510	<	nanosecond of simulation time for the model 1024-molecule system. The
1511	<	plot is read by chosing a desired energy drift value and determining
1512	<	where both the curves cross. If a \delta E$_1$ of 1E-3 kcal/mol per
1513	<	particle is desired, a nanosecond of simulation time will require ~19
1514	<	hours of CPU time with the {\sc dlm} integrator, while the same small
1515	<	drift value will require ~154 hours of CPU time. This demonstrates the
1516	<	computational advantage of the integration scheme utilized in {\sc
1517	<	oopse}.
1504	>	Although the {\sc dlm} method is more computationally expensive than
1505	>	the traditional quaternion scheme for propagating a single time step,
1506	>	consideration of the computational cost for a long simulation with a
1507	>	particular level of energy conservation is in order. A plot of energy
1508	>	drift versus computational cost was generated
1509	>	(Fig.~\ref{cpuCost}). This figure provides an estimate of the CPU time
1510	>	required under the two integration schemes for 1 nanosecond of
1511	>	simulation time for the model 1024-molecule system. By chosing a
1512	>	desired energy drift value it is possible to determine the CPU time
1513	>	required for both methods. If a $\delta \mbox{E}_1$ of $1 \times
1514	>	10^{-3} \mbox{kcal mol}^{-1}$ per particle is desired, a nanosecond of
1515	>	simulation time will require ~19 hours of CPU time with the {\sc dlm}
1516	>	integrator, while the quaternion scheme will require ~154 hours of CPU
1517	>	time. This demonstrates the computational advantage of the integration
1518	>	scheme utilized in {\sc oopse}.
1519
1520		\begin{figure}
1521		\centering
1522		\includegraphics[width=\linewidth]{compCost.eps}
1523		\caption[Energy drift as a function of required simulation run
1524	<	time]{The logarithm of absolute value of the slope of the energy drift
1525	<	(\delta E$_1$) as a function of simulation run time. Simulations were
1526	<	performed on a single 2.5 GHz Pentium IV processor. Simulation time
1527	<	comparisons can be made by tracing horizontally from one curve to the
1528	<	other. For example, a simulation that takes ~24 hours using the {\sc
1529	<	dlm} method will take roughly 210 hours using the simple quaternion
1530	<	method if the same degree of energy conservation is desired.}
1524	>	time]{Energy drift as a function of required simulation run time.
1525	>	$\delta \mathrm{E}_1$ is the linear drift in energy over time.
1526	>	Simulations were performed on a single 2.5 GHz Pentium 4
1527	>	processor. Simulation time comparisons can be made by tracing
1528	>	horizontally from one curve to the other. For example, a simulation
1529	>	that takes ~24 hours using the {\sc dlm} method will take roughly 210
1530	>	hours using the simple quaternion method if the same degree of energy
1531	>	conservation is desired.}
1532		\label{cpuCost}
1533		\end{figure}
1534
#	Line 1642 \| Line 1638 \| Here, $f$ is the total number of degrees of freedom in
1638		\end{equation}
1639		Here, $f$ is the total number of degrees of freedom in the system,
1640		\begin{equation}
1641	<	f = 3 N + 3 N_{\mathrm{orient}} - N_{\mathrm{constraints}},
1641	>	f = 3 N + 2 N_{\mathrm{linear}} + 3 N_{\mathrm{non-linear}} - N_{\mathrm{constraints}},
1642		\end{equation}
1643		and $K$ is the total kinetic energy,
1644		\begin{equation}
1645		K = \sum_{i=1}^{N} \frac{1}{2} m_i {\bf v}_i^T \cdot {\bf v}_i +
1646	<	\sum_{i=1}^{N_{\mathrm{orient}}} \frac{1}{2} {\bf j}_i^T \cdot
1646	>	\sum_{i=1}^{N_{\mathrm{linear}}+N_{\mathrm{non-linear}}} \frac{1}{2} {\bf j}_i^T \cdot
1647		\overleftrightarrow{\mathsf{I}}_i^{-1} \cdot {\bf j}_i.
1648		\end{equation}
1649	+	$N_{\mathrm{linear}}$ is the number of linear rotors (i.e. with two
1650	+	non-zero moments of inertia), and $N_{\mathrm{non-linear}}$ is the
1651	+	number of non-linear rotors (i.e. with three non-zero moments of
1652	+	inertia).
1653
1654		In eq.(\ref{eq:nosehooverext}), $\tau_T$ is the time constant for
1655		relaxation of the temperature to the target value. To set values for
#	Line 1687 \| Line 1687 \| factorization of the three rotation operations that wa
1687		Here $\mathrm{rotate}(h * {\bf j}
1688		\overleftrightarrow{\mathsf{I}}^{-1})$ is the same symplectic Trotter
1689		factorization of the three rotation operations that was discussed in
1690	<	the section on the DLM integrator. Note that this operation modifies
1690	>	the section on the {\sc dlm} integrator. Note that this operation modifies
1691		both the rotation matrix $\mathsf{A}$ and the angular momentum ${\bf
1692		j}$. {\tt moveA} propagates velocities by a half time step, and
1693		positional degrees of freedom by a full time step. The new positions
1694		(and orientations) are then used to calculate a new set of forces and
1695		torques in exactly the same way they are calculated in the {\tt
1696	<	doForces} portion of the DLM integrator.
1696	>	doForces} portion of the {\sc dlm} integrator.
1697
1698		Once the forces and torques have been obtained at the new time step,
1699		the temperature, velocities, and the extended system variable can be
#	Line 1798 \| Line 1798 \| where ${\bf r}_{ab}$ is the distance between the cente
1798		where ${\bf r}_{ab}$ is the distance between the centers of mass, and
1799		\begin{equation}
1800		{\bf f}_{ab} = s(r_{ab}) \sum_{i \in a} \sum_{j \in b} {\bf f}_{ij} +
1801	<	s\prime(r_{ab}) \frac{{\bf r}_{ab}}{\|r_{ab}\|} \sum_{i \in a} \sum_{j
1801	>	s^{\prime}(r_{ab}) \frac{{\bf r}_{ab}}{\|r_{ab}\|} \sum_{i \in a} \sum_{j
1802		\in b} V_{ij}({\bf r}_{ij}).
1803		\end{equation}
1804
1805		The instantaneous pressure is then simply obtained from the trace of
1806		the pressure tensor,
1807		\begin{equation}
1808	<	P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t).
1809	<	\right)
1808	>	P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t)
1809	>	\right).
1810		\end{equation}
1811
1812		In eq.(\ref{eq:melchionna1}), $\tau_B$ is the time constant for
#	Line 1816 \| Line 1816 \| integration of the equations of motion is carried out
1816		file. The units for {\tt tauBarostat} are fs, and the units for the
1817		{\tt targetPressure} are atmospheres. Like in the NVT integrator, the
1818		integration of the equations of motion is carried out in a
1819	<	velocity-Verlet style 2 part algorithm with only the following differences:
1819	>	velocity-Verlet style two part algorithm with only the following
1820	>	differences:
1821
1822		{\tt moveA:}
1823		\begin{align*}
#	Line 1849 \| Line 1850 \| the box by
1850		the box by
1851		\begin{equation}
1852		\mathcal{V}(t + h) \leftarrow e^{ - 3 h \eta(t + h /2)} \times
1853	<	\mathcal{V}(t)
1853	>	\mathcal{V}(t).
1854		\end{equation}
1855
1856		The {\tt doForces} step for the NPTi integrator is exactly the same as
1857	<	in both the DLM and NVT integrators. Once the forces and torques have
1857	>	in both the {\sc dlm} and NVT integrators. Once the forces and torques have
1858		been obtained at the new time step, the velocities can be advanced to
1859		the same time value.
1860
#	Line 2078 \| Line 2079 \| the force calculation at each time step.
2079		subtract the total constraint forces from the rest of the system after
2080		the force calculation at each time step.
2081
2082	<	After the force calculation, the total force on molecule $\alpha$,
2082	>	After the force calculation, the total force on molecule $\alpha$ is:
2083		\begin{equation}
2084		G_{\alpha} = \sum_i F_{\alpha i},
2085		\label{oopseEq:zc1}
#	Line 2099 \| Line 2100 \| v_{\alpha i} = v_{\alpha i} -
2100		v_{\alpha i} = v_{\alpha i} -
2101		\frac{\sum_i m_{\alpha i} v_{\alpha i}}{\sum_i m_{\alpha i}},
2102		\end{equation}
2103	<	where $v_{\alpha i}$ is the velocity of atom $i$ in the z direction.
2103	>	where $v_{\alpha i}$ is the velocity of atom $i$ in the $z$ direction.
2104		Lastly, all of the accumulated constraint forces must be subtracted
2105		from the rest of the unconstrained system to keep the system center of
2106		mass of the entire system from drifting.
#	Line 2137 \| Line 2138 \| The user may also specify the use of a constant veloci
2138
2139		The user may also specify the use of a constant velocity method
2140		(steered molecular dynamics) to move the molecules to their desired
2141	<	initial positions.
2141	>	initial positions. Based on concepts from atomic force microscopy,
2142	>	{\sc smd} has been used to study many processes which occur via rare
2143	>	events on the time scale of a few hundreds of picoseconds. For
2144	>	example,{\sc smd} has been used to observe the dissociation of
2145	>	Streptavidin-biotin Complex.\cite{smd}
2146
2147		To use of the $z$-constraint method in an {\sc oopse} simulation, the
2148		molecules must be specified using the {\tt nZconstraints} keyword in
#	Line 2145 \| Line 2150 \| of the $z$-constraint method are listed in table~\ref{
2150		of the $z$-constraint method are listed in table~\ref{table:zconParams}.
2151
2152		\begin{table}
2153	<	\caption{The Global Keywords: Z-Constraint Parameters}
2153	>	\caption{Meta-data Keywords: Z-Constraint Parameters}
2154		\label{table:zconParams}
2155		\begin{center}
2156		% Note when adding or removing columns, the \hsize numbers must add up to the total number
#	Line 2158 \| Line 2163 \| of the $z$-constraint method are listed in table~\ref{
2163
2164		{\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
2165
2166	<	{\tt nZconstraints} & integer & The number of zconstraint molecules& If using zconstraint method, {\tt nZconstraints} must be set \\
2167	<	{\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file is written & \\
2168	<	{\tt zconsForcePolicy} & string & The strategy of subtracting
2169	<	zconstraint force from the unconstrained molecules & Possible
2170	<	strategies are {\tt BYMASS} and {\tt BYNUMBER}. Default
2171	<	strategy is set to {\tt BYMASS}\\
2172	<	{\tt zconsGap} & $\mbox{\AA}$ & Set the distance between two adjacent
2173	<	constraint positions& Used mainly in moving molecules through a simulation \\
2174	<	{\tt zconsFixtime} & fs & Sets how long the zconstraint molecule is
2175	<	fixed & {\tt zconsFixtime} must be set if {\tt zconsGap} is set\\
2176	<	{\tt zconsUsingSMD} &logical & Flag for using Steered Molecular
2177	<	Dynamics or Harmonic Forces to move the molecule & Harmonic Forces are
2178	<	used by default\\
2166	>	{\tt nZconstraints} & integer & The number of $z$-constrained
2167	>	molecules & If using the $z$-constraint method, {\tt nZconstraints}
2168	>	must be set \\
2169	>	{\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file
2170	>	is written & \\
2171	>	{\tt zconsForcePolicy} & string & The strategy for subtracting
2172	>	the $z$-constraint force from the {\it unconstrained} molecules & Possible
2173	>	strategies are {\tt BYMASS} and {\tt BYNUMBER}. The default
2174	>	strategy is {\tt BYMASS}\\
2175	>	{\tt zconsGap} & $\mbox{\AA}$ & Sets the distance between two adjacent
2176	>	constraint positions&Used mainly to move molecules through a
2177	>	simulation to estimate potentials of mean force. \\
2178	>	{\tt zconsFixtime} & fs & Sets the length of time the $z$-constraint
2179	>	molecule is held fixed & {\tt zconsFixtime} must be set if {\tt
2180	>	zconsGap} is set\\
2181	>	{\tt zconsUsingSMD} & logical & Flag for using Steered Molecular
2182	>	Dynamics to move the molecules to the correct constrained positions &
2183	>	Harmonic Forces are used by default\\
2184
2185		\end{tabularx}
2186		\end{center}
#	Line 2187 \| Line 2197 \| gradient ({\sc cg}) to help users find reasonable loca
2197		stable fixed points on the surface. We have included two simple
2198		minimization algorithms: steepest descent, ({\sc sd}) and conjugate
2199		gradient ({\sc cg}) to help users find reasonable local minima from
2200	<	their initial configurations.
2200	>	their initial configurations. Since {\sc oopse} handles atoms and
2201	>	rigid bodies which have orientational coordinates as well as
2202	>	translational coordinates, there is some subtlety to the choice of
2203	>	parameters for minimization algorithms.
2204
2192	–	Since {\sc oopse} handles atoms and rigid bodies which have
2193	–	orientational coordinates as well as translational coordinates, there
2194	–	is some subtlety to the choice of parameters for minimization
2195	–	algorithms.
2196	–
2205		Given a coordinate set $x_{k}$ and a search direction $d_{k}$, a line
2206		search algorithm is performed along $d_{k}$ to produce
2207	<	$x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$.
2208	<
2201	<	In the steepest descent ({\sc sd}) algorithm,%
2207	>	$x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$. In the steepest descent ({\sc
2208	>	sd}) algorithm,%
2209		\begin{equation}
2210	<	d_{k}=-\nabla V(x_{k})
2210	>	d_{k}=-\nabla V(x_{k}).
2211		\end{equation}
2212		The gradient and the direction of next step are always orthogonal.
2213		This may cause oscillatory behavior in narrow valleys. To overcome
2214		this problem, the Fletcher-Reeves variant~\cite{FletcherReeves} of the
2215		conjugate gradient ({\sc cg}) algorithm is used to generate $d_{k+1}$
2216		via simple recursion:
2217	<	\begin{align}
2218	<	d_{k+1} & =-\nabla V(x_{k+1})+\gamma_{k}d_{k}\\
2219	<	\gamma_{k} & =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2220	<	V(x_{k})^{T}\nabla V(x_{k})}%
2221	<	\end{align}
2217	>	\begin{equation}
2218	>	d_{k+1} =-\nabla V(x_{k+1})+\gamma_{k}d_{k}
2219	>	\end{equation}
2220	>	where
2221	>	\begin{equation}
2222	>	\gamma_{k} =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2223	>	V(x_{k})^{T}\nabla V(x_{k})}.
2224	>	\end{equation}
2225
2226		The Polak-Ribiere variant~\cite{PolakRibiere} of the conjugate
2227		gradient ($\gamma_{k}$) is defined as%
#	Line 2219 \| Line 2229 \| V(x_{k})^{T}\nabla V(x_{k})}%
2229		\gamma_{k}=\frac{[\nabla V(x_{k+1})-\nabla V(x)]^{T}\nabla V(x_{k+1})}{\nabla
2230		V(x_{k})^{T}\nabla V(x_{k})}%
2231		\end{equation}
2232	+	It is widely agreed that the Polak-Ribiere variant gives better
2233	+	convergence than the Fletcher-Reeves variant, so the conjugate
2234	+	gradient approach implemented in {\sc oopse} is the Polak-Ribiere
2235	+	variant.
2236
2237		The conjugate gradient method assumes that the conformation is close
2238		enough to a local minimum that the potential energy surface is very
#	Line 2232 \| Line 2246 \| minimizer are given in Table~\ref{table:minimizeParams
2246		minimizer are given in Table~\ref{table:minimizeParams}
2247
2248		\begin{table}
2249	<	\caption{The Global Keywords: Energy Minimizer Parameters}
2249	>	\caption{Meta-data Keywords: Energy Minimizer Parameters}
2250		\label{table:minimizeParams}
2251		\begin{center}
2252		% Note when adding or removing columns, the \hsize numbers must add up to the total number
#	Line 2248 \| Line 2262 \| descent) \\
2262		{\tt minimizer} & string & selects the minimization method to be used
2263		& either {\tt CG} (conjugate gradient) or {\tt SD} (steepest
2264		descent) \\
2265	<	{\tt minimizerMaxIter} & steps & Sets the maximum iteration number in the energy minimization & Default value is 200\\
2266	<	{\tt minimizerWriteFreq} & steps & Sets the frequency at which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2267	<	{\tt minimizerStepSize} & $\mbox{\AA}$ & Set the step size of line search & Default value is 0.01\\
2268	<	{\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$ & Sets energy tolerance & Default value is $10^{-8}$\\
2269	<	{\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets gradient tolerance & Default value is $10^{-8}$\\
2270	<	{\tt minimizerLSTol} & $\mbox{kcal mol}^{-1}$ & Sets line search tolerance & Default value is $10^{-8}$\\
2271	<	{\tt minimizerLSMaxIter} & steps & Sets the maximum iteration of line searching & Default value is 50\\
2265	>	{\tt minimizerMaxIter} & steps & Sets the maximum number of iterations
2266	>	for the energy minimization & The default value is 200\\
2267	>	{\tt minimizerWriteFreq} & steps & Sets the frequency with which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2268	>	{\tt minimizerStepSize} & $\mbox{\AA}$ & Sets the step size for the
2269	>	line search & The default value is 0.01\\
2270	>	{\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$ & Sets the energy tolerance
2271	>	for stopping the minimziation. & The default value is $10^{-8}$\\
2272	>	{\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets the
2273	>	gradient tolerance for stopping the minimization. & The default value
2274	>	is $10^{-8}$\\
2275	>	{\tt minimizerLSTol} & $\mbox{kcal mol}^{-1}$ & Sets line search
2276	>	tolerance for terminating each step of the minimization. & The default
2277	>	value is $10^{-8}$\\
2278	>	{\tt minimizerLSMaxIter} & steps & Sets the maximum number of
2279	>	iterations for each line search & The default value is 50\\
2280
2281		\end{tabularx}
2282		\end{center}
#	Line 2263 \| Line 2285 \| Although processor power is continually improving, it
2285		\section{\label{oopseSec:parallelization} Parallel Simulation Implementation}
2286
2287		Although processor power is continually improving, it is still
2288	<	unreasonable to simulate systems of more then a 1000 atoms on a single
2288	>	unreasonable to simulate systems of more than 10,000 atoms on a single
2289		processor. To facilitate study of larger system sizes or smaller
2290		systems for longer time scales, parallel methods were developed to
2291		allow multiple CPU's to share the simulation workload. Three general
#	Line 2274 \| Line 2296 \| the duration of the simulation. Computational cost sca
2296		Algorithmically simplest of the three methods is atomic decomposition,
2297		where $N$ particles in a simulation are split among $P$ processors for
2298		the duration of the simulation. Computational cost scales as an
2299	<	optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately all
2299	>	optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately, all
2300		processors must communicate positions and forces with all other
2301		processors at every force evaluation, leading the communication costs
2302		to scale as an unfavorable $\mathcal{O}(N)$, \emph{independent of the
#	Line 2296 \| Line 2318 \| The parallelization method used in {\sc oopse} is the
2318		three dimensions.
2319
2320		The parallelization method used in {\sc oopse} is the force
2321	<	decomposition method. Force decomposition assigns particles to
2322	<	processors based on a block decomposition of the force
2321	>	decomposition method.\cite{hendrickson:95} Force decomposition assigns
2322	>	particles to processors based on a block decomposition of the force
2323		matrix. Processors are split into an optimally square grid forming row
2324		and column processor groups. Forces are calculated on particles in a
2325		given row by particles located in that processor's column
2326	<	assignment. Force decomposition is less complex to implement than the
2327	<	spatial method but still scales computationally as $\mathcal{O}(N/P)$
2328	<	and scales as $\mathcal{O}(N/\sqrt{P})$ in communication
2329	<	cost. Plimpton has also found that force decompositions scale more
2330	<	favorably than spatial decompositions for systems up to 10,000 atoms
2331	<	and favorably compete with spatial methods up to 100,000
2332	<	atoms.\cite{plimpton95}
2333	<
2326	>	assignment. One deviation from the algorithm described by Hendrickson
2327	>	{\it et al.} is the use of column ordering based on the row indexes
2328	>	preventing the need for a transpose operation necessitating a second
2329	>	communication step when gathering the final force components. Force
2330	>	decomposition is less complex to implement than the spatial method but
2331	>	still scales computationally as $\mathcal{O}(N/P)$ and scales as
2332	>	$\mathcal{O}(N/\sqrt{P})$ in communication cost. Plimpton has also
2333	>	found that force decompositions scale more favorably than spatial
2334	>	decompositions for systems up to 10,000 atoms and favorably compete
2335	>	with spatial methods up to 100,000 atoms.\cite{plimpton95}
2336	>
2337		\section{\label{oopseSec:conclusion}Conclusion}
2338
2339	<	We have presented a new open source parallel simulation program {\sc
2339	>	We have presented a new parallel simulation program called {\sc
2340		oopse}. This program offers some novel capabilities, but mostly makes
2341		available a library of modern object-oriented code for the scientific
2342		community to use freely. Notably, {\sc oopse} can handle symplectic
#	Line 2340 \| Line 2365 \| DMR-0079647.
2365		the Notre Dame Bunch-of-Boxes (B.o.B) computer cluster under NSF grant
2366		DMR-0079647.
2367
2368	<	\bibliographystyle{achemso}
2368	>	\bibliographystyle{jcc}
2369		\bibliography{oopsePaper}
2370
2371		\end{document}

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing trunk/oopsePaper/oopsePaper.tex (file contents): Revision 1434 by chrisfen, Thu Jul 29 18:01:05 2004 UTC vs. Revision 1439 by gezelter, Thu Jul 29 20:06:07 2004 UTC

Diff Legend

Comparing trunk/oopsePaper/oopsePaper.tex (file contents):
Revision 1434 by chrisfen, Thu Jul 29 18:01:05 2004 UTC vs.
Revision 1439 by gezelter, Thu Jul 29 20:06:07 2004 UTC