[ViewVC] Diff of: group/trunk/oopsePaper/oopsePaper.tex

Comparing trunk/oopsePaper/oopsePaper.tex (file contents):
Revision 1428 by gezelter, Wed Jul 28 19:46:08 2004 UTC vs.
Revision 3389 by gezelter, Tue Apr 29 17:50:49 2008 UTC

#	Line 1 \| Line 1
1		\documentclass[11pt]{article}
2		\usepackage{amsmath}
3	<	\usepackage{amssymb}
3	>	\usepackage{amssymb}
4		\usepackage{endfloat}
5		\usepackage{listings}
6		\usepackage{berkeley}
#	Line 22 \| Line 22
22		xleftmargin=0.5in, xrightmargin=0.5in,captionpos=b, %
23		abovecaptionskip=0.5cm, belowcaptionskip=0.5cm}
24		\renewcommand{\lstlistingname}{Scheme}
25	<	\title{{\sc oopse}: An Open Source Object-Oriented Parallel Simulation
25	>	\title{{\sc oopse}: An Object-Oriented Parallel Simulation
26		Engine for Molecular Dynamics}
27
28		\author{Matthew A. Meineke, Charles F. Vardeman II, Teng Lin,\\
#	Line 31 \| Line 31 \| Notre Dame, Indiana 46556}
31		University of Notre Dame\\
32		Notre Dame, Indiana 46556}
33
34	<	\date{\today}
34	>	\date{September 20, 2004}
35		\maketitle
36
37		\begin{abstract}
#	Line 43 \| Line 43 \| meta-data language. A number of advanced integrators
43		simulations are carried out using the force-based decomposition
44		method. Simulations are specified using a very simple C-based
45		meta-data language. A number of advanced integrators are included,
46	<	and the base integrator for orientational dynamics provides
47	<	substantial improvements over older quaternion-base schemes. All
48	<	source code is available under a very permissive (BSD-style) Open
49	<	Source license.
46	>	and the basic integrator for orientational dynamics provides
47	>	substantial improvements over older quaternion-based schemes.
48		\end{abstract}
49
50		\section{\label{sec:intro}Introduction}
51
52		There are a number of excellent molecular dynamics packages available
53		to the chemical physics
54	<	community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn}
54	>	community.\cite{Brooks83,MacKerell98,pearlman:1995,Gromacs,Gromacs3,DL_POLY,Tinker,Paradyn,namd,macromodel}
55		All of these packages are stable, polished programs which solve many
56		problems of interest. Most are now capable of performing molecular
57		dynamics simulations on parallel computers. Some have source code
#	Line 64 \| Line 62 \| has taken us now involves the use of atoms with orient
62		programs referenced can handle transition metal force fields like the
63		Embedded Atom Method ({\sc eam}). The direction our research program
64		has taken us now involves the use of atoms with orientational degrees
65	<	of freedom and transition metals. Since these simulation methods may
66	<	be of some use to other researchers, we have decided to release our
67	<	program to the scientific community with a permissive open source
68	<	license.
65	>	of freedom as well as transition metals. Since these simulation
66	>	methods may be of some use to other researchers, we have decided to
67	>	release our program (and all related source code) to the scientific
68	>	community.
69
70		This paper communicates the algorithmic details of our program, which
71	<	we have been calling the Open source Object-oriented Parallel
72	<	Simulation Engine (i.e. {\sc oopse}). We have structured this paper
73	<	to first discuss the underlying concepts in this simulation package
71	>	we have been calling the Object-Oriented Parallel Simulation Engine
72	>	(i.e. {\sc oopse}). We have structured this paper to first discuss
73	>	the underlying concepts in this simulation package
74		(Sec. \ref{oopseSec:IOfiles}). The empirical energy functions
75		implemented are discussed in Sec.~\ref{oopseSec:empiricalEnergy}.
76		Sec.~\ref{oopseSec:mechanics} describes the various Molecular Dynamics
#	Line 283 \| Line 281 \| scheme is often used. The elements of $\mathsf{A}$ can
281		$\psi$.\cite{Goldstein01} In order to avoid numerical instabilities
282		inherent in using the Euler angles, the four parameter ``quaternion''
283		scheme is often used. The elements of $\mathsf{A}$ can be expressed as
284	<	arithmetic operations involving the four quaternions ($q_0, q_1, q_2,$
285	<	and $q_3$).\cite{allen87:csl} Use of quaternions also leads to
284	>	arithmetic operations involving the four quaternions ($q_w, q_x, q_y,$
285	>	and $q_z$).\cite{allen87:csl} Use of quaternions also leads to
286		performance enhancements, particularly for very small
287		systems.\cite{Evans77}
288
#	Line 391 \| Line 389 \| mol}^{-1}$, distances are in $\mbox{\AA}$, times are i
389		It is important to note the fundamental units in all files which are
390		read and written by {\sc oopse}. Energies are in $\mbox{kcal
391		mol}^{-1}$, distances are in $\mbox{\AA}$, times are in $\mbox{fs}$,
392	<	translational velocities are in $\mbox{\AA fs}^{-1}$, and masses are
392	>	translational velocities are in $\mbox{\AA~fs}^{-1}$, and masses are
393		in $\mbox{amu}$. Orientational degrees of freedom are described using
394		quaternions (unitless, but $q_w^2 + q_x^2 + q_y^2 + q_z^2 = 1$),
395		body-fixed angular momenta ($\mbox{amu \AA}^{2} \mbox{radians
#	Line 411 \| Line 409 \| fs}^{-1}$), and body-fixed moments of inertia ($\mbox{
409
410		{\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
411
412	<	{\tt forceField} & string & Sets the force field. & Possible force fields are "DUFF", "LJ", and "EAM". \\
412	>	{\tt forceField} & string & Sets the force field. & Possible force fields are DUFF, LJ, and EAM. \\
413		{\tt nComponents} & integer & Sets the number of components. & Needs to appear before the first {\tt Component} block. \\
414		{\tt initialConfig} & string & Sets the file containing the initial configuration. & Can point to any file containing the configuration in the correct order. \\
415		{\tt minimizer}& string & Chooses a minimizer & Possible minimizers
416	<	are "SD" and "CG". Either {\tt ensemble} or {\tt minimizer} must be specified. \\
416	>	are SD and CG. Either {\tt ensemble} or {\tt minimizer} must be specified. \\
417		{\tt ensemble} & string & Sets the ensemble. & Possible ensembles are
418	<	"NVE", "NVT", "NPTi", "NPTf", and "NPTxyz". Either {\tt ensemble}
418	>	NVE, NVT, NPTi, NPTf, and NPTxyz. Either {\tt ensemble}
419		or {\tt minimizer} must be specified. \\
420		{\tt dt} & fs & Sets the time step. & Selection of {\tt dt} should be
421	<	small enough to sample the fastest motion of the simulation. (required
422	<	for molecular dynamics simulations)\\
421	>	small enough to sample the fastest motion of the simulation. ({\tt
422	>	dt} is required for molecular dynamics simulations)\\
423		{\tt runTime} & fs & Sets the time at which the simulation should
424		end. & This is an absolute time, and will end the simulation when the
425	<	current time meets or exceeds the {\tt runTime}. (required for
426	<	molecular dynamics simulations)\\
425	>	current time meets or exceeds the {\tt runTime}. ({\tt runTime} is
426	>	required for molecular dynamics simulations)\\
427
428		\end{tabularx}
429		\end{center}
#	Line 450 \| Line 448 \| extension. \\
448		the root name of the initial meta-data file but with an {\tt .eor}
449		extension. \\
450		{\tt useInitialTime} & logical & Sets whether the initial time is taken from the {\tt .in} file. & Useful when recovering a simulation from a crashed processor. Default is false. \\
451	<	{\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump} file is written. & Default sets the frequency to the {\tt runTime}. \\
452	<	{\tt statusTime} & fs & Sets the frequency at which the {\tt .stat} file is written. & Defaults set the frequency to the {\tt sampleTime}. \\
453	<	{\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoffRadius & Defaults to
454	<	$15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
455	<	\sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
451	>	{\tt sampleTime} & fs & Sets the frequency at which the {\tt .dump}
452	>	file is written. & The default is equal to the {\tt runTime}. \\
453	>	{\tt statusTime} & fs & Sets the frequency at which the {\tt .stat}
454	>	file is written. & The default is equal to the {\tt sampleTime}. \\
455	>	{\tt cutoffRadius} & $\mbox{\AA}$ & Manually sets the cutoff radius &
456	>	Defaults to $15\mbox{\AA}$ for systems containing charges or dipoles or to $2.5
457	>	\sigma_{L}$, where $\sigma_{L}$ is the largest LJ $\sigma$ in the
458		simulation. \\
459		{\tt switchingRadius} & $\mbox{\AA}$ & Manually sets the inner radius for the switching function. & Defaults to 95~\% of the {\tt cutoffRadius}. \\
460	<	{\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is "false". \\
460	>	{\tt useReactionField} & logical & Turns the reaction field correction on/off. & Default is false. \\
461		{\tt dielectric} & unitless & Sets the dielectric constant for reaction field. & If {\tt useReactionField} is true, then {\tt dielectric} must be set. \\
462		{\tt usePeriodicBoundaryConditions} & & & \\
463	<	& logical & Turns periodic boundary conditions on/off. & Default is "true". \\
463	>	& logical & Turns periodic boundary conditions on/off. & Default is true. \\
464		{\tt seed } & integer & Sets the seed value for the random number
465		generator. & The seed needs to be at least 9 digits long. The default
466		is to take the seed from the CPU clock. \\
467		{\tt forceFieldVariant} & string & Sets the name of the variant of the
468	<	force field. ({\sc eam} has three variants: {\tt u3}, {\tt u6}, and
468	>	force field. & {\sc eam} has three variants: {\tt u3}, {\tt u6}, and
469		{\tt VC}.
470
471		\end{tabularx}
#	Line 565 \| Line 565 \| declared in the molecule's declaration within the mode
565		\item The ordering of the atoms for each molecule follows the order
566		declared in the molecule's declaration within the model file.
567		\item Only atoms which are not members of a {\tt rigidBody} are
568	<	included
568	>	included.
569		\item Rigid Body coordinates for a molecule are listed immediately
570		after the the other atoms in a molecule. Some molecules may be
571		entirely rigid, in which case, only the rigid body coordinates are
#	Line 667 \| Line 667 \| V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}
667		\begin{equation}
668		V = V_{\mathrm{short-range}} + V_{\mathrm{long-range}}.
669		\end{equation}
670	<	The short-ranged portion includes explicit bonds, bends and torsions,
671	<	which have been defined in the meta-data file for the molecules which
672	<	present in the simulation. The functional forms and parameters for
673	<	these interactions are defined by the force field which is chosen.
670	>	The short-ranged portion includes the explicit bonds, bends, and
671	>	torsions which have been defined in the meta-data file for the
672	>	molecules which are present in the simulation. The functional forms and
673	>	parameters for these interactions are defined by the force field which
674	>	is chosen.
675
676	<	Calculating long-range (non-bonded) potential involves a sum over all
677	<	pairs of atoms (except for those atoms which are involved in a bond,
678	<	bend, or torsion with each other). If done poorly, calculating the
679	<	the long-range interactions for $N$ atoms would involve $N^2$
680	<	evaluations of atomic distance. To reduce the number of distance
676	>	Calculating the long-range (non-bonded) potential involves a sum over
677	>	all pairs of atoms (except for those atoms which are involved in a
678	>	bond, bend, or torsion with each other). If done poorly, calculating
679	>	the the long-range interactions for $N$ atoms would involve $N(N-1)/2$
680	>	evaluations of atomic distances. To reduce the number of distance
681		evaluations between pairs of atoms, {\sc oopse} uses a switched cutoff
682		with Verlet neighbor lists.\cite{allen87:csl} It is well known that
683		neutral groups which contain charges will exhibit pathological forces
#	Line 691 \| Line 692 \| cutoff groups ($a$ and $b$).
692		where $r_{ab}$ is the distance between the centers of mass of the two
693		cutoff groups ($a$ and $b$).
694
695	<	The sums over $a$ and $b$ are over the cutoffGroups that are present
695	>	The sums over $a$ and $b$ are over the cutoff groups that are present
696		in the simulation. Atoms which are not explicitly defined as members
697		of a {\tt cutoffGroup} are treated as a group consisting of only one
698		atom. The switching function, $s(r)$ is the standard cubic switching
#	Line 717 \| Line 718 \| present in the simulation. In simulations containing
718		atoms, the cutoff radius has a default value of $2.5\sigma_{ii}$,
719		where $\sigma_{ii}$ is the largest Lennard-Jones length parameter
720		present in the simulation. In simulations containing charged or
721	<	dipolar atoms, the default cutoff Radius is $15 \mbox{\AA}$.
721	>	dipolar atoms, the default cutoff radius is $15 \mbox{\AA}$.
722
723		The {\tt switchingRadius} is set to a default value of 95\% of the
724		{\tt cutoffRadius}. In the special case of a simulation containing
#	Line 726 \| Line 727 \| Both radii may be specified in the meta-data file.
727		potential to remove discontinuities in the potential at the cutoff.
728		Both radii may be specified in the meta-data file.
729
730	<	Force fields can easily be added to {\sc oopse}, although it comes
731	<	with a few simple examples (Lennard-Jones, {\sc duff}, {\sc water},
732	<	and {\sc eam}) which are explained in the following sections.
730	>	Force fields can be added to {\sc oopse}, although it comes with a few
731	>	simple examples (Lennard-Jones, {\sc duff}, {\sc water}, and {\sc
732	>	eam}) which are explained in the following sections.
733
734		\subsection{\label{sec:LJPot}The Lennard Jones Force Field}
735
#	Line 1076 \| Line 1077 \| models can be found in reference~\citen{fennell04}.
1077		force field file ({\tt DUFF.frc}). A table of the parameter values
1078		and the drawbacks and benefits of the different density corrected SSD
1079		models can be found in reference~\citen{fennell04}.
1080	+
1081	+	\subsection{\label{oopseSec:WATER}The {\sc water} Force Field}
1082	+
1083	+	In addition to the {\sc duff} force field's solvent description, a
1084	+	separate {\sc water} force field has been included for simulating most
1085	+	of the common rigid-body water models. This force field includes the
1086	+	simple and point-dipolar models (SSD, SSD1, SSD/E, SSD/RF, and DPD
1087	+	water), as well as the common charge-based models (SPC, SPC/E, TIP3P,
1088	+	TIP4P, and
1089	+	TIP5P).\cite{liu96:new_model,Ichiye03,fennell04,Marrink01,Berendsen81,Berendsen87,Jorgensen83,Mahoney00}
1090	+	In order to handle these models, charge-charge interactions were
1091	+	included in the force-loop:
1092	+	\begin{equation}
1093	+	V_{\text{charge}}(r_{ij}) = \sum_{ij}\frac{q_iq_je^2}{r_{ij}},
1094	+	\end{equation}
1095	+	where $q$ represents the charge on particle $i$ or $j$, and $e$ is the
1096	+	charge of an electron in Coulombs. As with the other pair
1097	+	interactions, charges can be simulated with a pure cutoff or a
1098	+	reaction field. The {\sc water} force field can be easily expanded
1099	+	through modification of the {\sc water} force field file ({\tt
1100	+	WATER.frc}). By adding atom types and inserting the appropriate
1101	+	parameters, it is possible to extend the force field to handle rigid
1102	+	molecules other than water.
1103
1104		\subsection{\label{oopseSec:eam}Embedded Atom Method}
1105
#	Line 1136 \| Line 1160 \| atoms.
1160		$i$ is simply the linear sum of density contributions of all the other
1161		atoms.
1162
1163	<	The {\sc eam} force field illustrates an additional feature of {\sc
1164	<	oopse}. Foiles, Baskes and Daw fit {\sc eam} potentials for Cu, Ag,
1165	<	Au, Ni, Pd, Pt and alloys of these metals.\cite{FBD86} These fits are
1166	<	included in {\sc oopse} as the {\tt u3} variant of the {\sc eam} force
1167	<	field. Voter and Chen reparamaterized a set of {\sc eam} functions
1168	<	which do a better job of predicting melting points.\cite{Voter:87}
1169	<	These functions are included in {\sc oopse} as the {\tt VC} variant of
1170	<	the {\sc eam} force field. An additional set of functions (the
1163	>	The {\sc eam} force field has a number of variants in the literature.
1164	>	Foiles, Baskes and Daw fit {\sc eam} potentials for Cu, Ag, Au, Ni,
1165	>	Pd, Pt and alloys of these metals.\cite{FBD86} These fits are included
1166	>	in {\sc oopse} as the {\tt u3} variant of the {\sc eam} force field.
1167	>	Voter and Chen reparamaterized a set of {\sc eam} functions which do a
1168	>	better job of predicting melting points.\cite{Voter:87} These
1169	>	functions are included in {\sc oopse} as the {\tt VC} variant of the
1170	>	{\sc eam} force field. An additional set of functions (the
1171		``Universal 6'' functions) are included in {\sc oopse} as the {\tt u6}
1172	<	variant of {\sc eam}. For example, to specify the Voter-Chen variant
1173	<	of the {\sc eam} force field, the user would add the {\tt
1174	<	forceFieldVariant = "VC";} line to the meta-data file.
1172	>	variant of {\sc eam}. In general, to specify one variant of a force
1173	>	field, the user would need a single line in the meta-data file to
1174	>	select the {\tt forceFieldVariant}. For example, to specify the
1175	>	Voter-Chen variant of the {\sc eam} force field, the user would add
1176	>	the {\tt forceFieldVariant = "VC";} line to the meta-data file.
1177
1178		The potential files used by the {\sc eam} force field are in the
1179		standard {\tt funcfl} format, which is the format utilized by a number
#	Line 1223 \| Line 1249 \| the inter-atomic forces.
1249		\section{\label{oopseSec:mechanics}Mechanics}
1250
1251		\subsection{\label{oopseSec:integrate}Integrating the Equations of Motion: the
1252	<	DLM method}
1252	>	{\sc dlm} method}
1253
1254		The default method for integrating the equations of motion in {\sc
1255		oopse} is a velocity-Verlet version of the symplectic splitting method
1256		proposed by Dullweber, Leimkuhler and McLachlan
1257	<	(DLM).\cite{Dullweber1997} When there are no directional atoms or
1257	>	({\sc dlm}).\cite{Dullweber1997} When there are no directional atoms or
1258		rigid bodies present in the simulation, this integrator becomes the
1259		standard velocity-Verlet integrator which is known to sample the
1260		microcanonical (NVE) ensemble.\cite{Frenkel1996}
1261
1262		Previous integration methods for orientational motion have problems
1263	<	that are avoided in the DLM method. Direct propagation of the Euler
1263	>	that are avoided in the {\sc dlm} method. Direct propagation of the Euler
1264		angles has a known $1/\sin\theta$ divergence in the equations of
1265		motion for $\phi$ and $\psi$,\cite{allen87:csl} leading to numerical
1266		instabilities any time one of the directional atoms or rigid bodies
#	Line 1342 \| Line 1368 \| of the particle in the space-fixed frame.
1368		Here $\hat{\bf u}$ is a unit vector pointing along the principal axis
1369		of the particle in the space-fixed frame.
1370
1371	<	The DLM method uses a Trotter factorization of the orientational
1371	>	The {\sc dlm} method uses a Trotter factorization of the orientational
1372		propagator. This has three effects:
1373		\begin{enumerate}
1374		\item the integrator is area-preserving in phase space (i.e. it is
#	Line 1436 \| Line 1462 \| advanced to the same time value.
1462		+ \frac{h}{2} {\bf \tau}^b(t + h) .
1463		\end{align*}
1464
1465	<	The matrix rotations used in the DLM method end up being more costly
1466	<	computationally than the simpler arithmetic quaternion
1467	<	propagation. With the same time step, a 1000-molecule water simulation
1468	<	shows an average 7\% increase in computation time using the DLM method
1469	<	in place of quaternions. This cost is more than justified when
1470	<	comparing the energy conservation of the two methods as illustrated in
1471	<	Fig.~\ref{timestep}.
1465	>	The matrix rotations used in the {\sc dlm} method end up being more
1466	>	costly computationally than the simpler arithmetic quaternion
1467	>	propagation. With the same time step, a 1024-molecule water simulation
1468	>	incurs an average 12\% increase in computation time using the {\sc
1469	>	dlm} method in place of quaternions. This cost is more than justified
1470	>	when comparing the energy conservation achieved by the two
1471	>	methods. Figure ~\ref{quatdlm} provides a comparative analysis of the
1472	>	{\sc dlm} method versus the traditional quaternion scheme.
1473
1474		\begin{figure}
1475		\centering
1476	<	\includegraphics[width=\linewidth]{timeStep.eps}
1477	<	\caption[Energy conservation for quaternion versus DLM dynamics]{Energy conservation using quaternion based integration versus
1478	<	the method proposed by Dullweber \emph{et al.} with increasing time
1479	<	step. For each time step, the dotted line is total energy using the
1480	<	DLM integrator, and the solid line comes from the quaternion
1481	<	integrator. The larger time step plots are shifted up from the true
1482	<	energy baseline for clarity.}
1483	<	\label{timestep}
1476	>	\includegraphics[width=\linewidth]{quatvsdlm.eps}
1477	>	\caption[Energy conservation analysis of the {\sc dlm} and quaternion
1478	>	integration methods]{Analysis of the energy conservation of the {\sc
1479	>	dlm} and quaternion integration methods. $\delta \mathrm{E}_1$ is the
1480	>	linear drift in energy over time and $\delta \mathrm{E}_0$ is the
1481	>	standard deviation of energy fluctuations around this drift. All
1482	>	simulations were of a 1024-molecule simulation of SSD water at 298 K
1483	>	starting from the same initial configuration. Note that the {\sc dlm}
1484	>	method provides more than an order of magnitude improvement in both
1485	>	the energy drift and the size of the energy fluctuations when compared
1486	>	with the quaternion method at any given time step. At time steps
1487	>	larger than 4 fs, the quaternion scheme resulted in rapidly rising
1488	>	energies which eventually lead to simulation failure. Using the {\sc
1489	>	dlm} method, time steps up to 8 fs can be taken before this behavior
1490	>	is evident.}
1491	>	\label{quatdlm}
1492		\end{figure}
1493
1494	<	In Fig.~\ref{timestep}, the resulting energy drift at various time
1495	<	steps for both the DLM and quaternion integration schemes is
1496	<	compared. All of the 1000 molecule water simulations started with the
1497	<	same configuration, and the only difference was the method for
1498	<	handling rotational motion. At time steps of 0.1 and 0.5 fs, both
1499	<	methods for propagating molecule rotation conserve energy fairly well,
1500	<	with the quaternion method showing a slight energy drift over time in
1501	<	the 0.5 fs time step simulation. At time steps of 1 and 2 fs, the
1502	<	energy conservation benefits of the DLM method are clearly
1503	<	demonstrated. Thus, while maintaining the same degree of energy
1504	<	conservation, one can take considerably longer time steps, leading to
1470	<	an overall reduction in computation time.
1494	>	In Fig.~\ref{quatdlm}, $\delta \mbox{E}_1$ is a measure of the linear
1495	>	energy drift in units of $\mbox{kcal mol}^{-1}$ per particle over a
1496	>	nanosecond of simulation time, and $\delta \mbox{E}_0$ is the standard
1497	>	deviation of the energy fluctuations in units of $\mbox{kcal
1498	>	mol}^{-1}$ per particle. In the top plot, it is apparent that the
1499	>	energy drift is reduced by a significant amount (2 to 3 orders of
1500	>	magnitude improvement at all tested time steps) by chosing the {\sc
1501	>	dlm} method over the simple non-symplectic quaternion integration
1502	>	method. In addition to this improvement in energy drift, the
1503	>	fluctuations in the total energy are also dampened by 1 to 2 orders of
1504	>	magnitude by utilizing the {\sc dlm} method.
1505
1506	+	Although the {\sc dlm} method is more computationally expensive than
1507	+	the traditional quaternion scheme for propagating a single time step,
1508	+	consideration of the computational cost for a long simulation with a
1509	+	particular level of energy conservation is in order. A plot of energy
1510	+	drift versus computational cost was generated
1511	+	(Fig.~\ref{cpuCost}). This figure provides an estimate of the CPU time
1512	+	required under the two integration schemes for 1 nanosecond of
1513	+	simulation time for the model 1024-molecule system. By chosing a
1514	+	desired energy drift value it is possible to determine the CPU time
1515	+	required for both methods. If a $\delta \mbox{E}_1$ of $1 \times
1516	+	10^{-3} \mbox{kcal mol}^{-1}$ per particle is desired, a nanosecond of
1517	+	simulation time will require ~19 hours of CPU time with the {\sc dlm}
1518	+	integrator, while the quaternion scheme will require ~154 hours of CPU
1519	+	time. This demonstrates the computational advantage of the integration
1520	+	scheme utilized in {\sc oopse}.
1521	+
1522	+	\begin{figure}
1523	+	\centering
1524	+	\includegraphics[width=\linewidth]{compCost.eps}
1525	+	\caption[Energy drift as a function of required simulation run
1526	+	time]{Energy drift as a function of required simulation run time.
1527	+	$\delta \mathrm{E}_1$ is the linear drift in energy over time.
1528	+	Simulations were performed on a single 2.5 GHz Pentium 4
1529	+	processor. Simulation time comparisons can be made by tracing
1530	+	horizontally from one curve to the other. For example, a simulation
1531	+	that takes ~24 hours using the {\sc dlm} method will take roughly 210
1532	+	hours using the simple quaternion method if the same degree of energy
1533	+	conservation is desired.}
1534	+	\label{cpuCost}
1535	+	\end{figure}
1536	+
1537		There is only one specific keyword relevant to the default integrator,
1538		and that is the time step for integrating the equations of motion.
1539
#	Line 1575 \| Line 1640 \| Here, $f$ is the total number of degrees of freedom in
1640		\end{equation}
1641		Here, $f$ is the total number of degrees of freedom in the system,
1642		\begin{equation}
1643	<	f = 3 N + 3 N_{\mathrm{orient}} - N_{\mathrm{constraints}},
1643	>	f = 3 N + 2 N_{\mathrm{linear}} + 3 N_{\mathrm{non-linear}} - N_{\mathrm{constraints}},
1644		\end{equation}
1645		and $K$ is the total kinetic energy,
1646		\begin{equation}
1647		K = \sum_{i=1}^{N} \frac{1}{2} m_i {\bf v}_i^T \cdot {\bf v}_i +
1648	<	\sum_{i=1}^{N_{\mathrm{orient}}} \frac{1}{2} {\bf j}_i^T \cdot
1648	>	\sum_{i=1}^{N_{\mathrm{linear}}+N_{\mathrm{non-linear}}} \frac{1}{2} {\bf j}_i^T \cdot
1649		\overleftrightarrow{\mathsf{I}}_i^{-1} \cdot {\bf j}_i.
1650		\end{equation}
1651	+	$N_{\mathrm{linear}}$ is the number of linear rotors (i.e. with two
1652	+	non-zero moments of inertia), and $N_{\mathrm{non-linear}}$ is the
1653	+	number of non-linear rotors (i.e. with three non-zero moments of
1654	+	inertia).
1655
1656		In eq.(\ref{eq:nosehooverext}), $\tau_T$ is the time constant for
1657		relaxation of the temperature to the target value. To set values for
#	Line 1620 \| Line 1689 \| factorization of the three rotation operations that wa
1689		Here $\mathrm{rotate}(h * {\bf j}
1690		\overleftrightarrow{\mathsf{I}}^{-1})$ is the same symplectic Trotter
1691		factorization of the three rotation operations that was discussed in
1692	<	the section on the DLM integrator. Note that this operation modifies
1692	>	the section on the {\sc dlm} integrator. Note that this operation modifies
1693		both the rotation matrix $\mathsf{A}$ and the angular momentum ${\bf
1694		j}$. {\tt moveA} propagates velocities by a half time step, and
1695		positional degrees of freedom by a full time step. The new positions
1696		(and orientations) are then used to calculate a new set of forces and
1697		torques in exactly the same way they are calculated in the {\tt
1698	<	doForces} portion of the DLM integrator.
1698	>	doForces} portion of the {\sc dlm} integrator.
1699
1700		Once the forces and torques have been obtained at the new time step,
1701		the temperature, velocities, and the extended system variable can be
#	Line 1731 \| Line 1800 \| where ${\bf r}_{ab}$ is the distance between the cente
1800		where ${\bf r}_{ab}$ is the distance between the centers of mass, and
1801		\begin{equation}
1802		{\bf f}_{ab} = s(r_{ab}) \sum_{i \in a} \sum_{j \in b} {\bf f}_{ij} +
1803	<	s\prime(r_{ab}) \frac{{\bf r}_{ab}}{\|r_{ab}\|} \sum_{i \in a} \sum_{j
1803	>	s^{\prime}(r_{ab}) \frac{{\bf r}_{ab}}{\|r_{ab}\|} \sum_{i \in a} \sum_{j
1804		\in b} V_{ij}({\bf r}_{ij}).
1805		\end{equation}
1806
1807		The instantaneous pressure is then simply obtained from the trace of
1808		the pressure tensor,
1809		\begin{equation}
1810	<	P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t).
1811	<	\right)
1810	>	P(t) = \frac{1}{3} \mathrm{Tr} \left( \overleftrightarrow{\mathsf{P}}(t)
1811	>	\right).
1812		\end{equation}
1813
1814		In eq.(\ref{eq:melchionna1}), $\tau_B$ is the time constant for
#	Line 1749 \| Line 1818 \| integration of the equations of motion is carried out
1818		file. The units for {\tt tauBarostat} are fs, and the units for the
1819		{\tt targetPressure} are atmospheres. Like in the NVT integrator, the
1820		integration of the equations of motion is carried out in a
1821	<	velocity-Verlet style 2 part algorithm with only the following differences:
1821	>	velocity-Verlet style two part algorithm with only the following
1822	>	differences:
1823
1824		{\tt moveA:}
1825		\begin{align*}
#	Line 1782 \| Line 1852 \| the box by
1852		the box by
1853		\begin{equation}
1854		\mathcal{V}(t + h) \leftarrow e^{ - 3 h \eta(t + h /2)} \times
1855	<	\mathcal{V}(t)
1855	>	\mathcal{V}(t).
1856		\end{equation}
1857
1858		The {\tt doForces} step for the NPTi integrator is exactly the same as
1859	<	in both the DLM and NVT integrators. Once the forces and torques have
1859	>	in both the {\sc dlm} and NVT integrators. Once the forces and torques have
1860		been obtained at the new time step, the velocities can be advanced to
1861		the same time value.
1862
#	Line 2011 \| Line 2081 \| the force calculation at each time step.
2081		subtract the total constraint forces from the rest of the system after
2082		the force calculation at each time step.
2083
2084	<	After the force calculation, the total force on molecule $\alpha$,
2084	>	After the force calculation, the total force on molecule $\alpha$ is:
2085		\begin{equation}
2086		G_{\alpha} = \sum_i F_{\alpha i},
2087		\label{oopseEq:zc1}
#	Line 2032 \| Line 2102 \| v_{\alpha i} = v_{\alpha i} -
2102		v_{\alpha i} = v_{\alpha i} -
2103		\frac{\sum_i m_{\alpha i} v_{\alpha i}}{\sum_i m_{\alpha i}},
2104		\end{equation}
2105	<	where $v_{\alpha i}$ is the velocity of atom $i$ in the z direction.
2105	>	where $v_{\alpha i}$ is the velocity of atom $i$ in the $z$ direction.
2106		Lastly, all of the accumulated constraint forces must be subtracted
2107		from the rest of the unconstrained system to keep the system center of
2108		mass of the entire system from drifting.
#	Line 2070 \| Line 2140 \| The user may also specify the use of a constant veloci
2140
2141		The user may also specify the use of a constant velocity method
2142		(steered molecular dynamics) to move the molecules to their desired
2143	<	initial positions.
2143	>	initial positions. Based on concepts from atomic force microscopy,
2144	>	{\sc smd} has been used to study many processes which occur via rare
2145	>	events on the time scale of a few hundreds of picoseconds. For
2146	>	example,{\sc smd} has been used to observe the dissociation of
2147	>	Streptavidin-biotin Complex.\cite{smd}
2148
2149		To use of the $z$-constraint method in an {\sc oopse} simulation, the
2150		molecules must be specified using the {\tt nZconstraints} keyword in
#	Line 2078 \| Line 2152 \| of the $z$-constraint method are listed in table~\ref{
2152		of the $z$-constraint method are listed in table~\ref{table:zconParams}.
2153
2154		\begin{table}
2155	<	\caption{The Global Keywords: Z-Constraint Parameters}
2155	>	\caption{Meta-data Keywords: Z-Constraint Parameters}
2156		\label{table:zconParams}
2157		\begin{center}
2158		% Note when adding or removing columns, the \hsize numbers must add up to the total number
#	Line 2091 \| Line 2165 \| of the $z$-constraint method are listed in table~\ref{
2165
2166		{\bf keyword} & {\bf units} & {\bf use} & {\bf remarks} \\ \hline
2167
2168	<	{\tt nZconstraints} & integer & The number of zconstraint molecules& If using zconstraint method, {\tt nZconstraints} must be set \\
2169	<	{\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file is written & \\
2170	<	{\tt zconsForcePolicy} & string & The strategy of subtracting
2171	<	zconstraint force from the unconstrained molecules & Possible
2172	<	strategies are {\tt BYMASS} and {\tt BYNUMBER}. Default
2173	<	strategy is set to {\tt BYMASS}\\
2174	<	{\tt zconsGap} & $\mbox{\AA}$ & Set the distance between two adjacent
2175	<	constraint positions& Used mainly in moving molecules through a simulation \\
2176	<	{\tt zconsFixtime} & fs & Sets how long the zconstraint molecule is
2177	<	fixed & {\tt zconsFixtime} must be set if {\tt zconsGap} is set\\
2178	<	{\tt zconsUsingSMD} &logical & Flag for using Steered Molecular
2179	<	Dynamics or Harmonic Forces to move the molecule & Harmonic Forces are
2180	<	used by default\\
2168	>	{\tt nZconstraints} & integer & The number of $z$-constrained
2169	>	molecules & If using the $z$-constraint method, {\tt nZconstraints}
2170	>	must be set \\
2171	>	{\tt zconsTime} & fs & Sets the frequency at which the {\tt .fz} file
2172	>	is written & \\
2173	>	{\tt zconsForcePolicy} & string & The strategy for subtracting
2174	>	the $z$-constraint force from the {\it unconstrained} molecules & Possible
2175	>	strategies are {\tt BYMASS} and {\tt BYNUMBER}. The default
2176	>	strategy is {\tt BYMASS}\\
2177	>	{\tt zconsGap} & $\mbox{\AA}$ & Sets the distance between two adjacent
2178	>	constraint positions&Used mainly to move molecules through a
2179	>	simulation to estimate potentials of mean force. \\
2180	>	{\tt zconsFixtime} & fs & Sets the length of time the $z$-constraint
2181	>	molecule is held fixed & {\tt zconsFixtime} must be set if {\tt
2182	>	zconsGap} is set\\
2183	>	{\tt zconsUsingSMD} & logical & Flag for using Steered Molecular
2184	>	Dynamics to move the molecules to the correct constrained positions &
2185	>	Harmonic Forces are used by default\\
2186
2187		\end{tabularx}
2188		\end{center}
#	Line 2120 \| Line 2199 \| gradient ({\sc cg}) to help users find reasonable loca
2199		stable fixed points on the surface. We have included two simple
2200		minimization algorithms: steepest descent, ({\sc sd}) and conjugate
2201		gradient ({\sc cg}) to help users find reasonable local minima from
2202	<	their initial configurations.
2202	>	their initial configurations. Since {\sc oopse} handles atoms and
2203	>	rigid bodies which have orientational coordinates as well as
2204	>	translational coordinates, there is some subtlety to the choice of
2205	>	parameters for minimization algorithms.
2206
2125	–	Since {\sc oopse} handles atoms and rigid bodies which have
2126	–	orientational coordinates as well as translational coordinates, there
2127	–	is some subtlety to the choice of parameters for minimization
2128	–	algorithms.
2129	–
2207		Given a coordinate set $x_{k}$ and a search direction $d_{k}$, a line
2208		search algorithm is performed along $d_{k}$ to produce
2209	<	$x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$.
2210	<
2134	<	In the steepest descent ({\sc sd}) algorithm,%
2209	>	$x_{k+1}=x_{k}+$ $\lambda _{k}d_{k}$. In the steepest descent ({\sc
2210	>	sd}) algorithm,%
2211		\begin{equation}
2212	<	d_{k}=-\nabla V(x_{k})
2212	>	d_{k}=-\nabla V(x_{k}).
2213		\end{equation}
2214		The gradient and the direction of next step are always orthogonal.
2215		This may cause oscillatory behavior in narrow valleys. To overcome
2216		this problem, the Fletcher-Reeves variant~\cite{FletcherReeves} of the
2217		conjugate gradient ({\sc cg}) algorithm is used to generate $d_{k+1}$
2218		via simple recursion:
2219	<	\begin{align}
2220	<	d_{k+1} & =-\nabla V(x_{k+1})+\gamma_{k}d_{k}\\
2221	<	\gamma_{k} & =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2222	<	V(x_{k})^{T}\nabla V(x_{k})}%
2223	<	\end{align}
2219	>	\begin{equation}
2220	>	d_{k+1} =-\nabla V(x_{k+1})+\gamma_{k}d_{k}
2221	>	\end{equation}
2222	>	where
2223	>	\begin{equation}
2224	>	\gamma_{k} =\frac{\nabla V(x_{k+1})^{T}\nabla V(x_{k+1})}{\nabla
2225	>	V(x_{k})^{T}\nabla V(x_{k})}.
2226	>	\end{equation}
2227
2228		The Polak-Ribiere variant~\cite{PolakRibiere} of the conjugate
2229		gradient ($\gamma_{k}$) is defined as%
#	Line 2152 \| Line 2231 \| V(x_{k})^{T}\nabla V(x_{k})}%
2231		\gamma_{k}=\frac{[\nabla V(x_{k+1})-\nabla V(x)]^{T}\nabla V(x_{k+1})}{\nabla
2232		V(x_{k})^{T}\nabla V(x_{k})}%
2233		\end{equation}
2234	+	It is widely agreed that the Polak-Ribiere variant gives better
2235	+	convergence than the Fletcher-Reeves variant, so the conjugate
2236	+	gradient approach implemented in {\sc oopse} is the Polak-Ribiere
2237	+	variant.
2238
2239		The conjugate gradient method assumes that the conformation is close
2240		enough to a local minimum that the potential energy surface is very
#	Line 2165 \| Line 2248 \| minimizer are given in Table~\ref{table:minimizeParams
2248		minimizer are given in Table~\ref{table:minimizeParams}
2249
2250		\begin{table}
2251	<	\caption{The Global Keywords: Energy Minimizer Parameters}
2251	>	\caption{Meta-data Keywords: Energy Minimizer Parameters}
2252		\label{table:minimizeParams}
2253		\begin{center}
2254		% Note when adding or removing columns, the \hsize numbers must add up to the total number
#	Line 2181 \| Line 2264 \| descent) \\
2264		{\tt minimizer} & string & selects the minimization method to be used
2265		& either {\tt CG} (conjugate gradient) or {\tt SD} (steepest
2266		descent) \\
2267	<	{\tt minimizerMaxIter} & steps & Sets the maximum iteration number in the energy minimization & Default value is 200\\
2268	<	{\tt minimizerWriteFreq} & steps & Sets the frequency at which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2269	<	{\tt minimizerStepSize} & $\mbox{\AA}$ & Set the step size of line search & Default value is 0.01\\
2270	<	{\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$ & Sets energy tolerance & Default value is $10^{-8}$\\
2271	<	{\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets gradient tolerance & Default value is $10^{-8}$\\
2272	<	{\tt minimizerLSTol} & $\mbox{kcal mol}^{-1}$ & Sets line search tolerance & Default value is $10^{-8}$\\
2273	<	{\tt minimizerLSMaxIter} & steps & Sets the maximum iteration of line searching & Default value is 50\\
2267	>	{\tt minimizerMaxIter} & steps & Sets the maximum number of iterations
2268	>	for the energy minimization & The default value is 200\\
2269	>	{\tt minimizerWriteFreq} & steps & Sets the frequency with which the {\tt .dump} and {\tt .stat} files are writtern during energy minimization & \\
2270	>	{\tt minimizerStepSize} & $\mbox{\AA}$ & Sets the step size for the
2271	>	line search & The default value is 0.01\\
2272	>	{\tt minimizerFTol} & $\mbox{kcal mol}^{-1}$ & Sets the energy tolerance
2273	>	for stopping the minimziation. & The default value is $10^{-8}$\\
2274	>	{\tt minimizerGTol} & $\mbox{kcal mol}^{-1}\mbox{\AA}^{-1}$ & Sets the
2275	>	gradient tolerance for stopping the minimization. & The default value
2276	>	is $10^{-8}$\\
2277	>	{\tt minimizerLSTol} & $\mbox{kcal mol}^{-1}$ & Sets line search
2278	>	tolerance for terminating each step of the minimization. & The default
2279	>	value is $10^{-8}$\\
2280	>	{\tt minimizerLSMaxIter} & steps & Sets the maximum number of
2281	>	iterations for each line search & The default value is 50\\
2282
2283		\end{tabularx}
2284		\end{center}
#	Line 2196 \| Line 2287 \| Although processor power is continually improving, it
2287		\section{\label{oopseSec:parallelization} Parallel Simulation Implementation}
2288
2289		Although processor power is continually improving, it is still
2290	<	unreasonable to simulate systems of more then a 1000 atoms on a single
2290	>	unreasonable to simulate systems of more than 10,000 atoms on a single
2291		processor. To facilitate study of larger system sizes or smaller
2292		systems for longer time scales, parallel methods were developed to
2293		allow multiple CPU's to share the simulation workload. Three general
#	Line 2207 \| Line 2298 \| the duration of the simulation. Computational cost sca
2298		Algorithmically simplest of the three methods is atomic decomposition,
2299		where $N$ particles in a simulation are split among $P$ processors for
2300		the duration of the simulation. Computational cost scales as an
2301	<	optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately all
2301	>	optimal $\mathcal{O}(N/P)$ for atomic decomposition. Unfortunately, all
2302		processors must communicate positions and forces with all other
2303		processors at every force evaluation, leading the communication costs
2304		to scale as an unfavorable $\mathcal{O}(N)$, \emph{independent of the
#	Line 2229 \| Line 2320 \| The parallelization method used in {\sc oopse} is the
2320		three dimensions.
2321
2322		The parallelization method used in {\sc oopse} is the force
2323	<	decomposition method. Force decomposition assigns particles to
2324	<	processors based on a block decomposition of the force
2323	>	decomposition method.\cite{hendrickson:95} Force decomposition assigns
2324	>	particles to processors based on a block decomposition of the force
2325		matrix. Processors are split into an optimally square grid forming row
2326		and column processor groups. Forces are calculated on particles in a
2327		given row by particles located in that processor's column
2328	<	assignment. Force decomposition is less complex to implement than the
2329	<	spatial method but still scales computationally as $\mathcal{O}(N/P)$
2330	<	and scales as $\mathcal{O}(N/\sqrt{P})$ in communication
2331	<	cost. Plimpton has also found that force decompositions scale more
2332	<	favorably than spatial decompositions for systems up to 10,000 atoms
2333	<	and favorably compete with spatial methods up to 100,000
2334	<	atoms.\cite{plimpton95}
2335	<
2328	>	assignment. One deviation from the algorithm described by Hendrickson
2329	>	{\it et al.} is the use of column ordering based on the row indexes
2330	>	preventing the need for a transpose operation necessitating a second
2331	>	communication step when gathering the final force components. Force
2332	>	decomposition is less complex to implement than the spatial method but
2333	>	still scales computationally as $\mathcal{O}(N/P)$ and scales as
2334	>	$\mathcal{O}(N/\sqrt{P})$ in communication cost. Plimpton has also
2335	>	found that force decompositions scale more favorably than spatial
2336	>	decompositions for systems up to 10,000 atoms and favorably compete
2337	>	with spatial methods up to 100,000 atoms.\cite{plimpton95}
2338	>
2339		\section{\label{oopseSec:conclusion}Conclusion}
2340
2341	<	We have presented a new open source parallel simulation program {\sc
2341	>	We have presented a new parallel simulation program called {\sc
2342		oopse}. This program offers some novel capabilities, but mostly makes
2343		available a library of modern object-oriented code for the scientific
2344		community to use freely. Notably, {\sc oopse} can handle symplectic
#	Line 2273 \| Line 2367 \| DMR-0079647.
2367		the Notre Dame Bunch-of-Boxes (B.o.B) computer cluster under NSF grant
2368		DMR-0079647.
2369
2370	<	\bibliographystyle{achemso}
2370	>	\bibliographystyle{jcc}
2371		\bibliography{oopsePaper}
2372
2373		\end{document}

Diff Legend

-–
+Removed lines
-+
+Added lines
-<
+Changed lines
->
+Changed lines

Comparing trunk/oopsePaper/oopsePaper.tex (file contents): Revision 1428 by gezelter, Wed Jul 28 19:46:08 2004 UTC vs. Revision 3389 by gezelter, Tue Apr 29 17:50:49 2008 UTC

Diff Legend

Comparing trunk/oopsePaper/oopsePaper.tex (file contents):
Revision 1428 by gezelter, Wed Jul 28 19:46:08 2004 UTC vs.
Revision 3389 by gezelter, Tue Apr 29 17:50:49 2008 UTC