Pré-visualização parcial do texto
Baixe ab initio total energy calculations e outras Notas de estudo em PDF para Engenharia de Produção, somente na Docsity!
Iterative minimization techniques for ab initio total-energy calculations: molecular dynamics and conjugate gradients M. €. Payne Cavendish Laboratory, Madingley Road, Cambridge, CB3 OHE, United Kingdom M. P. Teter and D.C. Allan Applied Process Research, Corning Incorporated, Corning, New York 14837 T. A. Arias and 3. D. Joannopoulos Depariment of Physics, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139 This article describes recent technical developments that have made the total-energy pseudopotentisl the most powerful ab inítio quantum-meckanica! modeling method presentiy available. In addition to preseni- ing technical details of the pseudopotential method, the articte aims to heighten awareness of the capabili- ties of the method in order to stimulate its application to as wide a range of problems in as many scientific disciplines as possible, CONTENTS 1. Introduction II. Total-Energy Pseudopotentiat Caleutations A. B. E. F [cR H Overview of approximations Electron-electron interactions 1. Exchange and correlation 2. Density-functional theory a. The Kobn-Sham energy functional b. Kohn-Sham equations e. Local-density approximation Periodie supercelis 1. Bloch's theorem 2. k-point sampling 3. Plane-wave basis sets 4. Plane-wave representation of Kohn-Shami equa- tions 5. Nonperiodio systems Eleciron-ion interactions 1. Pseudopotential approximation a. Norm conservation b. Generation procedure e. Convergence properties d. Plane-wave basis sets 2. Structure factor 3, Total jonic potential Ton-ion interactions Computationat procedure with conventional matrix diagonalization Drawbacks of conventional procedure Alternative methods HI. The Molecuiar-Dyramics Melhod A. B Eigenvalue solution by successive “improvement” of a trial wave function Molecular-dynamics procedure 1. Molecular-dynamios Lagrangian 2. Constraints Molecular-dynamics equatioas of motion 1. Unconstrained equations of motion 2. Constraincd equations of motion 3. Partially constrained equetions of motion Integration of equations of motion 1. The Verlet algorithm 2. Stability of the Verlet algorithm Reviews of Modern Physics, Vol. 84, No. 4, October 1992 V. E. Orthogonalization of electronic wave functions F. Damping G. Self-consisteney H. Kobn-Sham eigenstates IL Computational procedure with molecular dynamics 7. Instabilities and fuctuations in the Kohn-Sham en- ergy K. Computational cost of molecular dynamics . Caleulation of charge density Construction of Hamiltonian matrix . Aocelerations of the coefficients . Integration of equations of motion . Orthogonalization . Comparison to conventional matrix diagonatiza- tons 7. Loca! pseudopotentinls Improvements in Algorithms A. Improved integration 1. Analytic integration of second-order equations of motion 2. Analytic integration of first-order equations of motion B. Orthogonalization of wave functions nupenn 1, The Gram-Schmidt scheme 2. Convergence to Kohn-Sham eigenstates €. Comparison between algorithms D. Remaining difficulties - Direct Minimization of the Kokn-Sham Energy Func- tional A. Minimization of a function 1. The method of steepest descents 2. The conjugate-gradients technique B. Application of the conjugate-gradients method 1. The update of a single band . Constraims . Preconditioning . Conjugate directions . Search for the encrgy minimum . Calculational procedure 7. Computational cost €: Speed of convergence 1. Large energy cutof 2. Long supercells 3. A real system auswn Copyright 16:1992 The American Physical Society 1069 1070 1070 1070 1072 1072 1072 1072 1074 1074 1075 1076 1076 1076 1077 1077 1045 1046 M. C. Payne etat: Ab initio iterative minimization techniques YL Choice of ínitial Wave Functions for the Electronic States 1078 A. Convergence to the ground siate 1078 1. Spanning the ground state 1078 2. Symmetry conservation 1079 B. Rate of convergence 1079. €. Specific initial configurations 1079 1. Reduced basis sets 1079 2. Symmetrized combinations of plane waves 1079 3. Random initial configurations 1080 4. Random initial velocíties 1080 VFL Relaxation of the Ionic System togo A, The Car-Parrinello Lagrangian 1081 B. Equations of motion 1081 C. The Hellmano-Feyoman theorem 1081 1. Errors in Hellmano-Feynmem forces 1082 2. Consequences of the Hellmani-Feynman theorem 1082 D. Pulay forces 1082 E. Pulay stresses 1083 F. Local energy minimization 1083 1. Local energy minimization with the molecular- dynamics method 1083 2. Local energy minimization with the conjugate- gradients method 1083 G. Global energy minimization 1083 1. Simulated annealing 1084 2. Monte Carlo methods 1084 3. Location of low-energy configurations 1085 VII: Dynamical Simulations 1085 A. Limitations to dynamical simvulations 1085 B. Accuracy of dynamical trajectories 1085 €C. Error cancellation in dynamical simulations 1086 D. Car-Parrinello dynamics; constraints 1087 E. Conjugate-gradients dynamics 1088 F. Comparison of Car-Parrinello and conjugate- gradient dynamics 1089 G. Phonon frequencies 1089 1X. Nonlocul Pseudopotentiuls 1090 A. Kieinman-Bylander potentials 1091 1. Enhanced projections 1091 2. Computational cost 1091 B. Vanderbilt potentials 1091 €. Real-space projection; nonuniqueness of spherical harmonic projection 1092 D. Potentials for real-space projection 1093 X. Parallel Computers 1094 XL Concluding Remarks 1095 Acknowledgments 1096 References 1096 À. INTRODUCTION There is littie doubt that most of low-energy physics, chemistry, and biology can be explained by the quantum mechanies of elecrrons and ions. The limits of applicabil- ity and even the interpretation of the predictions of modern quantum theory are lively areas of debate amongst philosophers. Questions such as “How do we interpret the probabilistic nature of wave functions?” “What constitutes a measurement?,” “How much can we ever know about the state of a system?,” and “Can quan- tum mechanics describe consciousness?” are of funda- mental importance. Despite the fact that these questions Rev. Mod. Phys., Vol. 64, No. 4, October 1992 are still debated, it is clear that whether or not a more complete description of the world is possible, those things that modern quantum theory does predict are pre- dicted with incredible accuracy. One outstanding exam- ple of this accuracy is the calculation of the gyromagnet- ic ratio of the electron, which agrees with the experimen- tal result to the limit of the measurement, some 10 significant figures. Quantum theory has also proven correct and provided fundamental understanding for a wide variety of phenomena, including the energy levels of atoms, the covalent bond, and the distinction between metals and insulators. Further, in every instance of its application to date, the equations of quantum mechanics have yet to be shown to fail. There is, therefore, every reason to believe that an understanding of a great num- ber of phenomena can be achieved by continuing to solve these equations. As we shall soon see, the ability of quantum mechanics to predict the total energy of a system of electrons and nuclei, enables one to reap a tremendous benefit from a quantum-mechanical calculation. Tn fact this entire arti- cle is dedicated to just this one type of quantum- mechanical calculation, the foundation for which is quite strong. The quantum-mechanical rules, or Hamiltonians, for calculating the total energy of simple one-atom sys- tems have provided some of the most precise tests of the theory, and the rules for calculating the energies of more complicated systems are simple, straightforward exten- sions of these atomic Hamiltonians. It is therefore em- inently reasonable to expect quantum mechanics to pre- dict accurately the total energies of aggrepates of atoms as well. So far, this expectation has been confirmed time and time again by experiment. A few moments thought shows that nearly all physical properties are related to total energies or to differences between total energies. For instance, the equilibrium lat- tice constant of a crystal is the lattice constant that mini- mizes the total energy; surfaces and defects of solids adopt the structures that minimize their corresponding total energies. If total energies can be calculated, any physical property that can be related to a total energy or to a difference between total energies can be determined computationally. For example, to predict the equilibri- um lattice constant of a crystal, a series of total-energy calculations are performed to determine the total energy as a function of the lattice constant. As shown in Fig. 1, the results are then plotted on a graph of energy versus lattice constant, and a smooth curve is constructed through the points. The theoretical value for the equilib- rium lattice constant is the value of the lattice constant at the minimum of this curve. Total-energy techniques also have been successfully used to predict with accuracy equilibrium lattice constants, bulk moduli, phonons, piezoelectric constants, and phase-transition pressures and temperatures (for reviews see Cohen, 1984; Joanno- poulos, 1985; Pickett, 1989). Part of our aim in this article is to introduce the use- fulness of quantum total-energy techniques to a broad 1048 M. C. Payne eta!.: Ab initio iterative minimization techniques and all have bencfited from the availability of increasing- ly powerful computers. A decade ago most ab initio methods were capable only of modeling systems of a few atoms, and hence applicability to real-world systems at that point was extremely limited. Most methods can now model systems that contain some tens of atoms and are used to study a small but significant range of interesting problems. Of all the methods, one, the total-energy pseu- dopotential method, stands alone. A decade ago this method was also capable of modeling only few-atom sys- tems. Now, however, this method can model thousand- atom systems, and it is already clear that this number will increase by at least a factor of 10 within the next five years. Pushing back the limits of quantum-mechanical modeling to this extent, the total-cnergy pseudopotential method opens up a wide range of interesting problems to quantum-mechanical calculation, and the future should bring the application of quantum mechanics to many new fields of science. The increase in the number of atoms that can bc handied is directly duc to an increase in com- putational efficiency of the ab initio pseudopotential method, which also means that there is an increasing class of problems for which it is more cost effective to use quantum-mechanical modeling than experiment to deter- mine the physical parameters of systems. One purpose of this article is to heighten awareness amongst scientists in a range of scientific disciplines that the world is quantum mechanical and that there now exists an ab initio method that allows the quantum mechanics to be solved and in- corporated into everyday science. There is an economy of scale to ab initio total-energy calculations because so many physical properties are re- lated to total energies. While just one piece of theoretical “apparatus” is needed to calculate alí the physical prop- erties that are related to total energies, completely different sets of experimental apparatus are required to measure each class of physical property of a material. This represents an enormous advantage of quantum- mechanical modeling over experimental measurements. Comparing the decreasing cost of computers with the cost of a large number of different pieces of experimental apparatus needed to carry out the same functions, one secs that the cost cffectivencss of quantum-mechanical modeling methods over physical experimentation will continue to increase with time. This, then, is the time for researchers in a wide range of scientific disciplines to consider very seriously whether quantum-mechanical modeling may be applied in a cost-eflective way to fheir own field of research, even if the field of research is far removed from what is usually assumed to be the quantum-mechanical world. Total-encrgy pseudopotential calculations do require significant amounts of computer time, even for systems containing a few atoms in the unit cell, and the computa- tional time required to perform the calculations always increases with the number of atoms in the unit cell. Thus, for large systems containing hundreds of atoms in the unit cell, it is essential to use Lhe most efficient nu- Rev. Mod. Phys, Vol. 64, No. 4, October 1992 merical algorithms. In the following pages we shall dis- cuss in detail the latest methods fór doing this. Among the methods we shall discuss, two have revolutionized the field of ab inítio total-energy calculation, each increasing the number cf atoms in a calculable system by more than an order of magnitude over previously existing tech- niques. As we shall discuss in detail, the molecular- dynamics method developed by Car and Parrinelto (1985) transformed the way in which we viewed quantum- mechanical calculations and hence total-energy pseudo- potential catculations; instead of finding a coupled self- consistent solution to a descretized partial differential equation through matrix techniques, Car and Parrinello minimized a single function through simulated annealing. “The Car-Parrinello method can be used to perform calcu- lations for systems containing on the order of one hun- dred atoms in the unit cell. However, severe difficulties are encountered in certain cases when one attempts to use this method to perform calculations on much larger systems, Recently, conjugate-gradients methods have been developed (Teter et al. 1989; Gillan, 1989) that overcome the difficulties encountered with the molecular-dynamics technique. Conjugate-gradients methods have again transformed total-energy pseudopo- tential calculations by replacing simulated annealing minimization with a direct, completely self-consistent second-order search for the minimum. Using these methods, one can perform calculations for systems con- taining many hundreds, and soon thousands, of atoms. The molecular-dynamics and conjugate-gradients methods allow pseudopotential calculations to be per- formed for much larger systems than was possible using conventional matrix diagonalization methods. They also allow, for the first time, tractable ab ínitio quantum- mechanical simulations to be performed for systems at nonzero temperatures. While these capabilities offer the obvious advantage of permitting more complex systems to be studied, there is yet another benefit to be gained by using these new computational methods. By increasing the efficiency of the total-energy pseudopotential tech- nique, they have greatly extended the range of applica- tion of this technique by allowing, for the first time, the inclusion of noble and transition-metal atoms and first- row clements such as oxygen in large pscudopotential calculations. Until recently it was widely believed that computations including such elements would be com- pletely intractable with pseudopotentials in a plane-wave representation. Recent work (Allan and Teter, 1987; Bar-Yam et «í., 1989; Rappe et al., 1990; Vanderbilt, 1990; Trouillier and Martins, 1991) has shown that pseu- dopotential calculations can be performed for systems containing these atoms by employing a substantial but mansgeable number of plane waves in the basis set. With their increased efficiency, molecular-dynamies and conjugate-gradients methods can handle very large plane-wave basis sets and therefore permit large total. energy pseudopotential calculations to be performed with these new pseudopotentials, thus opening the way for M.C. Payne etal.: Ab ínitio iterative minimization techniques 1049 study of a larger variety of chemical systems than was previously possible. This article provides a detailed description of the total-energy pseudopotential method, the molecular- dynamics method, and conjugate-gradient minimization. The article also discusses related techniques and develop- ments.in a number of areas that have played a role in in- creasing the computational efficiency of these methods. Tt is hopcd that the information presented here is sufficiently detailed and at the leading edge of the work being done to be useful to scientists working both in and outside the ficld of ab initio quantum-mechanical calcula- tions. H. TOTAL-ENERGY PSEUDOPOTENTIAL CALCULATIONS This section describes the total-energy psendopotential method. An extremely useful review of the pseudopoten- tial method can be found in articles by Thm et al. (1979) and Denteneer and van Haeringen (1985). These articles are essential reading for anyone intending to implement codes for total-energy pseudopotential calculations. Totalenergy calculations can only be performed if a large number of simplifications and approximations are used. These simplifications and approximations are de- scribed in the following sections. A. Overview of approximations Prediction of the electronic and geometric structure of a solid requires calculation of the quantum-mechanical total energy of the system and subsequent minimization of that energy with respect to the electronic and nuclear coordinates. Because of the large difference in mass be- tween the electrons and nuclei and the fact that the forces on the particles are the same, the electrons respond essentially instantaneously to the motion of the nuclei. Thus the nuclei can be treated adiabatically, Jead- ing to a separation of electronic and nuclear coordinates in the many-body wave function—the so-called Born- Oppenhcimer approximation. This “adiabatic principle” reduces the many-body problem to the solution of the dy- namics of the electrons in some frozen-in configuration of the nuclei, Even with this simplification, the many-body problem remains formidable. Further simplifications, however, can be introduced that allow total-energy calculations to be performed accurately and efficiently. These include density-functional theory to model the electron-electron interactions, pseudopotentia! theory to model the electron-ion interactions, supercells to model systems with aperiodic geometries, and iterative minimization techniques to relax the electronic coordinates. Very briefly, the essential concepts are the following: ti) Density-functional' theory (Hohenberg and Kohn, Fev. Mod. Phys. Vol. 84, No. 4, October 1992 1964; Kohn and Sham, 1965) allows one, in principle, to map exactly the problem of a strongly interacting elec- tron gas (in the presence of nuclei) onto that of a single particle moving in an effective nonlocal potential. Al- though this potential is not known precisely, local ap- proximations to it work remarkably well. At present, we have no a priori arguments to explain why these approxi- mations work. Density-functional theory was revitalized in recent years only because theorists performed total- energy caleulations using these potentials and showed that they reproduced a variety of ground-state properties wilhin a few percent of experiment. Thus the acceptance of local approximations to density-functional theory has only emerged, a posteriori, after many successful investi- gations of many types of materials and systems. General- ly, total-energy differences between related structures can be believed to within a few percent and structural param- eters to at least within a tenth of an À. Cohesive ener- gies, however, can be in error by more than 10%. (ii) Pseudopotential theory (Phillips, 1958; Heine and Cohen, 1970) allows one to replace the strong electron ion potential. with a much weaker potential —a pseudopotential —that describes all the salient features of a valence electron moving through the solid, including relativistic effects. Thus the original solid is now re- placed by pseudo valence electrons and pseudo-jon cores. These pseudoelectrons experience exactly the same po- tential outside the core region as the original electrons but have a much weaker potential inside the core region. The-fact that the potential is weaker is crucial, however, because it makes the solution of the Schrôdinger equation much simpler by allowing expansion of the wave func- tions in a relatively small set of plane waves. Use of plane waves as basis functions makes the accurate and systematic study of complex, low-symmetry configura- tions of atoms much more tractable. (iii) The superccll approximation allows one to deal with aperiodic configurations of atoms within the frame- work of Bloch's theorem (see Ashcroft and Mermin, 1976). One simply constructs a large unit cell containing the configuration in question and repeats it periodically throughout space. By studying the properties of the sys- tem for larger and larger unit cells, one can gauge the im- portance of the induced periodicity and systematically filter it out. This approach has been successfully tested against “exact” Koster-Slater Green's-function methods (see Baraff and Schinter, 1979), which are only tractable for very-high-symmetry configurations. (iv) Finally, new iterative diagonalization approaches (Car and Parrinello, 1985; Payne et al. 1986; Williams and Soler, 1987; Gillan, 1989; Stich et aí. 1989; Teter et al., 1989) can be used to minimize the total-energy functional. These are much more efficient than the tradi- tional diagonalization methods. These new methods al- low expedient calculation of ionic forces and total ener- gies and significantly raise the level of modern total- energy calculations. These methods are the subject of Secs. III, IV, and V. M. G. Payne eta/.: Ab initio iterativo minimization techniques 1051 Patrioe? [Ri dr . (2.4) The exchange-correlation potential, Vyc, is given formal- Iy by the functional derivative 8Eveln(r)) Sa(r) The Kohn-Sham equations represent a mapping of the interacting many-electron system onto a system of nonin- teracting electrons moving in an effective potential due to all the other electrons. If the exchange-corretation ener- gy functional were known exactly, then taking the func- tional derivative with respect to the density would pro- duce an exchange-correlation potential that included the effects of exchange and correlation exactly. The Kohn-Sham equations must be solved self- consistently so that the occupied electronic states pgen- erate a charge density that próduces the electronic poten- tial that was used to construot the equations. The sum of the single-particle Kohn-Sham eigenvalues does not give the total electronic energy because this overcounts the effects of the electron-electron interaction in the Hartree energy and in the exchange-correlation energy. The Kohn-Sham eigenvalues are not, strictly speaking, the en- ergies of the single-particle electron states, but rather the derivatives of the total energy with respect to the occupa- tion numbers of these states (Tariak, 1978). Nevertheless, the highest ocenpied eigenvalue in an atomic or molecu- lar calculation is nearly the unrelaxed ionization energy for that system (Perdew et ai., 1982). The Kohn-Sham eguations are a set of eigenequations, and the terms within the brackets in Eq. (2.3) can be re- garded as a Hamiltonian. The bulk of the work involved in a total-energy pseudopotential calculation is the solu- tion of this eigenvalue problem once an approximate ex- pression for the exchange-correlation energy is given. Vrctr (2.5) c. Locat-density approximation The Hohenberg-Kohn theorem provides some motiva- tion for using approximate methods to describe the exchange-correlation energy as a function of the electron density. The simplest method of describing the exchange-correlation energy of an electronic system is to use the local-density approximation (LDA; Kohn and Sham, 1965), and this approximation is almost universal- ly used in total-energy pseudopotential calculations. n the local-density approximation the exchange-correlation energy of an electronic system is constructed by assum- ing that the exchange-correlation energy per electron at a point r.in the electron gas, Exctr), is equal to the exchange-correlation energy per electron in a homogene- ous electron gas that has the same density as the electron gas at point r. Thus EsclitoI= [exelDnirdr (2.6a) and Rev. Mod. Phys., Vol. 64, No. 4, October 1892 SExelntr)] din(rexetr)] ôn (1) dar) (2.6b) with exelr)= ela (o). (2.60) The local-density approximation assumes that the exchange-correlation energy functional is purely local. Several parametrizations exist for the exchange- correlation energy of a homogeneous electron gas CWigner, 1938; Kohn and Sham, 1965; Hedin and Lundqvist, 1971; Vosko et aí., 1980; Perdew and Zunger, 1981), aH of which lead to total-energy results that are very similar." These parametrizations use interpolation formulas to link exact results for the exchange- correlation energy of high-density electron gases and cal- culations of the exchange-correlation energy of inter- mediate and low-density electron gases. The local-density approximation, in principle, ignores corrections to the exchange-correlation energy at a point 1 due to nearby inhomogeneities in the electron density. Considering the inexact nature of the approximation, it is remarkable that calculations performed using the LDA have been so successful. Recent work has shown that this success can be partially attributed to the fact that the local-density approximation gives the correct sum rule for the exchange-correlation hole (Harris and Jones, 1974; Gunnarsson and Lundqvist, 1976; Langreth and Perdew, 1977. A number of attempts to improve the LDA, for instance by using gradient expansions, have not. shown any improvement over results obtained using the simple LDA. One of the reasons why these “improve- ments” to the LDA do so poorly is that they do not obey the sum rule for the exchange-correlation hole. Methods that do enforce the sum rule appear to offer a consistent improvement over the LDA (Langreth and Mehi, 1981, 1983). The LDA appears to give a single well-defined global minimum for the energy of a non-spin-polarized system of electrons in a fixed ionic potential. Therefore any en- ergy minimization scheme will locate the global energy minimum of the electronic system. For magnetic materi- als, however, one would expect to have more than one lo- cal minimum in the electronic energy. If the energy functional for the electronic system had many local mini- ma, it would be extremely costly to perform totalenergy calculations because the global energy minimum could only be located by sampling the energy functional over a large region of phase space. C. Periodic supercelis In the preceding section it was demonstrated that cer- tain observables of the many-body problem can be mapped into equivalent observables in an effcctive single-particle problem. However, there still remains the formidable task of handling an infinite number of nonin- teracting electrons moving in the static potential of an 1052 M. C. Payne etaf.: Ab initio iterative minimization techniques infiuite number of nuclei or ions. Two difficulties must be overcome: a wave function must be calculated for each of the infinite number of electrons in the system, and, since each electronic wave function extends over the en- tire solid, the basis set required to expand each wave function is infinite. Both problems can be surmounted by performing caleulations on periodic systems and applying Bioch's theorem to the electronic wave functions. 1. Bloch's theorem Bloch's theorem states that in a periodic solid each electronic wave function can be written as the product of a celi-periodic part and a wavelike part (see Ashcroft and Mermin, 1976), tlr)=expliker]fitr). 2.7) The cell-períodic part of the wave function can be ex- panded using a basis set consisting of a discrete set of plane waves whose wavc vectors are reciprocal lattice vectors of the crystal, SAD=D e aexpliGer], (2.8) G where the reciprocal iattice vectors G are defined by G:1=2mm for alt | where I is a lattice vector of the crys- tal and m is an integer. Therefore each electronic wave function can be written as a sum of plane waves, ADS csgexpli(k+G)r]. (2.9) SG 2. k-point sampling Electronic states are allowed only at a set of k points determined by the boundary conditions that apply to the bulk solid. The density of allowed k points is proportion- al to the volume of the solid. The intinite number of elec- trons in the solid are accounted for by an infinite number of k points, and only a finite number of clectronic states are occupied at each k point, The Bloch theorem changes the problem of calculating an infinite number of electronic wave functions to one of calculating a finite number of electronic wave functions at an infinite num- ber of k points. The occupied states at each k point con- tribute to the electronic potential in the bulk solid so that, in principle, an infinite number of calculations are needed to compute this potential. However, the electron- ie wave functions at k points that are very close together will be almost identical, Hence it is possible to represent the electronic wave functions over a region of k space by the wave functions at a single k point. In this case the electronic states at only a finite number of E points are required to calculate the electronic potential and hence determine the total enerigy of the solid. Methods have been devised for obtaining very accurate approximations to the electronic potential and the contri- Fev. Mod. Phys., Vol. 84, No. 4, October 1992 bution to the total energy from a filled electronic band by caleulating the electronic states at special sets of k points in the Brillouin zone (Chadi and Cohen, 1973; Joanno- poulos and Cohen, 1973; Monkhorst and Pack, 1976; Evarestov and Smirnov, 1983). Using these methods, one can obtain an accurate approximation for the electronic potential and the total energy of an insulator or a semi- conductor by calculating the electronic states at a very small number of k points. The electronic potential and total energy are more difficult to calculate if the system is metallic because a dense set of k points is required to define the Fermi surface precisely. The magnitude of any error in the total energy due to inadequacy of the k-point sampling can always be re- duced by using a denser set of k points. The computed total energy will converge as the density of k points in- creases, and the error due to the k-point sampling then approaches zero. In principle, a converged electronic po- tential and total energy can always be obtained provided that the computational time is available to calculate the electronic wave functions at a sufficiently dense set of k points. The computational cost of performing a very dense sampling of k space can be significantly reduced by using the k-p total-energy method (Robertson and Payne, 1990, 1991). In this technique solutions on the dense set of k points are generated from the solutions on a much coarser grid of k points using k-p perturbation theory. 3. Plane-wave basis sets Bloch's theorem states that the electronic wave func- tions at each k point can be expanded in terms of a discrete plane-wave basis set. In principle, an infinite plane-wave basis set is required to expand the electronic wave functions. However, the coefficients c; +. for the plane waves with small kinetic energy (2/2m)lk+G are typically more important than those with large kinet- ic energy. Thus the plane-wave basis set can be truncat- ed to include only plane waves that have kinetic energies less than some particular cutoff energy. If a continuum of plane-wave basis states were required to expand each electronic wave function, the basis set would be infinitely large no matter how small the cutof energy. Application of the Bloch theorem allows the electronic wave func- tions to be expanded in terms of a discrete set of plane waves. Introduction of an energy cutoff to the discrete plane-wave basis set produces a finite basis set. The truncation of the plane-wave basis set at a finite cutoff energy will lead to an error in the computed total energy. However, it is possible to reduce the magnitude of the error by increasing the value of the cutoff energy. In principle, the cutoff energy should be increased until the calculated total energy has converged, but it will be shown later that it is possible to perform calculations at lower cutoff energies. Onc of thc difficulties associated with the use of plane- wave hasis sets is that the number of basis states changes discontinuously with cutoff energy. In general these 1054 M. C. Payne eta!: Ab initio iterative minimization techniques FIG. 4. Schematic illustration of & supercell geometry for a molecule. Same convention as in Fig. 2. crystal slab and a vacuum region. The supercelk is re- peated over all space, so the total encrgy of an array of crystal slabs is calculated. To ensure that the results of the caleulation accurately represent an isolated surface, the vacuum regions must be wide enough so that faces of adjacent crystal slabs do not interact across the vacuum region, and the crystal slab must be thick enough so that the two surfaces of each crystal slab do not interact through the bulk crystal. Finally, even molecules can be studied in this fashion Toannopoulos et al, 1991), as illustrated in Fig. 4. Again, the supercell needs to be large enough so that the interactions between the molecules are negligible. D. Electron-ion imeractions 4. Pseudopotential approximation Although Bloch's theorem states that the electronic wave functions can be expanded using a discrete set of plane waves, a plane-wave basis set is usually very poorly suited to expanding electronic wave functions because a very large number of plane waves are needed to expand the tightly bound core orbitals and to follow the rapid os- cillations of the wave functions of the valence electrons in the core region. An extremely large plane-wave basis set would be required to perform an all-electron calculation, and a vast amount of computational time would be re- quired to calculate the electronic wave functions. The pseudopotential approximation (Phillips, 1958; Heine and Cohen, 1970; Yin and Cohen, 1982a) allows the elec- tronic wave functions to be expanded using a much smaller number of plane-wave basis states. Jt is well known that most physical properties of solids are dependent on the valence electrons to a much greater extent than on the core electrons. The pseudopotential approximation exploits this by removing the core elec- trons and by replacing them and the strong ionic poten- tial by a weaker pseudopotential that acts on a set of Rev. Mod. Phys., Vol. 64, No, 4, October 1892 pseudo wave functions rather than the true valence wave functions. An ionic potential, valence wave function and the corresponding pseudopotential and pseudo wave function are illustrated schematically in Fig. S. The valence wave functions oscillate rapidly in the region oc- cupied by the core electrons due to the strong ionic po- tential im this region. These oscillations maintain the orthogonalkity between the core wave functions and the valence wave functions, which is required by the ex- clusíon principle. The pseudopotential is constructed, ideally, so that its scattering properties or phase shifts for the pseudo wave functions are identical to the scattering properties of the ion and the core electrons for the valence wave functions, but in such a way that the pseu- do wave functions have no radial nodes in the core re- gion. In the core region, the total phase shift produced by the ion and the core electrons will be greater by q, for each node that the valence functions had in the core re- gion, than the phase shift produced by the ion and the valence electrons. Outside the core region the two poten- tials are identical, and the scattering from the two poten- tials is indistinguishable. The phase shift produced by the ion core is different for each angular momentum component of the valence wave function, and so the scattering from the pseudopotential must be angular momentum dependent. The most general form for a pseudopotential is =5 im)VXiml, Im Va (2.11) where lim) are the spherical harmonics and V, is the pseudopotential for angular momentum !. Acting on the electronic wave function with this operator decomposes the wave function into spherical harmonics, each of which is then multiplied by the relevant pseudopotential Fe FIG. 5. Schematic illustration of all-electron (solid lines) and pseudoelectron (dashed lines) potentials and their correspond- ing wave functions. The radius at which all-electron and pseu- doelectron values match is designated r.. M.C. Payne etal.: Ab initio iterative minimization techniques 1055 A pseudopotential that uses the same potential for all the angular momentum components of the wave function is called a local pseudopotential. A local pseudopotential is a function only of the distance from the nucleus. It is possible to produce arbitrary, predetermined phase shifts for each angular momentum state with a local potential, but there are limits to the amount that the phase shifts can be adjusted for the different angular momentum states, while maintaining the crucial smoothness and weakness of the pseudopotential. Without a smooth, weak pseudopotential it becomes difficult to expand the wave functions using a reasonable number of plane-wave basis states. a. Norm conservation In total-energy calculations, the exchange-correlation energy of the electronic system is a function of the elec- tron density. If the exchange-correlation energy is to be desired accurately, it is necessary that outside the core regions the pseudo wave functions and real wave func- tions be identical, not just in their spatial dependences but also in their absolute magnitudes, so that the two wave functions generaté identical charge densities. Ad- justment of the pseudopotential to ensure that the in- tegrals of the squared amplitudes of the real and the pseudo wave functions inside the core regions are identi- cal guarantees the equality of the wave function and pseudo wave function outside the core region. One of the first attempts to construct pseudopotentials of this type was by Starkloff and Joannopoulos (Joannopoulos et al. 1977, Starkloff and Joannopoulos 1977). They intro- duced a class of local pseudopotentials that described the valence energies and wave functions of many heavy atoms aceurately. Of course, in general, the scattering from the ion core is best described by a nonlocal pseudopotential that uses a different potential for each angular momentum com- ponent of the wave function. Various groups (Redondo et al., 1977; Hamamn et aí. 1979; Zunger and Cohen, 1979; Kerker, 1980; Bachelet et a!., 1982; Shirley et al, 1989) have now introduced nonlocal pseudopotentials of this type that work extremely well, Moreover, as pointed out by Hamann, Schluter, and Chiang (1979), a match of the pseudo and real wave functions outside the core re- gion aiso assures that the first-order energy dependence of the scattering from the ion core is correct, so that the scattering is accurately described over a wide range of en- ergy. A method for the construction of pseudopotentials that corrects even the higher-order energy dependence of the scattering has recently been introduced by Shirley et al. (1989). Local and nonlocal pseudopotentials of these types are currently termed ab initio or norm con- serving aud are capable of describing the scattering due to the ion in a variety of atomic environments, a property referred to as rransferability. Rev. Mod. Phys., Vol. 64, No. 4, October 1992 b. Generation procedure The typical method for generating an ionic pseudopo- tential for an atom of species cv, is illustrated in Fig. 6 and proceeds as follows. All-electron calculations are performed for an isolated atom in its ground state and some excited states, using a given form for the exchange- correlation density functional. This provides valence electron eigenvalues and valence electron wave functions for the atom. A parametrized form for the ionic pseudo- potential is chosen. The parameters are then adjusted, so that a pseudoatom calculation using the same form for exchange-correlation as in the all-electron atom gives both pseudowave functions that match the valence wave functions ouiside some cutoff radius 7, and pseudoeigen- values that are equal to the valence eigenvalues. The ion- ic pseudopotential obtained in this fashion is then used, without further modification, for any environment of the atom. The electronic density in any new environment of the atom is then determined using both the ionic pseudo- potential obtained in this way and the same form of exchange-correlation functional as employed in the con- struction of the ionic pseudopotential. A generalization of this pseudopotential construction procedure for solu- tions of the atom that are not normalizable has recently been introduced by Hamann (1989). Finally, it should be noted that ionic pseudopotentials are constructed with r, ranging typically from one to two times the value of the core radius. It should also be not- ed that, in general, the smaller the value of r,, the more “transferable”” the potential, (The entire procedure for solving the problem of a solid, given an ionic pseudopo- tential, is outlined in Sec. ILF.) Sole all-electron. sigenvalues & wave functions] Select a parameirized form for the pseudopotential ve) Áre pseudosigenvalues” equal to fheali-eleciron valence eigenvalves ?, generated FIG. 6. Flow chart describing the construction of an ionic pseudopotential for am atom. M. C. Payne etal.: Ab initio iterative minimization techniques 1057 Ng is the total number of ions of species a, and Ois the volume of the unit celi. E. lon-ion interactions Kt is extremely difficult to compute the Coulomb ener- gy of the ionic system using a direct real-space summa- tion because the Coulomb interaction is long ranged. The Coulomb interaction is also long ranged in recipro- cal space, so the problem is not solved by performing the summation in reciprocal space. Ewald (1917a, 1917b, 1921) developed a rapidly convergent method for per- forming Coulomb summations over periodic lattices. Ewald's method is based on the following identity: a (HI RI] 2 ” = x f, expl—|R/+1—RoPp'ldp — Ie? 49? Xexpli(R,—Ry)-G]-I do , p 2.16 where | are lattice vectors, G are reciprocal-lattice vec- tors, and (2 is the volume of the unit cell. This identity erfolm|R;+i—Ro]) Mm jm Eion Ertse? m q where Z, and Z, are the valences of ions and J, respec- tively, and erfe is the complementary error function. An ion does not interact with its own Coulomb charge, so the /=0 term must be omitted from the real-space summation when 1 =J. This is indicated by the Fin the first summation in Eq. (2.17). F. Computational procedure with conventional matrix diagonalization The sequence of steps required to carry out a total- energy pseudopotential caleulation with conventional matrix diagonalization techniques is shown in the flow diagram in Fig. 7. The procedure requires an initial guess for the electronic charge density, from which the Hartree potential and the exchange-correlation potential can be calculated. The Hamiltonian matrices for each of the k points included in the calculation must be con- structed, as in Eq. (2.10), and diagonalized to obtain the Kohn-Sham cigenstates. These eigenstates will normally Rev. Mod. Phys., VOL. 64, No. 4, October 1992 4m Seta au+ ZE ex R+-RA vota E Gpo? provides a method for rewriting the lattice summation for the Coulomb energy due to the interaction between an ion positioned at R, and an array of atoms positioned at the points Rj+. The identity holds for all positive “values of q. At first sight, the infinite Coulomb summation on the left-hand side of Eg. (2.16) has been replaced by two infinite summations, one over lattice vectors and the oth- er over reciprocal-lattice vectors. However, if one chooses an appropriate value of 7 the two summations become rapidly convergent in their respective spaces. Then the real and reciprocal-space summations can be computed with only a few lattice vectors and a few reciprocal-lattice vectors. As mentioned in the preceding section, the contribu- tions to the total energy from the electron-ion, ion-ion, and electron-electron interactions at G=0 cancel exact- ly, and so the G=0 contribution to the Coulomb energy of the ionic system must be removed in order to compute the correct total energy. In the Ewald summations the G=0 contribution to the Coulomb energy has been di- vided between the real-space and the reciprocal-space summations, so that it is not sufficient simply to omit the G=0 term in the reciprocal-space Ewald summation. The G=0 term in the reciprocal-space summation should be omitted and two terms added to the Ewald en- ergy to give the correct total energy. The correct form for the total energy is (Yin and Cohen, 1982b) 1 Ig]? 4 R-R$-G|- El, cotRy RG o 217 r generate a different charge density from the one original- ly used to construct the electronic potentials, and hence a new set of Hamiltonian matríces must be constructed us- ing the new electronic potentials, The eigenstates of the new Hamiltonians are obtained, and the process is re- peated until the solutions are self-consistent. In practice the new electronic potential is taken to be a combination of the electronic potentiais generated by the old and the new eigenstates, since this speeds the convergence to self-consistency. To complete the: total-energy calcula- tion, tests should be performed to-ensure that the total energy is converged both as a function of the number of k points and as a function of the cutoff energy for the plane-wave basis set. Very few total-energy calculations are taken to absolute convergence. For most calcula- tions, the difference in energy between different ionic configurations is more important than the absolute ener- gies of the individual configurations, and the calculations are performed using an energy cutoff and number of k points at which the energy differences have converged rather than an cnergy cutoff and number of k points at which the absolute energies have converged. 1058 M.C. Payne etai.: Ab ínitio iterative minimization techniques Construct Vion given otomic numbers and positions of ions + É Pick a cutoff for the plane-ave boss ser feilk + Gir) | Pick a trial density n(z) Calculate Yy (n) ond Vyc tn) |, Sove Hy= [- teve +VontWyrtic] = E ç by diagonalization of His Catculate new n(L) 15 SOLUTION SELF-CONSISTENT ? Gompute Tool Energy |) FIG. 7. Flow chart describing the computational procedure for the calculation of the total energy of a solid, using conventional matrix diagonalization. Generate New Density nfr) G. Drawbacks of conventional procedure Pseudopotential calculations with a plane-wave basis are not very well suited to conventional matrix diagonali- zation techniques. In a total-energy pseudopotential cal- culation there are typically 100 plane-wave basis states for each atom in the system. The cost of matrix diago- nalization íncreases as the third power of the number of plane-wave basis states, and the memory required to store the Hamiltonian matrix increases as the square of the number of basis states. As a result, conventional ma- trix diagonalization techniques are restricted to the order of 1000 plane-wave basis states, and this in turn restricts the number of atoms in the unit cell to the order of 10. Using conventional matrix diagonalization methods, the Kohn-Sham eigenvalues of ail of the electronic states are calculated, although only those of the lowest occupied states are required to compute the total energy. Further- more, considerable effort is expended to compute the ei- genvalues to machine accuracy, even when the electronic potential is far from self-consistency. H. Alternative methods Tt has been demonstrated that the total-energy pseudo- potential technique gives accurate and reliable values for total energies of solids. However, as described above, the power of the pseudopotential method is severely restrict- ed when using conventional matrix diagonalization tech- niques to solve for the Kohn-Sham eigenstates. In Secs. HI, IV, and V, descriptions are given of alternative methods for performing total-energy pseudopotential cal- culations. These methods are alternative techniques for Rev. Mod. Phys., Vol. 64, No. 4, October 1892 minimizing the Kobn-Sham energy functional and lead to the same self-consistent Kohn-Sham eigenstates and eigenvalues - as conventional matrix diagonalization. However, they are much better suited to performing total-energy pseudopotential calculations because the computational time and memory requirements scale more slowly with the size of the system, allowing calcula- tions on larger and more complex systems than can be studied using conventional matrix diagonalization tech- niques. Wt. THE MOLECULAR-DYNAMICS METHOD Eigenvalue problems may be solved by successively “improving” a trial wave function. A simple illustration of this process is given in Sec. IILA. Although the Car- Parrinello method (Car and Parrineilo, 1985) should be regarded primarily as a scheme for performing ab initio dynamical simulations, the molecular-dynamics treat- ment of the electronic degrees of freedom introduced in the Car-Parrinello method can be used to calculate directly the self-consistent Kohn-Sham eigenstates of a system. In this case the method operates by carrying out a series of iterations that “improve” a set of trial wave functions until they eventually converge to the Kohn- Sham eigenstates. The total energy can be easily comput- ed once the self-consistent Kohn-Sham eigenstates have been determined. In this section we describe the molecular-dynamics treatment of the electronic degrees of freedom and show how it provides a very efficient technique for finding the electronic ground state for a fixed ionic configuration. In Sec. IILA we begin this dis- cussion with a description of a simple scheme for the iterative solution of an eigenvalue problem based on the variational principle. The molecular-dynamics-based method is not as transparent as the example presented here, but it has the common feature of varying trial wave functions until they become eigenstates. A. Eigenvalue solution by successive “improvement” of a trial wave function The variational theorem gives an upper bound for the expectation value of a Hamiltonian H for any arbitrary normalized trial wave function 4. The expectation value is greater than or equal to the energy of the lowest- energy eigenstate of the Hamiltonian. Hence ColHy) ZA, (3.1) where Ap is the energy of the lowest-energy eigenstate of the Hamiltonian H. H 4 is expanded using a set of arbitrary orthonormal basis functions [é], “Det (Eb) Substitution of Eq. (3.2) into (3,3) gives 1060 M. €. Payne etal: Ab initio iterative minimization techniques Ieampndr=s,. (3.6) “Fhese constraints are incorporated in the molecular- dynamics Lagrangian by using the method of Lagrange multipliers. The molecular-dynamies Lagrangian be- comes L=Eutiildo) Eli (Riba) +53 Ay | [eres ay 37 bj The Lagrange multipliers A, ensure that wave functions remain normalized, while the Lagrange multipliers A, (17) ensure that the wave functions remain orthogonal. As described below, the Lagrange multipliers may be thought of as providing additional forces acting on the wave functions, which ensure that the wave functions remain orthonormal. €C. Molecular-dynamics equations of motion The equations of motion for the electronic states are derived from the Lagrange equations of motion, d [9L aL Sá =, (3.8 de [adt | owt which give i) (3.9) nb HH At» 7 where H is the Kohn-Sham Hamiltonian. The force —(H%,) is the gradient of the Kohn-Sham energy functional at the point in the Hilbert space that corresponds to the wave function y,. The Lagrange mul- tipliers add forces A,tb; to the force —(H,). These forces ensure that the electronic wave functions remain orthonormai as they propagate along their molecular- dynamics trajectories. A general discussion of the conse- quences of various orthonormalization schemes is given by Broughton and Khan (1989). 4. Unconstrained equations of motion The constraints of orthonormality play a crucial role in the evolution of the electronic states in the molecular- dynamics method. To illustrate the importance of these constraints, we consider the evolution of the electronic states in the absence of any constraints: ma>—[H —o ly, where H is the Kohn-Sham Hamiltonian, and o is an en- ergy shift that defines the zero of energy. Tf 1, is expanded in the basis set of the eigenstates of Hamiltonian H, = Gnbno (3.10) SID Rev. Mod. Phys. Vol, 64, No. 4, October 1892 and if Eq. (3.11) is substituted into (3.10), the following equation of motion for the coefficient of the cigenstate É, is obtained: HE — a 12) —O in where £, is the eigenvalue corresponding to cigenstate E,- Integration of these equations of motion, under the assumption that the velocities of the coefficients are ini- tially zero, gives the coefficients at time tas =cin(Ocosf[(e,—0)/n]'2t), en>o, (3.139) cilb=e;n(Ocosh(e,—0l/u)2t], enHW+E Agy; 4 equations of (3.14) ensure that the electronic wave functions remain ortho- normal at every instant in time. The molecular-dynamics evolution of the electronic wáve functions under these equations of motion would also conserve the total energy in the electronic degrees of freedom for the system of fixed ions we assume for this section. However, to ensure these properties, the values of the Lagrange multipliers must vary continuously with time, and so implementa- tion of this form of the molecular-dynamics equations re- quires that the Lagrange multipliers be evaluated at infinitely small time separations. To make the calcula- tions tractable, variation of the Lagrange multipliers dur- ing a time step is neglected and the Lagrange multipliers are approximated by a constant value during the time step. In this case the wave functions will not be exactly orthonormal at the end of the time step, and a separate orthonormalization step is needed in the calculation. 3. Partially constrained equations of motion Since a separate orthonormalization step is required at the end of each time step, it is possible to remove the M. C. Payne etal.: Ab initio iterative minimization techniques 1061 constraints of orthogonality from the equation of motion and use a partia!ly constrained equation of motion. The constraints of orthogonality are then imposed after the equations of motion have been integrated, and the Lagrange multipliers for the constraints of normalization Ay are approximated by the expectation values of the en- ergies of the states, À,, where = ChilH;) (3.15) This leads to an equation of motion that has the form ut LE A. (3.16) With this equation of motion, the acceleration of an elec- tronie state is always orthogonal to that state, a necessary requirement to maintain normalization, and the accelera- tion becomes zero when the wave function is an exact cigenstate. D. Integration of equations of motion Once the accelerations of the coefficients have been calculated, the equations of motion for the coefficients of the plane-wave basis states have to be integrated. Car and Parrinello used the Verlet algorithm (Verlet, 1967) to integrate the equations of motion. 1. The Verlet algorithm The Verlet algorithm is derived from the simplest second-order difference equation for the second deriva- tive, It gives the value of the ith electronic state at the next time step, Y; (At), as HAD 20) WA AD FARO), (3.178) where At is the length of the time step, 4;(0) is the value of the state at the present time step, and :;(— At is the value of the state at the last time step. Substitution of Eg. (3.16) into (3.17a) then gives 2 vAs=20 (0) pl do ELE =, WO) (3.17b) The Verlet algorithm introduces an error of order At* into the integration of the equations of motion. A more sophisticated finite-difference algorithm could be used to integrate the equations of motion and hence reduce the error in the integration to a higher order of At. In princi- ple, for a given level of accuracy this would allow a longer time step to be used in the integration of the equa- tions of motion and hence reduce the total computational time by reducing the number of time steps required to perform the calculation. The maximum stable time step, however, is not significantly increased with a higher- order difference scheme. A more sophisticated finite- difference equation would also require the values of the coefficients or the corresponding accelerations from a Rev. Mod. Phys., Vol. 64, No. 4, Octobor 1992 larger number of time steps in order to integrate the equations of motion. It requires a large amount of memory to store the wave-function coefficients and ac- celerations for each time step in a total-energy pseudopo- tential calculation. If the extra coefficients and accelera- tions did not fit into core memory, the computation could become 1/0 bound and the total time required for the calculation may actually increase. 2. Stability of the Verlet algorithm A general performance measure of algorithms of the Verlet type is the rate at which they converge to minimum-energy state. A piven problem normally re- quires a certain amount of real time to converge, and the computational effort is then determined by the size of the time step Át. In what follows it is demonstrated that the largest At allowed for stability is related to the difference between the largest and smallest eigenvalues of the sys- tem. Given the assumption that 1, is near the lowest-energy eigenstate £y, the state 1), is expanded as in Eg. (3.11), SEA E ADE, (3.18) ao where-the 3, represent infinitesimal coefficients. Substi- tution of Eg. (3.18) into (3.17b) gives to first order in 8 A Ball) =28(0) Bl — Ate) [84 Eo]80) (3.19) In the standard stability analysis (see Mathews and Walk- er, 1970), a constant growth factor g is introduced at each time step, so that SlndD=g8 41 — DAL). (3.20) Substitution of Eq. (3.20) into (3.19) then gives 2 g"-2g +1+ “o (Ea Eng =0, (3.21) and the real part of g can become greater than 1 if 2 1/2 dt> Ta (3.22a) tes—Eo) Therefore the largest possible Ar that is allowed for sta- bility must be aro , 3.22b) (Emas E9)!2 é , where Emax is the largest eigenvalue of the problem. For a Hamiltonian representation in a plane-wave basis, the largest eigenvalue is primarily determined by the cutoff kinetic energy of the basis set. Thus'the Verlet algorithm will require the time step to be reduced as the cutoff energy is increased. This problem is addressed again in Sec. IV.A. M. €. Payne eta. Ab initio iterative minimization techniques 1063 H. Kohn-Sham eigenstates, The Kohn-Sham energy functional is minimized by any set of wave functions that are a linear combination of the lowest-energy Kohn-Sham eigenstates. These wave functions will be stationary under the molecular- dynamics eguations of motion and subsequent orthogo- nalization. Therefore, in the molecular-dynamics method, each electronic wave function will, in general, converge to a linear combination of the lowest-energy Kohn-Sham eigenstates. This is not a problem for sys- tems with a gap, but it can be a severe problem for metal- lie systems in which the occupancy of a state depends on its eigenvalue. Some ideas for handling metals in Car- Parrinello dynamical simulations have been proposed by Fernando et al. (1989), Woodward et al. (1989), Benedek et al. (1991), and Pederson and Jackson (1991). The ac- tual Kohn-Sham eigenvalues can be found by diagonali- zation of the matrix whose matrix elements are given by O=ChilHy,). (3.24) A simpler approach, which guarantees convergence to Kohn-Sham eigenstates, is presented later in Sec. IV.B. À. Computational procedure with molecular dynamics The procedure for performing a total-energy pseudo- potential calculation using the molecular-dynamics tech- nique is shown in the flow diagram of Fig. 11. The pro- Construct Vion given atomic numbers and positions of ions Pick a cutoff for the plane-wave basis set Choose initiat wave functions y, Calculate n (rh Colculate Yy tn) and Vyc(n] Construct KOHN-SHAM Hamiltonion H Coteulate pur; == [H-AJ Yi Integrote equatioes of motion Orthogonalize and normalize wave functions ARE WAVE FUNCTIONS STATIONARY 7 ves NO Compute Tofol Energy FIG. Lt. Flow chart describing the computational procedure for the calculation of the total energy of a solid with molecular dynamics. Rev. Mod. Phys., Vol. 64, No. 4, October 1992 cedure requires an initial set of trial wave functions from which the Hartree potential and the exchange-correlation potential can be calculated. The Hamiltonian matrices for each of the k points included in the calculation are constructed, and from these the accelerations of the wave functions are calculated. The equations of motion for the electronic states are integrated, and the wave functions are orthogonalized and normalized. The charge density generated by the new set of wave functions is then calcu- lated. This charge density is used to construct a new set of Hamiltonian matrices, and a further set of wave func- tions is obtained by integration of the equations of motion and orthonormalization of the resultant wave functions. These iterations are repeated until the wave functions are státionary. The wave functions are then linear combinations of the Kohn-Sham eigenstates. The Kohn-Sham energy functional is minimized, and its value gives the total energy of the system. The solution is iden- tical to the solution that would be obtained by using ma- trix diagonalization techniques with the same basis states. The convergence tests described in Sec. IL.F must be performed to ensure that the calculated total energy has converged both as a function of the number of k- points included in the calculation and as a function of the cutoff energy for the plane-wave basis set. J. Instabilities and fluctuations in the Kohn-Sham energy In Sec. HI.D.2 above a criterion was derived that pro- vides an upper bound to the targest stable time step in the Verlet algorithm. There are, however, additional limita- tions to this time step that are much more subtle. These limitations are related to instabilities in the Kohn-Sham energy and can arise when the Hamiltonian is allowed to evolve under the eguations of motion, as required by self-consistency. These instabilities are caused by charge fluctuations and are commonly referred to as charge sloshing. As discussed in Sec. IILG above, the Hartree and exchange-correlation potentials of the Kohn-Sham Ham- iltonian depend on the electronic density and must change after each time step. If these changes are too large, the problem becomes unstable and the time step must be reduced. This instability is indicated schemati- cally in Fig. 12. The trajectories in this figure should be contrasted with the trajectories shown in Fig. 10. The major difficulty lies with the Hartree potential, VulG), in Eg. (2.10). VylG) is proportional to n(G)/|G|?, where n(G) is the Fourier transform of the charge density. Therefore, at small reciprocal-lattice vec- tors, a small change in 4 (G) will produce large changes in the potential. Since these changes are not taken into account during an elemental time step, they can lead to large increases in energy if the time step is too large. This is particularly true for systems with sufficiently small reciprocal-lattice vectors. The smallest (nonzero) reciprocal-lattice vector is inversely proportional to the 1064 M. C. Payne etal.: Ab initio iterative minimization techniques FIG. 12. Illustration of instability in the molecular-dynamies method and Kohn-Sham Hamiltonian as a consequence of a large time step. The convention is the same as in Fig. 10. Note that each iteration drives the coeffcients further from their equilibrium value. tongest length of the supercell. Therefore, as supercells become physically larger, the stability of the problem eventually becomes dominated by “charge sloshing” con- siderations, and the time step must be reduced 10 keep the fluctuations small. A related difficulty, which leads to a similar phenomenon, arises if many states with nearly the same eigenvalue exist in the neighborhood of the Fermi energy. Under these circumstances, macroscopic ascillations in electron density can occur with very little change in total energy. The problem is particularly serious for metals. Since small energy changes may yield large differences in electron density, and hence in forces, close convergence of the electrons to their ground state is necessary in me- tallic systems. Thus the largest stable time step in the Car-Parrincilo algorithm is dominated by either the maximum kinetic energy in the problem (as discussed in Sec. IILD.2) or the need to limit charge sloshing. One of these two con- siderations will place a practical limit on the size and type of problem that can be attacked with the Car- Parrinello algorithm. Despite these limitations, however, there are many problems for which the method has out- standing utility. Finally, it should be noted that, even for an infinitesimal time step, the Kohn-Sham energy will not always decrease monotonically during the evolution of the electronic system to its ground state. Insufficient damping of the wave-function coefficients in the equa- tions of motion, for example, will result in fluctuations of the Kohn-Sham energy during this evolution. This is simply the expected dynamical behavior of an under- damped system and will not prevent the electrons from eventuaily reaching their ground state. K. Computational cost of molecular dynamics The molecular-dynamics method provides a general technique for solving eigenvalue problems. However, Rev. Mod. Phys., Vol. 64, No, 4, October 1992 this method was developed specifically in the context of total-energy pseudopotential calculations, and in this sec- tion the computational cost of using the molecular- dynamics method. to perform total-energy pseudopoten- tial calculations will be calculated. An important feature of any computational method is the rate at which the computational time increases with the size of the system, and so particular attention will be paid to (he operations that increase fastest as the size of the system increases. In a total-cenergy pseudopotential calculation the wave functions 4; are expanded in a plane-wave basis set so that PAN=D Gps gesplilk+G)r]. (3.25) G A feature of total-energy pseudopotential calculations is that the number of plane-wave basis states used in the calculations is always much larger than the number of oceupied bands. The ratio is typically 100:1. This ratio is independent of the size of the system: increasing the size of the unit cell will increase the number of atoms in the unit cell. However, the lengths of the reciprocal- lattice vectors will be reduced, so that there will be more plane waves with energies below the cutoff energy for the plane-wave basis set. The ratio of the number of plane waves to the number of occupied bands is considerably larger when the system contains any transition-metal atoms or first-row atoms. These atoms have strong pseu- dopotentials, and much larger basis sets are needed to ex- pand the electronic wave functions. Only the parts of the total-energy calculation that scale faster than linearly with the size of the system will be considered in the fol- lowing discussion. The parts of the calculation that scale linearly with the size of the system do not have large “prefactors” associated with their computational cost, so the computational time required for these parts of the calculation is negligible. A calculation for a system that has Na occupied bands in which the electronic wave functions are expanded in a basis set containing Ney plane waves will be considered. The operations described below scale linearly with the number of k points so that a calculation for a single k point will be considered. 1. Calculation of charge density The charge density is most cheaply computed in real space because it is simply the square of the magnitude of the wave function. This requires that the wave functions be Fourier transformed from reciprocal space to real space. However, the charge density has components with wave vectors up to twice the cutoff wave vector for the electronic wave functions. Hence, to maintain a faithful representation of the charge density, one must compute it on a Fourier grid twice as dense in each spa- tial direction as the grid required to provide a faithful representation cf the wave function. Furthermore, the plane-wave components of the wave function, which lie within a sphere in reciprocal space, must be placed-in an