PsiAPI Tutorial: Using Psi4 as a Python Module¶
transcribed by D. A. Sirianni
psi4.
in
front, then submit it to the executable psi4
which processes the
Psithon into pure Python and runs it internally. In PsiAPI mode, you
write a pure Python script with import psi4
at the top and commands
are behind the psi4.
namespace, then submit it to the python
interpreter. Both modes are equally powerful. This tutorial covers the
PsiAPI mode.psi4.
namespace, the API
should not be considered completely stable. Most importantly, as we
someday deprecate the last of the global variables, options will be
added to the method calls (e.g.,
energy('scf', molecule=mol, options=opt)
)Unlike in the past, where Psi4 was executable software which could only
be called via input files like input.dat
, it is now interactive,
able to be loaded directly as a Python module. Here, we will explore the
basics of using Psi4 in this new style by reproducing the section A
Psi4 Tutorial
from the Psi4 manual in an interactive Jupyter Notebook.
Note: If the newest version of Psi4 (v.1.1a2dev42 or newer) is in your
path, feel free to execute each cell as you read along by pressing
Shift+Enter
when the cell is selected.
I. Basic Input Structure¶
Psi4 is now a Python module; so, we need to import it into our Python environment:
In [1]:
try:
import os, sys
sys.path.insert(1, os.path.abspath('/scratch/psilocaluser/conda-builds/psi4_1495011512596/work/build/stage//scratch/psilocaluser/conda-builds/psi4_1495011512596/_b_env_placehold_placehold/lib/python3.5/site-packages'))
except ImportError:
pass
import psi4
Psi4 is now able to be controlled directly from Python. By default, Psi4
will print any output to the screen; this can be changed by giving a
file name (with path if not in the current working directory) to the
function psi4.core.set_output_file()
API,
as a string:
In [2]:
psi4.core.set_output_file('output.dat', False)
Additionally, output may be suppressed by instead setting
psi4.core.be_quiet()
API.
II. Running a Basic Hartree-Fock Calculation¶
In our first example, we will consider a Hartree-Fock SCF computation
for the water molecule using a cc-pVDZ basis set. First, we will set the
available memory for Psi4 to use with the psi4.set_memory()
API
function, which takes either a string like '30 GB'
(with units!) or
an integer number of bytes of memory as its argument. Next, our
molecular geometry is passed as a string into psi4.geometry()
API.
We may input this geometry in either Z-matrix or Cartesian format; to
allow the string to break over multiple lines, use Python’s triple-quote
"""string"""
syntax. Finally, we will compute the Hartree-Fock SCF
energy with the cc-pVDZ basis set by passing the method/basis set as a
string ('scf/cc-pvdz'
) into the function psi4.energy()
API:
In [3]:
#! Sample HF/cc-pVDZ H2O Computation
psi4.set_memory('500 MB')
h2o = psi4.geometry("""
O
H 1 0.96
H 1 0.96 2 104.5
""")
psi4.energy('scf/cc-pvdz')
Out[3]:
-76.02663273488399
If everything goes well, the computation should complete and should
report a final restricted Hartree-Fock energy in the output file
output.dat
in a section like this:
Energy converged.
@DF-RHF Final Energy: -76.02663273486682
By default, the energy should be converged to about \(1.0\times 10^{-6}\), so agreement is only expected for about the first 6 digits after the decimal. If the computation does not complete, there is probably a problem with the compilation or installation of the program (see the installation instructions in the main Psi4 manual section Compiling and Installing from Source).
This very simple input is sufficient to run the requested information. Notice we didn’t tell the program some otherwise useful information like the charge of the molecule (0, it’s neutral), the spin multiplicity (1 for a closed-shell molecule with all electrons paired), or the reference wavefunction to use (restricted Hartree-Fock, or RHF, is usually appropriate for a closed-shell molecule). The program correctly guessed all of these options for us. We can change the default behavior through additional keywords.
Let’s consider what we would do for an open-shell molecule, where not
all the electrons are paired. For example, let’s run a computation on
methylene (\(\text{CH}_2\)), whose ground electronic state has two
unpaired electrons (triplet electronic state, or a spin multiplicity
\(2S + 1 = 3\)). In this case, the default spin multiplicity (1) is
not correct, so we need to tell the program the true value (3). Like
many programs, Psi4 can get the charge and multiplicity as the first two
integers in the Z-matrix (note the line with 0 3
at the beginning of
the molecule specification below). In this example, we will also specify
the bond length and bond angle as variables (\(R\) and \(A\)),
whose values are first stored and then inserted into the geometry
specification using Python 3 string
formatting.
In [4]:
#! Sample UHF/6-31G** CH2 Computation
R = 1.075
A = 133.93
ch2 = psi4.geometry("""
0 3
C
H 1 {0}
H 1 {0} 2 {1}
""".format(R, A)
)
psi4.set_options({'reference': 'uhf'})
psi4.energy('scf/6-31g**')
Out[4]:
-38.925334628859886
Executing this cell should yield the final energy as
@DF-UHF Final Energy: -38.92533462887677
Notice the new command, psi4.set_options()
API,
to the input. This function takes a Python
dictionary
as its argument, which is a key-value list which associates a Psi4
keyword
with its user-defined value. For open shell molecules, we have a choice
of unrestricted orbitals (unrestricted Hartree-Fock, or UHF) or
restricted orbitals (restricted open-shell Hartree-Fock, or ROHF).
Usually, UHF is a little easier to converge (although it may be more
susceptible to spin contamination than ROHF).
III. Geometry Optimization and Vibrational Frequency Analysis¶
The above examples were simple single-point energy computations (as
specified by the psi4.energy()
API
function). Of course there are other kinds of computations to perform,
such as geometry optimizations and vibrational frequency computations.
These can be specified by replacing psi4.energy()
API
with psi4.optimize()
API
or psi4.frequency()
API,
respectively.
Let’s take a look at an example of optimizing the H\(_2\)O molecule using Hartree-Fock with a cc-pVDZ basis set.
Now, here comes the real beauty of running Psi4 interactively: above,
when we computed the energy of H\(_2\)O with HF/cc-pVDZ, we
defined the Psi4 molecule object h2o
. Since we’re still in the
Python shell, as long as you executed that block of code, we can reuse
the h2o
molecule object in our optimization without redefining it,
by adding the molecule=h2o
argument to the psi4.optimize()
API
function:
In [5]:
psi4.set_options({'reference': 'rhf'})
psi4.optimize('scf/cc-pvdz', molecule=h2o)
Optimizer: Optimization complete!
Out[5]:
-76.02703272937504
This should perform a series of gradient computations. The gradient
points which way is downhill in energy, and the optimizer then modifies
the geometry to follow the gradient. After a few cycles, the geometry
should converge with a message like Optimization complete!
. As
indicated in the following table (printed by the optimizer at the end of
the computation and grep-able with ~
), the energy decreases with
each step, and the maximum force on each atom also decreases with each
step (in principle, these numbers could increase in some iterations, but
here they do not).
--------------------------------------------------------------------------------------------------------------- ~ Step Total Energy Delta E MAX Force RMS Force MAX Disp RMS Disp ~ --------------------------------------------------------------------------------------------------------------- ~ 1 -76.026632734908 -76.026632734908 0.01523518 0.01245755 0.02742222 0.02277530 ~ 2 -76.027022666011 -0.000389931104 0.00178779 0.00142946 0.01008137 0.00594928 ~ 3 -76.027032729374 -0.000010063363 0.00014019 0.00008488 0.00077463 0.00044738 ~ --------------------------------------------------------------------------------------------------------------- ~
To get harmonic vibrational frequencies, it’s important to keep in mind
that the values of the vibrational frequencies are a function of the
molecular geometry. Therefore, it’s important to obtain the vibrational
frequencies AT THE OPTIMIZED GEOMETRY. Luckily, Psi4 updates the
molecule with optimized geometry as it is being optimized. So, the
optimized geometry for H\(_2\)O is stored inside the h2o
molecule object, which we can access! To compute the frequencies, all we
need to do is to again pass the molecule=h2o
argument to the
psi4.frequency()
API
function:
In [6]:
scf_e, scf_wfn = psi4.frequency('scf/cc-pvdz', molecule=h2o, return_wfn=True, dertype=1)
6 displacements needed.
1 2 3 4 5 6
Executing this cell will prompt Psi4 to compute the Hessian (second derivative matrix) of the electronic energy with respect to nuclear displacements. From this, it can obtain the harmonic vibrational frequencies, given below (roundoff errors of around \(0.1\) cm\(^{-1}\) may exist):
Irrep Harmonic Frequency
(cm-1)
-----------------------------------------------
A1 1775.6478
A1 4113.3795
B2 4212.1814
-----------------------------------------------
Notice the symmetry type of the normal modes is specified (A1, A1, B2). The program also print out the normal modes in terms of Cartesian coordinates of each atom. For example, the normal mode at \(1776\) cm\(^{-1}\) is:
Frequency: 1775.65
Force constant: 0.1193
X Y Z mass
O 0.000 0.000 -0.068 15.994915
H 0.000 0.416 0.536 1.007825
H 0.000 -0.416 0.536 1.007825
where the table shows the displacements in the X, Y, and Z dimensions for each atom along the normal mode coordinate. (This information could be used to animate the vibrational frequency using visualization software.)
Because the vibrational frequencies are available, a thermodynamics
analysis is automatically performed at the end of the computation. You
can see this in the next section of the output file output.dat
. The
vibrational frequencies are sufficient to obtain vibrational
contributions to enthalpy (H), entropy (S), and Gibbs free energy (G).
Similarly, the molecular geometry is used to obtain rotational
constants, which are then used to obtain rotational contributions to H,
S, and G.
Note: Psi4 has several synonyms for the functions called in this
example. For instance, psi4.frequency()
API
will compute molecular vibrational frequencies, and psi4.optimize()
API
will perform a geometry optimization.
IV. Analysis of Intermolecular Interactions¶
Now let’s consider something a little more interesting. Psi4 contains code to analyze the nature of intermolecular interactions between two molecules, via symmetry-adapted perturbation theory (SAPT) (Jeziorski:1994:1887). This kind of analysis gives a lot of insight into the nature of intermolecular interactions, and Psi4 makes these computations easier than ever.
For a SAPT computation, the input needs to provide information on two distinct molecules. This is very easy, we just give a Z-matrix or set of Cartesian coordinates for each molecule, and separate the two with two dashes, like this:
In [7]:
# Example SAPT computation for ethene*ethyne (*i.e.*, ethylene*acetylene).
# Test case 16 from S22 Database
dimer = psi4.geometry("""
0 1
C 0.000000 -0.667578 -2.124659
C 0.000000 0.667578 -2.124659
H 0.923621 -1.232253 -2.126185
H -0.923621 -1.232253 -2.126185
H -0.923621 1.232253 -2.126185
H 0.923621 1.232253 -2.126185
--
0 1
C 0.000000 0.000000 2.900503
C 0.000000 0.000000 1.693240
H 0.000000 0.000000 0.627352
H 0.000000 0.000000 3.963929
units angstrom
""")
Here’s the second half of the input, where we specify the computation options:
In [8]:
psi4.set_options({'scf_type': 'df',
'freeze_core': 'true'})
psi4.energy('sapt0/jun-cc-pvdz', molecule=dimer)
Out[8]:
-0.0022355825227244703
All of the options we have currently set using psi4.set_options()
API
are “global” options (meaning that they are visible to all parts of the
program). Most common Psi4 options can be set like this. If an option
needs to be visible only to one part of the program (e.g., we only
want to increase the energy convergence in the SCF code, but not the
rest of the code), it can be set by with the
psi4.set_module_options()
API
function, psi4.set_module_options('scf', {'e_convergence': '1e-8'})
.
Note: The arguments to the functions we’ve used so far, like
psi4.set_options()
API,
psi4.set_module_options()
API,
psi4.energy()
API,
psi4.optimize()
API,
psi4.frequency()
API,
etc., are case-insensitive.
Back to our SAPT example, we have found that for basic-level SAPT
computations (i.e., SAPT0), a good error cancellation is found
(Hohenstein:2012:WIREs)
with the jun-cc-pVDZ basis (this is the usual aug-cc-pVDZ basis, but
without diffuse functions on hydrogen and without diffuse \(d\)
functions on heavy atoms)
(Papajak:2011:10).
So, we’ll use that as our standard basis set. The SAPT code is designed
to use density fitting techniques, because they introduce minimal errors
while providing much faster computations
(Hohenstein:2010:184111,Hohenstein:2010:014101).
Since we’re using density fitting for the SAPT, we might as well also
use it for the Hartree-Fock computations that are performed as part of
the SAPT. We can specify that by adding 'scf_type': 'df'
to the
dictionary passed to psi4.set_options()
.
Density fitting procedures require the use of auxiliary basis sets that pair with the primary basis set. Fortunately, Psi4 is usually smart enough to figure out what auxiliary basis sets are needed for a given computation. In this case, jun-cc-pVDZ is a standard enough basis set (just a simple truncation of the very popular aug-cc-pVDZ basis set) that Psi4 correctly guesses that we want the jun-cc-pVDZ-JKFIT auxiliary basis for the Hartree-Fock, and the jun-cc-pVDZ-RI basis set for the SAPT procedure.
To speed up the computation a little, we also tell the SAPT procedure to
freeze the core electrons by adding 'freeze_core': 'true'
to the
dictionary passed to psi4.set_options()
. The SAPT procedure is
invoked by psi4.energy('sapt0/jun-cc-pvdz', molecule=dimer)
. Here,
Psi4 knows to automatically run two monomer computations and a dimer
computation and then use these results to perform the SAPT analysis. The
various energy components are printed at the end of the output, in
addition to the total SAPT0 interaction energy. An explanation of the
various energy components can be found in the review by Jeziorski,
Moszynski, and Szalewicz
(Jeziorski:1994:1887),
and this is discussed in more detail in the SAPT
section of the Psi4
manual.
For now, we’ll note that most of the SAPT energy components are
negative; this means those are attractive contributions (the zero of
energy in a SAPT computation is defined as non-interacting monomers).
The exchange contributions are positive (repulsive). In this example,
the most attractive contribution between ethylene and acetylene is an
electrostatic term of \(-2.12\) kcal/mol (Elst10,r
where the 1
indicates the first-order perturbation theory result with respect to the
intermolecular interaction, and the 0 indicates zeroth-order with
respect to the intramolecular electron correlation). The next most
attractive contribution is the Disp20
term (second order
intermolecular dispersion, which looks like MP2 in which one excitation
is placed on each monomer), contributing an attraction of \(-1.21\)
kcal/mol. It is not surprising that the electrostatic contribution is
dominant, because the geometry chosen for this example has the acetylene
perpendicular to the ethylene, with the acetylene hydrogen pointing
directly at the double bond in ethylene; this will be attractive because
the H atoms in acetylene bear a partial positive charge, while the
electron rich double bond in ethylene bears a partial negative charge.
At the same time, the dispersion interaction should be smaller because
the perpendicular geometry does not allow much overlap between the
monomers. Hence, the SAPT analysis helps clarify (and quantify) our
physical understanding about the interaction between the two monomers.
V. Potential Surface Scans and Counterpoise Correction Made Easy¶
Finally, let’s consider an example which highlights the advantages of being able to interact with Psi4 directly with Python.
Suppose you want to do a limited potential energy surface scan, such as
computing the interaction energy between two neon atoms at various
interatomic distances. One simple but unappealing way to do this is to
generate separate geometries for each distance to be studied. Instead,
we can leverage Python loops and string formatting to make our lives
simpler. Additionally, let’s suppose you want to do counterpoise (CP)
correction to compute interaction energies. Counterpoise correction
involves computing the dimer energy and then subtracting out the
energies of the two monomers, each evaluated in the dimer basis. Again,
each of these computations could be run in a separate input file, but
because counterpoise correction is a fairly standard procedure for
intermolecular interactions, Psi4 knows about it and has a built-in
routine to perform counterpoise correction. It only needs to know what
method you want to do the couterpoise correction on (here, let’s
consider CCSD(T)), and it needs to know what’s monomer A and what’s
monomer B. This last issue of specifying the monomers separately was
already dealt with in the previous SAPT example, where we saw that two
dashes in the psi4.geometry()
string can be used to separate
monomers.
So, we’re going to do counterpoise-corrected CCSD(T) energies for Ne\(_2\) at a series of different interatomic distances. And let’s print out a table of the interatomic distances we’ve considered, and the CP-corrected CCSD(T) interaction energies (in kcal/mol) at each geometry:
In [9]:
#! Example potential energy surface scan and CP-correction for Ne2
ne2_geometry = """
Ne
--
Ne 1 {0}
"""
Rvals = [2.5, 3.0, 4.0]
psi4.set_options({'freeze_core': 'true'})
# Initialize a blank dictionary of counterpoise corrected energies
# (Need this for the syntax below to work)
ecp = {}
for R in Rvals:
ne2 = psi4.geometry(ne2_geometry.format(R))
ecp[R] = psi4.energy('ccsd(t)/aug-cc-pvdz', bsse_type='cp', molecule=ne2)
# Prints to screen
print("CP-corrected CCSD(T)/aug-cc-pVDZ Interaction Energies\n\n")
print(" R [Ang] E_int [kcal/mol] ")
print("---------------------------------------------------------")
for R in Rvals:
e = ecp[R] * psi4.constants.hartree2kcalmol
print(" {:3.1f} {:1.6f}".format(R, e))
# Prints to output.dat
psi4.core.print_out("CP-corrected CCSD(T)/aug-cc-pVDZ Interaction Energies\n\n")
psi4.core.print_out(" R [Ang] E_int [kcal/mol] \n")
psi4.core.print_out("---------------------------------------------------------\n")
for R in Rvals:
e = ecp[R] * psi4.constants.hartree2kcalmol
psi4.core.print_out(" {:3.1f} {:1.6f}\n".format(R, e))
CP-corrected CCSD(T)/aug-cc-pVDZ Interaction Energies
R [Ang] E_int [kcal/mol]
---------------------------------------------------------
2.5 0.758605
3.0 0.015968
4.0 -0.016215
First, you can see the geometry string ne2_geometry
has a two dashes
to separate the monomers from each other. Also note we’ve used a
Z-matrix to specify the geometry, and we’ve used a variable (R
) as
the interatomic distance. We have not specified the value of R
in
the ne2_geometry
string like we normally would. That’s because we
are going to vary it during the scan across the potential energy
surface, by using a Python loop over the list of interatomic distances
Rvals
. Before we are able to pass our molecule to Psi4, we need to
do two things. First, we must set the value of the intermolecular
separation in our Z-matrix (by using Python 3 string
formatting)
to the particular value of R. Second, we need to turn the Z-matrix
string into a Psi4 molecule, by passing it to
`psi4.geometry()
<http://psicode.org/psi4manual/master/api/psi4.driver.geometry.html#psi4.driver.geometry>`__.
The argument bsse_type='cp'
tells Psi4 to perform counterpoise (CP)
correction on the dimer to compute the CCSD(T)/aug-cc-pVDZ interaction
energy, which is stored in our ecp
dictionary at each iteration of
our Python loop. Note that we didn’t need to specify ghost atoms, and we
didn’t need to call the monomer and dimer computations separately. Psi4
does it all for us, automatically.
Near the very end of the output file output.dat
, the counterpoise
correction Python function will print a nice summary of the results of
the counterpoise computation (the energies of the dimer, of monomer 1
with the ghost functions of monomer 2, of monomer 2 with the ghost
functions of monomer 1, and the overall counterpoise corrected
interaction energy):
N-Body: Computing complex (1/2) with fragments (2,) in the basis of fragments (1, 2).
...
N-Body: Complex Energy (fragments = (2,), basis = (1, 2): -128.70932405488924)
...
N-Body: Computing complex (2/2) with fragments (1,) in the basis of fragments (1, 2).
...
N-Body: Complex Energy (fragments = (1,), basis = (1, 2): -128.70932405488935)
...
N-Body: Computing complex (1/1) with fragments (1, 2) in the basis of fragments (1, 2).
...
N-Body: Complex Energy (fragments = (1, 2), basis = (1, 2): -257.41867403127321)
...
==> N-Body: Counterpoise Corrected (CP) energies <==
n-Body Total Energy [Eh] I.E. [kcal/mol] Delta [kcal/mol]
1 -257.418648109779 0.000000000000 0.000000000000
2 -257.418674031273 -0.016265984132 -0.016265984132
And that’s it! The only remaining part of the example is a little table
of the different R
values and the CP-corrected CCSD(T) energies,
converted from atomic units (Hartree) to kcal mol\(^{-1}\) by
multiplying by the automatically-defined conversion factor
psi4.constants.hartree2kcalmol
. Psi4 provides several built-in
physical constants and conversion factors, as described in the Psi4
manual section Physical
Constants.
The table can be printed either to the screen, by using standard Python
``print()`
syntax <https://docs.python.org/3/whatsnew/3.0.html#print-is-a-function>`__,
or to the designated output file output.dat
using Psi4’s built-in
function psi4.core.print_out()
API
(C style printing).
As we’ve seen so far, the combination of Psi4 and Python creates a unique, interactive approach to quantum chemistry. The next section will explore this synergistic relationship in greater detail, describing how even very complex tasks can be done very easily with Psi4.
In [ ]: