PSI4 Project Logo

Configuration: Preparing PSI4’s Environment

Scratch Files and the ~/.psi4rc File

One very important part of user configuration at the end of the installation process is to tell PSI4 where to write its temporary (“scratch”) files. Electronic structure packages like PSI4 can create rather large temporary disk files. It is very important to ensure that PSI4 is writing its temporary files to a disk drive phsyically attached to the computer running the computation. If it is not, it will significantly slow down the program and the network. By default, PSI4 will write temporary files to /tmp, but this directory is often not large enough for typical computations. Therefore, you need to (a) make sure there is a sufficiently large directory on a locally attached disk drive (100GB–1TB or more, depending on the size of the molecules to be studied) and (b) tell PSI4 the path to this directory. The PSI4 installation instructions explain how to set up a resource file, ~/.psi4rc (example psi4/samples/example_psi4rc_file), for each user providing this information.

For convenience, the Python interpreter will execute the contents of the ~/.psi4rc file in the current user’s home area (if present) before performing any tasks in the input file. The primary use of the ~/.psi4rc file is to control the handling of scratch files. PSI4 has a number of utilities that manage input and output (I/O) of quantities to and from the hard disk. Most quantities, such as molecular integrals, are intermediates that are not of interest to the user and can be deleted after the computation finishes, but pertinent details of computations are also written to a checkpoint file and might be useful in subsequent computations. All files are sequentially numbered and are written to /tmp, then deleted at the end of the computation, unless otherwise instructed by the user.

A Python callable handle to the PSI4 I/O management routines is available, and is called psi4_io. To instruct the I/O manager to send all files to another location, say /scratch/user, add the following command to the ~/.psi4rc file (note the trailing “/”):

psi4_io.set_default_path('/scratch/user/')

For batch jobs running through a queue, it might be more convenient to use an environmental variable (in this case $MYSCRATCH) to set the scratch directory; the following code will do that:

scratch_dir = os.environ.get('MYSCRATCH')
if scratch_dir:
    psi4_io.set_default_path(scratch_dir + '/')

Individual files can be send to specific locations. For example, file 32 is the checkpoint file that the user might want to retain in the working directory (i.e., where PSI4 was launched from) for restart purposes. This is accomplished by the commands below:

psi4_io.set_specific_path(32, './')
psi4_io.set_specific_retention(32, True)

To circumvent difficulties with running multiple jobs in the same scratch, the process ID (PID) of the PSI4 instance is incorporated into the full file name; therefore, it is safe to use the same scratch directory for calculations running simultaneously.

To override any of these defaults for selected jobs, simply place the appropriate commands from the snippets above in the input file itself. During excecution, the ~/.psi4rc defaults will be loaded in first, but then the commands in the input file will be executed. Executing PSI4 with the psi4 -m (for messy) flag will prevent files being deleted at the end of the run:

psi4 -m

Alternately, the scratch directory can be set through the environment variable PSI_SCRATCH (overrides ~/.psi4rc settings).

The ~/.psi4rc file can also be used to define constants that are accessible in input files or to place any Python statements that should be executed with every PSI4 instance.

Threading

Most new modules in PSI4 are designed to run efficiently on SMP architectures via application of several thread models. The de facto standard for PSI4 involves using threaded BLAS/LAPACK (particularly Intel’s excellent MKL package) for most tensor-like operations, OpenMP for more general operations, and Boost Threads for some special-case operations. Note: Using OpenMP alone is a really bad idea. The developers make little to no effort to explicitly parallelize operations which are already easily threaded by MKL or other threaded BLAS. Less than 20% of the threaded code in PSI4 uses OpenMP, the rest is handled by parallel DGEMM and other library routines. From this point forward, it is assumed that you have compiled PSI4 with OpenMP and MKL (Note that it is possible to use g++ or another compiler and yet still link against MKL).

Control of threading in PSI4 can be accomplished at a variety of levels, ranging from global environment variables to direct control of thread count in the input file, to even directives specific to each model. This hierarchy is explained below. Note that each deeper level trumps all previous levels.

(1) OpenMP/MKL Environment Variables

The easiest/least visible way to thread PSI4 is to set the standard OpenMP/MKL environment variables OMP_NUM_THREADS and MKL_NUM_THREADS. For instance, in tcsh:

setenv OMP_NUM_THREADS 4
setenv MKL_NUM_THREADS 4

PSI4 then detects these value via the API routines in <omp.h> and <mkl.h>, and runs all applicable code with 4 threads. These environment variables are typically defined in a .tcshrc or .bashrc.

(2) The -n Command Line Flag

To change the number of threads at runtime, the psi4 -n flag may be used. An example is:

psi4 -i input.dat -o output.dat -n 4

which will run on four threads.

(3) Setting Thread Numbers in an Input

For more explicit control, the Process::environment class in PSI4 can override the number of threads set by environment variables. This functionality is accessed via the set_num_threads() Psithon function, which controls both MKL and OpenMP thread numbers. The number of threads may be changed multiple times in a PSI4 input file. An example input for this feature is:

# A bit small-ish, but you get the idea
molecule h2o {
0 1
O
H 1 1.0
H 1 1.0 2 90.0
}

set scf {
basis cc-pvdz
scf_type df
}

# Run from 1 to 4 threads, for instance, to record timings
for nthread in range(1,5):
    set_num_threads(nthread)
    energy('scf')

(4) Method-Specific Control

Even more control is possible in certain circumstances. For instance, the threaded generation of AO density-fitted integrals involves a memory requirement proportional to the number of threads. This requirement may exceed the total memory of a small-memory node if all threads are involved in the generation of these integrals. For general DF algorithms, the user may specify:

set MODULE_NAME df_ints_num_threads n

to explicitly control the number of threads used for integral formation. Setting this variable to 0 (the default) uses the number of threads specified by the set_num_threads() Psithon method or the default environmental variables.

Command Line Options

PSI4 can be invoked with no command line arguments, as it takes as input by default the file “input.dat” and directs output by default to “output.dat”. The set of three commands below are completely equivalent, while the fourth is, perhaps, the most common usage.

psi4
psi4 -i input.dat -o output.dat
psi4 input.dat output.dat

psi4 descriptive_filename.in descriptive_filename.out

Command-line arguments to PSI4 can be accessed through psi4 --help.

-a, --append

Append results to output file. Default: Truncate first

-h, --help

Display the command-line options and usage information.

-i <filename>, --input <filename>

Input file name. Default: input.dat

-o <filename>, --output <filename>

Output file name. Use stdout as <filename> to redirect to the screen. Default: output.dat

-m, --messy

Leave temporary files after the run is completed.

-n <threads>, --nthread <threads>

Number of threads to use (overrides OMP_NUM_THREADS)

--new-plugin <name>

Creates a new directory <name> with files for writing a new plugin. An additional argument specifies a template to use, for example: --new-plugin name +mointegrals. See Sec. Plugins: Adding New Functionality to PSI4 for available templates.

-p <prefix>, --prefix <prefix>

Prefix for psi files. Default: psi

-v, --verbose

Print a lot of information

-d, --debug

Flush the outfile at every fprintf. Default: true iff --with-debug

-V, --version

Print version information.

-w, --wipe

Clean out scratch area.

Environment Variables

These environment variables will influence PSI4’s behavior.

MKL_NUM_THREADS

Number of threads to use by operations with Intel threaded BLAS libraries.

OMP_NESTED

Do access nested DGEMM in OpenMP sections in DFMP2 for multi-socket platforms. This is very low-level access to OpenMP functions for experienced programmers. Users should leave this variable unset or set to False.

OMP_NUM_THREADS

Number of threads to use by modules with OpenMP threading.

PATH

Path for executables. To run Kállay’s MRCC program (see MRCC), the dmrcc executable must be in PATH

PSI_SCRATCH

Directory where scratch files are written. Overrides settings in ~/.psi4rc.

PYTHONPATH

Path in which the Python interpreter looks for modules to import. For PSI4, these are generally plugins (see Plugins: Adding New Functionality to PSI4).

Table Of Contents

Previous topic

A PSI4 Tutorial

Next topic

Psithon: Structuring an Input File

This Page