Welcome to PyGauss!¶
Author | Chris Sewell |
Project Page | https://github.com/chrisjsewell/PyGauss |
Conda Distro | https://conda.binstar.org/cjs14 |
PyPi Distro | https://pypi.python.org/pypi/pygauss |
Communication | pygauss@googlegroups.com |
PyGauss is intended as an interactive tool for supporting the lifecycle of a computational molecular chemistry investigation. From visual and analytical exploration, through to documentation and publication.
Intitally PyGauss has been designed for the purpose of examining one or more Gaussian quantum chemical computations, both geometrically and electronically. It is built on top of the cclib/chemview/chemlab suite of packages and python scientific stack though, and so should be extensible to other types of computational chemical analysis. PyGauss is primarily designed to be used interactively in the IPython Notebook.
As shown in the examples, a molecular optimisation can be assesed individually (much like in gaussview), but also as part of a group. The advantages of this package are then:
- Faster, more efficient analysis
- Extensible analysis
- Reproducible analysis
Contents¶
Quick Start¶
OSX and Linux¶
The recommended was to use pygauss is to download the Anaconda Scientific Python Distribution (64-bit). Once downloaded a new environment can be created in terminal and pygauss installed in one simple line:
conda create -n pg_env -c https://conda.binstar.org/cjs14 pygauss
Windows¶
There is currently no pygauss Conda distributable for Windows or for chemlab, which has C-extensions that need to be built using a compiler. Therefore chemlab will need to be cloned from GitHub, its extensions built, dependancies installed and finally install pygauss.
conda create -n pg_env python=2.7
conda install -n pg_env -c https://conda.binstar.org/cjs14 cclib
conda install -n pg_env -c https://conda.binstar.org/cjs14 chemview
conda install -n pg_env -c https://conda.binstar.org/cjs14 pyopengl
git clone --recursive https://github.com/chemlab/chemlab.git
cd chemlab
python setup.py build_ext --inplace
conda install -n pg_env <pil, pandas, matplotlib, scikit-learn, ...>
activate pg_env
pip install . # or add to PYTHONPATH
pip install pygauss
Troubleshooting¶
If you encounter difficulties it may be useful to look in working_conda_environments at conda environments known to work.
Testing¶
Pygauss utilises a unit test suite (nose/nose-parameterized) to ensure that computations run, and are correct. Continuous integration testing of the source code is provided by Travis CI and pass completion is an automated condition of the Conda build. These unit tests can also be run manually in the command line;
nosetests -v --with-doctest
or directly in python;
pygauss.run_nose(verbose=True)
Example Assessment¶
After installing PyGauss you should be able to open this IPython Notebook from; https://github.com/chrisjsewell/PyGauss/blob/master/Example_Assessment.ipynb, and run the following...
from IPython.display import display, Image
%matplotlib inline
import pygauss as pg
print 'pygauss version: {}'.format(pg.__version__)
pygauss version: 0.5.0
The test folder has a number of example Gaussian outputs to play around with.
folder = pg.get_test_folder()
len(folder.list_files())
37
Note: the folder object will act identical whether using a local path or one on a server over ssh (using paramiko):
folder = pg.Folder('/path/to/folder',
ssh_server='login.server.com',
ssh_username='username')
Single Molecule Analysis¶
A molecule can be created containg data about the inital geometry, optimisation process and analysis of the final configuration.
mol = pg.molecule.Molecule(folder_obj=folder,
init_fname='CJS1_emim-cl_B_init.com',
opt_fname=['CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_opt-modredundant_difrz.log',
'CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_opt-modredundant_difrz_err.log',
'CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_opt-modredundant_unfrz.log'],
freq_fname='CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_freq_unfrz.log',
nbo_fname='CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_pop-nbo-full-_unfrz.log',
atom_groups={'emim':range(20), 'cl':[20]},
alignto=[3,2,1])
Geometric Analysis¶
Molecules can be viewed statically or interactively.
#mol.show_initial(active=True)
vdw = mol.show_initial(represent='vdw', rotations=[[0,0,90], [-90, 90, 0]])
ball_stick = mol.show_optimisation(represent='ball_stick', rotations=[[0,0,90], [-90, 90, 0]])
display(vdw, ball_stick)


print 'Cl optimised polar coords from aromatic ring : ({0}, {1},{2})'.format(
*[round(i, 2) for i in mol.calc_polar_coords_from_plane(20,3,2,1)])
ax = mol.plot_opt_trajectory(20, [3,2,1])
ax.set_title('Cl optimisation path')
ax.get_figure().set_size_inches(4, 3)
Cl optimised polar coords from aromatic ring : (0.11, -116.42,-170.06)

Energetics and Frequency Analysis¶
print('Optimised? {0}, Conformer? {1}, Energy = {2} a.u.'.format(
mol.is_optimised(), mol.is_conformer(),
round(mol.get_opt_energy(units='hartree'),3)))
ax = mol.plot_opt_energy(units='hartree')
ax.get_figure().set_size_inches(3, 2)
ax = mol.plot_freq_analysis()
ax.get_figure().set_size_inches(4, 2)
Optimised? True, Conformer? True, Energy = -805.105 a.u.


Potential Energy Scan analysis of geometric conformers...
mol2 = pg.molecule.Molecule(folder_obj=folder, alignto=[3,2,1],
pes_fname=['CJS_emim_6311_plus_d3_scan.log',
'CJS_emim_6311_plus_d3_scan_bck.log'])
ax = mol2.plot_pes_scans([1,4,9,10], rotation=[0,0,90], img_pos='local_maxs', zoom=0.5)
ax.set_title('Ethyl chain rotational conformer analysis')
ax.get_figure().set_size_inches(7, 3)

Partial Charge Analysis¶
using Natural Bond Orbital (NBO) analysis
print '+ve charge centre polar coords from aromatic ring: ({0} {1},{2})'.format(
*[round(i, 2) for i in mol.calc_nbo_charge_center(3, 2, 1)])
display(mol.show_nbo_charges(represent='ball_stick', axis_length=0.4,
rotations=[[0,0,90], [-90, 90, 0]]))
+ve charge centre polar coords from aromatic ring: (0.02 -51.77,-33.15)

Density of States Analysis¶
print 'Number of Orbitals: {}'.format(mol.get_orbital_count())
homo, lumo = mol.get_orbital_homo_lumo()
homoe, lumoe = mol.get_orbital_energies([homo, lumo])
print 'HOMO at {} eV'.format(homoe)
print 'LUMO at {} eV'.format(lumoe)
Number of Orbitals: 272
HOMO at -4.91492036773 eV
LUMO at -1.85989816817 eV
ax = mol.plot_dos(per_energy=1,
atom_groups=['cl', 'emim'],
group_colors=['blue', 'orange'],
group_labels=['Cl', 'EMIM'], group_fill=False,
lbound=-20, ubound=10, legend_size=12)

Bonding Analysis¶
Using Second Order Perturbation Theory.
print 'H inter-bond energy = {} kJmol-1'.format(
mol.calc_hbond_energy(eunits='kJmol-1', atom_groups=['emim', 'cl']))
print 'Other inter-bond energy = {} kJmol-1'.format(
mol.calc_sopt_energy(eunits='kJmol-1', no_hbonds=True, atom_groups=['emim', 'cl']))
display(mol.show_sopt_bonds(min_energy=1, eunits='kJmol-1',
atom_groups=['emim', 'cl'],
no_hbonds=True,
rotations=[[0, 0, 90]]))
display(mol.show_hbond_analysis(cutoff_energy=5.,alpha=0.6,
atom_groups=['emim', 'cl'],
rotations=[[0, 0, 90], [90, 0, 0]]))
H inter-bond energy = 111.7128 kJmol-1
Other inter-bond energy = 11.00392 kJmol-1


Multiple Computations Analysis¶
Multiple computations, for instance of different starting conformations, can be grouped into an Analysis class and anlaysed collectively.
analysis = pg.Analysis(folder_obj=folder)
errors = analysis.add_runs(headers=['Cation', 'Anion', 'Initial'],
values=[['emim'], ['cl'],
['B', 'BE', 'BM', 'F', 'FE']],
init_pattern='*{0}-{1}_{2}_init.com',
opt_pattern='*{0}-{1}_{2}_6-311+g-d-p-_gd3bj_opt*unfrz.log',
freq_pattern='*{0}-{1}_{2}_6-311+g-d-p-_gd3bj_freq*.log',
nbo_pattern='*{0}-{1}_{2}_6-311+g-d-p-_gd3bj_pop-nbo-full-*.log',
alignto=[3,2,1], atom_groups={'emim':range(1,20), 'cl':[20]},
ipython_print=True)
Reading data 5 of 5
Molecular Comparison¶
fig, caption = analysis.plot_mol_images(mtype='optimised', max_cols=3,
info_columns=['Cation', 'Anion', 'Initial'],
rotations=[[0,0,90]])
print caption
Figure: (A) emim, cl, B, (B) emim, cl, BE, (C) emim, cl, BM, (D) emim, cl, F, (E) emim, cl, FE

Data Comparison¶
fig, caption = analysis.plot_mol_graphs(gtype='dos', max_cols=3,
info_columns=['Cation', 'Anion', 'Initial'],
atom_groups=['cl'], group_colors=['blue'],
group_labels=['Cl'], group_fill=True,
lbound=-20, ubound=10, legend_size=8)
print caption
Figure: (A) emim, cl, B, (B) emim, cl, BE, (C) emim, cl, BM, (D) emim, cl, F, (E) emim, cl, FE

The methods mentioned for indivdiual molecules can be applied to all or a subset of these computations.
analysis.add_mol_property_subset('Opt', 'is_optimised', rows=[2,3])
analysis.add_mol_property('Energy (au)', 'get_opt_energy', units='hartree')
analysis.add_mol_property('Cation chain, $\\psi$', 'calc_dihedral_angle', [1, 4, 9, 10])
analysis.add_mol_property('Cation Charge', 'calc_nbo_charge', 'emim')
analysis.add_mol_property('Anion Charge', 'calc_nbo_charge', 'cl')
analysis.add_mol_property(['Anion-Cation, $r$', 'Anion-Cation, $\\theta$', 'Anion-Cation, $\\phi$'],
'calc_polar_coords_from_plane', 3, 2, 1, 20)
analysis.add_mol_property('Anion-Cation h-bond', 'calc_hbond_energy',
eunits='kJmol-1', atom_groups=['emim', 'cl'])
analysis.get_table(row_index=['Anion', 'Cation', 'Initial'],
column_index=['Cation', 'Anion', 'Anion-Cation'])
Cation | Anion | Anion-Cation | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Opt | Energy (au) | chain, ψ | Charge | Charge | r | θ | φ | h-bond | |||
Anion | Cation | Initial | |||||||||
cl | emim | B | NaN | -805.105 | 80.794 | 0.888 | -0.888 | 0.420 | -123.392 | 172.515 | 111.713 |
BE | NaN | -805.105 | 80.622 | 0.887 | -0.887 | 0.420 | -123.449 | 172.806 | 112.382 | ||
BM | True | -805.104 | 73.103 | 0.874 | -0.874 | 0.420 | 124.121 | -166.774 | 130.624 | ||
F | True | -805.118 | 147.026 | 0.840 | -0.840 | 0.420 | 10.393 | 0.728 | 202.004 | ||
FE | NaN | -805.117 | 85.310 | 0.851 | -0.851 | 0.417 | -13.254 | -4.873 | 177.360 |
There is also an option (requiring pdflatex and ghostscript+imagemagik) to output the tables as a latex formatted image.
analysis.get_table(row_index=['Anion', 'Cation', 'Initial'],
column_index=['Cation', 'Anion', 'Anion-Cation'],
as_image=True, font_size=12)

Multi-Variate Analysis¶
RadViz is a way of visualizing multi-variate data.
ax = analysis.plot_radviz_comparison('Anion', columns=range(4, 10))

The KMeans algorithm clusters data by trying to separate samples into n groups of equal variance.
pg.utils.imgplot_kmean_groups(
analysis, 'Anion', 'cl', 4, range(4, 10),
output=['Initial'], mtype='optimised',
rotations=[[0, 0, 90], [-90, 90, 0]],
axis_length=0.3)

Figure: (A) BM

Figure: (A) FE

Figure: (A) B, (B) BE

Figure: (A) F
Documentation (MS Word)¶
After analysing the computations, it would be reasonable to want to document some of our findings. This can be achieved by outputting individual figure or table images via the folder object.
file_path = folder.save_ipyimg(vdw, 'image_of_molecule')
Image(file_path)

But you may also want to produce a more full record of your analysis, and this is where python-docx steps in. Building on this package the pygauss MSDocument class can produce a full document of your analysis.
import matplotlib.pyplot as plt
d = pg.MSDocument()
d.add_heading('A Pygauss Example Assessment', level=0)
d.add_docstring("""
# Introduction
We have looked at the following aspects
of [EMIM]^{+}[Cl]^{-} (C_{6}H_{11}ClN_{2});
- Geometric conformers
- Electronic structure
# Geometric Conformers
""")
fig, caption = analysis.plot_mol_images(max_cols=2,
rotations=[[90,0,0], [0,0,90]],
info_columns=['Anion', 'Cation', 'Initial'])
d.add_mpl(fig, dpi=96, height=9, caption=caption)
plt.close()
d.add_paragraph()
df = analysis.get_table(
columns=['Anion Charge', 'Cation Charge'],
row_index=['Anion', 'Cation', 'Initial'])
d.add_dataframe(df, incl_indx=True, style='Medium Shading 1 Accent 1',
caption='Analysis of Conformer Charge')
d.add_docstring("""
# Molecular Orbital Analysis
## Density of States
It is **important** to *emphasise* that the
computations have only been run in the gas phase.
""")
fig, caption = analysis.plot_mol_graphs(gtype='dos', max_cols=3,
info_columns=['Cation', 'Anion', 'Initial'],
atom_groups=['cl'], group_colors=['blue'],
group_labels=['Cl'], group_fill=True,
lbound=-20, ubound=10, legend_size=8)
d.add_mpl(fig, dpi=96, height=9, caption=caption)
plt.close()
d.save('exmpl_assess.docx')
Which gives us the following:


MORE TO COME!!
Project Status¶
Distribution¶
Conda | |
PyPi | |
Documents |
Development¶
GitHub | |
Unit Testing | |
Testing Coverage | |
Documents |
Whats New¶
v0.5.1 - MSDocument Improvements¶
Improved the pygauss.docs.MSDocument
class:
Improved
pygauss.docs.MSDocument.add_markdown()
to handle more markdown syntaxAdded
pygauss.docs.MSDocument.add_docstring()
to add multiple paragraphs (with markdown)Improved
pygauss.docs.MSDocument.add_dataframe()
andpygauss.docs.MSDocument.add_mpl()
- there is now a caption option with built-in figure/table number count
v0.5.0 - DoS Analysis & Multiprocessed I/O¶
- Density of States (DoS) and Partial Density of States (PDoS) analysis capabilities added.
- Addition of
pygauss.analysis.Analysis.plot_mol_graphs()
for data comparison pygauss.analysis.Analysis.add_runs()
method now utilises themultiprocessing
module for faster data input*- Improvement of API documentation
Back-compatibility note: replaced plot_optimisation_E and get_optimisation_E with
pygauss.molecule.Molecule.get_opt_energy()
and pygauss.molecule.Molecule.plot_opt_energy()
,
to adhere with standard naming conventions.
*multiprocessing not currently available in Windows or over SSH
v0.4.4 - Output to MS Word Documents¶
added the
pygauss.docs
module for outputting analysis to documents- initially utilising the
docx.document.Document
withinpygauss.docs.MSDocument
- initially utilising the
v0.4.3 - Coninuous Integrated Testing¶
Addition of continuous integrated testing using [Travis](https://travis-ci.org/) and testing coverage analysis using [Coverall](https://coveralls.io/).
v0.4.2 - Addition of Documentation¶
addition of Sphinx documentation
v0.4.0 - Major Update¶
update includes:
- refactoring of data io
- improvement of second order perturbation theory analysis
- image output to table
- addition of unit test suite
- improvement of method documentation
breaks some back compatibility
v0.3.0 - File Input Over SSH¶
main update is the ability setup an ssh connection to a server, using the paramiko library, and parse analysis files over it. Also the ability to use wildcards (*) in input file names.
some minor back compatibility breaks
v0.2.2 - Table Image Improvements¶
Improvements to Table to Image functionality on OSx
- added some fixes
- re-organised test modules
v0.2.1 - Latex Table Images¶
addition of functionality to output analysis tables as latex images for input into projects!
v0.2 - Initial working distribution¶
Working distribution of pygauss to be converted to first conda package
v0.1 - First Version¶
the first version
Whats To Come¶
Natural Bonding Orbital Visualisation¶
Visualise natural binding orbitals, in particular HOMO and LUMO
Molecular Image Labelling¶
with atom numbers, atom types, bond lengths, etc...
User API¶
pygauss.file_io module¶
Created on Mon May 18 21:01:25 2015
@author: chris sewell
-
class
pygauss.file_io.
Folder
(path, server=None, username=None, passwrd=None)[source]¶ Bases:
object
an object intended to act as an entry point to a folder path
Parameters: -
list_files
(pattern=None, one_file=False)[source]¶ list files in folder
Parameters: pattern (str) – a pattern the file must match that can include * wildcards
-
save_ipyimg
(img, img_name)[source]¶ a function for outputing an IPython Image to a file
Parameters: - img (IPython.display.Image) – an IPyton image
- img_name (str) – the desired name of the file
-
save_mplfig
(fig, fig_name, dpi=256, format='png')[source]¶ a function for outputing a matplotlib figure to a file
Parameters: - fig (matplotlib.figure.Figure) – a Matplotlib figure
- fig_name (str) – the desired name of the file
-
pygauss.molecule module¶
Created on Fri May 01 21:24:31 2015
@author: chris
-
class
pygauss.molecule.
Molecule
(folderpath='', init_fname=False, opt_fname=False, freq_fname=False, nbo_fname=False, pes_fname=False, fail_silently=False, atom_groups={}, alignto=[], server=None, username=None, passwrd=None, folder_obj=None)[source]¶ Bases:
object
a class to analyse gaussian input/output of a single molecular geometry
Parameters: - folderpath (str) – the folder path
- init_fname (str) – the intial geometry (.com) file
- opt_fname (str or list of str) – the optimisation log file
- freq_fname (str) – the frequency analysis log file
- nbo_fname (str) – the population analysis logfile
- pes_fname (str) – the potential energy scan logfile
- fail_silently (bool) – whether to raise an error if a file read fails (if True can use get_init_read_errors to see errors)
- atom_groups ({str:[int, ...]}) – groups of atoms that can be selected as a subset
- alignto ([int, int, int]) – the atom numbers to align the geometry to
Notes
any of the file names can have wildcards (e.g. ‘filename*.log) in them, as long as this resolves to a single path in the directory
NB: nbo population analysis must be run with the GFInput flag to ensure data is output to the log file
-
calc_bond_angle
(indxs, optimisation=True, mol=None)[source]¶ Returns the angle in degrees between three points
-
calc_dihedral_angle
(indxs, optimisation=True, mol=None)[source]¶ Returns the angle in degrees between four points
-
calc_min_dist
(idx_list1, idx_list2, optimisation=True, units='nm', ignore_missing=True)[source]¶ indexes start at 1
-
calc_nbo_charge_center
(p1, p2, p3, positive=True, units='nm', atoms=[])[source]¶ returns the distance r amd angles theta, phi of the positive/negative charge center to the circumcenter of the plane formed by [p1, p2, p3]
- the plane formed will have;
- x-axis along p1, y-axis anticlock-wise towards p2, z-axis normal to the plane
theta (azimuth) is the in-plane angle from the x-axis towards the y-axis phi (inclination) is the out-of-plane angle from the x-axis towards the z-axis
-
calc_opt_trajectory
(atom, plane=[])[source]¶ calculate the trajectory of an atom as it is optimised, relative to a plane of three atoms
-
calc_polar_coords_from_plane
(p1, p2, p3, c, optimisation=True, units='nm')[source]¶ returns the distance r and angles theta, phi of atom c to the circumcenter of the plane formed by [p1, p2, p3]
- the plane formed will have;
- x-axis along p1, y-axis anticlock-wise towards p2, z-axis normal to the plane
theta (azimuth) is the in-plane angle from the x-axis towards the y-axis phi (inclination) is the out-of-plane angle from the x-axis towards the z-axis
-
calc_sopt_energy
(atom_groups=[], eunits='kJmol-1', no_hbonds=False)[source]¶ calculate total energy of interactions between “filled” (donor) Lewis-type Natural Bonding Orbitals (NBOs) and “empty” (acceptor) non-Lewis NBOs, using Second Order Perturbation Theory
Parameters: Returns: analysis – a table of interactions
Return type:
-
combine_molecules
(other_mol, self_atoms=False, other_atoms=False, self_rotation=[0, 0, 0], other_rotation=[0, 0, 0], self_transpose=[0, 0, 0], other_transpose=[0, 0, 0], self_opt=True, other_opt=True, charge=None, multiplicity=None, out_name=False, descript='', overwrite=False, active=False, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, ipyimg=True, folder_obj=None)[source]¶ transpose in nanometers
-
get_freq_analysis
()[source]¶ return frequency analysis
Returns: data – frequency data Return type: pandas.DataFrame
-
get_hbond_analysis
(min_energy=0.0, atom_groups=[], eunits='kJmol-1')[source]¶ EXPERIMENTAL! hydrogen bond analysis (DH—A), using Second Order Bond Perturbation Theiry
Parameters: Returns: analysis – a table of interactions
Return type: Notes
uses a strict definition of a hydrogen bond as: interactions between “filled” (donor) Lewis-type Lone Pair (LP) NBOs and “empty” (acceptor) non-Lewis Bonding (BD) NBOs
-
get_init_read_errors
()[source]¶ get read errors, recorded if fail_silently was set to True on initialise
-
get_opt_energy
(units='eV', final=True)[source]¶ return the SCF optimisation energy
Parameters: Returns: energy – dependant on final
Return type: float or list of floats
-
get_orbital_energies
(orbitals, eunits='eV')[source]¶ the orbital energies for listed orbitals
Parameters: - orbitals (int or iterable of ints) – the orbital(s) to return energies for (starting at 1)
- eunits (str) – the units of energy
Returns: moenergies – energy for each orbital
Return type:
-
get_sopt_analysis
(eunits='kJmol-1', atom_groups=[], charge_info=False)[source]¶ interactions between “filled” (donor) Lewis-type Natural Bonding Orbitals (NBOs) and “empty” (acceptor) non-Lewis NBOs, using Second Order Perturbation Theory (SOPT)
Parameters: Returns: analysis – a table of interactions
Return type:
-
plot_dos
(eunits='eV', per_energy=1.0, lbound=None, ubound=None, atom_groups=[], group_colors=[], group_labels=[], group_fill=False, legend_size=10, ax=None)[source]¶ plot Density of States
- eunits : str
- unit of energy
- per_energy : float
- energy interval to group states by
- lbound : float
- lower bound energy
- ubound: float
- upper bound energy
- atom_groups : list of lists or strings
- atom groups to highlight
- group_colors : list of str
- highlight colour for each atom group format adheres to matplotlib.colors
- group_labels : list of str
- label for each atom group
- group_fill : bool
- whether to fill colour for groups
- legend_size : int
- the font size (in pts) for the legend
- ax : matplotlib.Axes
- an existing axes to plot the data on
Returns: plot – plotted optimisation data Return type: matplotlib.axes.Axes
-
plot_freq_analysis
(color='blue', alpha=1, marker_size=20, ax=None)[source]¶ plot frequency analysis
Returns: data – plotted frequency data Return type: matplotlib.axes.Axes
-
plot_opt_energy
(units='eV', linecolor='blue', ax=None)[source]¶ plot SCF optimisation energy
Returns: data – plotted optimisation data Return type: matplotlib.axes.Axes
-
plot_opt_trajectory
(atom, plane=[], ax_lims=None, ax_labels=False)[source]¶ plot the trajectory of an atom as it is optimised, relative to a plane of three atoms
-
plot_pes_scans
(fixed_atoms, eunits='kJmol-1', img_pos='', rotation=[0.0, 0.0, 0.0], zoom=1, order=1)[source]¶ plot Potential Energy Scan
Parameters: - img_pos (<’‘,’local_mins’,’local_maxs’,’global_min’,’global_max’>) – position image(s) of molecule conformation(s) on plot
- rotation ([float, float, float]) – rotation of molecule image(s)
-
show_active_orbital
(orbital, iso_value=0.03, alpha=0.5, bond_color=(255, 0, 0), antibond_color=(0, 255, 0), gbonds=True)[source]¶ get interactive representation of orbital
Parameters: - orbital (int) – the orbital to show (in range 1 to number of orbitals)
- iso_value (float) – The value for which the function should be constant.
- alpha – alpha value of iso-surface
- bond_color – color of bonding orbital surface in RGB format
- antibond_color – color of anti-bonding orbital surface in RGB format
- gbonds (bool) – guess bonds between atoms (via distance)
-
show_hbond_analysis
(min_energy=0.0, atom_groups=[], cutoff_energy=0.0, eunits='kJmol-1', bondwidth=5, gbonds=True, active=False, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], relative=False, minval=-1, maxval=1, alpha=0.5, transparent=True, ipyimg=True)[source]¶ EXPERIMENTAL! hydrogen bond analysis DH—A
For a hydrogen bond to occur there must be both a hydrogen donor and an acceptor present. The donor in a hydrogen bond is the atom to which the hydrogen atom participating in the hydrogen bond is covalently bonded, and is usually a strongly electronegative atom such as N, O, or F. The hydrogen acceptor is the neighboring electronegative ion or molecule, and must posses a lone electron pair in order to form a hydrogen bond.
Since the hydrogen donor is strongly electronegative, it pulls the covalently bonded electron pair closer to its nucleus, and away from the hydrogen atom. The hydrogen atom is then left with a partial positive charge, creating a dipole-dipole attraction between the hydrogen atom bonded to the donor, and the lone electron pair on the acceptor.
-
show_highlight_atoms
(atomlists, transparent=False, alpha=0.7, gbonds=True, active=False, optimised=True, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ show optimised geometry of molecule with certain atoms highlighted
-
show_initial
(gbonds=True, active=False, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ show initial geometry (before optimisation) of molecule coloured by atom type
-
show_nbo_charges
(gbonds=True, active=False, relative=False, minval=-1, maxval=1, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ show optimised geometry coloured by charge from nbo analysis
-
show_optimisation
(opt_step=False, gbonds=True, active=False, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ show optimised geometry of molecule coloured by atom type
-
show_sopt_bonds
(min_energy=20.0, cutoff_energy=0.0, atom_groups=[], bondwidth=5, eunits='kJmol-1', no_hbonds=False, gbonds=True, active=False, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], relative=False, minval=-1, maxval=1, alpha=0.5, transparent=True, ipyimg=True)[source]¶ visualisation of interactions between “filled” (donor) Lewis-type Natural Bonding Orbitals (NBOs) and “empty” (acceptor) non-Lewis NBOs, using Second Order Perturbation Theory
-
yield_orbital_images
(orbitals, iso_value=0.02, extents=(2, 2, 2), transparent=True, alpha=0.5, wireframe=True, bond_color=(255, 0, 0), antibond_color=(0, 255, 0), resolution=100, gbonds=True, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ yield orbital images
Parameters: - orbitals (int or list of ints) – the orbitals to show (in range 1 to number of orbitals)
- iso_value (float) – The value for which the function should be constant.
- extents ((float, float, float)) – +/- x,y,z to extend the molecule geometrty when constructing the surface
- transparent=True – whether iso-surface should be transparent (based on alpha value)
- alpha – alpha value of iso-surface
- wireframe – whether iso-surface should be wireframe (or solid)
- bond_color – color of bonding orbital surface in RGB format
- antibond_color – color of anti-bonding orbital surface in RGB format
- resolution (int) – The number of grid point to use for the surface. An high value will give better quality but lower performance.
- gbonds (bool) – guess bonds between atoms (via distance)
- represent (str) – representation of molecule (‘none’, ‘wire’, ‘vdw’ or ‘ball_stick’)
- zoom (float) – zoom level of images
- width (int) – width of original images
- height (int) – height of original images (although width takes precedent)
- axis_length (float) – length of x,y,z axes in negative and positive directions
- lines ([start_coord, end_coord, start_color, end_color, width, dashed]) – lines to add to image
- ipyimg (bool) – whether to return an IPython image, PIL image otherwise
Returns: mol – an image of the molecule in the format specified by ipyimg
Return type: IPython.display.Image or PIL.Image
pygauss.analysis module¶
-
class
pygauss.analysis.
Analysis
(folderpath='', server=None, username=None, passwrd=None, folder_obj=None, headers=[])[source]¶ Bases:
object
a class to analyse multiple computations
Parameters: - folderpath (str) – the folder directory storing the files to be analysed
- server (str) – the name of the server storing the files to be analysed
- username (str) – the username to connect to the server
- passwrd (str) – server password, if not present it will be asked for during initialisation
- headers (list) – the variable categories for each computation
-
add_basic_properties
(props=['basis', 'nbasis', 'optimised', 'conformer'])[source]¶ adds columns giving info of basic run properties
-
add_mol_property
(name, method, *args, **kwargs)[source]¶ compute molecule property for all rows and create a data column
Parameters:
-
add_mol_property_subset
(name, method, rows=[], filters={}, args=[], kwargs={}, relative_to_rows=[])[source]¶ compute molecule property for a subset of rows and create/add-to data column
Parameters: - name (str or list of strings) – name for output column (multiple if method outputs more than one value)
- method (str) – what molecule method to call
- rows (list) – what molecule rows to calculate the property for
- filters (dict) – filter for selecting molecules to calculate the property for
- args (list) – the arguments to pass to the molecule method
- kwargs (dict) – the keyword arguments to pass to the molecule method
- relative_to_rows (list of ints) – compute values relative to the summated value(s) of molecule at the rows listed
-
add_run
(identifiers={}, init_fname=None, opt_fname=None, freq_fname=None, nbo_fname=None, alignto=[], atom_groups={}, add_if_error=False, folder_obj=None)[source]¶ add single Gaussian run input/outputs
-
add_runs
(headers=[], values=[], init_pattern=None, opt_pattern=None, freq_pattern=None, nbo_pattern=None, add_if_error=False, alignto=[], atom_groups={}, ipython_print=False, folder_obj=None)[source]¶ add multiple Gaussian run inputs/outputs
-
calc_kmean_groups
(category_column, category_name, groups, columns=[], rows=[], filters={})[source]¶ calculate the kmeans grouping of rows
The KMeans algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares. This algorithm requires the number of clusters to be specified. It scales well to large number of samples and has been used across a large range of application areas in many different fields.
-
folder
¶ The folder for gaussian runs
-
get_basic_property
(prop, *args, **kwargs)[source]¶ returns a series of a basic run property or nan if it is not available
Parameters: prop (str) – can be ‘basis’, ‘nbasis’, ‘optimised’, ‘opt_error’ or ‘conformer’
-
get_table
(rows=[], columns=[], filters={}, precision=4, head=False, mol=False, row_index=[], column_index=[], as_image=False, na_rep='-', font_size=None, width=None, height=None, unconfined=False)[source]¶ return pandas table of requested data in requested format
- rows : integer or list of integers
- select row ids
- columns : string/integer or list of strings/integers
- select column names/positions
- filters : dict
- filter for rows with certain value(s) in specific columns
- precision : int
- decimal precision of displayed values
- head : int
- return only first n rows
- mol : bool
- include column containing the molecule objects
- row_index : string or list of strings
- columns to use as new index
- column_index : list of strings
- srings to place in to higher order column indexs
- as_image : bool
- output the table as an image (used pygauss.utils.df_to_img)
- na_rep : str
- how to represent empty (nan) cells (if outputting image)
- width, height, unconfined : int, int, bool
- args for IPy Image
Returns: df – a table of data Return type: pandas.DataFrame
-
plot_mol_graphs
(gtype='energy', share_plot=False, max_cols=1, padding=(1, 1), tick_rotation=0, rows=[], filters={}, sort_columns=[], info_columns=[], info_incl_id=False, start_letter='A', grid=True, sharex=True, sharey=True, legend_size=10, color_scheme='jet', eunits='eV', per_energy=1.0, lbound=None, ubound=None, atom_groups=[], group_colors=[], group_labels=[], group_fill=False)[source]¶ get a set of data plots for each molecule
Parameters: - gtype (str) – the type of plot, energy = optimisation energies, freq = frequency analsis, dos = Densty of States,
- share_plot (bool) – whether to plot all data on the same or separate axes
- max_cols (int) – maximum columns on plots (share_plot=False only)
- padding (tuple) – padding between images (horizontally, vertically)
- tick_rotation (int) – rotation of x-axis labels
- rows (int or list) – index for the row of each molecule to plot (all plotted if empty)
- filters (dict) – {columns:values} to filter by
- sort_columns (list of str) – columns to sort by
- info_columns (list of str) – columns to use as info in caption
- info_incl_id (bool) – include molecule id number in labels
- start_letter (str) – starting (capital) letter for labelling subplots (share_plot=False only)
- grid (bool) – whether to include a grid in the axes
- sharex (bool) – whether to align x-axes (share_plot=False only)
- sharey (bool) – whether to align y-axes (share_plot=False only)
- legend_size (int) – the font size (in pts) for the legend
- color_scheme (str) – the scheme to use for each molecule (share_plot=True only) according to http://matplotlib.org/examples/color/colormaps_reference.html
- eunits (str) – the units of energy to use
- per_energy (float) – energy interval to group states by (DoS only)
- lbound (float) – lower bound energy (DoS only)
- ubound (float) – upper bound energy (DoS only)
- atom_groups (list of lists or strings) – atom groups to highlight (DoS only)
- group_colors (list of str) – highlight colour for each atom group (DoS only) format adheres to matplotlib.colors
- group_labels (list of str) – label for each atom group (DoS only)
- group_fill (bool) – whether to fill colour for groups (DoS only)
Returns: - data (matplotlib.figure.Figure) – plotted frequency data
- caption (str) – A caption describing each subplot, given info_columns
-
plot_mol_images
(mtype='optimised', max_cols=1, padding=(1, 1), sort_columns=[], info_columns=[], info_incl_id=False, label_size=20, start_letter='A', rows=[], filters={}, align_to=[], rotations=[[0.0, 0.0, 0.0]], gbonds=True, represent='ball_stick', zoom=1.0, width=500, height=500, axis_length=0, relative=False, minval=-1, maxval=1, highlight=[], frame_on=False, eunits='kJmol-1', sopt_min_energy=20.0, sopt_cutoff_energy=0.0, atom_groups=[], alpha=0.5, transparent=False, hbondwidth=5, no_hbonds=False)[source]¶ show molecules in matplotlib table of axes
Parameters: - mtype – ‘initial’, ‘optimised’, ‘nbo’, ‘highlight’, ‘highlight-initial’, ‘sopt’ or ‘hbond’
- max_cols (int) – maximum columns in plot
- padding (tuple) – padding between images (horizontally, vertically)
- sort_columns (list of str) – columns to sort by
- info_columns (list of str) – columns to use as info in caption
- info_incl_id (bool) – include molecule id number in caption
- label_size (int) – subplot label size (pts)
- start_letter (str) – starting (capital) letter for labelling subplots
- rows (int or list) – index for the row of each molecule to plot (all plotted if empty)
- filters (dict) – {columns:values} to filter by
- align_to ([int, int, int]) – align geometries to the plane containing these atoms
- rotations (list of [float, float, float]) – for each rotation set [x,y,z] an image will be produced
- gbonds (bool) – guess bonds between atoms (via distance)
- represent (str) – representation of molecule (‘none’, ‘wire’, ‘vdw’ or ‘ball_stick’)
- zoom (float) – zoom level of images
- width (int) – width of original images
- height (int) – height of original images (although width takes precedent)
- axis_length (float) – length of x,y,z axes in negative and positive directions
- relative (bool) – coloring of nbo atoms scaled to min/max values in atom set (for nbo mtype)
- minval (float) – coloring of nbo atoms scaled to absolute min (for nbo mtype)
- maxval (float) – coloring of nbo atoms scaled to absolute max (for nbo mtype)
- highlight (list of lists) – atom indxes to highlight (for highlight mtype)
- eunits (str) – the units of energy to return (for sopt/hbond mtype)
- sopt_min_energy (float) – minimum energy to show (for sopt/hbond mtype)
- sopt_cutoff_energy (float) – energy below which bonds will be dashed (for sopt mtype)
- alpha (float) – alpha color value of geometry (for sopt/hbond mtypes)
- transparent (bool) – whether atoms should be transparent (for sopt/hbond mtypes)
- hbondwidth (float) – width of lines depicting interaction (for hbond mtypes)
- atom_groups ([list or str, list or str]) – restrict interactions to between two lists (or identifiers) of atom indexes (for sopt/hbond mtypes)
- no_hbonds (bool) – whether to ignore H-Bonds in the calculation (for sopt only)
- frame_on (bool) – whether to show frame around each image
Returns: - fig (matplotlib.figure.Figure) – A figure containing subplots for each molecule image
- caption (str) – A caption describing each subplot, given info_columns
-
plot_radviz_comparison
(category_column, columns=[], rows=[], filters={}, point_size=30, **kwargs)[source]¶ return plot axis of radviz graph
RadViz is a way of visualizing multi-variate data. It is based on a simple spring tension minimization algorithm. Basically you set up a bunch of points in a plane. In our case they are equally spaced on a unit circle. Each point represents a single attribute. You then pretend that each sample in the data set is attached to each of these points by a spring, the stiffness of which is proportional to the numerical value of that attribute (they are normalized to unit interval). The point in the plane, where our sample settles to (where the forces acting on our sample are at an equilibrium) is where a dot representing our sample will be drawn. Depending on which class that sample belongs it will be colored differently.
-
remove_rows
(rows)[source]¶ remove one or more rows of molecules
Parameters: rows (int or list of ints:) – the rows to remove
-
yield_mol_images
(rows=[], filters={}, mtype='optimised', sort_columns=[], align_to=[], rotations=[[0.0, 0.0, 0.0]], gbonds=True, represent='ball_stick', zoom=1.0, width=300, height=300, axis_length=0, relative=False, minval=-1, maxval=1, highlight=[], active=False, sopt_min_energy=20.0, sopt_cutoff_energy=0.0, atom_groups=[], alpha=0.5, transparent=False, hbondwidth=5, eunits='kJmol-1', no_hbonds=False, ipyimg=True)[source]¶ yields molecules
Parameters: - mtype – ‘initial’, ‘optimised’, ‘nbo’, ‘highlight’, ‘highlight-initial’, ‘sopt’ or ‘hbond’
- info_columns (list of str) – columns to use as info in caption
- max_cols (int) – maximum columns in plot
- label_size (int) – subplot label size (pts)
- start_letter (str) – starting (capital) letter for labelling subplots
- save_fname (str) – name of file, if you wish to save the plot to file
- rows (int or list) – index for the row of each molecule to plot (all plotted if empty)
- filters (dict) – {columns:values} to filter by
- sort_columns (list of str) – columns to sort by
- align_to ([int, int, int]) – align geometries to the plane containing these atoms
- rotations (list of [float, float, float]) – for each rotation set [x,y,z] an image will be produced
- gbonds (bool) – guess bonds between atoms (via distance)
- represent (str) – representation of molecule (‘none’, ‘wire’, ‘vdw’ or ‘ball_stick’)
- zoom (float) – zoom level of images
- width (int) – width of original images
- height (int) – height of original images (although width takes precedent)
- axis_length (float) – length of x,y,z axes in negative and positive directions
- relative (bool) – coloring of nbo atoms scaled to min/max values in atom set (for nbo mtype)
- minval (float) – coloring of nbo atoms scaled to absolute min (for nbo mtype)
- maxval (float) – coloring of nbo atoms scaled to absolute max (for nbo mtype)
- highlight (list of lists) – atom indxes to highlight (for highlight mtype)
- eunits (str) – the units of energy to return (for sopt/hbond mtype)
- sopt_min_energy (float) – minimum energy to show (for sopt/hbond mtype)
- sopt_cutoff_energy (float) – energy below which bonds will be dashed (for sopt mtype)
- alpha (float) – alpha color value of geometry (for highlight/sopt/hbond mtypes)
- transparent (bool) – whether atoms should be transparent (for highlight/sopt/hbond mtypes)
- hbondwidth (float) – width of lines depicting interaction (for hbond mtypes)
- atom_groups ([list or str, list or str]) – restrict interactions to between two lists (or identifiers) of atom indexes (for sopt/hbond mtypes)
- no_hbonds (bool) – whether to ignore H-Bonds in the calculation
- ipyimg (bool) – whether to return an IPython image, PIL image otherwise
- Yields –
- ------- –
- indx (int) – the row index of the molecule
- mol (IPython.display.Image or PIL.Image) – an image of the molecule in the format specified by ipyimg
pygauss.docs module¶
Created on Tue Jun 16 15:52:53 2015
@author: chris sewell
-
class
pygauss.docs.
MSDocument
(docx=None)[source]¶ Bases:
object
a class to output a Microsoft Word Document
inherited api details for
docx.document.Document
can be found at; https://python-docx.readthedocs.org/en/latest/api/document.htmlthe class has an internal state for the number of calls to add_picture and add_table for use in caption numbering
Parameters: docx (str or file-like object) – can be either a path to a .docx file (a string) or a file-like object. If docx is missing or None, the built-in default document “template” is loaded. -
__dir__
()[source]¶ required to have
docx.document.Document
methods inIPython
tab completion
-
__getattr__
(name)[source]¶ required to get
docx.document.Document
methods
-
add_dataframe
(df, incl_indx=True, autofit=True, sig_figures=5, style='Medium List 1 Accent 1', caption=None)[source]¶ add dataframe as a table to the document
Parameters: Returns: pic – a table added to the document
Return type:
-
add_docstring
(docstring, style='Body Text', markdown=True)[source]¶ adds a doctring to the document
this function will split text into paragraphs (denominated by a separating blank line) remove new-line characters and add to document, allowing for markdown style text designated in
pygauss.docs.MSDocument.add_markdown()
Parameters: Returns: paras – a list of paragraphs added to the document
Return type:
-
add_markdown
(text='', style='Body Text', markup_dict=None, para=None)[source]¶ adds a paragraph to the document, allowing for paragraph/font styling akin to a stripped down version of markdown text:
paragraph level:
# Header (level denoted by number of #'s) - bullet list 1. numbered list
font level:
**bold** *italic* _{subscript} ^{superscript} ~~strikethrough~~ $mathML$
Parameters: - text (str) – the text to add
- style (str) – the style to apply (overriden if paragraph level markdown)
- markup_dict (dict) – if set will override built in font level markup {font_attribute:(start_chars, end_chars)}
- para (docx.text.paragraph.Paragraph) – a pre-existing paragraph to add the text to if set, will ignore paragraph level markdown
Returns: para – a paragraph added to the document
Return type:
-
add_mpl
(fig, dpi=None, width=None, height=None, pad_inches=0.2, caption=None)[source]¶ add matplotlib figure to the document
Parameters: Returns: pic – an inline picture added to the document
Return type:
-
add_picture
(image_path_or_stream, width=None, height=None)[source]¶ Return a new picture shape added in its own paragraph at the end of the document. The picture contains the image at image_path_or_stream, scaled based on width and height. If neither width nor height is specified, the picture appears at its native size. If only one is specified, it is used to compute a scaling factor that is then applied to the unspecified dimension, preserving the aspect ratio of the image. The native size of the picture is calculated using the dots-per-inch (dpi) value specified in the image file, defaulting to 72 dpi if no value is specified, as is often the case.
-
pygauss.isosurface module¶
Created on Mon May 25 15:23:49 2015
@author: chris based on add_isosurface function from chemview
-
pygauss.isosurface.
calc_normals
(verts, faces)[source]¶ from; https://sites.google.com/site/dlampetest/python/calculating-normals-of-a-triangle-mesh-using-numpy
-
pygauss.isosurface.
get_isosurface
(coordinates, function, isolevel=0.03, color=(255, 0, 0, 255), extents=(5, 5, 5), resolution=100)[source]¶ Add an isosurface to the current scene.
Parameters: - coordinates (numpy.array) – the coordinates of the system
- function (func) – A function that takes x, y, z coordinates as input and is broadcastable
using numpy. Typically simple functions that involve standard
arithmetic operations and functions such as
x**2 + y**2 + z**2
ornp.exp(x**2 + y**2 + z**2)
will work. If not sure, you can first pass the function throughnumpy.vectorize
. Example:mv.add_isosurface(np.vectorize(f))
- isolevel (float) – The value for which the function should be constant.
- color ((int, int, int, int)) – The color given in RGBA format
- extents ((float, float, float)) – +/- x,y,z to extend the molecule geometrty when constructing the surface
- resolution (int) – The number of grid point to use for the surface. An high value will give better quality but lower performance.
pygauss.utils module¶
Created on Thu Apr 30 01:08:30 2015
@author: chris
-
pygauss.utils.
circumcenter
(pts)[source]¶ Computes the circumcenter and circumradius of M, N-dimensional points (1 <= M <= N + 1 and N >= 1). The points are given by the rows of an (M)x(N) dimensional maatrix pts.
Returns a tuple (center, radius) where center is a column vector of length N and radius is a scalar.
In the case of four points in 3D, pts is a 4x3 matrix arranged as:
- pts = [ x0 y0 z0 ]
- [ x1 y1 z1 ] [ x2 y2 z2 ] [ x3 y3 z3 ]
with return value ([ cx cy cz ], R)
Uses an extension of the method described here: http://www.ics.uci.edu/~eppstein/junkyard/circumcenter.html
-
pygauss.utils.
circumcenter_barycoords
(pts)[source]¶ Computes the barycentric coordinates of the circumcenter M, N-dimensional points (1 <= M <= N + 1 and N >= 1). The points are given by the rows of an (M)x(N) dimensional matrix pts.
Uses an extension of the method described here: http://www.ics.uci.edu/~eppstein/junkyard/circumcenter.html
-
pygauss.utils.
df_to_img
(df, na_rep='-', other_temp=None, font_size=None, width=None, height=None, unconfined=False)[source]¶ converts a pandas Dataframe to an IPython image
- na_rep : str
- how to represent empty (nan) cells
- other_temp : str
- a latex template to use for the table other than the default
The function uses pandas to convert the dataframe to a latex table, applies a template, converts to a PDF, converts to an image, and finally return the image
to use this function you will need the pdflatex executable from tex distribution, the convert executable from imagemagick, which also requires ghostscript; http://www.ghostscript.com/download/gsdnld.html http://www.imagemagick.org/script/binary-releases.php
NB: on Windows some issues were found with convert being an already existing application. To overcome this change its filename and use the im_name variable.
-
pygauss.utils.
imgplot_kmean_groups
(analysis, category, cat_name, groups, columns, filters={}, output=[], max_cols=2, **kwargs)[source]¶