Welcome to PyGauss’s documentation!¶
Author | Chris Sewell |
Project Page | https://github.com/chrisjsewell/PyGauss |
Conda Distro | https://conda.binstar.org/cjs14 |
PyPi Distro | https://pypi.python.org/pypi/pygauss |
PyGauss is designed to be an API for examining one or more input/output files from a Gaussian quantum chemical computation, providing functionality to assess molecular geometry and electronic distribution both visually and quantitatively.
It is built on top of the cclib/chemview/chemlab suite of packages and python scientific stack and is primarily designed to be used interactively in the IPython Notebook . As shown in the examples, a molecular optimisation can be assesed individually (much like in gaussview), but also as part of a group. The advantages of this package are then:
- Faster, more efficient analysis
- Reproducible analysis
- Extensible analysis
Contents¶
Installation¶
OSX and Linux¶
The recommended was to use pygauss is to download the Anaconda Scientific Python Distribution (64-bit). Once downloaded a new environment can be created in terminal and pygauss installed:
conda create -n pg_env python=2.7
conda install -c https://conda.binstar.org/cjs14 -n pg_env pygauss
Windows¶
There is currently no pygauss Conda distributable for Windows or for chemlab, which has C-extensions that need to be built using a compiler. Therefore chemlab will need to be cloned from GitHub, its extensions built, dependancies installed and finally install pygauss.
conda create -n pg_env python=2.7
conda install -n pg_env -c https://conda.binstar.org/cjs14 cclib
conda install -n pg_env -c https://conda.binstar.org/cjs14 chemview
conda install -n pg_env -c https://conda.binstar.org/cjs14 pyopengl
git clone --recursive https://github.com/chemlab/chemlab.git
cd chemlab
python setup.py build_ext --inplace
conda install -n pg_env <pil, pandas, matplotlib, scikit-learn, ...>
activate pg_env
pip install . # or add to PYTHONPATH
pip install pygauss
Troubleshooting¶
If you encounter difficulties it may be useful to look in working_conda_environments at conda environments known to work.
Whats New¶
v0.4.2 - Addition of Documentation¶
addition of Sphinx documentation
v0.4.0 - Major Update¶
update includes:
- refactoring of data io
- improvement of second order perturbation theory analysis
- image output to table
- addition of unit test suite
- improvement of method documentation
breaks some back compatibility
v0.3.0 - File Input Over SSH¶
main update is the ability setup an ssh connection to a server, using the paramiko library, and parse analysis files over it. Also the ability to use wildcards (*) in input file names.
some minor back compatibility breaks
v0.2.2 - Table Image Improvements¶
Improvements to Table to Image functionality on OSx
- added some fixes
- re-organised test modules
v0.2.1 - Latex Table Images¶
addition of functionality to output analysis tables as latex images for input into projects!
v0.2 - Initial working distribution¶
Working distribution of pygauss to be converted to first conda package
v0.1 - First Version¶
the first version
Example Assessment¶
You should be able to open an IPython Notebook and perform the the following:
from IPython.display import display
%matplotlib inline
import pygauss as pg
print 'pygauss version: {}'.format(pg.__version__)
pygauss version: 0.4.0
and access the test folder with a number of example Gaussian outputs.
folder = pg.get_test_folder()
len(folder.list_files())
33
Note: the folder object will act identical whether using a local path or one on a server over ssh (using paramiko):
folder = pg.Folder('/path/to/folder',
ssh_server='login.server.com',
ssh_username='username')
Single Molecule Analysis¶
A molecule can be created containg data about the inital geometry, optimisation process and analysis of the final configuration. Molecules can be viewed statically or interactively (not currently supported by Firefox).
mol = pg.molecule.Molecule(folder_obj=folder,
init_fname='CJS1_emim-cl_B_init.com',
opt_fname=['CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_opt-modredundant_difrz.log',
'CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_opt-modredundant_difrz_err.log',
'CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_opt-modredundant_unfrz.log'],
freq_fname='CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_freq_unfrz.log',
nbo_fname='CJS1_emim-cl_B_6-311+g-d-p-_gd3bj_pop-nbo-full-_unfrz.log',
atom_groups={'emim':range(20), 'cl':[20]},
alignto=[3,2,1])
#mol.show_initial(active=True)
display(mol.show_initial(represent='vdw', rotations=[[0,0,90], [-90, 90, 0]]))
display(mol.show_optimisation(represent='ball_stick', rotations=[[0,0,90], [-90, 90, 0]]))


Basic analysis of optimisation...
print('Optimised? {0}, Conformer? {1}, Energy = {2} a.u.'.format(
mol.is_optimised(), mol.is_conformer(),
round(mol.get_optimisation_E(units='hartree'),3)))
ax = mol.plot_optimisation_E(units='hartree')
ax.get_figure().set_size_inches(3, 2)
ax = mol.plot_freq_analysis()
ax.get_figure().set_size_inches(4, 2)
Optimised? True, Conformer? True, Energy = -805.105 a.u.


Geometric analysis...
print 'Cl optimised polar coords from aromatic ring : ({0}, {1},{2})'.format(
*[round(i, 2) for i in mol.calc_polar_coords_from_plane(20,3,2,1)])
ax = mol.plot_opt_trajectory(20, [3,2,1])
ax.set_title('Cl optimisation path')
ax.get_figure().set_size_inches(4, 3)
Cl optimised polar coords from aromatic ring : (0.11, -116.42,-170.06)

Potential Energy Scan analysis of geometric conformers...
mol2 = pg.molecule.Molecule(folder_obj=folder, alignto=[3,2,1],
pes_fname=['CJS_emim_6311_plus_d3_scan.log',
'CJS_emim_6311_plus_d3_scan_bck.log'])
ax = mol2.plot_pes_scans([1,4,9,10], rotation=[0,0,90], img_pos='local_maxs', zoom=0.5)
ax.set_title('Ethyl chain rotational conformer analysis')
ax.get_figure().set_size_inches(7, 3)

Natural Bond Orbital and Second Order Perturbation Theory analysis...
print '+ve charge centre polar coords from aromatic ring: ({0} {1},{2})'.format(
*[round(i, 2) for i in mol.calc_nbo_charge_center(3, 2, 1)])
display(mol.show_nbo_charges(represent='ball_stick', axis_length=0.4,
rotations=[[0,0,90], [-90, 90, 0]]))
+ve charge centre polar coords from aromatic ring: (0.02 -51.77,-33.15)

print 'H inter-bond energy = {} kJmol-1'.format(
mol.calc_hbond_energy(eunits='kJmol-1', atom_groups=['emim', 'cl']))
print 'Other inter-bond energy = {} kJmol-1'.format(
mol.calc_sopt_energy(eunits='kJmol-1', no_hbonds=True, atom_groups=['emim', 'cl']))
display(mol.show_sopt_bonds(min_energy=1, eunits='kJmol-1',
atom_groups=['emim', 'cl'],
no_hbonds=True,
rotations=[[0, 0, 90]]))
display(mol.show_hbond_analysis(cutoff_energy=5.,alpha=0.6,
atom_groups=['emim', 'cl'],
rotations=[[0, 0, 90], [90, 0, 0]]))
H inter-bond energy = 111.7128 kJmol-1
Other inter-bond energy = 11.00392 kJmol-1


Multiple Computations Analysis¶
Multiple computations, for instance of different starting conformations, can be grouped into an Analysis class.
analysis = pg.Analysis(folder_obj=folder)
errors = analysis.add_runs(headers=['Cation', 'Anion', 'Initial'],
values=[['emim'], ['cl'],
['B', 'BE', 'BM', 'F', 'FE']],
init_pattern='*{0}-{1}_{2}_init.com',
opt_pattern='*{0}-{1}_{2}_6-311+g-d-p-_gd3bj_opt*unfrz.log',
freq_pattern='*{0}-{1}_{2}_6-311+g-d-p-_gd3bj_freq*.log',
nbo_pattern='*{0}-{1}_{2}_6-311+g-d-p-_gd3bj_pop-nbo-full-*.log',
alignto=[3,2,1], atom_groups={'emim':range(20), 'cl':[20]})
fig, caption = analysis.plot_mol_images(mtype='initial', max_cols=3,
info_columns=['Cation', 'Anion', 'Initial'],
rotations=[[0,0,90]])
print caption
Figure: (A) emim, cl, B, (B) emim, cl, BE, (C) emim, cl, BM, (D) emim, cl, F, (E) emim, cl, FE

The methods mentioned for indivdiual molecules can then be applied to all or a subset of these computations.
analysis.add_mol_property_subset('Opt', 'is_optimised', rows=[2,3])
analysis.add_mol_property('Energy (au)', 'get_optimisation_E', units='hartree')
analysis.add_mol_property('Cation chain, $\\psi$', 'calc_dihedral_angle', [1, 4, 9, 10])
analysis.add_mol_property('Cation Charge', 'calc_nbo_charge', 'emim')
analysis.add_mol_property('Anion Charge', 'calc_nbo_charge', 'cl')
analysis.add_mol_property(['Anion-Cation, $r$', 'Anion-Cation, $\\theta$', 'Anion-Cation, $\\phi$'],
'calc_polar_coords_from_plane', 3, 2, 1, 20)
analysis.add_mol_property('Anion-Cation h-bond', 'calc_hbond_energy',
eunits='kJmol-1', atom_groups=['emim', 'cl'])
tbl = analysis.get_table(row_index=['Anion', 'Cation', 'Initial'],
column_index=['Cation', 'Anion', 'Anion-Cation'])
NEW FEATURE: there is now an option (requiring pdflatex and ghostscript+imagemagik) to output the tables as a latex formatted image.
analysis.get_table(row_index=['Anion', 'Cation', 'Initial'],
column_index=['Cation', 'Anion', 'Anion-Cation'],
as_image=True, font_size=12)

RadViz is a way of visualizing multi-variate data.
ax = analysis.plot_radviz_comparison('Anion', columns=range(4, 10))

The KMeans algorithm clusters data by trying to separate samples into n groups of equal variance.
pg.utils.imgplot_kmean_groups(
analysis, 'Anion', 'cl', 4, range(4, 10),
output=['Initial'], mtype='optimised',
rotations=[[0, 0, 90], [-90, 90, 0]],
axis_length=0.3)

Figure: (A) B, (B) BE

Figure: (A) BM

Figure: (A) FE

Figure: (A) F
MORE TO COME!!
User API¶
pygauss.file_io module¶
Created on Mon May 18 21:01:25 2015
@author: chris sewell
-
class
pygauss.file_io.
Folder
(path, server=None, username=None, passwrd=None)[source]¶ Bases:
object
an object intended to act as an entry point to a folder path
Parameters: -
list_files
(pattern=None, one_file=False)[source]¶ list files in folder
- pattern : str
- a pattern the file must match that can include * wildcards
-
save_ipyimg
(img, img_name)[source]¶ a function for outputing an IPython Image to a file
- img : IPython.display.Image
- an IPyton image
- img_name : str
- the desired name of the file
-
pygauss.molecule module¶
Created on Fri May 01 21:24:31 2015
@author: chris
-
class
pygauss.molecule.
Molecule
(folderpath='', init_fname=False, opt_fname=False, freq_fname=False, nbo_fname=False, pes_fname=False, fail_silently=False, atom_groups={}, alignto=[], server=None, username=None, passwrd=None, folder_obj=None)[source]¶ Bases:
object
a class to analyse gaussian input/output of a single molecular geometry
Parameters: - folderpath (str) – the folder path
- init_fname (str) – the intial geometry (.com) file
- opt_fname (str or list of str) – the optimisation log file
- freq_fname (str) – the frequency analysis log file
- nbo_fname (str) – the population analysis logfile
- pes_fname (str) – the potential energy scan logfile
- fail_silently (bool) – whether to raise an error if a file read fails (if True can use get_init_read_errors to see errors)
- atom_groups ({str:[int, ...]}) – groups of atoms that can be selected as a subset
- alignto ([int, int, int]) – the atom numbers to align the geometry to
- of the file names can have wildcards (e.g. 'filename*.log) in them, (any) –
- long as this resolves to a single path in the directory (as) –
- NB (nbo population analysis must be run with the GFInput flag to ensure) –
- is output to the log file (data) –
-
calc_bond_angle
(indxs, optimisation=True, mol=None)[source]¶ Returns the angle in degrees between three points
-
calc_dihedral_angle
(indxs, optimisation=True, mol=None)[source]¶ Returns the angle in degrees between four points
-
calc_min_dist
(idx_list1, idx_list2, optimisation=True, units='nm', ignore_missing=True)[source]¶ indexes start at 1
-
calc_nbo_charge_center
(p1, p2, p3, positive=True, units='nm', atoms=[])[source]¶ returns the distance r amd angles theta, phi of the positive/negative charge center to the circumcenter of the plane formed by [p1, p2, p3]
- the plane formed will have;
- x-axis along p1, y-axis anticlock-wise towards p2, z-axis normal to the plane
theta (azimuth) is the in-plane angle from the x-axis towards the y-axis phi (inclination) is the out-of-plane angle from the x-axis towards the z-axis
-
calc_opt_trajectory
(atom, plane=[])[source]¶ calculate the trajectory of an atom as it is optimised, relative to a plane of three atoms
-
calc_polar_coords_from_plane
(p1, p2, p3, c, optimisation=True, units='nm')[source]¶ returns the distance r and angles theta, phi of atom c to the circumcenter of the plane formed by [p1, p2, p3]
- the plane formed will have;
- x-axis along p1, y-axis anticlock-wise towards p2, z-axis normal to the plane
theta (azimuth) is the in-plane angle from the x-axis towards the y-axis phi (inclination) is the out-of-plane angle from the x-axis towards the z-axis
-
calc_sopt_energy
(atom_groups=[], eunits='kJmol-1', no_hbonds=False)[source]¶ calculate total energy of interactions between “filled” (donor) Lewis-type Natural Bonding Orbitals (NBOs) and “empty” (acceptor) non-Lewis NBOs, using Second Order Perturbation Theory
Parameters: Returns: analysis – a table of interactions
Return type: pandas.DataFrame
-
combine_molecules
(other_mol, self_atoms=False, other_atoms=False, self_rotation=[0, 0, 0], other_rotation=[0, 0, 0], self_transpose=[0, 0, 0], other_transpose=[0, 0, 0], self_opt=True, other_opt=True, charge=None, multiplicity=None, out_name=False, descript='', overwrite=False, active=False, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, ipyimg=True, folder_obj=None)[source]¶ transpose in nanometers
-
get_freq_analysis
()[source]¶ return frequency analysis
Returns: data – frequency data Return type: pd.DataFrame
-
get_hbond_analysis
(min_energy=0.0, atom_groups=[], eunits='kJmol-1')[source]¶ EXPERIMENTAL! hydrogen bond analysis (DH—A), using Second Order Bond Perturbation Theiry
Parameters: Returns: - analysis (pandas.DataFrame) – a table of interactions
- uses a strict definition of a hydrogen bond as
- interactions between “filled” (donor) Lewis-type Lone Pair (LP) NBOs
- and “empty” (acceptor) non-Lewis Bonding (BD) NBOs
-
get_init_read_errors
()[source]¶ get read errors, recorded if fail_silently was set to True on initialise
-
get_optimisation_E
(units='eV', final=True)[source]¶ return the SCF optimisation energy
Parameters: Returns: out – dependant on final
Return type: float or list of floats
-
get_orbital_energies
(orbitals, eunits='eV')[source]¶ the orbital energies for listed orbitals
Parameters: - orbitals (int or iterable of ints) – the orbital(s) to return energies for (starting at 1)
- eunits (str) – the units of energy
Returns: moenergies – energy for each orbital
Return type: np.array
-
get_sopt_analysis
(eunits='kJmol-1', atom_groups=[], charge_info=False)[source]¶ interactions between “filled” (donor) Lewis-type Natural Bonding Orbitals (NBOs) and “empty” (acceptor) non-Lewis NBOs, using Second Order Perturbation Theory (SOPT)
Parameters: Returns: analysis – a table of interactions
Return type: pandas.DataFrame
-
plot_freq_analysis
()[source]¶ plot frequency analysis
Returns: data – plotted frequency data Return type: matplotlib.axes._subplots.AxesSubplot
-
plot_opt_trajectory
(atom, plane=[], ax_lims=None, ax_labels=False)[source]¶ plot the trajectory of an atom as it is optimised, relative to a plane of three atoms
-
plot_pes_scans
(fixed_atoms, eunits='kJmol-1', img_pos='', rotation=[0.0, 0.0, 0.0], zoom=1, order=1)[source]¶ plot Potential Energy Scan
- img_pos : <’‘,’local_mins’,’local_maxs’,’global_min’,’global_max’>
- position image(s) of molecule conformation(s) on plot
- rotation : [float, float, float]
- rotation of molecule image(s)
-
show_active_orbital
(orbital, iso_value=0.03, alpha=0.5, bond_color=(255, 0, 0), antibond_color=(0, 255, 0), gbonds=True)[source]¶ get interactive representation of orbital
Parameters: - orbital (int) – the orbital to show (in range 1 to number of orbitals)
- iso_value (float) – The value for which the function should be constant.
- alpha – alpha value of iso-surface
- bond_color – color of bonding orbital surface in RGB format
- antibond_color – color of anti-bonding orbital surface in RGB format
- gbonds (bool) – guess bonds between atoms (via distance)
-
show_hbond_analysis
(min_energy=0.0, atom_groups=[], cutoff_energy=0.0, eunits='kJmol-1', bondwidth=5, gbonds=True, active=False, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], relative=False, minval=-1, maxval=1, alpha=0.5, transparent=True, ipyimg=True)[source]¶ EXPERIMENTAL! hydrogen bond analysis DH—A
For a hydrogen bond to occur there must be both a hydrogen donor and an acceptor present. The donor in a hydrogen bond is the atom to which the hydrogen atom participating in the hydrogen bond is covalently bonded, and is usually a strongly electronegative atom such as N, O, or F. The hydrogen acceptor is the neighboring electronegative ion or molecule, and must posses a lone electron pair in order to form a hydrogen bond.
Since the hydrogen donor is strongly electronegative, it pulls the covalently bonded electron pair closer to its nucleus, and away from the hydrogen atom. The hydrogen atom is then left with a partial positive charge, creating a dipole-dipole attraction between the hydrogen atom bonded to the donor, and the lone electron pair on the acceptor.
-
show_highlight_atoms
(atomlists, transparent=False, alpha=0.7, gbonds=True, active=False, optimised=True, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶
-
show_initial
(gbonds=True, active=False, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ show initial geometry (before optimisation) of molecule
-
show_nbo_charges
(gbonds=True, active=False, relative=False, minval=-1, maxval=1, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶
-
show_optimisation
(opt_step=False, gbonds=True, active=False, represent='vdw', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ show optimised geometry of molecule
-
show_sopt_bonds
(min_energy=20.0, cutoff_energy=0.0, atom_groups=[], bondwidth=5, eunits='kJmol-1', no_hbonds=False, gbonds=True, active=False, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], relative=False, minval=-1, maxval=1, alpha=0.5, transparent=True, ipyimg=True)[source]¶ visualisation of interactions between “filled” (donor) Lewis-type Natural Bonding Orbitals (NBOs) and “empty” (acceptor) non-Lewis NBOs, using Second Order Perturbation Theory
-
yield_orbital_images
(orbitals, iso_value=0.02, extents=(2, 2, 2), transparent=True, alpha=0.5, wireframe=True, bond_color=(255, 0, 0), antibond_color=(0, 255, 0), resolution=100, gbonds=True, represent='ball_stick', rotations=[[0.0, 0.0, 0.0]], zoom=1.0, width=300, height=300, axis_length=0, lines=[], ipyimg=True)[source]¶ yield orbital images
Parameters: - orbitals (int or list of ints) – the orbitals to show (in range 1 to number of orbitals)
- iso_value (float) – The value for which the function should be constant.
- extents ((float, float, float)) – +/- x,y,z to extend the molecule geometrty when constructing the surface
- transparent=True – whether iso-surface should be transparent (based on alpha value)
- alpha – alpha value of iso-surface
- wireframe – whether iso-surface should be wireframe (or solid)
- bond_color – color of bonding orbital surface in RGB format
- antibond_color – color of anti-bonding orbital surface in RGB format
- resolution (int) – The number of grid point to use for the surface. An high value will give better quality but lower performance.
- gbonds (bool) – guess bonds between atoms (via distance)
- represent (str) – representation of molecule (‘none’, ‘wire’, ‘vdw’ or ‘ball_stick’)
- zoom (float) – zoom level of images
- width (int) – width of original images
- height (int) – height of original images (although width takes precedent)
- axis_length (float) – length of x,y,z axes in negative and positive directions
- lines ([start_coord, end_coord, start_color, end_color, width, dashed]) – lines to add to image
- ipyimg (bool) – whether to return an IPython image, PIL image otherwise
Returns: mol – an image of the molecule in the format specified by ipyimg
Return type: IPython.display.Image or PIL.Image
pygauss.analysis module¶
-
class
pygauss.analysis.
Analysis
(folderpath='', server=None, username=None, passwrd=None, folder_obj=None, headers=[])[source]¶ Bases:
object
a class to analyse multiple computations
Parameters: - folderpath (str) – the folder directory storing the files to be analysed
- server (str) – the name of the server storing the files to be analysed
- username (str) – the username to connect to the server
- passwrd (str) – server password, if not present it will be asked for during initialisation
- headers (list) – the variable categories for each computation
-
add_basic_properties
(props=['basis', 'nbasis', 'optimised', 'conformer'])[source]¶ adds columns giving info of basic run properties
-
add_mol_property
(name, method, *args, **kwargs)[source]¶ compute molecule property for all rows and create a data column
Parameters:
-
add_mol_property_subset
(name, method, rows=[], filters={}, args=[], kwargs={}, relative_to_rows=[])[source]¶ compute molecule property for a subset of rows and create/add-to data column
Parameters: - name (str or list of strings) – name for output column (multiple if method outputs more than one value)
- method (str) – what molecule method to call
- rows (list) – what molecule rows to calculate the property for
- filters (dict) – filter for selecting molecules to calculate the property for
- args (list) – the arguments to pass to the molecule method
- kwargs (dict) – the keyword arguments to pass to the molecule method
- relative_to_rows (list of ints) – compute values relative to the summated value(s) of molecule at the rows listed
-
add_run
(identifiers={}, init_fname=None, opt_fname=None, freq_fname=None, nbo_fname=None, alignto=[], atom_groups={}, add_if_error=False, folder_obj=None)[source]¶ add single Gaussian run input/outputs
-
add_runs
(headers=[], values=[], init_pattern=None, opt_pattern=None, freq_pattern=None, nbo_pattern=None, add_if_error=False, alignto=[], atom_groups={}, ipython_print=False)[source]¶ add multiple Gaussian run inputs/outputs
-
calc_kmean_groups
(category_column, category_name, groups, columns=[], rows=[], filters={})[source]¶ calculate the kmeans grouping of rows
The KMeans algorithm clusters data by trying to separate samples in n groups of equal variance, minimizing a criterion known as the inertia or within-cluster sum-of-squares. This algorithm requires the number of clusters to be specified. It scales well to large number of samples and has been used across a large range of application areas in many different fields.
-
folder
¶ The folder for gaussian runs
-
get_basic_property
(prop, *args, **kwargs)[source]¶ returns a series of a basic run property or nan if it is not available
Parameters: prop (str) – can be ‘basis’, ‘nbasis’, ‘optimised’, ‘opt_error’ or ‘conformer’
-
get_freq_analysis
(info_columns=[], rows=[], filters={})[source]¶ return frequency analysis
Parameters: - info_columns (list of str) – columns to use as info in caption
- rows (int or list) – index for the row of each molecule to plot (all plotted if empty)
- filters (dict) – {columns:values} to filter by
Returns: data – frequency data
Return type: pd.DataFrame
-
get_table
(rows=[], columns=[], filters={}, precision=4, head=False, mol=False, row_index=[], column_index=[], as_image=False, na_rep='-', font_size=None, width=None, height=None, unconfined=False)[source]¶ return pandas table of requested data in requested format
- rows : integer or list of integers
- select row ids
- columns : string/integer or list of strings/integers
- select column names/positions
- filters : dict
- filter for rows with certain value(s) in specific columns
- precision : int
- decimal precision of displayed values
- head : int
- return only first n rows
- mol : bool
- include column containing the molecule objects
- row_index : string or list of strings
- columns to use as new index
- column_index : list of strings
- srings to place in to higher order column indexs
- as_image : bool
- output the table as an image (used pygauss.utils.df_to_img)
- na_rep : str
- how to represent empty (nan) cells (if outputting image)
- width, height, unconfined : int, int, bool
- args for IPy Image
-
plot_freq_analysis
(info_columns=[], rows=[], filters={}, share_plot=True, include_row=False)[source]¶ plot frequency analysis
Parameters: - info_columns (list of str) – columns to use as info in caption
- rows (int or list) – index for the row of each molecule to plot (all plotted if empty)
- filters (dict) – {columns:values} to filter by
- share_plot (bool) – whether to share a single plot or have multiple ones
- include_row (bool) – include row number in legend labels
Returns: data – plotted frequency data
Return type: matplotlib.figure.Figure
-
plot_mol_images
(mtype='optimised', info_columns=[], info_incl_id=False, max_cols=1, label_size=20, start_letter='A', save_fname=None, rows=[], filters={}, align_to=[], rotations=[[0.0, 0.0, 0.0]], gbonds=True, represent='ball_stick', zoom=1.0, width=500, height=500, axis_length=0, relative=False, minval=-1, maxval=1, highlight=[], frame_on=False, eunits='kJmol-1', sopt_min_energy=20.0, sopt_cutoff_energy=0.0, atom_groups=[], alpha=0.5, transparent=False, hbondwidth=5, no_hbonds=False)[source]¶ show molecules in matplotlib table of axes
Parameters: - mtype – ‘initial’, ‘optimised’, ‘nbo’, ‘highlight’, ‘sopt’ or ‘hbond’
- info_columns (list of str) – columns to use as info in caption
- info_incl_id (bool) – include molecule id number in caption
- max_cols (int) – maximum columns in plot
- label_size (int) – subplot label size (pts)
- start_letter (str) – starting (capital) letter for labelling subplots
- save_fname (str) – name of file, if you wish to save the plot to file
- rows (int or list) – index for the row of each molecule to plot (all plotted if empty)
- filters (dict) – {columns:values} to filter by
- align_to ([int, int, int]) – align geometries to the plane containing these atoms
- rotations (list of [float, float, float]) – for each rotation set [x,y,z] an image will be produced
- gbonds (bool) – guess bonds between atoms (via distance)
- represent (str) – representation of molecule (‘none’, ‘wire’, ‘vdw’ or ‘ball_stick’)
- zoom (float) – zoom level of images
- width (int) – width of original images
- height (int) – height of original images (although width takes precedent)
- axis_length (float) – length of x,y,z axes in negative and positive directions
- relative (bool) – coloring of nbo atoms scaled to min/max values in atom set (for nbo mtype)
- minval (float) – coloring of nbo atoms scaled to absolute min (for nbo mtype)
- maxval (float) – coloring of nbo atoms scaled to absolute max (for nbo mtype)
- highlight (list of lists) – atom indxes to highlight (for highlight mtype)
- eunits (str) – the units of energy to return (for sopt/hbond mtype)
- sopt_min_energy (float) – minimum energy to show (for sopt/hbond mtype)
- sopt_cutoff_energy (float) – energy below which bonds will be dashed (for sopt mtype)
- alpha (float) – alpha color value of geometry (for sopt/hbond mtypes)
- transparent (bool) – whether atoms should be transparent (for sopt/hbond mtypes)
- hbondwidth (float) – width of lines depicting interaction (for hbond mtypes)
- atom_groups ([list or str, list or str]) – restrict interactions to between two lists (or identifiers) of atom indexes (for sopt/hbond mtypes)
- no_hbonds (bool) – whether to ignore H-Bonds in the calculation (for sopt only)
- frame_on (bool) – whether to show frame around each image
Returns: - fig (matplotlib.figure.Figure) – A figure containing subplots for each molecule image
- caption (str) – A caption describing each subplot, given info_columns
-
plot_radviz_comparison
(category_column, columns=[], rows=[], filters={}, point_size=30, **kwargs)[source]¶ return plot axis of radviz graph
RadViz is a way of visualizing multi-variate data. It is based on a simple spring tension minimization algorithm. Basically you set up a bunch of points in a plane. In our case they are equally spaced on a unit circle. Each point represents a single attribute. You then pretend that each sample in the data set is attached to each of these points by a spring, the stiffness of which is proportional to the numerical value of that attribute (they are normalized to unit interval). The point in the plane, where our sample settles to (where the forces acting on our sample are at an equilibrium) is where a dot representing our sample will be drawn. Depending on which class that sample belongs it will be colored differently.
-
remove_rows
(rows)[source]¶ remove one or more rows of molecules
- rows : int or list of ints:
- the rows to remove
-
yield_mol_images
(rows=[], filters={}, mtype='optimised', align_to=[], rotations=[[0.0, 0.0, 0.0]], gbonds=True, represent='ball_stick', zoom=1.0, width=300, height=300, axis_length=0, relative=False, minval=-1, maxval=1, highlight=[], active=False, sopt_min_energy=20.0, sopt_cutoff_energy=0.0, atom_groups=[], alpha=0.5, transparent=False, hbondwidth=5, eunits='kJmol-1', no_hbonds=False, ipyimg=True)[source]¶ yields molecules
Parameters: - mtype – ‘initial’, ‘optimised’, ‘nbo’, ‘highlight’, ‘sopt’ or ‘hbond’
- info_columns (list of str) – columns to use as info in caption
- max_cols (int) – maximum columns in plot
- label_size (int) – subplot label size (pts)
- start_letter (str) – starting (capital) letter for labelling subplots
- save_fname (str) – name of file, if you wish to save the plot to file
- rows (int or list) – index for the row of each molecule to plot (all plotted if empty)
- filters (dict) – {columns:values} to filter by
- align_to ([int, int, int]) – align geometries to the plane containing these atoms
- rotations (list of [float, float, float]) – for each rotation set [x,y,z] an image will be produced
- gbonds (bool) – guess bonds between atoms (via distance)
- represent (str) – representation of molecule (‘none’, ‘wire’, ‘vdw’ or ‘ball_stick’)
- zoom (float) – zoom level of images
- width (int) – width of original images
- height (int) – height of original images (although width takes precedent)
- axis_length (float) – length of x,y,z axes in negative and positive directions
- relative (bool) – coloring of nbo atoms scaled to min/max values in atom set (for nbo mtype)
- minval (float) – coloring of nbo atoms scaled to absolute min (for nbo mtype)
- maxval (float) – coloring of nbo atoms scaled to absolute max (for nbo mtype)
- highlight (list of lists) – atom indxes to highlight (for highlight mtype)
- eunits (str) – the units of energy to return (for sopt/hbond mtype)
- sopt_min_energy (float) – minimum energy to show (for sopt/hbond mtype)
- sopt_cutoff_energy (float) – energy below which bonds will be dashed (for sopt mtype)
- alpha (float) – alpha color value of geometry (for highlight/sopt/hbond mtypes)
- transparent (bool) – whether atoms should be transparent (for highlight/sopt/hbond mtypes)
- hbondwidth (float) – width of lines depicting interaction (for hbond mtypes)
- atom_groups ([list or str, list or str]) – restrict interactions to between two lists (or identifiers) of atom indexes (for sopt/hbond mtypes)
- no_hbonds (bool) – whether to ignore H-Bonds in the calculation
- ipyimg (bool) – whether to return an IPython image, PIL image otherwise
- Yields –
- ------- –
- indx (int) – the row index of the molecule
- mol (IPython.display.Image or PIL.Image) – an image of the molecule in the format specified by ipyimg
pygauss.isosurface module¶
Created on Mon May 25 15:23:49 2015
@author: chris based on add_isosurface function from chemview
-
pygauss.isosurface.
calc_normals
(verts, faces)[source]¶ from; https://sites.google.com/site/dlampetest/python/calculating-normals-of-a-triangle-mesh-using-numpy
-
pygauss.isosurface.
get_isosurface
(coordinates, function, isolevel=0.03, color=(255, 0, 0, 255), extents=(5, 5, 5), resolution=100)[source]¶ Add an isosurface to the current scene.
Parameters: - coordinates (numpy.array) – the coordinates of the system
- function (func) – A function that takes x, y, z coordinates as input and is broadcastable
using numpy. Typically simple functions that involve standard
arithmetic operations and functions such as
x**2 + y**2 + z**2
ornp.exp(x**2 + y**2 + z**2)
will work. If not sure, you can first pass the function throughnumpy.vectorize
. Example:mv.add_isosurface(np.vectorize(f))
- isolevel (float) – The value for which the function should be constant.
- color ((int, int, int, int)) – The color given in RGBA format
- extents ((float, float, float)) – +/- x,y,z to extend the molecule geometrty when constructing the surface
- resolution (int) – The number of grid point to use for the surface. An high value will give better quality but lower performance.
pygauss.utils module¶
Created on Thu Apr 30 01:08:30 2015
@author: chris
-
pygauss.utils.
circumcenter
(pts)[source]¶ Computes the circumcenter and circumradius of M, N-dimensional points (1 <= M <= N + 1 and N >= 1). The points are given by the rows of an (M)x(N) dimensional maatrix pts.
Returns a tuple (center, radius) where center is a column vector of length N and radius is a scalar.
In the case of four points in 3D, pts is a 4x3 matrix arranged as:
- pts = [ x0 y0 z0 ]
- [ x1 y1 z1 ] [ x2 y2 z2 ] [ x3 y3 z3 ]
with return value ([ cx cy cz ], R)
Uses an extension of the method described here: http://www.ics.uci.edu/~eppstein/junkyard/circumcenter.html
-
pygauss.utils.
circumcenter_barycoords
(pts)[source]¶ Computes the barycentric coordinates of the circumcenter M, N-dimensional points (1 <= M <= N + 1 and N >= 1). The points are given by the rows of an (M)x(N) dimensional matrix pts.
Uses an extension of the method described here: http://www.ics.uci.edu/~eppstein/junkyard/circumcenter.html
-
pygauss.utils.
df_to_img
(df, na_rep='-', other_temp=None, font_size=None, width=None, height=None, unconfined=False)[source]¶ converts a pandas Dataframe to an IPython image
- na_rep : str
- how to represent empty (nan) cells
- other_temp : str
- a latex template to use for the table other than the default
The function uses pandas to convert the dataframe to a latex table, applies a template, converts to a PDF, converts to an image, and finally return the image
to use this function you will need the pdflatex executable from tex distribution, the convert executable from imagemagick, which also requires ghostscript; http://www.ghostscript.com/download/gsdnld.html http://www.imagemagick.org/script/binary-releases.php
NB: on Windows some issues were found with convert being an already existing application. To overcome this change its filename and use the im_name variable.
-
pygauss.utils.
imgplot_kmean_groups
(analysis, category, cat_name, groups, columns, filters={}, output=[], max_cols=2, **kwargs)[source]¶