Classes

atom3D Class

class molSimplify.Classes.atom3D.atom3D(Sym: str = 'C', xyz: List[float] | None = None, name: str | None = None, partialcharge: float | None = None, Tfactor=0, greek='', occup=1.0, loc='', line='')[source]

Bases: object

atom3D class. Base class in molSimplify for representing an element.

Parameters:
  • Sym (str, optional) – Symbol for atom3D instantiation. Element symbol. Default is ‘C’.

  • xyz (list, optional) – List of coordinates for new atom. Default is [0.0, 0.0, 0.0].

  • name (str, optional) – Unique identifier for atom 3D instantiation. Default is False.

  • partialcharge (int, optional) – Charge assigned to atom when added to mol. Default is None.

coords()[source]

Get coordinates of a given atom.

Returns:

coords – List of coordinates in X, Y, Z format.

Return type:

list

distance(atom2) float[source]

Get distance from one atom3D class to another.

Parameters:

atom2 (atom3D) – atom3D class of the atom to measure distance from.

Returns:

dist – Distance in angstroms.

Return type:

float

distancev(atom2)[source]

Get distance vector from one atom3D class to another.

Parameters:

atom2 (atom3D) – atom3D class of the atom to measure distance from.

Returns:

dist_list – List of distances in vector form: [dx, dy, dz] with units of Angstroms.

Return type:

list

ismetal(transition_metals_only=True) bool[source]

Identify whether an atom is a metal.

Parameters:

transition_metals_only (bool, optional) – Identify only transition metals. Default is true.

Returns:

metal – Bool for whether or not an atom is a metal.

Return type:

bool

mutate(newType='C')[source]

Mutate an element to another element in the atom3D.

Parameters:

newType (str, optional) – Element name for new element. Default is ‘C’.

setEDIA(score)[source]

Sets the EDIA score of an individual atom3D.

Parameters:

score (float) – Desired EDIA score of atom

setcoords(xyz)[source]

Set coordinates of an atom3D class to a new location.

Parameters:

xyz (list) – List of coordinates, has length 3: [X, Y, Z]

symbol() str[source]

Return symbol of atom3D.

Returns:

symbol – Element symbol for atom3D class.

Return type:

str

translate(dxyz)[source]

Move the atom3D by a displacement vector.

Parameters:

dxyz (list) – Displacement vector of length 3: [dx, dy, dz].

mol3D Class

class molSimplify.Classes.mol3D.mol3D(name='ABC', loc='', use_atom_specific_cutoffs=False)[source]

Bases: object

Holds information about a molecule, used to do manipulations. Reads information from structure file (XYZ, mol2) or is directly built from molsimplify. Please be cautious with periodic systems.

Example instantiation of an octahedral iron-ammonia complex from an XYZ file:

>>> complex_mol = mol3D()
>>> complex_mol.readfromxyz('fe_nh3_6.xyz')  
ACM(idx1, idx2, idx3, angle)[source]

Performs angular movement on mol3D class. A submolecule is rotated about idx2. Operates directly on class.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom 1.

  • idx3 (int) – Index of anchor atom 2.

  • angle (float) – New bond angle in degrees.

ACM_axis(idx1, idx2, axis, angle)[source]

Performs angular movement about an axis on mol3D class. A submolecule is rotated about idx2.Operates directly on class.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom.

  • axis (list) – Direction vector of axis. Length 3 (X, Y, Z).

  • angle (float) – New bond angle in degrees.

BCM(idx1, idx2, d)[source]

Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom.

Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom.

  • d (float) – Bond distance in angstroms.

>>> complex_mol = mol3D()
>>> complex_mol.addAtom(atom3D('H', [0, 0, 0]))
>>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))
>>> complex_mol.BCM(1, 0, 0.7) # Set distance between atoms 0 and 1 to be 1.5 angstroms. Move atom 1.
>>> complex_mol.coordsvect()
array([[0. , 0. , 0. ],
       [0. , 0. , 0.7]])
BCM_opt(idx1, idx2, d, ff='uff')[source]

Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom. Performs force field optimization after, freezing the moved bond length.

Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom.

  • d (float) – Bond distance in angstroms.

  • ff (str) – Name of force field to be used from openbabel.

IsOct(init_mol=None, dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, flag_lbd=True, BondedOct=True, skip=False, flag_deleteH=True, silent=False, use_atom_specific_cutoffs=True)[source]

Main geometry check method for octahedral structures

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.

  • silent (bool, optional) – Flag for warning suppression. Default is False.

  • use_atom_specific_cutoffs (bool, optional) – Determine bonding with atom specific cutoffs.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

IsStructure(init_mol=None, dict_check=False, angle_ref=False, num_coord=5, flag_catoms=False, catoms_arr=None, debug=False, skip=False, flag_deleteH=True)[source]

Main geometry check method for square pyramidal structures

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • num_coord (int, optional) – The metal coordination number.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

Oct_inspection(init_mol=None, catoms_arr=None, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]

Used to track down the changing geo_check metrics in a DFT geometry optimization. Catoms_arr always specified.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • debug (bool, optional) – Flag for extra printout. Default is False.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

  • flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.

  • flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.

RCAngle(idx1, idx2, idx3, anglei, anglef, angleint=1.0, writegeo=False, dir_name='rc_angle_geometries')[source]

Generates geometries along a given angle reaction coordinate. In the given molecule, idx1 is rotated about idx2 with respect to idx3. Operates directly on class.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom 1.

  • idx3 (int) – Index of anchor atom 2.

  • anglei (float) – New initial bond angle in degrees.

  • anglef (float) – New final bond angle in degrees.

  • angleint (float; default is 1.0 degree) – The angle interval in which the angle is changed

  • writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False

  • dir_name (string; default is 'rc_angle_geometries') – The directory to which generated reaction coordinate geoemtries are written, if writegeo=True.

>>> complex_mol = mol3D()
>>> complex_mol.addAtom(atom3D('O', [0, 0, 0]))
>>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))
>>> complex_mol.addAtom(atom3D('H', [0, 1, 0]))

Generate reaction coordinate geometries using the given structure by changing the angle between atoms 2, 1, and 0 from 90 degrees to 160 degrees in intervals of 10 degrees >>> complex_mol.RCAngle(2, 1, 0, 90, 160, 10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]

Generate reaction coordinates with the given geometry by changing the angle between atoms 2, 1, and 0 from 160 degrees to 90 degrees in intervals of 10 degrees, and the generated geometries will not be written to a directory. >>> complex_mol.RCAngle(2, 1, 0, 160, 90, -10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]

RCDistance(idx1, idx2, disti, distf, distint=0.05, writegeo=False, dir_name='rc_distance_geometries')[source]

Generates geometries along a given distance reaction coordinate. In the given molecule, idx1 is moved with respect to idx2. Operates directly on class.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom 1.

  • disti (float) – New initial bond distance in angstrom.

  • distf (float) – New final bond distance in angstrom.

  • distint (float; default is 0.05 angstrom) – The distance interval in which the distance is changed

  • writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False

  • dir_name (string; default is 'rc_distance_geometries') – The directory to which generated reaction coordinate geoemtries are written if writegeo=True.

>>> complex_mol = mol3D()
>>> complex_mol.addAtom(atom3D('H', [0, 0, 0]))
>>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))

Generate reaction coordinate geometries using the given structure by changing the distance between atoms 1 and 0 from 1.0 to 3.0 angstrom (atom 1 is moved) in intervals of 0.5 angstrom >>> complex_mol.RCDistance(1, 0, 1.0, 3.0, 0.5) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]

Generate reaction coordinates geometries using the given structure by changing the distance between atoms 1 and 0 from 3.0 to 1.0 angstrom (atom 1 is moved) in intervals of 0.2 angstrom, and the generated geometries will not be written to a directory. >>> complex_mol.RCDistance(1, 0, 3.0, 1.0, -0.25) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]

Structure_inspection(init_mol=None, catoms_arr=None, num_coord=5, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]

Used to track down the changing geo_check metrics in a DFT geometry optimization. Specifically for a square pyramidal structure. Catoms_arr always specified.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • num_coord (int, optional) – The metal coordination number.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • debug (bool, optional) – Flag for extra printout. Default is False.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

  • flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.

  • flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.

addAtom(atom: atom3D, index: int | None = None, auto_populate_BO_dict: bool = True)[source]

Adds an atom to the atoms attribute, which contains a list of atom3D class instances.

Parameters:
  • atom (atom3D) – atom3D class instance of added atom.

  • index (int, optional) – Index of added atom. Default is None.

  • auto_populate_BO_dict (bool, optional) – Populate bond order dictionary with newly added atom. Default is True.

>>> complex_mol = mol3D()
>>> C_atom = atom3D('C', [1, 1, 1])
>>> complex_mol.addAtom(C_atom) # Add carbon atom at cartesian position 1, 1, 1 to mol3D object.
add_bond(idx1: int, idx2: int, bond_type: int) dict[source]

Add a bond of order bond_type between the atom at idx1 and the atom at idx2. Adjusts bo_dict and graph only, not BO_mat nor OBMol.

Parameters:
  • idx1 (int) – Index of first atom.

  • idx2 (int) – Index of second atom.

  • bond_type (int) – The order of the new bond.

Returns:

self.bo_dict – The modified bond order dictionary.

Return type:

dict

alignmol(atom1, atom2)[source]

Aligns two molecules such that the coordinates of two atoms overlap. Second molecule is translated relative to the first. No rotations are performed. Use other functions for rotations. Moves the mol3D class.

Parameters:
  • atom1 (atom3D) – atom3D of reference atom in first molecule.

  • atom2 (atom3D) – atom3D of reference atom in second molecule.

apply_ffopt(constraints=False, ff='uff')[source]

Apply forcefield optimization to a given mol3D class.

Parameters:
  • constraints (int, optional) – Range of atom indices to employ cartesian constraints before ffopt.

  • ff (str, optional) – Force field to be used in openbabel. Default is UFF.

Returns:

energy – Energy of the ffopt in kJ/mol.

Return type:

float

apply_ffopt_list_constraints(list_constraints=False, ff='uff')[source]

Apply forcefield optimization to a given mol3D class. Differs from apply_ffopt in that one can specify constrained atoms as a list.

Parameters:
  • list_constraints (list of int, optional) – List of atom indices to employ cartesian constraints before ffopt.

  • ff (str, optional) – Force field to be used in openbabel. Default is UFF.

Returns:

energy – Energy of the ffopt in kJ/mol.

Return type:

float

aromatic_charge(bo_graph)[source]

Get the charge of aromatic rings based on 4*n+2 rule.

Parameters:

bo_graph (numpy.array) – bond order matrix

assign_graph_from_net(path_to_net, return_graph=False)[source]

Uses a .net file to assign a graph (and return if needed)

Parameters:
  • path_to_net (str) – path to .net file containing the molecular graph

  • return_graph (bool) – Return the graph in addition to assigning it to self. Default is False.

Returns:

graph – a numpy array containing the unattributed molecular graph

Return type:

np.array

calcCharges(charge=0, method='QEq')[source]

Compute the partial charges of a molecule using openbabel.

Parameters:
  • charge (int) – Net charge assigned to a molecule

  • method (str) – Method to calculate partial charge. Default is ‘QEq’.

centermass()[source]

Computes coordinates of center of mass of molecule.

Returns:

center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z).

Return type:

list

centersym()[source]

Computes coordinates of center of symmetry of molecule. Identical to centermass, but not weighted by atomic masses.

Returns:

center_of_symmetry – Coordinates of center of symmetry. List of length 3: (X, Y, Z).

Return type:

list

check_angle_linear(catoms_arr=None)[source]

Get the ligand orientation for linear ligands.

Parameters:

catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

Returns:

dict_orientation – Dictionary containing average deviation from linearity (devi_linear_avrg) and max deviation (devi_linear_max).

Return type:

dict

cleanBonds()[source]

Removes all stored openbabel bond order information.

closest_H_2_metal(delta=0)[source]

Get closest hydrogen atom to metal.

Parameters:

delta (float) – Distance tolerance in angstrom.

Returns:

  • flag (bool) – Flag for if a hydrogen exists in the distance tolerance.

  • min_dist_H (float) – Minimum distance for a hydrogen.

  • min_dist_nonH (float) – Minimum distance for a heavy atom.

combine(mol, bond_to_add=[], dirty=False)[source]

Combines two molecules. Each atom in the second molecule is appended to the first while preserving orders. Assumes operation with a given mol3D instance, when handed a second mol3D instance.

Parameters:
  • mol (mol3D) – mol3D class instance containing molecule to be added.

  • bond_to_add (list, optional) – List of tuples (ind1,ind2,order) bonds to add. Default is empty.

  • dirty (bool, optional) – Add atoms without worrying about bond orders. Default is False.

Returns:

cmol – New mol3D class containing the two molecules combined.

Return type:

mol3D

convert2OBMol(force_clean=False, ignoreX=False)[source]

Converts mol3D class instance to OBMol class instance. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.

Parameters:
  • force_clean (bool, optional) – Force no bond info retention. Default is False.

  • ignoreX (bool, optional) – Index of anchor atom. Default is False.

convert2OBMol2(force_clean=False, ignoreX=False)[source]

Converts mol3D class instance to OBMol class instance, but uses mol2 function, so bond orders are not interpreted, but rather read through the mol2. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.

Parameters:
  • force_clean (bool, optional) – Force no bond info retention. Default is False.

  • ignoreX (bool, optional) – Index of anchor atom. Default is False.

convert2mol3D()[source]

Converts OBMol class instance to mol3D class instance. Generally used after openbabel operations, such as FF optimizing a molecule. Updates the mol3D as necessary.

convexhull()[source]

Computes convex hull of molecule.

Returns:

hull – Coordinates of convex hull.

Return type:

array

coords()[source]

Method to obtain string of coordinates in molecule.

Returns:

coord_string – String of molecular coordinates with atom identities in XYZ format.

Return type:

string

coordsvect()[source]

Method to obtain array of coordinates in molecule.

Returns:

list_of_coordinates – Two dimensional numpy array of molecular coordinates. (N by 3) dimension, N is number of atoms.

Return type:

np.array

copymol3D(mol0)[source]

Copies properties and atoms of another existing mol3D object into current mol3D object. Should be performed on a new mol3D class instance. WARNING: NEVER EVER USE mol3D = mol0 to do this. It DOES NOT WORK. ONLY USE ON A FRESH INSTANCE OF MOL3D. Operates on fresh instance.

Parameters:

mol0 (mol3D) – mol3D of molecule to be copied.

count_atoms(exclude=['H', 'h', 'x', 'X'])[source]

Count the number of atoms, excluding certain atoms.

Parameters:

exclude (list) – list of symbols for atoms to exclude.

Returns:

count – the number of heavy atoms

Return type:

integer

count_electrons(charge=0)[source]

Count the number of electrons in a molecule.

Parameters:

charge (int, optional) – Net charge of a molecule. Default is neutral.

Returns:

count – The number of electrons in the system.

Return type:

integer

count_nonH_atoms()[source]

Count the number of heavy atoms.

Returns:

count – the number of heavy atoms

Return type:

integer

count_specific_atoms(atom_types=['x', 'X'])[source]

Count the number of atoms, including only certain atoms.

Parameters:

atom_types (list) – list of symbols for atoms to include.

Returns:

count – the number of heavy atoms

Return type:

integer

createMolecularGraph(oct=True, strict_cutoff=False, catom_list=None)[source]

Create molecular graph of a molecule given X, Y, Z positions. Bond order is not interpreted by this. Updates graph attribute of the mol3D class.

Parameters:
  • oct (bool) – Defines whether a structure is octahedral. Default is True.

  • strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs

  • catom_list (list, optional) – List of indices of bonded atoms.

create_mol_with_inds(inds)[source]

Create molecule with indices.

Parameters:

inds (list) – The indicies of the selected submolecule to SAVE

Returns:

molnew – The new molecule built from the indices.

Return type:

mol3D

deleteHs()[source]

Delete all hydrogens from a molecule. Preserves heavy atom ordering.

deleteatom(atomIdx)[source]

Delete a specific atom from the mol3D class given an index.

Parameters:

atomIdx (int) – Index for the atom3D to remove.

deleteatoms(Alist)[source]

Delete a multiple atoms from the mol3D class given a set of indices. Preserves ordering, starts from largest index.

Parameters:

Alist (list) – List of indices for atom3D instances to remove.

dict_check_processing(dict_check, num_coord=6, debug=False, silent=False)[source]

Process the self.geo_dict to get the flag_oct and flag_list, setting dict_check as the cutoffs.

Parameters:
  • dict_check (dict) – The cutoffs of each geo_check metrics we have.

  • num_coord (int, optional) – Metal coordination number. Default is 6.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • silence (bool, optional) – Flag for warning suppression. Default is False.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

distance(mol)[source]

Measure the distance between center of mass of two molecules.

Parameters:

mol (mol3D) – mol3D class instance of second molecule to measure distance to.

Returns:

d_cm – Distance between centers of mass of two molecules.

Return type:

float

draw_svg(filename)[source]

Draw image of molecule and save to SVG.

Parameters:

filename (str) – Name of file to save SVG to.

findAtomsbySymbol(sym: str) List[int][source]

Find all elements with a given symbol in a mol3D class.

Parameters:

sym (str) – Symbol of the atom of interest.

Returns:

atom_list – List of indices of atoms in mol3D with a given symbol.

Return type:

list

findMetal(transition_metals_only: bool = True) List[int][source]

Find metal(s) in a mol3D class.

Parameters:

transition_metals_only (bool, optional) – Only find transition metals. Default is true.

Returns:

metal_list – List of indices of metal atoms in mol3D.

Return type:

list

find_atom(sym='X')[source]

Find atoms with a specific symbol.

Parameters:

sym (str) – element symbol, default as X.

Returns:

inds – a list of atom index with the specified symbol.

Return type:

list

findcloseMetal(atom0)[source]

Find the nearest metal to a given atom3D class. Returns heaviest element if no metal found.

Parameters:

atom_idx (atom3D) – atom3D class for atom of interest.

Returns:

close_metal – index of the nearest metal, or heaviest atom if no metal found.

Return type:

int

findsubMol(atom0, atomN, smart=False)[source]

Finds a submolecule within the molecule given the starting atom and the separating atom. Illustration: H2A-B-C-DH2 will return C-DH2 if C is the starting atom and B is the separating atom. Alternatively, if C is the starting atom and D is the separating atom, returns H2A-B-C.

Parameters:
  • atom0 (int) – Index of starting atom.

  • atomN (int) – Index of the separating atom.

  • smart (bool, optional) – Decision of whether or not to use getBondedAtomsSmart. Default is False.

Returns:

subm – List of indices of atoms in submolecule.

Return type:

list

freezeatom(atomIdx)[source]

Set the freeze attribute to be true for a given atom3D class.

Parameters:

atomIdx (int) – Index for atom to be frozen.

freezeatoms(Alist)[source]

Set the freeze attribute to be true for a given set of atom3D classes, given their indices. Preserves ordering, starts from largest index.

Parameters:

Alist (list) – List of indices for atom3D instances to remove.

classmethod from_smiles(smiles, gen3d: bool = True)[source]
geo_dict_initialization()[source]

Initialization of geometry check dictionaries according to dict_oct_check_st.

geo_maxatomdist(mol2)[source]

Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

dist_max – Maximum atom distance between two structures.

Return type:

float

geo_rmsd(mol2)[source]

Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

rmsd – RMSD between the two structures.

Return type:

float

getAngle(idx0, idx1, idx2)[source]

Get angle between three atoms identified by their indices. Specifically, get angle between vectors formed by atom0->atom1 and atom2->atom1.

Parameters:
  • idx0 (int) – Index of first atom.

  • idx1 (int) – Index of second (middle) atom.

  • idx2 (int) – Index of third atom.

Returns:

angle – Angle in degrees.

Return type:

float

getAtom(idx)[source]

Get atom with a given index.

Parameters:

idx (int) – Index of desired atom.

Returns:

atom – atom3D class for element at given index.

Return type:

atom3D

getAtomCoords(idx)[source]

Get atom coordinates with a given index.

Parameters:

idx (int) – Index of desired atom.

Returns:

coords – List of coordinates (length 3: [X, Y, Z]) for element at given index.

Return type:

list

getAtomTypes()[source]

Get unique elements in a molecule

Returns:

unique_atoms_list – List of unique elements in molecule by symbol.

Return type:

list

getAtoms()[source]

Get all atoms within a molecule.

Returns:

atom_list – List of atom3D classes for all elements in a mol3D.

Return type:

list

getAtomwithSyms(syms=['X'], return_index=False)[source]

Get atoms with a given list of symbols.

Parameters:
  • idx (list) – List of desired atom symbols.

  • return_index (bool) – True or false for returning the atom indices instead of atom3D classes. Returns indices if True.

Returns:

atom_list – List of atom3D classes for elements with given symbols.

Return type:

list

getAtomwithinds(inds)[source]

Get atoms with a given list of indices.

Parameters:

idx (list) – List of indices of desired atoms.

Returns:

atom_list – List of atom3D classes for elements at given indices.

Return type:

list

getBondCutoff(atom: atom3D, ratom: atom3D) float[source]

Get cutoff based on two atoms.

Parameters:
  • atom (atom3D) – atom3D class of first atom.

  • ratom (atom3D) – atom3D class of second atom.

Returns:

distance_max – Cutoff based on atomic radii scaled by a factor.

Return type:

float

getBondedAtoms(idx: int) List[int][source]

Gets atoms bonded to a specific atom. This is determined based on element-specific distance cutoffs, rather than predefined valences. This method is ideal for metals because bond orders are ill-defined. For pure organics, the OBMol class provides better functionality.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getBondedAtomsBOMatrix(idx)[source]

Get atoms bonded by an atom referenced by index, using the BO matrix.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getBondedAtomsBOMatrixAug(idx)[source]

Get atoms bonded by an atom referenced by index, using the augmented BO matrix.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getBondedAtomsByCoordNo(idx, CoordNo=6)[source]

Gets atoms bonded to a specific atom by coordination number.

Parameters:
  • idx (int) – Index of reference atom.

  • CoordNo (int, optional) – Coordination number of reference atom of interest. Default is 6.

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getBondedAtomsByThreshold(idx, threshold=1.15)[source]

Gets atoms bonded to a specific atom. This method uses a threshold for determination of a bond.

Parameters:
  • idx (int) – Index of reference atom.

  • threshold (float, optional) – Scale factor for sum of covalent radii based cutoff. Default is 1.15.

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getBondedAtomsH(idx)[source]

Get bonded atom with a given index, but ONLY count hydrogens.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded hydrogens.

Return type:

list

getBondedAtomsOct(ind, CN=6, debug=False, flag_loose=False, atom_specific_cutoffs=False, strict_cutoff=False)[source]

Gets atoms bonded to an octahedrally coordinated metal. Specifically limitis intruder C and H atoms that would otherwise be considered bonded in the distance cutoffs. Limits bonding to the CN closest atoms (CN = coordination number).

Parameters:
  • ind (int) – Index of reference atom.

  • CN (int, optional) – Coordination number of reference atom of interest. Default is 6.

  • debug (bool, optional) – Produce additional outputs for debugging. Default is False.

  • flag_loose (bool, optional) – Use looser cutoffs to determine bonding. Default is False.

  • atom_specific_cutoffs (bool, optional) – Use atom specific cutoffs to determing bonding. Default is False.

  • strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getBondedAtomsSmart(idx, oct=True, strict_cutoff=False, catom_list=None)[source]

Get the atoms bonded with the atom specified with the given index, using the molecular graph. Creates graph if it does not exist.

Parameters:
  • idx (int) – Index of reference atom.

  • oct (bool, optional) – Flag for turning on octahedral bonding routines.

  • strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs

  • catom_list (list, optional) – List of indices of bonded atoms.

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getBondedAtomsnotH(idx, metal_multiplier=1.35, nonmetal_multiplier=1.15)[source]

Get bonded atom with a given index, but do not count hydrogens.

Parameters:
  • idx (int) – Index of reference atom.

  • metal_multiplier (float, optional) – Multiplier for sum of covalent radii of two elements containing metal.

  • nonmetal_multiplier (float, optional) – Multiplier for sum of covalent radii of two elements not containing metal.

Returns:

nats – List of indices of bonded atoms.

Return type:

list

getClosestAtom(ratom)[source]

Get hydrogens bonded to a specific atom3D class.

Parameters:

ratom (atom3D) – Reference atom3D class.

Returns:

idx – Index of atom closest to reference atom.

Return type:

int

getClosestAtomlist(atom_idx, cdist=3.0)[source]

Get hydrogens bonded to a specific atom3D class.

Parameters:
  • atom_idx (int) – Reference atom index.

  • cdist (float, optional) – Cutoff of neighbor distance in angstroms.

Returns:

neighbor_list – List of neighboring atoms

Return type:

list

getClosestAtomnoHs(ratom)[source]

Get atoms bonded to a specific atom3D class that are not hydrogen.

Parameters:

ratom (atom3D) – Reference atom3D class.

Returns:

idx – Index of atom closest to reference atom.

Return type:

int

getDistToMetal(idx, metalx)[source]

Get distance between two atoms in a molecule, with the second one being a metal.

Parameters:
  • idx (int) – Index of reference atom.

  • metalx (int) – Index of reference metal atom.

Returns:

d – Distance between atoms in angstroms.

Return type:

float

getFarAtom(reference, atomtype=False)[source]

Get atom furthest from a reference atom.

Parameters:
  • reference (int) – Index of a reference atom.

  • atomtype (bool, optional) – An element name for finding the furthest atom of a given element. Default is False.

Returns:

dist – Distance of atom from center of mass in angstroms.

Return type:

float

getHs()[source]

Get all hydrogens in a mol3D class instance.

Returns:

hlist – List of indices of hydrogen atoms.

Return type:

list

getHsbyAtom(ratom)[source]

Get hydrogens bonded to a specific atom3D class.

Parameters:

ratom (atom3D) – Reference atom3D class.

Returns:

hlist – List of indices of hydrogen atoms bound to reference atom3D.

Return type:

list

getHsbyIndex(idx)[source]

Get all hydrogens bonded to a given atom with an index.

Parameters:

idx (index of reference atom.) –

Returns:

hlist – List of indices of hydrogen atoms bound to reference atom.

Return type:

list

getMLBondLengths()[source]

Outputs the metal-ligand bond lengths in the complex.

Returns:

ml_bls – keyed by ID of metal M and valued by dictionary of M-L bond lengths and relative bond lengths

Return type:

dictionary

getNumAtoms()[source]

Get the number of atoms within a molecule.

Returns:

self.natoms – The number of atoms in the mol3D object.

Return type:

int

getOBMol(fst, convtype, ffclean=False, gen3d=True)[source]

Get OBMol object from a file or SMILES string. If you have a mol3D, then use convert2OBMol instead.

Parameters:
  • fst (str) – Name of input file.

  • convtype (str) – Input filetype (xyz,mol,smi).

  • ffclean (bool, optional) – Flag for forcefield cleanup of structure. Default is False.

  • gen3d (bool, optional) – Flag for 3D structure generation using openbabel.OBBuilder

Returns:

OBMol – OBMol class instance to be used with openbabel. Bound as .OBMol attribute.

Return type:

OBMol

get_bo_dict_from_inds(inds)[source]

Recreate bo_dict with correct indices

Parameters:

inds (list) – The indicies of the selected submolecule to SAVE

Returns:

new_bo_dict – The ported over dictionary with new indicies/bonds deleted

Return type:

dict

get_fcs(strict_cutoff=False, catom_list=None)[source]

Get first coordination shell of a transition metal complex.

Parameters:
  • strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs

  • catom_list (list, optional) – List of indices of coordinating atoms.

Returns:

fcs – List of first coordination shell indices.

Return type:

list

get_features(lac=True, force_generate=False, eq_sym=False, use_dist=False, NumB=False, Gval=False, size_normalize=False, alleq=False, strict_cutoff=False, catom_list=None, MRdiag_dict={}, depth=3)[source]

Get geo-based RAC features for this complex (if octahedral)

Parameters:
  • lac (bool, optional) – Use lac for ligand_assign_consistent behavior. Default is True

  • force_generate (bool, optional) – Force the generation of features.

  • eq_sym (bool, optional) – Force equatorial plane to have same chemical symbols if possible.

  • use_dist (bool, optional) – Whether or not CD-RACs used.

  • NumB (bool, optional) – Whether or not the number of bonds RAC features are generated.

  • Gval (bool, optional) – Whether or not the group number RAC features are generated.

  • size_normalize (bool, optional) – Whether or not to normalize by the number of atoms.

  • alleq (bool, optional) – Whether or not all ligands are equatorial.

  • strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs

  • catom_list (list, optional) – List of indices of coordinating atoms.

  • MRdiag_dict (dict, optional) – Keys are ligand identifiers, values are MR diagnostics like E_corr.

  • depth (int, optional) – The depth of the RACs (how many bonds out the RACs go).

Returns:

Dictionary of {‘RACname’:RAC} for all geo-based RACs

Return type:

results, dict

get_first_shell(check_hapticity=True)[source]

Get the first coordination shell of a mol3D object with a single transition metal (read from CSD mol2 file) if check_hapticity is True updates the first shell of multiheptate ligand to be hydrogen set at the geometric mean

Parameters:

check_hapticity (boolean) – whether to update multiheptate ligands to their geometric centroid

Returns:

  • mol 3D object (first coordination shell with metal (can change based on check_hapticity))

  • list (list of hapticity)

get_geometry_type(dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, skip=False, transition_metals_only=False)[source]

Get the type of the geometry (linear (2), trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7))

uses hapticity truncated first coordination shell. Does not require the input of num_coord.

Parameters:
  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • num_coord (int, optional) – Expected coordination number.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • transition_metals_only (bool, optional) – Flag for considering more than just transition metals as metals. Default is False.

Returns:

results – Measurement of deviations from arrays.

Return type:

dictionary

get_geometry_type_old(dict_check=False, angle_ref=False, num_coord=None, flag_catoms=False, catoms_arr=None, debug=False, skip=False, transition_metals_only=False, num_recursions=[0, 0])[source]

Get the type of the geometry (trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7))

Parameters:
  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • num_coord (int, optional) – Expected coordination number.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • transition_metals_only (bool, optional) – Flag for considering more than just transition metals as metals. Default is False.

  • num_recursions (list, optional) – counter to track number of ligands classified as ‘sandwich’ and ‘edge’ in original structure

Returns:

results – Measurement of deviations from arrays.

Return type:

dictionary

get_linear_angle(ind)[source]

Get linear ligand angle.

Parameters:

ind (int) – index for one of the metal-coordinating atoms.

Returns:

  • flag (bool) – True if the ligand is linear.

  • ang (float) – Get angle of linear ligand. 0 if not linear.

get_mol_graph_det(oct=True, useBOMat=False)[source]

Get molecular graph determinant.

Parameters:
  • oct (bool, optional) – Flag for whether the geometry is octahedral. Default is True.

  • useBOMat (bool, optional) – Use bond order matrix in determinant computation.

Returns:

safedet – String containing the molecular graph determinant.

Return type:

str

get_num_coord_metal(debug=False, strict_cutoff=False, catom_list=None)[source]

Get metal coordination based on get bonded atoms. Store this info.

Parameters:
  • debug (bool, optional) – Flag for whether extra output should be printed. Default is False.

  • strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs

  • catom_list (list, optional) – List of indices of coordinating atoms.

get_octetrule_charge(debug=False)[source]

Get the octet-rule charge provided a mol3D object with bo_graph (read from CSD mol2 file) Note that currently this function should only be applied to ligands (organic molecules).

Parameters:

debug (boolean) – whether to have more printouts

get_smiles(canonicalize=False, use_mol2=False) str[source]

Returns the SMILES string representing the mol3D object.

Parameters:
  • canonicalize (bool, optional) – Openbabel canonicalization of smiles. Default is False.

  • use_mol2 (bool, optional) – Use graph in mol2 instead of interpreting it. Default is False.

Returns:

smiles – SMILES from a mol3D object. Watch out for charges.

Return type:

str

get_smilesOBmol_charge()[source]

Get the charge of a mol3D object through adjusted OBmol hydrogen/smiles conversion Note that currently this function should only be applied to ligands (organic molecules).

get_submol_noHs()[source]

Get the heavy atom only submolecule, with no hydrogens.

Returns:

mol_noHs – mol3D class instance with no hydrogens.

Return type:

mol3D

get_symmetry_denticity(return_eq_catoms=False, BondedOct=False)[source]

Get symmetry class of molecule.

Parameters:
  • return_eq_catoms (bool, optional) – Flag for if equatorial atoms should be returned. Default is False.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

Returns:

  • eqsym (bool) – Flag for equatorial symmetry.

  • maxdent (int) – Maximum denticity in molecule.

  • ligdents (list) – List of denticities in molecule.

  • homoleptic (bool) – Flag for whether a geometry is homoleptic.

  • ligsymmetry (str) – Symmetry class for ligand of interest.

  • eq_catoms (list) – List of equatorial connection atoms.

getfarAtomdir(uP)[source]

Get atom furthest from center of mass in a given direction.

Parameters:

uP (list) – List of length 3 [dx, dy, dz] for direction vector.

Returns:

dist – Distance of atom from center of mass in angstroms.

Return type:

float

getfragmentlists()[source]

Get all independent molecules in mol3D.

Returns:

atidxes_total – list of lists for atom indices comprising of each distinct molecule.

Return type:

list

initialize()[source]

Initialize the mol3D to an empty object.

isPristine(unbonded_min_dist=1.3, oct=False)[source]

Checks the organic portions of a transition metal complex and determines if they look good.

Parameters:
  • unbonded_min_dist (float, optional) – Minimum distance for two things that are not bonded (in angstrom). Default is 1.3.

  • oct (bool, optional) – Flag for octahedral complex. Default is False.

Returns:

  • pass (bool) – Whether or not molecule passes the organic checks.

  • fail_list (list) – List of failing criteria, as a set of strings.

is_edge_compound(transition_metals_only: bool = True) Tuple[int, List, List][source]

Check if a structure is edge compound.

Returns:

  • num_edge_lig (int) – Number of edge ligands.

  • info_edge_lig (list) – List of dictionaries with info about edge ligands.

  • edge_lig_atoms (list) – List of dictionaries with the connecting atoms of the edge ligands.

is_linear_ligand(ind)[source]

Check whether a ligand is linear.

Parameters:

ind (int) – index for one of the metal-coordinating atoms.

Returns:

  • flag (bool) – True if the ligand is linear.

  • catoms (list) – Atoms bonded to the index of interest.

is_sandwich_compound(transition_metals_only: bool = True) Tuple[int, List, bool, bool, List][source]

Evaluates whether a compound is a sandwich compound

Returns:

  • num_sandwich_lig (int) – Number of sandwich ligands.

  • info_sandwich_lig (list) – List of dictionaries about the sandwich ligands.

  • aromatic (bool) – Flag about whether the ligand is aromatic.

  • allconnect (bool) – Flag for connected atoms in ring.

  • edge_lig_atoms (list) – List of dictionaries with the connecting atoms of the sandwich ligands.

ligand_comp_org(init_mol, catoms_arr=None, flag_deleteH=True, flag_lbd=True, debug=False, depth=3, BondedOct=False, angle_ref=False)[source]

Get the ligand distortion by comparing each individual ligands in init_mol and opt_mol.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • depth (int, optional) – Depth for truncated molecule. Default is 3.

  • check_whole (bool, optional) – Flag for checking whole ligand.

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

Returns:

dict_lig_distort – Dictionary containing rmsd_max and atom_dist_max.

Return type:

dict

make_formula(latex=True)[source]

Get a chemical formula from the mol3D class instance.

Parameters:

latex (bool, optional) – Flag for if formula is going to go in a latex document. Default is True.

Returns:

formula – Chemical formula

Return type:

str

match_lig_list(init_mol, catoms_arr=None, BondedOct=False, flag_lbd=True, debug=False, depth=3, check_whole=False, angle_ref=False)[source]

Match the ligands of mol and init_mol by calling ligand_breakdown

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • depth (int, optional) – Depth for truncated molecule. Default is 3.

  • check_whole (bool, optional) – Flag for checking whole ligand.

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

Returns:

  • liglist_shifted (list) – List of lists containing all ligands from optimized molecule.

  • liglist_init (list) – List of lists containing all ligands from initial molecule.

  • flag_match (bool) – A flag about whether the ligands of initial and optimized mol are exactly the same. There is a one to one mapping.

maxatomdist(mol2)[source]

Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

dist_max – Maximum atom distance between two structures.

Return type:

float

maxatomdist_nonH(mol2)[source]

Compute the max atom distance between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

rmsd – RMSD between the two structures ignoring hydrogens.

Return type:

float

maxdist(mol)[source]

Measure the largest distance between atoms in two molecules.

Parameters:

mol (mol3D) – mol3D class instance of second molecule.

Returns:

maxd – Max distance between atoms of two molecules.

Return type:

float

meanabsdev(mol2)[source]

Compute the mean absolute deviation (MAD) between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

MAD – Mean absolute deviation between the two structures.

Return type:

float

mindist(mol)[source]

Measure the smallest distance between atoms in two molecules.

Parameters:

mol (mol3D) – mol3D class instance of second molecule.

Returns:

mind – Min distance between atoms of two molecules.

Return type:

float

mindistmol()[source]

Measure the smallest distance between atoms in a single molecule.

Returns:

mind – Min distance between atoms of two molecules.

Return type:

float

mindistnonH(mol)[source]

Measure the smallest distance between an atom and a non H atom in another molecule.

Parameters:

mol (mol3D) – mol3D class of second molecule.

Returns:

mind – Min distance between atoms of two molecules that are not Hs.

Return type:

float

mindisttopoint(point)[source]

Measure the smallest distance between an atom and a point.

Parameters:

point (list) – List of coordinates of reference point.

Returns:

mind – Min distance between atoms of two molecules.

Return type:

float

mols_symbols()[source]

Store symbols and their frequencies in symbols_dict attributes.

molsize()[source]

Measure the size of the molecule, by quantifying the max distance between atoms and center of mass.

Returns:

maxd – Max distance between an atom and the center of mass.

Return type:

float

numRings(index)[source]

Computes the number of simple rings an atom is in.

Parameters:

index (int) – The index of the atom in question. Zero-indexing, so the first atom has an index of zero.

Returns:

myNumRings – The number of rings the atom is in.

Return type:

int

oct_comp(angle_ref=False, catoms_arr=None, debug=False)[source]

Get the deviation of shape of the catoms from the desired shape, which is defined in angle_ref.

Parameters:
  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input.

  • debug (bool, optional) – Flag for extra printout. Default is False.

Returns:

  • dict_catoms_shape (dict) – Dictionary of first coordination sphere shape measures.

  • catoms_arr (list) – Connection atom array.

overlapcheck(mol, silence=False)[source]

Measure the smallest distance between an atom and a point.

Parameters:
  • mol (mol3D) – mol3D class instance of second molecule.

  • silence (bool, optional) – Flag for extra output. Default is False.

Returns:

overlap – Flag for whether two molecules are overlapping.

Return type:

bool

populateBOMatrix(bonddict=False)[source]

Populate the bond order matrix using openbabel.

Parameters:

bonddict (bool) – Flag for if the obmol bond dictionary should be saved. Default is False.

Returns:

molBOMat – Numpy array for bond order matrix.

Return type:

np.array

populateBOMatrixAug()[source]

Populate the augmented bond order matrix using openbabel.

Parameters:

bonddict (bool) – Flag for if the obmol bond dictionary should be saved. Default is False.

Returns:

molBOMat – Numpy array for augmented bond order matrix.

Return type:

np.array

print_geo_dict()[source]

Print geometry check info after the check.

printxyz()[source]

Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.

read_bo_from_mol(molfile)[source]
read_bonder_order(bofile)[source]

Get bond order information from file.

Parameters:

bofile (str) – Path to a bond order file.

read_charge(chargefile)[source]

Get charge information from file.

Parameters:

chargefile (str) – Path to a charge file.

read_smiles(smiles, ff='mmff94', steps=2500)[source]

Read a smiles string and convert it to a mol3D class instance.

Parameters:
  • smiles (str) – SMILES string to be interpreted by openbabel.

  • ff (str, optional) – Forcefield to be used by openbabel. Default is mmff94.

  • steps (int, optional) – Steps to be taken by forcefield. Default is 2500.

readfrommol2(filename, readstring=False, trunc_sym='X')[source]

Read mol2 into a mol3D class instance. Stores the bond orders and atom types (SYBYL).

Parameters:
  • filename (string) – String of path to XYZ file. Path may be local or global. May be read in as a string.

  • readstring (bool) – Flag for deciding whether a string of mol2 file is being passed as the filename

  • trunc_sym (string) – Element symbol at which one would like to truncate the bo graph.

readfromstring(xyzstring)[source]

Read XYZ from string.

Parameters:

xyzstring (string) – String of XYZ file.

readfromtxt(txt)[source]

Read XYZ from textfile.

Parameters:

txt (list) – List of lists that comes as a result of readlines.

readfromxyz(filename: str, ligand_unique_id=False, read_final_optim_step=False)[source]

Read XYZ into a mol3D class instance.

Parameters:
  • filename (string) – String of path to XYZ file. Path may be local or global.

  • ligand_unique_id (string) – Unique identifier for a ligand. In MR diagnostics, we abstract the atom based graph to a ligand based graph. For ligands, they don’t have a natural name, so they are named with a UUID. Hard to attribute MR character to just atoms, so it is attributed ligands instead.

  • read_final_optim_step (boolean) – if there are multiple geometries in the xyz file (after an optimization run) use only the last one

resetBondOBMol()[source]

Repopulates the bond order matrix via openbabel. Interprets bond order matrix.

returnxyz()[source]

Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.

Returns:

XYZ – String of XYZ information from mol3D class.

Return type:

string

rmsd(mol2)[source]

Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

rmsd – RMSD between the two structures.

Return type:

float

rmsd_nonH(mol2)[source]

Compute the RMSD between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

rmsd – RMSD between the two structures ignoring hydrogens.

Return type:

float

sanitycheck(silence=False, debug=False)[source]

Sanity check a molecule for overlap within the molecule.

Parameters:
  • silence (bool) – Flag for printing warnings. Default is False.

  • debug (bool) – Flag for extra printout. Default is False.

Returns:

  • overlap (bool) – Flag for whether two structures overlap. True for overlap.

  • mind (float) – Minimum distance between atoms in molecule.

sanitycheckCSD(oct=False, angle1=30, angle2=80, angle3=45, debug=False, metals=None)[source]

Sanity check a CSD molecule.

Parameters:
  • oct (bool, optional) – Flag for octahedral test. Default is False.

  • angle1 (float, optional) – Metal angle cutoff. Default is 30.

  • angle2 (float, optional) – Organic angle cutoff. Default is 80.

  • angle3 (float, optional) – Metal/organic angle cutoff e.g. M-X-X angle. Default is 45.

  • debug (bool, optional) – Extra print out desired. Default is False.

  • metals (Nonetype, optional) – Check for metals. Default is None.

Returns:

  • sane (bool) – Whether or not molecule is a sane molecule

  • error_dict (dict) – Returned if debug, {bondidists and angles breaking constraints:values}

setAtoms(atoms)[source]

Set atoms of a mol3D class to atoms.

Parameters:

atoms (list) – contains atom3D instances that should be in the molecule

setLoc(loc)[source]

Sets the conformation of an amino acid in the chain of a protein.

Parameters:

loc (str) – a one-character string representing the conformation

symvect()[source]

Method to obtain array of symbol vector of molecule.

Returns:

symbol_vector – 1 dimensional numpy array of atom symbols. (N,) dimension, N is number of atoms.

Return type:

np.array

translate(dxyz)[source]

Translate all atoms by a given vector.

Parameters:

dxyz (list) – Vector to translate all molecules, as a list [dx, dy, dz].

typevect()[source]

Method to obtain array of type vector of molecule.

Returns:

type_vector – 1 dimensional numpy array of atom types (by name). (N,) dimension, N is number of atoms.

Return type:

np.array

writegxyz(filename)[source]

Write GAMESS XYZ file.

Parameters:

filename (str) – Path to XYZ file.

writemol2(filename, writestring=False, ignoreX=False, force=False)[source]

Write mol2 file from mol3D object. Partial charges are appended if given. Else, total charge of the complex (given or interpreted by OBMol) is assigned to the metal.

Parameters:
  • filename (str) – Path to mol2 file.

  • writestring (bool, optional) – Flag to write to a string if True or file if False. Default is False.

  • ignoreX (bool, optional) – Flag to delete atom X. Default is False.

  • force (bool, optional) – Flag to dictate if bond orders are written (obmol/assigned) or =1.

writemxyz(mol, filename)[source]

Write standard XYZ file with two molecules

Parameters:
  • mol (mol3D) – mol3D instance of second molecule.

  • filename (str) – Path to XYZ file.

writenumberedxyz(filename)[source]

Write standard XYZ file with numbers instead of symbols.

Parameters:

filename (str) – Path to XYZ file.

writesepxyz(mol, filename)[source]

Write standard XYZ file with two molecules separated.

Parameters:
  • mol (mol3D) – mol3D instance of second molecule.

  • filename (str) – Path to XYZ file.

writexyz(filename, symbsonly=False, ignoreX=False, ordering=False, writestring=False, withgraph=False, specialheader=False)[source]

Write standard XYZ file.

Parameters:
  • filename (str) – Path to XYZ file.

  • symbsonly (bool, optional) – Only write symbols to file. Default is False.

  • ignoreX (bool, optional) – Ignore X element when writing. Default is False.

  • ordering (bool, optional) – If handed a list, will order atoms in a specific order. Default is False.

  • writestring (bool, optional) – Flag to write to a string if True or file if False. Default is False.

  • withgraph (bool, optional) – Flag to write with graph (after XYZ) if True. Default is False. If True, sparse graph written.

  • specialheader (str, optional) – String to write information into header. Default is False. If True, a special string is written.

Mol2D Class

class molSimplify.Classes.mol2D.Mol2D(incoming_graph_data=None, **attr)[source]

Bases: Graph

find_metal(transition_metals_only: bool = True) List[int][source]

Find indices of metal(s) in a Mol2D class.

Parameters:

transition_metals_only (bool, optional) – Only find transition metals. Default is true.

Returns:

metal_list – List of indices of metal atoms in Mol2D.

Return type:

list

Examples

Build Vanadyl acetylacetonate from SMILES:

>>> mol = Mol2D.from_smiles("CC(=[O+]1)C=C(C)O[V-3]12(#[O+])OC(C)=CC(C)=[O+]2")
>>> mol.find_metal()
[7]
classmethod from_mol2_file(filename)[source]
classmethod from_mol_file(filename)[source]
classmethod from_smiles(smiles: str)[source]

Create a Mol2D object from a SMILES string.

Parameters:

smiles (str) – SMILES representation of the molecule.

Returns:

Mol2D object of the molecule

Return type:

Mol2D

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")
>>> mol
Mol2D(O1C4H4)
graph_determinant(return_string: bool = True) str | float[source]

Calculates the molecular graph determinant.

Parameters:

return_string (bool, optional) – Flag to return the determinant as a string. Default is True.

Returns:

graph determinant

Return type:

str | float

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")

and calculate the graph determinant:

>>> mol.graph_determinant()
'-19404698740'
graph_hash() str[source]

Calculates the node attributed graph hash of the molecule.

Returns:

node attributed graph hash

Return type:

str

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")

and calculate the node attributed graph hash:

>>> mol.graph_hash()
'f6090276d7369c0c0a535113ec1d97cf'
graph_hash_edge_attr() str[source]

Calculates the edge attributed graph hash of the molecule.

Returns:

edge attributed graph hash

Return type:

str

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")

and calculate the edge attributed graph hash:

>>> mol.graph_hash_edge_attr()
'a9b6fbc7b5f53613546d5e91a7544ed6'

monomer3D Class

class molSimplify.Classes.monomer3D.monomer3D(three_lc='GLY', chain='undef', id=-1, occup=1.0, loc='')[source]

Bases: object

Holds information about a monomer, used to do manipulations. Reads information from structure file (pdb) or is directly built from molsimplify.

addAtom(atom, index=None)[source]

Adds an atom to the atoms attribute, which contains a list of atom3D class instances.

Parameters:
  • atom (atom3D) – atom3D class instance of added atom.

  • index (int, optional) – Index of added atom. Default is None.

  • auto_populate_BO_dict (bool, optional) – Populate bond order dictionary with newly added atom. Default is True.

centermass()[source]

Computes coordinates of center of mass of monomer. :returns: center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z). :rtype: list

centroid()[source]

Computes coordinates of centroid of monomer. :returns: centroid – Coordinates of centroid. List of length 3: (X, Y, Z). :rtype: list

coords()[source]

Method to obtain string of coordinates in monomer.

Returns:

coord_string – String of molecular coordinates with atom identities in XYZ format.

Return type:

string

getGreek(greek)[source]

Finds the Greek lettered carbon(s) or other atom(s) of the user’s choice.

Parameters:

greek (string) – The Greek lettered atom (e.g. alpha carbon) we want. Inputs should be form ‘CA’ or similar.

Returns:

greek_atoms – A list of atom3D class objects that contains the Greek lettered atom(s) we want.

Return type:

list of atom3Ds

getPeptideAtoms()[source]

Makes the atoms involved in peptide bonding attributes.

identify()[source]

States whether the amino acid is (positively/negatively) charged, polar, or hydrophobic.

Returns:

aa_type – Positively charged, Negatively charged, Polar, Hydrophobic

Return type:

string

setBonds()[source]

Sets the bonds between atoms within the monomer.

setLoc(loc)[source]

Sets the conformation of a monomer in the chain of a protein.

Parameters:

loc (str) – a one-character string representing the conformation

setNext(next_aa)[source]

Sets the next monomer in the chain of a protein (if applicable).

Parameters:

next_aa (monomer3D) – the monomer that is next in the protein chain

setPrev(prev_aa)[source]

Sets the previous monomer in the chain of a protein (if applicable).

Parameters:

prev_aa (monomer3D) – the monomer that comes before self in the protein chain

protein3D Class

class molSimplify.Classes.protein3D.protein3D(pdbCode='undef')[source]

Bases: object

Holds information about a protein, used to do manipulations. Reads information from structure file (pdb, cif) or is directly built from molsimplify.

autoChooseConf()[source]

Automatically choose the conformation of a protein3D class instance based first on what the greatest occupancy level is and then the first conformation ihe alphabet with all else equal.

centermass()[source]

Computes coordinates of center of mass of protein.

convexhull()[source]

Computes convex hull of protein.

Returns:

hull – Coordinates of convex hull.

Return type:

array

countAAs()[source]

Return the number of amino acid residues in a protein3D class.

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.countAAs() # This return the number of AAs in the PDB for all the chains.
1121
fetch_pdb(pdbCode)[source]

API query to fetch a pdb and write it as a protein3D class instance

Parameters:

pdbCode (str) – Code for protein, e.g. 1os7

findAA(three_lc='XAA')[source]

Find amino acids with a specific three-letter code.

Parameters:

three_lc (str) – three-letter code, default as XAA.

Returns:

inds – a set of amino acid indices with the specified symbol.

Return type:

set

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7

Return a set of pairs where each pair is a combination of the chain name and the index of the amino acid specified (in this case, ‘MET’) >>> aa_set = pdb_system.findAA(three_lc = ‘MET’) >>> sorted(aa_set) # Sorting for reproducible order in doctest [(‘A’, 268), (‘B’, 268), (‘C’, 268), (‘D’, 268)]

findAtom(sym='X', aa=True)[source]

Find atoms with a specific symbol that are contained in amino acids or heteromolecules.

Parameters:
  • sym (str) – element symbol, default as X.

  • aa (boolean) – True if we want atoms contained in amino acids False if we want atoms contained in heteromolecules

Returns:

inds – a list of atom indices with the specified symbol.

Return type:

list

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.findAtom(sym="S", aa=True) # Returns indices of sulphur atoms present in amino acids
[2166, 4442, 6733, 9041]
>>> pdb_system.findAtom(sym="S", aa=False) # Returns indices of sulphur atoms present in heteromolecules
[9164, 9182, 9200]
findMetal(transition_metals_only=True)[source]

Find metal(s) in a protein3D class.

Parameters:

transition_metals_only (bool, optional) – Only find transition metals. Default is true.

Returns:

metal_list – List of indices of metal atoms in protein3D.

Return type:

list

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7')
fetched: 1os7
>>> pdb_system.findMetal()
[9160, 9178, 9196, 9214]
freezeatom(atomIdx)[source]

Set the freeze attribute to be true for a given atom3D class.

Parameters:

atomIdx (int) – Index for atom to be frozen.

freezeatoms(Alist)[source]

Set the freeze attribute to be true for a given set of atom3D classes, given their indices. Preserves ordering, starts from largest index.

Parameters:

Alist (list) – List of indices for atom3D instances to remove.

getAtom(idx)[source]

Get atom with a given index.

Parameters:

idx (int) – Index of desired atom.

Returns:

atom – atom3D class for element at given index.

Return type:

atom3D

getBoundMols(h_id, aas_only=False)[source]

Get a list of molecules bound to a heteroatom, usually a metal.

Parameters:
  • h_id (int) – The index of the desired (hetero)atom origin

  • aas_only (boolean) – Whether or not to only consider amino acids, defaults False

Returns:

bound_mols – List of monomer3D and/or mol3D instances of molecules bound to hetatm

Return type:

list

getChain(chain_id)[source]

Takes a chain of interest and turns it into its own protein3D class instance.

Parameters:

chain_id (string) – The letter name of the chain of interest

Returns:

p – A protein3D instance consisting of just the chain of interest

Return type:

protein3D

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.getChain('A') 
getIndex(atom)[source]

Get index of a given atom

Parameters:

atom (atom3D) – atom3D class for element at given index.

Returns:

idx – Index of desired atom.

Return type:

int

getMissingAAs()[source]

Get missing amino acid residues of a protein3D class.

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB
fetched: 1MH1
>>> pdb_system.getMissingAAs()   # This gives a list of monomer3D objects
[monomer3D(VAL, id=182), monomer3D(LYS, id=183), monomer3D(LYS, id=184)]
getMissingAtoms()[source]

Get missing atoms of a protein3D class.

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB
fetched: 1MH1
>>> missing_atoms = pdb_system.getMissingAtoms()

List atoms in the first set of missing_atoms >>> [atom.sym for atom in list(missing_atoms)[0]] [‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘O’]

getMolecule(a_id, aas_only=False)[source]

Finds the molecule that the atom is contained in.

Parameters:
  • a_id (int) – The index of the desired atom whose molecule we want to find

  • aas_only (boolean) – True if we want ito find atoms contained in amino acids only. False if we want atoms contained in all molecules. Default is False.

Returns:

mol – The amino acid residue, nucleotide, or heteromolecule containing the atom

Return type:

monomer3D or mol3D

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7

This returns an molSimplify.Classes.monomer3D object indicating that the atom is part of an amino acid or nucleotide: >>> pdb_system.getMolecule(a_id=2166) monomer3D(MET, id=268)

This returns a mol3D object indicating that the atom is part of a molecule that is not an amino acid or nucleotide >>> pdb_system.getMolecule(a_id=9164) mol3D(S1O3N1C2) >>> pdb_system.getMolecule(a_id=9164).name # This prints the name of the molecule, in this case, it is ‘TAU’ ‘TAU’

readMetaData()[source]

API query to fetch XML data from a pdb and add its useful attributes to a protein3D class.

Parameters:

pdbCode (str) – Code for protein, e.g. 1os7

readfrompdb(text)[source]

Read PDB into a protein3D class instance.

Parameters:

text (str) – String of path to PDB file. Path may be local or global. May also be the text of a PDB file from the internet.

setAAs(aas)[source]

Set monomers of a protein3D class to different monomers.

Parameters:

aas (dictionary) – Keyed by chain and location Valued by monomer3D monomers (amino acids or nucleotides)

setAtoms(atoms)[source]

Set atom indices of a protein3D class to atoms.

Parameters:

atoms (dictionary) – Keyed by atom index Valued by atom3D atom that has that index

setBonds(bonds)[source]

Sets the bonded atoms in the protein.

This is effectively the molecular graph.

Parameters:

bonds (dictionary) – Keyed by atom3D atoms in the protein Valued by a set consisting of bonded atoms

setCentroid()[source]

Computes coordinates of center of mass of protein.

setChains(chains)[source]

Set chains of a protein3D class to different chains.

Parameters:

chains (dictionary) – Keyed by desired chain IDs. Valued by the list of molecules in the chain.

setConf(conf)[source]

Set possible conformations of a protein3D class to a new list.

Parameters:

conf (list) – List of possible conformations for applicable amino acids.

setDataCompleteness(DataCompleteness)[source]

Set DataCompleteness value of protein3D class.

Parameters:

DataCompleteness (float) – The desired new R value.

setEDIAScores()[source]

Sets the EDIA score of a protein3D class.

Parameters:

pdbCode (string) – The 4-character code of the protein3D class.

setHetmols(hetmols)[source]

Set heteromolecules of a protein3D class to different ones.

Parameters:

hetmols (dictionary) – Keyed by chain and location Valued by mol3D heteromolecules

setIndices(a_ids)[source]

Set atom indices of a protein3D class to atoms.

Parameters:

a_ids (dictionary) – Keyed by atom3D atom Valued by its index

setMissingAAs(missing_aas)[source]

Set missing amino acids of a protein3D class to a new list.

Parameters:

missing_aas (list) – List of missing amino acids.

setMissingAtoms(missing_atoms)[source]

Set missing atoms of a protein3D class to a new dictionary.

Parameters:

missing_atoms (dictionary) – Keyed by amino acid residues / nucleotides of origin Valued by missing atoms

setPDBCode(pdbCode)[source]

Sets the 4-letter PDB code of a protein3D class instance

Parameters:

pdbCode (string) – Desired 4-letter PDB code

setR(R)[source]

Set R value of protein3D class.

Parameters:

R (float) – The desired new R value.

setRSRZ(RSRZ)[source]

Set RSRZ score of protein3D class.

Parameters:

RSRZ (float) – The desired new RSRZ score.

setRfree(Rfree)[source]

Set Rfree value of protein3D class.

Parameters:

Rfree (float) – The desired new Rfree value.

setTwinL(TwinL)[source]

Set TwinL score of protein3D class.

Parameters:

TwinL (float) – The desired new TwinL score.

setTwinL2(TwinL2)[source]

Set TwinL squared score of protein3D class.

Parameters:

TwinL2 (float) – The desired new TwinL squared score.

stripAtoms(atoms_stripped)[source]

Removes certain atoms from the protein3D class instance.

Parameters:

atoms_stripped (list) – List of atom3D indices that should be removed

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.stripAtoms([2166, 4442, 6733, 2165]) # This removes the list of atoms with
>>>                                                # indices listedin the code
stripHetMol(hetmol)[source]
Removes all heteroatoms part of the specified heteromolecule from

the protein3D class instance.

Parameters:

hetmol (str) – String representing the name of a heteromolecule whose heteroatoms should be stripped from the protein3D class instance

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('3I40') # Fetch a PDB
fetched: 3I40
>>> pdb_system.stripHetMol('HOH')

globalvars

class molSimplify.Classes.globalvars.globalvars(*args, **kwargs)[source]

Bases: object

Globalvars class. Defines global variables used throughout the code, including periodic table.

add_custom_path(path)[source]

Record custom path in ~/.molSimplify file

Parameters:

path (str) – Path to custom data ~/.molSimplify file.

amass() Dict[str, Tuple[float, int, float, int]][source]

Get the atomic mass dictionary.

Returns:

amassdict – Dictionary containing atomic masses.

Return type:

dict

bbcombs_mononuc()[source]

Get backbone combinations dictionary

Returns:

bbcombs_mononuc – Backbone combination dictionary for different geometries.

Return type:

dict

bondsdict()[source]

Get the bond dictionary.

Returns:

bondsdict – Dictionary containing bond lengths.

Return type:

dict

elementsbynum()[source]

Returns list of elements by number

Returns:

elementsbynum – List of elements by number

Return type:

list

endict()[source]

Returns electronegativity dictionary.

Returns:

endict – Electronegativity dictionary

Return type:

list

geo_check_dictionary()[source]

Returns list of geo check objects dictionary.

Returns:

geo_check_dictionary – Geo check measurement dictionary.

Return type:

dict

getAllAAs()[source]

Gets all amino acids

Returns:

amino_acids – Dictionary of standard amino acids

Return type:

dictionary

get_all_angle_refs()[source]

Get references angle dict.

Returns:

all_angle_refs – Reference angles for various geometries.

Return type:

dict

get_all_geometries()[source]

Get available geometries.

Returns:

all_geometries – All available geometries.

Return type:

list

groups()[source]

Returns dict of elements by groups.

Returns:

groups_dict – Groups dictionary.

Return type:

dict

metalslist(transition_metals_only=True)[source]

Get the metals list.

Returns:

metalslist – List of available metals.

Return type:

list

periods()[source]

Returns dict of elements by periods.

Returns:

periods_dict – Periods dictionary.

Return type:

dict

polarizability() Dict[str, float][source]

Get the polarizability dictionary.

Returns:

poldict – Dictionary containing polarizabilities.

Return type:

dict

testTF()[source]

Tests to see whether keras and tensorflow are available.

Returns:

tf_flag – True if tensorflow and keras are available.

Return type:

bool

testmatplotlib()[source]

Tests to see if matplotlib is available

Returns:

mpl_flag – True if matplotlib is available

Return type:

bool

tribonddict()[source]

Get the triple bond dictionary.

Returns:

tribonddict – Dictionary containing triple bond lengths.

Return type:

dict

vdwrad()[source]

Returns VDW dictionary.

Returns:

vdwrad – Dictionary of VDW radii.

Return type:

list

molSimplify.Classes.globalvars.mybash(cmd)[source]

Function to run a bash command.

Parameters:

cmd (str) – String containing command to be run

Returns:

output – bash output string

Return type:

str

rundiag

class molSimplify.Classes.rundiag.run_diag[source]

Bases: object

Class of run diagnostic information to automated decision making and property prediction

set_ANN(ANN_flag, ANN_reason=False, ANN_dict=False, catalysis_flag=False, catalysis_reason=False)[source]

Set the ANN properties.

Parameters:
  • ANN_flag (bool) – Flag for whether the ANN variables exist.

  • ANN_reason (str, optional) – Reasoning for why ANN failed if failed. Default is False.

  • ANN_dict (dict, optional) – Dictionary with ANN values and uncertainty.

  • catalysis_flag (bool, optional) – Whether or not catalytic properties are set.

  • catalysis_reason (str, optional) – Reasoning for why catalytic ANN failed if failed. Default is False.

set_dict_bl(dict_bl)[source]

Set the ANN properties.

Parameters:

dict_bl (dict, optional) – Dictionary with ANN bond lengths.

set_mol(mol)[source]

Set the ANN molecule.

Parameters:

mol (mol3D) – mol3D class instance for optimized molecule.

set_sanity(sanity, min_distance)[source]

Set the sanity of predictions.

Parameters:
  • sanity (bool) – Flag for whether the ANN variables exist.

  • min_distance (float) – Minimum distance to train

write_report(path)[source]

Write report of molecule with ANN properties.

Parameters:

path (str) – Path for location to write the report.