Classes

atom3D Class

class molSimplify.Classes.atom3D.atom3D(Sym: str = 'C', xyz: List[float] | None = None, name: str | None = None, partialcharge: float | None = None, Tfactor=0, greek='', occup=1.0, loc='', line='')[source]

Bases: object

atom3D class. Base class in molSimplify for representing an element.

Parameters:
  • Sym (str, optional) – Symbol for atom3D instantiation. Element symbol. Default is ‘C’.

  • xyz (list, optional) – List of coordinates for new atom. Default is [0.0, 0.0, 0.0]. Units of angstroms.

  • name (str, optional) – Unique identifier for atom 3D instantiation. Default is False.

  • partialcharge (int, optional) – Charge assigned to atom when added to mol. Default is None.

coords()[source]

Get coordinates of a given atom.

Returns:

coords – List of coordinates in X, Y, Z format. Units of angstroms.

Return type:

list

distance(atom2)[source]

Get distance from one atom3D class to another.

Parameters:

atom2 (atom3D) – atom3D class of the atom to measure distance from.

Returns:

dist – Distance in angstroms.

Return type:

float

distancev(atom2)[source]

Get distance vector from one atom3D class to another.

Parameters:

atom2 (atom3D) – atom3D class of the atom to measure distance from.

Returns:

dist_list – List of distances in vector form: [dx, dy, dz] with units of angstroms.

Return type:

list

ismetal(transition_metals_only=True, include_X=False) bool[source]

Identify whether an atom is a metal.

Parameters:
  • transition_metals_only (bool, optional) – Identify only transition metals. Default is True.

  • include_X (bool, optional) – Whether “X” atoms are considered metals. Default is False.

Returns:

metal – Bool for whether or not an atom is a metal.

Return type:

bool

mutate(newType='C')[source]

Mutate an element to another element in the atom3D.

Parameters:

newType (str, optional) – Element name for new element. Default is ‘C’.

setEDIA(score)[source]

Sets the EDIA score of an individual atom3D.

Parameters:

score (float) – Desired EDIA score of atom

setcoords(xyz)[source]

Set coordinates of an atom3D class to a new location.

Parameters:

xyz (list) – List of coordinates, has length 3: [X, Y, Z] Units of angstroms.

symbol() str[source]

Return symbol of atom3D.

Returns:

symbol – Element symbol for atom3D class.

Return type:

str

translate(dxyz)[source]

Move the atom3D by a displacement vector.

Parameters:

dxyz (list) – Displacement vector of length 3: [dx, dy, dz]. Units of angstroms.

mol3D Class

class molSimplify.Classes.mol3D.mol3D(name='ABC', loc='', use_atom_specific_cutoffs=False)[source]

Bases: object

Holds information about a molecule, used to do manipulations. Reads information from structure file (XYZ, mol2) or is directly built from molsimplify. Please be cautious with periodic systems.

Example instantiation of an octahedral iron-ammonia complex from an XYZ file:

>>> complex_mol = mol3D()
>>> complex_mol.readfromxyz('fe_nh3_6.xyz')  
ACM(idx1, idx2, idx3, angle)[source]

Performs angular movement on a mol3D object. A submolecule is rotated about idx2. Operates directly on object.

Note: Function is sometimes unreliable in non-simple cases.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom 1.

  • idx3 (int) – Index of anchor atom 2.

  • angle (float) – New bond angle in degrees.

ACM_axis(idx1, idx2, axis, angle)[source]

Performs angular movement about an axis on a mol3D object. A submolecule is rotated about idx2. Operates directly on object.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom.

  • axis (list) – Direction vector of axis. Length 3 (X, Y, Z).

  • angle (float) – New bond angle in degrees.

BCM(idx1, idx2, d)[source]

Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom.

Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom.

  • d (float) – Bond distance in angstroms.

>>> complex_mol = mol3D()
>>> complex_mol.addAtom(atom3D('H', [0, 0, 0]))
>>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))
>>> complex_mol.BCM(1, 0, 0.7) # Set distance between atoms 0 and 1 to be 1.5 angstroms. Move atom 1.
>>> complex_mol.coordsvect()
array([[0. , 0. , 0. ],
       [0. , 0. , 0.7]])
BCM_opt(idx1, idx2, d, ff='uff')[source]

Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom. Performs force field optimization after, freezing the moved bond length.

Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom.

  • d (float) – Bond distance in angstroms.

  • ff (str) – Name of force field to be used from openbabel.

IsOct(init_mol=None, dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, flag_lbd=True, BondedOct=True, skip=False, flag_deleteH=True, silent=False, use_atom_specific_cutoffs=True)[source]

Main geometry check method for octahedral structures.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.

  • silent (bool, optional) – Flag for warning suppression. Default is False.

  • use_atom_specific_cutoffs (bool, optional) – Determine bonding with atom specific cutoffs.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

IsStructure(init_mol=None, dict_check=False, angle_ref=False, num_coord=5, flag_catoms=False, catoms_arr=None, debug=False, skip=False, flag_deleteH=True)[source]

Main geometry check method for square pyramidal structures.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • num_coord (int, optional) – The metal coordination number.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

Oct_inspection(init_mol=None, catoms_arr=None, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]

Used to track down the changing geo_check metrics in a DFT geometry optimization. Catoms_arr always specified.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • debug (bool, optional) – Flag for extra printout. Default is False.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

  • flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.

  • flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.

RCAngle(idx1, idx2, idx3, anglei, anglef, angleint=1.0, writegeo=False, dir_name='rc_angle_geometries')[source]

Generates geometries along a given angle reaction coordinate. In the given molecule, idx1 is rotated about idx2 with respect to idx3. Operates directly on object.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom 1.

  • idx3 (int) – Index of anchor atom 2.

  • anglei (float) – New initial bond angle in degrees.

  • anglef (float) – New final bond angle in degrees.

  • angleint (float; default is 1.0 degree) – The angle interval in which the angle is changed

  • writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False

  • dir_name (string; default is 'rc_angle_geometries') – The directory to which generated reaction coordinate geoemtries are written, if writegeo=True.

>>> complex_mol = mol3D()
>>> complex_mol.addAtom(atom3D('O', [0, 0, 0]))
>>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))
>>> complex_mol.addAtom(atom3D('H', [0, 1, 0]))

Generate reaction coordinate geometries using the given structure by changing the angle between atoms 2, 1, and 0 from 90 degrees to 160 degrees in intervals of 10 degrees >>> complex_mol.RCAngle(2, 1, 0, 90, 160, 10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]

Generate reaction coordinates with the given geometry by changing the angle between atoms 2, 1, and 0 from 160 degrees to 90 degrees in intervals of 10 degrees, and the generated geometries will not be written to a directory. >>> complex_mol.RCAngle(2, 1, 0, 160, 90, -10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]

RCDistance(idx1, idx2, disti, distf, distint=0.05, writegeo=False, dir_name='rc_distance_geometries')[source]

Generates geometries along a given distance reaction coordinate. In the given molecule, idx1 is moved with respect to idx2. Operates directly on object.

Parameters:
  • idx1 (int) – Index of bonded atom containing submolecule to be moved.

  • idx2 (int) – Index of anchor atom 1.

  • disti (float) – New initial bond distance in angstrom.

  • distf (float) – New final bond distance in angstrom.

  • distint (float; default is 0.05 angstrom) – The distance interval in which the distance is changed

  • writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False

  • dir_name (string; default is 'rc_distance_geometries') – The directory to which generated reaction coordinate geoemtries are written if writegeo=True.

>>> complex_mol = mol3D()
>>> complex_mol.addAtom(atom3D('H', [0, 0, 0]))
>>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))

Generate reaction coordinate geometries using the given structure by changing the distance between atoms 1 and 0 from 1.0 to 3.0 angstrom (atom 1 is moved) in intervals of 0.5 angstrom >>> complex_mol.RCDistance(1, 0, 1.0, 3.0, 0.5) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]

Generate reaction coordinates geometries using the given structure by changing the distance between atoms 1 and 0 from 3.0 to 1.0 angstrom (atom 1 is moved) in intervals of 0.2 angstrom, and the generated geometries will not be written to a directory. >>> complex_mol.RCDistance(1, 0, 3.0, 1.0, -0.25) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]

Structure_inspection(init_mol=None, catoms_arr=None, num_coord=5, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]

Used to track down the changing geo_check metrics in a DFT geometry optimization. Specifically for a square pyramidal structure. Catoms_arr always specified.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • num_coord (int, optional) – The metal coordination number.

  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • debug (bool, optional) – Flag for extra printout. Default is False.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

  • dict_oct_info (dict) – Dictionary of measurements of geometry.

  • flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.

  • flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.

addAtom(atom: atom3D, index: int | None = None, auto_populate_bo_dict: bool = True)[source]

Adds an atom to the atoms attribute, which contains a list of atom3D class instances.

Parameters:
  • atom (atom3D) – atom3D class instance of added atom.

  • index (int, optional) – Index of added atom. Default is None.

  • auto_populate_bo_dict (bool, optional) – Populate bond order dictionary with newly added atom. Default is True.

>>> complex_mol = mol3D()
>>> C_atom = atom3D('C', [1, 1, 1])
>>> complex_mol.addAtom(C_atom) # Add carbon atom at cartesian position 1, 1, 1 to mol3D object.
add_bond(idx1: int, idx2: int, bond_type: int) dict[source]

Add a bond of order bond_type between the atom at idx1 and the atom at idx2. Adjusts bo_dict, bo_mat, and graph only, not OBMol.

Parameters:
  • idx1 (int) – Index of first atom.

  • idx2 (int) – Index of second atom.

  • bond_type (int) – The order of the new bond.

Returns:

self.bo_dict – The modified bond order dictionary.

Return type:

dict

alignmol(atom1, atom2)[source]

Aligns two molecules such that the coordinates of two atoms overlap. Second molecule is translated relative to the first. No rotations are performed. Use other functions for rotations. Moves the mol3D class.

Parameters:
  • atom1 (atom3D) – atom3D of reference atom in first molecule.

  • atom2 (atom3D) – atom3D of reference atom in second molecule.

apply_ffopt(constraints=False, ff='uff')[source]

Apply forcefield optimization to a given mol3D class.

Parameters:
  • constraints (int, optional) – Range of atom indices to employ cartesian constraints before ffopt.

  • ff (str, optional) – Force field to be used in openbabel. Default is UFF.

Returns:

energy – Energy of the ffopt in kJ/mol.

Return type:

float

apply_ffopt_list_constraints(list_constraints=False, ff='uff')[source]

Apply forcefield optimization to a given mol3D class. Differs from apply_ffopt in that one can specify constrained atoms as a list.

Parameters:
  • list_constraints (list of int, optional) – List of atom indices to employ cartesian constraints before ffopt.

  • ff (str, optional) – Force field to be used in openbabel. Default is UFF.

Returns:

energy – Energy of the ffopt in kJ/mol.

Return type:

float

aromatic_charge(bo_graph)[source]

Get the charge of aromatic rings based on 4*n+2 rule.

Parameters:

bo_graph (numpy.array) – Bond order matrix.

Returns:

aromatic_charge – The charge of the aromatic rings.

Return type:

int

assign_graph_from_net(path_to_net, return_graph=False)[source]

Uses a .net file to assign a graph (and return if needed).

Parameters:
  • path_to_net (str) – path to .net file containing the molecular graph.

  • return_graph (bool) – Return the graph in addition to assigning it to self. Default is False.

Returns:

graph – A numpy array containing the unattributed molecular graph.

Return type:

np.array

calcCharges(charge=0, method='QEq')[source]

Compute the partial charges of a molecule using openbabel.

Parameters:
  • charge (int) – Net charge assigned to a molecule

  • method (str) – Method to calculate partial charge. Default is ‘QEq’.

centermass()[source]

Computes coordinates of center of mass of molecule.

Returns:

center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z).

Return type:

list

centersym()[source]

Computes coordinates of center of symmetry of molecule. Identical to centermass, but not weighted by atomic masses.

Returns:

center_of_symmetry – Coordinates of center of symmetry. List of length 3: (X, Y, Z).

Return type:

list

check_angle_linear(catoms_arr=None)[source]

Get the ligand orientation for linear ligands.

Parameters:

catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

Returns:

dict_orientation – Dictionary containing average deviation from linearity (devi_linear_avrg) and max deviation (devi_linear_max).

Return type:

dict

choose_atoms_to_move(ligands, swap_indices, catoms)[source]

Helper function for flip_symmetry to determine atoms to reflect

Parameters:
  • ligands (list) – Indices of atoms in each ligand

  • swap_indices (list) – Indices indicating which ligand of each type to be moved

  • catoms (list) – Indices of coordinating atoms in each ligand

Returns:

atoms_to_move – List of atom indices to be moved

Return type:

list

cleanBonds()[source]

Removes all stored openbabel bond order information.

closest_H_2_metal(delta=0)[source]

Get closest hydrogen atom to metal.

Parameters:

delta (float) – Distance tolerance in angstrom.

Returns:

  • flag (bool) – Flag for if a hydrogen exists in the distance tolerance.

  • min_dist_H (float) – Minimum distance for a hydrogen.

  • min_dist_nonH (float) – Minimum distance for a heavy atom.

combine(mol, bond_to_add=[], dirty=False)[source]

Combines two molecules. Each atom in the second molecule is appended to the first while preserving orders. Assumes operation with a given mol3D instance, when handed a second mol3D instance.

Parameters:
  • mol (mol3D) – mol3D class instance containing molecule to be added.

  • bond_to_add (list, optional) – List of tuples (ind1,ind2,order) bonds to add. Default is empty.

  • dirty (bool, optional) – Add atoms without worrying about bond orders. Default is False.

Returns:

cmol – New mol3D class containing the two molecules combined.

Return type:

mol3D

continuous_shape_measure(ideal_polyhedron)[source]

Return the continuous shape measure for the FCS, defined as: min(sum_i^N (q_i - p_i)^2 / sum_i^N(q_i - q_0)^2) Where q_i, p_i are vertices of the polyhedron and reference, and q_0 is the center of geometry of the real structure. The minimization is over possible pairwise combinations of vertices, and a rotation of the reference polyhedron (which is done with Kabsch). Only works for single-metal center TMCs since the translation is handled by centering on the metal. Scaling is handled by making the average bond lengths the same for the two structures. 0 means perfect matching, maximum is 100.

Parameters:

ideal_polyhedron (np.array of 3-tuples of coordinates) – Reference list of points for an ideal geometry

Returns:

min_cshm – Continuous Shape Measure between the geometry and ideal_polyhedron

Return type:

float

convert2OBMol(force_clean=False, ignoreX=False)[source]

Converts mol3D class instance to OBMol class instance. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.

Parameters:
  • force_clean (bool, optional) – Force no bond info retention. Default is False.

  • ignoreX (bool, optional) – Ignore X element when writing. Default is False.

convert2OBMol2(ignoreX=False)[source]

Converts mol3D class instance to OBMol class instance, but uses mol2 function, so bond orders are not interpreted, but rather read through the mol2. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.

Parameters:

ignoreX (bool, optional) – Ignore X element when writing. Default is False.

convert2mol3D()[source]

Converts OBMol class instance to mol3D class instance. Generally used after openbabel operations, such as FF optimizing a molecule. Updates the mol3D as necessary.

convexhull()[source]

Computes convex hull of molecule.

Returns:

hull – Coordinates of convex hull.

Return type:

array

coords(no_tabs=False)[source]

Method to obtain string of coordinates in molecule.

Parameters:

no_tabs (bool, optional) – Whether or not to use tabs in coordinate columns.

Returns:

coord_string – String of molecular coordinates with atom identities in XYZ format.

Return type:

string

coordsvect()[source]

Method to obtain array of coordinates in molecule.

Returns:

list_of_coordinates – Two dimensional numpy array of molecular coordinates. (N by 3) dimension, N is number of atoms.

Return type:

np.array

copymol3D(mol0)[source]

Copies properties and atoms of another existing mol3D object into current mol3D object. Should be performed on a new mol3D class instance. WARNING: NEVER EVER USE mol3D = mol0 to do this. It DOES NOT WORK. ONLY USE ON A FRESH INSTANCE OF MOL3D. Operates on fresh instance.

Parameters:

mol0 (mol3D) – mol3D of molecule to be copied.

count_atoms(exclude=['H', 'h', 'x', 'X'])[source]

Count the number of atoms, excluding certain atoms.

Parameters:

exclude (list) – list of symbols for atoms to exclude.

Returns:

count – the number of heavy atoms

Return type:

integer

count_electrons(charge=0)[source]

Count the number of electrons in a molecule.

Parameters:

charge (int, optional) – Net charge of a molecule. Default is neutral.

Returns:

count – The number of electrons in the system.

Return type:

integer

count_nonH_atoms()[source]

Count the number of heavy atoms.

Returns:

count – the number of heavy atoms

Return type:

integer

count_specific_atoms(atom_types=['x', 'X'])[source]

Count the number of atoms, including only certain atoms.

Parameters:

atom_types (list) – list of symbols for atoms to include.

Returns:

count – the number of heavy atoms

Return type:

integer

createMolecularGraph(oct=True, strict_cutoff=False, catom_list=None)[source]

Create molecular graph of a molecule given X, Y, Z positions. Bond order is not interpreted by this. Updates graph attribute of the mol3D class.

Parameters:
  • oct (bool) – Defines whether a structure is octahedral. Default is True.

  • strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.

  • catom_list (list of int, optional) – List of indices of bonded atoms.

create_mol_with_inds(inds)[source]

Create molecule with indices.

Parameters:

inds (list) – The indices of the selected submolecule to SAVE

Returns:

molnew – The new molecule built from the indices.

Return type:

mol3D

deleteHs()[source]

Delete all hydrogens from a molecule. Preserves heavy atom ordering.

deleteatom(atomIdx)[source]

Delete a specific atom from the mol3D class given an index.

Parameters:

atomIdx (int) – Index for the atom3D to remove.

deleteatoms(Alist)[source]

Delete a multiple atoms from the mol3D class given a set of indices. Preserves ordering, starts from largest index.

Parameters:

Alist (list of int) – List of indices for atom3D instances to remove.

dev_from_ideal_geometry(ideal_polyhedron: ndarray) Tuple[float, float][source]

Return the minimum RMSD between a geometry and an ideal polyhedron (with the same average bond distances). Enumerates all possible indexing of the geometry. As such, only recommended for small systems.

Parameters:

ideal_polyhedron (np.array of 3-tuples of coordinates) – Reference list of points for an ideal geometry

Returns:

  • rmsd (float) – Minimum root mean square distance between the fed geometry and the ideal polyhedron

  • single_dev (float) – Maximum distance between any paired points in the fed geometry and the ideal polyhedron.

dict_check_processing(dict_check, num_coord=6, debug=False, silent=False)[source]

Process the self.geo_dict to get the flag_oct and flag_list, setting dict_check as the cutoffs.

Parameters:
  • dict_check (dict) – The cutoffs of each geo_check metrics we have.

  • num_coord (int, optional) – Metal coordination number. Default is 6.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • silence (bool, optional) – Flag for warning suppression. Default is False.

Returns:

  • flag_oct (int) – Good (1) or bad (0) structure.

  • flag_list (list) – Metrics that are preventing a good geometry.

distance(mol)[source]

Measure the distance between center of mass of two molecules.

Parameters:

mol (mol3D) – mol3D class instance of second molecule to measure distance to.

Returns:

d_cm – Distance between centers of mass of two molecules.

Return type:

float

draw_svg(filename)[source]

Draw image of molecule and save to SVG.

Parameters:

filename (str) – Name of file to save SVG to.

findAtomsbySymbol(sym: str) List[int][source]

Find all elements with a given symbol in a mol3D class.

Parameters:

sym (str) – Symbol of the atom of interest.

Returns:

atom_list – List of indices of atoms in mol3D with a given symbol.

Return type:

list of int

findMetal(transition_metals_only: bool = True, include_X: bool = False) List[int][source]

Find metal(s) in a mol3D class. Also sets the metals instance attribute if it is empty.

Parameters:
  • transition_metals_only (bool, optional) – Only find transition metals. Default is True.

  • include_X (bool, optional) – Whether “X” atoms are considered metals. Default is False.

Returns:

metal_list – List of indices of metal atoms in mol3D.

Return type:

list of int

findcloseMetal(atom0)[source]

Find the nearest metal to a given atom3D class. Returns heaviest element if no metal found.

Parameters:

atom_idx (atom3D) – atom3D class for atom of interest.

Returns:

close_metal – index of the nearest metal, or heaviest atom if no metal found.

Return type:

int

findsubMol(atom0, atomN, smart=False)[source]

Finds a submolecule within the molecule given the starting atom and the separating atom. Illustration: H2A-B-C-DH2 will return C-DH2 if C is the starting atom and B is the separating atom. Alternatively, if C is the starting atom and D is the separating atom, returns H2A-B-C.

Parameters:
  • atom0 (int) – Index of starting atom.

  • atomN (int) – Index of the separating atom.

  • smart (bool, optional) – Decision of whether or not to use getBondedAtomsSmart. Default is False.

Returns:

subm – List of indices of atoms in submolecule.

Return type:

list of int

flip_symmetry(verbose=True, max_allowed_dev=30, target_symmetry=None)[source]

Flip octahedral transition metal complexes (TMCs) to opposite symmetry group. Example: cis to trans, fac to mer, etc.

Parameters:
  • verbose (bool) – Flag for returning warning when TMC exhibits high deviation from closest symmetry. Default=True

  • max_allowed_dev (float) – Maximum allowed deviation before warning is triggered (degrees). Default=30

  • target_symmetry (str) – Target symmetry for complex to be transformed to, only defined for num_unique_ligands == 3 Default=None

Returns:

self – returns self, a mol3D object with flipped symmetry

Return type:

mol3D

freezeatom(atomIdx)[source]

Set the freeze attribute to be true for a given atom3D class.

Parameters:

atomIdx (int) – Index for atom to be frozen.

freezeatoms(Alist)[source]

Set the freeze attribute to be true for a given set of atom3D objects, given their indices. Preserves ordering, starts from largest index.

Parameters:

Alist (list of int) – List of indices for atom3D instances to remove.

classmethod from_smiles(smiles, gen3d: bool = True)[source]

Generate a mol3D object from a SMILES string.

geo_dict_initialization()[source]

Initialization of geometry check dictionaries according to dict_oct_check_st.

geo_maxatomdist(mol2)[source]

Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

dist_max – Maximum atom distance between two structures.

Return type:

float

geo_rmsd(mol2)[source]

Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

rmsd – RMSD between the two structures.

Return type:

float

getAngle(idx0, idx1, idx2)[source]

Get angle between three atoms identified by their indices. Specifically, get angle between vectors formed by atom0->atom1 and atom2->atom1.

Parameters:
  • idx0 (int) – Index of first atom.

  • idx1 (int) – Index of second (middle) atom.

  • idx2 (int) – Index of third atom.

Returns:

angle – Angle in degrees.

Return type:

float

getAtom(idx)[source]

Get atom with a given index.

Parameters:

idx (int) – Index of desired atom.

Returns:

atom – atom3D class for element at given index.

Return type:

atom3D

getAtomCoords(idx)[source]

Get atom coordinates with a given index.

Parameters:

idx (int) – Index of desired atom.

Returns:

coords – List of coordinates (length 3: [X, Y, Z]) for element at given index.

Return type:

list

getAtomTypes()[source]

Get unique elements in a molecule. Now somewhat redundant with get_element_list

Returns:

unique_atoms_list – List of unique elements in molecule by symbol.

Return type:

list

getAtoms()[source]

Get all atoms within a molecule.

Parameters:

None

Returns:

atom_list – List of atom3D objects for all elements in a mol3D.

Return type:

list

getAtomwithSyms(syms=['X'], return_index=False)[source]

Get atoms with a given list of symbols.

Parameters:
  • idx (list) – List of desired atom symbols.

  • return_index (bool) – True or false for returning the atom indices instead of atom3D objects. Returns indices if True.

Returns:

atom_list – List of atom3D objects for elements with given symbols.

Return type:

list

getAtomwithinds(inds)[source]

Get atoms with a given list of indices.

Parameters:

idx (list of int) – List of indices of desired atoms.

Returns:

atom_list – List of atom3D objects for elements at given indices.

Return type:

list

getBondCutoff(atom: atom3D, ratom: atom3D) float[source]

Get cutoff based on two atoms.

Parameters:
  • atom (atom3D) – atom3D class of first atom.

  • ratom (atom3D) – atom3D class of second atom.

Returns:

distance_max – Cutoff based on atomic radii scaled by a factor.

Return type:

float

getBondedAtoms(idx: int) List[int][source]

Gets atoms bonded to a specific atom. This is determined based on element-specific distance cutoffs, rather than predefined valences. This method is ideal for metals because bond orders are ill-defined. For pure organics, the OBMol class provides better functionality.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getBondedAtomsBOMatrix(idx)[source]

Get atoms bonded by an atom referenced by index, using the BO matrix.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getBondedAtomsBOMatrixAug(idx)[source]

Get atoms bonded by an atom referenced by index, using the augmented BO matrix.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getBondedAtomsByCoordNo(idx, CoordNo=6)[source]

Gets atoms bonded to a specific atom by coordination number.

Parameters:
  • idx (int) – Index of reference atom.

  • CoordNo (int, optional) – Coordination number of reference atom of interest. Default is 6.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getBondedAtomsByThreshold(idx, threshold=1.15)[source]

Gets atoms bonded to a specific atom. This method uses a threshold for determination of a bond.

Parameters:
  • idx (int) – Index of reference atom.

  • threshold (float, optional) – Scale factor for sum of covalent radii based cutoff. Default is 1.15.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getBondedAtomsH(idx)[source]

Get bonded atom with a given index, but ONLY count hydrogens.

Parameters:

idx (int) – Index of reference atom.

Returns:

nats – List of indices of bonded hydrogens.

Return type:

list of int

getBondedAtomsOct(ind, CN=6, debug=False, flag_loose=False, atom_specific_cutoffs=False, strict_cutoff=False)[source]

Gets atoms bonded to an octahedrally coordinated metal. Specifically limits intruder C and H atoms that would otherwise be considered bonded in the distance cutoffs. Limits bonding to the CN closest atoms (CN = coordination number).

Parameters:
  • ind (int) – Index of reference atom.

  • CN (int, optional) – Coordination number of reference atom of interest. Default is 6.

  • debug (bool, optional) – Produce additional outputs for debugging. Default is False.

  • flag_loose (bool, optional) – Use looser cutoffs to determine bonding. Default is False.

  • atom_specific_cutoffs (bool, optional) – Use atom specific cutoffs to determing bonding. Default is False.

  • strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getBondedAtomsSmart(idx, oct=False, strict_cutoff=False, catom_list=None)[source]

Get the atoms bonded with the atom specified with the given index, using the molecular graph. Creates graph if it does not exist.

Parameters:
  • idx (int) – Index of reference atom.

  • oct (bool, optional) – Flag for turning on octahedral bonding routines.

  • strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.

  • catom_list (list of int, optional) – List of indices of bonded atoms.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getBondedAtomsnotH(idx, metal_multiplier=1.35, nonmetal_multiplier=1.15)[source]

Get bonded atom with a given index, but do not count hydrogens.

Parameters:
  • idx (int) – Index of reference atom.

  • metal_multiplier (float, optional) – Multiplier for sum of covalent radii of two elements containing metal.

  • nonmetal_multiplier (float, optional) – Multiplier for sum of covalent radii of two elements not containing metal.

Returns:

nats – List of indices of bonded atoms.

Return type:

list of int

getClosestAtom(ratom)[source]

Get hydrogens bonded to a specific atom3D class.

Parameters:

ratom (atom3D) – Reference atom3D class.

Returns:

idx – Index of atom closest to reference atom.

Return type:

int

getClosestAtomlist(atom_idx, cdist=3.0)[source]

Get hydrogens bonded to a specific atom3D class.

Parameters:
  • atom_idx (int) – Reference atom index.

  • cdist (float, optional) – Cutoff of neighbor distance in angstroms.

Returns:

neighbor_list – List of neighboring atoms.

Return type:

list

getClosestAtomnoHs(ratom)[source]

Get atoms bonded to a specific atom3D class that are not hydrogen.

Parameters:

ratom (atom3D) – Reference atom3D class.

Returns:

idx – Index of atom closest to reference atom.

Return type:

int

getFarAtom(reference, atomtype=False)[source]

Get atom furthest from a reference atom.

Parameters:
  • reference (int) – Index of a reference atom.

  • atomtype (bool, optional) – An element name for finding the furthest atom of a given element. Default is False.

Returns:

dist – Distance of atom from center of mass in angstroms.

Return type:

float

getHs()[source]

Get all hydrogens in a mol3D class instance.

Returns:

hlist – List of indices of hydrogen atoms.

Return type:

list of int

getHsbyAtom(ratom)[source]

Get hydrogens bonded to a specific atom3D class.

Parameters:

ratom (atom3D) – Reference atom3D class.

Returns:

nHs – List of indices of hydrogen atoms bound to reference atom3D.

Return type:

list of int

getHsbyIndex(idx)[source]

Get all hydrogens bonded to a given atom with an index.

Parameters:

idx (index of reference atom.) –

Returns:

nHs – List of indices of hydrogen atoms bound to reference atom.

Return type:

list of int

getMLBondLengths()[source]

Outputs the metal-ligand bond lengths in the complex.

Returns:

bls – keyed by ID of metal M and valued by dictionary of M-L bond lengths and relative bond lengths

Return type:

dictionary

getNumAtoms()[source]

Get the number of atoms within a molecule.

Returns:

self.natoms – The number of atoms in the mol3D object.

Return type:

int

getOBMol(fst, convtype, ffclean=False, gen3d=True)[source]

Get OBMol object from a file or SMILES string. If you have a mol3D, then use convert2OBMol instead.

Parameters:
  • fst (str) – Name of input file.

  • convtype (str) – Input filetype (xyz,mol,smi).

  • ffclean (bool, optional) – Flag for forcefield cleanup of structure. Default is False.

  • gen3d (bool, optional) – Flag for 3D structure generation using openbabel.OBBuilder

Returns:

OBMol – OBMol class instance to be used with openbabel. Bound as .OBMol attribute.

Return type:

OBMol

get_bo_dict_from_inds(inds)[source]

Recreate bo_dict with correct indices.

Parameters:

inds (list) – The indices of the selected submolecule to SAVE.

Returns:

new_bo_dict – The ported over dictionary with new indices (bonds deleted).

Return type:

dict

get_coordinate_array()[source]

Get the coordinate array of the molecule. Same as coordsvect.

Parameters:

None

Returns:

coord_array – The coordinates of each atom. Shape is (number of atoms, 3).

Return type:

np.array

get_element_list()[source]

Get the element list of the molecule. Nearly the same as symvect.

Parameters:

None

Returns:

element_list – The element symbol of each atom.

Return type:

list

get_fcs(strict_cutoff=False, catom_list=None, max6=True)[source]

Get first coordination shell of a transition metal complex.

Parameters:
  • strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.

  • catom_list (list of int, optional) – List of indices of coordinating atoms.

  • max6 (bool, optional) – If True, will return catoms from oct_comp.

Returns:

fcs – List of first coordination shell indices.

Return type:

list

get_features(lac=True, force_generate=False, eq_sym=False, use_dist=False, NumB=False, Gval=False, size_normalize=False, alleq=False, strict_cutoff=False, catom_list=None, custom_property_dict={}, depth=3, loud=False, two_key=False, non_trivial=False)[source]

Get geo-based RAC features for this transition metal complex (if octahedral).

Parameters:
  • lac (bool, optional) – Use lac for ligand_assign_consistent behavior. Default is True

  • force_generate (bool, optional) – Force the generation of features.

  • eq_sym (bool, optional) – Force equatorial plane to have same chemical symbols if possible.

  • use_dist (bool, optional) – Whether or not CD-RACs used.

  • NumB (bool, optional) – Whether or not the number of bonds RAC features are generated.

  • Gval (bool, optional) – Whether or not the group number RAC features are generated.

  • size_normalize (bool, optional) – Whether or not to normalize by the number of atoms in molecule.

  • alleq (bool, optional) – Whether or not all ligands are equatorial.

  • strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.

  • catom_list (list of int, optional) – List of indices of coordinating atoms.

  • custom_property_dict (dict, optional) – Keys are custom property names (str), values are dictionaries mapping atom symbols (str, e.g., “H”, “He”) to the numerical property (float) for that atom. If provided, other property RACs (e.g., Z, S, T) will not be made.

  • depth (int, optional) – The maximum depth of the RACs (how many bonds out the RACs go). For example, if set to 3, depths considered will be 0, 1, 2, and 3.

  • loud (bool) – Whether to generate print statements. Default is False.

  • two_key (bool) – Whether return dictionary should only have two keys, ‘colnames’ and ‘results’, with values that are lists of feature names and values, respectively.

  • non_trivial (bool, optional) – Flag to exclude difference RACs of I, and depth zero difference RACs. These RACs are always zero. By default False.

Returns:

results – Dictionary of {‘RACname’:RAC} for all geo-based RACs

Return type:

dict

get_first_shell(check_hapticity=True)[source]

Get the first coordination shell of a mol3D object with a single transition metal (read from CSD mol2 file) if check_hapticity is True updates the first shell of multiheptate ligand to be hydrogen set at the geometric mean

Parameters:

check_hapticity (boolean) – Whether to update multiheptate ligands to their geometric centroid.

Returns:

  • mol 3D object (First coordination shell with metal (can change based on check_hapticity).)

  • list (List of hapticity.)

get_geometry_type(dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, skip=False)[source]

Get the type of the geometry (linear(2), trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7)).

Uses hapticity truncated first coordination shell.

Parameters:
  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

Returns:

results – Measurement of deviations from arrays.

Return type:

dictionary

get_geometry_type_distance(max_dev=1000000.0, close_dev=0.01, flag_catoms=False, catoms_arr=None, skip=False, transition_metals_only=False, cshm=False) Dict[str, Any][source]

Get the type of the geometry (available options in globalvars all_geometries).

Uses hapticity truncated first coordination shell. Does not require the input of num_coord.

Parameters:
  • max_dev (float, optional) – Maximum RMSD allowed between a structure and an ideal geometry before it is classified as unknown. Default is 1e6.

  • close_dev (float, optional) – Maximum difference in RMSD between two classifications allowed before they are compared by maximum single-atom deviation as well.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is False.

  • cshm (bool, optional) – Whether or not to return continuous shape measures for each geometry.

Returns:

results – Contains the classified geometry and the RMSD from an ideal structure. Summary contains a list of the RMSD, the maximum single-atom deviation, and a continuous shape measure for all considered geometry types.

Return type:

dictionary

get_geometry_type_old(dict_check=False, angle_ref=False, num_coord=None, flag_catoms=False, catoms_arr=None, debug=False, skip=False, transition_metals_only=False, num_recursions=[0, 0])[source]

Get the type of the geometry (trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7)).

Parameters:
  • dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • num_coord (int, optional) – Expected coordination number.

  • flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • skip (list, optional) – Geometry checks to skip. Default is False.

  • transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is False.

  • num_recursions (list, optional) – Counter to track number of ligands classified as ‘sandwich’ and ‘edge’ in original structure.

Returns:

results – Measurement of deviations from arrays.

Return type:

dictionary

get_graph()[source]

Return the graph attribute of the molecule. It is faster to just use mol.graph though, where mol is the mol3D object.

Parameters:

None

Returns:

self.graph – The graph.

Return type:

np.array

get_graph_hash(attributed_flag=True, oct=False, loud=True)[source]

Calculate the graph hash of a molecule. Note: Not useful for distinguishing betweeen molecules that are just a single atom.

Parameters:
  • attributed_flag (bool) – Whether the graph hash takes into account element symbols. Default is True.

  • oct (bool) – Defines whether a structure is octahedral. Default is False.

  • loud (bool) – Whether to generate print statements. Default is True.

Returns:

gh – The graph hash.

Return type:

str

get_linear_angle(ind)[source]

Get linear ligand angle.

Parameters:

ind (int) – Index for one of the metal-coordinating atoms.

Returns:

  • flag (bool) – True if the ligand is linear.

  • ang (float) – Get angle of linear ligand. 0 if not linear.

get_mol_graph_det(oct=True, use_bo_mat=False)[source]

Get molecular graph determinant.

Parameters:
  • oct (bool, optional) – Flag for whether the geometry is octahedral. Default is True.

  • use_bo_mat (bool, optional) – Use bond order matrix in determinant computation.

Returns:

safedet – String containing the molecular graph determinant.

Return type:

str

get_molecular_mass()[source]

Computes the molecular mass, or weight, of a mol3D.

Parameters:

None

Returns:

mol_mass – The molecular mass. Units of amu.

Return type:

float

get_num_coord_metal(debug=False, strict_cutoff=False, catom_list=None)[source]

Get metal coordination based on get bonded atoms. Store this info.

Parameters:
  • debug (bool, optional) – Flag for whether extra output should be printed. Default is False.

  • strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.

  • catom_list (list of int, optional) – List of indices of coordinating atoms.

get_octetrule_charge(debug=False)[source]

Get the octet-rule charge provided a mol3D object with bo_graph (read from CSD mol2 file). Note that currently this function should only be applied to ligands (organic molecules).

Parameters:

debug (boolean) – Whether to have more printouts.

Returns:

  • charge (float) – The overall charge of the molecule.

  • arom_charge (int) – The charge of the aromatic rings.

get_pair_distance(idx1, idx2)[source]

Get distance between two atoms in a molecule.

Parameters:
  • idx (int) – Index of reference atom.

  • idx2 (int) – Index of the second atom.

Returns:

d – Distance between atoms in angstroms.

Return type:

float

get_smiles(canonicalize=False, use_mol2=False) str[source]

Returns the SMILES string representing the mol3D object.

Parameters:
  • canonicalize (bool, optional) – Openbabel canonicalization of smiles. Default is False.

  • use_mol2 (bool, optional) – Use graph in mol2 instead of interpreting it. Default is False.

Returns:

smiles – SMILES from a mol3D object. Watch out for charges.

Return type:

str

get_smilesOBmol_charge()[source]

Get the charge of a mol3D object through adjusted OBmol hydrogen/smiles conversion. Note that currently this function should only be applied to ligands (organic molecules).

get_submol_noHs()[source]

Get the heavy atom only submolecule, with no hydrogens.

Returns:

mol_noHs – mol3D class instance with no hydrogens.

Return type:

mol3D

get_symmetry(verbose=True, max_allowed_dev=30, details=False)[source]

Classify octahedral transition metal complexes (TMCs) according to symmetry.

Parameters:
  • verbose (bool) – Flag for returning warning when TMC exhibits high deviation from closest symmetry. Default=True

  • max_allowed_dev (float) – Maximum allowed deviation before warning is triggered (degrees). Default=30

  • details (bool) – Flag for returning detailed lists of unique ligands and coordinating atoms, intended for use with flip_symmetry(). Default=False

Returns:

  • symmetry_dict (dict) – Dictionary storing assigned symmetry class and deviations from all possible symmetry classes.

  • detailed_dict (dict) – Dictionary storing indices of metal, ligand atoms, and ligand coordinating atoms, only returned if details=True

get_symmetry_denticity(return_eq_catoms=False, BondedOct=False)[source]

Get symmetry class of molecule.

Parameters:
  • return_eq_catoms (bool, optional) – Flag for if equatorial atoms should be returned. Default is False.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

Returns:

  • eqsym (bool) – Flag for equatorial symmetry.

  • maxdent (int) – Maximum denticity in molecule.

  • ligdents (list) – List of denticities in molecule.

  • homoleptic (bool) – Flag for whether a geometry is homoleptic.

  • ligsymmetry (str) – Symmetry class for ligand of interest.

  • eq_catoms (list) – List of equatorial connection atoms.

getfarAtomdir(uP)[source]

Get atom furthest from center of mass in a given direction.

Parameters:

uP (list) – List of length 3 [dx, dy, dz] for direction vector.

Returns:

dist – Distance of atom from center of mass in angstroms.

Return type:

float

getfragmentlists()[source]

Get all independent molecules in mol3D.

Returns:

atidxes_total – list of lists for atom indices comprising of each distinct molecule.

Return type:

list

graph_from_bodict(bo_dict)[source]
initialize()[source]

Initialize the mol3D to an empty object.

isPristine(unbonded_min_dist=1.3, oct=False)[source]

Checks the organic portions of a transition metal complex and determines if they look good.

Parameters:
  • unbonded_min_dist (float, optional) – Minimum distance for two things that are not bonded (in angstrom). Default is 1.3.

  • oct (bool, optional) – Flag for octahedral complex. Default is False.

Returns:

  • pass (bool) – Whether or not molecule passes the organic checks.

  • fail_list (list) – List of failing criteria, as a set of strings.

is_edge_compound(transition_metals_only: bool = True) Tuple[int, List, List][source]

Check if a TMC/mononuclear metal complex structure is an edge compound.

Parameters:

transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is True.

Returns:

  • num_edge_lig (int) – Number of edge ligands.

  • info_edge_lig (list) – List of dictionaries with info about edge ligands.

  • edge_lig_atoms (list) – List of dictionaries with the connecting atoms of the edge ligands.

is_linear_ligand(ind)[source]

Check whether a ligand is linear.

Parameters:

ind (int) – Index for one of the metal-coordinating atoms.

Returns:

  • flag (bool) – True if the ligand is linear.

  • catoms (list) – Atoms bonded to the index of interest.

is_sandwich_compound(transition_metals_only: bool = True) Tuple[int, List, bool, bool, List][source]

Evaluates whether a TMC/mononuclear metal complex compound is a sandwich compound.

Parameters:

transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is True.

Returns:

  • num_sandwich_lig (int) – Number of sandwich ligands.

  • info_sandwich_lig (list) – List of dictionaries about the sandwich ligands.

  • aromatic (bool) – Flag about whether the ligand is aromatic.

  • allconnect (bool) – Flag for connected atoms in ring.

  • sandwich_lig_atoms (list) – List of dictionaries with the connecting atoms of the sandwich ligands.

ligand_comp_org(init_mol, catoms_arr=None, flag_deleteH=True, flag_lbd=True, debug=False, depth=3, BondedOct=False, angle_ref=False)[source]

Get the ligand distortion by comparing each individual ligands in init_mol and opt_mol.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • depth (int, optional) – Depth for truncated molecule. Default is 3.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

Returns:

dict_lig_distort – Dictionary containing rmsd_max and atom_dist_max.

Return type:

dict

make_formula(latex=True)[source]

Get a chemical formula from the mol3D class instance.

Parameters:

latex (bool, optional) – Flag for if formula is going to go in a latex document. Default is True.

Returns:

retstr – Chemical formula

Return type:

str

match_lig_list(init_mol, catoms_arr=None, BondedOct=False, flag_lbd=True, debug=False, depth=3, check_whole=False, angle_ref=False)[source]

Match the ligands of mol and init_mol by calling ligand_breakdown.

Parameters:
  • init_mol (mol3D) – mol3D class instance of the initial geometry.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.

  • BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.

  • flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.

  • debug (bool, optional) – Flag for extra printout. Default is False.

  • depth (int, optional) – Depth for truncated molecule. Default is 3.

  • check_whole (bool, optional) – Flag for checking whole ligand.

  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

Returns:

  • liglist_shifted (list) – List of lists containing all ligands from optimized molecule.

  • liglist_init (list) – List of lists containing all ligands from initial molecule.

  • flag_match (bool) – A flag about whether the ligands of initial and optimized mol are exactly the same. There is a one to one mapping.

maxatomdist(mol2)[source]

Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

dist_max – Maximum atom distance between two structures.

Return type:

float

maxatomdist_nonH(mol2)[source]

Compute the max atom distance between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

dist_max – Maximum atom distance between two structures.

Return type:

float

maxdist(mol)[source]

Measure the largest distance between atoms in two molecules.

Parameters:

mol (mol3D) – mol3D class instance of second molecule.

Returns:

maxd – Max distance between atoms of two molecules.

Return type:

float

meanabsdev(mol2)[source]

Compute the mean absolute deviation (MAD) between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

dev – Mean absolute deviation between the two structures.

Return type:

float

mindist(mol)[source]

Measure the smallest distance between atoms in two molecules.

Parameters:

mol (mol3D) – mol3D class instance of second molecule.

Returns:

mind – Min distance between atoms of two molecules.

Return type:

float

mindistmol()[source]

Measure the smallest distance between atoms in a single molecule.

Returns:

mind – Min distance between atoms of two molecules.

Return type:

float

mindistnonH(mol)[source]

Measure the smallest distance between an atom and a non H atom in another molecule.

Parameters:

mol (mol3D) – mol3D class of second molecule.

Returns:

mind – Min distance between atoms of two molecules that are not Hs.

Return type:

float

mindisttopoint(point)[source]

Measure the smallest distance between an atom and a point.

Parameters:

point (list) – List of coordinates of reference point.

Returns:

mind – Min distance between atoms of two molecules.

Return type:

float

mol3D_to_networkx(get_symbols: bool = True, get_bond_order: bool = True, get_bond_distance: bool = False)[source]
mols_symbols()[source]

Store symbols and their frequencies in symbols_dict attributes.

molsize()[source]

Measure the size of the molecule, by quantifying the max distance between atoms and center of mass.

Returns:

maxd – Max distance between an atom and the center of mass.

Return type:

float

moments_of_inertia()[source]

Determines the moments of inertia for the object, in the specified coordinates (after centering about the center of mass).

Returns:

I – Moments of inertia tensor.

Return type:

np.array

new_writemol2(ignore_dummy_atoms=True, write_bond_orders=True, return_string=True, output_file=None)[source]

Generate a MOL2-format string or file from atomic coordinates and bonding data.

Parameters:
  • atom_coords (list or np.ndarray of shape (N, 3)) – A list or NumPy array of atomic coordinates. Each element is a 3D coordinate (x, y, z) for a single atom.

  • atom_elements (list of str) – A list of atomic element symbols (e.g., ‘C’, ‘N’, ‘O’, etc.), one for each atom in atom_coords. The list must be the same length as atom_coords.

  • bond_order_dict (dict) – A dictionary mapping tuples of atom indices (i, j) to bond orders. The bond order may be a string like ‘1’, ‘2’, ‘3’, ‘ar’, etc. Example: {(0, 1): ‘1’, (1, 2): ‘2’}

  • ignore_dummy_atoms (bool, optional (default=True)) – If True, atoms with element symbol ‘X’ will be ignored in both atoms and bonds.

  • write_bond_orders (bool, optional (default=True)) – If True, writes the actual bond orders from bond_order_dict. If False, all bonds are assigned order ‘1’.

  • return_string (bool, optional (default=True)) – If True, returns the MOL2 content as a string. If False, writes to output_file.

  • output_file (str or None, optional) – If return_string is False, this must be the path to the file to write.

Returns:

Returns the MOL2-format string if return_string is True, otherwise writes to file and returns None.

Return type:

str or None

Notes

  • Atoms are renumbered starting from 1.

  • Element-based labels (e.g., C1, C2) are assigned using counts per element.

  • Substructures are inferred using connected components in the bond graph.

  • Only bonds where both atoms are not dummy atoms are retained if ignore_dummy_atoms is True.

num_rings(index)[source]

Computes the number of simple rings an atom is in.

Parameters:

index (int) – The index of the atom in question. Zero-indexing, so the first atom has an index of zero.

Returns:

myNumRings – The number of rings the atom is in.

Return type:

int

oct_comp(angle_ref=False, catoms_arr=None, debug=False)[source]

Get the deviation of shape of the catoms from the desired shape, which is defined in angle_ref.

Parameters:
  • angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.

  • catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input.

  • debug (bool, optional) – Flag for extra printout. Default is False.

Returns:

  • dict_catoms_shape (dict) – Dictionary of first coordination sphere shape measures.

  • catoms_arr (list) – Connection atom array.

overlapcheck(mol, silence=False)[source]

Measure the smallest distance between an atom and a point.

Parameters:
  • mol (mol3D) – mol3D class instance of second molecule.

  • silence (bool, optional) – Flag for extra output. Default is False.

Returns:

overlap – Flag for whether two molecules are overlapping.

Return type:

bool

populateBOMatrix(bonddict=False, set_bo_mat=False)[source]

Populate the bond order matrix using openbabel.

Parameters:
  • bonddict (bool) – Flag for if the obmol bond dictionary should be saved. Default is False.

  • set_bo_mat (bool) – Flag for whether self.bo_mat and self.graph should be set. Default is False.

Returns:

molBOMat – Numpy array for bond order matrix.

Return type:

np.array

populateBOMatrixAug()[source]

Populate the augmented bond order matrix using openbabel.

Parameters:

bonddict (bool) – Flag for if the obmol bond dictionary should be saved. Default is False.

Returns:

molBOMat – Numpy array for augmented bond order matrix.

Return type:

np.array

principal_moments_of_inertia(return_eigvecs=False)[source]

Returns the principal moments of inertia, and optionally the eigenvectors defining the principal axes.

Parameters:

return_eigvecs (bool) – Flag for if the matrices used to diagonalize I should be returned. Default is False.

Returns:

  • pmom (np.array) – 3x1 array of the principal moments of inertia, in the provided Cartesian frame.

  • eigvecs (np.array) – 3x3 array where each column is an eigenvector.

print_geo_dict()[source]

Print geometry check info after the check.

printxyz()[source]

Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.

read_bo_from_mol(molfile)[source]
read_bond_order(bofile)[source]

Get bond order information from file.

Parameters:

bofile (str) – Path to a bond order file.

read_charge(chargefile)[source]

Get charge information from file.

Parameters:

chargefile (str) – Path to a charge file.

read_smiles(smiles, ff='mmff94', steps=2500)[source]

Read a smiles string and convert it to a mol3D class instance.

Parameters:
  • smiles (str) – SMILES string to be interpreted by openbabel.

  • ff (str, optional) – Forcefield to be used by openbabel. Default is mmff94.

  • steps (int, optional) – Steps to be taken by forcefield. Default is 2500.

readfrommol(filename)[source]

Read mol into a mol3D class instance. Stores the bond orders and atom types.

Parameters:

filename (string) – String of path to MOL file. Path may be local or global.

readfrommol2(filename, readstring=False)[source]

Read mol2 into a mol3D class instance. Stores the bond orders and atom types (SYBYL).

Parameters:
  • filename (string) – String of path to MOL2 file. Path may be local or global. May be read in as a string.

  • readstring (bool) – Flag for deciding whether a string of mol2 file is being passed as the filename.

readfromstring(xyzstring)[source]

Read XYZ from string.

Parameters:

xyzstring (string) – String of XYZ file.

readfromtxt(txt)[source]

Read XYZ from textfile.

Parameters:

txt (list) – List of lists that comes as a result of readlines.

readfromxyz(filename: str, ligand_unique_id=False, read_final_optim_step=False, readstring=False)[source]

Read XYZ into a mol3D class instance.

Parameters:
  • filename (string) – String of path to XYZ file. Path may be local or global.

  • ligand_unique_id (string) – Unique identifier for a ligand. In MR diagnostics, we abstract the atom based graph to a ligand based graph. For ligands, they don’t have a natural name, so they are named with a UUID. Hard to attribute MR character to just atoms, so it is attributed ligands instead.

  • read_final_optim_step (boolean) – if there are multiple geometries in the xyz file (after an optimization run) use only the last one.

  • readstring (boolean) – Flag for deciding whether a string or xyz file is being passed as the filename.

reflect_coords(metal_coords, lig1_catom_coords, lig2_catom_coords, atoms_to_move)[source]

Helper function for flip_symmetry to calculate vectors, projections, and update coordinates.

Parameters:
  • metal_coords (np.array) – Coordinates of metal atom.

  • lig1_catom_coords (np.array) – Coordinates of coordinating atom of first ligand to be flipped.

  • lig2_catom_coords (np.array) – Coordinates of coordinating atom of second ligand to be flipped.

  • atoms_to_move (list) – List of atom indices to be moved.

Returns:

self – returns self, a mol3D object with flipped symmetry.

Return type:

mol3D

resetBondOBMol()[source]

Repopulates the bond order matrix via openbabel. Interprets bond order matrix.

returnxyz(no_tabs=False)[source]

Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.

Parameters:

no_tabs (bool, optional) – Whether or not to use tabs in coordinate columns.

Returns:

ss – String of XYZ information from mol3D class.

Return type:

string

rmsd(mol2)[source]

Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

rmsd – RMSD between the two structures.

Return type:

float

rmsd_nonH(mol2)[source]

Compute the RMSD between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().

Parameters:

mol2 (mol3D) – mol3D instance of second molecule.

Returns:

rmsd – RMSD between the two structures ignoring hydrogens.

Return type:

float

roland_combine(mol, catoms, bond_to_add=[], dirty=False)[source]

Combines two molecules. Each atom in the second molecule is appended to the first while preserving orders. Assumes operation with a given mol3D instance, when handed a second mol3D instance.

Parameters:
  • mol (mol3D) – mol3D class instance containing molecule to be added.

  • catoms (TODO) – TODO

  • bond_to_add (list, optional) – List of tuples (ind1,ind2,order) bonds to add. Default is empty.

  • dirty (bool, optional) – Add atoms without worrying about bond orders. Default is False.

Returns:

cmol – New mol3D class containing the two molecules combined.

Return type:

mol3D

sanitycheck(silence=False, debug=False)[source]

Sanity check a molecule for overlap within the molecule.

Parameters:
  • silence (bool) – Flag for printing warnings. Default is False.

  • debug (bool) – Flag for extra printout. Default is False.

Returns:

  • overlap (bool) – Flag for whether two structures overlap. True for overlap.

  • mind (float) – Minimum distance between atoms in molecule.

sanitycheckCSD(oct=False, angle1=30, angle2=80, angle3=45, debug=False, metals=None)[source]

Sanity check a CSD molecule. Check that the molecule passes basic angle tests in line with CSD pulls.

Parameters:
  • oct (bool, optional) – Flag for octahedral test. Default is False.

  • angle1 (float, optional) – Metal angle cutoff. Default is 30.

  • angle2 (float, optional) – Organic angle cutoff. Default is 80.

  • angle3 (float, optional) – Metal/organic angle cutoff e.g. M-X-X angle. Default is 45.

  • debug (bool, optional) – Extra print out desired. Default is False.

  • metals (Nonetype, optional) – Check for metals. Default is None.

Returns:

  • sane (bool) – Whether or not molecule is a sane molecule.

  • error_dict (dict) – Returned if debug, {bondidists and angles breaking constraints:values}

setLoc(loc)[source]

Sets the conformation of an amino acid in the chain of a protein.

Parameters:

loc (str) – A one-character string representing the conformation.

symvect()[source]

Method to obtain array of symbol vector of molecule.

Returns:

symbol_vector – 1 dimensional numpy array of atom symbols. (N,) dimension, N is number of atoms.

Return type:

np.array

translate(dxyz)[source]

Translate all atoms by a given vector.

Parameters:

dxyz (list) – Vector to translate all molecules, as a list [dx, dy, dz].

typevect()[source]

Method to obtain array of type vector of molecule.

Returns:

type_vector – 1 dimensional numpy array of atom types (by name). (N,) dimension, N is number of atoms.

Return type:

np.array

writegxyz(filename)[source]

Write GAMESS XYZ file.

Parameters:

filename (str) – Path to XYZ file.

writemol(filename)[source]

Write mol file from mol3D object. Not advised if molecule has > 99 atoms. If there is no bond order information available, all bonds will be set as single bonds.

Parameters:

filename (str) – Path to mol file.

writemol2(filename, writestring=False, ignoreX=False, force=False)[source]

Write mol2 file from mol3D object. Partial charges are appended if given. Else, total charge of the complex (given or interpreted by OBMol) is assigned to the metal.

Parameters:
  • filename (str) – Path to mol2 file.

  • writestring (bool, optional) – Flag to write to a string if True or file if False. Default is False.

  • ignoreX (bool, optional) – Flag to delete atom X. Default is False.

  • force (bool, optional) – Flag to dictate if bond orders are written (obmol/assigned) or =1.

writemol2_bodict(ignore_dummy_atoms=True, write_bond_orders=True, return_string=True, output_file=None)[source]

Generate a MOL2-format string or file from atomic coordinates and bonding data.

Parameters:
  • atom_coords (list or np.ndarray of shape (N, 3)) – A list or NumPy array of atomic coordinates. Each element is a 3D coordinate (x, y, z) for a single atom.

  • atom_elements (list of str) – A list of atomic element symbols (e.g., ‘C’, ‘N’, ‘O’, etc.), one for each atom in atom_coords. The list must be the same length as atom_coords.

  • bond_order_dict (dict) – A dictionary mapping tuples of atom indices (i, j) to bond orders. The bond order may be a string like ‘1’, ‘2’, ‘3’, ‘ar’, etc. Example: {(0, 1): ‘1’, (1, 2): ‘2’}

  • ignore_dummy_atoms (bool, optional (default=True)) – If True, atoms with element symbol ‘X’ will be ignored in both atoms and bonds.

  • write_bond_orders (bool, optional (default=True)) – If True, writes the actual bond orders from bond_order_dict. If False, all bonds are assigned order ‘1’.

  • return_string (bool, optional (default=True)) – If True, returns the MOL2 content as a string. If False, writes to output_file.

  • output_file (str or None, optional) – If return_string is False, this must be the path to the file to write.

Returns:

Returns the MOL2-format string if return_string is True, otherwise writes to file and returns None.

Return type:

str or None

Notes

  • Atoms are renumbered starting from 1.

  • Element-based labels (e.g., C1, C2) are assigned using counts per element.

  • Substructures are inferred using connected components in the bond graph.

  • Only bonds where both atoms are not dummy atoms are retained if ignore_dummy_atoms is True.

writemxyz(mol, filename, no_tabs=False)[source]

Write standard XYZ file with two molecules.

Parameters:
  • mol (mol3D) – mol3D instance of second molecule.

  • filename (str) – Path to XYZ file.

  • no_tabs (bool, optional) – Whether or not to use tabs in coordinate columns.

writenumberedxyz(filename)[source]

Write standard XYZ file with numbers instead of symbols.

Parameters:

filename (str) – Path to XYZ file.

writesepxyz(mol, filename)[source]

Write standard XYZ file with two molecules separated.

Parameters:
  • mol (mol3D) – mol3D instance of second molecule.

  • filename (str) – Path to XYZ file.

writexyz(filename, symbsonly=True, ignoreX=False, ordering=False, writestring=False, withgraph=False, specialheader=False, no_tabs=False)[source]

Write standard XYZ file.

Parameters:
  • filename (str) – Path to XYZ file.

  • symbsonly (bool, optional) – Only write symbols to file. Default is True.

  • ignoreX (bool, optional) – Ignore X element when writing. Default is False.

  • ordering (bool, optional) – If handed a list, will order atoms in a specific order. Default is False.

  • writestring (bool, optional) – Flag to write to a string if True or file if False. Default is False.

  • withgraph (bool, optional) – Flag to write with graph (after XYZ) if True. Default is False. If True, sparse graph written. All bonds indicated as single.

  • specialheader (str, optional) – String to write information into header. Default is False. If True, a special string is written.

  • no_tabs (bool, optional) – Whether or not to use tabs in coordinate columns.

Returns:

ss – XYZ contents, if writestring is True.

Return type:

str

Mol2D Class

class molSimplify.Classes.mol2D.Mol2D(incoming_graph_data=None, **attr)[source]

Bases: Graph

denticity_hapticity(catoms)[source]

Get denticity and hapticity from molecular graph and known coordinating atoms Number of coordinating atoms = denticity * hapticity

Parameters:

catoms (list) – List of coordinating atom indices

Returns:

  • denticity (int) – Number of independent coordination paths in graph

  • hapticity (list) – Length of each separate coordination path in graph

find_metal(transition_metals_only: bool = True) List[int][source]

Find indices of metal(s) in a Mol2D class.

Parameters:

transition_metals_only (bool, optional) – Only find transition metals. Default is true.

Returns:

metal_list – List of indices of metal atoms in Mol2D.

Return type:

list

Examples

Build Vanadyl acetylacetonate from SMILES:

>>> mol = Mol2D.from_smiles("CC(=[O+]1)C=C(C)O[V-3]12(#[O+])OC(C)=CC(C)=[O+]2")
>>> mol.find_metal()
[7]
find_simple_paths(source, sink, cutoff=None, constraints=None)[source]

Find simple (i.e., no repeated nodes) path(s) between source and sink nodes in Mol2D class.

Parameters:
  • source (int) – Index of source node

  • sink (int) – Index of sink node

  • cutoff (int) – Depth at which to stop path search Default=None

  • constraints (list) – Nodes which may not be crossed during path Default=None

Returns:

simple_paths – List of lists of simple paths

Return type:

list

classmethod from_mol2_file(filename)[source]
classmethod from_mol3d(mol3d: mol3D)[source]
classmethod from_mol_file(filename)[source]
classmethod from_smiles(smiles: str)[source]

Create a Mol2D object from a SMILES string.

Parameters:

smiles (str) – SMILES representation of the molecule.

Returns:

Mol2D object of the molecule

Return type:

Mol2D

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")
>>> mol
Mol2D(O1C4H4)
graph_determinant(return_string: bool = True) str | float[source]

Calculates the molecular graph determinant.

Parameters:

return_string (bool, optional) – Flag to return the determinant as a string. Default is True.

Returns:

graph determinant

Return type:

str | float

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")

and calculate the graph determinant:

>>> mol.graph_determinant()
'-19404698740'
graph_hash() str[source]

Calculates the node attributed graph hash of the molecule.

Returns:

node attributed graph hash

Return type:

str

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")

and calculate the node attributed graph hash:

>>> mol.graph_hash()
'8366132e88f24330fedbbf24367877f7'
graph_hash_edge_attr() str[source]

Calculates the edge attributed graph hash of the molecule.

Returns:

edge attributed graph hash

Return type:

str

Examples

Create a furan molecule from SMILES:

>>> mol = Mol2D.from_smiles("o1cccc1")

and calculate the edge attributed graph hash:

>>> mol.graph_hash_edge_attr()
'b9aa3fc505879a7a2a9a1789aee922f5'

monomer3D Class

class molSimplify.Classes.monomer3D.monomer3D(three_lc='GLY', chain='undef', id=-1, occup=1.0, loc='')[source]

Bases: object

Holds information about a monomer, used to do manipulations. Reads information from structure file (pdb) or is directly built from molsimplify.

addAtom(atom, index=None)[source]

Adds an atom to the atoms attribute, which contains a list of atom3D class instances.

Parameters:
  • atom (atom3D) – atom3D class instance of added atom.

  • index (int, optional) – Index of added atom. Default is None.

  • auto_populate_BO_dict (bool, optional) – Populate bond order dictionary with newly added atom. Default is True.

centermass()[source]

Computes coordinates of center of mass of monomer. :returns: center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z). :rtype: list

centroid()[source]

Computes coordinates of centroid of monomer. :returns: centroid – Coordinates of centroid. List of length 3: (X, Y, Z). :rtype: list

coords()[source]

Method to obtain string of coordinates in monomer.

Returns:

coord_string – String of molecular coordinates with atom identities in XYZ format.

Return type:

string

getGreek(greek)[source]

Finds the Greek lettered carbon(s) or other atom(s) of the user’s choice.

Parameters:

greek (string) – The Greek lettered atom (e.g. alpha carbon) we want. Inputs should be form ‘CA’ or similar.

Returns:

greek_atoms – A list of atom3D class objects that contains the Greek lettered atom(s) we want.

Return type:

list of atom3Ds

getPeptideAtoms()[source]

Makes the atoms involved in peptide bonding attributes.

identify()[source]

States whether the amino acid is (positively/negatively) charged, polar, or hydrophobic.

Returns:

aa_type – Positively charged, Negatively charged, Polar, Hydrophobic

Return type:

string

setBonds()[source]

Sets the bonds between atoms within the monomer.

setLoc(loc)[source]

Sets the conformation of a monomer in the chain of a protein.

Parameters:

loc (str) – a one-character string representing the conformation

setNext(next_aa)[source]

Sets the next monomer in the chain of a protein (if applicable).

Parameters:

next_aa (monomer3D) – the monomer that is next in the protein chain

setPrev(prev_aa)[source]

Sets the previous monomer in the chain of a protein (if applicable).

Parameters:

prev_aa (monomer3D) – the monomer that comes before self in the protein chain

protein3D Class

class molSimplify.Classes.protein3D.protein3D(pdbCode='undef')[source]

Bases: object

Holds information about a protein, used to do manipulations. Reads information from structure file (pdb, cif) or is directly built from molsimplify.

autoChooseConf()[source]

Automatically choose the conformation of a protein3D class instance based first on what the greatest occupancy level is and then the first conformation ihe alphabet with all else equal.

centermass()[source]

Computes coordinates of center of mass of protein.

convexhull()[source]

Computes convex hull of protein.

Returns:

hull – Coordinates of convex hull.

Return type:

array

countAAs()[source]

Return the number of amino acid residues in a protein3D class.

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.countAAs() # This return the number of AAs in the PDB for all the chains.
1121
fetch_pdb(pdbCode)[source]

API query to fetch a pdb and write it as a protein3D class instance

Parameters:

pdbCode (str) – Code for protein, e.g. 1os7

findAA(three_lc='XAA')[source]

Find amino acids with a specific three-letter code.

Parameters:

three_lc (str) – three-letter code, default as XAA.

Returns:

inds – a set of amino acid indices with the specified symbol.

Return type:

set

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7

Return a set of pairs where each pair is a combination of the chain name and the index of the amino acid specified (in this case, ‘MET’) >>> aa_set = pdb_system.findAA(three_lc = ‘MET’) >>> sorted(aa_set) # Sorting for reproducible order in doctest [(‘A’, 268), (‘B’, 268), (‘C’, 268), (‘D’, 268)]

findAtom(sym='X', aa=True)[source]

Find atoms with a specific symbol that are contained in amino acids or heteromolecules.

Parameters:
  • sym (str) – element symbol, default as X.

  • aa (boolean) – True if we want atoms contained in amino acids False if we want atoms contained in heteromolecules

Returns:

inds – a list of atom indices with the specified symbol.

Return type:

list

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.findAtom(sym="S", aa=True) # Returns indices of sulphur atoms present in amino acids
[2166, 4442, 6733, 9041]
>>> pdb_system.findAtom(sym="S", aa=False) # Returns indices of sulphur atoms present in heteromolecules
[9164, 9182, 9200]
findMetal(transition_metals_only=True)[source]

Find metal(s) in a protein3D class.

Parameters:

transition_metals_only (bool, optional) – Only find transition metals. Default is true.

Returns:

metal_list – List of indices of metal atoms in protein3D.

Return type:

list

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7')
fetched: 1os7
>>> pdb_system.findMetal()
[9160, 9178, 9196, 9214]
freezeatom(atomIdx)[source]

Set the freeze attribute to be true for a given atom3D class.

Parameters:

atomIdx (int) – Index for atom to be frozen.

freezeatoms(Alist)[source]

Set the freeze attribute to be true for a given set of atom3D classes, given their indices. Preserves ordering, starts from largest index.

Parameters:

Alist (list) – List of indices for atom3D instances to remove.

getAtom(idx)[source]

Get atom with a given index.

Parameters:

idx (int) – Index of desired atom.

Returns:

atom – atom3D class for element at given index.

Return type:

atom3D

getBoundMols(h_id, aas_only=False)[source]

Get a list of molecules bound to a heteroatom, usually a metal.

Parameters:
  • h_id (int) – The index of the desired (hetero)atom origin

  • aas_only (boolean) – Whether or not to only consider amino acids, defaults False

Returns:

bound_mols – List of monomer3D and/or mol3D instances of molecules bound to hetatm

Return type:

list

getChain(chain_id)[source]

Takes a chain of interest and turns it into its own protein3D class instance.

Parameters:

chain_id (string) – The letter name of the chain of interest

Returns:

p – A protein3D instance consisting of just the chain of interest

Return type:

protein3D

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.getChain('A') 
getIndex(atom)[source]

Get index of a given atom

Parameters:

atom (atom3D) – atom3D class for element at given index.

Returns:

idx – Index of desired atom.

Return type:

int

getMissingAAs()[source]

Get missing amino acid residues of a protein3D class.

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB
fetched: 1MH1
>>> pdb_system.getMissingAAs()   # This gives a list of monomer3D objects
[monomer3D(VAL, id=182), monomer3D(LYS, id=183), monomer3D(LYS, id=184)]
getMissingAtoms()[source]

Get missing atoms of a protein3D class.

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB
fetched: 1MH1
>>> missing_atoms = pdb_system.getMissingAtoms()

List atoms in the first set of missing_atoms >>> [atom.sym for atom in list(missing_atoms)[0]] [‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘O’]

getMolecule(a_id, aas_only=False)[source]

Finds the molecule that the atom is contained in.

Parameters:
  • a_id (int) – The index of the desired atom whose molecule we want to find

  • aas_only (boolean) – True if we want ito find atoms contained in amino acids only. False if we want atoms contained in all molecules. Default is False.

Returns:

mol – The amino acid residue, nucleotide, or heteromolecule containing the atom

Return type:

monomer3D or mol3D

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7

This returns an molSimplify.Classes.monomer3D object indicating that the atom is part of an amino acid or nucleotide: >>> pdb_system.getMolecule(a_id=2166) monomer3D(MET, id=268)

This returns a mol3D object indicating that the atom is part of a molecule that is not an amino acid or nucleotide >>> pdb_system.getMolecule(a_id=9164) mol3D(S1O3N1C2) >>> pdb_system.getMolecule(a_id=9164).name # This prints the name of the molecule, in this case, it is ‘TAU’ ‘TAU’

readMetaData()[source]

API query to fetch XML data from a pdb and add its useful attributes to a protein3D class.

Parameters:

pdbCode (str) – Code for protein, e.g. 1os7

readfrompdb(text)[source]

Read PDB into a protein3D class instance.

Parameters:

text (str) – String of path to PDB file. Path may be local or global. May also be the text of a PDB file from the internet.

setAAs(aas)[source]

Set monomers of a protein3D class to different monomers.

Parameters:

aas (dictionary) – Keyed by chain and location Valued by monomer3D monomers (amino acids or nucleotides)

setAtoms(atoms)[source]

Set atom indices of a protein3D class to atoms.

Parameters:

atoms (dictionary) – Keyed by atom index Valued by atom3D atom that has that index

setBonds(bonds)[source]

Sets the bonded atoms in the protein.

This is effectively the molecular graph.

Parameters:

bonds (dictionary) – Keyed by atom3D atoms in the protein Valued by a set consisting of bonded atoms

setCentroid()[source]

Computes coordinates of center of mass of protein.

setChains(chains)[source]

Set chains of a protein3D class to different chains.

Parameters:

chains (dictionary) – Keyed by desired chain IDs. Valued by the list of molecules in the chain.

setConf(conf)[source]

Set possible conformations of a protein3D class to a new list.

Parameters:

conf (list) – List of possible conformations for applicable amino acids.

setDataCompleteness(DataCompleteness)[source]

Set DataCompleteness value of protein3D class.

Parameters:

DataCompleteness (float) – The desired new R value.

setEDIAScores()[source]

Sets the EDIA score of a protein3D class.

Parameters:

pdbCode (string) – The 4-character code of the protein3D class.

setHetmols(hetmols)[source]

Set heteromolecules of a protein3D class to different ones.

Parameters:

hetmols (dictionary) – Keyed by chain and location Valued by mol3D heteromolecules

setIndices(a_ids)[source]

Set atom indices of a protein3D class to atoms.

Parameters:

a_ids (dictionary) – Keyed by atom3D atom Valued by its index

setMissingAAs(missing_aas)[source]

Set missing amino acids of a protein3D class to a new list.

Parameters:

missing_aas (list) – List of missing amino acids.

setMissingAtoms(missing_atoms)[source]

Set missing atoms of a protein3D class to a new dictionary.

Parameters:

missing_atoms (dictionary) – Keyed by amino acid residues / nucleotides of origin Valued by missing atoms

setPDBCode(pdbCode)[source]

Sets the 4-letter PDB code of a protein3D class instance

Parameters:

pdbCode (string) – Desired 4-letter PDB code

setR(R)[source]

Set R value of protein3D class.

Parameters:

R (float) – The desired new R value.

setRSRZ(RSRZ)[source]

Set RSRZ score of protein3D class.

Parameters:

RSRZ (float) – The desired new RSRZ score.

setRfree(Rfree)[source]

Set Rfree value of protein3D class.

Parameters:

Rfree (float) – The desired new Rfree value.

setTwinL(TwinL)[source]

Set TwinL score of protein3D class.

Parameters:

TwinL (float) – The desired new TwinL score.

setTwinL2(TwinL2)[source]

Set TwinL squared score of protein3D class.

Parameters:

TwinL2 (float) – The desired new TwinL squared score.

stripAtoms(atoms_stripped)[source]

Removes certain atoms from the protein3D class instance.

Parameters:

atoms_stripped (list) – List of atom3D indices that should be removed

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('1os7') # Fetch a PDB
fetched: 1os7
>>> pdb_system.stripAtoms([2166, 4442, 6733, 2165]) # This removes the list of atoms with
>>>                                                # indices listedin the code
stripHetMol(hetmol)[source]
Removes all heteroatoms part of the specified heteromolecule from

the protein3D class instance.

Parameters:

hetmol (str) – String representing the name of a heteromolecule whose heteroatoms should be stripped from the protein3D class instance

Examples

>>> pdb_system = protein3D()
>>> pdb_system.fetch_pdb('3I40') # Fetch a PDB
fetched: 3I40
>>> pdb_system.stripHetMol('HOH')

globalvars

molSimplify.Classes.globalvars.get_version()[source]
class molSimplify.Classes.globalvars.globalvars(*args, **kwargs)[source]

Bases: object

Globalvars class. Defines global variables used throughout the code, including periodic table.

add_custom_path(path)[source]

Record custom path in ~/.molSimplify file

Parameters:

path (str) – Path to custom data ~/.molSimplify file.

amass() Dict[str, Tuple[float, int, float, int]][source]

Get the atomic mass dictionary.

Returns:

amassdict – Dictionary containing atomic masses.

Return type:

dict

bbcombs_mononuc()[source]

Get backbone combinations dictionary

Returns:

bbcombs_mononuc – Backbone combination dictionary for different geometries.

Return type:

dict

bondsdict()[source]

Get the bond dictionary.

Returns:

bondsdict – Dictionary containing bond lengths.

Return type:

dict

elementsbynum()[source]

Returns list of elements by number

Returns:

elementsbynum – List of elements by number

Return type:

list

endict()[source]

Returns electronegativity dictionary.

Returns:

endict – Electronegativity dictionary

Return type:

list

geo_check_dictionary()[source]

Returns list of geo check objects dictionary.

Returns:

geo_check_dictionary – Geo check measurement dictionary.

Return type:

dict

getAllAAs()[source]

Gets all amino acids

Returns:

amino_acids – Dictionary of standard amino acids

Return type:

dictionary

get_all_angle_refs()[source]

Get references angle dict.

Returns:

all_angle_refs – Reference angles for various geometries.

Return type:

dict

get_all_geometries()[source]

Get available geometries.

Returns:

all_geometries – All available geometries.

Return type:

list

get_all_polyhedra()[source]

Get reference polyhedra dict.

Returns:

all_polyhedra – Reference polyhedra for various geometries.

Return type:

dict

groups()[source]

Returns dict of elements by groups.

Returns:

groups_dict – Groups dictionary.

Return type:

dict

metalslist(transition_metals_only=True, include_X=False)[source]

Get the metals list.

Parameters:
  • transition_metals_only (bool, optional) – Identify only transition metals. Default is true.

  • include_X (bool, optional) – Whether “X” atoms are considered metals. Default is False.

Returns:

complete_metals_list – List of available metals.

Return type:

list

periods()[source]

Returns dict of elements by periods.

Returns:

periods_dict – Periods dictionary.

Return type:

dict

polarizability() Dict[str, float][source]

Get the polarizability dictionary.

Returns:

poldict – Dictionary containing polarizabilities.

Return type:

dict

testTF()[source]

Tests to see whether keras and tensorflow are available.

Returns:

tf_flag – True if tensorflow and keras are available.

Return type:

bool

testmatplotlib()[source]

Tests to see if matplotlib is available

Returns:

mpl_flag – True if matplotlib is available

Return type:

bool

tribonddict()[source]

Get the triple bond dictionary.

Returns:

tribonddict – Dictionary containing triple bond lengths.

Return type:

dict

vdwrad()[source]

Returns VDW dictionary.

Returns:

vdwrad – Dictionary of VDW radii.

Return type:

list

molSimplify.Classes.globalvars.mybash(cmd)[source]

Function to run a bash command.

Parameters:

cmd (str) – String containing command to be run

Returns:

output – bash output string

Return type:

str

rundiag

class molSimplify.Classes.rundiag.run_diag[source]

Bases: object

Class of run diagnostic information to automated decision making and property prediction

set_ANN(ANN_flag, ANN_reason=False, ANN_dict=False, catalysis_flag=False, catalysis_reason=False)[source]

Set the ANN properties.

Parameters:
  • ANN_flag (bool) – Flag for whether the ANN variables exist.

  • ANN_reason (str, optional) – Reasoning for why ANN failed if failed. Default is False.

  • ANN_dict (dict, optional) – Dictionary with ANN values and uncertainty.

  • catalysis_flag (bool, optional) – Whether or not catalytic properties are set.

  • catalysis_reason (str, optional) – Reasoning for why catalytic ANN failed if failed. Default is False.

set_dict_bl(dict_bl)[source]

Set the ANN properties.

Parameters:

dict_bl (dict, optional) – Dictionary with ANN bond lengths.

set_mol(mol)[source]

Set the ANN molecule.

Parameters:

mol (mol3D) – mol3D class instance for optimized molecule.

set_sanity(sanity, min_distance)[source]

Set the sanity of predictions.

Parameters:
  • sanity (bool) – Flag for whether the ANN variables exist.

  • min_distance (float) – Minimum distance to train

write_report(path)[source]

Write report of molecule with ANN properties.

Parameters:

path (str) – Path for location to write the report.