Classes
atom3D Class
- class molSimplify.Classes.atom3D.atom3D(Sym: str = 'C', xyz: List[float] | None = None, name: str | None = None, partialcharge: float | None = None, Tfactor=0, greek='', occup=1.0, loc='', line='')[source]
Bases:
objectatom3D class. Base class in molSimplify for representing an element.
- Parameters:
Sym (str, optional) – Symbol for atom3D instantiation. Element symbol. Default is ‘C’.
xyz (list, optional) – List of coordinates for new atom. Default is [0.0, 0.0, 0.0]. Units of angstroms.
name (str, optional) – Unique identifier for atom 3D instantiation. Default is False.
partialcharge (int, optional) – Charge assigned to atom when added to mol. Default is None.
- coords()[source]
Get coordinates of a given atom.
- Returns:
coords – List of coordinates in X, Y, Z format. Units of angstroms.
- Return type:
- ismetal(transition_metals_only=True, include_X=False) bool[source]
Identify whether an atom is a metal.
- mutate(newType='C')[source]
Mutate an element to another element in the atom3D.
- Parameters:
newType (str, optional) – Element name for new element. Default is ‘C’.
- setEDIA(score)[source]
Sets the EDIA score of an individual atom3D.
- Parameters:
score (float) – Desired EDIA score of atom
- setcoords(xyz)[source]
Set coordinates of an atom3D class to a new location.
- Parameters:
xyz (list) – List of coordinates, has length 3: [X, Y, Z] Units of angstroms.
mol3D Class
- class molSimplify.Classes.mol3D.mol3D(name='ABC', loc='', use_atom_specific_cutoffs=False)[source]
Bases:
objectHolds information about a molecule, used to do manipulations. Reads information from structure file (XYZ, mol2) or is directly built from molsimplify. Please be cautious with periodic systems.
Example instantiation of an octahedral iron-ammonia complex from an XYZ file:
>>> complex_mol = mol3D() >>> complex_mol.readfromxyz('fe_nh3_6.xyz')
- ACM(idx1, idx2, idx3, angle)[source]
Performs angular movement on a mol3D object. A submolecule is rotated about idx2. Operates directly on object.
Note: Function is sometimes unreliable in non-simple cases.
- ACM_axis(idx1, idx2, axis, angle)[source]
Performs angular movement about an axis on a mol3D object. A submolecule is rotated about idx2. Operates directly on object.
- BCM(idx1, idx2, d)[source]
Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom.
Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2
- Parameters:
>>> complex_mol = mol3D() >>> complex_mol.addAtom(atom3D('H', [0, 0, 0])) >>> complex_mol.addAtom(atom3D('H', [0, 0, 1])) >>> complex_mol.BCM(1, 0, 0.7) # Set distance between atoms 0 and 1 to be 1.5 angstroms. Move atom 1. >>> complex_mol.coordsvect() array([[0. , 0. , 0. ], [0. , 0. , 0.7]])
- BCM_opt(idx1, idx2, d, ff='uff')[source]
Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom. Performs force field optimization after, freezing the moved bond length.
Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2
- IsOct(init_mol=None, dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, flag_lbd=True, BondedOct=True, skip=False, flag_deleteH=True, silent=False, use_atom_specific_cutoffs=True)[source]
Main geometry check method for octahedral structures.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.
silent (bool, optional) – Flag for warning suppression. Default is False.
use_atom_specific_cutoffs (bool, optional) – Determine bonding with atom specific cutoffs.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
- IsStructure(init_mol=None, dict_check=False, angle_ref=False, num_coord=5, flag_catoms=False, catoms_arr=None, debug=False, skip=False, flag_deleteH=True)[source]
Main geometry check method for square pyramidal structures.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
num_coord (int, optional) – The metal coordination number.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
- Oct_inspection(init_mol=None, catoms_arr=None, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]
Used to track down the changing geo_check metrics in a DFT geometry optimization. Catoms_arr always specified.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
debug (bool, optional) – Flag for extra printout. Default is False.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.
flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.
- RCAngle(idx1, idx2, idx3, anglei, anglef, angleint=1.0, writegeo=False, dir_name='rc_angle_geometries')[source]
Generates geometries along a given angle reaction coordinate. In the given molecule, idx1 is rotated about idx2 with respect to idx3. Operates directly on object.
- Parameters:
idx1 (int) – Index of bonded atom containing submolecule to be moved.
idx2 (int) – Index of anchor atom 1.
idx3 (int) – Index of anchor atom 2.
anglei (float) – New initial bond angle in degrees.
anglef (float) – New final bond angle in degrees.
angleint (float; default is 1.0 degree) – The angle interval in which the angle is changed
writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False
dir_name (string; default is 'rc_angle_geometries') – The directory to which generated reaction coordinate geoemtries are written, if writegeo=True.
>>> complex_mol = mol3D() >>> complex_mol.addAtom(atom3D('O', [0, 0, 0])) >>> complex_mol.addAtom(atom3D('H', [0, 0, 1])) >>> complex_mol.addAtom(atom3D('H', [0, 1, 0]))
Generate reaction coordinate geometries using the given structure by changing the angle between atoms 2, 1, and 0 from 90 degrees to 160 degrees in intervals of 10 degrees >>> complex_mol.RCAngle(2, 1, 0, 90, 160, 10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]
Generate reaction coordinates with the given geometry by changing the angle between atoms 2, 1, and 0 from 160 degrees to 90 degrees in intervals of 10 degrees, and the generated geometries will not be written to a directory. >>> complex_mol.RCAngle(2, 1, 0, 160, 90, -10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]
- RCDistance(idx1, idx2, disti, distf, distint=0.05, writegeo=False, dir_name='rc_distance_geometries')[source]
Generates geometries along a given distance reaction coordinate. In the given molecule, idx1 is moved with respect to idx2. Operates directly on object.
- Parameters:
idx1 (int) – Index of bonded atom containing submolecule to be moved.
idx2 (int) – Index of anchor atom 1.
disti (float) – New initial bond distance in angstrom.
distf (float) – New final bond distance in angstrom.
distint (float; default is 0.05 angstrom) – The distance interval in which the distance is changed
writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False
dir_name (string; default is 'rc_distance_geometries') – The directory to which generated reaction coordinate geoemtries are written if writegeo=True.
>>> complex_mol = mol3D() >>> complex_mol.addAtom(atom3D('H', [0, 0, 0])) >>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))
Generate reaction coordinate geometries using the given structure by changing the distance between atoms 1 and 0 from 1.0 to 3.0 angstrom (atom 1 is moved) in intervals of 0.5 angstrom >>> complex_mol.RCDistance(1, 0, 1.0, 3.0, 0.5) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]
Generate reaction coordinates geometries using the given structure by changing the distance between atoms 1 and 0 from 3.0 to 1.0 angstrom (atom 1 is moved) in intervals of 0.2 angstrom, and the generated geometries will not be written to a directory. >>> complex_mol.RCDistance(1, 0, 3.0, 1.0, -0.25) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]
- Structure_inspection(init_mol=None, catoms_arr=None, num_coord=5, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]
Used to track down the changing geo_check metrics in a DFT geometry optimization. Specifically for a square pyramidal structure. Catoms_arr always specified.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
num_coord (int, optional) – The metal coordination number.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
debug (bool, optional) – Flag for extra printout. Default is False.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.
flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.
- addAtom(atom: atom3D, index: int | None = None, auto_populate_bo_dict: bool = True)[source]
Adds an atom to the atoms attribute, which contains a list of atom3D class instances.
- Parameters:
>>> complex_mol = mol3D() >>> C_atom = atom3D('C', [1, 1, 1]) >>> complex_mol.addAtom(C_atom) # Add carbon atom at cartesian position 1, 1, 1 to mol3D object.
- add_bond(idx1: int, idx2: int, bond_type: int) dict[source]
Add a bond of order bond_type between the atom at idx1 and the atom at idx2. Adjusts bo_dict, bo_mat, and graph only, not OBMol.
- alignmol(atom1, atom2)[source]
Aligns two molecules such that the coordinates of two atoms overlap. Second molecule is translated relative to the first. No rotations are performed. Use other functions for rotations. Moves the mol3D class.
- apply_ffopt(constraints=False, ff='uff')[source]
Apply forcefield optimization to a given mol3D class.
- apply_ffopt_list_constraints(list_constraints=False, ff='uff')[source]
Apply forcefield optimization to a given mol3D class. Differs from apply_ffopt in that one can specify constrained atoms as a list.
- aromatic_charge(bo_graph)[source]
Get the charge of aromatic rings based on 4*n+2 rule.
- Parameters:
bo_graph (numpy.array) – Bond order matrix.
- Returns:
aromatic_charge – The charge of the aromatic rings.
- Return type:
- assign_graph_from_net(path_to_net, return_graph=False)[source]
Uses a .net file to assign a graph (and return if needed).
- calcCharges(charge=0, method='QEq')[source]
Compute the partial charges of a molecule using openbabel.
- centermass()[source]
Computes coordinates of center of mass of molecule.
- Returns:
center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z).
- Return type:
- centersym()[source]
Computes coordinates of center of symmetry of molecule. Identical to centermass, but not weighted by atomic masses.
- Returns:
center_of_symmetry – Coordinates of center of symmetry. List of length 3: (X, Y, Z).
- Return type:
- check_angle_linear(catoms_arr=None)[source]
Get the ligand orientation for linear ligands.
- Parameters:
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
- Returns:
dict_orientation – Dictionary containing average deviation from linearity (devi_linear_avrg) and max deviation (devi_linear_max).
- Return type:
- choose_atoms_to_move(ligands, swap_indices, catoms)[source]
Helper function for flip_symmetry to determine atoms to reflect
- closest_H_2_metal(delta=0)[source]
Get closest hydrogen atom to metal.
- Parameters:
delta (float) – Distance tolerance in angstrom.
- Returns:
flag (bool) – Flag for if a hydrogen exists in the distance tolerance.
min_dist_H (float) – Minimum distance for a hydrogen.
min_dist_nonH (float) – Minimum distance for a heavy atom.
- combine(mol, bond_to_add=[], dirty=False)[source]
Combines two molecules. Each atom in the second molecule is appended to the first while preserving orders. Assumes operation with a given mol3D instance, when handed a second mol3D instance.
- Parameters:
- Returns:
cmol – New mol3D class containing the two molecules combined.
- Return type:
- continuous_shape_measure(ideal_polyhedron)[source]
Return the continuous shape measure for the FCS, defined as: min(sum_i^N (q_i - p_i)^2 / sum_i^N(q_i - q_0)^2) Where q_i, p_i are vertices of the polyhedron and reference, and q_0 is the center of geometry of the real structure. The minimization is over possible pairwise combinations of vertices, and a rotation of the reference polyhedron (which is done with Kabsch). Only works for single-metal center TMCs since the translation is handled by centering on the metal. Scaling is handled by making the average bond lengths the same for the two structures. 0 means perfect matching, maximum is 100.
- Parameters:
ideal_polyhedron (np.array of 3-tuples of coordinates) – Reference list of points for an ideal geometry
- Returns:
min_cshm – Continuous Shape Measure between the geometry and ideal_polyhedron
- Return type:
- convert2OBMol(force_clean=False, ignoreX=False)[source]
Converts mol3D class instance to OBMol class instance. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.
- convert2OBMol2(ignoreX=False)[source]
Converts mol3D class instance to OBMol class instance, but uses mol2 function, so bond orders are not interpreted, but rather read through the mol2. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.
- Parameters:
ignoreX (bool, optional) – Ignore X element when writing. Default is False.
- convert2mol3D()[source]
Converts OBMol class instance to mol3D class instance. Generally used after openbabel operations, such as FF optimizing a molecule. Updates the mol3D as necessary.
- convexhull()[source]
Computes convex hull of molecule.
- Returns:
hull – Coordinates of convex hull.
- Return type:
array
- coords(no_tabs=False)[source]
Method to obtain string of coordinates in molecule.
- Parameters:
no_tabs (bool, optional) – Whether or not to use tabs in coordinate columns.
- Returns:
coord_string – String of molecular coordinates with atom identities in XYZ format.
- Return type:
string
- coordsvect()[source]
Method to obtain array of coordinates in molecule.
- Returns:
list_of_coordinates – Two dimensional numpy array of molecular coordinates. (N by 3) dimension, N is number of atoms.
- Return type:
np.array
- copymol3D(mol0)[source]
Copies properties and atoms of another existing mol3D object into current mol3D object. Should be performed on a new mol3D class instance. WARNING: NEVER EVER USE mol3D = mol0 to do this. It DOES NOT WORK. ONLY USE ON A FRESH INSTANCE OF MOL3D. Operates on fresh instance.
- Parameters:
mol0 (mol3D) – mol3D of molecule to be copied.
- count_atoms(exclude=['H', 'h', 'x', 'X'])[source]
Count the number of atoms, excluding certain atoms.
- Parameters:
exclude (list) – list of symbols for atoms to exclude.
- Returns:
count – the number of heavy atoms
- Return type:
integer
- count_electrons(charge=0)[source]
Count the number of electrons in a molecule.
- Parameters:
charge (int, optional) – Net charge of a molecule. Default is neutral.
- Returns:
count – The number of electrons in the system.
- Return type:
integer
- count_nonH_atoms()[source]
Count the number of heavy atoms.
- Returns:
count – the number of heavy atoms
- Return type:
integer
- count_specific_atoms(atom_types=['x', 'X'])[source]
Count the number of atoms, including only certain atoms.
- Parameters:
atom_types (list) – list of symbols for atoms to include.
- Returns:
count – the number of heavy atoms
- Return type:
integer
- createMolecularGraph(oct=True, strict_cutoff=False, catom_list=None)[source]
Create molecular graph of a molecule given X, Y, Z positions. Bond order is not interpreted by this. Updates graph attribute of the mol3D class.
- deleteatom(atomIdx)[source]
Delete a specific atom from the mol3D class given an index.
- Parameters:
atomIdx (int) – Index for the atom3D to remove.
- deleteatoms(Alist)[source]
Delete a multiple atoms from the mol3D class given a set of indices. Preserves ordering, starts from largest index.
- dev_from_ideal_geometry(ideal_polyhedron: ndarray) Tuple[float, float][source]
Return the minimum RMSD between a geometry and an ideal polyhedron (with the same average bond distances). Enumerates all possible indexing of the geometry. As such, only recommended for small systems.
- Parameters:
ideal_polyhedron (np.array of 3-tuples of coordinates) – Reference list of points for an ideal geometry
- Returns:
rmsd (float) – Minimum root mean square distance between the fed geometry and the ideal polyhedron
single_dev (float) – Maximum distance between any paired points in the fed geometry and the ideal polyhedron.
- dict_check_processing(dict_check, num_coord=6, debug=False, silent=False)[source]
Process the self.geo_dict to get the flag_oct and flag_list, setting dict_check as the cutoffs.
- Parameters:
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
- draw_svg(filename)[source]
Draw image of molecule and save to SVG.
- Parameters:
filename (str) – Name of file to save SVG to.
- findAtomsbySymbol(sym: str) List[int][source]
Find all elements with a given symbol in a mol3D class.
- findMetal(transition_metals_only: bool = True, include_X: bool = False) List[int][source]
Find metal(s) in a mol3D class. Also sets the metals instance attribute if it is empty.
- findcloseMetal(atom0)[source]
Find the nearest metal to a given atom3D class. Returns heaviest element if no metal found.
- findsubMol(atom0, atomN, smart=False)[source]
Finds a submolecule within the molecule given the starting atom and the separating atom. Illustration: H2A-B-C-DH2 will return C-DH2 if C is the starting atom and B is the separating atom. Alternatively, if C is the starting atom and D is the separating atom, returns H2A-B-C.
- flip_symmetry(verbose=True, max_allowed_dev=30, target_symmetry=None)[source]
Flip octahedral transition metal complexes (TMCs) to opposite symmetry group. Example: cis to trans, fac to mer, etc.
- Parameters:
verbose (bool) – Flag for returning warning when TMC exhibits high deviation from closest symmetry. Default=True
max_allowed_dev (float) – Maximum allowed deviation before warning is triggered (degrees). Default=30
target_symmetry (str) – Target symmetry for complex to be transformed to, only defined for num_unique_ligands == 3 Default=None
- Returns:
self – returns self, a mol3D object with flipped symmetry
- Return type:
- freezeatom(atomIdx)[source]
Set the freeze attribute to be true for a given atom3D class.
- Parameters:
atomIdx (int) – Index for atom to be frozen.
- freezeatoms(Alist)[source]
Set the freeze attribute to be true for a given set of atom3D objects, given their indices. Preserves ordering, starts from largest index.
- classmethod from_smiles(smiles, gen3d: bool = True)[source]
Generate a mol3D object from a SMILES string.
- geo_dict_initialization()[source]
Initialization of geometry check dictionaries according to dict_oct_check_st.
- geo_maxatomdist(mol2)[source]
Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().
- geo_rmsd(mol2)[source]
Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().
- getAngle(idx0, idx1, idx2)[source]
Get angle between three atoms identified by their indices. Specifically, get angle between vectors formed by atom0->atom1 and atom2->atom1.
- getAtomTypes()[source]
Get unique elements in a molecule. Now somewhat redundant with get_element_list
- Returns:
unique_atoms_list – List of unique elements in molecule by symbol.
- Return type:
- getAtoms()[source]
Get all atoms within a molecule.
- Parameters:
None –
- Returns:
atom_list – List of atom3D objects for all elements in a mol3D.
- Return type:
- getBondedAtoms(idx: int) List[int][source]
Gets atoms bonded to a specific atom. This is determined based on element-specific distance cutoffs, rather than predefined valences. This method is ideal for metals because bond orders are ill-defined. For pure organics, the OBMol class provides better functionality.
- getBondedAtomsBOMatrix(idx)[source]
Get atoms bonded by an atom referenced by index, using the BO matrix.
- getBondedAtomsBOMatrixAug(idx)[source]
Get atoms bonded by an atom referenced by index, using the augmented BO matrix.
- getBondedAtomsByCoordNo(idx, CoordNo=6)[source]
Gets atoms bonded to a specific atom by coordination number.
- getBondedAtomsByThreshold(idx, threshold=1.15)[source]
Gets atoms bonded to a specific atom. This method uses a threshold for determination of a bond.
- getBondedAtomsOct(ind, CN=6, debug=False, flag_loose=False, atom_specific_cutoffs=False, strict_cutoff=False)[source]
Gets atoms bonded to an octahedrally coordinated metal. Specifically limits intruder C and H atoms that would otherwise be considered bonded in the distance cutoffs. Limits bonding to the CN closest atoms (CN = coordination number).
- Parameters:
ind (int) – Index of reference atom.
CN (int, optional) – Coordination number of reference atom of interest. Default is 6.
debug (bool, optional) – Produce additional outputs for debugging. Default is False.
flag_loose (bool, optional) – Use looser cutoffs to determine bonding. Default is False.
atom_specific_cutoffs (bool, optional) – Use atom specific cutoffs to determing bonding. Default is False.
strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.
- Returns:
nats – List of indices of bonded atoms.
- Return type:
- getBondedAtomsSmart(idx, oct=False, strict_cutoff=False, catom_list=None)[source]
Get the atoms bonded with the atom specified with the given index, using the molecular graph. Creates graph if it does not exist.
- Parameters:
- Returns:
nats – List of indices of bonded atoms.
- Return type:
- getBondedAtomsnotH(idx, metal_multiplier=1.35, nonmetal_multiplier=1.15)[source]
Get bonded atom with a given index, but do not count hydrogens.
- Parameters:
- Returns:
nats – List of indices of bonded atoms.
- Return type:
- getClosestAtomnoHs(ratom)[source]
Get atoms bonded to a specific atom3D class that are not hydrogen.
- getMLBondLengths()[source]
Outputs the metal-ligand bond lengths in the complex.
- Returns:
bls – keyed by ID of metal M and valued by dictionary of M-L bond lengths and relative bond lengths
- Return type:
dictionary
- getNumAtoms()[source]
Get the number of atoms within a molecule.
- Returns:
self.natoms – The number of atoms in the mol3D object.
- Return type:
- getOBMol(fst, convtype, ffclean=False, gen3d=True)[source]
Get OBMol object from a file or SMILES string. If you have a mol3D, then use convert2OBMol instead.
- Parameters:
- Returns:
OBMol – OBMol class instance to be used with openbabel. Bound as .OBMol attribute.
- Return type:
OBMol
- get_coordinate_array()[source]
Get the coordinate array of the molecule. Same as coordsvect.
- Parameters:
None –
- Returns:
coord_array – The coordinates of each atom. Shape is (number of atoms, 3).
- Return type:
np.array
- get_element_list()[source]
Get the element list of the molecule. Nearly the same as symvect.
- Parameters:
None –
- Returns:
element_list – The element symbol of each atom.
- Return type:
- get_fcs(strict_cutoff=False, catom_list=None, max6=True)[source]
Get first coordination shell of a transition metal complex.
- Parameters:
- Returns:
fcs – List of first coordination shell indices.
- Return type:
- get_features(lac=True, force_generate=False, eq_sym=False, use_dist=False, NumB=False, Gval=False, size_normalize=False, alleq=False, strict_cutoff=False, catom_list=None, custom_property_dict={}, depth=3, loud=False, two_key=False, non_trivial=False)[source]
Get geo-based RAC features for this transition metal complex (if octahedral).
- Parameters:
lac (bool, optional) – Use lac for ligand_assign_consistent behavior. Default is True
force_generate (bool, optional) – Force the generation of features.
eq_sym (bool, optional) – Force equatorial plane to have same chemical symbols if possible.
use_dist (bool, optional) – Whether or not CD-RACs used.
NumB (bool, optional) – Whether or not the number of bonds RAC features are generated.
Gval (bool, optional) – Whether or not the group number RAC features are generated.
size_normalize (bool, optional) – Whether or not to normalize by the number of atoms in molecule.
alleq (bool, optional) – Whether or not all ligands are equatorial.
strict_cutoff (bool, optional) – Strict bonding cutoff for fullerene and SACs.
catom_list (list of int, optional) – List of indices of coordinating atoms.
custom_property_dict (dict, optional) – Keys are custom property names (str), values are dictionaries mapping atom symbols (str, e.g., “H”, “He”) to the numerical property (float) for that atom. If provided, other property RACs (e.g., Z, S, T) will not be made.
depth (int, optional) – The maximum depth of the RACs (how many bonds out the RACs go). For example, if set to 3, depths considered will be 0, 1, 2, and 3.
loud (bool) – Whether to generate print statements. Default is False.
two_key (bool) – Whether return dictionary should only have two keys, ‘colnames’ and ‘results’, with values that are lists of feature names and values, respectively.
non_trivial (bool, optional) – Flag to exclude difference RACs of I, and depth zero difference RACs. These RACs are always zero. By default False.
- Returns:
results – Dictionary of {‘RACname’:RAC} for all geo-based RACs
- Return type:
- get_first_shell(check_hapticity=True)[source]
Get the first coordination shell of a mol3D object with a single transition metal (read from CSD mol2 file) if check_hapticity is True updates the first shell of multiheptate ligand to be hydrogen set at the geometric mean
- Parameters:
check_hapticity (boolean) – Whether to update multiheptate ligands to their geometric centroid.
- Returns:
mol 3D object (First coordination shell with metal (can change based on check_hapticity).)
list (List of hapticity.)
- get_geometry_type(dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, skip=False)[source]
Get the type of the geometry (linear(2), trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7)).
Uses hapticity truncated first coordination shell.
- Parameters:
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
- Returns:
results – Measurement of deviations from arrays.
- Return type:
dictionary
- get_geometry_type_distance(max_dev=1000000.0, close_dev=0.01, flag_catoms=False, catoms_arr=None, skip=False, transition_metals_only=False, cshm=False) Dict[str, Any][source]
Get the type of the geometry (available options in globalvars all_geometries).
Uses hapticity truncated first coordination shell. Does not require the input of num_coord.
- Parameters:
max_dev (float, optional) – Maximum RMSD allowed between a structure and an ideal geometry before it is classified as unknown. Default is 1e6.
close_dev (float, optional) – Maximum difference in RMSD between two classifications allowed before they are compared by maximum single-atom deviation as well.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
skip (list, optional) – Geometry checks to skip. Default is False.
transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is False.
cshm (bool, optional) – Whether or not to return continuous shape measures for each geometry.
- Returns:
results – Contains the classified geometry and the RMSD from an ideal structure. Summary contains a list of the RMSD, the maximum single-atom deviation, and a continuous shape measure for all considered geometry types.
- Return type:
dictionary
- get_geometry_type_old(dict_check=False, angle_ref=False, num_coord=None, flag_catoms=False, catoms_arr=None, debug=False, skip=False, transition_metals_only=False, num_recursions=[0, 0])[source]
Get the type of the geometry (trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7)).
- Parameters:
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
num_coord (int, optional) – Expected coordination number.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is False.
num_recursions (list, optional) – Counter to track number of ligands classified as ‘sandwich’ and ‘edge’ in original structure.
- Returns:
results – Measurement of deviations from arrays.
- Return type:
dictionary
- get_graph()[source]
Return the graph attribute of the molecule. It is faster to just use mol.graph though, where mol is the mol3D object.
- Parameters:
None –
- Returns:
self.graph – The graph.
- Return type:
np.array
- get_graph_hash(attributed_flag=True, oct=False, loud=True)[source]
Calculate the graph hash of a molecule. Note: Not useful for distinguishing betweeen molecules that are just a single atom.
- Parameters:
- Returns:
gh – The graph hash.
- Return type:
- get_linear_angle(ind)[source]
Get linear ligand angle.
- Parameters:
ind (int) – Index for one of the metal-coordinating atoms.
- Returns:
flag (bool) – True if the ligand is linear.
ang (float) – Get angle of linear ligand. 0 if not linear.
- get_molecular_mass()[source]
Computes the molecular mass, or weight, of a mol3D.
- Parameters:
None –
- Returns:
mol_mass – The molecular mass. Units of amu.
- Return type:
- get_num_coord_metal(debug=False, strict_cutoff=False, catom_list=None)[source]
Get metal coordination based on get bonded atoms. Store this info.
- get_octetrule_charge(debug=False)[source]
Get the octet-rule charge provided a mol3D object with bo_graph (read from CSD mol2 file). Note that currently this function should only be applied to ligands (organic molecules).
- Parameters:
debug (boolean) – Whether to have more printouts.
- Returns:
charge (float) – The overall charge of the molecule.
arom_charge (int) – The charge of the aromatic rings.
- get_smiles(canonicalize=False, use_mol2=False) str[source]
Returns the SMILES string representing the mol3D object.
- get_smilesOBmol_charge()[source]
Get the charge of a mol3D object through adjusted OBmol hydrogen/smiles conversion. Note that currently this function should only be applied to ligands (organic molecules).
- get_submol_noHs()[source]
Get the heavy atom only submolecule, with no hydrogens.
- Returns:
mol_noHs – mol3D class instance with no hydrogens.
- Return type:
- get_symmetry(verbose=True, max_allowed_dev=30, details=False)[source]
Classify octahedral transition metal complexes (TMCs) according to symmetry.
- Parameters:
verbose (bool) – Flag for returning warning when TMC exhibits high deviation from closest symmetry. Default=True
max_allowed_dev (float) – Maximum allowed deviation before warning is triggered (degrees). Default=30
details (bool) – Flag for returning detailed lists of unique ligands and coordinating atoms, intended for use with flip_symmetry(). Default=False
- Returns:
symmetry_dict (dict) – Dictionary storing assigned symmetry class and deviations from all possible symmetry classes.
detailed_dict (dict) – Dictionary storing indices of metal, ligand atoms, and ligand coordinating atoms, only returned if details=True
- get_symmetry_denticity(return_eq_catoms=False, BondedOct=False)[source]
Get symmetry class of molecule.
- Parameters:
- Returns:
eqsym (bool) – Flag for equatorial symmetry.
maxdent (int) – Maximum denticity in molecule.
ligdents (list) – List of denticities in molecule.
homoleptic (bool) – Flag for whether a geometry is homoleptic.
ligsymmetry (str) – Symmetry class for ligand of interest.
eq_catoms (list) – List of equatorial connection atoms.
- getfragmentlists()[source]
Get all independent molecules in mol3D.
- Returns:
atidxes_total – list of lists for atom indices comprising of each distinct molecule.
- Return type:
- isPristine(unbonded_min_dist=1.3, oct=False)[source]
Checks the organic portions of a transition metal complex and determines if they look good.
- Parameters:
- Returns:
pass (bool) – Whether or not molecule passes the organic checks.
fail_list (list) – List of failing criteria, as a set of strings.
- is_edge_compound(transition_metals_only: bool = True) Tuple[int, List, List][source]
Check if a TMC/mononuclear metal complex structure is an edge compound.
- Parameters:
transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is True.
- Returns:
num_edge_lig (int) – Number of edge ligands.
info_edge_lig (list) – List of dictionaries with info about edge ligands.
edge_lig_atoms (list) – List of dictionaries with the connecting atoms of the edge ligands.
- is_linear_ligand(ind)[source]
Check whether a ligand is linear.
- Parameters:
ind (int) – Index for one of the metal-coordinating atoms.
- Returns:
flag (bool) – True if the ligand is linear.
catoms (list) – Atoms bonded to the index of interest.
- is_sandwich_compound(transition_metals_only: bool = True) Tuple[int, List, bool, bool, List][source]
Evaluates whether a TMC/mononuclear metal complex compound is a sandwich compound.
- Parameters:
transition_metals_only (bool, optional) – Flag if only transition metals counted as metals. Default is True.
- Returns:
num_sandwich_lig (int) – Number of sandwich ligands.
info_sandwich_lig (list) – List of dictionaries about the sandwich ligands.
aromatic (bool) – Flag about whether the ligand is aromatic.
allconnect (bool) – Flag for connected atoms in ring.
sandwich_lig_atoms (list) – List of dictionaries with the connecting atoms of the sandwich ligands.
- ligand_comp_org(init_mol, catoms_arr=None, flag_deleteH=True, flag_lbd=True, debug=False, depth=3, BondedOct=False, angle_ref=False)[source]
Get the ligand distortion by comparing each individual ligands in init_mol and opt_mol.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
debug (bool, optional) – Flag for extra printout. Default is False.
depth (int, optional) – Depth for truncated molecule. Default is 3.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
- Returns:
dict_lig_distort – Dictionary containing rmsd_max and atom_dist_max.
- Return type:
- match_lig_list(init_mol, catoms_arr=None, BondedOct=False, flag_lbd=True, debug=False, depth=3, check_whole=False, angle_ref=False)[source]
Match the ligands of mol and init_mol by calling ligand_breakdown.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input. Default is Nonetype.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
debug (bool, optional) – Flag for extra printout. Default is False.
depth (int, optional) – Depth for truncated molecule. Default is 3.
check_whole (bool, optional) – Flag for checking whole ligand.
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
- Returns:
liglist_shifted (list) – List of lists containing all ligands from optimized molecule.
liglist_init (list) – List of lists containing all ligands from initial molecule.
flag_match (bool) – A flag about whether the ligands of initial and optimized mol are exactly the same. There is a one to one mapping.
- maxatomdist(mol2)[source]
Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().
- maxatomdist_nonH(mol2)[source]
Compute the max atom distance between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().
- meanabsdev(mol2)[source]
Compute the mean absolute deviation (MAD) between two molecules. Does not align molecules. For that, use geometry.kabsch().
- mindistmol()[source]
Measure the smallest distance between atoms in a single molecule.
- Returns:
mind – Min distance between atoms of two molecules.
- Return type:
- mindistnonH(mol)[source]
Measure the smallest distance between an atom and a non H atom in another molecule.
- mol3D_to_networkx(get_symbols: bool = True, get_bond_order: bool = True, get_bond_distance: bool = False)[source]
- molsize()[source]
Measure the size of the molecule, by quantifying the max distance between atoms and center of mass.
- Returns:
maxd – Max distance between an atom and the center of mass.
- Return type:
- moments_of_inertia()[source]
Determines the moments of inertia for the object, in the specified coordinates (after centering about the center of mass).
- Returns:
I – Moments of inertia tensor.
- Return type:
np.array
- new_writemol2(ignore_dummy_atoms=True, write_bond_orders=True, return_string=True, output_file=None)[source]
Generate a MOL2-format string or file from atomic coordinates and bonding data.
- Parameters:
atom_coords (list or np.ndarray of shape (N, 3)) – A list or NumPy array of atomic coordinates. Each element is a 3D coordinate (x, y, z) for a single atom.
atom_elements (list of str) – A list of atomic element symbols (e.g., ‘C’, ‘N’, ‘O’, etc.), one for each atom in atom_coords. The list must be the same length as atom_coords.
bond_order_dict (dict) – A dictionary mapping tuples of atom indices (i, j) to bond orders. The bond order may be a string like ‘1’, ‘2’, ‘3’, ‘ar’, etc. Example: {(0, 1): ‘1’, (1, 2): ‘2’}
ignore_dummy_atoms (bool, optional (default=True)) – If True, atoms with element symbol ‘X’ will be ignored in both atoms and bonds.
write_bond_orders (bool, optional (default=True)) – If True, writes the actual bond orders from bond_order_dict. If False, all bonds are assigned order ‘1’.
return_string (bool, optional (default=True)) – If True, returns the MOL2 content as a string. If False, writes to output_file.
output_file (str or None, optional) – If return_string is False, this must be the path to the file to write.
- Returns:
Returns the MOL2-format string if return_string is True, otherwise writes to file and returns None.
- Return type:
str or None
Notes
Atoms are renumbered starting from 1.
Element-based labels (e.g., C1, C2) are assigned using counts per element.
Substructures are inferred using connected components in the bond graph.
Only bonds where both atoms are not dummy atoms are retained if ignore_dummy_atoms is True.
- oct_comp(angle_ref=False, catoms_arr=None, debug=False)[source]
Get the deviation of shape of the catoms from the desired shape, which is defined in angle_ref.
- Parameters:
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User can overwrite this connection atom array by explicit input.
debug (bool, optional) – Flag for extra printout. Default is False.
- Returns:
dict_catoms_shape (dict) – Dictionary of first coordination sphere shape measures.
catoms_arr (list) – Connection atom array.
- overlapcheck(mol, silence=False)[source]
Measure the smallest distance between an atom and a point.
- populateBOMatrix(bonddict=False, set_bo_mat=False)[source]
Populate the bond order matrix using openbabel.
- populateBOMatrixAug()[source]
Populate the augmented bond order matrix using openbabel.
- Parameters:
bonddict (bool) – Flag for if the obmol bond dictionary should be saved. Default is False.
- Returns:
molBOMat – Numpy array for augmented bond order matrix.
- Return type:
np.array
- principal_moments_of_inertia(return_eigvecs=False)[source]
Returns the principal moments of inertia, and optionally the eigenvectors defining the principal axes.
- Parameters:
return_eigvecs (bool) – Flag for if the matrices used to diagonalize I should be returned. Default is False.
- Returns:
pmom (np.array) – 3x1 array of the principal moments of inertia, in the provided Cartesian frame.
eigvecs (np.array) – 3x3 array where each column is an eigenvector.
- printxyz()[source]
Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.
- read_bond_order(bofile)[source]
Get bond order information from file.
- Parameters:
bofile (str) – Path to a bond order file.
- read_charge(chargefile)[source]
Get charge information from file.
- Parameters:
chargefile (str) – Path to a charge file.
- read_smiles(smiles, ff='mmff94', steps=2500)[source]
Read a smiles string and convert it to a mol3D class instance.
- readfrommol(filename)[source]
Read mol into a mol3D class instance. Stores the bond orders and atom types.
- Parameters:
filename (string) – String of path to MOL file. Path may be local or global.
- readfrommol2(filename, readstring=False)[source]
Read mol2 into a mol3D class instance. Stores the bond orders and atom types (SYBYL).
- Parameters:
filename (string) – String of path to MOL2 file. Path may be local or global. May be read in as a string.
readstring (bool) – Flag for deciding whether a string of mol2 file is being passed as the filename.
- readfromstring(xyzstring)[source]
Read XYZ from string.
- Parameters:
xyzstring (string) – String of XYZ file.
- readfromtxt(txt)[source]
Read XYZ from textfile.
- Parameters:
txt (list) – List of lists that comes as a result of readlines.
- readfromxyz(filename: str, ligand_unique_id=False, read_final_optim_step=False, readstring=False)[source]
Read XYZ into a mol3D class instance.
- Parameters:
filename (string) – String of path to XYZ file. Path may be local or global.
ligand_unique_id (string) – Unique identifier for a ligand. In MR diagnostics, we abstract the atom based graph to a ligand based graph. For ligands, they don’t have a natural name, so they are named with a UUID. Hard to attribute MR character to just atoms, so it is attributed ligands instead.
read_final_optim_step (boolean) – if there are multiple geometries in the xyz file (after an optimization run) use only the last one.
readstring (boolean) – Flag for deciding whether a string or xyz file is being passed as the filename.
- reflect_coords(metal_coords, lig1_catom_coords, lig2_catom_coords, atoms_to_move)[source]
Helper function for flip_symmetry to calculate vectors, projections, and update coordinates.
- Parameters:
metal_coords (np.array) – Coordinates of metal atom.
lig1_catom_coords (np.array) – Coordinates of coordinating atom of first ligand to be flipped.
lig2_catom_coords (np.array) – Coordinates of coordinating atom of second ligand to be flipped.
atoms_to_move (list) – List of atom indices to be moved.
- Returns:
self – returns self, a mol3D object with flipped symmetry.
- Return type:
- resetBondOBMol()[source]
Repopulates the bond order matrix via openbabel. Interprets bond order matrix.
- returnxyz(no_tabs=False)[source]
Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.
- Parameters:
no_tabs (bool, optional) – Whether or not to use tabs in coordinate columns.
- Returns:
ss – String of XYZ information from mol3D class.
- Return type:
string
- rmsd(mol2)[source]
Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().
- rmsd_nonH(mol2)[source]
Compute the RMSD between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().
- roland_combine(mol, catoms, bond_to_add=[], dirty=False)[source]
Combines two molecules. Each atom in the second molecule is appended to the first while preserving orders. Assumes operation with a given mol3D instance, when handed a second mol3D instance.
- Parameters:
- Returns:
cmol – New mol3D class containing the two molecules combined.
- Return type:
- sanitycheck(silence=False, debug=False)[source]
Sanity check a molecule for overlap within the molecule.
- sanitycheckCSD(oct=False, angle1=30, angle2=80, angle3=45, debug=False, metals=None)[source]
Sanity check a CSD molecule. Check that the molecule passes basic angle tests in line with CSD pulls.
- Parameters:
oct (bool, optional) – Flag for octahedral test. Default is False.
angle1 (float, optional) – Metal angle cutoff. Default is 30.
angle2 (float, optional) – Organic angle cutoff. Default is 80.
angle3 (float, optional) – Metal/organic angle cutoff e.g. M-X-X angle. Default is 45.
debug (bool, optional) – Extra print out desired. Default is False.
metals (Nonetype, optional) – Check for metals. Default is None.
- Returns:
sane (bool) – Whether or not molecule is a sane molecule.
error_dict (dict) – Returned if debug, {bondidists and angles breaking constraints:values}
- setLoc(loc)[source]
Sets the conformation of an amino acid in the chain of a protein.
- Parameters:
loc (str) – A one-character string representing the conformation.
- symvect()[source]
Method to obtain array of symbol vector of molecule.
- Returns:
symbol_vector – 1 dimensional numpy array of atom symbols. (N,) dimension, N is number of atoms.
- Return type:
np.array
- translate(dxyz)[source]
Translate all atoms by a given vector.
- Parameters:
dxyz (list) – Vector to translate all molecules, as a list [dx, dy, dz].
- typevect()[source]
Method to obtain array of type vector of molecule.
- Returns:
type_vector – 1 dimensional numpy array of atom types (by name). (N,) dimension, N is number of atoms.
- Return type:
np.array
- writemol(filename)[source]
Write mol file from mol3D object. Not advised if molecule has > 99 atoms. If there is no bond order information available, all bonds will be set as single bonds.
- Parameters:
filename (str) – Path to mol file.
- writemol2(filename, writestring=False, ignoreX=False, force=False)[source]
Write mol2 file from mol3D object. Partial charges are appended if given. Else, total charge of the complex (given or interpreted by OBMol) is assigned to the metal.
- Parameters:
- writemol2_bodict(ignore_dummy_atoms=True, write_bond_orders=True, return_string=True, output_file=None)[source]
Generate a MOL2-format string or file from atomic coordinates and bonding data.
- Parameters:
atom_coords (list or np.ndarray of shape (N, 3)) – A list or NumPy array of atomic coordinates. Each element is a 3D coordinate (x, y, z) for a single atom.
atom_elements (list of str) – A list of atomic element symbols (e.g., ‘C’, ‘N’, ‘O’, etc.), one for each atom in atom_coords. The list must be the same length as atom_coords.
bond_order_dict (dict) – A dictionary mapping tuples of atom indices (i, j) to bond orders. The bond order may be a string like ‘1’, ‘2’, ‘3’, ‘ar’, etc. Example: {(0, 1): ‘1’, (1, 2): ‘2’}
ignore_dummy_atoms (bool, optional (default=True)) – If True, atoms with element symbol ‘X’ will be ignored in both atoms and bonds.
write_bond_orders (bool, optional (default=True)) – If True, writes the actual bond orders from bond_order_dict. If False, all bonds are assigned order ‘1’.
return_string (bool, optional (default=True)) – If True, returns the MOL2 content as a string. If False, writes to output_file.
output_file (str or None, optional) – If return_string is False, this must be the path to the file to write.
- Returns:
Returns the MOL2-format string if return_string is True, otherwise writes to file and returns None.
- Return type:
str or None
Notes
Atoms are renumbered starting from 1.
Element-based labels (e.g., C1, C2) are assigned using counts per element.
Substructures are inferred using connected components in the bond graph.
Only bonds where both atoms are not dummy atoms are retained if ignore_dummy_atoms is True.
- writenumberedxyz(filename)[source]
Write standard XYZ file with numbers instead of symbols.
- Parameters:
filename (str) – Path to XYZ file.
- writexyz(filename, symbsonly=True, ignoreX=False, ordering=False, writestring=False, withgraph=False, specialheader=False, no_tabs=False)[source]
Write standard XYZ file.
- Parameters:
filename (str) – Path to XYZ file.
symbsonly (bool, optional) – Only write symbols to file. Default is True.
ignoreX (bool, optional) – Ignore X element when writing. Default is False.
ordering (bool, optional) – If handed a list, will order atoms in a specific order. Default is False.
writestring (bool, optional) – Flag to write to a string if True or file if False. Default is False.
withgraph (bool, optional) – Flag to write with graph (after XYZ) if True. Default is False. If True, sparse graph written. All bonds indicated as single.
specialheader (str, optional) – String to write information into header. Default is False. If True, a special string is written.
no_tabs (bool, optional) – Whether or not to use tabs in coordinate columns.
- Returns:
ss – XYZ contents, if writestring is True.
- Return type:
Mol2D Class
- class molSimplify.Classes.mol2D.Mol2D(incoming_graph_data=None, **attr)[source]
Bases:
Graph- denticity_hapticity(catoms)[source]
Get denticity and hapticity from molecular graph and known coordinating atoms Number of coordinating atoms = denticity * hapticity
- Parameters:
catoms (list) – List of coordinating atom indices
- Returns:
denticity (int) – Number of independent coordination paths in graph
hapticity (list) – Length of each separate coordination path in graph
- find_metal(transition_metals_only: bool = True) List[int][source]
Find indices of metal(s) in a Mol2D class.
- Parameters:
transition_metals_only (bool, optional) – Only find transition metals. Default is true.
- Returns:
metal_list – List of indices of metal atoms in Mol2D.
- Return type:
Examples
Build Vanadyl acetylacetonate from SMILES:
>>> mol = Mol2D.from_smiles("CC(=[O+]1)C=C(C)O[V-3]12(#[O+])OC(C)=CC(C)=[O+]2") >>> mol.find_metal() [7]
- find_simple_paths(source, sink, cutoff=None, constraints=None)[source]
Find simple (i.e., no repeated nodes) path(s) between source and sink nodes in Mol2D class.
- classmethod from_smiles(smiles: str)[source]
Create a Mol2D object from a SMILES string.
- Parameters:
smiles (str) – SMILES representation of the molecule.
- Returns:
Mol2D object of the molecule
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1") >>> mol Mol2D(O1C4H4)
- graph_determinant(return_string: bool = True) str | float[source]
Calculates the molecular graph determinant.
- Parameters:
return_string (bool, optional) – Flag to return the determinant as a string. Default is True.
- Returns:
graph determinant
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1")
and calculate the graph determinant:
>>> mol.graph_determinant() '-19404698740'
- graph_hash() str[source]
Calculates the node attributed graph hash of the molecule.
- Returns:
node attributed graph hash
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1")
and calculate the node attributed graph hash:
>>> mol.graph_hash() '8366132e88f24330fedbbf24367877f7'
- graph_hash_edge_attr() str[source]
Calculates the edge attributed graph hash of the molecule.
- Returns:
edge attributed graph hash
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1")
and calculate the edge attributed graph hash:
>>> mol.graph_hash_edge_attr() 'b9aa3fc505879a7a2a9a1789aee922f5'
monomer3D Class
- class molSimplify.Classes.monomer3D.monomer3D(three_lc='GLY', chain='undef', id=-1, occup=1.0, loc='')[source]
Bases:
objectHolds information about a monomer, used to do manipulations. Reads information from structure file (pdb) or is directly built from molsimplify.
- addAtom(atom, index=None)[source]
Adds an atom to the atoms attribute, which contains a list of atom3D class instances.
- centermass()[source]
Computes coordinates of center of mass of monomer. :returns: center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z). :rtype: list
- centroid()[source]
Computes coordinates of centroid of monomer. :returns: centroid – Coordinates of centroid. List of length 3: (X, Y, Z). :rtype: list
- coords()[source]
Method to obtain string of coordinates in monomer.
- Returns:
coord_string – String of molecular coordinates with atom identities in XYZ format.
- Return type:
string
- getGreek(greek)[source]
Finds the Greek lettered carbon(s) or other atom(s) of the user’s choice.
- Parameters:
greek (string) – The Greek lettered atom (e.g. alpha carbon) we want. Inputs should be form ‘CA’ or similar.
- Returns:
greek_atoms – A list of atom3D class objects that contains the Greek lettered atom(s) we want.
- Return type:
list of atom3Ds
- identify()[source]
States whether the amino acid is (positively/negatively) charged, polar, or hydrophobic.
- Returns:
aa_type – Positively charged, Negatively charged, Polar, Hydrophobic
- Return type:
string
- setLoc(loc)[source]
Sets the conformation of a monomer in the chain of a protein.
- Parameters:
loc (str) – a one-character string representing the conformation
protein3D Class
- class molSimplify.Classes.protein3D.protein3D(pdbCode='undef')[source]
Bases:
objectHolds information about a protein, used to do manipulations. Reads information from structure file (pdb, cif) or is directly built from molsimplify.
- autoChooseConf()[source]
Automatically choose the conformation of a protein3D class instance based first on what the greatest occupancy level is and then the first conformation ihe alphabet with all else equal.
- convexhull()[source]
Computes convex hull of protein.
- Returns:
hull – Coordinates of convex hull.
- Return type:
array
- countAAs()[source]
Return the number of amino acid residues in a protein3D class.
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.countAAs() # This return the number of AAs in the PDB for all the chains. 1121
- fetch_pdb(pdbCode)[source]
API query to fetch a pdb and write it as a protein3D class instance
- Parameters:
pdbCode (str) – Code for protein, e.g. 1os7
- findAA(three_lc='XAA')[source]
Find amino acids with a specific three-letter code.
- Parameters:
three_lc (str) – three-letter code, default as XAA.
- Returns:
inds – a set of amino acid indices with the specified symbol.
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7
Return a set of pairs where each pair is a combination of the chain name and the index of the amino acid specified (in this case, ‘MET’) >>> aa_set = pdb_system.findAA(three_lc = ‘MET’) >>> sorted(aa_set) # Sorting for reproducible order in doctest [(‘A’, 268), (‘B’, 268), (‘C’, 268), (‘D’, 268)]
- findAtom(sym='X', aa=True)[source]
Find atoms with a specific symbol that are contained in amino acids or heteromolecules.
- Parameters:
sym (str) – element symbol, default as X.
aa (boolean) – True if we want atoms contained in amino acids False if we want atoms contained in heteromolecules
- Returns:
inds – a list of atom indices with the specified symbol.
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.findAtom(sym="S", aa=True) # Returns indices of sulphur atoms present in amino acids [2166, 4442, 6733, 9041] >>> pdb_system.findAtom(sym="S", aa=False) # Returns indices of sulphur atoms present in heteromolecules [9164, 9182, 9200]
- findMetal(transition_metals_only=True)[source]
Find metal(s) in a protein3D class.
- Parameters:
transition_metals_only (bool, optional) – Only find transition metals. Default is true.
- Returns:
metal_list – List of indices of metal atoms in protein3D.
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') fetched: 1os7 >>> pdb_system.findMetal() [9160, 9178, 9196, 9214]
- freezeatom(atomIdx)[source]
Set the freeze attribute to be true for a given atom3D class.
- Parameters:
atomIdx (int) – Index for atom to be frozen.
- freezeatoms(Alist)[source]
Set the freeze attribute to be true for a given set of atom3D classes, given their indices. Preserves ordering, starts from largest index.
- Parameters:
Alist (list) – List of indices for atom3D instances to remove.
- getBoundMols(h_id, aas_only=False)[source]
Get a list of molecules bound to a heteroatom, usually a metal.
- getChain(chain_id)[source]
Takes a chain of interest and turns it into its own protein3D class instance.
- Parameters:
chain_id (string) – The letter name of the chain of interest
- Returns:
p – A protein3D instance consisting of just the chain of interest
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.getChain('A')
- getMissingAAs()[source]
Get missing amino acid residues of a protein3D class.
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB fetched: 1MH1 >>> pdb_system.getMissingAAs() # This gives a list of monomer3D objects [monomer3D(VAL, id=182), monomer3D(LYS, id=183), monomer3D(LYS, id=184)]
- getMissingAtoms()[source]
Get missing atoms of a protein3D class.
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB fetched: 1MH1 >>> missing_atoms = pdb_system.getMissingAtoms()
List atoms in the first set of missing_atoms >>> [atom.sym for atom in list(missing_atoms)[0]] [‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘O’]
- getMolecule(a_id, aas_only=False)[source]
Finds the molecule that the atom is contained in.
- Parameters:
a_id (int) – The index of the desired atom whose molecule we want to find
aas_only (boolean) – True if we want ito find atoms contained in amino acids only. False if we want atoms contained in all molecules. Default is False.
- Returns:
mol – The amino acid residue, nucleotide, or heteromolecule containing the atom
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7
This returns an molSimplify.Classes.monomer3D object indicating that the atom is part of an amino acid or nucleotide: >>> pdb_system.getMolecule(a_id=2166) monomer3D(MET, id=268)
This returns a mol3D object indicating that the atom is part of a molecule that is not an amino acid or nucleotide >>> pdb_system.getMolecule(a_id=9164) mol3D(S1O3N1C2) >>> pdb_system.getMolecule(a_id=9164).name # This prints the name of the molecule, in this case, it is ‘TAU’ ‘TAU’
- readMetaData()[source]
API query to fetch XML data from a pdb and add its useful attributes to a protein3D class.
- Parameters:
pdbCode (str) – Code for protein, e.g. 1os7
- readfrompdb(text)[source]
Read PDB into a protein3D class instance.
- Parameters:
text (str) – String of path to PDB file. Path may be local or global. May also be the text of a PDB file from the internet.
- setAAs(aas)[source]
Set monomers of a protein3D class to different monomers.
- Parameters:
aas (dictionary) – Keyed by chain and location Valued by monomer3D monomers (amino acids or nucleotides)
- setAtoms(atoms)[source]
Set atom indices of a protein3D class to atoms.
- Parameters:
atoms (dictionary) – Keyed by atom index Valued by atom3D atom that has that index
- setBonds(bonds)[source]
Sets the bonded atoms in the protein.
This is effectively the molecular graph.
- Parameters:
bonds (dictionary) – Keyed by atom3D atoms in the protein Valued by a set consisting of bonded atoms
- setChains(chains)[source]
Set chains of a protein3D class to different chains.
- Parameters:
chains (dictionary) – Keyed by desired chain IDs. Valued by the list of molecules in the chain.
- setConf(conf)[source]
Set possible conformations of a protein3D class to a new list.
- Parameters:
conf (list) – List of possible conformations for applicable amino acids.
- setDataCompleteness(DataCompleteness)[source]
Set DataCompleteness value of protein3D class.
- Parameters:
DataCompleteness (float) – The desired new R value.
- setEDIAScores()[source]
Sets the EDIA score of a protein3D class.
- Parameters:
pdbCode (string) – The 4-character code of the protein3D class.
- setHetmols(hetmols)[source]
Set heteromolecules of a protein3D class to different ones.
- Parameters:
hetmols (dictionary) – Keyed by chain and location Valued by mol3D heteromolecules
- setIndices(a_ids)[source]
Set atom indices of a protein3D class to atoms.
- Parameters:
a_ids (dictionary) – Keyed by atom3D atom Valued by its index
- setMissingAAs(missing_aas)[source]
Set missing amino acids of a protein3D class to a new list.
- Parameters:
missing_aas (list) – List of missing amino acids.
- setMissingAtoms(missing_atoms)[source]
Set missing atoms of a protein3D class to a new dictionary.
- Parameters:
missing_atoms (dictionary) – Keyed by amino acid residues / nucleotides of origin Valued by missing atoms
- setPDBCode(pdbCode)[source]
Sets the 4-letter PDB code of a protein3D class instance
- Parameters:
pdbCode (string) – Desired 4-letter PDB code
- setRSRZ(RSRZ)[source]
Set RSRZ score of protein3D class.
- Parameters:
RSRZ (float) – The desired new RSRZ score.
- setRfree(Rfree)[source]
Set Rfree value of protein3D class.
- Parameters:
Rfree (float) – The desired new Rfree value.
- setTwinL(TwinL)[source]
Set TwinL score of protein3D class.
- Parameters:
TwinL (float) – The desired new TwinL score.
- setTwinL2(TwinL2)[source]
Set TwinL squared score of protein3D class.
- Parameters:
TwinL2 (float) – The desired new TwinL squared score.
- stripAtoms(atoms_stripped)[source]
Removes certain atoms from the protein3D class instance.
- Parameters:
atoms_stripped (list) – List of atom3D indices that should be removed
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.stripAtoms([2166, 4442, 6733, 2165]) # This removes the list of atoms with >>> # indices listedin the code
- stripHetMol(hetmol)[source]
- Removes all heteroatoms part of the specified heteromolecule from
the protein3D class instance.
- Parameters:
hetmol (str) – String representing the name of a heteromolecule whose heteroatoms should be stripped from the protein3D class instance
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('3I40') # Fetch a PDB fetched: 3I40 >>> pdb_system.stripHetMol('HOH')
globalvars
- class molSimplify.Classes.globalvars.globalvars(*args, **kwargs)[source]
Bases:
objectGlobalvars class. Defines global variables used throughout the code, including periodic table.
- add_custom_path(path)[source]
Record custom path in ~/.molSimplify file
- Parameters:
path (str) – Path to custom data ~/.molSimplify file.
- amass() Dict[str, Tuple[float, int, float, int]][source]
Get the atomic mass dictionary.
- Returns:
amassdict – Dictionary containing atomic masses.
- Return type:
- bbcombs_mononuc()[source]
Get backbone combinations dictionary
- Returns:
bbcombs_mononuc – Backbone combination dictionary for different geometries.
- Return type:
- bondsdict()[source]
Get the bond dictionary.
- Returns:
bondsdict – Dictionary containing bond lengths.
- Return type:
- elementsbynum()[source]
Returns list of elements by number
- Returns:
elementsbynum – List of elements by number
- Return type:
- endict()[source]
Returns electronegativity dictionary.
- Returns:
endict – Electronegativity dictionary
- Return type:
- geo_check_dictionary()[source]
Returns list of geo check objects dictionary.
- Returns:
geo_check_dictionary – Geo check measurement dictionary.
- Return type:
- getAllAAs()[source]
Gets all amino acids
- Returns:
amino_acids – Dictionary of standard amino acids
- Return type:
dictionary
- get_all_angle_refs()[source]
Get references angle dict.
- Returns:
all_angle_refs – Reference angles for various geometries.
- Return type:
- get_all_geometries()[source]
Get available geometries.
- Returns:
all_geometries – All available geometries.
- Return type:
- get_all_polyhedra()[source]
Get reference polyhedra dict.
- Returns:
all_polyhedra – Reference polyhedra for various geometries.
- Return type:
- groups()[source]
Returns dict of elements by groups.
- Returns:
groups_dict – Groups dictionary.
- Return type:
- periods()[source]
Returns dict of elements by periods.
- Returns:
periods_dict – Periods dictionary.
- Return type:
- polarizability() Dict[str, float][source]
Get the polarizability dictionary.
- Returns:
poldict – Dictionary containing polarizabilities.
- Return type:
- testTF()[source]
Tests to see whether keras and tensorflow are available.
- Returns:
tf_flag – True if tensorflow and keras are available.
- Return type:
- testmatplotlib()[source]
Tests to see if matplotlib is available
- Returns:
mpl_flag – True if matplotlib is available
- Return type:
rundiag
- class molSimplify.Classes.rundiag.run_diag[source]
Bases:
objectClass of run diagnostic information to automated decision making and property prediction
- set_ANN(ANN_flag, ANN_reason=False, ANN_dict=False, catalysis_flag=False, catalysis_reason=False)[source]
Set the ANN properties.
- Parameters:
ANN_flag (bool) – Flag for whether the ANN variables exist.
ANN_reason (str, optional) – Reasoning for why ANN failed if failed. Default is False.
ANN_dict (dict, optional) – Dictionary with ANN values and uncertainty.
catalysis_flag (bool, optional) – Whether or not catalytic properties are set.
catalysis_reason (str, optional) – Reasoning for why catalytic ANN failed if failed. Default is False.
- set_dict_bl(dict_bl)[source]
Set the ANN properties.
- Parameters:
dict_bl (dict, optional) – Dictionary with ANN bond lengths.
- set_mol(mol)[source]
Set the ANN molecule.
- Parameters:
mol (mol3D) – mol3D class instance for optimized molecule.