Classes
atom3D Class
- class molSimplify.Classes.atom3D.atom3D(Sym: str = 'C', xyz: List[float] | None = None, name: str | None = None, partialcharge: float | None = None, Tfactor=0, greek='', occup=1.0, loc='', line='')[source]
Bases:
object
atom3D class. Base class in molSimplify for representing an element.
- Parameters:
Sym (str, optional) – Symbol for atom3D instantiation. Element symbol. Default is ‘C’.
xyz (list, optional) – List of coordinates for new atom. Default is [0.0, 0.0, 0.0].
name (str, optional) – Unique identifier for atom 3D instantiation. Default is False.
partialcharge (int, optional) – Charge assigned to atom when added to mol. Default is None.
- coords()[source]
Get coordinates of a given atom.
- Returns:
coords – List of coordinates in X, Y, Z format.
- Return type:
- mutate(newType='C')[source]
Mutate an element to another element in the atom3D.
- Parameters:
newType (str, optional) – Element name for new element. Default is ‘C’.
- setEDIA(score)[source]
Sets the EDIA score of an individual atom3D.
- Parameters:
score (float) – Desired EDIA score of atom
- setcoords(xyz)[source]
Set coordinates of an atom3D class to a new location.
- Parameters:
xyz (list) – List of coordinates, has length 3: [X, Y, Z]
mol3D Class
- class molSimplify.Classes.mol3D.mol3D(name='ABC', loc='', use_atom_specific_cutoffs=False)[source]
Bases:
object
Holds information about a molecule, used to do manipulations. Reads information from structure file (XYZ, mol2) or is directly built from molsimplify. Please be cautious with periodic systems.
Example instantiation of an octahedral iron-ammonia complex from an XYZ file:
>>> complex_mol = mol3D() >>> complex_mol.readfromxyz('fe_nh3_6.xyz')
- ACM(idx1, idx2, idx3, angle)[source]
Performs angular movement on mol3D class. A submolecule is rotated about idx2. Operates directly on class.
- ACM_axis(idx1, idx2, axis, angle)[source]
Performs angular movement about an axis on mol3D class. A submolecule is rotated about idx2.Operates directly on class.
- BCM(idx1, idx2, d)[source]
Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom.
Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2
- Parameters:
>>> complex_mol = mol3D() >>> complex_mol.addAtom(atom3D('H', [0, 0, 0])) >>> complex_mol.addAtom(atom3D('H', [0, 0, 1])) >>> complex_mol.BCM(1, 0, 0.7) # Set distance between atoms 0 and 1 to be 1.5 angstroms. Move atom 1. >>> complex_mol.coordsvect() array([[0. , 0. , 0. ], [0. , 0. , 0.7]])
- BCM_opt(idx1, idx2, d, ff='uff')[source]
Performs bond centric manipulation (same as Avogadro, stretching and squeezing bonds). A submolecule is translated along the bond axis connecting it to an anchor atom. Performs force field optimization after, freezing the moved bond length.
Illustration: H3A-BH3 -> H3A—-BH3 where B = idx1 and A = idx2
- IsOct(init_mol=None, dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, flag_lbd=True, BondedOct=True, skip=False, flag_deleteH=True, silent=False, use_atom_specific_cutoffs=True)[source]
Main geometry check method for octahedral structures
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.
silent (bool, optional) – Flag for warning suppression. Default is False.
use_atom_specific_cutoffs (bool, optional) – Determine bonding with atom specific cutoffs.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
- IsStructure(init_mol=None, dict_check=False, angle_ref=False, num_coord=5, flag_catoms=False, catoms_arr=None, debug=False, skip=False, flag_deleteH=True)[source]
Main geometry check method for square pyramidal structures
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
num_coord (int, optional) – The metal coordination number.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
- Oct_inspection(init_mol=None, catoms_arr=None, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]
Used to track down the changing geo_check metrics in a DFT geometry optimization. Catoms_arr always specified.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
debug (bool, optional) – Flag for extra printout. Default is False.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.
flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.
- RCAngle(idx1, idx2, idx3, anglei, anglef, angleint=1.0, writegeo=False, dir_name='rc_angle_geometries')[source]
Generates geometries along a given angle reaction coordinate. In the given molecule, idx1 is rotated about idx2 with respect to idx3. Operates directly on class.
- Parameters:
idx1 (int) – Index of bonded atom containing submolecule to be moved.
idx2 (int) – Index of anchor atom 1.
idx3 (int) – Index of anchor atom 2.
anglei (float) – New initial bond angle in degrees.
anglef (float) – New final bond angle in degrees.
angleint (float; default is 1.0 degree) – The angle interval in which the angle is changed
writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False
dir_name (string; default is 'rc_angle_geometries') – The directory to which generated reaction coordinate geoemtries are written, if writegeo=True.
>>> complex_mol = mol3D() >>> complex_mol.addAtom(atom3D('O', [0, 0, 0])) >>> complex_mol.addAtom(atom3D('H', [0, 0, 1])) >>> complex_mol.addAtom(atom3D('H', [0, 1, 0]))
Generate reaction coordinate geometries using the given structure by changing the angle between atoms 2, 1, and 0 from 90 degrees to 160 degrees in intervals of 10 degrees >>> complex_mol.RCAngle(2, 1, 0, 90, 160, 10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]
Generate reaction coordinates with the given geometry by changing the angle between atoms 2, 1, and 0 from 160 degrees to 90 degrees in intervals of 10 degrees, and the generated geometries will not be written to a directory. >>> complex_mol.RCAngle(2, 1, 0, 160, 90, -10) [mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2), mol3D(O1H2)]
- RCDistance(idx1, idx2, disti, distf, distint=0.05, writegeo=False, dir_name='rc_distance_geometries')[source]
Generates geometries along a given distance reaction coordinate. In the given molecule, idx1 is moved with respect to idx2. Operates directly on class.
- Parameters:
idx1 (int) – Index of bonded atom containing submolecule to be moved.
idx2 (int) – Index of anchor atom 1.
disti (float) – New initial bond distance in angstrom.
distf (float) – New final bond distance in angstrom.
distint (float; default is 0.05 angstrom) – The distance interval in which the distance is changed
writegeo (if True, the generated geometries will be written) – to a directory; if False, they will not be written to a directory; default is False
dir_name (string; default is 'rc_distance_geometries') – The directory to which generated reaction coordinate geoemtries are written if writegeo=True.
>>> complex_mol = mol3D() >>> complex_mol.addAtom(atom3D('H', [0, 0, 0])) >>> complex_mol.addAtom(atom3D('H', [0, 0, 1]))
Generate reaction coordinate geometries using the given structure by changing the distance between atoms 1 and 0 from 1.0 to 3.0 angstrom (atom 1 is moved) in intervals of 0.5 angstrom >>> complex_mol.RCDistance(1, 0, 1.0, 3.0, 0.5) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]
Generate reaction coordinates geometries using the given structure by changing the distance between atoms 1 and 0 from 3.0 to 1.0 angstrom (atom 1 is moved) in intervals of 0.2 angstrom, and the generated geometries will not be written to a directory. >>> complex_mol.RCDistance(1, 0, 3.0, 1.0, -0.25) [mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2), mol3D(H2)]
- Structure_inspection(init_mol=None, catoms_arr=None, num_coord=5, dict_check=False, angle_ref=False, flag_lbd=False, dict_check_loose=False, BondedOct=True, debug=False)[source]
Used to track down the changing geo_check metrics in a DFT geometry optimization. Specifically for a square pyramidal structure. Catoms_arr always specified.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
num_coord (int, optional) – The metal coordination number.
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
dict_check_loose (dict, optional) – Dictionary of geo check metrics, if a dictionary other than the default one from globalvars is desired.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
debug (bool, optional) – Flag for extra printout. Default is False.
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
dict_oct_info (dict) – Dictionary of measurements of geometry.
flag_oct_loose (int) – Good (1) or bad (0) structures with loose cutoffs.
flag_list_loose (list) – Metrics that are preventing a good geometry with loose cutoffs.
- addAtom(atom: atom3D, index: int | None = None, auto_populate_BO_dict: bool = True)[source]
Adds an atom to the atoms attribute, which contains a list of atom3D class instances.
- Parameters:
>>> complex_mol = mol3D() >>> C_atom = atom3D('C', [1, 1, 1]) >>> complex_mol.addAtom(C_atom) # Add carbon atom at cartesian position 1, 1, 1 to mol3D object.
- add_bond(idx1: int, idx2: int, bond_type: int) dict [source]
Add a bond of order bond_type between the atom at idx1 and the atom at idx2. Adjusts bo_dict and graph only, not BO_mat nor OBMol.
- alignmol(atom1, atom2)[source]
Aligns two molecules such that the coordinates of two atoms overlap. Second molecule is translated relative to the first. No rotations are performed. Use other functions for rotations. Moves the mol3D class.
- apply_ffopt(constraints=False, ff='uff')[source]
Apply forcefield optimization to a given mol3D class.
- apply_ffopt_list_constraints(list_constraints=False, ff='uff')[source]
Apply forcefield optimization to a given mol3D class. Differs from apply_ffopt in that one can specify constrained atoms as a list.
- aromatic_charge(bo_graph)[source]
Get the charge of aromatic rings based on 4*n+2 rule.
- Parameters:
bo_graph (numpy.array) – bond order matrix
- assign_graph_from_net(path_to_net, return_graph=False)[source]
Uses a .net file to assign a graph (and return if needed)
- calcCharges(charge=0, method='QEq')[source]
Compute the partial charges of a molecule using openbabel.
- centermass()[source]
Computes coordinates of center of mass of molecule.
- Returns:
center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z).
- Return type:
- centersym()[source]
Computes coordinates of center of symmetry of molecule. Identical to centermass, but not weighted by atomic masses.
- Returns:
center_of_symmetry – Coordinates of center of symmetry. List of length 3: (X, Y, Z).
- Return type:
- check_angle_linear(catoms_arr=None)[source]
Get the ligand orientation for linear ligands.
- Parameters:
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
- Returns:
dict_orientation – Dictionary containing average deviation from linearity (devi_linear_avrg) and max deviation (devi_linear_max).
- Return type:
- closest_H_2_metal(delta=0)[source]
Get closest hydrogen atom to metal.
- Parameters:
delta (float) – Distance tolerance in angstrom.
- Returns:
flag (bool) – Flag for if a hydrogen exists in the distance tolerance.
min_dist_H (float) – Minimum distance for a hydrogen.
min_dist_nonH (float) – Minimum distance for a heavy atom.
- combine(mol, bond_to_add=[], dirty=False)[source]
Combines two molecules. Each atom in the second molecule is appended to the first while preserving orders. Assumes operation with a given mol3D instance, when handed a second mol3D instance.
- Parameters:
- Returns:
cmol – New mol3D class containing the two molecules combined.
- Return type:
- convert2OBMol(force_clean=False, ignoreX=False)[source]
Converts mol3D class instance to OBMol class instance. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.
- convert2OBMol2(force_clean=False, ignoreX=False)[source]
Converts mol3D class instance to OBMol class instance, but uses mol2 function, so bond orders are not interpreted, but rather read through the mol2. Stores as OBMol attribute. Necessary for force field optimizations and other openbabel operations.
- convert2mol3D()[source]
Converts OBMol class instance to mol3D class instance. Generally used after openbabel operations, such as FF optimizing a molecule. Updates the mol3D as necessary.
- convexhull()[source]
Computes convex hull of molecule.
- Returns:
hull – Coordinates of convex hull.
- Return type:
array
- coords()[source]
Method to obtain string of coordinates in molecule.
- Returns:
coord_string – String of molecular coordinates with atom identities in XYZ format.
- Return type:
string
- coordsvect()[source]
Method to obtain array of coordinates in molecule.
- Returns:
list_of_coordinates – Two dimensional numpy array of molecular coordinates. (N by 3) dimension, N is number of atoms.
- Return type:
np.array
- copymol3D(mol0)[source]
Copies properties and atoms of another existing mol3D object into current mol3D object. Should be performed on a new mol3D class instance. WARNING: NEVER EVER USE mol3D = mol0 to do this. It DOES NOT WORK. ONLY USE ON A FRESH INSTANCE OF MOL3D. Operates on fresh instance.
- Parameters:
mol0 (mol3D) – mol3D of molecule to be copied.
- count_atoms(exclude=['H', 'h', 'x', 'X'])[source]
Count the number of atoms, excluding certain atoms.
- Parameters:
exclude (list) – list of symbols for atoms to exclude.
- Returns:
count – the number of heavy atoms
- Return type:
integer
- count_electrons(charge=0)[source]
Count the number of electrons in a molecule.
- Parameters:
charge (int, optional) – Net charge of a molecule. Default is neutral.
- Returns:
count – The number of electrons in the system.
- Return type:
integer
- count_nonH_atoms()[source]
Count the number of heavy atoms.
- Returns:
count – the number of heavy atoms
- Return type:
integer
- count_specific_atoms(atom_types=['x', 'X'])[source]
Count the number of atoms, including only certain atoms.
- Parameters:
atom_types (list) – list of symbols for atoms to include.
- Returns:
count – the number of heavy atoms
- Return type:
integer
- createMolecularGraph(oct=True, strict_cutoff=False, catom_list=None)[source]
Create molecular graph of a molecule given X, Y, Z positions. Bond order is not interpreted by this. Updates graph attribute of the mol3D class.
- deleteatom(atomIdx)[source]
Delete a specific atom from the mol3D class given an index.
- Parameters:
atomIdx (int) – Index for the atom3D to remove.
- deleteatoms(Alist)[source]
Delete a multiple atoms from the mol3D class given a set of indices. Preserves ordering, starts from largest index.
- Parameters:
Alist (list) – List of indices for atom3D instances to remove.
- dict_check_processing(dict_check, num_coord=6, debug=False, silent=False)[source]
Process the self.geo_dict to get the flag_oct and flag_list, setting dict_check as the cutoffs.
- Parameters:
- Returns:
flag_oct (int) – Good (1) or bad (0) structure.
flag_list (list) – Metrics that are preventing a good geometry.
- draw_svg(filename)[source]
Draw image of molecule and save to SVG.
- Parameters:
filename (str) – Name of file to save SVG to.
- findAtomsbySymbol(sym: str) List[int] [source]
Find all elements with a given symbol in a mol3D class.
- findcloseMetal(atom0)[source]
Find the nearest metal to a given atom3D class. Returns heaviest element if no metal found.
- findsubMol(atom0, atomN, smart=False)[source]
Finds a submolecule within the molecule given the starting atom and the separating atom. Illustration: H2A-B-C-DH2 will return C-DH2 if C is the starting atom and B is the separating atom. Alternatively, if C is the starting atom and D is the separating atom, returns H2A-B-C.
- freezeatom(atomIdx)[source]
Set the freeze attribute to be true for a given atom3D class.
- Parameters:
atomIdx (int) – Index for atom to be frozen.
- freezeatoms(Alist)[source]
Set the freeze attribute to be true for a given set of atom3D classes, given their indices. Preserves ordering, starts from largest index.
- Parameters:
Alist (list) – List of indices for atom3D instances to remove.
- geo_dict_initialization()[source]
Initialization of geometry check dictionaries according to dict_oct_check_st.
- geo_maxatomdist(mol2)[source]
Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().
- geo_rmsd(mol2)[source]
Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().
- getAngle(idx0, idx1, idx2)[source]
Get angle between three atoms identified by their indices. Specifically, get angle between vectors formed by atom0->atom1 and atom2->atom1.
- getAtomTypes()[source]
Get unique elements in a molecule
- Returns:
unique_atoms_list – List of unique elements in molecule by symbol.
- Return type:
- getAtoms()[source]
Get all atoms within a molecule.
- Returns:
atom_list – List of atom3D classes for all elements in a mol3D.
- Return type:
- getBondedAtoms(idx: int) List[int] [source]
Gets atoms bonded to a specific atom. This is determined based on element-specific distance cutoffs, rather than predefined valences. This method is ideal for metals because bond orders are ill-defined. For pure organics, the OBMol class provides better functionality.
- getBondedAtomsBOMatrix(idx)[source]
Get atoms bonded by an atom referenced by index, using the BO matrix.
- getBondedAtomsBOMatrixAug(idx)[source]
Get atoms bonded by an atom referenced by index, using the augmented BO matrix.
- getBondedAtomsByCoordNo(idx, CoordNo=6)[source]
Gets atoms bonded to a specific atom by coordination number.
- getBondedAtomsByThreshold(idx, threshold=1.15)[source]
Gets atoms bonded to a specific atom. This method uses a threshold for determination of a bond.
- getBondedAtomsOct(ind, CN=6, debug=False, flag_loose=False, atom_specific_cutoffs=False, strict_cutoff=False)[source]
Gets atoms bonded to an octahedrally coordinated metal. Specifically limitis intruder C and H atoms that would otherwise be considered bonded in the distance cutoffs. Limits bonding to the CN closest atoms (CN = coordination number).
- Parameters:
ind (int) – Index of reference atom.
CN (int, optional) – Coordination number of reference atom of interest. Default is 6.
debug (bool, optional) – Produce additional outputs for debugging. Default is False.
flag_loose (bool, optional) – Use looser cutoffs to determine bonding. Default is False.
atom_specific_cutoffs (bool, optional) – Use atom specific cutoffs to determing bonding. Default is False.
strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs
- Returns:
nats – List of indices of bonded atoms.
- Return type:
- getBondedAtomsSmart(idx, oct=True, strict_cutoff=False, catom_list=None)[source]
Get the atoms bonded with the atom specified with the given index, using the molecular graph. Creates graph if it does not exist.
- Parameters:
- Returns:
nats – List of indices of bonded atoms.
- Return type:
- getBondedAtomsnotH(idx, metal_multiplier=1.35, nonmetal_multiplier=1.15)[source]
Get bonded atom with a given index, but do not count hydrogens.
- Parameters:
- Returns:
nats – List of indices of bonded atoms.
- Return type:
- getClosestAtomnoHs(ratom)[source]
Get atoms bonded to a specific atom3D class that are not hydrogen.
- getDistToMetal(idx, metalx)[source]
Get distance between two atoms in a molecule, with the second one being a metal.
- getHs()[source]
Get all hydrogens in a mol3D class instance.
- Returns:
hlist – List of indices of hydrogen atoms.
- Return type:
- getHsbyIndex(idx)[source]
Get all hydrogens bonded to a given atom with an index.
- Parameters:
idx (index of reference atom.) –
- Returns:
hlist – List of indices of hydrogen atoms bound to reference atom.
- Return type:
- getMLBondLengths()[source]
Outputs the metal-ligand bond lengths in the complex.
- Returns:
ml_bls – keyed by ID of metal M and valued by dictionary of M-L bond lengths and relative bond lengths
- Return type:
dictionary
- getNumAtoms()[source]
Get the number of atoms within a molecule.
- Returns:
self.natoms – The number of atoms in the mol3D object.
- Return type:
- getOBMol(fst, convtype, ffclean=False, gen3d=True)[source]
Get OBMol object from a file or SMILES string. If you have a mol3D, then use convert2OBMol instead.
- Parameters:
- Returns:
OBMol – OBMol class instance to be used with openbabel. Bound as .OBMol attribute.
- Return type:
OBMol
- get_fcs(strict_cutoff=False, catom_list=None)[source]
Get first coordination shell of a transition metal complex.
- get_features(lac=True, force_generate=False, eq_sym=False, use_dist=False, NumB=False, Gval=False, size_normalize=False, alleq=False, strict_cutoff=False, catom_list=None, MRdiag_dict={}, depth=3)[source]
Get geo-based RAC features for this complex (if octahedral)
- Parameters:
lac (bool, optional) – Use lac for ligand_assign_consistent behavior. Default is True
force_generate (bool, optional) – Force the generation of features.
eq_sym (bool, optional) – Force equatorial plane to have same chemical symbols if possible.
use_dist (bool, optional) – Whether or not CD-RACs used.
NumB (bool, optional) – Whether or not the number of bonds RAC features are generated.
Gval (bool, optional) – Whether or not the group number RAC features are generated.
size_normalize (bool, optional) – Whether or not to normalize by the number of atoms.
alleq (bool, optional) – Whether or not all ligands are equatorial.
strict_cutoff (bool, optional) – strict bonding cutoff for fullerene and SACs
catom_list (list, optional) – List of indices of coordinating atoms.
MRdiag_dict (dict, optional) – Keys are ligand identifiers, values are MR diagnostics like E_corr.
depth (int, optional) – The depth of the RACs (how many bonds out the RACs go).
- Returns:
Dictionary of {‘RACname’:RAC} for all geo-based RACs
- Return type:
results, dict
- get_first_shell(check_hapticity=True)[source]
Get the first coordination shell of a mol3D object with a single transition metal (read from CSD mol2 file) if check_hapticity is True updates the first shell of multiheptate ligand to be hydrogen set at the geometric mean
- Parameters:
check_hapticity (boolean) – whether to update multiheptate ligands to their geometric centroid
- Returns:
mol 3D object (first coordination shell with metal (can change based on check_hapticity))
list (list of hapticity)
- get_geometry_type(dict_check=False, angle_ref=False, flag_catoms=False, catoms_arr=None, debug=False, skip=False, transition_metals_only=False)[source]
Get the type of the geometry (linear (2), trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7))
uses hapticity truncated first coordination shell. Does not require the input of num_coord.
- Parameters:
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
num_coord (int, optional) – Expected coordination number.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
transition_metals_only (bool, optional) – Flag for considering more than just transition metals as metals. Default is False.
- Returns:
results – Measurement of deviations from arrays.
- Return type:
dictionary
- get_geometry_type_old(dict_check=False, angle_ref=False, num_coord=None, flag_catoms=False, catoms_arr=None, debug=False, skip=False, transition_metals_only=False, num_recursions=[0, 0])[source]
Get the type of the geometry (trigonal planar(3), tetrahedral(4), square planar(4), trigonal bipyramidal(5), square pyramidal(5, one-empty-site), octahedral(6), pentagonal bipyramidal(7))
- Parameters:
dict_check (dict, optional) – The cutoffs of each geo_check metrics we have. Default is False
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
num_coord (int, optional) – Expected coordination number.
flag_catoms (bool, optional) – Whether or not to return the catoms arr. Default as False.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
debug (bool, optional) – Flag for extra printout. Default is False.
skip (list, optional) – Geometry checks to skip. Default is False.
transition_metals_only (bool, optional) – Flag for considering more than just transition metals as metals. Default is False.
num_recursions (list, optional) – counter to track number of ligands classified as ‘sandwich’ and ‘edge’ in original structure
- Returns:
results – Measurement of deviations from arrays.
- Return type:
dictionary
- get_linear_angle(ind)[source]
Get linear ligand angle.
- Parameters:
ind (int) – index for one of the metal-coordinating atoms.
- Returns:
flag (bool) – True if the ligand is linear.
ang (float) – Get angle of linear ligand. 0 if not linear.
- get_num_coord_metal(debug=False, strict_cutoff=False, catom_list=None)[source]
Get metal coordination based on get bonded atoms. Store this info.
- get_octetrule_charge(debug=False)[source]
Get the octet-rule charge provided a mol3D object with bo_graph (read from CSD mol2 file) Note that currently this function should only be applied to ligands (organic molecules).
- Parameters:
debug (boolean) – whether to have more printouts
- get_smiles(canonicalize=False, use_mol2=False) str [source]
Returns the SMILES string representing the mol3D object.
- get_smilesOBmol_charge()[source]
Get the charge of a mol3D object through adjusted OBmol hydrogen/smiles conversion Note that currently this function should only be applied to ligands (organic molecules).
- get_submol_noHs()[source]
Get the heavy atom only submolecule, with no hydrogens.
- Returns:
mol_noHs – mol3D class instance with no hydrogens.
- Return type:
- get_symmetry_denticity(return_eq_catoms=False, BondedOct=False)[source]
Get symmetry class of molecule.
- Parameters:
- Returns:
eqsym (bool) – Flag for equatorial symmetry.
maxdent (int) – Maximum denticity in molecule.
ligdents (list) – List of denticities in molecule.
homoleptic (bool) – Flag for whether a geometry is homoleptic.
ligsymmetry (str) – Symmetry class for ligand of interest.
eq_catoms (list) – List of equatorial connection atoms.
- getfragmentlists()[source]
Get all independent molecules in mol3D.
- Returns:
atidxes_total – list of lists for atom indices comprising of each distinct molecule.
- Return type:
- isPristine(unbonded_min_dist=1.3, oct=False)[source]
Checks the organic portions of a transition metal complex and determines if they look good.
- Parameters:
- Returns:
pass (bool) – Whether or not molecule passes the organic checks.
fail_list (list) – List of failing criteria, as a set of strings.
- is_edge_compound(transition_metals_only: bool = True) Tuple[int, List, List] [source]
Check if a structure is edge compound.
- Returns:
num_edge_lig (int) – Number of edge ligands.
info_edge_lig (list) – List of dictionaries with info about edge ligands.
edge_lig_atoms (list) – List of dictionaries with the connecting atoms of the edge ligands.
- is_linear_ligand(ind)[source]
Check whether a ligand is linear.
- Parameters:
ind (int) – index for one of the metal-coordinating atoms.
- Returns:
flag (bool) – True if the ligand is linear.
catoms (list) – Atoms bonded to the index of interest.
- is_sandwich_compound(transition_metals_only: bool = True) Tuple[int, List, bool, bool, List] [source]
Evaluates whether a compound is a sandwich compound
- Returns:
num_sandwich_lig (int) – Number of sandwich ligands.
info_sandwich_lig (list) – List of dictionaries about the sandwich ligands.
aromatic (bool) – Flag about whether the ligand is aromatic.
allconnect (bool) – Flag for connected atoms in ring.
edge_lig_atoms (list) – List of dictionaries with the connecting atoms of the sandwich ligands.
- ligand_comp_org(init_mol, catoms_arr=None, flag_deleteH=True, flag_lbd=True, debug=False, depth=3, BondedOct=False, angle_ref=False)[source]
Get the ligand distortion by comparing each individual ligands in init_mol and opt_mol.
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
flag_deleteH (bool, optional,) – Flag to delete Hs in ligand comparison. Default is True.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
debug (bool, optional) – Flag for extra printout. Default is False.
depth (int, optional) – Depth for truncated molecule. Default is 3.
check_whole (bool, optional) – Flag for checking whole ligand.
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
- Returns:
dict_lig_distort – Dictionary containing rmsd_max and atom_dist_max.
- Return type:
- match_lig_list(init_mol, catoms_arr=None, BondedOct=False, flag_lbd=True, debug=False, depth=3, check_whole=False, angle_ref=False)[source]
Match the ligands of mol and init_mol by calling ligand_breakdown
- Parameters:
init_mol (mol3D) – mol3D class instance of the initial geometry.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input. Default is Nonetype.
BondedOct (bool, optional) – Flag for bonding. Only used in Oct_inspection, not in geo_check. Default is False.
flag_lbd (bool, optional) – Flag for using ligand breakdown on the optimized geometry. If False, assuming equivalent index to initial geo. Default is True.
debug (bool, optional) – Flag for extra printout. Default is False.
depth (int, optional) – Depth for truncated molecule. Default is 3.
check_whole (bool, optional) – Flag for checking whole ligand.
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
- Returns:
liglist_shifted (list) – List of lists containing all ligands from optimized molecule.
liglist_init (list) – List of lists containing all ligands from initial molecule.
flag_match (bool) – A flag about whether the ligands of initial and optimized mol are exactly the same. There is a one to one mapping.
- maxatomdist(mol2)[source]
Compute the max atom distance between two molecules. Does not align molecules. For that, use geometry.kabsch().
- maxatomdist_nonH(mol2)[source]
Compute the max atom distance between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().
- meanabsdev(mol2)[source]
Compute the mean absolute deviation (MAD) between two molecules. Does not align molecules. For that, use geometry.kabsch().
- mindistmol()[source]
Measure the smallest distance between atoms in a single molecule.
- Returns:
mind – Min distance between atoms of two molecules.
- Return type:
- mindistnonH(mol)[source]
Measure the smallest distance between an atom and a non H atom in another molecule.
- molsize()[source]
Measure the size of the molecule, by quantifying the max distance between atoms and center of mass.
- Returns:
maxd – Max distance between an atom and the center of mass.
- Return type:
- oct_comp(angle_ref=False, catoms_arr=None, debug=False)[source]
Get the deviation of shape of the catoms from the desired shape, which is defined in angle_ref.
- Parameters:
angle_ref (bool, optional) – Reference list of list for the expected angles (A-metal-B) of each connection atom.
catoms_arr (Nonetype, optional) – Uses the catoms of the mol3D by default. User and overwrite this connection atom array by explicit input.
debug (bool, optional) – Flag for extra printout. Default is False.
- Returns:
dict_catoms_shape (dict) – Dictionary of first coordination sphere shape measures.
catoms_arr (list) – Connection atom array.
- overlapcheck(mol, silence=False)[source]
Measure the smallest distance between an atom and a point.
- populateBOMatrix(bonddict=False)[source]
Populate the bond order matrix using openbabel.
- Parameters:
bonddict (bool) – Flag for if the obmol bond dictionary should be saved. Default is False.
- Returns:
molBOMat – Numpy array for bond order matrix.
- Return type:
np.array
- populateBOMatrixAug()[source]
Populate the augmented bond order matrix using openbabel.
- Parameters:
bonddict (bool) – Flag for if the obmol bond dictionary should be saved. Default is False.
- Returns:
molBOMat – Numpy array for augmented bond order matrix.
- Return type:
np.array
- printxyz()[source]
Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.
- read_bonder_order(bofile)[source]
Get bond order information from file.
- Parameters:
bofile (str) – Path to a bond order file.
- read_charge(chargefile)[source]
Get charge information from file.
- Parameters:
chargefile (str) – Path to a charge file.
- read_smiles(smiles, ff='mmff94', steps=2500)[source]
Read a smiles string and convert it to a mol3D class instance.
- readfrommol2(filename, readstring=False, trunc_sym='X')[source]
Read mol2 into a mol3D class instance. Stores the bond orders and atom types (SYBYL).
- Parameters:
filename (string) – String of path to XYZ file. Path may be local or global. May be read in as a string.
readstring (bool) – Flag for deciding whether a string of mol2 file is being passed as the filename
trunc_sym (string) – Element symbol at which one would like to truncate the bo graph.
- readfromstring(xyzstring)[source]
Read XYZ from string.
- Parameters:
xyzstring (string) – String of XYZ file.
- readfromtxt(txt)[source]
Read XYZ from textfile.
- Parameters:
txt (list) – List of lists that comes as a result of readlines.
- readfromxyz(filename: str, ligand_unique_id=False, read_final_optim_step=False)[source]
Read XYZ into a mol3D class instance.
- Parameters:
filename (string) – String of path to XYZ file. Path may be local or global.
ligand_unique_id (string) – Unique identifier for a ligand. In MR diagnostics, we abstract the atom based graph to a ligand based graph. For ligands, they don’t have a natural name, so they are named with a UUID. Hard to attribute MR character to just atoms, so it is attributed ligands instead.
read_final_optim_step (boolean) – if there are multiple geometries in the xyz file (after an optimization run) use only the last one
- resetBondOBMol()[source]
Repopulates the bond order matrix via openbabel. Interprets bond order matrix.
- returnxyz()[source]
Print XYZ info of mol3D class instance to stdout. To write to file (more common), use writexyz() instead.
- Returns:
XYZ – String of XYZ information from mol3D class.
- Return type:
string
- rmsd(mol2)[source]
Compute the RMSD between two molecules. Does not align molecules. For that, use geometry.kabsch().
- rmsd_nonH(mol2)[source]
Compute the RMSD between two molecules, considering heavy atoms only. Does not align molecules. For that, use geometry.kabsch().
- sanitycheck(silence=False, debug=False)[source]
Sanity check a molecule for overlap within the molecule.
- sanitycheckCSD(oct=False, angle1=30, angle2=80, angle3=45, debug=False, metals=None)[source]
Sanity check a CSD molecule.
- Parameters:
oct (bool, optional) – Flag for octahedral test. Default is False.
angle1 (float, optional) – Metal angle cutoff. Default is 30.
angle2 (float, optional) – Organic angle cutoff. Default is 80.
angle3 (float, optional) – Metal/organic angle cutoff e.g. M-X-X angle. Default is 45.
debug (bool, optional) – Extra print out desired. Default is False.
metals (Nonetype, optional) – Check for metals. Default is None.
- Returns:
sane (bool) – Whether or not molecule is a sane molecule
error_dict (dict) – Returned if debug, {bondidists and angles breaking constraints:values}
- setAtoms(atoms)[source]
Set atoms of a mol3D class to atoms.
- Parameters:
atoms (list) – contains atom3D instances that should be in the molecule
- setLoc(loc)[source]
Sets the conformation of an amino acid in the chain of a protein.
- Parameters:
loc (str) – a one-character string representing the conformation
- symvect()[source]
Method to obtain array of symbol vector of molecule.
- Returns:
symbol_vector – 1 dimensional numpy array of atom symbols. (N,) dimension, N is number of atoms.
- Return type:
np.array
- translate(dxyz)[source]
Translate all atoms by a given vector.
- Parameters:
dxyz (list) – Vector to translate all molecules, as a list [dx, dy, dz].
- typevect()[source]
Method to obtain array of type vector of molecule.
- Returns:
type_vector – 1 dimensional numpy array of atom types (by name). (N,) dimension, N is number of atoms.
- Return type:
np.array
- writemol2(filename, writestring=False, ignoreX=False, force=False)[source]
Write mol2 file from mol3D object. Partial charges are appended if given. Else, total charge of the complex (given or interpreted by OBMol) is assigned to the metal.
- Parameters:
- writenumberedxyz(filename)[source]
Write standard XYZ file with numbers instead of symbols.
- Parameters:
filename (str) – Path to XYZ file.
- writexyz(filename, symbsonly=False, ignoreX=False, ordering=False, writestring=False, withgraph=False, specialheader=False)[source]
Write standard XYZ file.
- Parameters:
filename (str) – Path to XYZ file.
symbsonly (bool, optional) – Only write symbols to file. Default is False.
ignoreX (bool, optional) – Ignore X element when writing. Default is False.
ordering (bool, optional) – If handed a list, will order atoms in a specific order. Default is False.
writestring (bool, optional) – Flag to write to a string if True or file if False. Default is False.
withgraph (bool, optional) – Flag to write with graph (after XYZ) if True. Default is False. If True, sparse graph written.
specialheader (str, optional) – String to write information into header. Default is False. If True, a special string is written.
Mol2D Class
- class molSimplify.Classes.mol2D.Mol2D(incoming_graph_data=None, **attr)[source]
Bases:
Graph
- find_metal(transition_metals_only: bool = True) List[int] [source]
Find indices of metal(s) in a Mol2D class.
- Parameters:
transition_metals_only (bool, optional) – Only find transition metals. Default is true.
- Returns:
metal_list – List of indices of metal atoms in Mol2D.
- Return type:
Examples
Build Vanadyl acetylacetonate from SMILES:
>>> mol = Mol2D.from_smiles("CC(=[O+]1)C=C(C)O[V-3]12(#[O+])OC(C)=CC(C)=[O+]2") >>> mol.find_metal() [7]
- classmethod from_smiles(smiles: str)[source]
Create a Mol2D object from a SMILES string.
- Parameters:
smiles (str) – SMILES representation of the molecule.
- Returns:
Mol2D object of the molecule
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1") >>> mol Mol2D(O1C4H4)
- graph_determinant(return_string: bool = True) str | float [source]
Calculates the molecular graph determinant.
- Parameters:
return_string (bool, optional) – Flag to return the determinant as a string. Default is True.
- Returns:
graph determinant
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1")
and calculate the graph determinant:
>>> mol.graph_determinant() '-19404698740'
- graph_hash() str [source]
Calculates the node attributed graph hash of the molecule.
- Returns:
node attributed graph hash
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1")
and calculate the node attributed graph hash:
>>> mol.graph_hash() 'f6090276d7369c0c0a535113ec1d97cf'
- graph_hash_edge_attr() str [source]
Calculates the edge attributed graph hash of the molecule.
- Returns:
edge attributed graph hash
- Return type:
Examples
Create a furan molecule from SMILES:
>>> mol = Mol2D.from_smiles("o1cccc1")
and calculate the edge attributed graph hash:
>>> mol.graph_hash_edge_attr() 'a9b6fbc7b5f53613546d5e91a7544ed6'
monomer3D Class
- class molSimplify.Classes.monomer3D.monomer3D(three_lc='GLY', chain='undef', id=-1, occup=1.0, loc='')[source]
Bases:
object
Holds information about a monomer, used to do manipulations. Reads information from structure file (pdb) or is directly built from molsimplify.
- addAtom(atom, index=None)[source]
Adds an atom to the atoms attribute, which contains a list of atom3D class instances.
- centermass()[source]
Computes coordinates of center of mass of monomer. :returns: center_of_mass – Coordinates of center of mass. List of length 3: (X, Y, Z). :rtype: list
- centroid()[source]
Computes coordinates of centroid of monomer. :returns: centroid – Coordinates of centroid. List of length 3: (X, Y, Z). :rtype: list
- coords()[source]
Method to obtain string of coordinates in monomer.
- Returns:
coord_string – String of molecular coordinates with atom identities in XYZ format.
- Return type:
string
- getGreek(greek)[source]
Finds the Greek lettered carbon(s) or other atom(s) of the user’s choice.
- Parameters:
greek (string) – The Greek lettered atom (e.g. alpha carbon) we want. Inputs should be form ‘CA’ or similar.
- Returns:
greek_atoms – A list of atom3D class objects that contains the Greek lettered atom(s) we want.
- Return type:
list of atom3Ds
- identify()[source]
States whether the amino acid is (positively/negatively) charged, polar, or hydrophobic.
- Returns:
aa_type – Positively charged, Negatively charged, Polar, Hydrophobic
- Return type:
string
- setLoc(loc)[source]
Sets the conformation of a monomer in the chain of a protein.
- Parameters:
loc (str) – a one-character string representing the conformation
protein3D Class
- class molSimplify.Classes.protein3D.protein3D(pdbCode='undef')[source]
Bases:
object
Holds information about a protein, used to do manipulations. Reads information from structure file (pdb, cif) or is directly built from molsimplify.
- autoChooseConf()[source]
Automatically choose the conformation of a protein3D class instance based first on what the greatest occupancy level is and then the first conformation ihe alphabet with all else equal.
- convexhull()[source]
Computes convex hull of protein.
- Returns:
hull – Coordinates of convex hull.
- Return type:
array
- countAAs()[source]
Return the number of amino acid residues in a protein3D class.
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.countAAs() # This return the number of AAs in the PDB for all the chains. 1121
- fetch_pdb(pdbCode)[source]
API query to fetch a pdb and write it as a protein3D class instance
- Parameters:
pdbCode (str) – Code for protein, e.g. 1os7
- findAA(three_lc='XAA')[source]
Find amino acids with a specific three-letter code.
- Parameters:
three_lc (str) – three-letter code, default as XAA.
- Returns:
inds – a set of amino acid indices with the specified symbol.
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7
Return a set of pairs where each pair is a combination of the chain name and the index of the amino acid specified (in this case, ‘MET’) >>> aa_set = pdb_system.findAA(three_lc = ‘MET’) >>> sorted(aa_set) # Sorting for reproducible order in doctest [(‘A’, 268), (‘B’, 268), (‘C’, 268), (‘D’, 268)]
- findAtom(sym='X', aa=True)[source]
Find atoms with a specific symbol that are contained in amino acids or heteromolecules.
- Parameters:
sym (str) – element symbol, default as X.
aa (boolean) – True if we want atoms contained in amino acids False if we want atoms contained in heteromolecules
- Returns:
inds – a list of atom indices with the specified symbol.
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.findAtom(sym="S", aa=True) # Returns indices of sulphur atoms present in amino acids [2166, 4442, 6733, 9041] >>> pdb_system.findAtom(sym="S", aa=False) # Returns indices of sulphur atoms present in heteromolecules [9164, 9182, 9200]
- findMetal(transition_metals_only=True)[source]
Find metal(s) in a protein3D class.
- Parameters:
transition_metals_only (bool, optional) – Only find transition metals. Default is true.
- Returns:
metal_list – List of indices of metal atoms in protein3D.
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') fetched: 1os7 >>> pdb_system.findMetal() [9160, 9178, 9196, 9214]
- freezeatom(atomIdx)[source]
Set the freeze attribute to be true for a given atom3D class.
- Parameters:
atomIdx (int) – Index for atom to be frozen.
- freezeatoms(Alist)[source]
Set the freeze attribute to be true for a given set of atom3D classes, given their indices. Preserves ordering, starts from largest index.
- Parameters:
Alist (list) – List of indices for atom3D instances to remove.
- getBoundMols(h_id, aas_only=False)[source]
Get a list of molecules bound to a heteroatom, usually a metal.
- getChain(chain_id)[source]
Takes a chain of interest and turns it into its own protein3D class instance.
- Parameters:
chain_id (string) – The letter name of the chain of interest
- Returns:
p – A protein3D instance consisting of just the chain of interest
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.getChain('A')
- getMissingAAs()[source]
Get missing amino acid residues of a protein3D class.
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB fetched: 1MH1 >>> pdb_system.getMissingAAs() # This gives a list of monomer3D objects [monomer3D(VAL, id=182), monomer3D(LYS, id=183), monomer3D(LYS, id=184)]
- getMissingAtoms()[source]
Get missing atoms of a protein3D class.
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1MH1') # Fetch a PDB fetched: 1MH1 >>> missing_atoms = pdb_system.getMissingAtoms()
List atoms in the first set of missing_atoms >>> [atom.sym for atom in list(missing_atoms)[0]] [‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘C’, ‘O’]
- getMolecule(a_id, aas_only=False)[source]
Finds the molecule that the atom is contained in.
- Parameters:
a_id (int) – The index of the desired atom whose molecule we want to find
aas_only (boolean) – True if we want ito find atoms contained in amino acids only. False if we want atoms contained in all molecules. Default is False.
- Returns:
mol – The amino acid residue, nucleotide, or heteromolecule containing the atom
- Return type:
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7
This returns an molSimplify.Classes.monomer3D object indicating that the atom is part of an amino acid or nucleotide: >>> pdb_system.getMolecule(a_id=2166) monomer3D(MET, id=268)
This returns a mol3D object indicating that the atom is part of a molecule that is not an amino acid or nucleotide >>> pdb_system.getMolecule(a_id=9164) mol3D(S1O3N1C2) >>> pdb_system.getMolecule(a_id=9164).name # This prints the name of the molecule, in this case, it is ‘TAU’ ‘TAU’
- readMetaData()[source]
API query to fetch XML data from a pdb and add its useful attributes to a protein3D class.
- Parameters:
pdbCode (str) – Code for protein, e.g. 1os7
- readfrompdb(text)[source]
Read PDB into a protein3D class instance.
- Parameters:
text (str) – String of path to PDB file. Path may be local or global. May also be the text of a PDB file from the internet.
- setAAs(aas)[source]
Set monomers of a protein3D class to different monomers.
- Parameters:
aas (dictionary) – Keyed by chain and location Valued by monomer3D monomers (amino acids or nucleotides)
- setAtoms(atoms)[source]
Set atom indices of a protein3D class to atoms.
- Parameters:
atoms (dictionary) – Keyed by atom index Valued by atom3D atom that has that index
- setBonds(bonds)[source]
Sets the bonded atoms in the protein.
This is effectively the molecular graph.
- Parameters:
bonds (dictionary) – Keyed by atom3D atoms in the protein Valued by a set consisting of bonded atoms
- setChains(chains)[source]
Set chains of a protein3D class to different chains.
- Parameters:
chains (dictionary) – Keyed by desired chain IDs. Valued by the list of molecules in the chain.
- setConf(conf)[source]
Set possible conformations of a protein3D class to a new list.
- Parameters:
conf (list) – List of possible conformations for applicable amino acids.
- setDataCompleteness(DataCompleteness)[source]
Set DataCompleteness value of protein3D class.
- Parameters:
DataCompleteness (float) – The desired new R value.
- setEDIAScores()[source]
Sets the EDIA score of a protein3D class.
- Parameters:
pdbCode (string) – The 4-character code of the protein3D class.
- setHetmols(hetmols)[source]
Set heteromolecules of a protein3D class to different ones.
- Parameters:
hetmols (dictionary) – Keyed by chain and location Valued by mol3D heteromolecules
- setIndices(a_ids)[source]
Set atom indices of a protein3D class to atoms.
- Parameters:
a_ids (dictionary) – Keyed by atom3D atom Valued by its index
- setMissingAAs(missing_aas)[source]
Set missing amino acids of a protein3D class to a new list.
- Parameters:
missing_aas (list) – List of missing amino acids.
- setMissingAtoms(missing_atoms)[source]
Set missing atoms of a protein3D class to a new dictionary.
- Parameters:
missing_atoms (dictionary) – Keyed by amino acid residues / nucleotides of origin Valued by missing atoms
- setPDBCode(pdbCode)[source]
Sets the 4-letter PDB code of a protein3D class instance
- Parameters:
pdbCode (string) – Desired 4-letter PDB code
- setRSRZ(RSRZ)[source]
Set RSRZ score of protein3D class.
- Parameters:
RSRZ (float) – The desired new RSRZ score.
- setRfree(Rfree)[source]
Set Rfree value of protein3D class.
- Parameters:
Rfree (float) – The desired new Rfree value.
- setTwinL(TwinL)[source]
Set TwinL score of protein3D class.
- Parameters:
TwinL (float) – The desired new TwinL score.
- setTwinL2(TwinL2)[source]
Set TwinL squared score of protein3D class.
- Parameters:
TwinL2 (float) – The desired new TwinL squared score.
- stripAtoms(atoms_stripped)[source]
Removes certain atoms from the protein3D class instance.
- Parameters:
atoms_stripped (list) – List of atom3D indices that should be removed
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('1os7') # Fetch a PDB fetched: 1os7 >>> pdb_system.stripAtoms([2166, 4442, 6733, 2165]) # This removes the list of atoms with >>> # indices listedin the code
- stripHetMol(hetmol)[source]
- Removes all heteroatoms part of the specified heteromolecule from
the protein3D class instance.
- Parameters:
hetmol (str) – String representing the name of a heteromolecule whose heteroatoms should be stripped from the protein3D class instance
Examples
>>> pdb_system = protein3D() >>> pdb_system.fetch_pdb('3I40') # Fetch a PDB fetched: 3I40 >>> pdb_system.stripHetMol('HOH')
globalvars
- class molSimplify.Classes.globalvars.globalvars(*args, **kwargs)[source]
Bases:
object
Globalvars class. Defines global variables used throughout the code, including periodic table.
- add_custom_path(path)[source]
Record custom path in ~/.molSimplify file
- Parameters:
path (str) – Path to custom data ~/.molSimplify file.
- amass() Dict[str, Tuple[float, int, float, int]] [source]
Get the atomic mass dictionary.
- Returns:
amassdict – Dictionary containing atomic masses.
- Return type:
- bbcombs_mononuc()[source]
Get backbone combinations dictionary
- Returns:
bbcombs_mononuc – Backbone combination dictionary for different geometries.
- Return type:
- bondsdict()[source]
Get the bond dictionary.
- Returns:
bondsdict – Dictionary containing bond lengths.
- Return type:
- elementsbynum()[source]
Returns list of elements by number
- Returns:
elementsbynum – List of elements by number
- Return type:
- endict()[source]
Returns electronegativity dictionary.
- Returns:
endict – Electronegativity dictionary
- Return type:
- geo_check_dictionary()[source]
Returns list of geo check objects dictionary.
- Returns:
geo_check_dictionary – Geo check measurement dictionary.
- Return type:
- getAllAAs()[source]
Gets all amino acids
- Returns:
amino_acids – Dictionary of standard amino acids
- Return type:
dictionary
- get_all_angle_refs()[source]
Get references angle dict.
- Returns:
all_angle_refs – Reference angles for various geometries.
- Return type:
- get_all_geometries()[source]
Get available geometries.
- Returns:
all_geometries – All available geometries.
- Return type:
- groups()[source]
Returns dict of elements by groups.
- Returns:
groups_dict – Groups dictionary.
- Return type:
- metalslist(transition_metals_only=True)[source]
Get the metals list.
- Returns:
metalslist – List of available metals.
- Return type:
- periods()[source]
Returns dict of elements by periods.
- Returns:
periods_dict – Periods dictionary.
- Return type:
- polarizability() Dict[str, float] [source]
Get the polarizability dictionary.
- Returns:
poldict – Dictionary containing polarizabilities.
- Return type:
- testTF()[source]
Tests to see whether keras and tensorflow are available.
- Returns:
tf_flag – True if tensorflow and keras are available.
- Return type:
- testmatplotlib()[source]
Tests to see if matplotlib is available
- Returns:
mpl_flag – True if matplotlib is available
- Return type:
rundiag
- class molSimplify.Classes.rundiag.run_diag[source]
Bases:
object
Class of run diagnostic information to automated decision making and property prediction
- set_ANN(ANN_flag, ANN_reason=False, ANN_dict=False, catalysis_flag=False, catalysis_reason=False)[source]
Set the ANN properties.
- Parameters:
ANN_flag (bool) – Flag for whether the ANN variables exist.
ANN_reason (str, optional) – Reasoning for why ANN failed if failed. Default is False.
ANN_dict (dict, optional) – Dictionary with ANN values and uncertainty.
catalysis_flag (bool, optional) – Whether or not catalytic properties are set.
catalysis_reason (str, optional) – Reasoning for why catalytic ANN failed if failed. Default is False.
- set_dict_bl(dict_bl)[source]
Set the ANN properties.
- Parameters:
dict_bl (dict, optional) – Dictionary with ANN bond lengths.
- set_mol(mol)[source]
Set the ANN molecule.
- Parameters:
mol (mol3D) – mol3D class instance for optimized molecule.