API Reference¶

molify.ase2networkx(atoms: Atoms, pbc: bool = True, scale: float = 1.2) → Graph[source]¶

Convert an ASE Atoms object to a NetworkX graph.

Determines which atoms are bonded (connectivity). All edges will have bond_order=None unless atoms.info[‘connectivity’] already has bond orders.

Parameters:

atoms (ase.Atoms) – The ASE Atoms object to convert into a graph.
pbc (bool, optional) – Whether to consider periodic boundary conditions when calculating distances (default is True). If False, only connections within the unit cell are considered.
scale (float, optional) – Scaling factor for the covalent radii when determining bond cutoffs (default is 1.2).

Returns:

An undirected NetworkX graph with connectivity information.

Return type:

networkx.Graph

Notes

The graph contains the following information:

Nodes represent atoms with properties:
- position: Cartesian coordinates (numpy.ndarray)
- atomic_number: Element atomic number (int)
- original_index: Index in original Atoms object (int)
- charge: Formal charge (float)
Edges represent bonds with:
- bond_order: Bond order (float or None if unknown)
Graph properties include:
- pbc: Periodic boundary conditions
- cell: Unit cell vectors

Connectivity is determined by:

Using explicit connectivity if present in atoms.info
Otherwise using distance-based cutoffs (edges will have bond_order=None)

To get bond orders, pass the graph to networkx2rdkit().

Examples

>>> from molify import ase2networkx, smiles2atoms
>>> atoms = smiles2atoms(smiles="O")
>>> graph = ase2networkx(atoms)
>>> len(graph.nodes)
3
>>> len(graph.edges)
2

molify.ase2rdkit(atoms: Atoms, suggestions: list[str] | None = None) → Mol[source]¶

Convert an ASE Atoms object to an RDKit molecule.

Convenience function that chains: ase2networkx() → networkx2rdkit(suggestions=…)

Parameters:

atoms (ase.Atoms) – The ASE Atoms object to convert.
suggestions (list[str], optional) – SMILES/SMARTS patterns for bond order determination. Passed directly to networkx2rdkit().

Returns:

The resulting RDKit molecule with bond orders determined.

Return type:

rdkit.Chem.Mol

Examples

>>> from molify import ase2rdkit, smiles2atoms
>>> atoms = smiles2atoms(smiles="C=O")
>>> mol = ase2rdkit(atoms)
>>> mol.GetNumAtoms()
4

molify.compress(atoms: Atoms, density: float, freeze_molecules: bool = False) → Atoms[source]¶

Compress an ASE Atoms object to a target density.

Parameters:

atoms (ase.Atoms) – The Atoms object to compress.
density (float) – The target density in g/cm^3.
freeze_molecules (bool) – If True, freeze the internal degrees of freedom of the molecules during compression, to prevent bond compression.

molify.draw_molecular_graph(graph: Graph, layout: Literal['spring', 'circular', 'kamada_kawai'] = 'kamada_kawai', figsize: tuple[float, float] = (8, 8), node_size: int = 500, font_size: int | None = None, show_bond_orders: bool = True, weight_by_bond_order: bool = True, ax: Axes | None = None) → Figure[source]¶

Draw a molecular graph with bond order visualization.

This function visualizes NetworkX molecular graphs with proper representation of chemical bonds. Single, double, triple, and aromatic bonds are displayed with appropriate line styles.

Parameters:

graph (nx.Graph) – NetworkX graph with node attribute ‘atomic_number’ and edge attribute ‘bond_order’. Can be created from molify conversion functions like ase2networkx() or rdkit2networkx().
layout (Literal["spring", "circular", "kamada_kawai"], optional) –
Layout algorithm to use for node positioning. Options are:
- spring: Force-directed layout
- circular: Nodes arranged in a circle
- kamada_kawai: Force-directed using Kamada-Kawai algorithm (default)
figsize (tuple[float, float], optional) – Figure size as (width, height) in inches. Default is (8, 8).
node_size (int, optional) – Size of the nodes. Default is 500.
font_size (int, optional) – Font size for node labels (atomic numbers). If None, automatically scaled based on node_size (approximately 0.66 * sqrt(node_size)). Default is None.
show_bond_orders (bool, optional) – If True, visualize different bond orders with multiple parallel lines. Default is True.
weight_by_bond_order (bool, optional) – If True and layout is ‘spring’, use bond_order as edge weights in the spring layout algorithm. Higher bond orders pull atoms closer together. This makes double/triple bonds appear shorter than single bonds. Default is True.
ax (Axes, optional) – Matplotlib axes to draw on. If None, creates a new figure.

Returns:

Matplotlib figure containing the molecular graph visualization.

Return type:

Figure

Notes

Node Colors:

Nodes are colored using Jmol color scheme based on atomic numbers
Common colors: Carbon (gray), Hydrogen (white), Oxygen (red), Nitrogen (blue)

Bond Order Visualization:

Single bond (1.0): 1 solid line
Double bond (2.0): 2 parallel solid lines
Triple bond (3.0): 3 parallel solid lines
Aromatic bond (1.5): 1 solid line + 1 dashed line (parallel)
Unknown (None): 1 thin solid line

Examples

>>> from molify import smiles2atoms, ase2networkx, draw_molecular_graph
>>> atoms = smiles2atoms("C=C=O")  # Ketene
>>> graph = ase2networkx(atoms)
>>> fig = draw_molecular_graph(graph, layout='spring')

>>> # For benzene showing aromatic bonds
>>> benzene = smiles2atoms("c1ccccc1")
>>> graph_benzene = ase2networkx(benzene)
>>> fig = draw_molecular_graph(graph_benzene, show_bond_orders=True)

molify.get_centers_of_mass(atoms: Atoms, unwrap: bool = True, **kwargs) → Atoms[source]¶

Compute the center of mass for each molecule in an ASE Atoms object.

Parameters:

atoms (ase.Atoms) – The ASE Atoms object containing the molecular structures.
unwrap (bool, optional) – If True, unwrap the structures before computing the center of mass.
**kwargs (dict) – Additional keyword arguments to pass to the ase2networkx function.

Returns:

An ASE Atoms object containing the centers of mass of each molecule, with the masses of the molecules as the masses attribute. The species will start at 1 and increment for each unique molecule type.

Return type:

ase.Atoms

Example

>>> from molify import smiles2conformers, pack, get_centers_of_mass
>>> ethanol = smiles2conformers("CCO", numConfs=10)
>>> box = pack([ethanol], [10], density=786)
>>> centers = get_centers_of_mass(box)
>>> print(centers.get_positions().shape)
(10, 3)

molify.get_substructures(atoms: Atoms, pattern: str, **kwargs) → list[Atoms][source]¶

Extract all matched substructures from an ASE Atoms object.

This function converts ASE Atoms to RDKit Mol, finds all substructure matches, and returns ASE Atoms objects for each match.

Parameters:

atoms (ase.Atoms) – The structure to search in.
pattern (str) – SMARTS or SMILES pattern to match.
**kwargs – Additional keyword arguments passed to ase2rdkit for molecule conversion (e.g., suggestions for bond detection hints).

Returns:

List of ASE Atoms objects, each containing one matched substructure.

Return type:

list[ase.Atoms]

Examples

>>> from molify import smiles2atoms, get_substructures
>>>
>>> propanol = smiles2atoms("CCCO")
>>> carbons = get_substructures(propanol, "[#6]", suggestions=["CCCO"])
>>> len(carbons)
3

molify.group_matches_by_fragment(mol: Mol, matches: tuple[tuple[int, ...], ...]) → list[list[int]][source]¶

Group matched atom indices by disconnected molecular fragments.

This function takes substructure matches and organizes them by which disconnected fragment each match belongs to. Duplicate atoms within each fragment are removed.

Parameters:

mol (Chem.Mol) – RDKit molecule containing one or more disconnected fragments.
matches (tuple[tuple[int, ...], ...]) – Matches returned from match_substructure, where each tuple contains atom indices for one match.

Returns:

List of atom index lists, one per fragment. Each inner list contains unique atom indices for all matches within that fragment. Fragments with no matches are omitted.

Return type:

list[list[int]]

Examples

>>> from rdkit import Chem
>>> from molify import match_substructure, group_matches_by_fragment
>>>
>>> # Molecule with two disconnected fragments
>>> mol = Chem.MolFromSmiles("CCO.CF")
>>> matches = match_substructure(mol, "[#6]")
>>> matches
((0,), (1,), (3,))
>>>
>>> # Group by fragment
>>> group_matches_by_fragment(mol, matches)
[[0, 1], [3]]

molify.iter_fragments(atoms: Atoms) → list[Atoms][source]¶

Iterate over connected molecular fragments in an ASE Atoms object.

If a ‘connectivity’ field is present in atoms.info, it will be used to determine fragments. Otherwise, ase.build.separate will be used.

Parameters:: atoms (ase.Atoms) – A structure that may contain one or more molecular fragments.
Yields:: ase.Atoms – Each connected component (fragment) in the input structure.

Examples

>>> from molify import smiles2atoms
>>> from rdkit.Chem import CombineMols
>>>
>>> # Create multi-fragment system
>>> ethanol = smiles2atoms("CCO")
>>> methanol = smiles2atoms("CO")
>>> combined = ethanol + methanol
>>>
>>> # Iterate over fragments
>>> fragments = list(iter_fragments(combined))
>>> len(fragments)
2

molify.match_substructure(mol: Mol, smarts_or_smiles: str, hydrogens: Literal['include', 'exclude', 'isolated'] = 'exclude', mapped_only: bool = False) → tuple[tuple[int, ...], ...][source]¶

Find all matches of a substructure pattern in an RDKit molecule.

This function performs substructure matching with advanced features including hydrogen handling and atom mapping support. All matches are returned, including matches across multiple disconnected fragments.

Parameters:

mol (Chem.Mol) – RDKit molecule to search. Use molify.ase2rdkit(atoms, suggestions=[...]) to convert from ASE Atoms with explicit control over bond detection.
smarts_or_smiles (str) – SMARTS pattern or mapped SMILES pattern (e.g., “[C:1][C:2]O”). If atom maps are present and mapped_only=True, only mapped atoms are returned in map number order.
hydrogens ({'exclude', 'include', 'isolated'}, default='exclude') –
Controls hydrogen atom inclusion in matches:
- ’exclude’: Return only heavy atoms (default)
- ’include’: Return heavy atoms followed by their bonded hydrogens
- ’isolated’: Return only hydrogens bonded to matched heavy atoms
mapped_only (bool, default=False) – If True and pattern contains atom maps (e.g., [C:1]), return only the mapped atoms ordered by map number. If False, return all matched atoms.

Returns:

Tuple of matches, where each match is a tuple of atom indices. For patterns with multiple matches, each match is returned as a separate tuple.

Return type:

tuple[tuple[int, …], …]

Raises:

ValueError – If the SMARTS/SMILES pattern is invalid or if atom map labels are used multiple times within the same pattern.

Examples

>>> from molify import smiles2atoms, ase2rdkit, match_substructure
>>>
>>> # Explicit conversion with suggestions for better bond detection
>>> atoms = smiles2atoms("CCO")
>>> mol = ase2rdkit(atoms, suggestions=["CCO"])
>>>
>>> # Find all carbon atoms
>>> match_substructure(mol, "[#6]")
((0,), (1,))
>>>
>>> # Find C-O bond with hydrogens included
>>> match_substructure(mol, "CO", hydrogens="include")
((1, 6, 7, 2, 8),)
>>>
>>> # Use atom mapping to select specific atoms in order
>>> match_substructure(mol, "[C:2][C:1]O", mapped_only=True)
((1, 0),)  # Returns in map order: C:2 then C:1
>>>
>>> # Multiple matches in propanol
>>> propanol = ase2rdkit(smiles2atoms("CCCO"))
>>> match_substructure(propanol, "[#6]")
((0,), (1,), (2,))

molify.networkx2ase(graph: Graph) → Atoms[source]¶

Convert a NetworkX graph to an ASE Atoms object.

Parameters:

graph (networkx.Graph) –

The molecular graph to convert. Node attributes must include:

position (numpy.ndarray): Cartesian coordinates
atomic_number (int): Element atomic number
charge (float, optional): Formal charge

Edge attributes must include:

bond_order (float): Bond order

Optional graph attributes:

pbc (bool): Periodic boundary conditions
cell (numpy.ndarray): Unit cell vectors

Returns:

The resulting Atoms object with:

Atomic positions and numbers
Initial charges if present in graph
Connectivity information stored in atoms.info
PBC and cell if present in graph

Return type:

ase.Atoms

Examples

>>> import networkx as nx
>>> from molify import networkx2ase
>>> graph = nx.Graph()
>>> graph.add_node(0, position=[0,0,0], atomic_number=1, charge=0)
>>> graph.add_node(1, position=[1,0,0], atomic_number=1, charge=0)
>>> graph.add_edge(0, 1, bond_order=1.0)
>>> atoms = networkx2ase(graph)
>>> len(atoms)
2

molify.networkx2rdkit(graph: Graph, suggestions: list[str] | None = None) → Mol[source]¶

Convert a NetworkX graph to an RDKit molecule.

Automatically determines bond orders for edges with bond_order=None.

Parameters:

graph (networkx.Graph) –
The molecular graph to convert
Node attributes:
- atomic_number (int): Element atomic number
- charge (float, optional): Formal charge
Edge attributes:
- bond_order (float or None): Bond order. If None, will be determined automatically.
suggestions (list[str], optional) – Optional SMILES/SMARTS patterns to help determine bond orders. Used when edges have bond_order=None.

Returns:

The resulting RDKit molecule with:

Atoms and bonds from the graph
Formal charges if specified
All bond orders determined
Sanitized molecular structure

Return type:

rdkit.Chem.Mol

Raises:

ValueError – If nodes are missing atomic_number attribute, or if bond order determination fails for any edge.

Examples

>>> import networkx as nx
>>> from molify import networkx2rdkit
>>> graph = nx.Graph()
>>> graph.add_node(0, atomic_number=6, charge=0)
>>> graph.add_node(1, atomic_number=8, charge=0)
>>> graph.add_edge(0, 1, bond_order=2.0)
>>> mol = networkx2rdkit(graph)
>>> mol.GetNumAtoms()
2

>>> # With automatic bond order determination
>>> graph2 = nx.Graph()
>>> graph2.add_node(0, atomic_number=6, charge=0, position=[0, 0, 0])
>>> graph2.add_node(1, atomic_number=8, charge=0, position=[1.2, 0, 0])
>>> graph2.add_edge(0, 1, bond_order=None)  # Will be determined
>>> mol2 = networkx2rdkit(graph2)

molify.pack(data: list[list[Atoms]], counts: list[int], density: float, seed: int = 42, tolerance: float = 2, verbose: bool = False, packmol: str = 'packmol', pbc: bool = True, output_format: Literal['pdb', 'xyz'] = 'pdb', ratio: tuple[float, float, float] = (1.0, 1.0, 1.0)) → Atoms[source]¶

Packs the given molecules into a box with the specified density using PACKMOL.

Parameters:

data (list[list[ase.Atoms]]) – A list of lists of ASE Atoms objects representing the molecules to be packed.
counts (list[int]) – A list of integers representing the number of each type of molecule.
density (float) – The target density of the packed system in kg/m^3.
seed (int, optional) – The random seed for reproducibility, by default 42.
tolerance (float, optional) – The tolerance for the packing algorithm, by default 2.
verbose (bool, optional) – If True, enables logging of the packing process, by default False.
packmol (str, optional) – The path to the packmol executable.
pbc (bool, optional) – Ensure tolerance across periodic boundaries, by default True.
output_format (str, optional) – The file format used for communication with packmol, by default “pdb”. WARNING: Do not use “xyz”. This might cause issues and is only implemented for debugging purposes.
ratio (tuple[float, float, float], optional) – Box aspect ratio (a:b:c). Must be three positive, finite numbers. Defaults to (1.0, 1.0, 1.0) for a cubic box.

Returns:

An ASE Atoms object representing the packed system.

Return type:

ase.Atoms

Example

>>> from molify import pack, smiles2conformers
>>> water = smiles2conformers("O", 1)
>>> ethanol = smiles2conformers("CCO", 1)
>>> density = 1000  # kg/m^3
>>> packed_system = pack([water, ethanol], [7, 5], density)
>>> print(packed_system)
Atoms(symbols='C10H44O12', pbc=True, cell=[8.4, 8.4, 8.4])

molify.rdkit2ase(mol: Mol, seed: int = 42) → Atoms[source]¶

Convert an RDKit molecule to an ASE Atoms object.

Parameters:

mol (rdkit.Chem.Mol) – RDKit molecule to convert
seed (int, optional) – Random seed for conformer generation (default is 42)

Returns:

ASE Atoms object with: - Atomic positions from conformer coordinates - Atomic numbers from molecular structure - Connectivity information in atoms.info - Formal charges if present - SMILES string in atoms.info

Return type:

ase.Atoms

Examples

>>> from rdkit import Chem
>>> from molify import rdkit2ase
>>> mol = Chem.MolFromSmiles('CCO')
>>> atoms = rdkit2ase(mol)
>>> len(atoms)
9

molify.rdkit2networkx(mol: Mol) → Graph[source]¶

Convert an RDKit molecule to a NetworkX graph.

Parameters:

mol (rdkit.Chem.Mol) – RDKit molecule object to be converted

Returns:

Undirected graph representing the molecule where:

Nodes contain:

atomic_number (int): Atomic number
original_index (int): RDKit atom index
charge (int): Formal charge

Edges contain:

bond_order (float): Bond order (1.0, 1.5, 2.0, or 3.0)

Return type:

networkx.Graph

Notes

Bond orders are converted as follows:

SINGLE -> 1.0
DOUBLE -> 2.0
TRIPLE -> 3.0
AROMATIC -> 1.5

Examples

>>> from rdkit import Chem
>>> from molify import rdkit2networkx
>>> mol = Chem.MolFromSmiles('C=O')
>>> graph = rdkit2networkx(mol)
>>> len(graph.nodes)
4
>>> len(graph.edges)
3

molify.smiles2atoms(smiles: str, seed: int = 42) → Atoms[source]¶

Convert a SMILES string to an ASE Atoms object.

Parameters:

smiles (str) – The SMILES string to convert.
seed (int, optional) – Random seed for conformer generation (default is 42).

Returns:

The generated Atoms object (first conformer).

Return type:

ase.Atoms

Notes

This is a convenience wrapper around smiles2conformers that returns just the first conformer.

Examples

>>> from molify import smiles2atoms
>>> import ase
>>> atoms = smiles2atoms("CCO")
>>> isinstance(atoms, ase.Atoms)
True

molify.smiles2conformers(smiles: str, numConfs: int, randomSeed: int = 42, maxAttempts: int = 1000) → list[Atoms][source]¶

Generate multiple molecular conformers from a SMILES string.

Parameters:

smiles (str) – The SMILES string to convert.
numConfs (int) – Number of conformers to generate.
randomSeed (int, optional) – Random seed for conformer generation (default is 42).
maxAttempts (int, optional) – Maximum number of embedding attempts (default is 1000).

Returns:

List of generated conformers as ASE Atoms objects.

Return type:

list[ase.Atoms]

Notes

Special handling is included for PF6- (hexafluorophosphate) which is treated as a special case.

Each Atoms object in the returned list includes: - Atomic positions - Atomic numbers - SMILES string in the info dictionary - Connectivity information in the info dictionary - Formal charges when present

Examples

>>> from molify import smiles2conformers
>>> import ase
>>> frames = smiles2conformers("CCO", numConfs=3)
>>> len(frames)
3
>>> all(isinstance(atoms, ase.Atoms) for atoms in frames)
True

molify.unwrap_structures(atoms, scale=1.2, **kwargs) → Atoms[source]¶

Unwrap molecular structures across periodic boundary conditions (PBC).

This function corrects atomic positions that have been wrapped across periodic boundaries, ensuring that bonded atoms appear as continuous molecular structures. It can handle multiple disconnected molecules within the same unit cell.

The algorithm works by: 1. Building a connectivity graph based on covalent radii 2. Traversing each connected component (molecule) using depth-first search 3. Accumulating periodic image shifts to maintain molecular connectivity 4. Applying the shifts to obtain unwrapped coordinates

Parameters:

atoms (ase.Atoms) – The ASE Atoms object containing wrapped atomic positions
scale (float, optional) – Scale factor for covalent radii cutoffs used in bond detection. Larger values include more distant neighbors as bonded. Default is 1.2.
**kwargs (dict) – Additional keyword arguments to pass to the ase2networkx function.

Returns:

A new ASE Atoms object with unwrapped atomic positions. The original atoms object is not modified.

Return type:

ase.Atoms

Notes

The function preserves the original atoms object and returns a copy
Works with any periodic boundary conditions (1D, 2D, or 3D)
Handles multiple disconnected molecules/fragments
Uses covalent radii scaled by the scale parameter for bond detection

Examples

>>> import numpy as np
>>> from molify import smiles2conformers, pack, unwrap_structures
>>>
>>> # Create a realistic molecular system
>>> water = smiles2conformers("O", numConfs=1)
>>> ethanol = smiles2conformers("CCO", numConfs=1)
>>>
>>> # Pack molecules into a periodic box
>>> packed_system = pack(
...     data=[water, ethanol],
...     counts=[10, 5],
...     density=800
... )
>>>
>>> # Simulate wrapped coordinates (as might occur in MD)
>>> cell = packed_system.get_cell()
>>> positions = packed_system.get_positions()
>>>
>>> # Artificially move the box and wrap it again
>>> positions += np.array([cell[0, 0] * 0.5, 0, 0])  # Shift by half box length
>>> wrapped_atoms = packed_system.copy()
>>> wrapped_atoms.set_positions(positions)
>>> wrapped_atoms.wrap()  # Wrap back into PBC
>>>
>>> # Unwrap the structures to get continuous molecules
>>> unwrapped_atoms = unwrap_structures(wrapped_atoms)

molify.visualize_selected_molecules(mol: Mol, *args, mols_per_row: int = 4, sub_img_size: tuple[int, int] = (200, 200), legends: list[str] | None = None, alpha: float = 0.5)[source]¶

Visualize molecules with optional atom highlighting.

If no atom selections are provided, displays the molecule without highlights. Duplicate molecular structures will only be plotted once.

Parameters:

mol (Chem.Mol) – The RDKit molecule object, which may contain multiple fragments.
*args (list[int] or tuple[int]) – Variable number of lists/tuples containing atom indices to be highlighted. Each selection will be assigned a different color from matplotlib’s tab10 colormap. If no arguments provided, displays the molecule without highlights.
mols_per_row (int, default=4) – Number of molecules per row in the grid.
sub_img_size (tuple[int, int], default=(200, 200)) – Size of each molecule image in pixels.
legends (list[str], optional) – Custom legends for each molecule. If None, default legends will be used.
alpha (float, default=0.5) – Transparency level for the highlighted atoms (0.0 = fully transparent, 1.0 = opaque).

Returns:

A PIL image object of the grid.

Return type:

PIL.Image

Examples

>>> from molify import smiles2atoms, ase2rdkit, match_substructure
>>> from molify import visualize_selected_molecules
>>>
>>> # Create and convert molecule
>>> mol = ase2rdkit(smiles2atoms("Cc1ccccc1"))
>>>
>>> # Find aromatic and aliphatic carbons
>>> aromatic = match_substructure(mol, "c")
>>> aliphatic = match_substructure(mol, "[C;!c]")
>>>
>>> # Flatten matches for visualization
>>> aromatic_flat = [idx for match in aromatic for idx in match]
>>> aliphatic_flat = [idx for match in aliphatic for idx in match]
>>>
>>> # Visualize with highlights
>>> visualize_selected_molecules(mol, aromatic_flat, aliphatic_flat)
<PIL.Image>