API Reference

rdkit2ase.ase2networkx(atoms: Atoms, suggestions: list[str] | None = None, pbc: bool = True, scale: float = 1.2) Graph[source]

Convert an ASE Atoms object to a NetworkX graph with bonding information.

Parameters:
  • atoms (ase.Atoms) – The ASE Atoms object to convert into a graph.

  • suggestions (list[str], optional) – List of SMILES patterns to suggest bond orders (default is None). If None, bond order is only determined from connectivity. If you provide an empty list, bond order will be determined using rdkit’s bond order determination algorithm. If SMILES patterns are provided, they will be used to suggest bond orders first, and then rdkit’s bond order determination algorithm will be used.

  • pbc (bool, optional) – Whether to consider periodic boundary conditions when calculating distances (default is True). If False, only connections within the unit cell are considered.

  • scale (float, optional) – Scaling factor for the covalent radii when determining bond cutoffs (default is 1.2).

Returns:

An undirected NetworkX graph with atomic properties and bonding information.

Return type:

networkx.Graph

Notes

The graph contains the following information:

  • Nodes represent atoms with properties:
    • position: Cartesian coordinates (numpy.ndarray)

    • atomic_number: Element atomic number (int)

    • original_index: Index in original Atoms object (int)

    • charge: Formal charge (float)

  • Edges represent bonds with:
    • bond_order: Bond order (float or None if unknown)

  • Graph properties include:
    • pbc: Periodic boundary conditions

    • cell: Unit cell vectors

Connectivity is determined by:

  1. Using explicit connectivity if present in atoms.info

  2. Otherwise using distance-based cutoffs and use SMILES patterns

  3. Use rdkit’s bond order determination algorithm if no suggestions are provided.

Examples

>>> from rdkit2ase import ase2networkx, smiles2atoms
>>> atoms = smiles2atoms(smiles="O")
>>> graph = ase2networkx(atoms)
>>> len(graph.nodes)
3
>>> len(graph.edges)
2
rdkit2ase.ase2rdkit(atoms: Atoms, suggestions: list[str] | None = None) Mol[source]

Convert an ASE Atoms object to an RDKit molecule.

Parameters:
  • atoms (ase.Atoms) – The ASE Atoms object to convert.

  • suggestions (list[str], optional) – List of SMARTS patterns to suggest bond orders (default is None).

Returns:

The resulting RDKit molecule with 3D coordinates.

Return type:

rdkit.Chem.Mol

Notes

This function first converts the Atoms object to a NetworkX graph using ase2networkx, then converts the graph to an RDKit molecule using networkx2rdkit.

Examples

>>> from rdkit2ase import ase2rdkit, smiles2atoms
>>> atoms = smiles2atoms(smiles="C=O")
>>> mol = ase2rdkit(atoms)
>>> mol.GetNumAtoms()
4
rdkit2ase.compress(atoms: Atoms, density: float, freeze_molecules: bool = False) Atoms[source]

Compress an ASE Atoms object to a target density.

Parameters:
  • atoms (ase.Atoms) – The Atoms object to compress.

  • density (float) – The target density in g/cm^3.

  • freeze_molecules (bool) – If True, freeze the internal degrees of freedom of the molecules during compression, to prevent bond compression.

rdkit2ase.get_centers_of_mass(atoms: Atoms, unwrap: bool = True, **kwargs) Atoms[source]

Compute the center of mass for each molecule in an ASE Atoms object.

Parameters:
  • atoms (ase.Atoms) – The ASE Atoms object containing the molecular structures.

  • unwrap (bool, optional) – If True, unwrap the structures before computing the center of mass.

  • **kwargs (dict) – Additional keyword arguments to pass to the ase2networkx function.

Returns:

An ASE Atoms object containing the centers of mass of each molecule, with the masses of the molecules as the masses attribute. The species will start at 1 and increment for each unique molecule type.

Return type:

ase.Atoms

Example

>>> from rdkit2ase import smiles2conformers, pack, get_centers_of_mass
>>> ethanol = smiles2conformers("CCO", numConfs=10)
>>> box = pack([ethanol], [10], density=786)
>>> centers = get_centers_of_mass(box)
>>> print(centers.get_positions().shape)
(10, 3)
rdkit2ase.get_substructures(atoms: Atoms, **kwargs) list[Atoms][source]

Extract all matched substructures from an ASE Atoms object.

Parameters:
  • atoms (ase.Atoms) – The structure to search in.

  • smarts (str, optional) – A SMARTS string to match substructures.

  • smiles (str, optional) – A SMILES string to match substructures.

  • mol (Chem.Mol, optional) – An RDKit Mol object to match substructures.

  • fragment (ase.Atoms, optional) – A specific ASE Atoms object to match against the structure.

  • **kwargs – Additional keyword arguments passed to match_substructure.

Returns:

List of substructure fragments matching the pattern.

Return type:

list of ase.Atoms

rdkit2ase.iter_fragments(atoms: Atoms) list[Atoms][source]

Iterate over connected molecular fragments in an ASE Atoms object.

If a ‘connectivity’ field is present in atoms.info, it will be used to determine fragments. Otherwise, ase.build.separate will be used.

Parameters:

atoms (ase.Atoms) – A structure that may contain one or more molecular fragments.

Yields:

ase.Atoms – Each connected component (fragment) in the input structure.

rdkit2ase.match_substructure(atoms: Atoms, smiles: str | None = None, smarts: str | None = None, mol: Mol | None = None, fragment: Atoms | None = None, **kwargs) tuple[tuple[int, ...]][source]

Find all matches of a substructure pattern in a given ASE Atoms object.

Parameters:
  • atoms (ase.Atoms) – The molecule or structure in which to search for substructure matches.

  • smiles (str, optional) – A SMILES string representing the substructure pattern to match.

  • smarts (str, optional) – A SMARTS string representing the substructure pattern to match.

  • mol (Chem.Mol, optional) – An RDKit Mol object representing the substructure pattern to match.

  • fragment (ase.Atoms, optional) – An ASE Atoms object representing the substructure pattern to match. If provided, it will be converted to an RDKit Mol object for matching.

  • **kwargs – Additional keyword arguments passed to ase2rdkit.

Returns:

A tuple of atom index tuples, each corresponding to one match of the pattern.

Return type:

tuple of tuple of int

rdkit2ase.networkx2ase(graph: Graph) Atoms[source]

Convert a NetworkX graph to an ASE Atoms object.

Parameters:

graph (networkx.Graph) –

The molecular graph to convert. Node attributes must include:
  • position (numpy.ndarray): Cartesian coordinates

  • atomic_number (int): Element atomic number

  • charge (float, optional): Formal charge

Edge attributes must include:
  • bond_order (float): Bond order

Optional graph attributes:
  • pbc (bool): Periodic boundary conditions

  • cell (numpy.ndarray): Unit cell vectors

Returns:

The resulting Atoms object with:
  • Atomic positions and numbers

  • Initial charges if present in graph

  • Connectivity information stored in atoms.info

  • PBC and cell if present in graph

Return type:

ase.Atoms

Examples

>>> import networkx as nx
>>> from rdkit2ase import networkx2ase
>>> graph = nx.Graph()
>>> graph.add_node(0, position=[0,0,0], atomic_number=1, charge=0)
>>> graph.add_node(1, position=[1,0,0], atomic_number=1, charge=0)
>>> graph.add_edge(0, 1, bond_order=1.0)
>>> atoms = networkx2ase(graph)
>>> len(atoms)
2
rdkit2ase.networkx2rdkit(graph: Graph) Mol[source]

Convert a NetworkX graph to an RDKit molecule.

Parameters:

graph (networkx.Graph) –

The molecular graph to convert

Node attributes:
  • atomic_number (int): Element atomic number

  • charge (float, optional): Formal charge

Edge attributes:
  • bond_order (float): Bond order

Returns:

The resulting RDKit molecule with:
  • Atoms and bonds from the graph

  • Formal charges if specified

  • Sanitized molecular structure

Return type:

rdkit.Chem.Mol

Raises:

ValueError: – If nodes are missing atomic_number attribute, or if edges are missing bond_order attribute

Examples

>>> import networkx as nx
>>> from rdkit2ase import networkx2rdkit
>>> graph = nx.Graph()
>>> graph.add_node(0, atomic_number=6, charge=0)
>>> graph.add_node(1, atomic_number=8, charge=0)
>>> graph.add_edge(0, 1, bond_order=2.0)
>>> mol = networkx2rdkit(graph)
>>> mol.GetNumAtoms()
2
rdkit2ase.pack(data: list[list[Atoms]], counts: list[int], density: float, seed: int = 42, tolerance: float = 2, verbose: bool = False, packmol: str = 'packmol', pbc: bool = True, output_format: Literal['pdb', 'xyz'] = 'pdb') Atoms[source]

Packs the given molecules into a box with the specified density using PACKMOL.

Parameters:
  • data (list[list[ase.Atoms]]) – A list of lists of ASE Atoms objects representing the molecules to be packed.

  • counts (list[int]) – A list of integers representing the number of each type of molecule.

  • density (float) – The target density of the packed system in kg/m^3.

  • seed (int, optional) – The random seed for reproducibility, by default 42.

  • tolerance (float, optional) – The tolerance for the packing algorithm, by default 2.

  • verbose (bool, optional) – If True, enables logging of the packing process, by default False.

  • packmol (str, optional) – The path to the packmol executable, by default “packmol”. When installing packmol via jula, use “packmol.jl”.

  • pbc (bool, optional) – Ensure tolerance across periodic boundaries, by default True.

  • output_format (str, optional) – The file format used for communication with packmol, by default “pdb”. WARNING: Do not use “xyz”. This might cause issues and is only implemented for debugging purposes.

Returns:

An ASE Atoms object representing the packed system.

Return type:

ase.Atoms

Example

>>> from rdkit2ase import pack, smiles2conformers
>>> water = smiles2conformers("O", 1)
>>> ethanol = smiles2conformers("CCO", 1)
>>> density = 1000  # kg/m^3
>>> packed_system = pack([water, ethanol], [7, 5], density)
>>> print(packed_system)
Atoms(symbols='C10H44O12', pbc=True, cell=[8.4, 8.4, 8.4])
rdkit2ase.rdkit2ase(mol: Mol, seed: int = 42) Atoms[source]

Convert an RDKit molecule to an ASE Atoms object.

Parameters:
  • mol (rdkit.Chem.Mol) – RDKit molecule to convert

  • seed (int, optional) – Random seed for conformer generation (default is 42)

Returns:

ASE Atoms object with: - Atomic positions from conformer coordinates - Atomic numbers from molecular structure - Connectivity information in atoms.info - Formal charges if present - SMILES string in atoms.info

Return type:

ase.Atoms

Examples

>>> from rdkit import Chem
>>> from rdkit2ase import rdkit2ase
>>> mol = Chem.MolFromSmiles('CCO')
>>> atoms = rdkit2ase(mol)
>>> len(atoms)
9
rdkit2ase.rdkit2networkx(mol: Mol) Graph[source]

Convert an RDKit molecule to a NetworkX graph.

Parameters:

mol (rdkit.Chem.Mol) – RDKit molecule object to be converted

Returns:

Undirected graph representing the molecule where:

Nodes contain:
  • atomic_number (int): Atomic number

  • original_index (int): RDKit atom index

  • charge (int): Formal charge

Edges contain:
  • bond_order (float): Bond order (1.0, 1.5, 2.0, or 3.0)

Return type:

networkx.Graph

Notes

Bond orders are converted as follows:
  • SINGLE -> 1.0

  • DOUBLE -> 2.0

  • TRIPLE -> 3.0

  • AROMATIC -> 1.5

Examples

>>> from rdkit import Chem
>>> from rdkit2ase import rdkit2networkx
>>> mol = Chem.MolFromSmiles('C=O')
>>> graph = rdkit2networkx(mol)
>>> len(graph.nodes)
4
>>> len(graph.edges)
3
rdkit2ase.select_atoms_flat_unique(mol: Mol, smarts_or_smiles: str, hydrogens: Literal['include', 'exclude', 'isolated'] = 'exclude') list[int][source]

Selects a unique list of atom indices in a molecule using SMARTS or mapped SMILES. If the pattern contains atom maps (e.g., [C:1]), only the mapped atoms are returned. Otherwise, all atoms in the matched substructure are returned.

Parameters:
  • mol (Chem.Mol) – RDKit molecule, which can contain explicit hydrogens.

  • smarts_or_smiles (str) – SMARTS (e.g., “[F]”) or SMILES with atom maps (e.g., “C1[C:1]OC(=[O:1])O1”).

  • hydrogens ({"include", "exclude", "isolated"}, default "exclude") – How to handle hydrogens in the final returned list. - “include”: Include hydrogens attached to matched heavy atoms - “exclude”: Exclude all hydrogens from results (default) - “isolated”: Return only hydrogens attached to matched heavy atoms

Returns:

A single, flat list of unique integer atom indices matching the criteria.

Return type:

list[int]

Raises:

ValueError – If the SMARTS/SMILES pattern is invalid.

rdkit2ase.select_atoms_grouped(mol: Mol, smarts_or_smiles: str, hydrogens: Literal['include', 'exclude', 'isolated'] = 'exclude') list[list[int]][source]

Selects atom indices using SMARTS or SMILES, grouped by disconnected fragments.

This function identifies all substructure matches and returns a list of atom index lists. Each inner list corresponds to a unique, disconnected molecular fragment that contained at least one match.

If the pattern contains atom maps (e.g., “[C:1]”, “[C:2]”), only the mapped atoms are returned, ordered by their map numbers. Map numbers must be unique within the pattern. Otherwise, all atoms in the matched substructures are returned.

Parameters:
  • mol (rdchem.Mol) – RDKit molecule, which can contain multiple disconnected fragments and explicit hydrogens.

  • smarts_or_smiles (str) – SMARTS pattern (e.g., “[F]”) or SMILES with atom maps (e.g., “CC(=O)N[C:1]([C:2])[C:3](=O)[N:4]C”). When using mapped atoms, map numbers must be unique.

  • hydrogens ({'include', 'exclude', 'isolated'}, default='exclude') –

    How to handle hydrogens in the final returned list for each group: - ‘include’: Add hydrogens bonded to selected heavy atoms after each

    mapped atom.

    • ’exclude’: Remove all hydrogens from the selection.

    • ’isolated’: Return only the hydrogens that are bonded to selected heavy atoms.

Returns:

A list of integer lists. Each inner list contains the atom indices for a matched, disconnected fragment. For mapped patterns, atoms are ordered by their map numbers. Fragments with no matches are omitted from the output.

Return type:

list[list[int]]

Raises:

ValueError – If the provided SMARTS/SMILES pattern is invalid or if atom map labels are used multiple times within the same pattern.

Examples

>>> # Molecule with two disconnected fragments: ethanol and fluoromethane
>>> mol = Chem.MolFromSmiles("CCO.CF") # Indices: C(0)C(1)O(2) . C(3)F(4)
>>>
>>> # Select all carbon atoms
>>> select_atoms_grouped(mol, "[C]")
[[0, 1], [3]]
>>>
>>> # Select fluorine and its bonded carbon using 'include'
>>> select_atoms_grouped(mol, "[F]", hydrogens="include")
[[3, 4]]
rdkit2ase.smiles2atoms(smiles: str, seed: int = 42) Atoms[source]

Convert a SMILES string to an ASE Atoms object.

Parameters:
  • smiles (str) – The SMILES string to convert.

  • seed (int, optional) – Random seed for conformer generation (default is 42).

Returns:

The generated Atoms object (first conformer).

Return type:

ase.Atoms

Notes

This is a convenience wrapper around smiles2conformers that returns just the first conformer.

Examples

>>> from rdkit2ase import smiles2atoms
>>> import ase
>>> atoms = smiles2atoms("CCO")
>>> isinstance(atoms, ase.Atoms)
True
rdkit2ase.smiles2conformers(smiles: str, numConfs: int, randomSeed: int = 42, maxAttempts: int = 1000) list[Atoms][source]

Generate multiple molecular conformers from a SMILES string.

Parameters:
  • smiles (str) – The SMILES string to convert.

  • numConfs (int) – Number of conformers to generate.

  • randomSeed (int, optional) – Random seed for conformer generation (default is 42).

  • maxAttempts (int, optional) – Maximum number of embedding attempts (default is 1000).

Returns:

List of generated conformers as ASE Atoms objects.

Return type:

list[ase.Atoms]

Notes

Special handling is included for PF6- (hexafluorophosphate) which is treated as a special case.

Each Atoms object in the returned list includes: - Atomic positions - Atomic numbers - SMILES string in the info dictionary - Connectivity information in the info dictionary - Formal charges when present

Examples

>>> from rdkit2ase import smiles2conformers
>>> import ase
>>> frames = smiles2conformers("CCO", numConfs=3)
>>> len(frames)
3
>>> all(isinstance(atoms, ase.Atoms) for atoms in frames)
True
rdkit2ase.unwrap_structures(atoms, scale=1.2, **kwargs) Atoms[source]

Unwrap molecular structures across periodic boundary conditions (PBC).

This function corrects atomic positions that have been wrapped across periodic boundaries, ensuring that bonded atoms appear as continuous molecular structures. It can handle multiple disconnected molecules within the same unit cell.

The algorithm works by: 1. Building a connectivity graph based on covalent radii 2. Traversing each connected component (molecule) using depth-first search 3. Accumulating periodic image shifts to maintain molecular connectivity 4. Applying the shifts to obtain unwrapped coordinates

Parameters:
  • atoms (ase.Atoms) – The ASE Atoms object containing wrapped atomic positions

  • scale (float, optional) – Scale factor for covalent radii cutoffs used in bond detection. Larger values include more distant neighbors as bonded. Default is 1.2.

  • **kwargs (dict) – Additional keyword arguments to pass to the ase2networkx function.

Returns:

A new ASE Atoms object with unwrapped atomic positions. The original atoms object is not modified.

Return type:

ase.Atoms

Notes

  • The function preserves the original atoms object and returns a copy

  • Works with any periodic boundary conditions (1D, 2D, or 3D)

  • Handles multiple disconnected molecules/fragments

  • Uses covalent radii scaled by the scale parameter for bond detection

Examples

>>> import numpy as np
>>> from rdkit2ase import smiles2conformers, pack, unwrap_structures
>>>
>>> # Create a realistic molecular system
>>> water = smiles2conformers("O", numConfs=1)
>>> ethanol = smiles2conformers("CCO", numConfs=1)
>>>
>>> # Pack molecules into a periodic box
>>> packed_system = pack(
...     data=[water, ethanol],
...     counts=[10, 5],
...     density=800
... )
>>>
>>> # Simulate wrapped coordinates (as might occur in MD)
>>> cell = packed_system.get_cell()
>>> positions = packed_system.get_positions()
>>>
>>> # Artificially move the box and wrap it again
>>> positions += np.array([cell[0, 0] * 0.5, 0, 0])  # Shift by half box length
>>> wrapped_atoms = packed_system.copy()
>>> wrapped_atoms.set_positions(positions)
>>> wrapped_atoms.wrap()  # Wrap back into PBC
>>>
>>> # Unwrap the structures to get continuous molecules
>>> unwrapped_atoms = unwrap_structures(wrapped_atoms)
rdkit2ase.visualize_selected_molecules(mol: Mol, *args, mols_per_row: int = 4, sub_img_size: tuple[int, int] = (200, 200), legends: list[str] | None = None, alpha: float = 0.5)[source]

Visualizes molecules with optional atom highlighting. If no atom selections are provided, displays the molecule without highlights. Duplicate molecular structures will only be plotted once.

Parameters:
  • mol (Chem.Mol) – The RDKit molecule object, which may contain multiple fragments.

  • *args (list[int]) – Variable number of lists containing atom indices to be highlighted. Each list will be assigned a different color from matplotlib’s tab10 colormap. If no arguments provided, displays the molecule without highlights.

  • mols_per_row (int, default 4) – Number of molecules per row in the grid.

  • sub_img_size (tuple[int, int], default (200, 200)) – Size of each molecule image.

  • legends (list[str] | None, default None) – Custom legends for each molecule. If None, default legends will be used.

  • alpha (float, default 0.5) – Transparency level for the highlighted atoms (0.0 = fully transparent, 1.0 = opaque).

Returns:

A PIL image object of the grid.

Return type:

PIL.Image