Solute and solvent selections
The solute and solvent are defined by selecting subsets of atoms from the system. These subsets are defined by the AtomSelection
data structures.
To construct a AtomSelection
data structure, one needs to provide, at least, the (1-based) indices of the atoms that belong to the selection, and either the number of atoms of each molecule or the number of molecules in the selection.
Using the PDBTools package
The PDBTools package helps the construction of the solute and solvent data structures, by providing a convenient selection syntax. Additionally, it sets up the names of the atoms of the system in the data structure, which can be used to retrieve atom and and group contributions to MDDFs and coordination numbers.
For example, here we define a protein of a system as the solute:
julia> using ComplexMixtures, PDBTools
julia> atoms = readPDB(ComplexMixtures.Testing.pdbfile);
julia> protein = select(atoms, "protein");
julia> solute = AtomSelection(protein, nmols=1)
AtomSelection
1463 atoms belonging to 1 molecule(s).
Atoms per molecule: 1463
Number of groups: 1463
We need to inform the AtomSelection
function about the number of atoms of each molecule (using natomspermol=3
, for example), or the number of molecules (using nmols=1000
, for example), such that the atoms belonging to each molecule can be determined without ambiguity.
Now, we define the solvent of the system as the water molecules:
julia> water = select(atoms, "water");
julia> solvent = AtomSelection(water, natomspermol=3)
AtomSelection
58014 atoms belonging to 19338 molecule(s).
Atoms per molecule: 3
Number of groups: 3
Using VMD
VMD is a very popular and powerful package for visualization of simulations. It contains a very versatile library to read topologies and trajectory files, and a powerful selection syntax. The PDBTools.jl (v1.0 or greater) package provides a simple wrapper to VMD that allows using the same syntax at it supports.
For example, the solute can be defined with:
using ComplexMixtures, PDBTools
atoms = readPDB("system.pdb")
indices, names = select_with_vmd("./system.pdb", "protein", vmd="/usr/bin/vmd")
protein = atoms[indices]
solute = AtomSelection(protein, nmols=1)
The main advantage here is that all the file types that VMD supports are supported. But VMD needs to be installed and is run in background, and it takes a few seconds to be executed.
The VMDSelect
function also accepts an optional keyword parameter srcload
, which can be used to load custom scripts within vmd
before setting the selection. This allows the definition of tcl
scripts with custom selection macros, for instance. The usage would be:
using PDBTools
indices, names = select_with_vmd(
"file.pdb",
"resname MYRES";
srcload = [ "mymacros1.tcl", "mymacros2.tcl" ]
)
Which corresponds to source
ing each of the macro files in VMD before defining the selection with the custom MYRES
name.
VMD uses 0-based indexing and VMDselect
adjusts that. However, if a selection is performed by index, as with index 1
, VMD will select the second atom, and the output will be [2]
. AtomSelections by type, name, segment, residue name, etc, won't be a problem.
Predefinition of atom groups
Importantly, this should be only a concern for the solvation analysis of systems in which individual molecules are very large. This feature was introduced in version 2.0
of the package to support the study of small molecule distribution in virus structures, of millions of atoms.
By default, the contribution of each type of atom to the coordination number counts is stored, to allow the decomposition of the final MDDFs into any group contribution. However, when a structure, like a virus, has millions of atoms, storing the contribution of each atom becomes prohibitive in terms of memory. Thus, one may need to predefine the groups in which the contributions will be analyzed.
Here, we illustrate this feature by presselecting the acidic and basic residues of a protein:
julia> using ComplexMixtures, PDBTools
julia> atoms = readPDB(ComplexMixtures.Testing.pdbfile);
julia> protein = select(atoms, "protein");
julia> acidic_residues = select(atoms, "protein and acidic");
julia> basic_residues = select(atoms, "protein and basic");
julia> solute = AtomSelection(
protein,
nmols=1,
group_atom_indices = [ index.(acidic_residues), index.(basic_residues) ],
group_names = [ "acidic residues", "basic residues" ]
)
AtomSelection
1463 atoms belonging to 1 molecule(s).
Atoms per molecule: 1463
Number of groups: 2
In this example, then, the solute
AtomSelection
has two groups. The indices of the atoms of the groups are stored in the group_atom_indices
vector and the group names in the group_names
vector. The atom_group
auxiliary function is the most practical way to retrive the indices of the atoms of the group.
julia> atom_group(solute, "acidic residues")
162-element Vector{Int64}:
24
25
26
⋮
1436
1437
With these group selections predefined, the contributions of these groups to the MDDF or coordination numbers can be retrived directly from the result data structure with, for example:
julia> trajectory = Trajectory(trajectory_file, solute, solvent)
julia> result = mddf(trajectory, Options(bulk_range=(8.0, 12.0)));
julia> acidic_residue_contributions = contributions(result, SoluteGroup("acidic residues"))
Reference functions
ComplexMixtures.AtomSelection
— Typestruct AtomSelection
Structure that contains the information about the solute and solvent molecules.
nmols::Int64
natomspermol::Int64
indices::Vector{Int64}
custom_groups::Bool
group_atom_indices::Vector{Vector{Int64}}
group_names::Vector{String}
ComplexMixtures.AtomSelection
— MethodAtomSelection constructors
The AtomSelection
structure carries the information of the molecules that are going to be used to compute the MDDF. The structure can be initialized in different ways:
- Initialize the structure providing a vector of PDBTools.Atom(s).
AtomSelection(
atoms::AbstractVector{<:PDBTools.Atom};
nmols::Int = 0,
natomspermol::Int = 0,
group_atom_indices::Union{Nothing,Vector{Vector{Int}}} = nothing,
group_names::Vector{String} = String[]
)
The indices of the atoms will be retrived from the indices of the atoms as defined in the PDB file, thus the PDB file must correspond to the same system as that of the simulation.
Either the number of molecules (nmols
) or the number of atoms per molecule (natomspermol
) must be provided.
If group_atom_indices
is nothing
or group_names
is empty, the names of the groups will be retrieved from the atom names, and in the coordination numbers of each individual atom will be stored.
Example
julia> using ComplexMixtures, PDBTools
julia> pdbfile = ComplexMixtures.Testing.pdbfile;
julia> atoms = PDBTools.readPDB(pdbfile, "resname TMAO");
julia> atsel = AtomSelection(atoms, natomspermol=14)
AtomSelection
2534 atoms belonging to 181 molecule(s).
Atoms per molecule: 14
Number of groups: 14
julia> atom_group_name(atsel, 1)
"N"
julia> atom_group_name(atsel, 5)
"O1"
julia> length(atom_group_names(atsel))
14
- Lower level: initialize the structure providing the index of atoms and groups.
AtomSelection(
indices::Vector{Int};
nmols::Int = 0,
natomspermol::Int = 0,
group_atom_indices::Union{Nothing,Vector{Vector{Int}}} = nothing,
group_names::Vector{String} = String[]
)
Construct an AtomSelection structure from the most low-level information: the index of atoms and groups.
Either the number of molecules (nmols
) or the number of atoms per molecule (natomspermol
) must be provided.
Groups of atoms can be defined by providing a vector of vectors of atom indices (group_atom_indices
), and a vector of group names (group_names
). If group_atom_indices
is set to nothing
, the coordination numbers of each individual atoms wil be stored.
Examples
julia> using ComplexMixtures
julia> AtomSelection([1,2,3], nmols=1)
AtomSelection
3 atoms belonging to 1 molecule(s).
Atoms per molecule: 3
Number of groups: 3
julia> AtomSelection([1,2,3], natomspermol=1)
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 1
julia> AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
ComplexMixtures.SoluteGroup
— TypeSoluteGroup and SolventGroup data structures.
These structures are used to select groups of atoms to extract their contributions from the MDDF results.
Most tipically, the groups are defined from a selection of atoms with the PDBTools package, or by providing directly the indices of teh atoms in the structure.
Alternativelly, if the groups were predefined, the groups can be selected by group index or group name.
The possible constructors are:
SoluteGroup(atoms::Vector{PDBTools.Atom})
SoluteGroup(atom_indices::Vector{Int})
SoluteGroup(atom_names::Vector{String})
SoluteGroup(group_name::String)
SoluteGroup(residue::PDBTools.Residue)
SoluteGroup(atsel::AtomSelection)
above, each constructor can be replaced by SolventGroup. The resulting data structures are used as input parameters for the contributions
function:
contributions(results::Result, group::Union{SoluteGroup, SolventGroup}; type=:mddf)
See the contributions
help entry for additional information.
Examples
Defining solute groups with different input types:
julia> using ComplexMixtures, PDBTools
julia> atoms = PDBTools.readPDB(ComplexMixtures.Testing.pdbfile, "protein");
julia> SoluteGroup(select(atoms, "protein and resname ASP")) # vector of PDBTools.Atom(s)
SoluteGroup defined by:
atom_indices: [ 24, 25, ..., 1056, 1057 ] - 72 atoms
julia> SoluteGroup(1:100) # atom indices (range or vector)
SoluteGroup defined by:
atom_indices: [ 1, 2, ..., 99, 100 ] - 100 atoms
julia> SoluteGroup(["N", "CA", "C", "O"]) # vector of atom names
SoluteGroup defined by:
atom_names: [ N, CA, C, O ] - 4 atom names.
julia> SoluteGroup("acidic residues") # predefined group name
SoluteGroup defined by:
group_name: "acidic residues"
julia> SoluteGroup(1) # predefined group index
SoluteGroup defined by:
group_index: 1
julia> SoluteGroup(collect(eachresidue(atoms))[2]) # PDBTools.Residue(s)
SoluteGroup defined by:
atom_indices: [ 13, 14, ..., 22, 23 ] - 11 atoms
ComplexMixtures.SolventGroup
— TypeSoluteGroup and SolventGroup data structures.
These structures are used to select groups of atoms to extract their contributions from the MDDF results.
Most tipically, the groups are defined from a selection of atoms with the PDBTools package, or by providing directly the indices of teh atoms in the structure.
Alternativelly, if the groups were predefined, the groups can be selected by group index or group name.
The possible constructors are:
SoluteGroup(atoms::Vector{PDBTools.Atom})
SoluteGroup(atom_indices::Vector{Int})
SoluteGroup(atom_names::Vector{String})
SoluteGroup(group_name::String)
SoluteGroup(residue::PDBTools.Residue)
SoluteGroup(atsel::AtomSelection)
above, each constructor can be replaced by SolventGroup. The resulting data structures are used as input parameters for the contributions
function:
contributions(results::Result, group::Union{SoluteGroup, SolventGroup}; type=:mddf)
See the contributions
help entry for additional information.
Examples
Defining solute groups with different input types:
julia> using ComplexMixtures, PDBTools
julia> atoms = PDBTools.readPDB(ComplexMixtures.Testing.pdbfile, "protein");
julia> SoluteGroup(select(atoms, "protein and resname ASP")) # vector of PDBTools.Atom(s)
SoluteGroup defined by:
atom_indices: [ 24, 25, ..., 1056, 1057 ] - 72 atoms
julia> SoluteGroup(1:100) # atom indices (range or vector)
SoluteGroup defined by:
atom_indices: [ 1, 2, ..., 99, 100 ] - 100 atoms
julia> SoluteGroup(["N", "CA", "C", "O"]) # vector of atom names
SoluteGroup defined by:
atom_names: [ N, CA, C, O ] - 4 atom names.
julia> SoluteGroup("acidic residues") # predefined group name
SoluteGroup defined by:
group_name: "acidic residues"
julia> SoluteGroup(1) # predefined group index
SoluteGroup defined by:
group_index: 1
julia> SoluteGroup(collect(eachresidue(atoms))[2]) # PDBTools.Residue(s)
SoluteGroup defined by:
atom_indices: [ 13, 14, ..., 22, 23 ] - 11 atoms
ComplexMixtures.atom_group
— Methodatom_group(atsel::AtomSelection, i::Int)
atom_group(atsel::AtomSelection, groupname::String)
atom_group(atsel::AtomSelection, i::Int)
atom_group(atsel::AtomSelection, groupname::String)
Return the indices of the atoms that belong to a given group.
Example
julia> using ComplexMixtures
julia> atsel = AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
julia> atom_group(atsel, 1)
2-element Vector{Int64}:
1
2
julia> atom_group(atsel, "G2")
1-element Vector{Int64}:
3
julia> atom_group_name(atsel, 1)
"G1"
ComplexMixtures.atom_group_name
— Methodatom_group_name(atsel::AtomSelection, i::Int)
atom_group_names(atsel::AtomSelection)
Return the name of the group of atoms with index i
. The atom_group_names
function returns a vector with the names of all the groups.
Example
julia> using ComplexMixtures
julia> atsel = AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
julia> atom_group_name(atsel, 1)
"G1"
julia> atom_group_names(atsel)
2-element Vector{String}:
"G1"
"G2"
ComplexMixtures.atom_group_names
— Methodatom_group_name(atsel::AtomSelection, i::Int)
atom_group_names(atsel::AtomSelection)
Return the name of the group of atoms with index i
. The atom_group_names
function returns a vector with the names of all the groups.
Example
julia> using ComplexMixtures
julia> atsel = AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
julia> atom_group_name(atsel, 1)
"G1"
julia> atom_group_names(atsel)
2-element Vector{String}:
"G1"
"G2"