Solute and solvent selections
The solute and solvent are defined by selecting subsets of atoms from the system. These subsets are defined by the AtomSelection
data structures.
To construct a AtomSelection
data structure, one needs to provide, at least, the (1-based) indices of the atoms that belong to the selection, and either the number of atoms of each molecule or the number of molecules in the selection.
ComplexMixtures.AtomSelection
— Typestruct AtomSelection
Structure that contains the information about the solute and solvent molecules.
nmols::Int64
natomspermol::Int64
indices::Vector{Int64}
custom_groups::Bool
group_atom_indices::Vector{Vector{Int64}}
group_names::Vector{String}
ComplexMixtures.AtomSelection
— MethodAtomSelection constructors
The AtomSelection
structure carries the information of the molecules that are going to be used to compute the MDDF. The structure can be initialized in different ways:
- Initialize the structure providing a vector of PDBTools.Atom(s).
AtomSelection(
atoms::AbstractVector{<:PDBTools.Atom};
nmols::Integer = 0,
natomspermol::Integer = 0,
group_atom_indices::Union{Nothing,Vector{<:Vector{<:Int}}} = nothing,
group_names::Vector{<:AbstractString} = String[]
)
The indices of the atoms will be retrived from the indices of the atoms as defined in the PDB file, thus the PDB file must correspond to the same system as that of the simulation.
Either the number of molecules (nmols
) or the number of atoms per molecule (natomspermol
) must be provided.
If group_atom_indices
is nothing
or group_names
is empty, the names of the groups will be retrieved from the atom names, and in the coordination numbers of each individual atom will be stored.
Example
julia> using ComplexMixtures, PDBTools
julia> pdbfile = ComplexMixtures.Testing.pdbfile;
julia> atoms = PDBTools.readPDB(pdbfile, "resname TMAO");
julia> atsel = AtomSelection(atoms, natomspermol=14)
AtomSelection
2534 atoms belonging to 181 molecule(s).
Atoms per molecule: 14
Number of groups: 14
julia> atom_group_name(atsel, 1)
"N"
julia> atom_group_name(atsel, 5)
"O1"
julia> length(atom_group_names(atsel))
14
- Lower level: initialize the structure providing the index of atoms and groups.
AtomSelection(
indices::AbstractVector{<:Integer};
nmols::Int = 0,
natomspermol::Int = 0,
group_atom_indices::Union{Nothing,AbstractVector{<:AbstractVector{<:Int}}} = nothing,
group_names::AbstractVector{<:AbstractString} = String[]
)
Construct an AtomSelection structure from the most low-level information: the index of atoms and groups.
Either the number of molecules (nmols
) or the number of atoms per molecule (natomspermol
) must be provided.
Groups of atoms can be defined by providing a vector of vectors of atom indices (group_atom_indices
), and a vector of group names (group_names
). If group_atom_indices
is set to nothing
, the coordination numbers of each individual atoms wil be stored.
Examples
julia> using ComplexMixtures
julia> AtomSelection([1,2,3], nmols=1)
AtomSelection
3 atoms belonging to 1 molecule(s).
Atoms per molecule: 3
Number of groups: 3
julia> AtomSelection([1,2,3], natomspermol=1)
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 1
julia> AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
Alternatively, and more practically, atom selections defined by the PDBTools
package, which can interface with the selection syntax of VMD
, can be used, as described in the following sections.
Using the PDBTools package
The PDBTools package helps the construction of the solute and solvent data structures, by providing a convenient selection syntax. Additionally, it sets up the names of the atoms of the system in the data structure, which can be used to retrieve atom and and group contributions to MDDFs and coordination numbers.
For example, here we define a protein of a system as the solute:
julia> using ComplexMixtures, PDBTools
julia> atoms = read_pdb(ComplexMixtures.Testing.pdbfile);
julia> protein = select(atoms, "protein");
julia> solute = AtomSelection(protein, nmols=1)
AtomSelection
1463 atoms belonging to 1 molecule(s).
Atoms per molecule: 1463
Number of groups: 1463
We need to inform the AtomSelection
function about the number of atoms of each molecule (using natomspermol=3
, for example), or the number of molecules (using nmols=1000
, for example), such that the atoms belonging to each molecule can be determined without ambiguity.
Now, we define the solvent of the system as the water molecules:
julia> water = select(atoms, "water");
julia> solvent = AtomSelection(water, natomspermol=3)
AtomSelection
58014 atoms belonging to 19338 molecule(s).
Atoms per molecule: 3
Number of groups: 3
Using VMD
VMD is a very popular and powerful package for visualization of simulations. It contains a very versatile library to read topologies and trajectory files, and a powerful selection syntax. The PDBTools.jl (v1.0 or greater) package provides a simple wrapper to VMD that allows using the same syntax at it supports.
For example, the solute can be defined with:
using ComplexMixtures, PDBTools
atoms = read_pdb("system.pdb")
indices, names = select_with_vmd("./system.pdb", "protein", vmd="/usr/bin/vmd")
protein = atoms[indices]
solute = AtomSelection(protein, nmols=1)
The main advantage here is that all the file types that VMD supports are supported. But VMD needs to be installed and is run in background, and it takes a few seconds to be executed.
The VMDSelect
function also accepts an optional keyword parameter srcload
, which can be used to load custom scripts within vmd
before setting the selection. This allows the definition of tcl
scripts with custom selection macros, for instance. The usage would be:
using PDBTools
indices, names = select_with_vmd(
"file.pdb",
"resname MYRES";
srcload = [ "mymacros1.tcl", "mymacros2.tcl" ]
)
Which corresponds to source
ing each of the macro files in VMD before defining the selection with the custom MYRES
name.
VMD uses 0-based indexing and VMDselect
adjusts that. However, if a selection is performed by index, as with index 1
, VMD will select the second atom, and the output will be [2]
. AtomSelections by type, name, segment, residue name, etc, won't be a problem.
Predefinition of atom groups
Importantly, this should be only a concern for the solvation analysis of systems in which individual molecules are very large. This feature was introduced in version 2.0
of the package to support the study of small molecule distribution in virus structures, of millions of atoms.
By default, the contribution of each type of atom to the coordination number counts is stored, to allow the decomposition of the final MDDFs into any group contribution. However, when a structure, like a virus, has millions of atoms, storing the contribution of each atom becomes prohibitive in terms of memory. Thus, one may need to predefine the groups in which the contributions will be analyzed.
Here, we illustrate this feature by presselecting the acidic and basic residues of a protein:
julia> using ComplexMixtures, PDBTools
julia> atoms = read_pdb(ComplexMixtures.Testing.pdbfile);
julia> protein = select(atoms, "protein");
julia> acidic_residues = select(atoms, "protein and acidic");
julia> basic_residues = select(atoms, "protein and basic");
julia> solute = AtomSelection(
protein,
nmols=1,
group_atom_indices = [ index.(acidic_residues), index.(basic_residues) ],
group_names = [ "acidic residues", "basic residues" ]
)
AtomSelection
1463 atoms belonging to 1 molecule(s).
Atoms per molecule: 1463
Number of groups: 2
In this example, then, the solute
AtomSelection
has two groups. The indices of the atoms of the groups are stored in the group_atom_indices
vector and the group names in the group_names
vector. The atom_group
auxiliary function is the most practical way to retrive the indices of the atoms of the group.
julia> atom_group(solute, "acidic residues")
162-element Vector{Int64}:
24
25
26
⋮
1436
1437
With these group selections predefined, the contributions of these groups to the MDDF or coordination numbers can be retrived directly from the result data structure with, for example:
julia> result = mddf(trajectory_file, solute, solvent, Options(bulk_range=(8.0, 12.0)));
julia> acidic_residue_contributions = contributions(result, SoluteGroup("acidic residues"))
ComplexMixtures.atom_group
— Functionatom_group(atsel::AtomSelection, i::Integer)
atom_group(atsel::AtomSelection, groupname::String)
atom_group(atsel::AtomSelection, i::Int)
atom_group(atsel::AtomSelection, groupname::String)
Return the indices of the atoms that belong to a given group, when custom groups where defined.
Example
julia> using ComplexMixtures
julia> atsel = AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
julia> atom_group(atsel, 1)
2-element Vector{Int64}:
1
2
julia> atom_group(atsel, "G2")
1-element Vector{Int64}:
3
julia> atom_group_name(atsel, 1)
"G1"
ComplexMixtures.atom_group_name
— Functionatom_group_name(atsel::AtomSelection, i::Integer)
atom_group_names(atsel::AtomSelection)
Return the name of the group of atoms with index i
. The atom_group_names
function returns a vector with the names of all the groups.
Example
julia> using ComplexMixtures
julia> atsel = AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
julia> atom_group_name(atsel, 1)
"G1"
julia> atom_group_names(atsel)
2-element Vector{String}:
"G1"
"G2"
ComplexMixtures.atom_group_names
— Functionatom_group_name(atsel::AtomSelection, i::Integer)
atom_group_names(atsel::AtomSelection)
Return the name of the group of atoms with index i
. The atom_group_names
function returns a vector with the names of all the groups.
Example
julia> using ComplexMixtures
julia> atsel = AtomSelection([1,2,3], natomspermol=1, group_atom_indices=[[1,2],[3]], group_names=["G1", "G2"])
AtomSelection
3 atoms belonging to 3 molecule(s).
Atoms per molecule: 1
Number of groups: 2
julia> atom_group_name(atsel, 1)
"G1"
julia> atom_group_names(atsel)
2-element Vector{String}:
"G1"
"G2"