Atomic and group contributions

One of the interesting features of Minimum-Distance distributions is that they can be naturally decomposed into the atomic or group contributions. Simply put, if a MDDF has a peak at a hydrogen-bonding distance, it is natural to decompose that peak into the contributions of each type of solute or solvent atom to that peak.

Tip

See also the section on contributions per residue for proteins and other macromolecules: 2D density map per residue.

To obtain the atomic contributions of an atom or group of atoms to the MDDF, the coordination number, or the site count at each distance, the contributions function is provided. For example, in a system composed of a protein and water, we would have defined the solute and solvent using:

using PDBTools, ComplexMixtures
atoms = read_pdb("system.pdb")
protein = select(atoms,"protein")
water = select(atoms,"water")
solute = AtomSelection(protein,nmols=1)
solvent = AtomSelection(water,natomspermol=3)

The MDDF calculation is executed with:

results = mddf("trajectory.dcd", solute, solvent, Options(bulk_range=(8.0, 12.0)))

Atomic contributions in the result data structure

The results data structure contains the decomposition of the MDDF into the contributions of every type of atom of the solute and the solvent. These contributions can be retrieved using the contributions function, with the SoluteGroup and SolventGroup selectors.

ComplexMixtures.contributionsFunction
contributions(R::Result, group::Union{SoluteGroup,SolventGroup}; type = :mddf)

Returns the contributions of the atoms of the solute or solvent to the MDDF, coordination number, or MD count.

Arguments

  • R::Result: The result of a calculation.
  • group::Union{SoluteGroup,SolventGroup}: The group of atoms to consider.
  • type::Symbol: The type of contributions to return. Can be :mddf (default), :coordination_number, or :md_count.

Examples

julia> using ComplexMixtures, PDBTools

julia> dir = ComplexMixtures.Testing.data_dir*"/Gromacs";

julia> atoms = readPDB(dir*"/system.pdb");

julia> protein = select(atoms, "protein");

julia> emim = select(atoms, "resname EMI"); 

julia> solute = AtomSelection(protein, nmols = 1)
AtomSelection 
    1231 atoms belonging to 1 molecule(s).
    Atoms per molecule: 1231
    Number of groups: 1231

julia> solvent = AtomSelection(emim, natomspermol = 20)
AtomSelection 
    5080 atoms belonging to 254 molecule(s).
    Atoms per molecule: 20
    Number of groups: 20

julia> results = load(dir*"/protein_EMI.json"); # load pre-calculated results

julia> ca_cb = contributions(results, SoluteGroup(["CA", "CB"])); # contribution of CA and CB atoms to the MDDF

julia> ca_cb = contributions(results, SoluteGroup(["CA", "CB"]); type=:coordination_number); # contribution of CA and CB atoms to the coordination number
source
ComplexMixtures.SolventGroupType

SoluteGroup and SolventGroup data structures.

These structures are used to select groups of atoms to extract their contributions from the MDDF results.

Most tipically, the groups are defined from a selection of atoms with the PDBTools package, or by providing directly the indices of teh atoms in the structure.

Alternativelly, if the groups were predefined, the groups can be selected by group index or group name.

The possible constructors are:

SoluteGroup(atoms::Vector{<:PDBTools.Atom})
SoluteGroup(atom_indices::AbstractVector{<:Integer})
SoluteGroup(atom_names::AbstractVector{<:AbstractString})
SoluteGroup(group_name::AbstractString)
SoluteGroup(residue::PDBTools.Residue)

above, each constructor can be replaced by SolventGroup. The resulting data structures are used as input parameters for the contributions function:

contributions(results::Result, group::Union{SoluteGroup, SolventGroup}; type=:mddf)

See the contributions help entry for additional information.

Examples

Defining solute groups with different input types:

julia> using ComplexMixtures, PDBTools

julia> atoms = PDBTools.readPDB(ComplexMixtures.Testing.pdbfile, "protein"); 

julia> SoluteGroup(select(atoms, "protein and resname ASP")) # vector of PDBTools.Atom(s)
SoluteGroup defined by:
    atom_indices: [ 24, 25, ..., 1056, 1057 ] - 72 atoms

julia> SoluteGroup(1:100) # atom indices (range or vector)
SoluteGroup defined by:
    atom_indices: [ 1, 2, ..., 99, 100 ] - 100 atoms

julia> SoluteGroup(["N", "CA", "C", "O"]) # vector of atom names
SoluteGroup defined by:
    atom_names: [ N, CA, C, O ] - 4 atom names.
 
julia> SoluteGroup("acidic residues") # predefined group name
SoluteGroup defined by:
    group_name: "acidic residues"

julia> SoluteGroup(1) # predefined group index
SoluteGroup defined by:
    group_index: 1

julia> SoluteGroup(collect(eachresidue(atoms))[2]) # PDBTools.Residue(s)
SoluteGroup defined by:
    atom_indices: [ 13, 14, ..., 22, 23 ] - 11 atoms
source
ComplexMixtures.SoluteGroupType

SoluteGroup and SolventGroup data structures.

These structures are used to select groups of atoms to extract their contributions from the MDDF results.

Most tipically, the groups are defined from a selection of atoms with the PDBTools package, or by providing directly the indices of teh atoms in the structure.

Alternativelly, if the groups were predefined, the groups can be selected by group index or group name.

The possible constructors are:

SoluteGroup(atoms::Vector{<:PDBTools.Atom})
SoluteGroup(atom_indices::AbstractVector{<:Integer})
SoluteGroup(atom_names::AbstractVector{<:AbstractString})
SoluteGroup(group_name::AbstractString)
SoluteGroup(residue::PDBTools.Residue)

above, each constructor can be replaced by SolventGroup. The resulting data structures are used as input parameters for the contributions function:

contributions(results::Result, group::Union{SoluteGroup, SolventGroup}; type=:mddf)

See the contributions help entry for additional information.

Examples

Defining solute groups with different input types:

julia> using ComplexMixtures, PDBTools

julia> atoms = PDBTools.readPDB(ComplexMixtures.Testing.pdbfile, "protein"); 

julia> SoluteGroup(select(atoms, "protein and resname ASP")) # vector of PDBTools.Atom(s)
SoluteGroup defined by:
    atom_indices: [ 24, 25, ..., 1056, 1057 ] - 72 atoms

julia> SoluteGroup(1:100) # atom indices (range or vector)
SoluteGroup defined by:
    atom_indices: [ 1, 2, ..., 99, 100 ] - 100 atoms

julia> SoluteGroup(["N", "CA", "C", "O"]) # vector of atom names
SoluteGroup defined by:
    atom_names: [ N, CA, C, O ] - 4 atom names.
 
julia> SoluteGroup("acidic residues") # predefined group name
SoluteGroup defined by:
    group_name: "acidic residues"

julia> SoluteGroup(1) # predefined group index
SoluteGroup defined by:
    group_index: 1

julia> SoluteGroup(collect(eachresidue(atoms))[2]) # PDBTools.Residue(s)
SoluteGroup defined by:
    atom_indices: [ 13, 14, ..., 22, 23 ] - 11 atoms
source

Example: computing the oxygen contributions of water

Here we show the MDDF of water (solvent) relative to a solute. Water molecules have atom names OH2, H1, H2, one can retrieve the contributions of the oxygen atom with:

OH2 = contributions(results, SolventGroup(["OH2"]))

or with, if OH2 is the first atom in the molecule,

OH2 = contributions(results, SolventGroup([1]))

The contributions of the hydrogen atoms can be obtained, similarly, with:

H = contributions(results, SolventGroup(["H1", "H2"]))

or with, if OH2 is the first atom in the molecule,

H = contributions(results, SolventGroup([2, 3]))

Each of these calls will return a vector of the constributions of these atoms to the total MDDF.

For example, here we plot the total MDDF and the Oxygen contributions:

using Plots
plot(results.d, results.mddf, label=["Total MDDF"], linewidth=2)
plot!(results.d, contributions(results, SolventGroup(["OH2"])), label=["OH2"], linewidth=2)
plot!(xlabel="Distance / Å", ylabel="MDDF")

Contributions to coordination numbers or site counts

The keyword type defines the return type of the contribution:

  • type=:mddf : the contribution of the group to the MDDF is returned (default).
  • type=:coordination_number : the contribution of the group to the coordination number, that is, the cumulative sum of counts at each distance, is returned.
  • type=:md_count : the contribution of the group to the site count at each distance is returned.

Example of the usage of the type option:

ca_contributions = contributions(results, SoluteGroup(["CA"]); type=:coordination_number)

Using PDBTools

If the solute is a protein, or other complex molecule, selections defined with PDBTools can be used. For example, this will retrieve the contribution of the acidic residues of a protein to total MDDF:

using PDBTools
atoms = read_pdb("system.pdb")
acidic_residues = select(atoms, "acidic")
acidic_contributions = contributions(results, SoluteGroup(acidic_residues))

It is expected that for a protein most of the atoms do not contribute to the MDDF, and that all values are zero at very short distances, smaller than the radii of the atoms.

More interesting and general is to select atoms of a complex molecule, like a protein, using residue names, types, etc. Here we illustrate how this is done by providing selection strings to contributions to obtain the contributions to the MDDF of different types of residues of a protein to the total MDDF.

For example, if we want to split the contributions of the charged and neutral residues to the total MDDF distribution, we could use to following code. Here, solute refers to the protein.

charged_residues = PDBTools.select(atoms,"charged")
charged_contributions = contributions(results, SoluteGroup(charged_residues))

neutral_residues = PDBTools.select(atoms,"neutral")
neutral_contributions = contributions(atoms, SoluteGroup(neutral_residues))

The charged_contributions and neutral_contributions outputs are vectors containing the contributions of these residues to the total MDDF. The corresponding plot is:

plot(results.d,results.mddf,label="Total MDDF",linewidth=2)
plot!(results.d,charged_contributions,label="Charged residues",linewidth=2)
plot!(results.d,neutral_contributions,label="Neutral residues",linewidth=2)
plot!(xlabel="Distance / Å",ylabel="MDDF")

Resulting in:

Note here how charged residues contribute strongly to the peak at hydrogen-bonding distances, but much less in general. Of course all selection options could be used, to obtain the contributions of specific types of residues, atoms, the backbone, the side-chains, etc.