Coordination numbers

Coordination numbers

The coordination number is the numerical count of how many molecules of a solvent is within a certain distance from the solute. The following function provides the functionality of retrieving coordination numbers and atom groups contributions to these numbers, given a Result data structure:

ComplexMixtures.coordination_numberFunction
coordination_number(R::Result) = R.coordination_number
coordination_number(R::Result, s::Union{SoluteGroup,SolventGroup})

Here, the coordination number is the number of molecules of the solvent that are within a certain distance from the solute.

If no group is defined (first call above), the coordination number of the whole solute or solvent is returned.

If a group is defined (second call above), the contributions of this group of atoms to the coordination number are returned, as a function of the distance to the solute:

  • R are the results obtained, that is, a Result data structure,
  • s is the solute or solvent selection (of type ComplexMixtures.AtomSelection)

Examples

Coordination number of a subgroup of atoms of the solute

Here we compute the coordination number of the atoms of the Alanine residues of a protein, relative to the solvent (TMAO), as a function of the distance. For simplicity, the coordination number at 5 Å is shown. We use, in this case, the SoluteGroup type to define the group of atoms, providing select(ats, "resname ALA") as the selection of atoms of the solute. select is a function from the PDBTools package.

julia> using ComplexMixtures, PDBTools

julia> using ComplexMixtures.Testing: data_dir

julia> ats = read_pdb(joinpath(data_dir,"NAMD/structure.pdb"));

julia> solute = AtomSelection(select(ats, "protein"); nmols=1);

julia> solvent = AtomSelection(select(ats, "resname TMAO"); natomspermol=14);

julia> R = mddf(
           joinpath(data_dir,"NAMD/trajectory.dcd"), 
           solute, 
           solvent, 
           Options(bulk_range=(8.0,12.0), silent=true)
        );

julia> i5 = findfirst(>=(5), R.d) # index for distance ≈ 5 Å
251

julia> ala_residues = select(ats, "resname ALA");

julia> coordination_number(R, SoluteGroup(ala_residues))[i5]
0.45

Alternatively to the use of the PDBTools.select function, an array with the indices of the atoms of the protein can be used to define the solute group, as, for example:

julia> ala_indices = findall(at -> resname(at) == "ALA", ats);

julia> coordination_number(R, SoluteGroup(ala_indices))[i5]
0.45

A similar syntax can be used to compute contributions of the solvent atoms to the MDDF.

Coordination numbers if the groups are predefined

If group contributions were precomputed, the name of the group can be used to compute the coordination number:

julia> using ComplexMixtures, PDBTools

julia> using ComplexMixtures.Testing: data_dir

julia> ats = read_pdb(joinpath(data_dir,"NAMD/structure.pdb"));

julia> solute = AtomSelection(
           select(ats, "protein"); 
           nmols=1,
           # predefinition of the group of atoms:
           group_atom_indices = [ findall(at -> resname(at) == "ALA", ats) ],
           group_names = ["alanine residues"]
        )
AtomSelection 
    1463 atoms belonging to 1 molecule(s).
    Atoms per molecule: 1463
    Number of groups: 1

julia> solvent = AtomSelection(select(ats, "resname TMAO"); natomspermol=14);

julia> R = mddf(
           joinpath(data_dir,"NAMD/trajectory.dcd"), 
           solute, 
           solvent, 
           Options(bulk_range=(8.0,12.0), silent=true)
        );

julia> i5 = findfirst(>=(5), R.d) # index for distance ≈ 5 Å
251

julia> coordination_number(R, SoluteGroup("alanine residues"))[i5]
0.45
source

The function can be called without any argument besides the Result data structure, in which case the coordination number of the complete solute will be returned, as a function of the distance.

Alternatively, subgroups of the solute or the solvent can be selected, by providing SoluteGroup or SolventGroup objects as second arguments, initialized with the atoms of the subgroup, as a list of indices or a list of atoms generated by PDBTools. Note that, if a SolventGroup is defined we obtain the contributions of those solvent atoms to the coordination number of the solute (for instance, the sum of the contributions of all solvent atoms is the coordination number of the solute).

Example

In the following example we compute the coordination number of the atoms of residue 50 (which belongs to the solute - a protein) with the solvent atoms of TMAO, as a function of the distance. The plot produced will show side by side the residue contribution to the MDDF and the corresponding coordination number.

# Load necessary packages
using ComplexMixtures, PDBTools 
# Load data structure and previously computed results from a mddf calculation 
pdb = read_pdb("test/data/NAMD/structure.pdb")
R = load("test/data/NAMD/protein_tmao.json")
# Define which is the solute
solute = AtomSelection(PDBTools.select(pdb, "protein"), nmols=1)
# We intend to compute the contributions of residue 50 only
residue50 = PDBTools.select(pdb, "residue 50")
# Compute the group contribution to the MDDF
residue50_contribution = contributions(R, SoluteGroup(residue50))
# Now compute the coordination number
residue50_coordination = coordination_number(R, SoluteGroup(residue50))
#
# Plot with twin y-axis
#
using Plots 
plot(R.d, residue50_contribution,
    xaxis="distance / Å", 
    yaxis="MDDF contribution", 
    linewidth=2, label=nothing, color=1
)
plot!(twinx(),R.d, residue50_coordination, 
    yaxis="Coordination number", 
    linewidth=2, label=nothing, color=2
)
plot!(title="Residue 50", framestyle=:box, subplot=1)

With appropriate input data, this code produces:

Tip

There are some systems for which the normalization of the distributions is not necessary or possible. It is still possible to compute the coordination numbers, by running, instead of mddf, the coordination_number function:

coordination_number(trajectory_file, solute, solvent, options::Options)

This call will return a Result data structure but with all fields requiring normalization with zeros. In summary, this result data structure can be used to compute the coordination numbers, but not the MDDF, RDF, or KB integrals.

Compat

The independent computation of coordination numbers was introduced in version 1.1.

Computing radial distribution functions

The distributions returned by the mddf function (the mddf and rdf vectors), are normalized by the random reference state or using a site count based on the numerical integration of the volume corresponding to each minimum-distance to the solute.

If, however, the solute is defined by a single atom (as the oxygen atom of water, for example), the numerical integration of the volume can be replaced by a simple analytical spherical shell volume, reducing noise. The ComplexMixtures.gr function returns the radial distribution function and the KB integral computed from the results, using this volume estimate:

g, kb = ComplexMixtures.gr(R)

By default, the single-reference count (rdf_count) of the Result structure will be used to compute the radial distribution function. The function can be called with explicit control of all input parameters:

g, kb = ComplexMixtures.gr(r,count,density,binstep)

where:

ParameterDefinitionResult structure output data to provide
rVector of distancesThe d vector
countNumber of site counts at each rThe rdf or mddf vectors
densityBulk densityThe density.solvent_bulk or density.solvent densities.
binstepThe histogram stepThe options.binstep

Example:

...
R = mddf(trajectory_file, solute, solvent, options)
g, kb = ComplexMixtures.gr(R.d,R.rdf_count,R.density.solvent_bulk,R.options.binstep)
ComplexMixtures.grMethod
gr(r::AbstractVector{<:Real}, count::AbstractVector{<:Real}, density::Real, binstep::Real)

Computes the radial distribution function from the count data and the density.

This is exactly a conventional g(r) if a single atom was chosen as the solute and solvent selections.

Returns both the g(r) and the kb(r)

source
ComplexMixtures.grMethod
gr(R::Result) = gr(R.d,R.rdf_count,R.density.solvent_bulk,R.files[1].options.binstep)

If a Result structure is provided without further details, use the rdf count and the bulk solvent density.

source

Overview of the solvent and solute properties

The output to the REPL of the Result structure provides an overview of the properties of the solution. The data can be retrieved into a data structure using the overview function. Examples:

...

julia> results = mddf(trajectory_file, solute, solvent, Options(bulk_range=(8.0, 12.0)))

julia> results
--------------------------------------------------------------------------------
MDDF Overview - ComplexMixtures - Version 2.0.8
--------------------------------------------------------------------------------

Solvent properties:
-------------------

Simulation concentration: 0.49837225882780106 mol L⁻¹
Molar volume: 2006.532230249041 cm³ mol⁻¹

Concentration in bulk: 0.5182380507741433 mol L⁻¹
Molar volume in bulk: 1929.6151614228274 cm³ mol⁻¹

Solute properties:
------------------

Simulation Concentration: 0.002753437894076249 mol L⁻¹
Estimated solute partial molar volume: 13921.98945754469 cm³ mol⁻¹

Bulk range: 8.0 - 12.0 Å
Molar volume of the solute domain: 34753.1382279134 cm³ mol⁻¹

Auto-correlation: false

Trajectory files and weights:

   /home/user/NAMD/trajectory.dcd - w = 1.0

Long range MDDF mean (expected 1.0): 1.0378896753018338 ± 1.0920172247127446
Long range RDF mean (expected 1.0): 1.2147429551790854 ± 1.2081838161780682

--------------------------------------------------------------------------------

In this case, since solute and solvent are equivalent and the system is homogeneous, the molar volumes and concentrations are similar. This is not the case if the molecules are different or if the solute is at infinite dilution (in which case the bulk solvent density might be different from the solvent density in the simulation).

To retrieve the data of the overview structure use, for example:

julia> overview = overview(results);

julia> overview.solute_molar_volume
657.5051512801567