Miscelaneous functions

MolSimToolkit.bulk_coordinationMethod
bulk_coordination(
    simulation::Simulation;
    reference_solute::AbstractVector{<:PDBTools.Atom}, 
    solute::AbstractVector{<:PDBTools.Atom},
    n_atoms_per_molecule_solute::Integer,
    solvent::AbstractVector{<:PDBTools.Atom}, 
    n_atoms_per_molecule_solvent::Integer,
    dmax::Real = 5.0,
    cutoff::Real = 20.0,
    bin_size::Real = 0.1,
    show_progress = true,
)

Computes the coordination number of one type of solvent molecule relative to another solvent molecule, as a function of the distance to a reference solute molecule.

For example, imagine a protein solvated in a mixture of water and TMAO. This function allows to compute the number of water molecules that are within a given distance to the TMAO molecules, as a function of the distance to the protein. That is, it computes the coordination number of water relative to TMAO, as a function of the distance to the protein.

The function returns the the distances and the histogram of the coordination number as a function of the distance.

Compat

This function was introduced in version 1.11.0 of MolSimToolkit.jl.

Arguments

  • simulation::Simulation: The simulation object
  • reference_solute::AbstractVector{PDBTools.Atom}: The atoms of the reference solute molecule (the protein in the example above).
  • solute::AbstractVector{PDBTools.Atom}: The atoms of the solute molecule (the TMAO in the example above).
  • n_atoms_per_molecule_solute::Integer: The number of atoms per molecule of the solute.
  • solvent::AbstractVector{PDBTools.Atom}: The atoms of the solvent molecule (the water in the example above).
  • n_atoms_per_molecule_solvent::Integer: The number of atoms per molecule of the solvent.
  • dmax::Real = 5.0: The maximum distance to the solute to consider a solvent molecule as coordinated.
  • cutoff::Real = 20.0: The maximum distance to the reference molecule for computing the histogram.
  • bin_size::Real = 0.1: The size of the bins for the histogram.
  • show_progress = true: Whether to show a progress bar.

Example

julia> using MolSimToolkit, PDBTools

julia> pdb = readPDB(MolSimToolkit.Testing.namd2_pdb); # protein-tmao-water system

julia> trajectory = MolSimToolkit.Testing.namd2_traj;

julia> simulation = Simulation(pdb, trajectory)
Simulation 
    Atom type: Atom
    PDB file: -
    Trajectory file: /home/leandro/.julia/dev/MolSimToolkit/test/data/namd/protein_in_water_tmao/trajectory.dcd
    Total number of frames: 20
    Frame range: 1:1:20
    Number of frames in range: 20
    Current frame: nothing

julia> r, h = bulk_coordination(
           simulation;
           reference_solute = select(pdb, "protein"),
           solute = select(pdb, "resname TMAO"),
           n_atoms_per_molecule_solute = 14,
           solvent = select(pdb, "water"),
           n_atoms_per_molecule_solvent = 3,
       );

julia> using Plots

julia> plot(r, h, # plots `h` as a function of `r`
           xlabel = "Distance to protein (Å)",
           ylabel = "TMAO-water Coordination number",
           linewidth=2,
           label=:none, framestyle=:box, fontfamily="Computer Modern",
       )
source
MolSimToolkit.intermittent_correlationMethod
intermittent_correlation(
    data::AbstractVector; 
    maxdelta = length(data) ÷ 10, 
    types::Function = x -> true,
)

Calculate the intermittent correlation function of a time series. That is, computes the probability of finding a value of the same type at a step i + delta in the time series, given that it was present in step i.

Returns an OffsetArray with indices 0:maxdelta, where the value at position 0 is 1.0, corresponding to the normalized count of events.

Arguments

  • data::AbstractVector: The time series to be analyzed.
  • maxdelta::Integer: The maximum delta-step to be considered. Defaults to length(data) ÷ 10.
  • types (optional): A function that returns true for the types of data that should be considered. Defaults to all data, i. e. x -> true. For example, to ignore 0 values, use types = x -> x != 0.

Examples

Here we produce a time-series of 10,000 elements, as a sequence of 1's and 0's ([1, 0, 1, 0, ...]), and calculate the intermittent correlation function. The probability of finding the same number (0 or 1) after odd steps is 0, and the probability of finding the same number after even steps is 1.

julia> using MolSimToolkit

julia> data = [ mod(i,2) for i in 1:10^4 ];

julia> intermittent_correlation(data; maxdelta=4)
5-element OffsetArray(::Vector{Float64}, 0:4) with eltype Float64 with indices 0:4:
 1.0
 0.0
 1.0
 0.0
 1.0

julia> intermittent_correlation(data; maxdelta=4, types = x -> x != 0)
5-element OffsetArray(::Vector{Float64}, 0:4) with eltype Float64 with indices 0:4:
 1.0
 0.0
 1.0
 0.0
 1.0

In the second run, we have ignored the 0 values, and the result is the same, because here the correlations of the 1 values are the same as the correlations of the 0 values.

Compat

This function was added in version 1.9.0 of MolSimToolkit. The types argument was added in version 1.10.0.

source
MolSimToolkit.distancesMethod
distances(simulation, indices1::AbstractVector{<:Integer}, indices2::AbstractVector{<:Integer})
distances(simulation, selection1::AbstractVector{<:PDBTools.Atom}, selection2::AbstractVector{<:PDBTools.Atom})

Function that calculates the distance between the centers of mass of two selections in a simulation.

The selections are defined by the indices1 and indices2 vectors, which are the indices of the atoms, or by the selection1 and selection2 vectors, which are vectors of PDBTools.Atom objects.

Use silent=true to suppress the progress bar.

Example

julia> using PDBTools

julia> using MolSimToolkit, MolSimToolkit.Testing

julia> sim = Simulation(Testing.namd_pdb, Testing.namd_traj);

julia> ats = atoms(sim);

julia> i1 = findall(sel"protein and residue 1", ats); # indices

julia> i2 = findall(sel"protein and residue 15", ats); # indices

julia> distances(sim, i1, i2; silent=true)
5-element Vector{Float64}:
 23.433267858947584
 30.13791365033211
 28.48617683945202
 27.92740141686934
 23.235012287435566

julia> distances(sim, 
           filter(sel"protein and residue 1", ats), # selection (PDBTools.Atom)
           filter(sel"protein and residue 15", ats); # selection (PDBTools.Atom) 
           silent=true
       )
5-element Vector{Float64}:
 23.433267858947584
 30.13791365033211
 28.48617683945202
 27.92740141686934
 23.235012287435566
source
PDBTools.center_of_massMethod
center_of_mass(
    indices::AbstractVector{Int};
    simulation::Simulation,
    positions::FramePositions,
    iref::Union{Nothing,Int} = max(1, div(length(indices),2)),
)

Calculate the center of mass of a selection of atoms in a simulation given the positions. The selection is defined by the indices vector, which is the indices of the atoms.

The iref parameter is the index of the reference atom. The center of mass is calculated by first computing the minimum-image of all atoms relative to this atom. By default, it is the atom closest to the middle of the indices vector. If iref is nothing, the center of mass is calculated without wrapping the coordinates.

julia> using PDBTools 

julia> using MolSimToolkit, MolSimToolkit.Testing

julia> simulation = Simulation(Testing.namd_pdb, Testing.namd_traj);

julia> protein_indices = findall(sel"protein", atoms(simulation));

julia> first_frame!(simulation); # move simulation to the first frame

julia> coor = positions(current_frame(simulation));

julia> cm = center_of_mass(protein_indices, simulation, coor)
3-element Point3D{Float64} with indices SOneTo(3):
 -3.7290442807974906
 -1.5339226637687564
  1.960640754560446
Compat

The iref=nothing option was added in version 1.22.0.

source
MolSimToolkit.most_representative_structureMethod
most_representative_structure(simulation::Simulation; atoms = nothing)

Find the most representative structure in a simulation. The most representative structure is the one that minimizes the RMSD with respect to the average structure of the simulation. The average structure is defined iteratively, first by aligning all frames to the first frame, and then by averaging the aligned structures. The structure most similar to the average is then identified and used as the reference structure for the next iteration. The process is repeated until the structure most similar to the average is the same as the previous iteration.

Arguments

  • simulation::Simulation: Simulation object.
  • atoms: Atoms to consider in the calculation:
    • atoms is nothing: the function will consider all alpha-carbons in proteins ("protein and name CA").
    • atoms is an AbstractVector{<:PDBTools.Atom}: the function will consider the atoms in the vector.
    • atoms is an AbstractVector{<:Int}: the function will consider the atoms with the indices in the vector.
    • atoms is a String: the function will consider the atoms selected by the string.

Returns

  • Tuple (Int, Float64), with:
    • Index of the most representative structure.
    • RMSD of the most representative structure with respect to the average structure.

Example

julia> using MolSimToolkit, MolSimToolkit.Testing, PDBTools

julia> simulation = Simulation(Testing.namd_pdb, Testing.namd_traj);

julia> most_representative_structure(simulation) # atoms == nothing (all alpha-carbons in proteins)
(4, 1.1681526249035976)

julia> most_representative_structure(simulation; atoms = "protein and name CA") # atoms is a String
(4, 1.1681526249035976)

julia> calphas = select(atoms(simulation), "name CA");

julia> most_representative_structure(simulation; atoms = calphas) # atoms is an Vector{PDBTools.Atom}
(4, 1.1681526249035976)

julia> ica = PDBTools.index.(calphas)

julia> most_representative_structure(simulation; atoms = ica) # atoms is an vector of indices
(4, 1.1681526249035976)
source