Computing the MDDF

Minimum-Distance Distribution Function

The main function of the ComplexMixtures package actually computes the MDDF between the solute and the solvent chosen.

ComplexMixtures.mddfFunction
mddf(
    trajectory_file::String, 
    solute::AtomSelection,
    solvent::AtomSelection, # optional: if omitted, an auto-correlation will be computed
    options::Options=Options(); # optional: if omitted, default options will be used
    # optional keywords:
    trajectory_format="", 
    frame_weights = Float64[], 
    coordination_number_only = false, 
    low_memory = false, 
    chemfiles = false,
)

Computes the minimum-distance distribution function, atomic contributions, and KB integrals, given the trajectory file name, and the definition of the solute and solvent groups of atoms as AtomSelection data structures. If the solvent parameter is omitted, a self-correlation will be computed.

This is the main function of the ComplexMixtures package.

The options parameter is optional. If not set, the default Options() structure will be used.

Optional execution keywords

  • trajectory_format is a string that defines the format of the trajectory file. If not provided, the format will be guessed from the file extension, and the Chemfiles library will be used to read the trajectory.

  • frame_weights is an array of weights for each frame of the trajectory. If this is provided, the MDDF will be computed as a weighted average of the MDDFs of each frame. This can be used to study the solvation dependency in perturbed ensembles.

  • coordination_number_only is a boolean that, if set to true, will compute only the site-counts and coordination numbers of the solvent molecules around the solute, and not the MDDF, RDF, or KB integrals. This is useful when the normalization of the distribution is not possible or needed, for instance when the bulk solution is not properly defined. The computation is much faster in this case. The call to mddf with this option is equivalent to direct the call to coordination_number.

  • low_memory can be set to true to reduce the memory requirements of the computation. This will parallelize the computation of the minimum distances at a lower level, reducing the memory requirements at the expense of some performance.

  • chemfiles is a boolean that, if set to true, will use try to use the Chemfiles library to read the trajectory file. independently of the file extension. By default, PDB and DCD trajectories are read by specific readers.

Compat

The current call signature was introduced in version 2.9.0. The previous call signature with the previous construction of the Trajectory object, is still available, but it is no longer recommended because it more prone to user-level solute and solvent configuration errors.

Examples

julia> using ComplexMixtures, PDBTools

julia> using ComplexMixtures.Testing: data_dir

julia> atoms = readPDB(joinpath(data_dir,"NAMD/structure.pdb"));

julia> solute = AtomSelection(select(atoms, "protein"), nmols=1);

julia> solvent = AtomSelection(select(atoms, "resname TMAO"), natomspermol=14);

julia> options = Options(lastframe=10, bulk_range=(10.0, 15.0));

julia> trajectory_file = joinpath(data_dir,"NAMD/trajectory.dcd");

julia> results = mddf(trajectory_file, solute, solvent, options);
source

The mddf functions is run with, for example:

results = mddf(trajectory_file, solute, solvent, Options(bulk_range=(10.0, 15.0)))  

The MDDF along with other results, like the corresponding KB integrals, are returned in the results data structure, which is described in the next section.

It is possible to tune several options of the calculation, by setting the Options data structure with user-defined values in advance. The most common parameters to be set by the user are bulk_range and stride.

stride defines if some frames will be skip during the calculation (for speedup). For example, if stride=5, only one in five frames will be considered. Adjust stride with:

options = Options(stride=5, bulk_range=(10.0, 15.0))
results = mddf(trajectory_file, solute, solvent, options)
Note

bulk_range defines the subset of the system, as defined according to a range of distances from the solute, that are to be considered as the bulk solution. Within this range of distances, the user believes that the reference solute molecule does not significantly affect anymore the structure of the solvent.

By default, all molecules above 10 Angstroms from the solute are considered bulk molecules (corresponding to Options(dbulk=10.0)), but it is highly recommended to use a manual definition of bulk_range.

The definition of a range of distances within the system to compute the bulk density is adequate because this system subset is then an open system with a solvent molecule reservoir. The adequate choice of bulk_range can be inspected by the proper convergence of the distribution functions (which must converge to 1.0) and a proper convergence of the KB integrals.

The bulk_range option was introduced in version 2.1.0.

See the Options section for further details and other options to set.

Coordination numbers only

The coordination number is the unnormalized count of how many molecules of the solvent are within a given distance to the solute. Coordination numbers can be computed for systems where the normalization of the distribution functions is not possible (or needed) because of an ill definition of an ideal-gas state. For example, in highly heterogeneous systems, on in systems with only a few molecules of the "solvent", the density of the bulk solution might not be properly defined.

In these cases, nevertheless, coordination numbers can be computed and still provide valuable information about the molecular structure of the system. Coordination number can be computed also from the results obtained from a mddf run, as explained in the corresponding section of the Tools menu.

The coordination_number function, called with the same arguments as the mddf function, can be used to compute coordination numbers without the normalization required for the MDDF, providing (possibly much) faster computations when the normalization is not possible or required:

ComplexMixtures.coordination_numberMethod
coordination_number(
    trajectory_file::String, 
    solute::AtomSelection,
    solvent::AtomSelection, # optional: if omitted, an auto-correlation will be computed
    options::Options=Options(); # optional: if omitted, default options will be used
    # optional keywords:
    trajectory_format="", 
    frame_weights = Float64[], 
    low_memory = false, 
)

Computes the minimum-distance coordination number and atomic contributions, given the trajectory file name, and the definition of the solute and solvent groups of atoms as AtomSelection data structures. If the solvent parameter is omitted, a self-coordination number will be computed.

The output is a Result structure, which contains the data as the result of a call to mddf, except that all counters which require normalization of the distribution will be zero. In summary, this result data structure can be used to compute the coordination numbers, but not the MDDF, RDF, or KB integrals.

The options parameter is optional. If not set, the default Options() structure will be used.

Optional execution keywords

  • trajectory_format is a string that defines the format of the trajectory file. If not provided, the format will be guessed from the file extension, and the Chemfiles library will be used to read the trajectory.

  • frame_weights is an array of weights for each frame of the trajectory. If this is provided, the MDDF will be computed as a weighted average of the MDDFs of each frame. This can be used to study the solvation dependency in perturbed ensembles.

  • low_memory can be set to true to reduce the memory requirements of the computation. This will parallelize the computation of the minimum distances at a lower level, reducing the memory requirements at the expense of some performance.

  • chemfiles is a boolean that, if set to true, will use try to use the Chemfiles library to read the trajectory file. independently of the file extension. By default, PDB and DCD trajectories are read by specific readers.

Compat

The current call signature was introduced in version 2.9.0. The previous call signature with the previous construction of the Trajectory object, is still available, but it is no longer recommended because it more prone to user-level solute and solvent configuration errors.

Examples

julia> using ComplexMixtures, PDBTools

julia> using ComplexMixtures.Testing: data_dir

julia> atoms = readPDB(joinpath(data_dir,"NAMD/structure.pdb"));

julia> solute = AtomSelection(select(atoms, "protein"), nmols=1);

julia> solvent = AtomSelection(select(atoms, "resname TMAO"), natomspermol=14);

julia> options = Options(lastframe=10, bulk_range=(10.0, 15.0));

julia> trajectory_file = joinpath(data_dir,"NAMD/trajectory.dcd");

julia> results = coordination_number(trajectory_file, solute, solvent, options);
source
Note

The mddf, kb, and random count parameters will be filled with zeros when using this options, and are meaningless.