m-value (protein transfer free energy) calculator
m-values are the transfer free-energy difference, here in kcal/mol, between two structures from water to a 1M solution of a cosolvent.
PDBTools.mvalue — Functionmvalue(
sasa_initial::SASA{3,<:AbstractVector{<:Atom}},
sasa_final::SASA{3,<:AbstractVector{<:Atom}},
cosolvent::String;
sel::Union{String,Function}=all,
model::Type{<:MValueModel}=AutonBolen,
backbone::Function = isbackbone,
sidechain::Function = issidechain,
)Calculates the m-value (transfer free energy of a protein in 1M solution, in kcal/mol) using the Tanford transfer model, as implemented by Moeser and Horinek [1] or by Auton and Bolen [2,3].
Positional Arguments
sasa_initial::SASA{3,<:AbstractVector{<:Atom}}: SASA object representing the initial state (e.g., native state).sasa_final::SASA{3,<:AbstractVector{<:Atom}}: SASA object representing the final state (e.g., denatured state).cosolvent::String: The cosolvent to consider. One of: "betaine", "proline", "sarcosine", "sorbitol", "sucrose", "tmao", "urea", "urea-app", "urea-mh" (case insensitive).
Keyword Arguments
sel::Union{String,Function}=all: Selection of atoms to consider in the calculation. Can be a selection string or a function that takes anAtomand returns aBool.model::Type{<:MValueModel}=AutonBolen: The model to use for the calculation. EitherMoeserHorinekorAutonBolen.backbone::Function = PDBTools.isbackbone: Function to identify backbone atoms.sidechain::Function = PDBTools.issidechain: Function to identify side chain atoms.
Returns
A MValue object, with fields:
ntatoms::Int: Number of atoms considered.tot::Float32: Total m-value (kcal/mol/M).bb::Float32: Backbone contribution to the m-value (kcal/mol/M).sc::Float32: Side chain contribution to the m-value (kcal/mol/M).residue_contributions_bb::Vector{Float32}: Backbone contributions of each residue to the m-value.residue_contributions_sc::Vector{Float32}: Side-chain contributions of each residue to the m-value.
Example
using PDBTools
initial_state = read_pdb("native.pdb")
final_state = read_pdb("desnat.pdb")
sasa_initial = sasa_particles(initial_state)
sasa_final = sasa_particles(final_state)
mvalue(sasa_initial, sasa_final, "chain A"; model=AutonBolen, cosolvent="TMAO")References
- https://doi.org/10.1021/acs.jpcb.7b02138
- https://doi.org/10.1016/s0076-6879(07)28023-1
- https://www.pnas.org/doi/10.1073/pnas.0706251104
The m-values can be estimated by the Tanford transfer model, in which each amino acid contributes to the transfer free energy according to the change in its solvent accessible surface area and experimental values of individual amino acid transfer energies, and a backbone contribution (here we implement the Auton/Bolen and Moeser/Horinek models [1, 2, 3]). Typically, these models are used to obtain the effect of cosolvent on the structural stability of proteins, but the current implementation allows the practical use of these functions to compute m-values of more general transformations, as described in the examples.
Protein denaturation
Consider these two states of a model protein, a native and a denatured (straight chain) state, for which we compute the SASA:
using PDBTools
native_state = read_pdb(PDBTools.src_dir*"/tools/mvalue/1MJC_native.pdb", "protein")
sasa_native = sasa_particles(native_state)PDBTools.SASA{3, Vector{Atom{Nothing}}}
Number of particles: 999
Total SASA: 4331.568
Output of dots: false desnat_state = read_pdb(PDBTools.src_dir*"/tools/mvalue/1MJC_straight.pdb", "protein")
sasa_desnat = sasa_particles(desnat_state)PDBTools.SASA{3, Vector{Atom{Nothing}}}
Number of particles: 1007
Total SASA: 9552.38
Output of dots: false The denatured state has a greater surface area than the native state. Thus, cosolvents that bind preferentially to the surface, as urea, should promote a stabilization of the denatured state. This is obtained with:
m = mvalue(sasa_native, sasa_desnat, "urea"; model=MoeserHorinek)PDBTools.MValue{MoeserHorinek} - 69 residues.
Total m-value: -0.9374039 kcal mol⁻¹
Backbone contributions: -0.39141208 kcal mol⁻¹
Side-chain contributions: -0.54599184 kcal mol⁻¹Where the tot, bb and sc fields contain, respectively, the total, backbone and side-chain contributions. The MValue object contains, additionally, the contribution of the side chain and backbone of each amino acid residue type for the m-value, in the residue_contributions_bb and residue_contributions_sc fields.
We can set the beta fields (for example) of the atoms as the residue contributions:
for (ir, r) in enumerate(eachresidue(native_state)) # iterate over residues
# total contribution of residue ir
c_residue = m.residue_contributions_sc[ir] + m.residue_contributions_bb[ir]
for at in r # iterate over atoms in residue
at.beta = c_residue
end
end
write_pdb("contrib.pdb", native_state)And with that get an image (here produced with VMD) of the contributions of the residues to the transfer free energies:

Cofactor binding
The 1BSX protein-data-bank structure contains a nuclear hormone receptor bound to a cofactor:
bsx = wget("1BSX")
collect(eachchain(bsx))6-element Vector{Chain}[
Chain(A-1905 atoms)
Chain(B-92 atoms)
⋮
Chain(E-23 atoms)
Chain(F-23 atoms)
]Chains A and B belong to the receptor and the cofactor. Let us understand the effect of cosolvents on the association of these two chains.
First, we compute the SASA of chains A and B, thus including chain A in bound to the cofactor:
cAB = sasa_particles(select(bsx, "chain A B"))PDBTools.SASA{3, Vector{Atom{Nothing}}}
Number of particles: 1997
Total SASA: 12461.824
Output of dots: false and, then, we compute the SASA of chain A without the cofactor:
cA_free = sasa_particles(select(bsx, "chain A"))PDBTools.SASA{3, Vector{Atom{Nothing}}}
Number of particles: 1905
Total SASA: 11929.926
Output of dots: false Note that the SASA of chain A in the bound state is smaller than that of the free state, as expected:
sasa(cAB, "chain A")11514.509f0We now compute the $m$-value of chain A only, but using the surface areas computed in the two different states:
mvalue(cAB, cA_free, "urea"; sel="chain A", model=MoeserHorinek)PDBTools.MValue{MoeserHorinek} - 243 residues.
Total m-value: -0.03302022 kcal mol⁻¹
Backbone contributions: -0.0064200666 kcal mol⁻¹
Side-chain contributions: -0.026600152 kcal mol⁻¹where the tot field is negative, indicating that exposing the cofactor binding surface is slightly favorable in urea.
The same applies to the cofactor, chain B:
cB_free = sasa_particles(select(bsx, "chain B"))
mvalue(cAB, cB_free, "urea"; sel="chain B", model=MoeserHorinek)PDBTools.MValue{MoeserHorinek} - 12 residues.
Total m-value: -0.11344366 kcal mol⁻¹
Backbone contributions: -0.028283926 kcal mol⁻¹
Side-chain contributions: -0.085159734 kcal mol⁻¹and the exposed surface of the cofactor is also slightly stabilized in urea. Urea, thus tends to destabilize the binding of the cofactor to the receptor.
By contrast, in a cosolvent that tends to promote protein aggregation, we have:
mvalue(cAB, cA_free, "Sucrose"; sel="chain A")PDBTools.MValue{AutonBolen} - 243 residues.
Total m-value: 0.0074142963 kcal mol⁻¹
Backbone contributions: 0.023528593 kcal mol⁻¹
Side-chain contributions: -0.016114296 kcal mol⁻¹and
mvalue(cAB, cB_free, "Sucrose"; sel="chain B")PDBTools.MValue{AutonBolen} - 12 residues.
Total m-value: 0.14418674 kcal mol⁻¹
Backbone contributions: 0.10452811 kcal mol⁻¹
Side-chain contributions: 0.039658625 kcal mol⁻¹and thus Sucrose can stabilize cofactor binding. We remark that the values obtained here are very small, and this is intended to be only an illustrative example.
Alternative SASA calculations
The following functions can be used to compute m-values from the variation of the SASA per residue type, which allow the use of external tools to compute the SASA. This is used mostly for testing purposes. The functions allow the use of SASAs obtained directly from the Auton & Bolen server, or from Gromacs SASA calculations.
PDBTools.mvalue_delta_sasa — Functionmvalue_delta_sasa(; model=MoeserHorinek, cosolvent="urea", atoms:AbstractVector{<:PDBTools.Atom}, sasas, type=1)Calculates the m-value (transfer free energy of a protein in 1M solution) using the Tanford transfer model, as implemented by Moeser and Horinek [1] or by Auton and Bolen [2,3].
Arguments
model: The model to be used. Must beMoeserHorinekorAutonBolen.MoeserHorinekis only implemented forcosolvent="urea", and should be more precise in that case. Other solvents are available forAutonBolen.cosolvent::String: One of "betaine", "proline", "sarcosine", "sorbitol", "sucrose", "tmao", "urea", "urea-app", "urea-mh"atoms::AbstractVector{<:PDBTools.Atom}: Vector containing the atoms of the structure.sasas::Dict{String, Dict{Symbol, Float64}}: A dictionary containing the change in solvent accessible surface area (SASA) upon denaturation for each amino acid type. This data can be obtained from the m-value server or calculated using GROMACS:- The output of the server can be parsed using the
parse_mvalue_server_sasafunction defined in this module. - Compute the SASA with
delta_sasa_per_restype, a SASA calculation utility implemented in PDBTools.jl. - SASA values can be calculated using GROMACS with the
gmx_delta_sasa_per_restypefunction defined in this module.
- The output of the server can be parsed using the
type::Int: Specifies which SASA value to use from the provided data, because the server provides minimum, average, and maximum values, according to different denatured models for the protein. The recommended value is2for comparison with experimental data. Normally, GROMACS calculations will provide a single value, sotype=1should be used in that case.
Returns
A named tuple with the following fields:
tot: Total transfer free energy (kcal/mol).bb: Contribution from the backbone (kcal/mol).sc: Contribution from the side chains (kcal/mol).restype: A dictionary with the transfer free energy contributions per residue type (kcal/mol).
Each entry in the dictionary is a named tuple with bb and sc fields representing the backbone and side chain contributions, respectively.
Example calls
using PDBTools
using PDBTools: mvalue_delta_sasa
using PDBTools: delta_sasa_per_restype, parse_mvalue_server_sasa, gmx_delta_sasa_per_restype
protein = read_pdb("protein.pdb")
# Using SASA values calculated with PDBTools.jl
sasas=delta_sasa_per_restype(native=read_pdb("native.pdb"), desnat=read_pdb("desnat.pdb"))
mvalue_delta_sasa(; model=AutonBolen, cosolvent="TMAO", atoms=protein, sasas=sasas)
# Using SASA values from the m-value server
sasas_from_server=parse_mvalue_server_sasa(server_output)
mvalue_delta_sasa(; model=MoeserHorinek, cosolvent="urea", atoms=protein, sasas=sasas_from_server, type=2)
# Using SASA values calculated with GROMACS
sasas_gmx=gmx_delta_sasa_per_restype(native_pdb="native.pdb", desnat_pdb="desnat.pdb")
mvalue_delta_sasa(; model=AutonBolen, cosolvent="TMAO", atoms=protein, sasas=sasas_gmx)References
- https://doi.org/10.1021/acs.jpcb.7b02138
- https://doi.org/10.1016/s0076-6879(07)28023-1
- https://www.pnas.org/doi/10.1073/pnas.0706251104
PDBTools.delta_sasa_per_restype — Functiondelta_sasa_per_restype(;
native::AbstractVector{<:PDBTools.Atom},
desnat::AbstractVector{<:PDBTools.Atom}
)Calculates the change in solvent accessible surface area (SASA) upon denaturation for each amino acid type using PDBTools. Returns a dictionary that can be directly used as input to the mvalue function.
Arguments
native: Vector of PDBTools.Atom objects for the native structure.desnat: Vector of PDBTools.Atom objects for the denatured structure.
Returns
A dictionary where each key is an amino acid three-letter code (e.g., "ALA", "PHE"), and the value is another dictionary with two keys: :sc for side chain SASA values and :bb for backbone SASA values. Each of these keys maps to a tuple containing a single Float64 value representing the change in SASA upon denaturation in Ų.
Optional arguments
n_dots::Int=500: Sets the precision of the SASA calculation (greater is better).backbone::Function = at -> name(at) in ("N", "CA", "C", "O"): Define what is a backbone atom.sidechain::Function = at -> !(name(at) in ("N", "CA", "C", "O")): Define what is a sidechain atom.ignore_hydrogen::Bool = true: By default, ignore all Hydrogen atoms of the structure.unitcell=nothing: By default, do not use periodic boundary conditions. To use PBCs, define A unitcell by providing either a 3x3 matrix or, for orthorhombic cells, a vector of length 3 of cell sides.
PDBTools.parse_mvalue_server_sasa — Functionparse_mvalue_server_sasa(string::AbstractString)Parses the SASA output from the m-value calculator server (http://best.bio.jhu.edu/mvalue/), into a dictionary that can be directly used as input to the mvalue function.
The input string should contain lines formatted as follows, and correspond to the SASA values for each amino acid type:
sasa_from_server = """
ALA 8 ( 11.1) 79.1 [ 147.1] | ( -13.0) 51.4 [ 115.8]
PHE 3 ( 166.9) 197.1 [ 230.2] | ( 29.4) 56.4 [ 83.4]
LEU 7 ( 475.2) 532.2 [ 589.3] | ( 89.3) 145.3 [ 201.3]
...
LYS 6 ( 171.5) 220.4 [ 269.3] | ( -4.5) 42.0 [ 88.5]
ARG 1 ( 110.2) 124.4 [ 138.6] | ( 17.1) 25.0 [ 33.0]
CYS 0 ( 0.0) 0.0 [ 0.0] | ( 0.0) 0.0 [ 0.0]
"""This data can be found in the output of the server, under the title "Sidechain and Backbone changes in Accessible Surface Area".
The function returns a dictionary where each key is an amino acid three-letter code (e.g., "ALA", "PHE"), and the value is another dictionary with two keys: :sc for side chain SASA values and :bb for backbone SASA values. Each of these keys maps to a tuple containing three Float64 values representing the minimum, average, and maximum SASA values in Ų.
PDBTools.gmx_delta_sasa_per_restype — Functiongmx_delta_sasa_per_restype(; native_pdb::AbstractString, desnat_pdb::AbstractString)Calculates the change in solvent accessible surface area (SASA) upon denaturation for each amino acid type using GROMACS. Returns a dictionary that can be directly used as input to the mvalue function.
This function requires GROMACS (gmx sasa executable) to be installed and accessible from the command line. The path to the gmx executable can be provided with the gmx keyword.
Arguments
native_pdb::AbstractString: Path to the PDB file of the native protein structure.desnat_pdb::AbstractString: Path to the PDB file of the denatured protein structure.
Optional arguments
gmx: the path to thegmxGROMACS exectuable (by default it expectsgmxto be on the path).n_dots::Int: sets the precision of the SASA grid (greater is better).backbone::Function = at -> name(at) in ("N", "CA", "C", "O"): Define what is a backbone atom.sidechain::Function = at -> !(name(at) in ("N", "CA", "C", "O")): Define what is a sidechain atom.ignore_hydrogen::Bool=true: By default, ignore all hydrogen atoms.
Returns
A dictionary where each key is an amino acid three-letter code (e.g., "ALA", "PHE"), and the value is another dictionary with two keys: :sc for side chain SASA values and :bb for backbone SASA values. Each of these keys maps to a tuple containing a single Float64 value representing the change in SASA upon denaturation in Ų.