m-value (protein transfer free energy) calculator

m-values are the transfer free-energy difference, here in kcal/mol, between two structures from water to a 1M solution of a cosolvent.

PDBTools.mvalueFunction
mvalue(
    sasa_initial::SASA{3,<:AbstractVector{<:Atom}},    
    sasa_final::SASA{3,<:AbstractVector{<:Atom}},
    cosolvent::String;
    sel::Union{String,Function}=all,
    model::Type{<:MValueModel}=AutonBolen,
    backbone::Function = isbackbone,
    sidechain::Function = issidechain,
)

Calculates the m-value (transfer free energy of a protein in 1M solution, in kcal/mol) using the Tanford transfer model, as implemented by Moeser and Horinek [1] or by Auton and Bolen [2,3].

Positional Arguments

  • sasa_initial::SASA{3,<:AbstractVector{<:Atom}}: SASA object representing the initial state (e.g., native state).
  • sasa_final::SASA{3,<:AbstractVector{<:Atom}}: SASA object representing the final state (e.g., denatured state).
  • cosolvent::String: The cosolvent to consider. One of: "betaine", "proline", "sarcosine", "sorbitol", "sucrose", "tmao", "urea", "urea-app", "urea-mh" (case insensitive).

Keyword Arguments

  • sel::Union{String,Function}=all: Selection of atoms to consider in the calculation. Can be a selection string or a function that takes an Atom and returns a Bool.
  • model::Type{<:MValueModel}=AutonBolen: The model to use for the calculation. Either MoeserHorinek or AutonBolen.
  • backbone::Function = PDBTools.isbackbone: Function to identify backbone atoms.
  • sidechain::Function = PDBTools.issidechain: Function to identify side chain atoms.

Returns

A MValue object, with fields:

  • ntatoms::Int: Number of atoms considered.
  • tot::Float32: Total m-value (kcal/mol/M).
  • bb::Float32: Backbone contribution to the m-value (kcal/mol/M).
  • sc::Float32: Side chain contribution to the m-value (kcal/mol/M).
  • residue_contributions_bb::Vector{Float32}: Backbone contributions of each residue to the m-value.
  • residue_contributions_sc::Vector{Float32}: Side-chain contributions of each residue to the m-value.

Example

using PDBTools

initial_state = read_pdb("native.pdb")
final_state = read_pdb("desnat.pdb")

sasa_initial = sasa_particles(initial_state)
sasa_final = sasa_particles(final_state)

mvalue(sasa_initial, sasa_final, "chain A"; model=AutonBolen, cosolvent="TMAO")

References

  1. https://doi.org/10.1021/acs.jpcb.7b02138
  2. https://doi.org/10.1016/s0076-6879(07)28023-1
  3. https://www.pnas.org/doi/10.1073/pnas.0706251104
source

The m-values can be estimated by the Tanford transfer model, in which each amino acid contributes to the transfer free energy according to the change in its solvent accessible surface area and experimental values of individual amino acid transfer energies, and a backbone contribution (here we implement the Auton/Bolen and Moeser/Horinek models [1, 2, 3]). Typically, these models are used to obtain the effect of cosolvent on the structural stability of proteins, but the current implementation allows the practical use of these functions to compute m-values of more general transformations, as described in the examples.

Protein denaturation

Consider these two states of a model protein, a native and a denatured (straight chain) state, for which we compute the SASA:

using PDBTools
native_state = read_pdb(PDBTools.src_dir*"/tools/mvalue/1MJC_native.pdb", "protein")
sasa_native = sasa_particles(native_state)
PDBTools.SASA{3, Vector{Atom{Nothing}}}
    Number of particles: 999
    Total SASA: 4331.568
    Output of dots: false 
desnat_state = read_pdb(PDBTools.src_dir*"/tools/mvalue/1MJC_straight.pdb", "protein")
sasa_desnat = sasa_particles(desnat_state)
PDBTools.SASA{3, Vector{Atom{Nothing}}}
    Number of particles: 1007
    Total SASA: 9552.38
    Output of dots: false 

The denatured state has a greater surface area than the native state. Thus, cosolvents that bind preferentially to the surface, as urea, should promote a stabilization of the denatured state. This is obtained with:

m = mvalue(sasa_native, sasa_desnat, "urea"; model=MoeserHorinek)
PDBTools.MValue{MoeserHorinek} - 69 residues.
    Total m-value: -0.9374039 kcal mol⁻¹
    Backbone contributions: -0.39141208 kcal mol⁻¹
    Side-chain contributions: -0.54599184 kcal mol⁻¹

Where the tot, bb and sc fields contain, respectively, the total, backbone and side-chain contributions. The MValue object contains, additionally, the contribution of the side chain and backbone of each amino acid residue type for the m-value, in the residue_contributions_bb and residue_contributions_sc fields.

We can set the beta fields (for example) of the atoms as the residue contributions:

for (ir, r) in enumerate(eachresidue(native_state)) # iterate over residues
    # total contribution of residue ir
    c_residue = m.residue_contributions_sc[ir] + m.residue_contributions_bb[ir]
    for at in r # iterate over atoms in residue
        at.beta = c_residue
    end
end
write_pdb("contrib.pdb", native_state)

And with that get an image (here produced with VMD) of the contributions of the residues to the transfer free energies:

Cofactor binding

The 1BSX protein-data-bank structure contains a nuclear hormone receptor bound to a cofactor:

bsx = wget("1BSX")
collect(eachchain(bsx))
6-element Vector{Chain}[ 
    Chain(A-1905 atoms)
    Chain(B-92 atoms)
    ⋮
    Chain(E-23 atoms)
    Chain(F-23 atoms)
]

Chains A and B belong to the receptor and the cofactor. Let us understand the effect of cosolvents on the association of these two chains.

First, we compute the SASA of chains A and B, thus including chain A in bound to the cofactor:

cAB = sasa_particles(select(bsx, "chain A B"))
PDBTools.SASA{3, Vector{Atom{Nothing}}}
    Number of particles: 1997
    Total SASA: 12461.824
    Output of dots: false 

and, then, we compute the SASA of chain A without the cofactor:

cA_free = sasa_particles(select(bsx, "chain A"))
PDBTools.SASA{3, Vector{Atom{Nothing}}}
    Number of particles: 1905
    Total SASA: 11929.926
    Output of dots: false 

Note that the SASA of chain A in the bound state is smaller than that of the free state, as expected:

sasa(cAB, "chain A")
11514.509f0

We now compute the $m$-value of chain A only, but using the surface areas computed in the two different states:

mvalue(cAB, cA_free, "urea"; sel="chain A", model=MoeserHorinek)
PDBTools.MValue{MoeserHorinek} - 243 residues.
    Total m-value: -0.03302022 kcal mol⁻¹
    Backbone contributions: -0.0064200666 kcal mol⁻¹
    Side-chain contributions: -0.026600152 kcal mol⁻¹

where the tot field is negative, indicating that exposing the cofactor binding surface is slightly favorable in urea.

The same applies to the cofactor, chain B:

cB_free = sasa_particles(select(bsx, "chain B"))
mvalue(cAB, cB_free, "urea"; sel="chain B", model=MoeserHorinek)
PDBTools.MValue{MoeserHorinek} - 12 residues.
    Total m-value: -0.11344366 kcal mol⁻¹
    Backbone contributions: -0.028283926 kcal mol⁻¹
    Side-chain contributions: -0.085159734 kcal mol⁻¹

and the exposed surface of the cofactor is also slightly stabilized in urea. Urea, thus tends to destabilize the binding of the cofactor to the receptor.

By contrast, in a cosolvent that tends to promote protein aggregation, we have:

mvalue(cAB, cA_free, "Sucrose"; sel="chain A")
PDBTools.MValue{AutonBolen} - 243 residues.
    Total m-value: 0.0074142963 kcal mol⁻¹
    Backbone contributions: 0.023528593 kcal mol⁻¹
    Side-chain contributions: -0.016114296 kcal mol⁻¹

and

mvalue(cAB, cB_free, "Sucrose"; sel="chain B")
PDBTools.MValue{AutonBolen} - 12 residues.
    Total m-value: 0.14418674 kcal mol⁻¹
    Backbone contributions: 0.10452811 kcal mol⁻¹
    Side-chain contributions: 0.039658625 kcal mol⁻¹

and thus Sucrose can stabilize cofactor binding. We remark that the values obtained here are very small, and this is intended to be only an illustrative example.

Alternative SASA calculations

The following functions can be used to compute m-values from the variation of the SASA per residue type, which allow the use of external tools to compute the SASA. This is used mostly for testing purposes. The functions allow the use of SASAs obtained directly from the Auton & Bolen server, or from Gromacs SASA calculations.

PDBTools.mvalue_delta_sasaFunction
mvalue_delta_sasa(; model=MoeserHorinek, cosolvent="urea", atoms:AbstractVector{<:PDBTools.Atom}, sasas, type=1)

Calculates the m-value (transfer free energy of a protein in 1M solution) using the Tanford transfer model, as implemented by Moeser and Horinek [1] or by Auton and Bolen [2,3].

Arguments

  • model: The model to be used. Must be MoeserHorinek or AutonBolen. MoeserHorinek is only implemented for cosolvent="urea", and should be more precise in that case. Other solvents are available for AutonBolen.

  • cosolvent::String: One of "betaine", "proline", "sarcosine", "sorbitol", "sucrose", "tmao", "urea", "urea-app", "urea-mh"

  • atoms::AbstractVector{<:PDBTools.Atom}: Vector containing the atoms of the structure.

  • sasas::Dict{String, Dict{Symbol, Float64}}: A dictionary containing the change in solvent accessible surface area (SASA) upon denaturation for each amino acid type. This data can be obtained from the m-value server or calculated using GROMACS:

    • The output of the server can be parsed using the parse_mvalue_server_sasa function defined in this module.
    • Compute the SASA with delta_sasa_per_restype, a SASA calculation utility implemented in PDBTools.jl.
    • SASA values can be calculated using GROMACS with the gmx_delta_sasa_per_restype function defined in this module.
  • type::Int: Specifies which SASA value to use from the provided data, because the server provides minimum, average, and maximum values, according to different denatured models for the protein. The recommended value is 2 for comparison with experimental data. Normally, GROMACS calculations will provide a single value, so type=1 should be used in that case.

Returns

A named tuple with the following fields:

  • tot: Total transfer free energy (kcal/mol).
  • bb: Contribution from the backbone (kcal/mol).
  • sc: Contribution from the side chains (kcal/mol).
  • restype: A dictionary with the transfer free energy contributions per residue type (kcal/mol).

Each entry in the dictionary is a named tuple with bb and sc fields representing the backbone and side chain contributions, respectively.

Example calls

using PDBTools
using PDBTools: mvalue_delta_sasa
using PDBTools: delta_sasa_per_restype, parse_mvalue_server_sasa, gmx_delta_sasa_per_restype
protein = read_pdb("protein.pdb")

# Using SASA values calculated with PDBTools.jl
sasas=delta_sasa_per_restype(native=read_pdb("native.pdb"), desnat=read_pdb("desnat.pdb"))
mvalue_delta_sasa(; model=AutonBolen, cosolvent="TMAO", atoms=protein, sasas=sasas)

# Using SASA values from the m-value server
sasas_from_server=parse_mvalue_server_sasa(server_output)
mvalue_delta_sasa(; model=MoeserHorinek, cosolvent="urea", atoms=protein, sasas=sasas_from_server, type=2)

# Using SASA values calculated with GROMACS
sasas_gmx=gmx_delta_sasa_per_restype(native_pdb="native.pdb", desnat_pdb="desnat.pdb")
mvalue_delta_sasa(; model=AutonBolen, cosolvent="TMAO", atoms=protein, sasas=sasas_gmx)

References

  1. https://doi.org/10.1021/acs.jpcb.7b02138
  2. https://doi.org/10.1016/s0076-6879(07)28023-1
  3. https://www.pnas.org/doi/10.1073/pnas.0706251104
source
PDBTools.delta_sasa_per_restypeFunction
delta_sasa_per_restype(; 
    native::AbstractVector{<:PDBTools.Atom}, 
    desnat::AbstractVector{<:PDBTools.Atom}
)

Calculates the change in solvent accessible surface area (SASA) upon denaturation for each amino acid type using PDBTools. Returns a dictionary that can be directly used as input to the mvalue function.

Arguments

  • native: Vector of PDBTools.Atom objects for the native structure.
  • desnat: Vector of PDBTools.Atom objects for the denatured structure.

Returns

A dictionary where each key is an amino acid three-letter code (e.g., "ALA", "PHE"), and the value is another dictionary with two keys: :sc for side chain SASA values and :bb for backbone SASA values. Each of these keys maps to a tuple containing a single Float64 value representing the change in SASA upon denaturation in Ų.

Optional arguments

  • n_dots::Int=500: Sets the precision of the SASA calculation (greater is better).
  • backbone::Function = at -> name(at) in ("N", "CA", "C", "O"): Define what is a backbone atom.
  • sidechain::Function = at -> !(name(at) in ("N", "CA", "C", "O")): Define what is a sidechain atom.
  • ignore_hydrogen::Bool = true: By default, ignore all Hydrogen atoms of the structure.
  • unitcell=nothing: By default, do not use periodic boundary conditions. To use PBCs, define A unitcell by providing either a 3x3 matrix or, for orthorhombic cells, a vector of length 3 of cell sides.
source
PDBTools.parse_mvalue_server_sasaFunction
parse_mvalue_server_sasa(string::AbstractString)

Parses the SASA output from the m-value calculator server (http://best.bio.jhu.edu/mvalue/), into a dictionary that can be directly used as input to the mvalue function.

The input string should contain lines formatted as follows, and correspond to the SASA values for each amino acid type:

sasa_from_server = """
ALA 	8 	 (    11.1)     79.1 [   147.1] | (   -13.0)     51.4 [   115.8] 
PHE 	3 	 (   166.9)    197.1 [   230.2] | (    29.4)     56.4 [    83.4] 
LEU 	7 	 (   475.2)    532.2 [   589.3] | (    89.3)    145.3 [   201.3] 
...
LYS 	6 	 (   171.5)    220.4 [   269.3] | (    -4.5)     42.0 [    88.5] 
ARG 	1 	 (   110.2)    124.4 [   138.6] | (    17.1)     25.0 [    33.0] 
CYS 	0 	 (     0.0)      0.0 [     0.0] | (     0.0)      0.0 [     0.0] 
"""

This data can be found in the output of the server, under the title "Sidechain and Backbone changes in Accessible Surface Area".

The function returns a dictionary where each key is an amino acid three-letter code (e.g., "ALA", "PHE"), and the value is another dictionary with two keys: :sc for side chain SASA values and :bb for backbone SASA values. Each of these keys maps to a tuple containing three Float64 values representing the minimum, average, and maximum SASA values in Ų.

source
PDBTools.gmx_delta_sasa_per_restypeFunction
gmx_delta_sasa_per_restype(; native_pdb::AbstractString, desnat_pdb::AbstractString)

Calculates the change in solvent accessible surface area (SASA) upon denaturation for each amino acid type using GROMACS. Returns a dictionary that can be directly used as input to the mvalue function.

Note

This function requires GROMACS (gmx sasa executable) to be installed and accessible from the command line. The path to the gmx executable can be provided with the gmx keyword.

Arguments

  • native_pdb::AbstractString: Path to the PDB file of the native protein structure.
  • desnat_pdb::AbstractString: Path to the PDB file of the denatured protein structure.

Optional arguments

  • gmx: the path to the gmx GROMACS exectuable (by default it expects gmx to be on the path).
  • n_dots::Int: sets the precision of the SASA grid (greater is better).
  • backbone::Function = at -> name(at) in ("N", "CA", "C", "O"): Define what is a backbone atom.
  • sidechain::Function = at -> !(name(at) in ("N", "CA", "C", "O")): Define what is a sidechain atom.
  • ignore_hydrogen::Bool=true: By default, ignore all hydrogen atoms.

Returns

A dictionary where each key is an amino acid three-letter code (e.g., "ALA", "PHE"), and the value is another dictionary with two keys: :sc for side chain SASA values and :bb for backbone SASA values. Each of these keys maps to a tuple containing a single Float64 value representing the change in SASA upon denaturation in Ų.

source