Contact and distance maps

The contact_map function computes a contact map for a structure or a pair of structures. These structures are typically proteins, but any structures defined by sequences of residues can be provided as inputs. The heatmap function, from Plots, is overloaded here to provide a convenient way to plot the contact or distance maps.

PDBTools.contact_mapFunction
contact_map(
    atoms1::AbstractVector{<:PDBTools.Atom}
    atoms2::AbstractVector{<:PDBTools.Atom}; # optional for contacts between two structures
    dmax::Real=4.0,
    gap::Int=0, # only available if atoms2 is not provided
    unitcell=nothing,
    discrete::Bool=true,
    positions::Union{Nothing,AbstractVector{<:AbstractVector{<:Real}}}=nothing,
)

Calculate the contact map between residues in a protein* structure (if only atoms1 is provided) or between residues in two different protein* structures (atoms1 and atoms2).

Note

The distance used to define a contact is the minimum distance between any two atoms of the residues of the atoms groups provided, with a threshold distance dmax.

Returns the contact map as a ContactMap object.

*The term "protein" is used here to refer to any structure with residues.

Arguments

  • atoms1::AbstractVector{<:PDBTools.Atom}: Atoms of the first structure.
  • atoms2::AbstractVector{<:PDBTools.Atom}: Atoms of the second structure, if provided.

Optional keyword arguments

  • dmax::Real=4.0: Threshold distance for a contact.
  • gap::Int=0: Gap between residues to calculate contacts.
  • unitcell=nothing: Unit cell dimensions for periodic boundary conditions.
  • discrete::Bool=true: If true, the matrix contains Bool values, where true indicates a contact and false indicates no contact. If false, the matrix contains distances between residues.
  • positions: Positions of the atoms in the structure. If provided, the function uses these positions to calculate the distance between residues.

Examples

Contact map between residues in the same structure

julia> using PDBTools

julia> ats = read_pdb(PDBTools.DIMERPDB);

julia> cA = select(ats, "chain A");

julia> cB = select(ats, "chain B");

julia> map = contact_map(cA, cB) # contact map between chains A and B
ContactMap{Union{Missing, Bool}} of size (243, 12), with threshold 4.0 and gap 0 

julia> # using Plots; heatmap(map); # uncoment to plot the contact map

Contact map between residues in two different structures

julia> using PDBTools

julia> ats = read_pdb(PDBTools.DIMERPDB);

julia> cA = select(ats, "chain A");

julia> cB = select(ats, "chain B");

julia> map = contact_map(cA, cB) # contact map between chains A and B
ContactMap{Union{Missing, Bool}} of size (243, 12), with threshold 4.0 and gap 0 

julia> # using Plots; heatmap(map); # uncoment plot the contact map
source
Plots.heatmapMethod
heatmap(map::PDBTools.ContactMap; kwargs...)

Plot a contact map.

Arguments

  • map::ContactMap: the contact map to plot

All other arguments are default keywords of Plots.heatmap and can be adjusted to customize the plot.

Most typical options to adjust are:

  • xstep: the stride of the x-axis ticks
  • ystep: the stride of the y-axis ticks
  • color: the color palette to use (default: :grayC for distances, :Greys_9 for binary maps)
  • clims: the range of the color scale.
  • colorbar_title: the title of the colorbar. Default: "distance (Å)" for distances, no title for binary maps.

Example

julia> using PDBTools, Plots

julia> ats = read_pdb(PDBTools.DIMERPDB);

julia> cA = select(ats, "chain A");

julia> cB = select(ats, "chain B");

julia> map = contact_map(cA, cB)
ContactMap{Union{Missing, Bool}} of size (243, 12), with threshold 4.0 and gap 0

julia> # plt = heatmap(map) # uncomment to plot

julia> map = contact_map(cA, cB; discrete=false) # distance map
ContactMap{Union{Missing, Float32}} of size (243, 12), with threshold 4.0 and gap 0

julia> # plt = heatmap(map) # uncomment to plot
source
Note

The distance computed in these functions is the minimum distance between the atoms of the residues. If, for example, a contact map computed from the distance between Cα atoms is desired, select the atoms before computing the map:

cA = select("chain A and name CA")
contact_map(cA; dmax=8.0)

Importantly, the maximum distance defining a contact has to be adjusted, as it is, by default, dmax=4.0, which is a reasonable contact measure from minimum distance between atoms, but too short for distances between backbone atoms.

Contact map

A typical usage consists in computing the contact map and plotting it:

using PDBTools
using Plots
ats = read_pdb(PDBTools.DIMERPDB);
cA = select(ats, "chain A");
cB = select(ats, "chain B");
map = contact_map(cA, cB) # contact map between chains A and B
heatmap(map)
Example block output

Distance map

In the example above we opted to plot a discrete contact map, with the default contact distance dmax=4.0. Now we change two parameters: discrete=false and dmax=12.0, to compute a distance map up to a greater distance:

using PDBTools
using Plots
ats = read_pdb(PDBTools.DIMERPDB);
cA = select(ats, "chain A");
cB = select(ats, "chain B");
map = contact_map(cA, cB; discrete=false, dmax=12.0) # contact map between chains A and B
heatmap(map)
Example block output

Single structure

Similarly, we can produce plots for the contact map of a single structure. Here, we showcase the use of the gap parameter, to ignore residues closer in the sequence by less than 4 residues:

using PDBTools
using Plots
ats = read_pdb(PDBTools.DIMERPDB);
cA = select(ats, "chain A");
distance_map = contact_map(cA; gap=4, discrete=false, dmax=12.0) # chain A only
discrete_map = contact_map(cA; gap=4, discrete=true, dmax=12.0) # chain A only
plot(
    heatmap(distance_map; colorbar=nothing, color=:davos),
    heatmap(discrete_map);
    layout=(1,2), size=(800,500)
)
Example block output

Customizing the plot

All heatmap parametes can be customized using the Plots keyword syntax. Above, we illustrated this by removing the color bar and changing the color scale.

Common customization options are:

  • xstep: the stride of the x-axis ticks
  • ystep: the stride of the y-axis ticks
  • color: the color palette to use (default: :grayC for distances, :Greys_9 for binary maps)
  • clims: the range of the color scale.
  • colorbar_title: the title of the colorbar. Default: "distance (Å)" for distances, no title for binary maps.

Indexing

The ContactMap data structure can be indexed to extract the contacts of a specific residue. For example:

using PDBTools
ats = read_pdb(PDBTools.DIMERPDB);
cA = select(ats, "chain A");
cB = select(ats, "chain B");
map = contact_map(cA, cB; discrete=false, dmax=12.0)
map[235,:] # all distances below 12.0 Angs of residue 235 of cA with cB
12-element Vector{Union{Missing, Float32}}:
  7.688603f0
  7.10548f0
  3.8597367f0
  7.667418f0
 10.733246f0
 10.48728f0
  9.114745f0
   missing
   missing
   missing
   missing
   missing

Data structure and auxiliary functions

PDBTools.ContactMapType
ContactMap{Bool|Real}

Data structure to store contact maps between residues in a protein structure. The contact map is a matrix of distances between residues. A contact is defined when the distance between any two atoms of the residues is less than a given threshold dmax.

If the distance between two residues is greater than dmax, the value in the matrix is missing, indicating that there is no contact between the residues. If the distance is less than dmax, the value in the matrix is the distance between the residues.

The gap parameter is used to calculate contacts between residues separated by a given number of residues. For example, if gap=3, the contact map was calculated between residues separated by at least 3 residues in the sequence.

Fields

  • matrix::Matrix{Union{Missing,T}}: Matrix of distances between residues.
  • d::T: Threshold distance for a contact.
  • gap::Int: Gap between residues to calculate contacts.

If the contact map was calculated with discrete=true, the matrix contains Bool values, where true indicates a contact and false indicates no contact. On the other hand, if discrete=false, the matrix contains distances between residues.

source
PDBTools.residue_residue_distanceFunction
residue_residue_distance(
    r1::PDBTools.Residue, 
    r2::PDBTools.Residue; 
    positions::AbstractVector{AbstractVector{T}}=nothing; 
    unitcell=nothing
)

Calculate the minimum distance between two residues in a protein structure. If the positions argument is not provided, the function calculates the distance using the coordinates of the atoms in the residues. If positions is provided, the function uses the coordinates in the positions array.

Arguments

  • r1::PDBTools.Residue: Residue 1
  • r2::PDBTools.Residue: Residue 2
  • positions::AbstractVector{AbstractVector{T}}: Optional alternate positions of the atoms in the structure.
  • unitcell=nothing: Optional unit cell dimensions for periodic boundary conditions.
Note

The index of the atoms in the residues must match the index of the atoms in the positions array.

Example

julia> using PDBTools

julia> ats = read_pdb(PDBTools.DIMERPDB);

julia> residues = collect(eachresidue(ats));

julia> r1 = residues[1]; r10 = residues[10];

julia> println(name(r1), resnum(r1), " and ", name(r10), resnum(r10))
LYS211 and GLU220

julia> d = residue_residue_distance(r1, r10)
16.16511f0
source