Hydrogen-bond analysis

This tool provides a fast and practical way to compute hydrogen bonds in structures (if the structures contain hydrogen atoms).

PDBTools.hydrogen_bondsFunction
hydrogen_bonds(atoms, sel, sel1 => sel2, ... ; kargs...)

Function to find hydrogen bonds in a set of atoms, or among two sets of atoms. The structure must contain Hydrogen atoms.

Arguments

  • atoms: Vector of atoms, or structure component (model, chain, segment, residue) to be analyzed.

and, optionally, the selections or selection pairs for which the hydrogen bonds must be computed:

  • sel::String: Selection string, e. g. "protein".
  • sel1 => sel2::Pair{String,String}: Pair of selection strings, e. g. "resname ARG" => "resname GLU".

The two selections of each pair, if different, must not have overlapping atoms (and error will the thrown). If no selection is provided, the hydrogen bonds of the complete structure will be computed.

Optional keyword arguments

  • unitcell::Union{Nothing,AbstractVecOrMat}=nothing: Unit cell for periodic boundary conditions.
  • donor_acceptor_distance::Real=3.5f0: Maximum distance between donor and acceptor to consider a hydrogen bond.
  • angle_cutoff::Real=30: Maximum angle (in degrees) between donor-hydrogen-acceptor to consider a hydrogen bond.
  • electronegative_elements=("N", "O", "F", "S"): Elements considered electronegative for hydrogen bonding.
  • d_covalent_bond::Real=1.2f0: Maximum distance between donor and hydrogen to consider a covalent bond.
  • parallel::Bool=false: Whether to use parallel computation.

Returns

  • HBonds: A data structure containing the found hydrogen bonds, where each element is a named tuple (D, H, A, r, ang) where the fields correspond to the donor, hydrogen and acceptor atoms, the distance between donor and acceptor atoms, and the angle.

Example

julia> using PDBTools

julia> pdb = read_pdb(PDBTools.test_dir*"/hbonds.pdb", "model 1");

julia> uc = read_unitcell(PDBTools.test_dir*"/hbonds.pdb");

julia> hbs = hydrogen_bonds(pdb, "protein"; unitcell=uc) # Single set of atoms: selection is optional
OrderedCollections.OrderedDict{String, PDBTools.HBonds} with 1 entry:
  "protein => protein" => HBonds(Int32[1, 1, 271, 37, 1020, 237, 56, 76, 1060, 204  …  748, 813, 828, 871, 863, 877, 96…

julia> hbs["protein => protein"] # Summary
HBonds data structure with 63 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643958f0)
    Last hbond: (D-H---A) = (D = 1014, H = 1017, A = 1032, r = 2.5816715f0, ang = 12.714139f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.

julia> hbs["protein => protein"][1] # first h-bond
(D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643958f0)

julia> hbs = hydrogen_bonds(pdb, "protein", "protein" => "resname SOL"; unitcell=uc) # Multiple selections
OrderedCollections.OrderedDict{String, PDBTools.HBonds} with 2 entries:
  "protein => protein"     => HBonds(Int32[1, 1, 271, 37, 1020, 237, 56, 76, 1060, 204  …  748, 813, 828, 871, 863, 877…
  "protein => resname SOL" => HBonds(Int32[1406, 1583, 1799, 2027, 789, 3503, 1169, 184, 3914, 4304  …  1224, 38768, 12…

julia> hbs["protein => protein"]
HBonds data structure with 63 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643958f0)
    Last hbond: (D-H---A) = (D = 1014, H = 1017, A = 1032, r = 2.5816715f0, ang = 12.714139f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.

julia> hbs["protein => resname SOL"]
HBonds data structure with 138 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 1406, H = 1407, A = 160, r = 2.9361732f0, ang = 6.771988f0)
    Last hbond: (D-H---A) = (D = 41798, H = 41800, A = 395, r = 2.6894214f0, ang = 10.623453f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.
Note

This function does not use topology information. It identified polar hydrogens based on distance criteria only, where d_covalent_bond is the criterion for identifying covalent bonds between donor and hydrogen atoms.

source

The HBonds structure

The hydrogen_bonds function returns an OrderedDict{String, HBonds}, where each key corresponds to the selection pair used (e.g., "protein => protein" or "protein => resname SOL").

Each HBonds object is an iterable container where each element is a named tuple with fields:

  • D: Index of the donor atom
  • H: Index of the hydrogen atom
  • A: Index of the acceptor atom
  • r: Distance between donor and acceptor (Å)
  • ang: Angle between H-D and A-D vectors (degrees)

Basic Usage

using PDBTools
pdb = read_pdb(PDBTools.test_dir*"/hbonds.pdb", "model 1")
uc = read_unitcell(PDBTools.test_dir*"/hbonds.pdb")
hbs = hydrogen_bonds(pdb,
    "protein", # alias for "protein" => "protein"
    "protein" => "water",
    unitcell=uc # optional
)
OrderedCollections.OrderedDict{String, PDBTools.HBonds} with 2 entries:
  "protein => protein" => HBonds(Int32[1, 1, 271, 37, 1020, 237, 56, 76, 1060, 204  …  748, 813, 828, 871, 863, 877, 96…
  "protein => water"   => HBonds(Int32[128, 1238], Int32[129, 1240], Int32[1238, 1114], Float32[3.03824, 2.68987], Floa…

Access the hydrogen bonds for a specific selection:

hb_prot = hbs["protein => protein"]
HBonds data structure with 63 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643976f0)
    Last hbond: (D-H---A) = (D = 1014, H = 1017, A = 1032, r = 2.5816743f0, ang = 12.714139f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.

Iterating over hydrogen bonds

The HBonds structure supports standard Julia iteration. You can iterate directly over all hydrogen bonds:

for bond in hb_prot
    println("Donor: $(bond.D), Hydrogen: $(bond.H), Acceptor: $(bond.A), Distance: $(round(bond.r, digits=2)) Å")
    break  # just show first one for brevity
end
Donor: 1, Hydrogen: 2, Acceptor: 286, Distance: 2.69 Å

Use enumerate to get both index and bond:

for (i, bond) in enumerate(hb_prot)
    if i <= 3
        println("Bond $i: D=$(bond.D) -> A=$(bond.A), r=$(round(bond.r, digits=2)) Å, angle=$(round(bond.ang, digits=1))°")
    end
end
Bond 1: D=1 -> A=286, r=2.69 Å, angle=10.6°
Bond 2: D=1 -> A=267, r=2.65 Å, angle=4.1°
Bond 3: D=271 -> A=19, r=2.78 Å, angle=25.9°

Collect all bonds into a vector:

hbonds_prot = collect(hb_prot)
63-element Vector{Any}:
 (D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643976f0)
 (D = 1, H = 4, A = 267, r = 2.6454165f0, ang = 4.060425f0)
 (D = 271, H = 272, A = 19, r = 2.7765808f0, ang = 25.896185f0)
 ⋮
 (D = 1042, H = 1043, A = 997, r = 2.7836866f0, ang = 8.648177f0)
 (D = 1014, H = 1017, A = 1032, r = 2.5816743f0, ang = 12.714139f0)

Finding specific bonds with findfirst and findall

Use findfirst to locate the first hydrogen bond matching a condition:

idx = findfirst(bond -> bond.r < 2.6, hb_prot)
30

Find first H-bond involving a specific donor atom index

acceptor_index = 55
idx = findfirst(bond -> bond.A == acceptor_index, hb_prot)
6

The corresponding hydrogen bond being:

hb_prot[6]
(D = 237, H = 238, A = 55, r = 2.885242f0, ang = 23.387154f0)

Find all hydrogen bonds with angle less than 15 degrees

Use findall to get indices of all matching hydrogen bonds:

indices = findall(bond -> bond.ang < 15, hb_prot)
println("Found $(length(indices)) H-bonds with angle < 15°")
Found 35 H-bonds with angle < 15°

Find all hydrogen bonds shorter than 2.8 Å

short_hbonds = findall(bond -> bond.r < 2.8, hb_prot)
31-element Vector{Int64}:
  1
  2
  3
  ⋮
 62
 63

These indices can be used to extract the list from a HBonds data structure:

hb_prot[short_hbonds]
HBonds data structure with 31 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643976f0)
    Last hbond: (D-H---A) = (D = 1014, H = 1017, A = 1032, r = 2.5816743f0, ang = 12.714139f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.

and will be similar to a filter operation, as shown below.

Filtering

Use filter to extract bonds matching criteria:

# Get all strong hydrogen bonds (short distance, small angle)
strong_bonds = filter(bond -> bond.r < 2.8 && bond.ang < 20, hb_prot)
HBonds data structure with 25 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643976f0)
    Last hbond: (D-H---A) = (D = 1014, H = 1017, A = 1032, r = 2.5816743f0, ang = 12.714139f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.

Use comprehensions for flexible queries:

Extract distances and angles of all hydrogen bonds:

distances = [(bond.r, bond.ang) for bond in hb_prot]
63-element Vector{Tuple{Float32, Float32}}:
 (2.6871147, 10.643976)
 (2.6454165, 4.060425)
 (2.7765808, 25.896185)
 ⋮
 (2.7836866, 8.648177)
 (2.5816743, 12.714139)

Multiple selections

Compute hydrogen bonds between different groups of atoms:

hbs = hydrogen_bonds(pdb, "protein", "protein" => "resname HOH"; unitcell=uc)
OrderedCollections.OrderedDict{String, PDBTools.HBonds} with 2 entries:
  "protein => protein"     => HBonds(Int32[1, 1, 271, 37, 1020, 237, 56, 76, 1060, 204  …  748, 813, 828, 871, 863, 877…
  "protein => resname HOH" => HBonds(Int32[128, 1238], Int32[129, 1240], Int32[1238, 1114], Float32[3.03824, 2.68987], …

Intra-protein hydrogen bonds

hbs["protein => protein"]
HBonds data structure with 63 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 1, H = 2, A = 286, r = 2.6871147f0, ang = 10.643976f0)
    Last hbond: (D-H---A) = (D = 1014, H = 1017, A = 1032, r = 2.5816743f0, ang = 12.714139f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.

Protein-water hydrogen bonds

hbs["protein => resname HOH"]
HBonds data structure with 2 hydrogen-bonds.
    First hbond: (D-H---A) = (D = 128, H = 129, A = 1238, r = 3.0382407f0, ang = 22.778383f0)
    Last hbond: (D-H---A) = (D = 1238, H = 1240, A = 1114, r = 2.6898682f0, ang = 18.384151f0)
    - r is the distance between Donor and Acceptor atoms (D-A)
    - ang is the angle (degrees) between H-D and A-D.
Note

Different selections in a pair (like "protein" => "resname SOL") must not have overlapping atoms. An error will be thrown if the same atom appears in both selections.