Secondary structure

These functions provide an interface to compute the secondary structure assignment of proteins using the STRIDE and DSSP algorithms, using as inputs vectors of PDBTools.Atoms.

ProteinSecondaryStructures.stride_runFunction
stride_run(atoms::AbstractVector{<:PDBTools.Atom})

Run STRIDE secondary structure assignment on the provided array of atoms.

Example

julia> using PDBTools

julia> atoms = read_pdb(PDBTools.TESTPDB, "protein");

julia> atoms[1458].name = "O"; # Terminal residue has non-standard atom name

julia> ss = stride_run(atoms)
104-element Vector{SSData}:
 SSData("ALA", "A", 1, "C", 360.0, 64.07)
 SSData("CYS", "A", 2, "T", -36.7, 125.51)
 SSData("ASP", "A", 3, "T", -125.56, -10.38)
 ⋮
 SSData("CYS", "A", 103, "C", -56.48, -168.79)
 SSData("THR", "A", 104, "C", -110.75, 360.0)
source
ProteinSecondaryStructures.dssp_runFunction
dssp_run(atoms::AbstractVector{<:PDBTools.Atom})

Run DSSP secondary structure assignment on the provided array of atoms.

Example

julia> using PDBTools

julia> atoms = read_pdb(PDBTools.TESTPDB, "protein");

julia> atoms[1458].name = "O"; # Terminal residue has non-standard atom name

julia> ss = dssp_run(atoms)
104-element Vector{SSData}:
 SSData("ALA", "A", 1, " ", 0.0, 64.1)
 SSData("CYS", "A", 2, " ", -36.7, 125.5)
 SSData("ASP", "A", 3, " ", -125.6, -10.4)
 ⋮
 SSData("CYS", "A", 103, " ", -56.5, -168.8)
 SSData("THR", "A", 104, " ", -110.7, 0.0)
source

The stride_run and dssp_run functions return a vector of SSData objects, each containing the secondary structure assignment and backbone dihedral angles for each residue.

These functions return a vector of SSData objects, as defined in ProteinSecondaryStructures.jl. The secondary structure assignment codes are available in the ProteinSecondaryStructures.jl documentation.

Note

Non-standard residue or atom names may lead to incorrect secondary structure assignments. To ensure accurate results, it is recommended to replace non-standard names with their standard three-letter codes before running these functions, and to verify that all backbone atoms (N, CA, C, O) are present in each residue. Errors or warnings will be issued if residue names are not recognized or backbone atoms are missing.

Example using STRIDE

The stride_run function runs the STRIDE algorithm on the provided array of atoms.

using PDBTools
atoms = read_pdb(PDBTools.TESTPDB, "protein")
ss = stride_run(atoms)
104-element Vector{SSData}:
 SSData("ALA", "A", 1, "C", 360.0, 64.07)
 SSData("CYS", "A", 2, "T", -36.7, 125.51)
 SSData("ASP", "A", 3, "T", -125.56, -10.38)
 ⋮
 SSData("CYS", "A", 103, "C", -56.48, -168.79)
 SSData("THR", "A", 104, "C", -110.75, 360.0)

To run with DSSP just use the dssp_run function instead.

Utility functions

PDBTools also reexports the ss_composition , ss_name, ss_code, ss_number of ProteinSecondaryStructures.jl, which can be used to analyze the secondary structure assignment results. For example, to compute the secondary structure composition from the ss vector obtained above:

ss_composition(ss)
Dict{String, Int64} with 10 entries:
  "310 helix"   => 0
  "bend"        => 0
  "turn"        => 36
  "beta bridge" => 4
  "kappa helix" => 0
  ⋮             => ⋮
ss_name(ss[1]) # name of the secondary structure of the first residue
"coil"

For further information refer to to ProteinSecondaryStructures.jl documentation.