Structure object

Warning

This is an experimental feature. The purpose is to provide an object to carry metadata, unitcells in particular, to simplify the interface of calculations with periodic boundary conditions.

PDBTools.StructureType
Structure

A Structure is the main data structure in PDBTools.jl representing a molecular structure. It is a wrapper around a vector of Atom objects, with additional metadata stored in a dictionary.

source

Definition of structure object

A Structure object can be used to store an array of Atom objects along with, for example, the unitcell information or other metadata:

using PDBTools
ats = read_pdb(PDBTools.TESTPBC)
uc = read_unitcell(PDBTools.TESTPBC)
str = Structure(ats; unitcell=uc)
   Structure{Vector{Atom{Nothing}}, Atom{Nothing}}
   number of atoms: 130241, data: atoms, unitcell
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
       1    N     ALA     P        1        1  103.848   97.435   20.875  1.00  0.00     1    PROT         1
       2  HT1     ALA     P        1        1  103.498   96.484   21.110  1.00  0.00     1    PROT         2
⋮
  130241   H3    GLYC     D     4302    26788   27.053   48.661   12.369  1.00  0.00     1    GLYC    130241

Indexing and iteration

A Structure object behaves like a regular Vector{Atom} in most contexts, as it implements the AbstractVector interface. For example, the str object above can be indexed and iterated as a regular vector of atoms.

To fetch atom by indexing, one can do:

str[begin] # or str[1]
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
       1    N     ALA     P        1        1  103.848   97.435   20.875  1.00  0.00     1    PROT         1
str[10]
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
      10  HB3     ALA     P        1        1  106.730   95.726   20.423  1.00  0.00     1    PROT        10
str[end]
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
  130241   H3    GLYC     D     4302    26788   27.053   48.661   12.369  1.00  0.00     1    GLYC    130241

And all iterators that apply to raw Atom vectors also apply to the Structure object. For example, let us collect the chains of the structure:

collect(eachchain(str))
8-element Vector{Chain}[ 
    Chain(P-3801 atoms)
    Chain(A-29997 atoms)
    ⋮
    Chain(E-106 atoms)
    Chain(D-60228 atoms)
]

Data field assignment and access

Additionally, the unitcell field (or any other property defined), can be obtained by directly accessing the field with the corresponding name:

str.unitcell
3×3 StaticArraysCore.SMatrix{3, 3, Float32, 9} with indices SOneTo(3)×SOneTo(3):
 107.845   -4.71405f-6   -4.71405f-6
   0.0    107.845        -4.71406f-6
   0.0      0.0         107.845

More fields can be added to the Structure object by simple assignment:

str.filename = "file.pdb"
"file.pdb"

which can instantly be acessed by:

str.filename
"file.pdb"