Structure object

Warning

This is an experimental feature. The purpose is to provide a an object to carry meta data, unitcells in particular, to simplify the interface of calculations with periodic boundary conditions.

PDBTools.StructureType
Structure

A Structure is the main data structure in PDBTools.jl representing a molecular structure. It is a wrapper around a vector of Atom objects, with additional metadata stored in a dictionary.

source

Definition of structure object

A Structure object can be used to store an array of Atom objects along with, for example, the unitcell information or other metadata:

using PDBTools
ats = read_pdb(PDBTools.TESTPBC)
uc = read_unitcell(PDBTools.TESTPBC)
str = Structure(ats; unitcell=uc)
   Structure{Vector{Atom{Nothing}}, Atom{Nothing}}
   number of atoms: 130241, data: atoms, unitcell
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
       1    N     ALA     P        1        1  103.848   97.435   20.875  1.00  0.00     1    PROT         1
       2  HT1     ALA     P        1        1  103.498   96.484   21.110  1.00  0.00     1    PROT         2
⋮
  130241   H3    GLYC     D     4302    26788   27.053   48.661   12.369  1.00  0.00     1    GLYC    130241

Indexing and iteration

A Structure object behaves like a regular Vector{Atom} in most contexts, as it implements the AbstractVector interface. For example, the str object above can be indexed and iterated as a regular vector of atoms.

To fetch atom by indexing, one can do:

str[begin] # or str[1]
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
       1    N     ALA     P        1        1  103.848   97.435   20.875  1.00  0.00     1    PROT         1
str[10]
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
      10  HB3     ALA     P        1        1  106.730   95.726   20.423  1.00  0.00     1    PROT        10
str[end]
   index name resname chain   resnum  residue        x        y        z occup  beta model segname index_pdb
  130241   H3    GLYC     D     4302    26788   27.053   48.661   12.369  1.00  0.00     1    GLYC    130241

And all iterators that apply to raw Atom vectors also apply to the Structure object. For example, let us collect the chains of the structure:

collect(eachchain(str))
8-element Vector{Chain}[ 
    Chain(P-3801 atoms)
    Chain(A-29997 atoms)
    ⋮
    Chain(E-106 atoms)
    Chain(D-60228 atoms)
]

Data field assignment and access

Additionally, the unitcell field (or any other property defined), can be obtained by directly accessing the field with the corresponding name:

str.unitcell
3×3 StaticArraysCore.SMatrix{3, 3, Float32, 9} with indices SOneTo(3)×SOneTo(3):
 107.845   -4.71405f-6   -4.71405f-6
   0.0    107.845        -4.71406f-6
   0.0      0.0         107.845

More fields can be added to the Structure object by simple assignment:

str.filename = "file.pdb"
"file.pdb"

which can instantly be acessed by:

str.filename
"file.pdb"