| Protein Data Bank |
Article Index for Protein |
Articles about Protein Data Bank |
Website Links For Protein |
Information AboutProtein Data Bank |
| CATEGORIES ABOUT PROTEIN DATA BANK | |
| bioinformatics databases | |
|
The Protein Data Bank ('''PDB''') is a repository for 3-D structural data of Protein s and Nucleic Acid s. This data, typically obtained by X-ray Crystallography or NMR Spectroscopy , is submitted by Biologist s and Biochemists from around the world, is released into the Public Domain , and can be accessed for free. HISTORY Founded in 1971 by Brookhaven National Laboratory , management of the Protein Data Bank was transferred in 1998 to members of the Research Collaboratory for Structural Bioinformatics (RCSB) . The Worldwide Protein Data Bank (wwPDB) consists of organizations that act as deposition, data processing and distribution centers for structural data that is freely and publicly available to the global community. The PDB is a key resource in Structural Biology and is critical to more recent work in Structural Genomics . Countless derived databases and projects have been developed to integrate and classify the PDB in terms of Protein Structure , Protein Function and Protein Evolution . Growth When the PDB was originally founded it contained just 7 protein structures. Since then it has undergone an approximate exponential growth in the number of structures, which does not show any sign of falling off. The growth rate of the PDB has been the subject of fairly extensive analysis. CONTENTS As of 26 September , 2006 , the database contained 39,051 released atomic coordinate entries (or "structures"), 35,767 of that proteins, the rest being nucleic acids, nucleic acid-protein complexes, and a few other molecules. About 5,000 new structures are released each year. Data are stored in the MmCIF format specifically developed for the purpose. Note that the database stores information about the exact location of all Atom s in a large biomolecule (although, usually without the Hydrogen atoms, as their positions are more of a statistical estimate); if one is only interested in ''sequence data'', i.e. the list of Amino Acid s making up a particular Protein or the list of Nucleotide s making up a particular Nucleic Acid , the much larger databases from Swiss-Prot and the International Nucleotide Sequence Database Collaboration should be used. Statistics As of 17 July , 2007 , the "PDB Holdings List" at RCSB reported the following statistics: Note that theoretical models are no longer accepted in the PDB. 22461 structures in the PDB have a Structure Factor file. 3138 structures in the PDB have an NMR restraint file. The current breakdown of holdings is updated weekly . FILE FORMAT Through the years the PDB File Format has undergone many, many changes and revisions. Its original format was dictated by the width of computer punch cards.
This legacy format has caused many problems with the format, and consequently there are 'clean-up' projects; The MMDB uses ASN.1 (and an XML conversion of this format). The wwPDB members RCSB PDB, MSD-EBI, and PDBj are working together to make the data uniform across the archive. Some believe this to be desirable; others argue that, without a universal repository of information (i.e., a common dictionary), it is not possible to draw comparisons. Each structure published in PDB receives a four-character alphanumeric identifier, its PDB ID. This should not be used as an identifier for biomolecules, since often several structures for the same molecule (in different environments or conformations) are contained in PDB with different PDB IDs. If a biologist submits structure data for a protein or nucleic acid, wwPDB staff reviews and annotates the entry. The data are then automatically checked for plausibility. The Source Code for this validation software has been released for free. The main data base accepts only experimentally derived structures, and not theoretically predicted ones (see Protein Structure Prediction ). Various funding agencies and scientific journals now require scientists to submit their structure data to PDB. VIEWING THE DATA The structural data can be used to visualize the Biomolecule s with appropriate software, such as VMD , RasMol , PyMOL , Jmol , MDL Chime , QuteMol , Web Browser VRML plugin or any web-based software designed to visualize and analyse the protein structures such as STING . A recent desktop software addition is Sirius . The RCSB PDB website also contains resources for education, structural genomics, and related software. REFERENCES Printed
Online
OTHER EXTERNAL LINKS
Links to enzyme database data
MOLECULAR GRAPHIC VISUALISATION TOOLS
|
|
|