home uniprot
Text Search: 
 
       Home      About PIR     Databases      Search/Retrieval      Download      Support
HOME / Protein Ontology
PRO- PRotein Ontology

A number of ontologies exist that describe the properties that can be attributed to proteins; for example, protein functions are described by Gene Ontology, while human diseases are described by Disease Ontology. There is, however, a gap in the current set of ontologies—one that describes the protein entities themselves and their relationships. We designed a PRotein Ontology (PRO) to facilitate protein annotation and to guide new experiments. The components of PRO extend from the classification of proteins on the basis of evolutionary relationships to the representation of the multiple protein forms of a gene (products generated by genetic variation, alternative splicing, proteolytic cleavage, and other post-translational modification). PRO will allow the specification of relationships between PRO, GO and other OBO Foundry ontologies (Figure 1). For more detailed information please see Natale et al., 2007.

Figure 1: PRO protein ontology overview. The figure shows the current (partial) working model and a subset of the possible connections to other ontologies.

Collaborators
PRO publications
PRO brochure
Downloads Now PRO 1.0!
PRO Wiki

Here we describe the initial development of PRO, illustrated using human and mouse proteins from the TGF-beta signaling pathway. This development consists of three phases:

  • Phase I: Creation of nodes and extraction of data for human and mouse proteins from PIRSF and UniProtKB databases. We have developed a parser that transforms information from the sources indicated into nodes and relationships using OBO format. The parser was designed to capture experimentally verified biological entities, ignoring any annotation labeled as “by similarity,” “potential,” or “probable.” There are three kinds of entities considered by the parser: isoforms, variants, and cleavage and modification products. Cross-references to other ontologies or knowledgebases (e.g., HUGO, GO, OMIM, RESID) were also extracted.
  • Phase II: Determination of the orthologous known forms between human and mouse, and assignation of the existing annotation to the correct protein form by manual curation. If the UniProtKB record lacks information on the protein modification, cleavage, or isoforms, the terms are assigned to the reference sequence form.
  • Phase III: Thorough manual annotation of the records, with assignment of new entities if described in the literature (See PRO brochure).

    PIRSF Hierarchy in DAG format
    Describes the relationship between UniProtKB proteins, curated PIRSF families and Pfam domain superfamilies. It is the foundation for the PRO evolutionary component.



  • PIR
     HomeAbout PIRDatabasesSearch/AnalysisDownloadSupport  SITE MAPTERMS OF USE

     Copyright © 2005 - 2006 Protein Information Resource,  Georgetown University Medical Center

    3300 Whitehaven Street, NW, Suite 1200, Washington, DC 20007, USA