MitoProteome Database

Introduction | Search Mitoproteome
Mouse-over the above links for more information

Introduction

MitoProteome is an object-relational database of human mitochondrial protein sequences generated from information obtained from a comprehensive curation of public databases as well as from direct experimental evidence. MitoProteome contains both mitochondrial- and nuclear-encoded protein sequences. The initial release (2004) contained 847 human mitochondrial proteins, 615 of which had been identified by LC/MS studies of human heart mitochondria (ref 1). The most recent release (2009) has been extensively revised as follows:

(a): The original list of 847 protein records in the mitoproteome database was examined to find corresponding Entrez gene ID's from their GenBank, Refseq and UniProt entries. Of these, a total of 163 had no corresponding Entrez Gene ID entries (mainly due to the fact that these public databases have been extensively updated in the last 6 years and many sequences have been retired). These 163 sequences were searched by BLAST against the Oct. 2008 version of GenBank NR to identify identical sequences with valid Entrez Gene ID's. In cases where there was not an identical sequence match, if the closest sequence contained all the peptides identified by LC/MS/MS (ref 1) it was accepted as a valid replacement sequence.

(b): Secondly, the entire Oct. 2008 version of the human Refseq and UniProt databases were searched for mitochondrial-related protein sequences , based on their gene descriptions, Gene Ontology categories and KEGG annotations. Entrez gene ID's for the 13 mitochondrial encoded proteins were added to this list.

Sequences from (a) and (b) were merged and the following selection criteria were applied:

  • The sequence must have a current Entrez gene ID.
  • The sequence's UniProt subcellular annotation must be listed as "mitochondrial"
    OR
    It must belong to a well-known family of mitochondrial proteins (OXPHOS complexes, TCA cycle enzymes, etc.).

A total of 780 protein sequences passed these selection criteria, were assigned a status of "current" and are listed in this current database release. An additional 175 protein sequences which had been identified by LC/MS studies (ref 1), but which did not have a UniProt "mitochondrial" subcellular location annotation were assigned a status of "Msdoubtful". Finally, a group of 317 protein sequences assigned on the basis of mitochondrial keywords in their GO/KEGG annotations were assigned a status of "doubtful".

Each protein is extensively annotated with data extracted from external databases, including:

  • Gene data extracted from Entrez
  • Protein metabolic pathway information for KEGG
  • Disease data extracted from OMIM
  • Interaction data extracted from MINT and DIP
  • Protein family data extracted from PFAM
  • Domain and Motif data extracted from InterPro
  • Fingerprint data extracted from PRINTS

Literature References

  1. Steven W. Taylor, Eoin Fahy, Bing Zhang, Gary M. Glenn, Dale E. Warnock, Sandra Wiley, Anne N. Murphy, Sara P. Gaucher, Roderick A. Capaldi, Bradford W. Gibson, and Soumitra S. Ghosh. "Characterization of the human heart mitochondrial proteome." Nature Biotechnology March 2003 Volume 21 Number 3 pp 281 - 286. (Abstract) (Full Text)
  2. Dawn Cotter, Purnima Guda, Eoin Fahy, Shankar Sumbramaniam. "MitoProteome: Mitochondrial Protein Sequence Database and Annotation System." Nucleic Acids Research 2004 32: D463-D467. (Abstract) (Full Text)

Database References

http://www.ncbi.nlm.nih.gov/Entrez/
http://www.ncbi.nlm.nih.gov/RefSeq/
http://www.uniprot.org


webmaster_at_mitoproteome.org
valid html valid css