|
||||||
|
||||||
| Graduate Thesis 2009 | ||||||
|
PIPE: A PROTEIN-PROTEIN INTERACTION PREDICTION ENGINE BASED ON THE RE-OCCURRING SHORT POLYPEPTIDE SEQUENCES BETWEEN KNOWN INTERACTING PROTEIN PAIRS By Sylvain Pitre Fall 2009 A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Doctor of Philosophy
Ottawa-Carleton Institute for Computer Science School of Computer Science Carleton University Supervisor: Frank Dehne ABSTRACT Identification of protein-protein interaction networks has received considerable attention in the post-genomic era. The currently available biochemical approaches used to detect protein interactions (PPIs) are all time and labor intensive. Consequently there is a growing need for the development of computational tools that are capable of effectively identifying such interactions.
In this thesis we explain the development and implementation of a novel Protein-Protein Interaction Prediction Engine termed PIPE. This tool is capable of predicting protein-protein interactions for any target protein pair of the yeast Saccharomyces cerevisiae from their primary structure and without the need for any additional information or predictions about the proteins. PIPE showed a sensitivity of 61% for detecting any yeast protein interaction with 89% specificity and an overall accuracy of 75%. PIPE was used to identify novel interactions and a novel yeast complex confirmed by tandem afinity purification (TAP tag).
We also report an improved version, PIPE2, which exhibits a specificity of approximately 99.95% and executes 16,000 times faster than the original method. Importantly, we report an all-to-all sequence-based computational screen of PPIs in yeast in which we identify 29,589 high confidence interactions out of approximately 20 million possible pairs. Furthermore, a novel putative protein complex was discovered, comprised largely of membrane proteins not amenable to TAP tagging.
The third iteration of the PIPE algorithm (PIPE3) has been adapted and tested to predict interactions in several organisms including H. sapiens and involving several viruses. A computational genome-wide scan of S. pombe revealed over 9,000 PPIs, triple what was available using traditional methods. The novel PPIs in S. pombe were used to identify two new complexes which have been studied in more detail. PIPE3 has also been shown to be able to make cross-organism predictions: predicting PPIs in one organism using PPI knowledge of one or more other organisms. These tests suggest that PIPE3 can be a good predictor for newly sequenced organisms or for those which have very few known PPIs.
THESIS DOWNLOAD [ TH_phd_2009_pitre_0010.pdf ] |
||||||
|