Carleton University - Canada’s Capital University Carleton University - Canada’s Capital University Sitemap
Contact SCS
Campus Map
Computer Science Search:
Powered by Google
News & Seminars Future Students Current Students SCS Research People Tech Support
Graduate Thesis 2009

PIPE: A PROTEIN-PROTEIN INTERACTION PREDICTION ENGINE BASED ON THE RE-OCCURRING SHORT POLYPEPTIDE SEQUENCES BETWEEN KNOWN INTERACTING PROTEIN PAIRS

By
Sylvain Pitre

Fall 2009

A thesis submitted to the Faculty of Graduate Studies and Research
in partial fulfillment of the requirements for the degree of


Doctor of Philosophy

Ottawa-Carleton Institute for Computer Science
School of Computer Science
Carleton University


Supervisor: Frank Dehne

ABSTRACT

Identification of protein-protein interaction networks has received considerable attention in the post-genomic era. The currently available biochemical approaches used to detect protein interactions (PPIs) are all time and labor intensive. Consequently there is a growing need for the development of computational tools that are capable of effectively identifying such interactions. In this thesis we explain the development and implementation of a novel Protein-Protein Interaction Prediction Engine termed PIPE. This tool is capable of predicting protein-protein interactions for any target protein pair of the yeast Saccharomyces cerevisiae from their primary structure and without the need for any additional information or predictions about the proteins. PIPE showed a sensitivity of 61% for detecting any yeast protein interaction with 89% specificity and an overall accuracy of 75%. PIPE was used to identify novel interactions and a novel yeast complex confirmed by tandem afinity purification (TAP tag). We also report an improved version, PIPE2, which exhibits a specificity of approximately 99.95% and executes 16,000 times faster than the original method. Importantly, we report an all-to-all sequence-based computational screen of PPIs in yeast in which we identify 29,589 high confidence interactions out of approximately 20 million possible pairs. Furthermore, a novel putative protein complex was discovered, comprised largely of membrane proteins not amenable to TAP tagging. The third iteration of the PIPE algorithm (PIPE3) has been adapted and tested to predict interactions in several organisms including H. sapiens and involving several viruses. A computational genome-wide scan of S. pombe revealed over 9,000 PPIs, triple what was available using traditional methods. The novel PPIs in S. pombe were used to identify two new complexes which have been studied in more detail. PIPE3 has also been shown to be able to make cross-organism predictions: predicting PPIs in one organism using PPI knowledge of one or more other organisms. These tests suggest that PIPE3 can be a good predictor for newly sequenced organisms or for those which have very few known PPIs.

THESIS DOWNLOAD

[ TH_phd_2009_pitre_0010.pdf ]