37951_SOURCE01_3_A.pdf (1.48 MB)
Download fileKnowledge discovery from protein-protein interaction networks
thesis
posted on 2023-01-18, 16:48 authored by Wei ZhuSubmission note: A thesis submitted in total fulfilment of the requirements for the degree of Doctor of Philosophy to the Department of Computer Science and Computer Engineering, School of Engineering and Mathematical Sciences, Faculty of Science, Technology and Engineering, La Trobe University, Bundoora.
The past years have seen novel high-throughput technologies for protein-protein interaction (PPI) measurements that have created large–scale data on protein interaction across human and most species. These data are commonly represented as networks, with nodes representing proteins and edges representing the directed PPIs. A fundamental challenge for bioinformatics is the interpretation of this wealth of data to elucidate the interaction patterns and biological characteristics of the proteins. The current protein-protein interaction algorithms are mainly based on PPI network topological structures, i.e. the similarity or distance measurements of proteins are based on the protein-protein interaction patterns, rather than the functional semantics of the proteins. The reality is that the interactions among proteins should be weighted from their functional semantics. How to establish protein similarity or distance measures from protein functional semantics is the key to how to more precisely and reasonably discover the biological patterns and characteristics of proteins. The protein similarity measurement is developed in chapter 1 and further developed to completion in other chapters. Another problem is that usually, functional semantics (i.e. functional annotations) of some proteins (in some cases, many proteins) in the PPI networks are unknown at the moment. How to define a model to describe the propagation of protein functions within the network structures needs to be dealt with. To address this problem, we developed five algorithms to semantically predict unknown protein functions, which have been proved to have higher efficiency in real datasets. XIV Currently there are several sources from which PPI networks are derived and numerous databases that store PPI data. It was observed that PPI data usually contain much negative information (PPIs that may not occur) that might distort the final analysis results or conclusions. How to validate the reliability of PPI data, or how to filter negative information from existing data sets still remains an open area. Generally, it is an unsolved problem in this field. However, it is necessary to manage the dataset by some means such as integrating multiple datasets or filtering a single dataset. In this thesis, detailed descriptions are given of data handling in all chapters, which is the base of our experiments. Finally, a review on computational developments regarding miRNA regulation is presented. The studies related to tumorigenesis are heavily explored in the current stage. In the future, we will mainly focus on basic studies on the correlation between miRNAs and general PPI networks and identifying miRNA-regulated signaling transduction pathways related to tumorigenesis, to explore the internal relationships between miRNA-targeted proteins during their interactions, which are expected to contribute to cancer treatment and drug discovery.
The past years have seen novel high-throughput technologies for protein-protein interaction (PPI) measurements that have created large–scale data on protein interaction across human and most species. These data are commonly represented as networks, with nodes representing proteins and edges representing the directed PPIs. A fundamental challenge for bioinformatics is the interpretation of this wealth of data to elucidate the interaction patterns and biological characteristics of the proteins. The current protein-protein interaction algorithms are mainly based on PPI network topological structures, i.e. the similarity or distance measurements of proteins are based on the protein-protein interaction patterns, rather than the functional semantics of the proteins. The reality is that the interactions among proteins should be weighted from their functional semantics. How to establish protein similarity or distance measures from protein functional semantics is the key to how to more precisely and reasonably discover the biological patterns and characteristics of proteins. The protein similarity measurement is developed in chapter 1 and further developed to completion in other chapters. Another problem is that usually, functional semantics (i.e. functional annotations) of some proteins (in some cases, many proteins) in the PPI networks are unknown at the moment. How to define a model to describe the propagation of protein functions within the network structures needs to be dealt with. To address this problem, we developed five algorithms to semantically predict unknown protein functions, which have been proved to have higher efficiency in real datasets. XIV Currently there are several sources from which PPI networks are derived and numerous databases that store PPI data. It was observed that PPI data usually contain much negative information (PPIs that may not occur) that might distort the final analysis results or conclusions. How to validate the reliability of PPI data, or how to filter negative information from existing data sets still remains an open area. Generally, it is an unsolved problem in this field. However, it is necessary to manage the dataset by some means such as integrating multiple datasets or filtering a single dataset. In this thesis, detailed descriptions are given of data handling in all chapters, which is the base of our experiments. Finally, a review on computational developments regarding miRNA regulation is presented. The studies related to tumorigenesis are heavily explored in the current stage. In the future, we will mainly focus on basic studies on the correlation between miRNAs and general PPI networks and identifying miRNA-regulated signaling transduction pathways related to tumorigenesis, to explore the internal relationships between miRNA-targeted proteins during their interactions, which are expected to contribute to cancer treatment and drug discovery.
History
Center or Department
Faculty of Science, Technology and Engineering. School of Engineering and Mathematical Sciences.Thesis type
- Ph. D.