posted on 2023-01-19, 11:13authored byTrieu Minh Nhut Le
Submission note: A thesis submitted in total fulfilment of the requirements for the degree of Doctor of Philosophy to the School of Engineering and Mathematical Sciences, Faculty of Science, Technology and Engineering, La Trobe University, Bundoora.
This thesis focuses on answering probabilistic top-k and skyline queries on probabilistic data using the possible worlds semantics model. These are two of the most important queries for decision support systems. Almost all other existing methods for answering queries on probabilistic data require the user to set a probability threshold. However, it is difficult to set a threshold because if it is set too high, important results may be lost, but if it is set too low, a lot of low quality results may be returned. In this thesis, novel approaches for answering probabilistic top-k and skyline queries are proposed using the dominance principle as natural and effective methods to select results of queries with an acceptable number of answers, ensuring all important answers are captured without the need to set a threshold. There are three challenges to answering both probabilistic top-k and skyline queries. The first challenge is to develop novel probabilistic top-k and skyline queries using the dominance principle to return only the most interesting results. The second challenge is to develop formulas based on probabilistic theory to directly calculate the probabilities of the results without considering any possible worlds and to also develop algorithms to effectively prune the search space. The third challenge is to ensure that all the semantic properties of the probabilistic queries are covered. The evaluations of the performance of the proposed approaches show that, firstly, the results of the queries are not only very reasonable in size but also capture all the important answers. Secondly, the proposed algorithms outperform the current algorithms by accelerating the pruning search space, thereby reducing execution time. Lastly, all the semantic properties of probabilistic queries are covered.
History
Center or Department
Faculty of Science, Technology and Engineering. School of Engineering and Mathematical Sciences.
Thesis type
Ph. D.
Awarding institution
La Trobe University
Year Awarded
2014
Rights Statement
The thesis author retains all proprietary rights (such as copyright and patent rights) over the content of this thesis, and has granted La Trobe University permission to reproduce and communicate this version of the thesis. The author has declared that any third party copyright material contained within the thesis made available here is reproduced and communicated with permission. If you believe that any material has been made available without permission of the copyright owner please contact us with the details.