Ranking queries in uncertain databases

Nguyen, Huynh Thanh Ha

doi:10.26181/21855267.v1

42476_SOURCE01_2_A.pdf (1005.73 kB)

Ranking queries in uncertain databases

thesis

posted on 2023-01-19, 09:24 authored by Huynh Thanh Ha Nguyen

Submission note: A thesis submitted in total fulfilment of the requirements for the degree of Doctor of Philosophy to the School of Engineering and Mathematical Sciences, College of Science, Health and Engineering, La Trobe University, Bundoora.

In recent years, uncertain (imprecise) data has emerged in many real-world application domains, including sensor networks, moving object tracking, data integration, data cleaning, information extraction and others, hence a research area has emerged to support advanced techniques to efficiently manage and explore such uncertain databases. Ranking queries (also known as top-k queries) are one of the most important analytics techniques and are widely used in data exploration, data analytics and decision-making scenarios. Compared with ranking processes on traditional databases, ranking uncertain data to provide meaningful answers is more complicated due to the complex interplay between scores and probabilities which complicates the semantics of the queries. In addition, the problem is even more challenging when there are conflicting multiple ranking criteria involved in the ranking processes. The main contributions of this thesis are to develop several novel ranking approaches to effectively and efficiently retrieve the truly interesting top-k results on multidimensional and partially ordered domains on uncertain data. First, we define a novel approach, called the Dominating Top-k Aggregate query, which overcomes the weaknesses of several existing ranking approaches to provide trustworthy and useful knowledge from uncertain Big Data to support data analytics and decision making. We guarantee the reliability of our ranking results by demonstrating the satisfaction of data correlations that constrain six fundamental ranking properties of our method. Second, as the top-k representative skyline query is another important method for multi-criteria decision-making applications, we are the first to study the query in the context of uncertain data. We handle both discrete and continuous cases. We also personalize the query by employing user-references regarding the priority of individual raking criteria. Finally, we study ranking queries under the extended uncertain data model where the attribute values of data objects are expressed as continuous ranges.

History

Center or Department

College of Science, Health and Engineering. School of Engineering and Mathematical Sciences. Department of Computer Science and Computer Engineering.

Thesis type

Ph. D.

Awarding institution

La Trobe University

Year Awarded

2017

Rights Statement

This thesis contains third party copyright material which has been reproduced here with permission. Any further use requires permission of the copyright owner. The thesis author retains all proprietary rights (such as copyright and patent rights) over all other content of this thesis, and has granted La Trobe University permission to reproduce and communicate this version of the thesis. The author has declared that any third party copyright material contained within the thesis made available here is reproduced and communicated with permission. If you believe that any material has been made available without permission of the copyright owner please contact us with the details.

Data source

arrow migration 2023-01-10 00:15. Ref: latrobe:42476 (9e0739)

Usage metrics

Keywords

thesis

Licence

In Copyright

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Ranking queries in uncertain databases

History

Center or Department

Thesis type

Awarding institution

Year Awarded

Rights Statement

Data source

Usage metrics

Categories

Keywords

Licence

Exports