Anomaly Detection in Graphs for Knowledge Discovery and Data Quality Enhancement

Research output: Other contribution

Abstract

Anomaly detection is the process of discovering unusual or rare patterns in data that are significantly different from the rest of the observations in a dataset. The importance of the task stems from the centrality of discovering unique or unusual phenomena in science and industry, where anomaly detection is also of significant importance for businesses and governments. Although errors and noise in data are frequently regarded as anomalies, an anomaly need not be erroneous, as abnormal data can unveil interesting facts, thus generating knowledge.

In this thesis, we initially provide a holistic understanding of the background of anomaly detection, and then conduct an extensive literature review of the state-ofthe-art techniques developed for graph-based anomaly detection to identify current research gaps. Next, we provide the foundation of our work by introducing an approach for unsupervised anomaly detection in tabular data. As existing work in this domain prioritize error detection, our approach proceeds beyond mere error detection to perform inter-attribute comparisons to identify records that are abnormal but appear to be expected in nature. The aim of this research is to present domain experts with a set of rules and visualizations to describe anomalous records.

Due to extensive connections between real-world objects, graph anomaly detection has received increased interest over the past years. Hence, we next direct our research towards anomaly detection in attributed graphs. While anomaly detection in tabular data aids in identifying anomalous records, the analysis of inter-relationships among records is required to find pairs or sets of abnormal records, which would otherwise be seen as normal when considered in isolation. In this approach, we perform unsupervised detection of abnormal nodes and edges in a graph using their associated attributes, which then provides a visualization of anomalies to aid the decision-making process of non-technical domain experts.

Next, we extend our research towards anomaly detection in Knowledge Graphs (KG). With the use of graph structural properties and semantic features, we identify abnormal facts and entities independent of external resources. In this work, we contribute towards quality enhancement of KGs further assisting in downstream tasks such as Knowledge Graph Completion.

Finally, we propose a taxonomy of anomaly types in KGs. This taxonomy includes a range of anomalies in KGs together with examples, and we discuss possible approaches for correction. The purpose of this research is to promote existing work in the domain of KG quality enhancement. In each of our contributions, we experimentally evaluate how the approaches we propose outperform the respective baselines using real-world datasets and KGs
Original languageEnglish
TypeThesis
Media of outputOnline
Number of pages191
DOIs
Publication statusPublished - Jan 2024
Externally publishedYes

Keywords

  • anomaly detection
  • datasets
  • thesis

Fingerprint

Dive into the research topics of 'Anomaly Detection in Graphs for Knowledge Discovery and Data Quality Enhancement'. Together they form a unique fingerprint.

Cite this