Anomaly Detection and Classification in Knowledge Graphs

Asara Senaratne, Peter Christen, Pouya Omran, Graham Williams

Research output: Working paper/PreprintPreprint

Abstract

Anomalies such as redundant, inconsistent, contradictory, and deficient values in a Knowledge Graph (KG) are unavoidable, as these graphs are often curated manually, or extracted using machine learning and natural language processing techniques. Therefore, anomaly detection is a task that can enhance the quality of KGs. In this paper, we propose SEKA (Seeking Knowledge Graph Anomalies), an unsupervised approach for the detection of abnormal triples and entities in KGs. SEKA can help improve the correctness of a KG whilst retaining its coverage. We propose an adaption of the Path Rank Algorithm (PRA), named the Corroborative Path Rank Algorithm (CPRA), which is an efficient adaptation of PRA that is customized to detect anomalies in KGs. Furthermore, we also present TAXO (Taxonomy of anomaly types in KGs), a taxonomy of possible anomaly types that can occur in a KG. This taxonomy provides a classification of the anomalies discovered by SEKA with an extensive discussion of possible data quality issues in a KG. We evaluate both approaches using the four real-world KGs YAGO1, KBpedia, Wikidata, and DSKG to demonstrate the ability of SEKA and TAXO to outperform the baselines.
Original languageEnglish
PublisherArxiv
Pages1-37
Number of pages37
DOIs
Publication statusSubmitted - 6 Dec 2024
Externally publishedYes

Keywords

  • Outlier detection
  • data quality
  • taxonomy
  • corroborative path rank algorithm
  • triples

Fingerprint

Dive into the research topics of 'Anomaly Detection and Classification in Knowledge Graphs'. Together they form a unique fingerprint.

Cite this