Abstract
Anomalies such as redundant, inconsistent, contradictory, and deficient values in a knowledge graph are unavoidable, as such graphs are often curated manually, or extracted using machine learning and natural language processing techniques. Therefore, anomaly detection in knowledge graphs is an essential task that contributes towards its quality. Although there are approaches to detect anomalies in knowledge graphs, they are either domain dependent, not scalable to large graphs, or they require substantial human intervention. In this preliminary research paper we propose a novel unsupervised feature-based approach to anomaly detection in knowledge graphs. We first characterize triples in a directed edge-labelled knowledge graph using a set of binary features, and then use a one-class Support Vector Machine (SVM) to classify these triples as normal or abnormal. After selecting the features that have the highest consistency with the SVM outcomes, we provide a visualization of the identified anomalies, and the list of anomalous triples, thus supporting non-technical domain experts to understand the anomalies present in a knowledge graph. We evaluate our approach on the four knowledge graphs YAGO-1, KBpedia, Wikidata, and DSKG. This evaluation demonstrates that our approach is well suited to identify anomalies in knowledge graphs in an unsupervised manner, independent from the domain of the knowledge graph being evaluated.
Original language | English |
---|---|
Title of host publication | Proceedings of the 10th International Joint Conference on Knowledge Graphs |
Subtitle of host publication | IJCKG 2021 |
Editors | Oscar Corcho, Thepchai Supnithi, Xiaoyan Zhu, Aidan Hogan, Thanaruk Theeramunkong, Haofen Wang |
Place of Publication | New York |
Publisher | Association for Computing Machinery |
Pages | 161-165 |
Number of pages | 5 |
ISBN (Electronic) | 978-1-4503-9565-6 |
DOIs | |
Publication status | Published - 24 Jan 2022 |
Externally published | Yes |
Event | 10th International Joint Conference on Knowledge Graphs - Virtual, Thailand Duration: 6 Dec 2021 → 8 Dec 2021 |
Conference
Conference | 10th International Joint Conference on Knowledge Graphs |
---|---|
Abbreviated title | IJCKG 2021 |
Country/Territory | Thailand |
City | Virtual |
Period | 6/12/21 → 8/12/21 |
Keywords
- binary feature library
- Data quality assessment
- edge-labelled graphs
- one-class classifier
- Visualization