TY - GEN
T1 - SEKA
T2 - 2023 World Wide Web Conference, WWW 2023
AU - Senaratne, Asara
PY - 2023
Y1 - 2023
N2 - Knowledge Graphs (KGs) form the backbone of many knowledge dependent applications such as search engines and digital personal assistants. KGs are generally constructed either manually or automatically using a variety of extraction techniques applied over multiple data sources. Due to the diverse quality of these data sources, there are likely anomalies introduced into any KG. Hence, it is unrealistic to expect a perfect archive of knowledge. Given how large KGs can be, manual validation is impractical, necessitating an automated approach for anomaly detection in KGs. To improve KG quality, and to identify interesting and abnormal triples (edges) and entities (nodes) that are worth investigating, we introduce SEKA, a novel unsupervised approach to detect anomalous triples and entities in a KG using both the structural characteristics and the content of edges and nodes of the graph. While an anomaly can be an interesting or unusual discovery, such as a fraudulent transaction requiring human intervention, anomaly detection can also identify potential errors. We propose a novel approach named Corroborative Path Algorithm to generate a matrix of semantic features, which we then use to train a one-class Support Vector Machine to identify abnormal triples and entities with no dependency on external sources. We evaluate our approach on four real-world KGs demonstrating the ability of SEKA to detect anomalies, and to outperform comparative baselines.
AB - Knowledge Graphs (KGs) form the backbone of many knowledge dependent applications such as search engines and digital personal assistants. KGs are generally constructed either manually or automatically using a variety of extraction techniques applied over multiple data sources. Due to the diverse quality of these data sources, there are likely anomalies introduced into any KG. Hence, it is unrealistic to expect a perfect archive of knowledge. Given how large KGs can be, manual validation is impractical, necessitating an automated approach for anomaly detection in KGs. To improve KG quality, and to identify interesting and abnormal triples (edges) and entities (nodes) that are worth investigating, we introduce SEKA, a novel unsupervised approach to detect anomalous triples and entities in a KG using both the structural characteristics and the content of edges and nodes of the graph. While an anomaly can be an interesting or unusual discovery, such as a fraudulent transaction requiring human intervention, anomaly detection can also identify potential errors. We propose a novel approach named Corroborative Path Algorithm to generate a matrix of semantic features, which we then use to train a one-class Support Vector Machine to identify abnormal triples and entities with no dependency on external sources. We evaluate our approach on four real-world KGs demonstrating the ability of SEKA to detect anomalies, and to outperform comparative baselines.
KW - Knowledge graph quality enhancement
KW - one-class classifier
KW - semantic features
KW - unsupervised anomaly detection
UR - http://www.scopus.com/inward/record.url?scp=85159631063&partnerID=8YFLogxK
U2 - 10.1145/3543873.3587536
DO - 10.1145/3543873.3587536
M3 - Conference contribution
AN - SCOPUS:85159631063
T3 - ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023
SP - 568
EP - 572
BT - ACM Web Conference 2023 - Companion of the World Wide Web Conference, WWW 2023
PB - Association for Computing Machinery, Inc
Y2 - 30 April 2023 through 4 May 2023
ER -