Unsupervised Identification of Abnormal Nodes and Edges in Graphs

Asara Senaratne, Peter Christen, Graham Williams, Pouya G Omran

Research output: Contribution to journalArticlepeer-review

5 Citations (Scopus)
16 Downloads (Pure)

Abstract

Much of today's data are represented as graphs, ranging from social networks to bibliographic citations. Nodes in such graphs correspond to records that generally represent entities, while edges represent relationships between these entities. Both nodes and edges in a graph can have attributes that characterize the entities and their relationships. Relationships are either explicitly known (like friends in a social network), or they are inferred using link prediction (such as two babies are siblings because they have the same mother). Any graph representing real-world data likely contains nodes and edges that are abnormal, and identifying these can be important for outlier detection in applications ranging from crime and fraud detection to viral marketing. We propose a novel approach to the unsupervised detection of abnormal nodes and edges in graphs. We first characterize nodes and edges using a set of features, and then employ a one-class classifier to identify abnormal nodes and edges. We extract patterns of features from these abnormal nodes and edges, and apply clustering to identify groups of patterns with similar characteristics. We finally visualize these abnormal patterns to show co-occurrences of features and relationships between those features that mostly influence the abnormality of nodes and edges. We evaluate our approach on datasets from diverse domains, including historical birth certificates, COVID patient records, e-mails, books, and movies. This evaluation demonstrates that our approach is well suited to identify both abnormal nodes and edges in graphs in an unsupervised way, and it can outperform several baseline anomaly detection techniques.

Original languageEnglish
Article number8
Number of pages37
JournalJournal of Data and Information Quality
Volume15
Issue number1
Early online date28 Dec 2022
DOIs
Publication statusPublished - Mar 2023
Externally publishedYes

Keywords

  • abnormality detection
  • agglomerative clustering
  • Feature generation
  • one-class classifier
  • outlier detection
  • support vector machine

Fingerprint

Dive into the research topics of 'Unsupervised Identification of Abnormal Nodes and Edges in Graphs'. Together they form a unique fingerprint.

Cite this