Automatic Clustering and Summarisation of Microblogs: A Multi- Subtopic Phrase Reinforcement Algorithm

Mahfouth Alghamdi, Haifeng Shen

    Research output: Contribution to conferencePaperpeer-review

    Abstract

    There is a phenomenal growth of microblogging-based social communication services and subscriptions in recent years. Through these services, users publish a large number of posts within a short period time, making it extremely hard for readers to keep track of a trending topic. A solution to this issue is text summarisation, which can generate a short summary of a trending topic from multiple posts. Most of the existing summarisation algorithms were proposed for long documents and do not work well for short microblogging posts. The PR (Phrase Reinforcement) algorithm was particularly designed to summarise microblogs, however it is merely able to generate a single-post summary that conveys a single topic, potentially overlooking other important information from the posts. In this paper, we contribute the PRICE (Phrase Reinforcement: Iteration, Clustering and Extraction) algorithm by extending the original PR algorithm with the ability to generate both multi-post and single- post summaries that span over multiple subtopics. Experimental evalu- ation results show that the PRICE algorithm outperforms the original PR algorithm in terms of both ROUGE-1 and Content metrics.

    Original languageEnglish
    Pages86-98
    Number of pages13
    DOIs
    Publication statusPublished - 1 Jan 2017
    EventAustralasian Conference on Artificial Life and Compu- tational Intelligence -
    Duration: 31 Jul 2017 → …

    Conference

    ConferenceAustralasian Conference on Artificial Life and Compu- tational Intelligence
    Period31/07/17 → …

    Keywords

    • Microblogging
    • Phrase reinforcement
    • Text summarization

    Fingerprint

    Dive into the research topics of 'Automatic Clustering and Summarisation of Microblogs: A Multi- Subtopic Phrase Reinforcement Algorithm'. Together they form a unique fingerprint.

    Cite this