Error-Driven Adaptive Language Modeling for Chinese Pinyin-to-Character Conversion

Jin Huang, David Powers

    Research output: Contribution to conferencePaperpeer-review

    1 Citation (Scopus)

    Abstract

    The performance of Chinese Pinyin-to-Character conversion is severely affected when the characteristics of the training and conversion data differ. As natural language is highly variable and uncertain, it is impossible to build a complete and general language model to suit all the tasks. The traditional adaptive MAP models mix the task independent data with task dependent data using a mixture coefficient but we never can predict what style of language users have and what new domain will appear. This paper presents a statistical error-driven adaptive language modeling approach to Chinese Pinyin input system. This model can be incrementally adapted when an error occurs during Pinyin-to-Character converting time. It significantly improves Pinyin-to-Character conversion rate.

    Original languageEnglish
    Pages19-22
    Number of pages4
    DOIs
    Publication statusPublished - 1 Dec 2011
    EventInternational Conference on Asian Language Processing -
    Duration: 15 Nov 2011 → …

    Conference

    ConferenceInternational Conference on Asian Language Processing
    Period15/11/11 → …

    Keywords

    • Adaptive Learning
    • Chinese Language Processing
    • Pinyin-to-Character Conversion
    • Statistical Language Modeling

    Cite this