Markov Models for Written Language Identification and Verification

Research output: A Conference proceeding or a Chapter in BookConference contributionpeer-review

Abstract

—The paper presents a Markov chain-based method for automatic written language identification. Given a training document in a specific language, each word can be represented as a Markov chain of letters. Using the entire training document regarded as a set of Markov chains, the set of initial and transition probabilities can be calculated and referred to as a Markov model for that language. Given an unknown language string, the maximum likelihood decision rule was used to identify language. Experimental results showed that the proposed method achieved lower error rate and faster identification speed than the current n-gram method.
Original languageEnglish
Title of host publicationProceedings of the 12th International Conference on Neural Information Processing
EditorsK.Y Yang, L.R Dung
Place of PublicationTaiwan
PublisherNational Chiao Tung University
Pages67-70
Number of pages4
Publication statusPublished - 2005
EventInternational Conference on Neural Information Processing - , Taiwan, Province of China
Duration: 29 Oct 20052 Nov 2005

Publication series

NameProceedings of the 12th international conference on neural information processing

Conference

ConferenceInternational Conference on Neural Information Processing
Country/TerritoryTaiwan, Province of China
Period29/10/052/11/05

Fingerprint

Dive into the research topics of 'Markov Models for Written Language Identification and Verification'. Together they form a unique fingerprint.

Cite this