—The paper presents a Markov chain-based method for automatic written language identification. Given a training document in a specific language, each word can be represented as a Markov chain of letters. Using the entire training document regarded as a set of Markov chains, the set of initial and transition probabilities can be calculated and referred to as a Markov model for that language. Given an unknown language string, the maximum likelihood decision rule was used to identify language. Experimental results showed that the proposed method achieved lower error rate and faster identification speed than the current n-gram method.
|Name||Proceedings of the 12th international conference on neural information processing|
|Conference||International Conference on Neural Information Processing|
|Country/Territory||Taiwan, Province of China|
|Period||29/10/05 → 2/11/05|