An Algorithm for Estimating Mixture Distribution of High Dimensional Vectors And Its Application to Character Recognition
Fang Sun, Shin'ichiro Omachi, and Hirotomo Aso
Proceedings of The 11th Scandinavian Conference on Image Analysis (SCIA'99), pp.267-274, June 1999

For statistical pattern recognition, in order to obtain high recognition accuracy, it is very important to estimate distribution precisely. In many cases, the distribution of feature vectors which are extracted from recognition objects is assumed to be normal, however it is more intricate and volatile in practice. It is thought to be more feasible to assume the distribution as mixed normal distribution. To estimate the mixed distribution precisely, a great number of training samples are required, especially in the case that the number of dimensions of feature vector is large. But unfortunately, compared with the number of dimensions, there are always not enough training samples. For this reason, the mixed normal distribution estimation is rarely used in recognition problems using high dimensional vectors, for example, character recognition. In this paper, by introducing Simplified Mahalanobis distance to the maximum likelihood estimates, the mixed normal distribution estimation algorithm for high dimensional vectors is proposed. As a practical application, the estimation algorithm is adopted to character recognition. A multi-template dictionary is constructed with consideration of the distribution of each category. The effectiveness of the proposed method is examined by experiments using Japanese characters.
Full paper
Gzipped Postscript