In this paper, we seek a Gaussian mixture model (GMM) of the class-conditional densities for plug-in Bayes classification. We propose a method for setting the number of the components and the covariance matrices of the class-conditional GMMs. It compromises between simplicity of the model selection based on the Bayesian information criterion (BIC) and the high accuracy of the model selection based on the cross-validation (CV) estimate of the correct classification rate. We apply an idea of Friedman [Friedman, J.H. 1989. Regularized discriminant analysis. J. Amer. Statist. Assoc., 84, 165-175] to shrink a predefined covariance matrix to a parameterization with substantially reduced degrees of freedom (reduced number of the adjustable parameters). Our method differs from the original Friedman's method by the meaning of the shrinkage. We operate on matrices computed for a certain class while the Friedman's method shrinks matrices from different classes. We compare our method with the conventional methods for setting the GMMs based on the BIC and CV. The experimental results show that our method has the potential to produce parameterizations of the covariance matrices of the GMMs which are better than the parameterizations used in other methods. We observed significant enlargement of the correct classification rates for our method with respect to the other methods which is more pronounced as the training sample size decreases. The latter implies that our method could be an attractive choice for applications based on a small number of training observations.
- Bayesian information criterion
- Gaussian mixture models
- Model selection
- Regularized discriminant analysis