Mixture models pdf
Citation Type. Has PDF. Publication Type. More Filters. Mathematics, Computer Science. View 2 excerpts, cites background and methods. Breakdown points for maximum likelihood estimators of location—scale mixtures. ML-estimation based on mixtures of Normal distributions is a widely used tool for cluster analysis.
However, a single outlier can make the parameter estimation of at least one of the mixture … Expand. Estimation of parameters in latent class models using fuzzy clustering algorithms. Highly Influenced. View 3 excerpts, cites background and methods.
Mixture models form the essential basis of data clustering within a statistical framework. Here, the estimation of the parameters of a mixture of Gaussian densities is considered. In this particular … Expand. Robust Cluster Analysis via Mixture Models. In particular, our result is insensitive to translation or rescaling. Under suitable identifiability assumptions and when the distribution of the data belongs to our model, hence is of the form 1 , we also analyze the performance of our estimators of the parameters w1 ,.
In order to establish convergence rates, we relate the Hellinger distance between the distribution of the data and its estimator to a suitable distance between the corresponding parame- ters. We can also use other results specific to parameter estimation in mixture models such as what Gadat et al [12] proved in the context of two component mixtures with one known component.
We also provide the example of a parametric model for which our technics allow us to establish faster convergence rates while classical methods based on the likelihood or the least-squares fail to apply and hence give nothing. In many applications, starting with a single mixture model may be restric- tive and a more reasonable approach is to consider candidate ones for estimating the number of components of the mixture and proposing suitable models for the emission densities.
To tackle this problem, we design a model selection procedure from which we establish, under suitable assumptions, an oracle-type inequality. We consider several illustrations of this strategy. We also consider a model with a fixed number of components but each emission density can either belong to the Gaussian or to the Cauchy location-scale family.
We prove that if we know the number of components, we can estimate consistently the proportions of Gaussian and Cauchy components as well as their location and scale parameters. To our knowledge, this result is the first of its kind. The proof of this result relies on an upper bound for the ex- pectation of the supremum of an empirical process over a mixture of VC-subgraph classes. It generalizes the result that was previously established for a single VC- subgraph class. The key argument in the proof is the uniform entropy property of VC-subgraph classes that still holds for the overall density mixture model with lower bounded weights.
The paper is organized as follows. We describe our statistical framework in Section 2. In Section 3, we present the construction of the estimator on a single mixture model.
We state the general result for density estimation on a single model and illustrate the performance of the estimator on the specific example of GMMs. The problem of estimating the parameters of the mixture is addressed in the subsection 3.
Finally, Section 4 is devoted to model selection criterion and the properties of the estimator on the selected model. Those sections include the main results, density estimation, the parametric estimation in regular parametric models, the case of two-component mixtures with one known component and the lemmas. This assumption is quite restrictive and we rather consider a collection Ek of candidate models for Fk that may even depend on k.
We say that Ek is simple when it reduces to a single emission model F k and composite otherwise. Assumption 1. We assume the following. Assumption 2. Throughout this paper we shall use the following notation. This means that we know that P is a mixture of at most K emission probabilities F1 ,.
Theorem 1. Our assumption that the families of density functions F k are VC-subgraph is actually weak since it includes situations where these mod- els consist of unbounded densities or densities which are not in L2 which to our knowledge have never been considered in the literature. A concrete example of such situations is the following one.
It follows from Proposition vi of Baraud et al [3] that the VC-index of F k is not larger than This means that the estimator is robust with respect to a possible misspecification of the model and the departure from the assumption that the data are i. In particular, this includes the situa- tions where the dataset contains some outliers or has been contaminated. Inequality 16 , stated later, also allows to consider misspecification for the emission models for example.
Theorem 2. We illustrate this lemma with the following example. We fix such values of M and s. Let QK be the class of distributions associated to QK. The class PM,s is not proven to be VC-subgraph but it is totally bounded. As a direct consequence of Theorem 3. This result provides a risk bound over the class of distributions associated to mixtures of s-concave densities. We say that pH is the Gaussian mixture density with mixing distribution H.
We want to approximate any distribution of this form with finite Gaussian mixtures, i. To obtain an approximation result, we need to consider mixing measures H that are supported on a compact set, i. The Hellinger distance being invariant to translation and rescaling, we consider the following class of densities. We denote by G the location-scale Gaussian family of probability density functions, i. Proposition 1. We can also consider larger classes of distributions, with R increasing as n increases but it would deteriorate this rate.
Our result is still an improvement of Theorem 4. Moreover, our estimator is robust, to contamination for instance. Example 1. Let F be the set of uniform distributions U a,b the uniform distri- bution on the interval a,b of positive lengths. There is a wide litera- ture about identifiability that includes the works of Teicher [22], Sapatinas [21] and Allman et al [1] for example.
Identifiability is a minimum requirement for the pa- rameter estimators to be meaningful but we can hardly get more than consistency with it. As mentioned in the introduction, we are looking for a lower bound on the Hellinger distance between mixture distributions.
There are still some situations where we do have such a lower bound. Regular parametric model Let K be an integer larger than 1. It is always possible to find a countable dense subset of Ak with respect to the Euclidean distance on Rdk. We assume there is a rea- sonably good connection between the Hellinger distance on the emission models and the Euclidean distances on the parameter spaces such that a dense subset of Ak would translate into a dense subset of the emission model with respect to the Hellinger distance.
Assumption 3. Theorem 4. Theorem 7. The Gaussian mixture model is the most common mixture model and it is a reg- ular parametric model. Mathematics, Computer Science. IET Image Process. View 1 excerpt. Speaker recognition system based on pitch estimation. Optimum linear regression in additive Cauchy-Gaussian noise. Signal Process. Robust text-independent speaker identification using Gaussian mixture speaker models.
IEEE Trans. Speech Audio Process. View 1 excerpt, references methods. Vector quantization. View 1 excerpt, references background. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper.
Vibratory power unit for vibrating conveyers and screens comprising an asynchronous polyphase motor, at least one pair of associated unbalanced masses disposed on the shaft of said motor, with the … Expand.
0コメント