filler
^up^ [01] >>

Mixture Modelling

On work by Chris Wallace, David Boulton, Rohan Baxter, David Dowe, Jon Oliver and others,   Lloyd Allison 24/10/2001.

This document is online at   http://www.csse.monash.edu.au/~lloyd/Archive/2005-06-Mixture/index.shtml   and contains hyper-links to other resources.

Unsupervised classification = clustering = mixture modelling = . . .

Given multivariate data, i.e. S things, each having D attibutes, infer:

For a simple univariate example . . .

filler
<< [02] >>
Gaussian Mixture Model for Clustering
Needs Sun Microsystems' Java ON!

©
L
.
A
l
l
i
s
o
n
 
(
a
u
)
 
mu1= sigma1= p=
mu2= sigma2= q=(1-p)    
vertical: min= max=

filler
<< [03] >>

Use the form above to experiment. Try


filler
<< [04] >>

Can have mixtures of

But the more general the model, the more difficult the search problem.

filler
<< [05] >>

Bi-variate example

protein structure clusters, classes ie mixture found by minimum message length inference
The two attributes, phi and psi, are the two free dihedral-angles per amino-acid on a protein backbone. The peaks represent the centres of classes found.

filler
<< [06] >>

Encoding the Hypothesis


filler
<< [07] >>

Data given Hypothesis

e.g.   x | 2-classes:
datum to be encoded in a clustering
x has a certain measurement accuracy, an interval.
NB. If interval is small, probability density can be approximated as a constant over it, and simply "passes through" the Maths.

filler
<< [08] >>

Total assignment (of x)

total assignment in clustering, mixture modelling
x is most likely in the right hand class; code length = -log2(shaded)

filler
<< [09] >>

membership of the left hand class is less likely but conceivable

alternative total assignment in clustering, mixture modelling
code length = -log2(shaded)

filler
<< [10] >>

Fractional assignment, to both classes, loses no probability:

fractional assignment in clustering, mixture modelling, MML is a Bayesian method
code length = -log2(shaded)

filler
<< [11] >>

The probability contributions of two classes, green and shaded:

clustering, mixture modelling, classification
think about it!

filler
<< [12] >>

Total assignment leads to biased estimates of parameters (means, variances):

total assignment in clustering, mixture modelling
by effectively transferring areas (unequal in general) between classes. Means move apart, variances reduce.

filler
<< [13] >>

Coding Data

There are two ways of viewing/ implementing fractional assignment:

  1. Borrowed bits, Wallace (1986).

    e.g. consider a case of 50:50 membership,

    • work out code for rest of data,
    • "borrow" 1st bit from rest to decide which membership choice to make,
    • remove borrowed bit from rest, don't transmit it, receiver can deduce it!

    Generalizes to unequal ratios and >2 classes.

  2. Directly from a code-book based on the combined, i.e. mixed, distribution.
Note that class memberships are nuisance parameters if we want (only) the class descriptions. And typically of nuisance parameters, are proportional to |input|. Fractional assignment = no nuisance.

filler
<< [14] >>

Search

The methods described allow two hypotheses to be compared and the better one chosen, but there remains the search problem - to find the best mixture model. `Snob' does the following:

Must converge, possibly to a local optimum, works well in practice.

filler
<< [15]

Conclusion

References


© L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3800.
Created with "vi (Linux)",   charset=iso-8859-1