^CSE454^ [01] >>

# Mixture Modelling

CSE454 2005 : This document is online at   http://www.csse.monash.edu.au/~lloyd/tilde/CSC4/CSE454/   and contains hyper-links to other resources - Lloyd Allison ©.

See L. Allison, Models for Machine Learning and Data Mining in Functional Programming, J. Functional Programming (JFP), 15(1), pp.15-32, January 2005, and also [II].
```
estMixture ests dataSet =
let
-- [estimator]->[dataSpace] -> model of dataSpace
-- i.e. [estimator] -> estimator
...
```

Takes a list of estimators, one per component of the mixture.

 ^CSE454^ << [02] >> ```memberships (Mix mixer components) = let -- memberships|Mixture doAll (d:ds) = prepend (doOne d) (doAll ds) -- all data doAll [] = map (\x -> []) components doOne datum = normalise( -- one datum zipWith (\c -> \m -> (pr mixer c)*(pr m datum)) [0..] components) -- pr(c) * pr(datum|c) for class #c = m in doAll dataSet ``` Given components of the mixture, find (fit) the fractional memberships of things (data) in (to) the components.
 ^CSE454^ << [03] >> ```randomMemberships = let doAll seed [] = map (\_ -> []) ests doAll seed (_:ds) = -- all data let doOne seed [] ans = (seed, normalise ans) doOne seed (_:ests) ans = -- one datum doOne (prng seed) ests ((fromIntegral(1+ seed `mod` 10)) : ans) in let (seed2, forDatum) = doOne seed ests [] in prepend forDatum (doAll seed2 ds) in doAll 4321 dataSet ``` Allocate initial pseudo-random (prng) fractional memberships to things (data), not very interesting.
 ^CSE454^ << [04] >> ```fit [] [] = [] -- Models|memberships fit (est:ests) (mem:mems) = (est dataSet mem) : (fit ests mems) fitMixture mems = Mix (freqs2model (map (foldl (+) 0) mems)) -- weights (fit ests mems) -- components ``` Calculate mixture-weights of the components, and fit components (use the given estimators) to their weighted members.
 ^CSE454^ << [05] >> ```cycle mx = fitMixture (memberships mx) -- EM step cycles 0 mx = mx cycles n mx = cycles (n-1) (cycle mx) -- n x cycle in mixture( cycles ?? (fitMixture randomMemberships) ) -- -----9/2002--9/2003--L.Allison--CSSE--Monash--.au-- ``` Fit memberships to components; fit components to the memberships. Iterate some number of times, or until convergence, or... etc..
^CSE454^ << [06] >>

# Summary

What are the estimators?
e.g. Multistate -- add up frequencies (+0.5 for MML) and normalise.
e.g. Normal -- usual mean and std dev (latter uses  /(n-1) for MML)
e.g. Multivariate -- ``product'' estimator over attribute estimators, or
e.g. a factor model estimator if you have one, etc..

Mixture complexity: Can search for 1, 2, 3, ... components for the optimal message length.

L. Allison. Models for Machine Learning and Data Mining in Functional Programming, J. Functional Programming (JFP), 15(1), pp.15-32, January 2005, and also [II].

© 2005 L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3168.
Created with "vi (IRIX)",   charset=iso-8859-1