^up^ [01] >>

Consider a discrete sample space of M unordered values, e.g.

- throw = {head, tail}
M = 2

- base = {A, C, G, T}
M = 4

- roll = {1, 2, 3, 4, 5, 6} M = 6. NB. unordered

- amino acid = {Glycine, Alanine, Valine, Isoleucine, Leucine, Phenylalanine, Proline, Methionine, Serine, Threonine, Tyrosine, Tryptophan Aspargine, Glutamine, Cysteine, Aspartic acid, Glutamic acid, Lysine, Arginine, Histidine} M = 20

This document is online at http://www.csse.monash.edu.au/~lloyd/Archive/2005-04-Fin-state/index.shtml and contains hyper-links to other resources.

<< [02] >>

Distribution has M-1 parameters
T_{1},
T_{2}, ...,
T_{M-1}.
M-1 degrees of freedom.

Also define
T_{M} = 1 - T_{1} - T_{2} ... - T_{M-1}

<< [03] >>

From data, observed frequencies are
n_{1}, ..., n_{M},
let _{i=1..M} n_{i}

Maximum likelihood:
T_{i,ML} = n_{i}/N
what if n_{i}=0?

Minimum Message Length:
T_{i,MML} = (n_{i} + 1/2)/(N + M/2)

MinEKL estimator:
T_{i,MinEKL} = (n_{i} + 1)/(N + M)
minimum expected Kullback Leibler

<< [04] >>

- discrete sample spaces (as seen) and also

- model of the "class" attribute in supervised classification

- sub-model on 1st-order Markov model

- proportions of the classes in a mixture model
(unsupervised classification)

- frequency of transitions out of a state in a Probabilistic Finite State Automaton (PFSA, hidden Markov model, HMM) . . .

<< [05] >>

© L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3800. Created with "vi (IRIX)", charset=iso-8859-1