Appears in [Stats. and Comp. 9(4) pp269-278, Sept. 1999 (click)] or [click].

MML Markov classification of sequential data

home1 home2
 Bib
 Algorithms
 Bioinfo
 FP
 Logic
 MML
 Prog.Lang
and the
 Book

MML
 Structured
  mixtures
  HMM
  DTree
  DGraph
  Supervised
  Unsupervised

Seminar

General purpose un-supervised classification programs have typically assumed independence between observations in the data they analyse. In this paper we report on an extension to the MML classifier Snob which enables the program to take advantage of some of the extra information implicit in ordered datasets (such as time-series). Specifically the data is modelled as if it were generated from a first order Markov process with as many states as there are classes of observation. The state of such a process at any point in the sequence determines the class from which the corresponding observation is generated. Such a model is commonly referred to as a Hidden Markov Model. The MML calculation for the expected length of a near optimal two-part message stating a specific model of this type and a dataset given this model is presented. Such an estimate enables us to fairly compare models which differ in the number of classes they specify which in turn can guide a robust un-supervised search of the model space. The new program, tSnob, is tested against both `synthetic' data and a large `real world' dataset and is found to make unbiased estimates of model parameters and to conduct an effective search of the extended model space.

doi: 10.1023/A:1008907921792


Also see [MML].

Coding Ockham's Razor, L. Allison, Springer

A Practical Introduction to Denotational Semantics, L. Allison, CUP

Linux
 Ubuntu
free op. sys.
OpenOffice
free office suite
The GIMP
~ free photoshop
Firefox
web browser

© L. Allison   http://www.allisons.org/ll/   (or as otherwise indicated),
Faculty of Information Technology (Clayton), Monash University, Australia 3800 (6/'05 was School of Computer Science and Software Engineering, Fac. Info. Tech., Monash University,
was Department of Computer Science, Fac. Comp. & Info. Tech., '89 was Department of Computer Science, Fac. Sci., '68-'71 was Department of Information Science, Fac. Sci.)
Created with "vi (Linux + Solaris)",  charset=iso-8859-1,  fetched Friday, 29-Mar-2024 06:30:18 AEDT.