Plan
The half-semester course is to follow on the heels of its main pre-requisite,
CSE454.
(CSE454 plan
and
CSE454's progress in 2002.)
(A plan for CSE454
and an
initial plan for
CSE455.)
Foundations of inductive inference from (algorithmic) information theory;
intermediate to advanced Minimum Message Length (MML) inference;
details of Fisher information and uncertainty regions;
angular/circular models (von Mises, Wrapped Normal or trigonometric);
Poisson distribution;
MML of specific models such as decision graphs, hidden Markov models
(or HMMs, also known as probabilistic finite state automata, or PFSAs),
linear and polynomial regression, causal models, Bayesian nets,
time series, sequences, segmentation, trends;
probabilistic prediction and Kullback-Leibler distance.
Statistical invariance, statistical consistency.
Data mining.
Additional models may include factor analysis and additional theory may
include the Neyman-Scott problem (1948).
Applications to be considered may include:
models of protein folding and protein structure prediction,
bushfire prediction,
text and image analysis,
DNA alignment and the human genome
project, authorship identification for texts, etc. Further typical
applications may be described.
Might also manage to fit in some or all of:
polygon modelling;
DNA pattern discovery and alignment,
evolutionary trees;
Lempel-Ziv text compression, C.S. Wallace improvement (1989, 1996),
approximate repeats; HMMs (PFSAs) in mixture modelling;
Markov fields, images.
Progress
The half-semester course followed on the heels of its main pre-requisite,
CSE454,
whose 7th and last lecture was on Tuesday 23rd April 2002.
Consultation
Tuesdays 12-1; Thursdays 12-1.
Reference(s)
C.S. Wallace
and
D.L. Dowe
(1999a),
"Minimum
Message Length and Kolmogorov complexity",
Comp. J.,
Vol.
42, No. 4,
pp270-283.
Field of education code(s)
020119 Artificial Intelligence, 010103 Statistics.
Links
Minimum Message Length
(MML),
Occam's razor,
and
data collections.
I taught
CSC423
Learning and Prediction
on MML from 1997-2001
(4 points, then 6 points)
before it was split into
CSE454
(3 points)
and
CSE455
Learning and Prediction II:
MML Data Mining
(3 points).
I am co-ordinating
CSE50DM:
Statistics of data and data mining,
part of the
Graduate Certificate in Computational Science.
MML talks
and
CSSE
Clayton
seminars.
MML Bayes Nets
with Decision Trees.
MML Decision Trees.
Ockham's razor.
Pi
= 3.141592653589793....
Probabilistic football-tipping competition
(and
outline of
probabilistic
prediction),
free with
prizes for some
secondary students.
World longest's running and Australia's
first.
Snob program for
MML
mixture modelling and clustering.
Short course
slides.
Semester
dates.
Some useful links,
TheBreastCancerSite,
TheHungerSite,
TheRainforestSite,
www.GiveWater.org,
"do-goody"/"improving the world".
Copyright
Dr David L. Dowe,
Monash University, Australia,
9th May 2002, etc.
Link to 2006 CSE455 courseware notes.
Copying is not permitted without expressed permission from
David L. Dowe.