MML, MDL, Minimum Encoding Length Inference
Postdoc available (Postdoctoral Fellowship job available, deadline: 31 July 2016) :
Research
Fellow in Statistics, Machine Learning, Mixture Modelling, Latent Factor Analysis and Astrophysics
(deadline 31/July/2016)
Welcome to
David Dowe's
minimum encoding length inference page.
This page discusses primarily the information-theoretic inference methods of:
Minimum Description Length
(MDL)
(Rissanen, 1978;
Rissanen, 1989;
Rissanen,
1999)
and
the earlier
Minimum Message Length (MML)
(Wallace
and Boulton, Comp. J. 1968; Wallace and Freeman, 1987;
Wallace
and Dowe, 1999a;
Wallace, 2005).
Related earlier works were
Solomonoff,
1964;
Kolmogorov, 1965; and
Chaitin,
1966.
Minimum Message Length (MML) is an information-theoretic Bayesian principle
of inductive inference. It was first published in
C. S. Wallace
and D.M. Boulton,
Comp. J.,
1968.
The
first
MML papers
were
C. S. Wallace
and D.M. Boulton,
Comp. J.,
1968;
D.M. Boulton and C. S. Wallace,
1969;
D.M. Boulton and C. S. Wallace,
1970;
D.M. Boulton and C. S. Wallace,
1973;
D.M. Boulton and C. S. Wallace,
1975;
C. S. Wallace and D.M. Boulton,
1975;
and David M. Boulton's 1975 Ph.D. thesis.
[See also
Ray Solomonoff (1926-2009)
85th memorial
conference (Wedn 30 Nov - Fri 2 Dec 2011),
3rd
Call for Papers,
2nd
Call for Papers,
1st
Call for Papers,
Invited
speakers,
accepted
papers.]
Pieces of literature I particularly recommend include:
Wallace, C.S. (2005) [posthumous],
Statistical and Inductive Inference by Minimum Message
Length, Springer (Series: Information Science and Statistics), 2005, XVI,
432 pp., 22 illus., Hardcover, ISBN: 0-387-23795-X
[table of
contents,
chapter headings,
preface
(and
page vi,
also
here)
and
more];
and
D. L. Dowe,
"Foreword
re C. S. Wallace",
Computer Journal,
Vol. 51, No. 5
(Sept. 2008)
[Christopher
Stewart WALLACE (1933-2004) memorial special issue],
pp523-560;
and
C.S. Wallace and
D. L. Dowe,
"Minimum
Message Length and Kolmogorov complexity",
Comp.
J., Vol. 42, No. 4 (1999a),
pp270-283
[which is also Chris Wallace's most cited work which is co-authored by a still
active MML researcher]
and also
other articles in that
special issue
of the
Comp. J. on
Kolmogorov complexity;
and
Comley, Joshua W. and D.L. Dowe (2005).
Minimum Message Length, MDL and Generalised Bayesian Networks with Asymmetric Languages,
Chapter 11 (pp265-294) in P. Gru:nwald, I. J. Myung and M. A. Pitt (eds.),
Advances in Minimum Description Length: Theory and Applications,
M.I.T. Press (MIT Press), April 2005, ISBN 0-262-07262-9
[Final camera ready copy was submitted in October 2003]
[pp265-284
and
pp285-294;
p265,
p266,
p268,
p268,
p269,
p270,
p271,
p272,
p273,
p274,
p275,
p276,
p278,
p278,
p279,
p280,
p281,
p282,
p283,
p284,
p285,
p286,
p288,
p288,
p289,
p290,
p291,
p292,
p293,
p294]
(and other chapters in that book);
and
Dowe, D.L., S. Gardner and G. Oppy (December 2007).
"Bayes Not Bust! Why Simplicity is no problem for
Bayesians",
British Journal for the Philosophy
of Science
(BJPS):
Abstract,
full text,
.pdf;
doi:10.1093/bjps/axm033 .
[accepted, 2006, earlier, near-final version of the paper.
See also
here.];
and
D. L. Dowe (2011 [was 2010]),
"MML, hybrid Bayesian network graphical models,
statistical consistency, invariance and uniqueness",
Handbook of the Philosophy of Science
- (HPS Volume 7) Philosophy of Statistics,
Elsevier
[ISBN: 978-0-444-51862-0 {ISBN 10: 0-444-51542-9 / ISBN 13: 978-0-444-51862-0}],
pp901-982.
Below are some frequently asked questions (FAQs):
What is the difference between MML and MDL?
What are the differences between MML and MDL?
What is the similarity between MML and MDL?
What are the similarities between MML and MDL?
Can you compare MML and MDL, and give a comparison between MML and MDL?
``Answers'': Have a look at
\cite[sec. 10.2]{Wallace2005} and
\cite[sec. 11.4.3, pp272-273]{ComleyDowe2005}
and
\cite[sec. 6.7]{Dowe2010}
and
the references within these,
and
also have a look at
Comp.
J., Vol 42, No. 4,
Wallace and Dowe (1999a),
and anything else you can find.
There are also links below to related topics and to people doing related
research. Please e-mail me if you'd like to be included.
MDL, MML and algorithmic information theory links
Computer Journal
special
issue on Kolmogorov complexity
and algorithmic information theory,
Volume 42, Issue 4: 1999.
This
issue
features contributions by many authors - including
articles
by Rissanen,
by Solomonoff,
by
Wallace
and Dowe,
and by
others;
and responses and rejoinders on
MDL
and MML
by Rissanen
and by
Wallace
and Dowe
respectively.
Calendar
of
Machine
Learning,
RUUG,
CSSE,
Monash Univ.
MML talks
and
CSE455
Minimum Message Length
(formerly Learning and Prediction II:
MML Data Mining
[formerly before that
CSC423 Learning and Prediction
course]),
CSSE,
Monash Univ.
Information,
Statistics and Induction in Science
(ISIS)
conference,
Aug. 1996.
"Minimum
Message Length and Kolmogorov complexity",
Comp.
J., Vol 42, No. 4 (1999a),
pp270-283,
by
C. Wallace
and
D. Dowe
[which is the
Computer Journal's
most downloaded ``full text as .pdf'' article -
see, e.g.,
here].
"Minimum Message Length, MDL and Generalised Bayesian Networks with Asymmetric Languages",
by J. W. Comley and D.L. Dowe;
Chapter 11
(pp265-294)
in P. Grunwald, M. A. Pitt and I. J. Myung (eds.),
Advances in Minimum Description Length:
Theory and Applications,
M.I.T. Press, April 2005, ISBN 0-262-07262-9.
{This is about Generalised Bayesian nets (or even the special case of hybrid
Bayesian nets), generalising MML Bayesian nets or
MML Bayesian networks or MML Bayes nets; and it deals with
a mix of both continuous and discrete variables.
(See also
Comley and Dowe
(2003),
.pdf.)}
Related links and, below, People
Artificial Intelligence
(AI) on the Web:
Machine
Learning
and
Artificial Intelligence Resources.
Bayesian Net
Repository - see also
MML Bayes Nets:
Comley & Dowe (2005)
and
(2003,
.pdf).
Bayesian Statisticians
worldwide
(was
here
and was previously
here),
a link re
Rev. Thomas Bayes,
a link to
a small drawing
of Rev. Thomas Bayes and (his posthumously published)
"An Essay towards solving
a Problem in the Doctrine of Chances".
MML uses Bayesian information theory.
Bayesianiam,
compression
and intelligence
[Dowe and Hajek
(1997,
1998)].
Boosting Research Site: boosting.org.
Clustering,
mixture modelling
(or finite
mixture modeling)
and/or
unsupervised learning
using MML:
Snob
- see, e.g.,
Wallace and Dowe (2000)
(was here).
Compression
and intelligence
[Dowe and Hajek
(1997,
1998)],
and other
compression
pointers.
Computational complexity links.
Computational Learning Theory
(CoLT)
page:
theorists
and
resources,
and
Thomas Zeugmann's
COLTBIB.
CSE455,
4th Year Hons course entitled
Minimum Message Length,
Monash University.
Data collections
and
Vlad's
KD and
data mining page.
Kernel machines WWW site:
www.kernel-machines.org;
an
Alex Smola
SVM
talk;
and
Support
Vector Machine mailing list.
Machine Learning
and CBR people
and
Mach. Learning Resources
(maintained by
David Aha).
Online Mach.
Learn. Resources.
Nit -
a nit
is a unit of information
(1 nit = log2(e) bits,
1 bit = ln(2) nits)
going back at least
as far as Boulton and Wallace (1970).
See sec. 11.4.1,
p271
of
Comley and Dowe (2005) (MIT Press,
pp265-294)
for a very concise history.
Occam's razor links:
Minimum Encoding Length Inference is an operational form of
Ockham's razor.
Probabilistic
prediction,
probabilistic prediction competition
and
Gaussian prediction competition
for Australian football,
and
history.
Statistics:
University of Florida
Department of Statistics's
Statistics page;
and
AskDrMath's
Prob/Stat
and
Statistics.
Support
Vector Machine mailing list;
and
Kernel machines WWW site,
www.kernel-machines.org.
David Albrecht's
SVM site,
another,
and
another;
and
Tan and Dowe (2004, .pdf) on MML SVMs.
People interested in MDL, MML, algorithmic probability and algorithmic
information theory
[A,
B,
C,
D,
E,
F,
G,
H,
I,
J,
K,
L,
M,
N,
O,
P,
Q,
R,
S,
T,
U,
V,
W,
X,
Y,
Z]
A
Yudi Agusta.
Akaike Information Criterion (AIC):
To
compare
MML and AIC, see, e.g.,
here.
David Albrecht
(and his SVM site).
Lloyd Allison's
MDL
(stochastic
complexity)
and
MML
pages.
Christoph Arndt's
information theory notes
and
"Information measures"
(in English)
book.
Azat Arslanov
B
Andrew Barron
Rohan Baxter
Rev. Thomas Bayes
(dec. 1761),
a small drawing
of Rev. Thomas Bayes,
a German article
on him;
and
Bayesian
Statisticians worldwide.
MML uses Bayesian information theory.
Bayesian Nets using
Minimum message length
(MML).
Adrian Bickerstaffe.
Olivier Bousquet's
Kolmogorov
complexity resources,
Kolmogorov complexity mail list
and
list archive.
David M. Boulton
Matthew Brand
Peter G. Bryant
C
Cristian S. Calude
CDMS.
Greg Chaitin
Bertrand Clarke
John G. Cleary
Clustering,
mixture modelling
(or finite
mixture modeling)
and/or
unsupervised learning
using MML:
Snob
- see, e.g.,
Wallace and Dowe (2000).
Josh Comley
and
(with D. L. Dowe)
"Minimum Message Length, MDL and Generalised Bayesian Networks with Asymmetric
Languages",
Chapter 11
(pp265-294)
in P. Grunwald, M. A. Pitt and I. J. Myung (eds.),
Advances in Minimum Description Length:
Theory and Applications,
M.I.T. Press, April 2005, ISBN 0-262-07262-9.
{This is about Generalised Bayesian nets, generalising MML Bayesian nets or
MML Bayesian networks or MML Bayes nets (or generalised directed graphical
models, or generalised MML directed graphical models); and it deals with
a mix of both continuous and discrete variables.
(See also
Comley and Dowe
(2003),
.pdf.)}
Consistency:
The Wallace-Freeman (1987) version of MML has shown to be statistically
consistent on the Neyman-Scott problem - see
Dowe & Wallace (1997).
We have conjectured (or asked the question)
[Dowe, Baxter, Oliver & Wallace
(1998, sec. 6,
p93),
Edwards & Dowe(1998, sec. 5.3, p104),
Wallace & Dowe
(1999a, p282),
Wallace & Dowe (2000, sec. 5,
p78),
Comley & Dowe
(2005, sec. 11.3.1,
p269), and
DoweGardnerOppy2007 (sec. 8 (Conclusion), currently(?) p52)]
whether or not only MML
and closely related Bayesian methods can, in general, infer fully-specified
models with both statistical consistency and
invariance.
Thomas M. Cover
D
Ian Davidson.
A. Phil Dawid
Ivanoe De Falco
(and colleagues)'s
papers
and
Genetic Programming Approach
to Kolmogorov Complexity page.
Trevor Dix
Byron Dom's
publications.
David Dowe
and
(with
C. S. Wallace):
"Minimum
Message Length and Kolmogorov complexity"
Comp.
J., Vol 42, No. 4 (1999a),
pp270-283
[this article is the Computer Journal's most downloaded ``full text as
.pdf'' - see, e.g.,
here];
and
(with J. W. Comley)
"Minimum Message
Length, MDL and Generalised Bayesian Networks with Asymmetric Languages",
Chapter 11
(pp265-294)
in P. Grunwald, M. A. Pitt and I. J. Myung (eds.),
Advances in Minimum Description Length:
Theory and Applications,
M.I.T. Press, April 2005, ISBN 0-262-07262-9;
and
D. Dowe publications
(or other D. Dowe publications).
Mike Dowman.
E
Bruce Edmonds's
publications and
Complexity Related
Links.
Russell Edwards,
Astronomical Institute
"Anton Pannekoek",
Faculty of Mathematics, Computer Science, Physics and Astronomy
(WINS),
University of Amsterdam.
Mark Ellison
(was here).
Vladimir Estivill-Castro
F
Graham Farr's
publications.
Ronald Aylmer
Fisher
(1890-1962): While R. A. Fisher is generally considered far from
Bayesian, both
Rissanen's
MDL work (e.g., 1996) and
Wallace's
(Bayesian) MML work (e.g., with Freeman, 1987)
use the notion of Fisher information.
Image of R. A. Fisher
(was here).
Leigh Fitzgibbon's
publications.
Simon Fox
Roberto Fraile
(and here).
Pasi Fra:nti.
Norio Fukuda
G
Peter Gacs
Alex Gammerman
Andres Garcia
Russell Greiner
Peter Grünwald
(Peter Grunwald,
Peter Gru:nwald)
and
here;
and
(MDL source book)
``Advances in Minimum Description
Length'', April 2005, MIT Press.
H
Alan Hajek
Brian Hanlon
Hongxing He
José Hernàndez i Orallo
(and
in English,
Jose
Hernandez-Orallo).
Phil Hingston
Marcus Hutter
I
Information theory:
Min. Enc. Length Inference, MDL
and MML are based in information theory - as is also
AIC.
See also quantum info. theory.
Invariance: Strict MML and many of its approximations - such as
Wallace & Freeman (JRSS, 1987) and MMLD (or I1D) - are statistically
invariant.
See, e.g.,
Dowe, Gardner
& Oppy2007 (page ?7? and elsewhere)
and
Comley & Dowe (2005)
(sec. 11.3
[pp268,
269
and
270]).
See also invariance and consistency
conjecture.
Laurence Irlicht
J
Bruno Janvier.
Tao Jiang
Murray Jorgensen
K
Paul Kabaila
Kenichi
Kanatani's
recent
publications.
Aram Karalic
Gary Warren King
Andrei
N. Kolmogorov (1903-1987) sites:
http://www.Kolmogorov.com/Kolmogorov.html,
a c.v.
(in Russian),
http://www.pms.ru/kolmogorov
(in Russian),
exploratorium.edu,
dcs.st-and.ac.uk
and
(by Paul Vitanyi)
a biography.
Petri Kontkanen
L
Aaron Lanterman
(was here).
Thomas Lee's
publications and
technical reports.
Shane Legg
Leonid Levin
Mengxiang Li
Ming Li
Xiaodong Li
M
Dean McKenzie's
publications.
Enes Makalic.
Steve Maybank
(and here).
www.MDL-research.org,
people,
reading,
demonstrations
and
related topics;
and
(MDL source book)
``Advances in Minimum Description
Length'', April 2005, MIT Press.
Paul Ming's
Virtual Turing Machine
(VTM)
and
Virtual Turing Machine 2
(VTM2).
Minimum Description Length
(MDL) and
comparisons
with MML
(on
pp270-283
and elsewhere)
in
Comp.
J., Vol 42, No. 4, 1999.
Mixture modelling
(or finite
mixture modeling),
clustering
and/or
unsupervised learning
using MML:
Snob
- see, e.g.,
Wallace and Dowe (2000).
MML and the
"Turing test".
Suzie Molloy.
Robert J. Munro
Petri Myllymäki
N
Julian R. Neil
Doug Newlands
O
Ockham, William of (circa. 1280 or 1285 till
circa. 1347 or 1349, apparently 10th April 1349),
Ockham's razor
and town of Ockham.
R. D. Ogden,
Computer Science Department, Southwest Texas State University,
San Marcos, TX 78666, USA; ro01@swt.edu.
Jon Oliver
Graham Oppy
P
Jon Patrick.
Philosophy, Stanford Encyclopaedia of.
Q
Guoqi Qian
Quantum information theory -
Quantum
Computer Technology
(U Melb,
U NSW,
U Qld).
R
Anand Venkataraman
(was Anand (Venkt) Raman).
Fengrong Ren
Jorma Rissanen
(and
publications,
although links seems dead)
and
some
sample publications.
Ricardo Rocha
Teemu Roos
Juho Rousu.
S
Pritika Sanghi.
J. G. Sanjayan
Ju:rgen Schmidhuber
Stanley L. Sclove's
journal publications
and
working papers.
Claude Shannon ("father of
information theory")'s
obituary
(and another
obituary),
1916-2001.
Alexander Shen
Tomi Silander
Jatinder Singh.
Snob
(software):
mixture modelling
(to infer a
finite
mixture model)
using MML.
Ray Solomonoff's
biography,
publications
and
obituary (New York Times, Wedn 13/Jan/2010):
online
and
scanned
- and
Solomonoff 85th
memorial conference (Wedn 30 Nov - Fri 2 Dec 2011) :
3rd
Call for Papers,
2nd
Call for Papers,
1st
Call for Papers,
Invited
speakers,
accepted
papers.
David Suter.
T
Peter Jing Tan;
and
P. J. Tan and
D. L. Dowe
(2003),
"MML Inference of
Decision Graphs with Multi-Way Joins and Dynamic Attributes",
Proc. 16th Australian Joint
Conference on Artificial Intelligence (AI'03), Perth, Australia,
3-5 Dec. 2003, Published in Lecture Notes in Artificial Intelligence
(LNAI) 2903, Springer-Verlag,
pp269-281.
Hiroshi Tanaka
Kai Ming Ting
Henry Tirri
Peter Tischer
Phil H. S. Torr
(was here).
John Tromp's Home Page (was
here).
Alan Turing (1912-1954),
developer of
(Universal)
Turing Machines,
among many other things.
Sites
maintained by
Andrew Hodges
and
dcs.st-and.ac.uk;
The Turing archive for the history of
computing
(maintained by Jack
Copeland and Gordon
Aston);
and
image of Alan Turing.
A.M. Turing's (1950)
"Computing
Machinery and Intelligence", Mind, 59, pp433-460 -
which is the paper which introduced the
imitation game
or
"Turing Test".
Turing
Machine simulator:
http://wap03.informatik.fh-wiesbaden.de/weber1/turing/tm.html
and
documentation.
Virtual Turing Machine
(VTM)
and
Virtual Turing Machine 2
(VTM2),
by
Paul Ming.
The "Turing test"
and MML.
Charles Twardy
and
Bayesian Models for Search & Rescue.
U
Unsupervised learning,
mixture modelling
(or finite
mixture modeling)
and/or
clustering
using MML:
Snob
- see, e.g.,
Wallace and Dowe (2000).
Aleksey M. Urmanov.
William Uther
(was here).
V
Farshid Vahid
Manuela Veloso.
Brani Vidakovic.
Murli Viswananthan
(and here)'s
publications.
Paul Vitanyi's
Publications & Areas of Interest,
Kolmogorov complexity
publications,
selected recent papers
and
publications.
Volodya Vovk
V. V. V'yugin
W
Chris Wallace's publications,
including
"Minimum
Message Length and Kolmogorov complexity" (with D. L. Dowe),
Comp.
J., Vol 42, No. 4 (1999a),
pp270-283
[this article is the Computer Journal's most downloaded ``full text as
.pdf'' - see, e.g.,
here];
and including his (posthumous) book:
Wallace, C.S. (2005) [posthumous],
Statistical and Inductive Inference by Minimum Message
Length, Springer (Series: Information Science and Statistics), 2005, XVI,
432 pp., 22 illus., Hardcover, ISBN: 0-387-23795-X
[table of
contents,
chapter headings
(was here)
and
more];
and
also including
Chris Wallace MML publications, 1990-
and also
Chris
Wallace MML publications, 1968-1991.
See also
D. L. Dowe,
"Foreword
re C. S. Wallace",
Computer Journal,
Vol. 51, No. 5
(Sept. 2008)
[Christopher
Stewart WALLACE (1933-2004) memorial special issue],
pp523-560
(and
here).
Ian H. Witten
Y
Kenji Yamanishi
Bin Yu's
publications.
Z
Alexander K. Zvonkin
Miscellaneous, other, links
Chris Wallace (1933-2004)
(developer of MML in
1968)
and
the Computer J's
Christopher
Stewart WALLACE (1933-2004) memorial special issue
[Vol. 51, No. 5
(Sept. 2008)],
Bayesian Nets
using
Minimum message length
(MML),
clustering and mixture modelling,
data repositories,
decision trees using Minimum message length (MML),
Occam's razor
(Ockham's razor),
Snob
(program for MML
clustering and mixture modelling)
to infer
finite mixture models,
(econometric)
time series
using MML,
medical research,
a probabilistic sports prediction
competition
(and further reading on probabilistic
scoring),
chess and game theory research;
Feeding the world
(TheHungerSite),
TheRainforestSite,
"do-goody"/"do-goody stuff, improving the world and saving the planet".
Please e-mail me if
you would like to know more.
This page is
http://www.csse.monash.edu.au/~dld/MELI.html.
Copyright
David L. Dowe,
Monash University, Australia,
31 Jan. 2000, etc.
Copying is not permitted without expressed permission from
David L. Dowe.