
The project is to form a "theory" (or prelude or semantics or model(!))
of `programming with "statistical models" from AIDMIIMLSI',
using Haskell as the analytical tool.
The project aims are, in order of priority:
 Understand and formally define exactly what "statistical models"
(i.e. the
products^{[1]} of
AIDMIIMLSI^{[2]})
really are from a programming point of view,
that is how do they behave, what can be done to each one, and
how can two or more be combined?
 Develop
 (polymorphic) types and typeclasses to
define "statistical models", and
 a useful set of operators (functions, combinators, methods)
to act on them,
i.e. a prelude (library).
 Dismember and reconstruct a few representative, important and wellknown
"statistical models" from AIDMIIMLSI and, while doing so,
 add generality, and
 make a broad claim to being realistic.
 Eventually, encode as much as possible of AIDMIIMLSI
in a convenient, compact library
while adding lightness and generality.
^{[1]}
E.g.
Mixturemodelling (unsupervised classification, clustering),
classification (decision) trees (supervised classification, expert systems),
or
Bayesian/ causal networks/ models,
etc..
^{[2]}
Definition: AIDMIIMLSI =
artificialintelligence/
datamining/
inductiveinference/
machinelearning/
statisticalinference/
etc., call it what you will,
i.e. somewhere between the intersection and union of these areas.
( ~ Super Thunder Sting Car Ray Bird  Pete & Dud.)
Note that the getting of understanding is the primary aim, and
it could, for example, be used to
(i) design better components for an existing AIDMIIMLSI platform,
(ii) design a better AIDMIIMLSI platform, or even
(iii) build a serious platform in Haskell,
but these other things are secondary not primary aims.

(notes) 
 This exercise
 is to

artificialintelligence/ datamining/ inductiveinference/
machinelearning/ statisticalinference


as
 PreludeList (map, fold, zip, ...)
 is to
 listprocessing
 Imagine yourself in the late 1950s or the early
1960s developing Lisp (McCarthy et al), or APL (Iverson), or similar.
NB. PreludeList
now has 40+ years of experience behind it!

parser combinators
 parsing


the mathematical semantics of Algol60 (Moses 1974)
 the programming language Algol60
 Understanding of Algol60 compared
to other languages.

and

 function
 is to
 functional programming


as

statistical model
 is to

<what?> programming,
perhaps 'inductive programming'?
 The term I.P. was
suggested by Charles Twardy (2004).

other "possible" approaches 
approach  why not 
Create a new language for machine learning
 A lot of work.
Hard to maintain, 'port, etc..
Unlikely to improve on Haskell's notion of value and type
(although a simplified, specialised subset might be a "goer").
Don't create languages without good reasons.

11 translate some existing AIDMIIMLSI platform
directly into Haskell, say.
 Might be useful but does not aid understanding.

the project is not primarily about...  ...because 
Any particular application area or
problem instance of AIDMIIMLSI
 (if it was it might use R, rather)
it is about understanding what AIDMIIMLSI could be.

Software Engineering
 although it has aspects of a software engineering job on AIDMIIMLSI.

Stamp (i.e. model) collecting 
no way, but it is about understanding the machinery that allows
stamps to be produced, and generalized.

Haskell 
Haskell is "just" a good tool.
But it is curious that there is no builtin
[class Function], or
[class Pair],
etc..

 Seminars at:
Monash U.,
Griffith U. &
U. Sydney
[I.P.] 2005,
York [TSM],
York [II] 2004.
 L. Allison,
Added Distributions for use in Clustering (Mixture Modelling), Function Models, Regression Trees, Segmentation, and mixed Bayesian Networks in Inductive Programming 1.2,
TR 2008/224, FIT Monash U.,
April 2008.
 M. B. Dale, L. Allison, P. E. R. Dale.
Segmentation and clustering as complementary sources of information.
Acta Oecologica, Elsevier, 31(2), pp.193202,
MarchApril 2007,
doi:10.1016/j.actao.2006.09.002.
 J. Bardsley.
Generalising Data Description for Machine Learning,
BCS honours project, 2006.
(Explores the use of Template Haskell
to generate helper functions, types and class instance declarations)
 L. Allison.
A Programming Paradigm for Machine Learning, with a Case Study of Bayesian Networks.
ACSC2006, pp.103111, January 2006.
 L. Allison.
Inductive inference 1.1.2:
Inductive programming and a case study of Bayesian networks.
Faculty of Info. Tech. (#75, Clayton),
Monash University, Australia 3800,
TR 2005/177, pp.18, Oct. 2005.
(Also see TR2004/153.)
 L. Allison.
Models for machine learning and data mining in functional programming,
J. Functional Programming, 15(1), pp.1532,
January 2005
(online 23 July 2004).
 L. Allison.
Inductive Inference 1.1.
TR 2004/153,
School of Computer Science and Software Engineering,
Monash University, May 2004
(inc. mixed Bayesian networks).
 L. Allison.
Inductive Inference 1.
TR 2003/148,
School of Computer Science and Software Engineering,
Monash University, Dec' 2003,
inc' .hs code.
 A seminar, 21 Oct. 2003, to
[Dept CS & SWE, Me1bourne U.].
 L. Allison.
Types and Classes of Machine Learning and Data Mining,
TwentySixth Australasian Computer Science Conference (ACSC2003)
pp.207215, Adelaide, Australia, 47 February 2003,
(some method names have since changed).
Terminology
Terminology is a problem.
Just consider the many uses of the words "model" and "class"
in computing, mathematics and statistics.

