School of Computer Science & Software Engineering,
Monash University, Clayton, Victoria,
The talk concerns 'statistical models' as used in
artificial intelligence, data mining,
inductive inference or machine learning.
Many models are naturally polymorphic.
There are useful operators,
some known and some yet to be discovered *,
for combining models to make new models.
Overfitting is a potential problem in inductive inference but there is a
natural combinational criterion, MML,
to trade-off model complexity v. fit to data.
I suggest that statistical models, MML and functional programming
make a natural threesome.
Examples are given from "time"-series analysis.
Prog. Lang. & Sys.
(PLASMA) research group,
Dept. of Computer Science,
Wed. 15 Dec. 2004.
[*] (un)known unknowns?
What should this kind of programming be called?
Q: 'Function' is to 'functional programming' as
'statistical model' is to <what>?
Inductive programming ?
model "error" v. model complexity
Overfitting is a problem, unless
... MML model selection!
data + model complexity
Math can be hard with (variable numbers of)
discrete and continuous parameters.
Most important abilities of a (basic) model
class Model mdl where
pr :: (mdl dataSpace)
-> dataSpace -> Probability
nlPr :: (mdl dataSpace)
-> dataSpace -> MessageLength
msg :: SuperModel (mdl dataSpace)
=> (mdl dataSpace) -> [dataSpace]
A "superclass" for Statistical Models
class (Show sMdl) => SuperModel sMdl where
prior :: sMdl -> Probability
msg1 :: sMdl -> MessageLength
(Mixture mx, SuperModel (mx sMdl))
=> mx sMdl -> sMdl
class FunctionModel fm where
condModel :: (fm inSpace opSpace)
-> inSpace -> ModelType opSpace
condPr :: (fm inSpace opSpace)
-> inSpace -> opSpace -> Probability
I.e. Given the values of the input (exogenous) attributes (variables)
make a conditional (model) prediction of the output (endogenous) attributes.
class TimeSeries tsm where
predictors :: (tsm dataSpace)
-> [dataSpace] -> [ModelType dataSpace]
prs :: (tsm dataSpace)
-> [dataSpace] -> [Probability]
I.e. A time-series model makes a sequence of (model) predictions
for what comes next given the preceding context.
estMarkov k dataSeries =
scan (d:ds) context = ...
contexts = scan dataseries 
k contexts dataSeries)
Many useful functions such as
=> fm [ds] ds
-> TimeSeriesType ds
data OrBoth a b = JustL a
JustR b |
Both a b deriving ...
model Ops -> model a
-> model b
-> model (a,b)
-> model (OrBoth a b)
(model a -> model b -> model (a,b))
tsm Ops -> tsm a
-> tsm b
-> tsm (OrBoth a b)
... when it is convenient to make a model (a, b)
from the (model a)
and the (model b).
e.g. optimal alignment
|| ||| || ||
^ ^ ^
| | JustL C
| Both A G
2-D dynamic programming algorithm.
dpa2D :: Int ->
tsm (OrBoth a b) -> [a] -> [b]
-> [OrBoth a b]
Note that `a' and `b' can be different types,
discrete, continuous, multivariate, even non-EQ types.
The time-series type,
tsm, often has a "memory".
Needs time-series models with "state".
[OrBoth a b] <--->
([Ops], [a], [b]),
LHS & RHS equivalent, if lengths match.
[Ops] ---> (tsm Ops),
[a] ---> (tsm a),
[b] ---> (tsm b)
---> (new) tsm (OrBoth a b)
---> (new) [OrBoth a b] ... (EM) must converge.
A generalised Lempel Ziv time-series model
data ApproxRepeats a =
Jump Int |
Rpt ( OrBoth a a) deriving ...
(model a -> model a -> model (a,a))
OpsAR -> tsm a
OpsOB -> tsm a
(and recall that tsmOrBoth
has a (tsm a) parameter ... yes you can.-)
subject of this talk
denotational semantics of L
... before parser combinators etc. were invented.
(the start of) a combinator library for machine learning,
a combinational denotational semantics of statistical models,
a way of "software engineering" much of AI & machine learning,
some tiny but very general programs for e.g.
clustering, (decision-) classification-trees]
And if you have a good answer to the [
please tell me.
(and on approximate repeats
Created with "vi (Linux & Solaris)", charset=iso-8859-1
L. A λλ ison,
School of Computer Science and Software Engineering,
Monash University, Australia 3800.