^inductive programming^

Bayesian Networks

hybrid Bayes nets

The estimation of Bayesian networks makes a case-study to illustrate inductive programming (IP).

A Bayesian network can be over discrete or continuous attributes (variables), or over discrete and continuous (mixed) attributes.

The main interests here are in the types and classes of the networks, and the expressiveness of I.P..
The estimator below is given a perm-utation of the variables but a search over permutations could also be programmed, without change to the types or classes.


Network Estimator

estNetwork  perm  estMV  dataSet =
  n = length perm
  search _  []     = []  -- done
  search ps (c:cs) =
   let  -- use parents ps to predict children c:cs
    opFlag  = ints2flags [c] n  -- identify the child c ...
    ipFlags = ints2flags ps  n  -- ... and the parents ps.
    cTree = estCTree (estAndSelect estMV opFlag) -- leaf est
                     (splitSelect ipFlags)  -- allowed tests
                     dataSet dataSet        -- !
    cTree : (search (c:ps) cs)  -- i.e. list of class'n trees
  trees     = search [] perm      -- network
  msgLen    = sum(map msg1 trees) -- cost
  nlP datum = sum(map (\t->condNlPr t datum datum) trees) --neg log Pr
  MnlPr msgLen nlP (\() -> "Net:" ++ (show trees))        --a Model
-- end --

... continued.

Each node of a Bayesian network contains a classification-tree. A classification-tree can split (test) on both discrete and continuous attributes, and it can model discrete or continuous variables (or other types) in its leaves.

The Bayesian network estimator uses
the estimator for classification trees, estCTree, and
a new type of model, ModelMV...


ModelMV is a new model type for multivariate data-spaces.

data ModelMV dataSpace =

 MnlPrMV ([Bool] -> MessageLength)      -- msg1

 ([Bool] -> dataSpace -> MessageLength) -- msg2

 ([Bool] -> String)                     -- Show

 Int                                    -- width

Certain dimensions can be 'select'-ed by Boolean flags...

type-class 'Project'

Class Project, as in 'projection', allows the Bayes Net estimator to select input- and output-subspaces of the data-space to be 'parents' and 'children'.

class Project t where

  select  :: [Bool] -> t -> t
    -- non-selected parts become "identities"

  selAll  :: t -> [Bool]
    -- i.e. all "on" flags

An instance of class Project is some 'multi-dimensional' value where certain dimensions can be 'select'-ed.

(In addition, a class Splits2, a variation on the standard class Splits, enables independent and dependent dimensions of the data-space to be selected for the estimator of classification trees, as used in the nodes of a Bayesian network.)

'ModelMV' an instance of 'Project'

instance Project (ModelMV ds) where

 select bs (MnlPrMV m n s w) =
  MnlPrMV (\bs' -> m (zipWith (&&) bs bs'))
          (\bs' -> \datum -> n (zipWith (&&) bs bs') datum)
          (\bs' -> s (zipWith (&&) bs bs'))

 selAll (MnlPrMV _ _ _ w) = replicate w True
generalized Bayesian network


It took one day to implement the Bayesian networks case-study in inductive programming.

As an application, a Bayesian network was learned for search and rescue data -- lost persons -- in one and a half days.

L. Allison. Models for Machine Learning and Data Mining in Functional Programming, J. Functional Programming, 15(1), pp.15-32, January 2005; also see [II(click)].
-- " --. Inductive Inference 1.1. TR 2004/153, School of Comp. Sci. & Software Eng., Monash U., May 2004; also see [II/1.1/BN].

© L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3800.
Created with "vi (Linux & Solaris)",   charset=iso-8859-1