^CSE454^
[01]
>>
1: Introduction
CSE454
2005
:
This document is online at
http://www.csse.monash.edu.au/~lloyd/tilde/CSC4/CSE454/
and contains hyperlinks to other resources

^CSE454^
<<
[02]
>>
NB. The term
data space is often used,
in machine learning. 
^CSE454^
<<
[03]
>>

^CSE454^
<<
[04]
>>
InferencePeople often distinguish between

^CSE454^
<<
[05]
>>
BayesIf B_{1}, B_{2}, ..., B_{k} is a partition of a set B (of causes) then P(AB_{i}).P(B_{i}) P(B_{i}A) =  i=1, 2, ..., k P(AB_{1}).P(B_{1})+...+P(AB_{k}).P(B_{k}) 
^CSE454^
<<
[06]
>>
. . . applied to data D and hypotheses H_{i}:
P(DH_{1}).P(H_{1})+...+P(DH_{k}).P(H_{k}) = P(D) P(H_{i}D) = P(DH_{i}).P(H_{i}) / P(D) posterior P(H_{i}D) P(DH_{i}).P(H_{i})  =  posterior oddsratio P(H_{j}D) P(DH_{j}).P(H_{j}) 
^CSE454^
<<
[07]
>>
NB. Can ignore P(H_{i}) in posterior oddsratio
if, and only if, P(H_{i})=P(H_{j}).
Maximum likelihood may can cause problems when we have inequality. 
^CSE454^
<<
[08]
>>
ExampleC_{1}, a fair coin, P(H) = P(T) = 0.5. C_{2}, a biased coin, P(H) = 2/3, P(T) = 1/3. One of the coins is thrown 4 times, giving H, T, T, H. Which coin was thrown?

^CSE454^
<<
[09]
>>
Prior, P(C_{1}) = P(C_{2}) = 0.5. Likelihood, P(HTTH  C_{1}) = 1/16 and P(HTTH  C_{2}) = 4/9 . 1/9 = 4/81. Posterior oddsratio,
P(C_{1}HTTH)/P(C_{2}HTTH) =

^CSE454^
<<
[10]
>>
Now, P(C_{1}HTTH) + P(C_{2}HTTH) = 1 and if x/(1x) = 81/64, then
P(C_{1}HTTH) = 81/145. This case is simple because the model space is discrete, in fact finite (2). 
^CSE454^
<<
[11]
>>
e.g. predictionKnow P(C_{1}) = 81/145, P(C_{2}) = 64/145. The more likely coin is C_{1}. If we assumed the coin really was C_{1}, would
predict But the coin might be C_{2}. Should predict P(H) =
i.e. use a weighted average of the hypotheses. 
^CSE454^
<<
[12]
>>
ConclusionWe have looked at
© 2005 L. Allison, School of Computer Science and Software Engineering, 