The two attributes, phi and psi, are the two free dihedral-angles
per amino-acid on a protein backbone.
The peaks represent the centres of classes found.
x has a certain measurement accuracy, an interval.
NB. If interval is small, probability density
can be approximated as a constant over it, and
simply "passes through" the Maths.
There are two ways of viewing/ implementing fractional assignment:
Borrowed bits, Wallace (1986).
e.g. consider a case of 50:50 membership,
work out code for rest of data,
"borrow" 1st bit from rest to decide which
membership choice to make,
remove borrowed bit from rest, don't transmit it, receiver can deduce it!
Generalizes to unequal ratios and >2 classes.
Directly from a code-book based on the combined, i.e. mixed, distribution.
Note that class memberships are nuisance parameters
if we want (only) the class descriptions.
And typically of nuisance parameters,
are proportional to |input|.
Fractional assignment = no nuisance.
The methods described allow two hypotheses to be compared and
the better one chosen, but there remains the search problem -
to find the best mixture model. `Snob' does the following:
Have a current working hypothesis,
iterate, re-estimating memberships & class parameters.
Periodically, consider merging two classes, or splitting a class,
accept if total message length is reduced.
Repeat until MsgLen(Hypothesis) + MsgLen(Data|Hypothesis)
stops reducing.
Must converge, possibly to a local optimum, works well in practice.
C. S. Wallace.
An improved program for classification.
Proc. Australian Comp. Sci. Conf ACSC9, 8(1),
pp357-366, February 1986
- DLD: First description of the bit-borrowing coding technique,
also see ICCI'90,
later called bits-back by G. E. Hinton & D. van Camp in
COLT '93, (COLT93) '93.
C. S. Wallace & P. R. Freeman.
Estimation and Inference by Compact Encoding.
J. R. Stat. Soc. B 49 pp240-265 1987
[paper]
C. S. Wallace.
Classification by minimum-message-length encoding.
Advances in Computing and Information - ICCI '90,
Springer-Verlag, LNCS 468, pp72-81, May 1990