^CSE454 / 2005^   >notes>

CSE454 CSSE Monash, semester 1, 2005

Do not use Excel or other software, particularly not to produce graphs and diagrams. This is not to be perverse; there are far fewer problems with readability when graphs are drawn by hand!

Due: CSSE office, noon, Thursday, week 8, 28 April.

  1. This question will be of some use to you as a study of scientific literature (but I must confess that, when collected, the results will be of use to future students of the unit).
    Find at least three refereed research papers (not press releases, comment or editorials) which report on statistical studies of the health effects of moderate alcohol consumption, appearing in respected scientific journals (e.g. Am. J. Cardiol., Am. J. Clinical Nutrition, Ann. Intern. Med., British Medical Journal (BMJ), J. American Med. Assoc., Nature, Science), during your specified year(s), and for each paper record:
    %A author1 (initials family_name)
    %A author2 (etc.)
    %T the_title
    %J the_journal
    %V the_volume_number
    %N the_issue_number
    %P page1-last
    %M the_month
    %D the_year
    %K key_words
    %X A very short statement of what the health effect is (particularly +ve or -ve)
       and how it applies to beer, wine, red wine, white wine, other alcohol.
       The doi if available, or other portable and persistent url.
    Submit the results to Q1 (only), as ascii text, by email. [5 marks]
    student years
    Lachlan D 2004
    Jason G 1999
    Martin H 2002
    Emma J 2000
    Kevin M 1996
    Jane O 2001
    Mony S 1998
    Erica S 2003
    Michael T 1997

  2. Get the Anderson / Fisher Iris data set from the repository of databases for machine learning [UCI.M.Learn]. Use the four measured attributes (sepal & petal, length & breadth) as data to be clustered by the [Snob] mixture-modelling program.
    • In at most 2-sides of A4, total, (12-pt.,) describe the class structure found by Snob.
    • Compare the memberships, in the various Snob classes, of the different species (5th attribute).
      [5 marks]

  3. Just before Christmas 2000, two papers appeared in the British Medical Journal, on the topic of dog bites and the full moon. Bhattacharjee et al, Do animals bite more during a full moon? Retrospective observational analysis, BMJ 2000; V321 [pp1559-1561] (23 December) suggested that bites were more frequent near the full moon. The other (from Australia), Chapman and Morrell, Barking mad? Another lunatic hypothesis bites the dust, BMJ 2000; V321, [pp1561-1563] (23 December), held the opposite view that there was no such link. The BMJ is on-line and I have Chapman and Morrell's data at [..../dog/].
    Read the two BMJ papers.
    (a) In one or two sides of A4, total, (12-pt,) comment on the strengths and weaknesses of the two paper. Which is the most convincing? Why? (Don't give me "we need more data" -- we can't get any more!-)
    [5 marks]
    (b) Use the [Snob] mixture-modelling program to see if there any clusters in Chapman and Morrell's data.
    • Produce a file where each dog-bite has one attribute -- the phase of the moon at which it occurred. Use the von Mises[*] distribution in Snob to see if there are one, two or more classes (clusters) and where are their means.
    • Regardless of whether 1, 3, 4, ..., classes are best, what is the best 2-class model?
    • Carry out some (sensible) investigations as to whether including different attribute(s), such as the gender or the age of the victim, makes any difference.
    Write a 1 to 2-side report on your conclusions including some representation of any classes that Snob finds. There is no single right answer to this question.
    [5 marks]

[*] "The von Mises distribution M(mu,kappa) has a mean direction mu and concentration parameter kappa. For small kappa it tends to a uniform distribution and for large kappa it tends to a Normal Distribution with variance 1/kappa."
- T. Edgoose, L. Allison & D. L. Dowe. An MML Classification of Protein Sequences that knows about angles and sequences. Pacific Symp. Biocomputing 98, pp585-596, Jan' 1998.
f(x | mu, kappa) = (1 / (2.pi.I0(kappa)). exp( kappa.cos(x-mu) )
where I0(kappa) is a normalization cosntant.

17/3/2005 © L. Allison, School of Computer Science and Software Engineering, Monash University, Australia 3800.
Created with "vi (Linux & Solaris)",   charset=iso-8859-1