Up to [Law Notes].

Extract from some correspondence with H.T.

From harold at mdx ac uk  Wed Sep 20 23:19:34 1995
To: lloyd at bruce cs monash edu au (Lloyd Allison)
From: harold at mdx ac uk (Prof. Harold Thimbleby x6061)
Subject: Re: Australian Senate Enquiry

>The "47% of the 11000 most repeated searches":
>   1. this could be just 1% of the total searches (I doubt it is),
>      can you clarify?

It was 1%ish of the total. I had about 10^6 searches, so I didn't classify
it all. Somebody with a thesaurus [ ... ] could
help here; or maybe we could develop a methodology to automatically follow
the references and see what proportion refer transitively to stuff that can
be classified.... sounds dubious.

   [ ... ]

>      What was the total # of searches during the collection?
>      Is it 11000 words or 11000 phrases/searches?

There are very few phrases, although the data has them (see below) That is,
few people look for ands/ors etc

>   2. what is the list of pornographic words ?
>

The list of searches starts off:

1317 sex
 446 erotic
 424 nude
 369 erotica
 272 penthouse
 244 playboy
 237 porn
 224 pornography
 223 porno
 205 isindex=
 187 adult
 150 mpeg
 118 ebola    [ <-- There was an Ebola virus outbreak in Africa mid 1995]
  86 girls
  85 hustler
  84 news
  83 games
  81 music
  77 bondage
  76 robots
  76 netscape
  71 supermodels
  71 gif
  71 gay
  64 pictures
  63 weather
  62 doom
  54 x-rated
  54 alt.sex
  52 xxx
  52 supermodel
  51 SEX
  50 nudity
  50 nudes
  49 genealogy
  49 Sex

The numbers indicate the exact ASCII-match frequencies (ie., sex and Sex
are different) of expression searches (some are in the form
'pamela+anderson' for example). I am going to organise the data more
helpfully and put it on the Web, probably next week.

 [ ... ]

  -------------------------------------------------------------------------
  Prof Harold Thimbleby                                   Computing Science
  +44 (0)181 362 6061 direct                           Middlesex University
  FAX/ansaphone 0181 362 6411                             Bounds Green Road
  University 0181 362 5000                                  LONDON, N11 2NQ
  harold at mdx ac uk
  WWW URL: http://www.cs.mdx.ac.uk/harold