Their reproof adds the
injury of insult to
the shame of failure.
When a program dies
what you need is a moment
of serenity.
The Coy.pm
module brings tranquillity
to your debugging.
The module alters
the behaviour of die
and
warn (and
croak and carp).
It also provides
transcend
and enlighten, two
Zen alternatives.
Like Carp.pm,
Coy reports errors from the
caller's point-of-view.
But it prefaces
the bad news of failure with
a soothing poem.
The easiest way
to ornament errors is
with a "canned" haiku.
Salon magazine[1]
suggested this approach in
1998.
They asked readers to
submit error messages
written as haiku.
The winning entries
are now widely known. The best
of them is perhaps:
Three things are certain:But just as canned fish
Death, taxes, and lost data.
Guess which has occurred.
Inevitably,
constant repetition robs
them of their piquance.
Besides, there are too
many error messages
that need a haiku.
Perl's diagnostics
alone would require just
under 500.
And, of course, there's an
endless supply of user-
defined messages.
But it's not the first
system designed to create
synthetic haiku.
The Internet is
awash with generators
of Japanese verse;
A simple search[2]
finds
100,000 links for:
"generate haiku".
Silicon Graphics
rigged a lava-lamp[3]
to build
random-word verses:
In contrast, Garret
Kaminaga[4]
created
a "haiku grammar".
Its simple rules (see
Figure 1) expand to give
correct syllables.
haiku:
five_line seven_line five_linefive_line: one four | one three one | one one three | one two two |seven_line: one one five_line | two five_line | five_line one one | five_line twoone: red | white | black | sky | dawns | breaks | falls | cranes |two: drifting | purple | mountains | faces | empty | temple |three: peasant farms | computer | sashimi | fishing boats | ethernetfour: CD Player | aluminum | yakitori | chrysanthemumsfive: resolutional | rolling foothills rise |
These flow better, but
still betray a tell-tale "stream-
of-consciousness" feel:
But theirs are based on
real English sentence structures
(as in Figure 2).
As a result, they
generate quite plausible
(and lovely) haiku:
haiku:
form1 | form2 | form3form1: article adjective nounform2: noun preposition article nounform3: article adjective adjective nounnoun: waterfall | river | breeze | moon | rain | wind | sea | sky | stormverb: shakes | drifts | has stopped | struggles | whispers | grows | flysadjective: liquid | gusty | flowing | autumn | hidden | bitter | misty | summer |
Consequently, most
of the haiku they produce
don't scan correctly.
As these samples show,
a haiku generator
must balance two things:
It must use correct
English syntax and it has
to track syllables.
Coy's words are stored in
a hierarchical, cross-linked
vocabulary.
Figure 3 shows an
abbreviated sample
of the database.
$database = { duck => { category => [ "bird" ], sound => [ "quacks", ], act => { swims => { location => "suraquatic", direction => "horizontal", synonyms => [ "paddles" ], associations => "sink wet", }, }, }, fox => { category => [ "animal", "hunter" ], sound => [ "barks" ], act => { trots => { location => "terrestrial", associations => "smart problem", }, }, }, lover => { category => [ "human" ], sound => [ "sighs", "laughs" ], minimum => 2, maximum => 2, act => { kisses => { location => "terrestrial", associations => "connection", }, quarrels => { location => "terrestrial", associations => "argument", }, }, }, }; |
For each such noun, a
set of categories and
sounds is then given.
The categories
relate the noun to standard
actions (see below).
The sounds are used as
verbs to generate clauses
describing noises.
In addition, a
list of more general verb forms
("act")
is specified.
Each such verb, listed
in third person singular,
may take attributes.
These attributes list
constraints on the verb's usage
(such as location).
The entry for "duck"
=>
"swims",
for instance, locates it
as "suraquatic".
Other attributes
limit the subject count for
particular verbs.
"lover" => "kisses",
for
example,
is limited
to exactly 2.
(What can I say? It's
a very traditional
style of poetry.)
Nouns and verbs may be
given synonym lists to
cut repetition.
Verbs also list their
associations (see the
following section).
Many common verbs
can be applied across a
general class of nouns.
For example, all
nouns representing fish can
take the verb to swim.
Relationships of
this type can be stored in Coy's
vocabulary.
A noun's entry can
specify categories
to which it belongs.
Such categories
are listed separately
in the database.
Each is formatted
like a noun: with verbs, sounds, and
associations.
When the database
is loaded, categories
are "distributed".
Coy identifies
noun entries specified with
a category.
It then adds to such
entries the category's
list of attributes
This system ensures
that the haiku relates to
the error message.
The message is first
scanned to find significant
words (principally nouns).
These words are found by
deleting "stop words" from the
original text.
The remaining words
become a "filter" for the
vocabulary.
Coy then expands this
filter by augmenting words
with their synonyms.
Each word selected
for the haiku is compared
against the filter.
If the selected
word's associations don't
match, it's rejected.
This leads to problems
though, if the filter words are
too unusual.
In extreme cases,
they may filter out the whole
vocabulary.
To prevent this, Coy
can turn the word filter off
temporarily.
It does so if the
selection success rate falls
below 5%.
That allows words to
be chosen, so that haiku
creation proceeds.
When the selection
rate rises again, Coy turns
the filter back on.
This balances the
desire for relevance with
the need to progress
Those templates encode
various grammatical
structures for haiku.
The generator
selects one and fills it in
with relevant words.
Figure 4 shows a
few of the grammatical
templates Coy uses.
haiku_fragment: sentence | description | exclamationsentence:noun verb |description: |
Note that the grammar
has no terminals. They're drawn
from the database.
Templates are chosen
at random, as often as
needed (see below).
The chosen template
is then filled in with "filtered"
semi-random words.
The noun to be used
is randomly selected,
and constrains the verb.
The verb is chosen
from those specified for that
particular noun.
Any other parts
of the grammar are likewise
constrained by the verb.
These are typically
adverbial phrases of
place or direction.
For instance, suppose
the filtered noun chosen is
the word hummingbird.
Immediately
this constrains the verb to words
like flies, darts, or
nests.
If flies were chosen,
that would then constrain the place
to be aerial.
Whereas, if nests were
chosen, the place would have to
be arboreal.
Note that Coy needs no
A.I. techniques to enforce
these sequenced constraints.
The hierarchical
vocabulary structure
itself ensures them
The module must then
adjust the selected words'
grammatical form.
Specifically, the
words used must be inflected
for number and tense.
Lingua::EN::Inflect
is used to supply correct
noun/verb agreement
(specifically, the
exportable PL_N
and PL_V
subs).
Currently, tense is
restricted to the present
or continuous.
That's not a problem
though--most haiku are written
in those two tenses.
Verbs are stored in the
vocabulary in the
present tense only.
Lingua::EN::Inflect
can now inflect present tense
to continuous.
This transformation
is provided via the
new PART
subroutine.
(Inflecting present
participles is harder
than it might first seem.
Consider the verbs:
bat, combat, eat,
bite, fulfil(l),
lie, sortie, and
ski.)
However, there's no
guarantee that the result
scans 5-7-5.
To ensure perfect
metre, each selected word's
syllables are checked.
This occurs whilst the
grammar templates are filled in
(as words are filtered).
The selector tracks
the progressive syllable
count of the words used.
If the count exceeds
17, the selector
can reject a word.
The selection can
also backtrack further, if
that's necessary.
In some cases this
might cause the template itself
to be rejected
The template-filling
process then repeats until
the full haiku scans.
That handler passes
the error it receives through
the generator.
It then re-calls die
with the resulting haiku
as its argument.
The same approach is
applied to $SIG{__WARN__},
to catch warnings too.
As a result, all
exceptions thrown by die,
warn,
croak,
or carp are caught.
The Coy.pm
module also exports two
extra subroutines.
These are transcend
and
enlighten,
which lend a Zen
overtone to code.
Internally though,
these two subs are just wrappers
around croak
and carp.
Given the error
message: die
"Bad argument",
Coy generates this:
A pair of loversNote the allusion
quarrel beside a stream. Four
thrushes fly away.
Haiku are never
repeated. A second
die
"Bad argument"
gives:
Two old men fightingIn contrast, for a
under a sycamore tree.
Homer Simpson sighs.
Bankei weeping byCoy cannot always
a lake. Ryonen dying.
Seven howling bears.
For example, it
also produced this
response
to croak
"Missing file":
A swallow nestingSometimes Coy's output
in the branches of an elm
tree. A waiting fox.
A wolf leaps underIn other cases,
a willow. Two old men sit
under the willow.
Two young women near
Bill Clinton's office. A cat
waiting by a pond.
That in turn leads to
tell-tale repetition (which
fails the Turing test).
Extending the range
of words Coy.pm can
use is no problem
(though finding the time
and the creativity
required may be).
This leads to haiku
utterly unrelated
to the error text.
Again, there is no
technical difficulty
in adding more links:
Defining enough
associations isn't
hard, just tedious.
Yet again, this needs
no technical solution,
just time and effort.
Of course, such enhanced
templates might require richer
vocabulary.
For example, verb
predicates would need extra
database structure:
Each verb entry would
have to be extended with
links to object nouns.
It is currently
around 92%
accurate (per word).
This means that correct
syllable counts for haiku
can't be guaranteed.
Syllable counts for
single words are correct to
±1.
In a multi-word
haiku these errors cancel
out in most cases.
Thus, the haiku tend
to be correct within one
or two syllables.
As the syllable
counter slowly improves, this
problem will abate.
Increasing Coy's range
of vocabulary is
clearly essential.
Both its content and
its cross-linked structure need to
be greatly enhanced.
This, in turn, would make
it possible to extend
the grammar templates.
Some new syntactic
formats would also provide
more variety.
A side-effect might
be haiku that are smoother
and less fragmented.
The problem, of course,
is the effort required
to add this data.
Automation of
data acquisition might
be one solution.
Coy could use data
from some general semantic
lexicon system.
For instance, it might
be possible to adapt
data from WordNet[7].
This reduces the
stress induced in the user
when a program fails.
The haiku are fresh
each time, and related to
the error message.
They conform to the
rules of English grammar and
Japanese metre.
As usual, the
module is available
via the CPAN.