FIT5900 : Sound
In the previous lecture:
- DHTML is a collection of technologies for making web pages dynamic.
- Typically DHTML involves a scripting language and a Document Object model
on which it acts.
- Style sheets may be used to specify the layout of a page and these style
descriptions also be acted upon by a scripting language.
In this lecture:
- What are the different attributes of sound?
- How is sound stored digitally?
- How may sound be employed effectively in multimedia production?
There are many good books on electronic sound and music... most of them
have nothing to do with the WWW! This area is so vast that it is ridiculous
to pretend that a single lecture can do any more than scratch the surface. I
suggest you head to the library and start reading anything on the subject which
takes your interest!
Roads, C. "The Computer Music Tutorial", MIT Press 1996
...is however an excellent general text on the subject of computers and music!
- Relative 'highness' or 'lowness' of a sound measured against a standard
(high tone, lower tone)
- High pitched tones (jet engine whine) perceived by the ear are high frequency
oscillations of the air
- Low pitched tones (thunder) are low frequency oscillations of the air
- Audible frequency range is ~20Hz - 20,000Hz
- Tone quality or colour
- Used to determine (for example) the difference between a tone played by
a bell and a tone played on a trumpet.
- Timbre is created by the kind and number of overtones.
- The tone heard as pitch is called the fundamental tone or the first harmonic
of a sound.
- Overtones are additional tones of higher pitch than, and superposed over,
the fundamental tone.
- Rich, full sounds (violin, voice) have many overtones, pure, thin sounds
(flute, triangle) have few overtones.
- Length of time a sound event occupies
- Short sound event (door slam)
- Long sound event (fog horn)
- Perceived intensity of a sound event
- Loud sound event (thunder)
- Soft sound event (termite eating your chair)
- The dynamics of sound event are the variations of its perceived intensity.
Attack - Decay
- Combination of dynamics & duration
- Attack time - time taken for a sound to reach a (maximum) level of
- Sustain time - time sound remains at a constant level (often maximal).
- Decay time - time taken for sound to fall from sustained loudness
to zero loudness.
- Attack / Sustain / Decay defines a sound envelope.
More completely specified envelope (ADSR):
Different sounds have different envelopes:
- The envelope of the sounds produced by some instruments may be controlled
by the musician (eg. trumpet, violin, flute)
- A jet engine and a car passing in the street have gradual attacks and decays.
- A pedestrian-crossing click and a hand clap have rapid attacks and decays.
- A bass drum has a rapid attack and a gradual decay.
- Can you think of a sound with a gradual attack and a rapid decay?
- Analogue sound vibrations are continuous pressure waves.
- Digital storage of sound vibrations requires sampling the wave at
regular intervals to record its amplitude.
- Sampling rate: number of samples per second
- Common sampling rates: 11.025kHz, 22.05kHz, 44.1kHz
- Sounds captured at high sampling rates
- High storage costs
- High quality
...and vice versa!
- Number of bits to encode the amplitude of a sound sample also determines
how accurately the wave will be represented. (To record a high frequency sound,
you need to sample the rapidly oscillating wave often.)
- Common bits per sample: 8, 12, 16
- The dots in fig 3 are individual samples
- Lots of dots per second = high sampling rate
- Lots of different dot heights requires lots of bits per sample. (eg. 8 bits
store 256 different heights)
Music, Sound or Noise?
- Noise is sound that bothers you.
- Sound is not necessarily noise that doesn't bother you!
- Music is sound you listen to on a recording or at a performance.
- Musicality concerns the role sound plays rather than being an attribute
of the sound itself.
- Not everyone wants to listen to the same sounds...
- What is music to you in one circumstance is noise to you in another
(and is always noise to your neighbours)
- Not everyone agrees on these definitions. I might not like them myself tomorrow!
Sound, Little Images and Big Images
- Low resolution images (TV screens, Quicktime & MPEG movies) require audio
to enhance / clarify the depicted scenes.
- Hi quality audio can (under some circumstances) conflict with low resolution
(How does this relate to sending messages as discussed throughout
- Low resolution audio can (easily) destroy the impact of hi-resolution images.
(The 'Drive-In' phenomenon... if you actually want to watch
the film its better to visit a modern cinema Unfortunately the quality of
IMAX/OMNIMAX film-making needs improving.)
Sound For Moving Pictures
Literal Sounds: emerge from a sound source to which the sound refers.
- Sound effects
- Source-connected (source on-screen) or source disconnected (source off-screen)
- Source-disconnected literal sounds evoke a visual image of the source
Non-literal Sounds: are not intended to convey a literal meaning, nor
be identified with a source.
Source disconnected non-literal sounds do not evoke a visual image of the source.
Descriptive Non-literal Sounds
Sounds can evoke or
Moods (like colours)
| What do
these sound like?
- Sounds can be recorded using direct to digital means (PC, mini-disk, DAT,
Digital video tape etc.) or analogue means (tape recorder).
- Use a good microphone
- Eliminate / avoid background noise where possible (eg. record in a studio)
- Record source material at the highest sampling rate where appropriate
- Record source material at the highest possible level without clipping.
- The creation and recording of sounds in a studio as video/film footage plays
is known as foley.
(E.g. Foley artists place footsteps, slam or knock on doors,
drum fingers etc. where this information was impossible to record well in
- Improvise with sound effects...
pop paper bags, tap pencils, break glasses, crumple plastic bags, strike matches,
wobble cardboard... there's no shortage of sound creating paraphernalia!
Sounds for Interfaces
- Sound attracts attention independent of the user's current visual focus
- Sound can confirm an operation
- Sound can alert a user to a special event (eg. error, completion of computation...)
- Sound cannot be localized to an event on screen
- Sound must be accompanied by a visual cue.
- Sound must be approximately synchronized with a visual event (not delayed).
- Choose sound attributes to suit purpose.
(Eg. Siren is not much use to indicate a normal keystroke when typing. A quiet
'click' is useless as a reactor meltdown warning!)
- Be consistent (always!)
Down-loadable audio digital sound files are down-loaded in total
by a user before playback using a helper application.
Common digital audio file format extensions:
- AIFF (Apple sound file format)
- AU (Unix / Sun audio file)
- WAV (Windo$e sound file format)
- Quicktime, MPEG, AVI have both audio and video channels.
- MP3 compressed digital audio (lossy) that discards sound data that the ear
is not supposed to be able to hear. This format is taking the recording industry
by storm! Why?
- Include a down-loadable sound file in a web page like this
Streaming audio digital sound files run in a real-time stream
to client. The client plays as much of the file as possible as it is received.
Format: RA (RealAudio)
- RealAudio files are played by a free 'Real Audio Player' application on
the client machine.
- To set up RealAudio to play from your web site:
- Turn normal digital audio files into RA files using the free RealAudio
- Create a metafile containing links to one or more audio
- Call the metafile something (soundsBurp.ram)
- Put a link to the metafile in your web page:
- Speak to the sys-admin and ensure the server is configured to handle
files of extension .ram as being of MIME type 'x-pn-realaudio'
- And sadly, just when you were excited: buy the RealAudio server software
- OR happily, install the free personal copy which allows (only) 2 connections
MIDI - Musical
Instrument Digital Interface
- A protocol for communication between digital musical instruments.
- A stream of MIDI messages instruct instruments such as synthesizers to:
- Play a note of a certain pitch
- Play a note on a certain instrument
- Stop playing a note
- Alter sound parameters (loudness, envelopes, effects etc.)
...and lots more!
A computer can be set up as a software synthesizer to receive
and play MIDI messages if it stores at least General MIDI instrument
data. MIDI files are MUCH smaller than digital audio files because the sound
waveforms are stored on the client. Only note information needs to be sent over
the Internet, a considerable saving over sending the entire sound waveform!
Free MP3 files for techno-heads who love thumping BASS
All that's wrong with commercialization is evident here - just look at the RA
A record company making the most of Real Audio and MP3 to publicize their material
This lecture's key point(s):
- Five major attributes of sound: pitch; timbre; loudness; attack/decay
(envelope) and duration.
- Digital sound is an encoding of analgoue air-pressure waves at a certain
- Sound may be used to improve a user interface but it may also annoy a
user if used inappropriately.
- Digital sound files may be distributed via the WWW.
courseware | FIT5900
Alan Dorin & Jon McCormack 1999,2000