CSE5910 : Multimedia Programming in Java
Lecture : Digital Audio
In the previous lecture:
- Java provides numerous routines for manipulating graphical objects.
- Polygons and Polylines are amongst the most useful general purpose drawing routines.
- Special purpose routines for manipulating fonts and colors also exist, as well as routines for drawing specific shapes.
In this lecture:
- What are the different attributes of sound?
- How is sound stored digitally?
- How may sound be employed effectively in multimedia production?
References: There are many good books on electronic sound and music. This area is so vast that it is ridiculous to pretend that a single lecture can do any more than scratch the surface. I suggest you head to the library and start reading anything on the subject which takes your interest!
Roads, C. "The Computer Music Tutorial", MIT Press 1996
...is however an excellent general text on the subject of computers and music!
Pitch
- Relative 'highness' or 'lowness' of a sound measured against a standard scale
(high tone, lower tone)
- High pitched tones (jet engine whine) perceived by the ear are high frequency oscillations of the air
- Low pitched tones (thunder) are low frequency oscillations of the air
- Audible frequency range is ~20Hz - 20,000Hz
Timbre
- Tone quality or colour
- Used to determine (for example) the difference between a tone played by a bell and a tone played on a trumpet.
- Timbre is created by the kind and number of overtones.
- The tone heard as pitch is called the fundamental tone or the first harmonic of a sound.
- Overtones are additional tones of higher pitch than, and superposed over, the fundamental tone.
- Rich, full sounds (violin, voice) have many overtones, pure, thin sounds (flute, triangle) have few overtones.
|
|
frequency spectrum of pitch C4
sin wave tone
(note the fundamental)
|
|
Duration
- Length of time a sound event occupies
- Short sound event (door slam)
- Long sound event (fog horn)
Loudness (Dynamics)
- Perceived intensity of a sound event
- Loud sound event (thunder)
- Soft sound event (termite eating your chair)
- The dynamics of sound event are the variations of its perceived intensity.
Attack - Decay
- Combination of dynamics & duration
Simple model:
- Attack time - time taken for a sound to reach a (maximum) level of loudness.
- Sustain time - time sound remains at a constant level (often maximal).
- Decay time - time taken for sound to fall from sustained loudness to zero loudness.
- Attack / Sustain / Decay defines a sound envelope.
More completely specified envelope (ADSR):
Different sounds have different envelopes:
- The envelope of the sounds produced by some instruments may be controlled by the musician (eg. trumpet, violin, flute)
- A jet engine and a car passing in the street have gradual attacks and decays.
- A pedestrian-crossing click and a hand clap have rapid attacks and decays.
- A bass drum has a rapid attack and a gradual decay.
- Can you think of a sound with a gradual attack and a rapid decay?
Digital Audio
- Analogue sound vibrations are continuous pressure waves.
- Digital storage of sound vibrations requires sampling the wave at regular intervals to record its amplitude.
- Sampling rate: number of samples per second
- Common sampling rates: 11.025kHz, 22.05kHz, 44.1kHz
- Sounds captured at high sampling rates
- High storage costs
- High quality
...and vice versa!
- Number of bits to encode the amplitude of a sound sample also determines how accurately the wave will be represented. (To record a high frequency sound, you need to sample the rapidly oscillating wave often.)
- Common bits per sample: 8, 12, 16
- The dots in fig 3 are individual samples
- Lots of dots per second = high sampling rate
- Lots of different dot heights requires lots of bits per sample. (eg. 8 bits store 256 different heights)
Music, Sound or Noise?
- Noise is sound that bothers you.
- Sound is not necessarily noise that doesn't bother you!
- Music is sound you listen to on a recording or at a performance.
- Musicality concerns the role sound plays rather than being an attribute of the sound itself.
- Not everyone wants to listen to the same sounds...
- What is music to you in one circumstance is noise to you in another
(and is always noise to your neighbours)
- Not everyone agrees on these definitions. I might not like them myself tomorrow!
Sound, Little Images and Big Images
- Low resolution images (TV screens, Quicktime & MPEG movies) require audio to enhance / clarify the depicted scenes.
- Hi quality audio can (under some circumstances) conflict with low resolution images.
(How does this relate to sending messages as discussed throughout these lectures?)
- Low resolution audio can (easily) destroy the impact of hi-resolution images.
Sound For Moving Pictures
Literal Sounds: emerge from a sound source to which the sound refers.
- Dialogue
- Sound effects
- Source-connected (source on-screen) or source disconnected (source off-screen)
- Source-disconnected literal sounds evoke a visual image of the source
Non-literal Sounds: are not intended to convey a literal meaning, nor be identified with a source.
Source disconnected non-literal sounds do not evoke a visual image of the source.
Descriptive Non-literal Sounds
Sounds can evoke or describe:
Moods (like colours)
Places
Times
Attributes (heavy, rough, cold)
And more! |
What do these sound like?
Infinity
Rough
Sharp
Soft
Hot |
- Icy
- Heavy
- Squishy
- Solemn
- Potato
|
Recording Sounds
- Sounds can be recorded using direct to digital means (PC, mini-disk, DAT, Digital video tape etc.) or analogue means (tape recorder).
- Use a good microphone
- Eliminate / avoid background noise where possible (eg. record in a studio)
- Record source material at the highest sampling rate where appropriate
- Record source material at the highest possible level without clipping.
- The creation and recording of sounds in a studio as video/film footage plays is known as foley.
(E.g. Foley artists place footsteps, slam or knock on doors, drum fingers etc. where this information was impossible to record well in the field.)
- Improvise with sound effects...
pop paper bags, tap pencils, break glasses, crumple plastic bags, strike matches, wobble cardboard... there's no shortage of sound creating paraphernalia!
Sounds for Interfaces
- Sound attracts attention independent of the user's current visual focus
- Sound can confirm an operation
- Sound can alert a user to a special event (eg. error, completion of computation...)
- Sound cannot be localized to an event on screen
- Sound must be accompanied by a visual cue.
- Sound must be approximately synchronized with a visual event (not delayed).
- Choose sound attributes to suit purpose.
(Eg. Siren is not much use to indicate a normal keystroke when typing. A quiet 'click' is useless as a reactor meltdown warning!)
- Be consistent (always!)
MIDI - Musical Instrument Digital Interface
- A protocol for communication between digital musical instruments.
- A stream of MIDI messages instruct instruments such as synthesizers to:
- Play a note of a certain pitch
- Play a note on a certain instrument
- Stop playing a note
- Alter sound parameters (loudness, envelopes, effects etc.)
...and lots more!
A computer can be set up as a software synthesizer to receive and play MIDI messages if it stores at least General MIDI instrument data. MIDI files are MUCH smaller than digital audio files because the sound waveforms are stored or synthesized on the client. Only note information needs to be sent over the Internet, a considerable saving over sending the entire sound waveform!
Lecture summary:
- Five major attributes of sound: pitch; timbre; loudness; attack/decay (envelope) and duration.
- Digital sound is an encoding of analgoue air-pressure waves at a certain sampling frequency.
- Sound may be used to improve a user interface but it may also annoy a user if used inappropriately.