Understanding and
Crafting the Mix:
The Art of Recording
Understanding and
Crafting the Mix:
The Art of Recording
William Moylan
AMSTERDAM • BOSTON • HEIDLEBERG • LONDON
NEW YORK • OXFORD • PARIS • SAN DIEGO
SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO
Focal Press is an imprint of Elsevier
Focal Press is an imprint of Elsevier
30 Corporate Drive, Suite 400, Burlington, MA 01803, USA
Linacre House, Jordan Hill, Oxford OX2 8DP, UK
Copyright © 2007, William Moylan. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form
or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior
written permission of the publisher.
Permissions may be sought directly from Elsevier’s Science & Technology Rights Depart ment in Oxford,
UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail: permissions@elsevier.com.uk. You may
also complete your request online via the Elsevier homepage (www.elsevier.com), by selecting “Customer
Support” and then “Obtaining Permissions.
Recognizing the importance of preserving what has been written,
Elsevier prints its books on acid-free paper whenever possible.
Library of Congress Cataloging-in-Publication Data
Moylan, William.
Understanding and crafting the mix : the art of recording / William Moylan
p. cm.
Includes bibliographical references and index.
ISBN-13: 978-0-240-80755-3 (pbk. : alk. paper)
ISBN-10: 0-240-80755-3 (pbk. : alk. paper) 1. Sound--Recording and
reproducing. 2. Acoustical engineering. 3. Music theory. 4.
Music--Editing. I. Title.
TK7881.4.M693 2006
621.389’32--dc22
2006032818
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
For information on all Newnes publications,
visit our website at www.books.elsevier.com.
07 08 09 10 10 9 8 7 6 5 4 3 2 1
Printed in the United States of America
v
Table of Contents
Listing of Exercises ix
Foreword xii
Preface xvi
Acknowledgments xxi
Introduction xxii
Overview of Organization and Materials xxiii
Establishing an Accurate Playback of Recordings xxvii
Part One
Defi ning the Art of Recording: The Sound Characteristics
and Aesthetic Qualities of Audio Recordings
Chapter 1
The Elements of Sound and Audio Recording
3
The States of Sound 3
Physical Dimensions of Sound 5
Perceived Parameters of Sound 15
Summary 34
Exercises 35
Chapter 2
The Aesthetic and Artistic Elements
of Sound in Audio Recordings
36
The States of Sound and the Aesthetic/Artistic Elements 37
Pitch Levels and Relationships 38
Dynamic Levels and Relationships 42
Rhythmic Patterns and Rates of Activities 45
Sound Sources and Sound Quality 46
Spatial Properties: Stereo and Surround Sound 48
Conclusion 59
Exercises 60
Contents
vi
Chapter 3
The Musical Message and the Listener
61
The Musical Message 61
Musical Form and Structure 63
Musical Materials 65
The Relationships of Artistic Elements and Musical Materials 66
Equivalence and the Expression of Musical Ideas 68
Text as Song Lyrics 70
The Listener 73
Conclusion 80
Exercises 81
Part Two
Understanding the Mix: Developing Listening
and Sound Evaluation Skills
Chapter 4
Listening and Evaluating Sound for the
Audio Professional
85
Why Audio Professionals Need to Evaluate Sound 86
Talking About Sound 87
The Listening Process 89
Personal Development for Listening and Sound Evaluation 95
Summary 98
Exercises 99
Chapter 5
A System for Evaluating Sound
100
System Overview 100
Sound Evaluation Sequence 103
Graphing the States and Activity of Sound Components 106
Plotting Sources Against a Time Line 112
Summary 114
Exercises 115
Chapter 6
Evaluating Pitch in Audio and Music Recordings
11 8
Analytical Systems 119
Realizing a Sense of Pitch 120
Recognizing Pitch Levels 121
Pitch Area and Frequency Band Recognition 124
Melodic Contour 131
Exercises 134
Chapter 7
Evaluating Loudness in Audio and
Music Recordings
138
Reference Levels and the Hierarchy of Dynamics 139
Program Dynamic Contour 146
Musical Balance 149
Contents
vii
Performance Intensity versus Musical Balance 151
Exercises 153
Chapter 8
Evaluating Sound Quality
157
Sound Quality in Critical Listening Contexts 158
Sound Quality in Analytical Listening Contexts 159
Sound Quality and Perspective 160
Evaluating the Characteristics of Sound Quality and Timbre 161
Summary 173
Exercises 174
Chapter 9
Evaluating the Spatial Elements of
Reproduced Sound
176
Understanding Space as an Artistic Element 177
Stereo Sound Location 184
Distance Location 188
Environmental Characteristics 195
Space within Space 203
Surround Sound 207
Exercises 215
Chapter 10
Complete Evaluations and
Understanding Observations
224
Pitch Density and Timbral Balance 225
The Overall Texture 230
Relationships of the Individual Sound Sources and the
Overall Texture 233
The Complete Evaluation 239
Using Graphs for Making Evaluations and in Production Work 247
Summary 247
Exercises 249
Part Three
Crafting the Mix: Shaping Music and Sound, and
Controlling the Recording Process
Chapter 11
Bringing Artistic Judgment to the
Recording Process
255
Part Three Overview 255
The Signal Chain 256
Guiding the Creation of Music 259
Summary 260
Chapter 12
The Aesthetics of Recording Production
261
The Artistic Roles of the Recordist 262
The Recording and Reality: Shaping the Recording Aesthetic 263
Contents
viii
The Recording Aesthetic in Relation to the Performance Event 266
Altered Realities of Music Performance 271
Summary 274
Chapter 13
Preliminary Stages: Defi ning the Materials
of the Project
275
Sound Sources as Artistic Resources, and the Choice of Timbres 277
Microphones: The Aesthetic Decisions of Capturing Timbres 280
Equipment Selection: Application of Inherent Sound Quality 295
Monitoring: The Sound Quality of Playback 298
Summary 306
Exercises 307
Chapter 14
Capturing, Shaping and Creating the
Performance
310
Recording and Tracking Sessions: Shifting of Focus
and Perspectives 311
Signal Processing: Shifting of Perspective to Reshape
Sounds and Music 317
The Mix: Composing and Performing the Recording 319
Summary 335
Exercises 335
Chapter 15
The Final Artistic Processes and an
Overview of Music Production Sequences
339
An Overview of Two Sequences for Creating a Music Recording 340
Editing: Rearranging and Suspending Time 344
Mastering: The Final Artistic Decisions 349
The Listeners Alterations to the Recording 354
Concluding Remarks 355
Exercises 356
CD Contents and Track Descriptions 359
Glossary 369
Bibliography 381
Discography 389
Index 391
ix
Listing of Exercises
These exercises are designed to develop listening, evaluation and produc-
tion skills.
Exercise 1-1
Learning the Sound Quality of the Harmonic Series.
Exercise 2-1
General Musical Balance and Performance Intensity Observations.
Exercise 3-1
Structure Exercise.
Exercise 4-1
Musical Memory Development Exercise.
Exercise 5-1
Exercise in Graphing Clock Time.
Exercise 5-2
Time Judgment Exercise.
Exercise 5-3
Exercise for Plotting the Presence of Sound Sources Against the Time Line.
Exercise 6-1
Pitch Reference Exercise.
Exercise 6-2
Pitch Level Estimation Exercise.
Exercise 6-3
Pitch Area Analysis Exercise.
Exercise 6-4
Melodic Contour Analysis Exercise.
Exercise 7-1
Reference Dynamic Level Exercise.
Listing of Exercises
x
Exercise 7-2
Program Dynamic Contour Exercise.
Exercise 7-3
Musical Balance Exercise.
Exercise 7-4
Performance Intensity versus Musical Balance Exercise.
Exercise 8-1
Describing Sound Exercise.
Exercise 8-2
Sound Quality Evaluation Exercise.
Exercise 9-1
Stereo Location Exercise.
Exercise 9-2
Distance Location Exercise.
Exercise 9-3
Refl ections and Reverberation Exercise.
Exercise 9-4
Environmental Characteristics Spectrum and Spectral Envelope Exercise.
Exercise 9-5
Environmental Characteristics Exercise.
Exercise 9-6
Exercise in Determining the Environmental Characteristics of the Perceived
Performance Environment.
Exercise 9-7
Space Within Space Exercise.
Exercise 9-8
Surround Sound Location Exercises.
a. Audience Perspective
b. Ensemble Perspective
Exercise 10-1
Pitch Density Exercise.
Exercise 10-2
Timbral Balance Exercise.
Exercise 10-3
Shifting Focus and Perspective Exercise.
Listing of Exercises
xi
Exercise 13-1
Loudness Perception Exercise.
Exercise 13-2
Identifying and Comparing Microphone Characteristics.
Exercise 13-3
Comparing Microphone Placements.
Exercises 13-4
Additional Exercises for Microphone Techniques.
Exercises 14-1
Tracking Exercises.
a. Input and Output
b. Microphone Selection and Placement
c. Performance Related Issues
Exercises 14-2
Signal Processing Exercises.
a. Equalization
b. Noise Gates
c. Compressors
d. Delay
e. Reverb
f. Filters
Exercises 14-3
Mixing Exercises.
a. Musical Balance
b. Timbral Balance
c. Stereo Location
d. Distance Location
Exercises 15-1
Exercises Modifying Completed Mixes.
a. Compression
b. Equalization
c. Reverberation
d. Limiting
xii
Foreword
Nowhere within the mystery of creation is the concept of infi nity more
closely demonstrated than in the human response to sound.
Sounds
barely audible in the quiet solitude of a forest glade contain infor-
mation about direction, height, distance and character which unconscious-
ly provide awareness of our surroundings.
Sounds
as you step into a great cathedral, hearing nearby soft footfalls on
ancient fl agstones, the singing of a distant choir provides clues of distance
and perspective, and even invites sharing of mood.
Sounds
in a concert hall, an offi ce, a bathroom or a recording studio, each
has its own message prompting our response.
Blindfold, the smallest sample provides amazing awareness of our
environment.
The way sound behaves within a space paints a picture. We don’t have
to analyze, measure or evaluate. Created in the image of a communicat-
ing God, we communicate with speech to express what we think and with
music to express what we feel.
Music is as old as man. References to music, song, poetry and musical
instruments go back thousands of years. We see Egyptian bas-reliefs and
know that music played an enormous part in their culture and religion. Near
the end of King David’s reign, the Hebrews had professional Temple choirs
and a 4000 strong orchestra.
1
We read of strolling minstrels and court musi-
cians of later ages. But what did they sound like?
Every age of man is recorded in writing, painting and sculpture, but there
are no sound recordings.
Today we can choose from vast libraries of music from any part of the world,
which have become the benchmarks of artistic and technical quality for
many thousands of people who may never have even been to a concert.
Technical quality has improved enormously over the years but, surprising-
ly, we can still enjoy the very earliest recordings with all their imperfections
Foreword
xiii
of noise, distortion, limited bandwidth and poor dynamic range. Why, then,
is The Art of Recording important?
As technology advances, realism has become more accurate but it seems
that the artistic qualities of a performance have become more elusive. Nei-
ther accuracy nor artistry can be tabulated in some book of rules. Unlike
hardware design—for example, computers, where precision and speed pre-
dominate and the fi gures accurately defi ne the performance—in the fi eld
of sound recording, specifi cations and measurements cannot describe the
sounds that we hear and neither can they predict the effect of those sounds
on us. The recordist must provide the vital link as our interpreter.
In the 1953 Fourth Edition of his
Radio Designer’s Handbook, Langford-
Smith states: “It is common practice to regard the ear as the fi nal judge of
delity, but this can only give a true judgment when the listener has acute
hearing, a keen ear for distortion and is not in the habit of listening to dis-
torted music. A listener with a keen ear for distortion can only cultivate this
faculty by making frequent direct comparisons with the original music in
the concert hall.
2
We must cultivate A Point of Reference.
Truly successful recordists and producers have developed very high
degrees of refi nement and can perceive qualities (or the lack of them!) in
the sound recording and reproducing chain which seem to defy reason and
sometimes to contradict our current state of knowledge.
In 1977, Geoff Emmerick, who with George Martin recorded The Beatles at
Abbey Road and later at Air Studios in London, showed me that he could
hear a difference between two identical channels on a recently delivered
new console. After some hours of listening with him, I agreed that I could
hear a subtle difference. When we measured I found that, out of 48 chan-
nels, three had been incorrectly terminated and displayed a rise of 3 dB
at 54 kHz. The limit of hearing for most humans does not extend beyond
20 kHz and this small resonance, whilst obviously an oversight in the fac-
tory, would not normally have been regarded as important.
One of the signifi cant features of this episode was that Geoff was deeply
“unhappy,” even “distressed” at what he was hearing or perceiving.
Since then I have seen much more evidence that the range beyond 20 kHz
is part of human awareness. Newly introduced designs which transmit fre-
quencies to beyond 100 kHz (with low distortion and noise) surprisingly
sound warmer, sweeter and fuller.
In 1987, when addressing the Institute of Broadcast Sound in London, I car-
ried out a simple experiment for the fi rst time, with the object of discover-
ing what effect frequencies above 20 kHz might have on a professionally
aware audience.
Foreword
xiv
A generator capable of switching between a sine and a square wave was
fed through an ordinary” amplifi er to an ordinary” monitor loudspeaker.
The frequency was set at 3 kHz. The audience confi rmed that they could
hear the third harmonic as a superimposed 9 kHz tone or “whistle,” when
the generator was switched to square wave. (A square wave contains pre-
dominantly odd harmonics. When the fi rst of these exceeds the limit of
hearing, the sine and square wave should sound the same.)
The frequency was then progressively raised and older members soon admit-
ted that they could no longer hear the third harmonic “whistle.” But all could
still hear a
difference in quality as the generator was switched between sine
and square. As the frequency continued to be slowly raised, some could still
identify a difference when the fundamental had reached 15 kHz.
This experiment has been repeated many times in different parts of the
world without any real attempt at scientifi c control. “Ordinary” equipment
provided by my host was used on every occasion. Results have been sur-
prisingly consistent: some 35 to 45 percent of those present being able to
identify a quality difference when the fundamental was as high as 15 kHz.
There was one exception: at the University of Massachusetts Lowell, some
60 percent of the audience were still identifying a difference when the fun-
damental frequency had exceeded 17 kHz.
At the 83rd Convention of the Audio Engineering Society, Dr. William
Moylan proposed a “Systematic Method for the Aural Analysis of Sound
Sources.
3
Dr. Moylan uses his method at the University of Massachusetts
Lowell Department of Music. I think this points to the success of Dr. Moylan’s
training in aural analysis! This same method has led to this book.
No measurements or formulae can ever replace the recordist, but he must
develop a reliable Point of Reference and learn “The Art of Recording.
The learning process is endless. We never arrive at an ultimate state of
knowledge. We are, even now, only scratching the surface and this is espe-
cially so as we explore new formats to convey not only technical excellence
but a whole listening experience of music and its environment.
Can there be a perfect recording?
Only if we could arrive at perfect knowledge. How then, should such knowl-
edge be used? Could it change, for example, the world in which we live?
Stephen Hawking, in
A Brief History of Time,
4
seeks a unifi ed theory—draw-
ing together the general theory of relativity and of quantum mechanics—
which would lead to a complete understanding of the events around us and
of our own existence. He says: “If we fi nd the answer to that, it would be
the ultimate triumph of human reason—for then we would know the mind
of God.
Foreword
xv
We need open minds to envision direction, responsible minds to choose
direction and a Point of Reference beyond ourselves to Whom we are ulti-
mately answerable.
Well over one hundred years ago, Lord Rayleigh told us that the ears are
the fi nal arbiter of sound: “Directly or indirectly, all questions connected
with this subject must come for decision to the ear, as the organ of hearing;
and from it there can be no appeal. But we are not, therefore, to infer that
all acoustical investigations are conducted with the unassisted ear. When
once we have discovered the physical phenomena which constitute the
foundation of sound, our explorations are, in great measure transferred
to another fi eld lying within the dominion of the principles of Mechanics.
Important laws are in this way arrived at, to which the sensations of the ear
cannot but conform.
5
To follow William Moylans “important laws” will prepare your ears and your
mind for the true Art of Recording.” His approach is proven and will lead the
readers ears to the refi ned levels of “keen-ness” and “acuteness” required
today. Dr. Moylan’s book also gives us insight into what turns “Recording”
into Art,” and provides ways to bring artistic sensibility into our work.
Rupert Neve
Wimberley, TX
September 2001
Endnotes
1
Bible. 1 Chronicles 23:5. “ . . . and four thousand are to praise the LORD
with the musical instruments I have provided for that purpose.
2
F. Langford-Smith, editor,
Radio Designer’s Handbook, 4th edition,
Sydney, New South Wales: Wireless Press for Amalgamated Wireless
Valve Company Pty. Ltd, 1953, Ch.14, Section 12, (iii).
3
William Moylan, A Systematic Method for the Aural Analysis of Sound
Sources in Audio Reproduction/ Reinforcement, Communications and
Musical Contexts,” paper read at the 83rd Convention of the Audio En-
gineering Society, October, 1987.
4
Stephen W. Hawking,
A Brief History of Time (Bantam Press, Toronto,
1992), pp. 173–175.
5
Lord Rayleigh,
The Theory of Sound, fi rst edition, 1877, New York: Dover
Publications, 1945.
xvi
Preface
This second edition carries a reversal of the title. This refl ects my goal of
bringing the reader to a better understanding of the sound qualities of
recordings that will bring them to craft recordings that refl ect focused artis-
tic and aesthetic vision; in short, to assist the reader in making the record-
ings they
want to make, and to help them understand the recordings of
others.
Two important additions for this edition are an accompanying CD of
56 tracks to illustrate many key concepts, to assist in developing certain lis-
tening skills and to aid in establishing an accurate playback system; and a
glossary of terms and concepts to assist the reader in keeping track of new
concepts. A section added to the Introduction discusses “Establishing a
Quality Playback System;” it is extremely important for a readers playback
system to be accurate in order to hear and understand the book’s materials
correctly.
Part Three is signifi cantly expanded in this second edition to bring more
focus onto crafting the mix. The accompanying CD, numerous exercises,
and more detailed discussions on aesthetics, techniques and objectives
clarify what is needed to craft a recording. Part Two addresses sound qual-
ity and timbre evaluation in greater depth and uses the CD to clarify impor-
tant concepts; surround sound is given fuller coverage and has evolved;
and Chapter 10 has been extensively reworked and expanded to include
a section on “Shifting Focus and Perspective,” more coverage of timbral
balance and pitch density, reworking “Complete Evaluations” to delineate
between the overall texture and the individual sound source.
From the First Edition
Understanding and Crafting the Mix: The Art of Recording is the product of
my experiences as an educator in sound recording technology (music and
technology), and of my thought processes and observations as a composer
of acoustic music and of music for recordings (recording productions and
electronic music). It includes what I have learned through my creative work
Preface
xvii
as a recording producer and through my attempts to be a facile and trans-
parent recording engineer. It includes in-depth research into how we hear
music and sound as reproduced through loudspeakers (aural perception,
music cognition), and into the use of the recording medium to enhance
artistic expression, especially in music.
This book has evolved substantially since my initial research in the early
1980s. It has been greatly shaped by years of devising instructional meth-
ods and materials in my courses at the University of Massachusetts Lowell,
and by my interaction with many other audio educators and observations of
other recording programs worldwide. This evolution has been enhanced by
many other people and experiences as well—closely working with numer-
ous, talented production professionals in the audio industry representing
many different types and sizes of facilities in many locations, and many
conversations with individuals and companies engaged in audio product
development, design, and manufacturing.
The concepts and methods of this book have gone through many stages of
development. They will continue to change with new technologies and pro-
duction techniques and will continue to be refi ned as we learn more about
what we hear, and how we perceive sound and understand art.
I continue to be fascinated by how audio recording can be used to add
unique artistic qualities to music, and I look forward to the next new music
recording and the next new development in audio technology.
Purpose
Understanding and Crafting the Mix: The Art of Recording seeks to bring
the reader to understand how recorded sound is different from live sound,
and how those differences can enhance music. It will bring the reader to
explore how those sound characteristics appear in signifi cant readings by
The Beatles and others. The book also presents a system for the develop-
ment of critical and analytical listening skills necessary to recognize and
understand these sound characteristics.
This leads to the production process itself. The book seeks to move the
reader to consider audio recording as a creative process. Techniques and
technologies are purposely not covered. Instead, the book explores the
recording process as an act of creating art, and helps the reader envision
recording devices as musical instruments. It seeks to develop an artistic
sensitivity that will lead the reader to fi nd and create their own unique
artistic voice in shaping and creating music recordings.
Unchanged from the fi rst edition,
Understanding and Crafting the Mix:
The Art of Recording
is intended to be used: (1) as a resource book for all
people involved in audio; (2) as a textbook for courses in recording anal-
ysis, critical listening (listening skills-related courses), audio production-
Preface
xviii
related courses of all types in sound recording technology, music engineer-
ing, music technology, media/communications, or related programs; or (3)
as a self-learning text for the motivated student, beginning professional or
interested amateur. It can also provide a unique look at some of the most
important recordings by The Beatles for all who may be interested. It has
been written to be accessible by people with limited backgrounds in acous-
tics, engineering, physics, math and music.
The intended reader might be an active professional in any one of the
many areas of the audio recording industry, or a student studying for a
career in the industry. The reader might also be learning recording through
self-directed study, or be anyone interested in learning about recording
and recorded sound, perhaps an audiophile.
The portions of the book addressing sound quality evaluation will be direct-
ly applicable to all individuals who work with sound. Audio engineers in
technical areas will benefi t from this knowledge, and the related listening
skills, as much as individuals in creative, production positions. Even those
involved in consumer audio and pro audio sales can benefi t. All those peo-
ple who talk “about” sound can make use of the approach to evaluating
sound that is presented.
Finally, people interested in the music of The Beatles might gain new
insights. Discussions and examinations of their recordings appear through-
out the book. The timelessness of their music, and their use of then experi-
mental techniques and technologies, make these recordings especially
useful and appropriate. The reader may fi nd these examples interesting
points of departure for further study of their music and their creative use
of recording.
As a resource book,
Understanding and Crafting the Mix: The Art of Record-
ing is designed to contribute to the professional development of recordists.
The book seeks to clearly defi ne the sonic dimensions of audio record-
ing, and brings the reader to approach recording production creatively. It
further seeks to expand the current professionals’ creative thinking, their
skills in critical and analytical listening, and their skill in and sensitivity to
accurate and meaningful communication about sound quality.
Many people actively engaged in the creative and artistic roles of the indus-
try are hard-pressed to describe their creative thought processes. The actu-
al materials they are crafting have not previously been well defi ned. The
current professional has likely had little guidance in identifying the skills
required in audio-recording production; it is likely their current skills were
developed intuitively and with little outside assistance. The recordist may
already have highly developed creative abilities and listening skills, yet be
unaware of the dimensions of those skills. This book will address these
areas, and assist the current and future professionals in discovering new
dimensions within their unique and personal creative voice.
Preface
xix
Many excellent books exist on recording techniques and audio technolo-
gies. Articles in many excellent magazines, journals and serials exist that
cover recording devices and techniques, audio technologies and acoustic
concerns. These areas are not addressed here.
No sourcebooks or textbooks currently exist that discuss the aesthetic and
creative aspects of recording music or audio production. Few books exist
that discuss and develop the listening skills required to evaluate recordings
for technical quality, and none to evaluate the artistry of recordings or to
analyze the artistic content of recordings. This book seeks to fi ll this void.
It may be used as a sourcebook or a textbook in all of these areas, from
beginning through the most advanced levels.
The book may be used in a wide variety of courses, in many college and
university degree programs, and in vocational-type programs in audio and
music recording. Music engineering or music production (sound record-
ing technology) programs; music technology programs; communications,
multimedia, media, fi lm, radio/television, or telecommunications pro-
grams; and music composition programs emphasizing electronic/comput-
er music composition will all have courses that speak to the artistic aspects
of recording music (and sound).
The book is well suited to developing the student’s music- production skills
and artistry. It is designed to stimulate thought about the recording process
as being a collection of creative resources. These skills and creative ideas
can then be applied to the act of crafting the music recording artistically.
As a textbook for sound evaluation and listening-skill development,
Under-
standing and Crafting the Mix: The Art of Recording can be used for instruc-
tion at various skill levels. The instructor may determine the levels of detail
and profi ciency required of the students in performing listening evaluations
and adjust these studies and exercises accordingly. The development of lis-
tening skills is a lengthy and involved process that will go through many
stages of accomplishment. The book is useable by students at the beginning
of their studies or at the most advanced levels, for graduate students as
well as for undergraduates and students in short courses and programs.
Graphing the activity of the various artistic elements is important for devel-
oping listening and evaluation skills, especially during beginning studies.
It is also valuable for performing in-depth evaluations of recordings that
allow us to study how the artistic aspects of recording have been used by
accomplished recording producers. This process of graphing the activity of
the various artistic elements is also a useful documentation tool. Working
professionals through beginning students will fi nd the process useful.
Finally, it is hoped this book will clarify communication in some small way
and in some small segment of our industry.
Preface
xx
All audio professionals are required to communicate “about” sound. The
many artistic and technical people of the industry need to communicate
clearly. We in the industry presently function without this meaningful
exchange of ideas. We often do not communicate well or accurately. In
order for communication to occur, a vocabulary must be present; terms
or descriptions must mean the same thing to the people involved, and the
terms must apply to something specifi c within the sound. It is my hope
that this book might serve as a meaningful point of departure for an “audio
recording vocabulary” to be devised.
xxi
Acknowledgments
My sincere thanks to Catharine Steers, Emma Baxter and Beth Howard
for bringing me through this second edition, as well as the other wonder-
ful editorial staff and production people at Focal Press and Elsevier—who
might be too numerous to mention but who have all made my work easier
and more enjoyable by their professionalism and constant willingness to
help. I am honored to be affi liated with this extraordinary organization and
group of people.
My students in the Sound Recording Technology program at the University
of Massachusetts Lowell have worked through the materials and concepts
of this book in a number of forms and have taught me a great deal. My
special thanks to these serious and gifted young people for their (largely
unknowing) contributions to this project.
Many other people have provided support for this book, or assisted in
shaping my thoughts about these materials. While they are far too many to
mention, they are all very important to me. Only a very few follow.
Thanks to Phil Reese, Erh-Chaun Lai, and Mark Whittaker for working
through some of the graphs and exercises, and to Ben Burrows and Alex
Case for their ideas on some of the book’s concepts.
The enclosed CD is a major addition from the fi rst edition. My gratitude to
Eli Cohn, David Janco, Thomas Yahoub, Daniel Bolton and Sage Atwood
for giving of their talents and providing the CD’s performances; to Phillip
Reese for the long hours and hard work throughout this production, and to
Erh-Chaun Lai for his production assistance; and fi nally to Adam Ayan,
Scott Sperlich, Bob and Gail Ludwig and all the great people at Gateway
Mastering Studios for their exceptional work and care.
I am deeply indebted to Mr. Rupert Neve for generously giving of his time and
energies to provide his great insight to this book by writing its Foreword.
Finally, a special acknowledgment to Vicki and Zachary for their tolerance,
support, and reminders of what is important; and for making my life very
special.
xxii
Introduction
What makes recording music an art?
What makes the music recording a unique medium for artistic
expression?
What is different between a music recording and a live music
performance?
Why has the recordist (recording producer or engineer) become recognized
as an artist?
How does the recording process (recording techniques and technologies)
shape a piece of music?
It is widely recognized that the recording process shapes music. Recording
techniques and technologies change the qualities of acoustic sound and
impart new sound characteristics. These sound qualities are under the con-
trol of an individual that shapes the music recording—the recordist.
The changes in sound quality that are created by recording do not occur
in nature. They are unique to audio recordings and give recorded music
(or music reproduced over loudspeakers) a set of unique sound character-
istics. These characteristics may be very different from a live, unamplifi ed
performance. These sound qualities have become accepted as part of the
experience of listening to recorded and reproduced sound, and of music.
The unique sound qualities of recording contribute to the character of a
recorded piece of music; they become part of the piece of music. How a
piece of music might be shaped has thus been extended to include the
unique sound qualities of music recordings. The person controlling, or cre-
ating, these sound qualities (the recordist) is functioning as a creative art-
ist. This person is a musician of sorts—“conducting” by encouraging and
ensuring quality performances, “performing” recording, mixing and pro-
cessing devices, and “composing” the mix.
Introduction
xxiii
Overview of Organization and Materials
The questions above will be answered during the course of this book. The
elements of sound that shape the artistic qualities of recordings have not
previously been defi ned. The defi nitions offered herein will allow for the
understanding necessary to defi ne these questions. This book will demon-
strate how these aspects of sound are shaped in the recording process and
will examine their appearance in well-known recordings by The Beatles.
A system for developing listening skills and understanding the artistry of
music recordings will also be presented. Through this process and its many
related exercises the reader will be brought to gain understanding of and
control over shaping the artistic aspects of audio recordings.
Accordingly, the book is divided into three parts:
Part One defi nes the artistic dimensions and sound characteristics of
music recordings
Part Two leads to an understanding of how those artistic dimensions
and sound characteristics are commonly used in production practice
and carries the reader through a system for listening-skill development
Part Three explores the process of artistically crafting a music recording
Part One. Defi ning the Art of Recording: The Sound Characteris-
tics and Aesthetic Qualities of Audio Recordings
Part One is divided into three chapters. To begin defi ning the artistic dimen-
sions of music recordings, sound must be understood. The states of sound
in air, in human perception, and as applied to music are followed in the
sequence of understanding the meaning of sounds. The processes of mov-
ing from one state of sound to another are explored. The anomalies that
occur in the transfer processes are recognized and evaluated.
Sound as a resource for artistic expression is the basis for Chapter 2.
This encompasses the unique sound qualities of recordings, and the poten-
tial of those qualities to be used in artistic expression. This is followed by
an examination of the musical message itself, leading to how musical
materials are perceived by the listener, and how that perception leads to
the understanding of musical messages and to communication.
Part One centers around understanding sound and the listening process
itself. It brings to light the importance of the listener in the communication
of musical materials.
People listen to sound and music at various levels of intellectual involve-
ment. They will at times listen passively and take an undirected “journey”
through the sensual and emotive states of the music. They might seek an
understanding of any literary or extra-musical ideas in the music (with the
presence and infl uences of any text or other image-inducing associations).
Introduction
xxiv
Listening can also take the form of an aesthetic listening experience (appre-
ciating the interrelationships of the abstract musical materials). These and
others are basically recreational, entertainment, therapeutic or self-enrich-
ment activities. The listening process is approached much differently when
it is part of ones work.
When the listening process is approached as part of the professional
recordist’s responsibilities, it is always an active process that has the lis-
tener consciously engaged. How one listens, and what one listens for, is a
central concern for the recordist. Listening is one of the primary responsi-
bilities of the recordist. Any person wishing to enter the recording industry
must develop their listening skills in one way or another.
Different positions might require very different skills and responsibilities,
but nearly all careers in audio involve listening to, evaluating and/or
describing sound.
Part Two. Understanding the Mix: Developing Listening and
Sound Evaluation Skills
Part Two presents a complete system for evaluating the dimensions of
sound in music and audio recordings. The need for sound evaluation and
the contexts for sound evaluation in music recordings are discussed at the
beginning of this section.
Each element of sound is evaluated separately. How each element is used
to shape music recordings is presented. A method of evaluation has been
specifi cally devised for each individual element, and refl ects actual use of
the elements in music recordings. Recordings by The Beatles are used as
examples of the qualities of sound being discussed. They provide excellent
examples of how the unique sound qualities of recordings have enhanced
the music and have at times contributed in fundamental ways to shaping
musical ideas.
The methods of evaluation for each individual sound quality are accom-
plished in relation to a complete, interrelated system of evaluating sound
in music recordings. A series of listening exercises is presented throughout
the course of the book to guide the reader in developing sound evaluation
and listening skills.
The system progresses from simple concepts and listening processes to
the most complex. It builds on experiences that are most easily learned
and evolves systematically to the most diffi cult. Listening experiences
that many audio professionals or intermediate-level musicians may have
already acquired (at least intuitively) are incorporated.
The system will develop the readers listening skills and will provide the
basis for meaningful and accurate communication on sound content and
quality.
Introduction
xxv
An objective vocabulary and a way to evaluate sound have been devised
to allow precise information about sound to be recognized and communi-
cated. This information will be an actual account of the states or values of
the sound material. Subjective impressions of the sound’s quality (a very
common way musicians and recordists attempt to communicate about
sound) are always avoided. They do not allow communication to be accu-
rate and limit its value. The method avoids any personal impressions about
the sound quality and addresses only the physical dimensions of sound as
they are perceived and as they appear in the music. The method for eval-
uating sound will allow individuals to talk “about” sound in meaningful
ways. It will allow people to exchange precise and conclusive information
about the sound qualities, once they acquire the skills required to recognize
those qualities.
The reader will eventually gain the experience and the knowledge that will
allow them to perform quite complex listening and evaluation tasks, and to
describe sounds to others.
Time, practice, understanding, repetition, and concentrated effort are all
required to develop listening skills. Auditory memory will increase as the
listener becomes more accustomed to the sound material, more aware of
patterns of levels and changes within all of the artistic elements, and more
aware of how to focus attention on the aspects of sound we are condi-
tioned from birth to ignore. Developing refi ned listening skills is a long-
term project; the individual’s listening skills will likely continue to develop
throughout their career.
Part Three. Crafting the Mix: Shaping Music and Sound, and
Controlling the Recording Process
Part Three applies the artistic elements to the recording production process.
It presents the concepts and thought processes of recording production,
and relates them to the artistic aspects of recording and the artistic ele-
ments of sound. It will explore the concepts the recordist will work through
during the creative processes of crafting a music recording.
Part Three defi nes general principles. It will not present specifi c ways to
record, nor will it address specifi c pieces of equipment or specifi c record-
ing techniques. The principles covered will allow the reader to conceptualize
music recording as a creative process; it will place the reader in the posi-
tion of guiding the artistic product from its beginning as an idea, through its
development during the many stages of the recording sequence, to its fi nal
form. It is intended to separate tools from technique, to bring the reader to
conceptualize the recording project without concern for the ever-changing
ow of devices.
This approach will explore the creative potentials of the audio recording
medium independently from equipment and technology concerns. How the
Introduction
xxvi
recording process shapes the characteristics of sounds will be explored,
and ways the reader can establish and exercise control of the artistry of
recording will be presented. The resources of the recording process are
considered for their potential to capture sound qualities, to perform the
music recording, and to generate the relationships of a piece of music.
The reader is encouraged to use their new listening and evaluation skills
to explore the productions of others and to try to emulate similar ideas in
their own productions—to develop their craft and their art.
Two possible recording production sequences are presented and contrast-
ed. Through these two scenarios, the aesthetic and artistic concerns of the
recording production process are evaluated. The artistic roles (or functions)
of the recordist are contrasted in relation to the sequences.
It is a goal of
Understanding and Crafting the Mix: The Art of Recording to
bring the reader to explore the creative potentials of the medium’s tools
(equipment) as musical instruments and to develop an artistic sensitivity
through studying and understanding the creative works of others (Part
Two).
Applying the artistic elements of recording to the recording process (Part
Three) and evaluating the artistic elements in the recording process (Part
Two) will often occur simultaneously in production practice. They are,
however, two distinct processes. They are presented separately here for
clarity and for a thorough presentation. The two are interdependent, when
considering the evaluation of sound that occurs during the recording
production process.
Accompanying CD
Accompanying this book is a CD containing 56 tracks. Examples illustrate
many key concepts, and the reader will be referred to the CD during the
course of reading the text. Many examples will assist in developing impor-
tant listening skills, and other tracks will present dimensions of recorded
sound in ways that are more easily recognized than in most commercial
recordings. The reader will often want to listen to the CD while reading
certain sections; the CD is meant to clarify materials and make some of the
more abstract concepts much more tangible.
At the end of the CD are three tracks intended to aid in establishing an accu-
rate playback system. These are at the end of the CD so the reader does not
have to hear them each time the CD is started. Readers are strongly encour-
aged to evaluate their sound systems. Accurate and quality playback of the
enclosed CD, and of the commercial recordings cited throughout the book,
is necessary if the reader is to understand the materials being presented.
Introduction
xxvii
Establishing an Accurate Playback of Recordings
A reasonable-quality, accurate playback system located in an acceptable lis-
tening environment is needed to experience and learn the material discussed
in this book. It is important that the recorded examples be heard as intend-
ed—with a minimum of change from the playback system and the room—in
order to learn the materials being discussed. The reader should have access
to such a system in order to benefi t from the readings. Headphones are not a
suitable substitute for accurate listening to nearly all of the examples.
High-resolution monitors in an acoustically neutral environment and an
impeccable signal chain are required for many activities performed by pro-
fessional recordists. It is not at all realistic to expect a beginner in the indus-
try, a student or an interested amateur to have such a system. It is, however,
necessary for the reader to hear the accompanying CD and the commercial
recordings cited throughout the book with some semblance of accuracy.
Putting together a quality sound system and establishing an accurate play-
back environment are covered in Chapter 13. Topics presented are:
Components and specifi cations for a high-quality playback system to
provide unaltered, detailed sound
Loudspeaker placement and listening-room interaction
Listener distance and orientation to loudspeakers
Monitoring levels
If readers are uncertain about having a suitable playback system, they are
encouraged to look over that material now, or certainly before playing any
of the listening examples.
The sound-system evaluation tracks (Tracks 54–56) on the accompanying CD
can help the reader evaluate their current system. It is strongly suggested
that readers do so in order to know the performance of their system, and
to ensure it is properly set up. It is also recommended they acquire a sound
level meter. The meter will not only assist in evaluating their
system, but will also help the reader to work through a num-
ber of exercises in this book and will lead the reader to estab-
lish accurate and healthy monitoring practices.
If a reader cannot establish a reasonable quality of play-
back on their own, they are strongly encouraged to fi nd a
suitable location where they can listen to the recordings
that are cited. This is an important part of learning the mate-
rials presented.
Summary
Part One provides the necessary background to recognize the qualities of
sound. It defi nes the sound characteristics and aesthetic qualities of audio
Listen . . .
to tracks 54-56
for playback system set-up and
calibration material; read the
track descriptions for assistance.
Introduction
xxviii
and music recordings. It brings the reader to understand how the recording
process enhances music recordings, and has the potential to shape their
artistic qualities substantially.
Part Two provides a framework and a method to develop high-level listen-
ing skills. It will lead the listener to develop listening and sound evaluation
skills and to learn how to communicate objective and meaningful informa-
tion about sound. Important recordings by The Beatles will be examined to
illustrate the sound characteristics being studied.
Part Three focuses on the aesthetics of recording and mixing. This section
will not present prescriptions for production techniques to achieve cer-
tain results; nor will it provide information on selecting and using specifi c
equipment or technologies. It will lead the reader to discovery and creativ-
ity by developing problem solving and creative thinking. It will bring the
reader to gain control of the craft of recording music and shaping sound,
and to use the recording process in an artistic way.
Understanding and Crafting the Mix: The Art of Recording encourages the
reader to think broadly and deeply about the creative, artistic, and technical
processes involved in planning, executing and evaluating a music record-
ing, and it provides the guidance, the language and the training to enable
them to craft quality recordings. The reader is brought to think of a music
recording as a piece of music, rather than a product of technological pro-
cesses.
The goal of
Understanding and Crafting the Mix: The Art of Recording is
to bring the reader to an awareness of the dimensions of recorded sound,
and to an understanding of how the dimensions shape music recordings,
through the development of listening and evaluation skills. It seeks to lead
readers to control the recording process to create quality recordings, and
to fi nd their own unique artistic voice in crafting music recordings.
1
Part One
Defi ning the Art of Recording:
The Sound Characteristics and
Aesthetic Qualities of Audio Recordings
3
1 The Elements of Sound and
Audio Recording
Audio recording is the recording of sound. It is the act of capturing the
physical dimensions of sound and then reproducing those dimensions
either immediately or from a storage medium (magnetic, vinyl, electronic,
digital), and thereby returning those dimensions to their physical, acous-
tic state. The process moves from physical sound, through the recording/
reproduction chain, and back to physical sound.
The “art” in recording centers on the artistically sensitive application of the
recording process. The recording process is being used to shape or create
sound as an artistic statement (piece of music), or supporting artistic mate-
rial. To be in control of crafting the artistic product, one must be in control
of the recording process, be in control of the ways in which the recording
process modifi es sound, and be in control of communicating well-defi ned
creative ideas.
These areas of control of the artistic process all closely involve a human
interaction with sound. Inconsistencies between the various states of sound
are present throughout the audio-recording process. Many of these incon-
sistencies are the result of the human factor: the ways in which humans
perceive sound and interpret or formulate its meanings. In order for mate-
rial to be under their control, the artist (audio professional) must under-
stand the substance of their material: sound, in all its inconsistencies.
The States of Sound
In audio recording, sound is encountered in three different states. Each of
these three states directly infl uences the recording process and the cre-
ation (or capturing) of a piece of art. These three states are:
1. Sound as it exists physically (having physical dimensions);
Chapter 1
4
2. Sound as it exists in human perception (psychoacoustic conception):
sound being perceived by humans after being transformed by the ear
and interpreted by the mind (the perceived parameters of sound being
human perceptions of the physical dimensions); and
3.
Sound as idea: sound as it exists as an aural representation of an
abstract or a tangible concept, as an emotion or feeling, or represent-
ing a physical object or activity (this is how the mind fi nds meaning
from its attention to the perceived parameters of sound); sounds as
meaningful events, capable of communication, provide a medium for
artistic expression; sounds hereby communicate, have meaning.
The audio-recording process ends with sound reproduced over loudspeak-
ers, as sound existing in its physical state, in air. Often the audio-recording
process will begin with sound in this physical state, to be captured by a
microphone.
Humans are directly involved in all facets of the audio-recording process
through listening to sound. They evaluate the audio signal at all stages
while the recording is being made (including the
recordist—the person
making the recording—and all others involved in the industry), and what is
heard by the end listener is the reason for making the recording. Humans
translate the physical dimensions of sound into the perceived parameters
of sound through the listening process (aural perception).
This translation process involves the hearing mechanism functioning on
the physical dimensions of sound, and the transmission of neural signals to
the brain. The process is nonlinear and alters the information; the hearing
mechanism does not produce nerve impulses that are exact replicas of the
applied acoustic energy.
Certain aspects of the distortion caused by the translation process are,
in general, consistent between listeners and between hearings; they are
related to the physical workings of the inner ear or the transfer of the per-
ceived sounds to the mind/brain. Other aspects are not consistent between
listeners and between hearings; they relate to the listeners unique hearing
characteristics and their experience and intelligence.
The fi nal function occurs at the brain. At a certain area of the cortex, the neu-
ral information is processed, identifi ed, consciously perceived, and stored
in short-term memory; the neural signals are transferred to other centers of
the brain for long-term memory. At this point, the knowledge, experience,
attentiveness, and intelligence of the listener become factors in the under-
standing and perception of sound’s artistic elements (or the meanings or
message of the sound). The individual is not always sensitive or attentive to
the material or to the listening activity, and the individual is not always able
to match the sound to their previous experiences or known circumstances.
The Elements of Sound and Audio Recording
5
Figure 1-1
Three
states of sound: in
air, in perception,
as message.
Acoustic Energy Perception Meaning
The physical dimensions (1) are interpreted as perceived parameters of
the sound (2). The perceived parameters of sound (2) provide a resource
of elements that allow for the communication and understanding of the
meaning of sound (and artistic expression) (3).
The audio-recording process communicates ideas, and can express feel-
ings and emotions. Audio might take the forms of music, dialog, motion-
picture action sounds, whale songs, or others. Whatever its form, audio is
sound that has some type of meaning to the listener. The perceived sound
provides a medium of variables that are recognizable and have meaning,
when presented in certain orders or patterns. Sound, as perceived and
understood by the human mind, becomes the resource for creative and
artistic expression. The artist uses the perceived parameters of sound as
the artistic elements of sound, to create and ensure the communication of
meaningful (musical) messages.
The individual states of sound as physical dimensions and as perceived
parameters will be discussed individually, in the next section. The interac-
tion of the perceived parameters of sound will follow the discussion of
the individual parameters. These discussions provide critical information
for understanding the breadth of the “artistic elements of sound” in audio
recording, presented in the next chapter.
Physical Dimensions of Sound
Five physical dimensions of sound are central to the audio-recording pro-
cess. These physical dimensions are: the characteristics of the sound wave-
form as (1)
frequency and (2) amplitude displacements, occurring within
the continuum of (3)
time; the fusion of the many frequency and amplitude
anomalies of the single sound to create a global, complex waveform as (4)
Chapter 1
6
timbre; and the interaction of the sound source (timbre) and the environ-
ment in which it exists creates alterations to the waveform according to
variables of (5)
space.
Figure 1-2
Dimensions
of the waveform.
Rarefactions Time (Milliseconds)
or Distance
Compression of
Air Molecules
Static
Barometric
Pressure
0
Amplitude
One Cycle
Frequency is the number of similar, cyclical displacements in the medium,
air, per time unit (measured in cycles of the waveform per second, or Hz).
Each similar compression/rarefaction combination creates a single cycle of
the waveform. Amplitude is the amount of displacement of the medium at
any moment, within each cycle of the waveform (measured as the magni-
tude of displacement in relation to a reference level, or decibels).
Timbre
Timbre is a composite of a multitude of functions of frequency and ampli-
tude displacements; it is the global result of all the amplitude and frequency
components that create the individual sound. Timbre is the overall quality
of a sound. Its primary component parts are the dynamic envelope, spec-
trum, and spectral envelope.
The
dynamic envelope of a sound is the contour of the changes in the over-
all dynamic level of the sound throughout its existence. Dynamic enve-
lopes of individual acoustic instruments and voices vary greatly in content
and contours. The dynamic envelope is often thought of as being divided
into a number of component parts. These component parts may or may
not be present in any individual sound. The widely accepted components
of the dynamic envelope are: attack (time), initial decay (time), initial sus-
tain level, secondary decay (time), primary sustain level, and fi nal decay
(release time).
The Elements of Sound and Audio Recording
7
Figure 1-3
Dynamic
envelope.
Dynamic envelope shapes other than those created by the above outline
are common. Many musical instruments have more or fewer parts to their
characteristic dynamic envelope. Further, vocalists and the performers of
many instruments have great control over the sustaining portions of the
envelope, providing internal dynamic changes to sounds. Musical sounds
that do not have some variation of level during the sustain portion of the
envelope are rare; the organ is one such exception.
The
spectrum of a sound is the composite of all of the frequency compo-
nents of the sound. It is comprised of the fundamental frequency, harmon-
ics, and overtones, sometimes including subharmonics and subtones.
The periodic vibration of the waveform produces the sensation of a domi-
nant frequency. The number of periodic vibrations, or cycles of the wave-
form, that repeat its characteristic shape is the
fundamental frequency. The
fundamental frequency is also that frequency at which the sounding body
resonates along its entire length. The fundamental frequency is often the
most prominent frequency in the spectrum, and will often have the great-
est amplitude of any component of the spectrum.
In all sounds except the pure sine wave, frequencies other than the funda-
mental are present in the spectrum. These frequencies are usually higher
than the fundamental frequency. They may or may not be in a whole-num-
ber relationship to the fundamental. Frequency components of the spectrum
that are whole-number multiples of the fundamental are
harmonics; these
frequencies reinforce the prominence of the fundamental frequency (and the
pitched quality of the sound). Those components of the spectrum that are
not proportionally related to the fundamental we will refer to as
overtones.
Time
Initial Attack: A = time 1 = level
Initial Decay: B = time
Initial Sustain C = time 2 = level
Secondary Decay: D = time
Primary Sustain: E = time 3 = level
Final Decay: F = time
Amplitude
Prefix Body
1
2
B
A
C
D
3
E
F
Chapter 1
8
Traditional musical acoustics studies defi ne overtones as being proportional
to the fundamental, but with a different sequence than harmonics (fi rst over-
tone = second harmonic, etc.); this traditional defi nition is herein replaced by
a differentiation between
partials that are proportional to the fundamental
(harmonics) and those that are not (overtones). This distinction will prove
important in the evaluation of timbre and sound quality in later chapters.
All of the individual components of the spectrum are partials. Partials (over-
tones and harmonics) can exist below the fundamental frequency as well as
above; they are accordingly referred to as subharmonics and subtones.
Figure 1-4
Harmonic
Series
Harmonic = 1 2 3 4 5 6 7 8 9 10
Frequency = 55Hz 110 165 220 275 330 385 440 495 550Hz
Pitch = A
1
A
2
E
2
A
3
C
4
E
4
G
4
A
4
B
4
C
5
(flat)
11
605
D
5
(sharp)
12
660
E
5
13
715
F
5
(sharp)
14
770
G
5
(flat)
15
825
A
5
(flat)
16
880
A
5
17
935 Hz
B
5
(sharp)
Listen . . .
to tracks 1 and 2
for the harmonic series played in
individual frequencies and pitch-
es, and as a chord.
For each individual instrument or voice, certain ranges of frequencies
within the spectrum will be emphasized consistently, no matter the fun-
damental frequency. Instruments and voices will have resonances that
will strengthen those spectral components that fall within these defi nable
frequency ranges. These areas are called
formants, formant regions, or
resonance peaks. Formants remain largely constant, and modify the same
frequency areas no matter the fundamental frequency. Spectral modifi ca-
tions will be present in all occurrences of the sound source with harmonics
or overtones in the formant regions. Formants can appear as increases in
the amplitudes of partials that appear in certain frequency bands, or as
spectral components in themselves (such as noise transients caused by a
hammer striking a string). They can also be associated with resonances of
the particular mechanism that produced the source sound. Formants are
largely responsible for shaping the characteristic sounds of specifi c instru-
ments; they allow us to differentiate between the instruments of different
manufacturers, or even to tell the difference between two instruments of
the same model and maker.
A sound’s spectrum is composed primarily of partials
that create a characteristic pattern, which is recognizable
as being characteristic of a particular instrument or voice.
This pattern of spectrum will transpose (change level but
maintain the same distances between frequencies/partials)
with every new fundamental frequency of the same sound
source and remain mostly unchanged. This consistent pat-
tern will form a similar timbre at different pitch levels. For-
mants establish frequency areas that will be emphasized
The Elements of Sound and Audio Recording
9
for a particular instrument or voice. These areas will not change with varied
fundamental frequency, as they are fi xed characteristics (such as resonant
frequencies) of the device that created the sound. Formants may also take
the form of spectral information that is present in all sounds produced by
the instrument or voice.
Figure 1-5
Format
regions of two pitches
from a hypothetical
instrument. Vertical
lines represent the
partials of the two
pitches, placed at spe-
cifi c frequencies and
at specifi c amplitudes.
The frequencies that comprise the spectrum (fundamental frequency, har-
monics, overtones, subharmonics and subtones) all have different ampli-
tudes that change independently over the sound’s duration. Thus, each par-
tial has a different dynamic envelope. Altogether these dynamic envelopes
of all the partials make up the
spectral envelope. The spectral envelope is
the composite of each individual dynamic level and dynamic envelope of
all of the components (partials) of the spectrum.
The component parts of timbre (dynamic envelope, spectrum, and spectral
envelope) display strikingly different characteristics during different parts
of the duration of the sound. The duration of a sound is commonly divided
into two time units: the
prefi x or onset, and the body. The initial portion of
the sound is the prefi x or onset; it is markedly different from the remainder
of the sound, the body. The time length of the prefi x is usually determined
by the way a sound is initiated, and is often the same time unit as the initial
attack. The actual time increment of the prefi x may be anywhere from a few
microseconds to 20–30 ms.
Frequency (Hz)
Resonance Areas
(frequency band)
Amplitude
area of noise burst
during sounds’ onsets
Pitches
250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000
Amplitude
250 500 750 1000 1250 1500 1750 2000 2250 2500 2750 3000
A
3
(220Hz)
E
4
(330Hz)
Frequency (Hz)
Chapter 1
10
Figure 1-6
Spectral
envelope.
The prefi x is defi ned as the initial portion of the sound that has markedly
different characteristics of dynamic envelope, spectrum, and spectral enve-
lope than the remainder of the sound. The body of the sound is usually
much longer in duration than the prefi x. (See Figure 1-3.)
Space
The interaction of the sound source (timbre) and the environment, in which
it is produced, will create alterations to sound. These changes to the sound
source’s sound quality are created by the acoustic space. The nature of
these alterations is directly related to (1) the characteristics of the acoustic
space in which the sound is produced and (2) the location of the sound
source within the environment.
Space-related sound measurements must be performed at a specifi c physi-
cal location. The measurements are calculated from the point in space where
a receptor (perhaps a microphone or a listener) will capture the composite
sound (the sound source within the acoustic space). The location of the
listener (or other receptor) becomes a reference in the measurement of the
acoustic properties of space.
The aspects of space that infl uence sound in audio recording are: (1) the
distance of the sound source to the listener, (2) the angle of the sound
source to the listener, (3) the geometry of the environment in which the
sound source is sounding, and (4) the
location of the sound source within
the host environment.
Time in Seconds
0.1
0.2
0.3
0.4
1k —
2k —
3k —
4k —
5k —
Frequency
in Hertz
Relative
Amplitude
The Elements of Sound and Audio Recording
11
Figure 1-7
Paths of
refl ected sound within
an enclosed space.
The environment in which the sound source is sounding is often referred
to as the host environment. Within the host environment, sound will travel
on a direct path to the listener (as
direct sound) and sound will bounce off
refl ective surfaces before arriving at the listener (as refl ected sound).
Reverberant sound is a composite of many refl ections of the sound arriving
at the listener (or microphone) in close succession. The many refl ections
that comprise the reverberant sound are spaced so closely that the indi-
vidual refl ections cannot be measured; these many refl ections are there-
fore considered as a single entity. As time progresses, these closely spaced
refl ections become more closely spaced and of diminishing amplitude,
until they are no longer of consequence.
Reverberation time (often referred
to as RT60) is the length of time required for the refl ections to reach an
amplitude level of 60 dB lower than that of the original sound source.
Figure 1-8
Refl ected
sound.
Direct Sound
Walls Ceiling
Direct Sound
Floor
Reverberant Sound
Early Sound Field
Early Reflections
Direct
Sound
Sound
Impulse
Amplitude (dB)
0
-60
Arrival
Time Gap
Time
Chapter 1
12
Early refl ections are those refl ections that arrive at the ear or microphone
within around 50 ms of the direct sound. As a collection, the refl ections
that arrive at the receptor within the fi rst 50 ms after the arrival of the direct
sound comprise the
early sound fi eld.
Varying the
distance of the sound source from the receptor (ear or micro-
phone) alters the sound at the receptor. The sound at the receptor will be
a composite of the direct sound and the refl ected sounds (reverberation
and early sound fi eld). The composite sound at the receptor is affected by
the distance of the sound source from the receptor in two ways: (1) low-
amplitude portions of the sound’s spectrum (usually high frequencies)
are lost with increasing distance of the sound source to the receptor and
(2) refl ected sound increases in prominence to the direct sound as distance
increases. Figure 1-9 illustrates the loss of
timbral detail (the subtle aspects
and changes in the content of a sound’s timbre, also called defi nition of
timbre
) with increasing distance as well as the change of the proportion of
direct to refl ected sound.
The characteristic changes to the composite sound, caused by the geometry
of the host environment and by the location of the sound source within the
host environment, are also infl uenced by the changes caused by distance.
Figure 1-9
Changes in
sound with distance.
Direct Sound
Amplitude
Wave Form
Timbral Detail
10 Meters
100 Meters
Distance
0
+
-
Direct to
Reflected Sound
0
+
-
Time
These two dimensions of the relationship of the sound source to its acous-
tic space may alter the composite sound in four additional ways: (1) tim-
bre differences between the direct and refl ected sounds; (2) time differ-
ences between the arrivals of the direct sound, the initial refl ections, and
the reverberant sound; (3) spacing in time of the early refl ections; and (4)
amplitude differences between direct and refl ected sounds.
The geometry of the host environment greatly infl uences the content of
the composite sound. The dimensions and volume of the space, the angles
The Elements of Sound and Audio Recording
13
of boundaries (walls, fl oors, ceilings), materials of construction, and the
presence of openings (such as windows) and large objects within the space
will all alter the composite sound. Host environments cover the gamut of
all the physical spaces and open areas that create our reality (from a small
room to a large concert hall, from the corridor of a city street to an open
eld, etc.).
Unique sequences of refl ected sound are created when a sound is pro-
duced within an environment, and sequences are shaped by the location
of the sound source within the host environment. These unique sequences
contain patterns of refl ections that are defi ned by the spacing of refl ections
over time and the amplitudes of the refl ections. A “rhythm of refl ections”
exists and will form the basis of important observations in later chapters.
By altering the early time fi eld and reverberant sound, the location of the
sound source within the host environment may cause signifi cant altera-
tions to the composite sound at the receptor (ear or microphone).
Figure 1-10
Patterns
of refl ections.
The location of the sound source within the host environment may strongly
infl uence the composite sound. The amount of infl uence will be directly
related to the proximity of the sound source to the walls, ceiling, fl oor,
openings (such as windows and doors), and large objects refl ecting sound
within the host environment.
Direct
Sound
Amplitude
Time
Pattern 1
Pattern 2
Chapter 1
14
In audio production, the spatial properties of host environ-
ments and the location of a sound source within the envi-
ronment can be generated artifi cially. It is common to use
reverberation units and delays to create environmental
cues. These cues may be very realistic representations of
natural spaces or they may be environmental characteris-
tics that cannot occur in our physical reality.
Figure 1-11
Horizontal
and vertical planes.
The angle of the sound source to the receptor is an important infl uence in
audio recording. The sound source may be at any angle from the recep-
tor (listener or omnidirectional microphone) and be detected. The sound
source may be present at any location in the sphere surrounding the recep-
tor. The location is calculated with reference to the 360° vertical and hori-
zontal planes that encompass the receptor.
The angle of the sound source to the receptor may be calculated against
the horizontal plane (parallel to the fl oor), the vertical plane (height), or
by combining the two (in a way very similar to positioning locations on a
globe). Defi ning elevation (vertical plane) and direction (left, right, front,
rear) can determine the precise location of the sound source within our
three-dimensional space by precise increments of degrees.
Listen . . .
to track 34-36
for a realization of Figure 1-10,
where the rhythms of these pat-
terns of refl ections are sounded
separately and together.
Vertical Plane
Rear
Left
Horizontal
Plane
Front
Right
The Elements of Sound and Audio Recording
15
Figure 1-12
Defi ning
sound source angle
from a microphone.
Angles of source locations on the horizontal plane are captured or gener-
ated in audio recording to provide stereo and surround sound. To date, the
vertical plane has received little attention in audio because of playback for-
mat diffi culties. Recent surround sound advances have produced formats
that provide these cues in ways that can strikingly enhance programs.
Perceived Parameters of Sound
The fi ve physical dimensions of sound translate into respective perceived
parameters of sound. Sound as it exists in human perception is quite dif-
ferent from sound in its physical state, in air. Our perception of sound is a
result of the physical dimensions being transformed by the ear and inter-
preted by the mind. The perceived parameters of sound are our percep-
tions of the physical dimensions of sound.
This translation process from the physical dimensions to the perceived
parameters is nonlinear, and differs between individuals. The hearing
mechanism does not directly transfer acoustic energy into equivalent nerve
impulses. The human ear is not equally sensitive in all frequency ranges, nor
is it equally sensitive to sound at all amplitude levels. This nonlinearity in
transferring acoustic energy to neural impulses causes sound to be in a dif-
ferent state in our perception than what exists in air. Thus, the physical states
of sound captured by recording equipment will be heard by the recordist in
ways that may be unexpected, without knowledge of these differences.
Complicating this further, there is no reason to believe any two people actu-
ally hear the characteristics of sound in precisely the same way. If it were
possible for all conditions for two sounds to be identically sent to two listen-
ers, the two people likely would hear slightly (or strikingly) different charac-
teristics. We only need notice the different ear shapes around us to recog-
nize no two people will pick up acoustic energy in precisely the same way.
Horizontal
Plane
90º
-90º
180º
Vertical Plane
90º
60º
30º
-30º
-60º
-90º
-120º
-150º
-180º
150º
120º
Vertical Plane
from Side
30º
60º
90º
120º
150º
180º
-150º
-120º
-90º
-60º
-30º
Horizontal Plane
from Above
Chapter 1
16
Table 1-1
The Physical Dimensions and the Perceived Parameters of Sound
Physical Dimensions Perceived Parameters
Frequency Pitch
Amplitude Loudness
Time Duration
Timbre (physical components) Timbre (perceived overall quality)
Space (physical components) Space (perceived characteristics)
Pitch
Pitch is the perception of the frequency of the waveform. It has been defi ned
as the perceived position of a sound on a scale from low to high, and as an
attribute of hearing sensation by which sounds may be ordered on a musi-
cal scale. Pitch is a subjective attribute and cannot be measured. We assign
values to pitches, to allow us to understand their relationships to other
pitches. We organize these values into tuning and harmonic systems, which
give rise to melody and harmony. The creation of these systems may have
been different (and are therefore subjective), and indeed differ markedly
between cultures and have evolved substantially over the centuries.
A clear pitch sensation is perceived when a sound wave regularly repeats
a wave shape of very similar characteristics. The number of periodic repeti-
tions of the shape per second allows a fundamental frequency to be per-
ceived, and an unambiguous pitch sensation results. This is further solidi-
ed by the presence of frequency components that are integer multiples of
the fundamental (harmonics). Thus, the physical dimensions of sound are
transformed into our perception of pitch.
The frequency area most widely accepted as encompassing the hearing
range of the normal human spans the boundaries of 20–20,000 Hz (20 kHz),
though humans are sensitive to (if not actually able to hear) frequencies
well below and above this range.
Most humans cannot identify specifi c pitch levels. Some people have been
blessed with, or have developed, the ability to recognize specifi c pitch lev-
els (in relation to specifi c tuning systems). These people are said to have
“absolute” or “perfect pitch.The ability to accurately recognize pitch levels
is not common even among well-trained musicians.
It is commonly within human ability, however, to determine the relative
placement of a pitch within the hearing range. A
register is a specifi c por-
tion of the range. It is entirely possible to determine, within certain consis-
tent limits of accuracy, the relative register of a perceived pitch level. This
skill can be developed and accuracy improved signifi cantly.
We are able to consistently perform the estimation of the approximate
level of a pitch, associating pitch level with register. With practice, this
The Elements of Sound and Audio Recording
17
consistency can be accurate to within a minor third (within three semi-
tones). This skill in the estimation of pitch level” will be an important part
of the method for evaluating sound presented later.
Humans perceive pitch most accurately as relationships. We perceive pitch
as the relationship between two or more soundings of the same or related
sound sources. We do not perceive pitch as identifi able, discrete incre-
ments; we do not listen to pitch material to defi ne the letter names (incre-
ments) of pitches. Instead, we calculate the distance (or interval) between
pitches by gauging the distance between the perceived levels of the two
(or more) pitches.
The interval between pitches becomes the basis for all judgments that
defi ne and relate the sounds. Thus, melody is the perception of succes-
sively sounded pitches (creating linear intervals), and chords are the per-
ceptions of simultaneously sounded pitches (creating harmonic intervals).
We often perceive pitch in relation to a reference level (one predominating
pitch that acts as the key or pitch-center of a piece of music), or to a system
of organization to which pitches can be related (a tonal system, such as
major or minor).
Our ability to recognize the interval between two pitches is not consistent
throughout the hearing range. Most listeners have the ability to accurately
judge the size of the semitone (or minor second, the smallest musical inter-
val of the equal-tempered system) within the range of 60 Hz and 4 kHz. As
pitch material moves below 60 Hz, a typical listener will have increased
diffi culty in accurately judging interval size. As pitch material moves above
4 kHz, the typical listener will also experience increased diffi culty in accu-
rately judging interval size.
The smallest interval humans can accurately perceive is not consistent
throughout our hearing range. It changes with the register of the two pitch-
es creating the interval. The size of the minimum audible interval varies
from about 1/12 of a semitone between 1–4 kHz, to about half of a semitone
(a quarter-tone) at approximately 65 Hz. These fi gures are dependent upon
optimum duration and loudness levels of the pitches; sudden changes of
pitch level are up to 30 times easier to detect than gradual changes. It is
possible for humans to distinguish up to 1,500 individual pitch levels by
spacing out the appropriate minimum-audible intervals, throughout the
hearing range.
With all factors being equal, the perception of harmonic intervals (simulta-
neously sounding pitches) is more accurate than the perception of melodic
intervals (successively sounded pitches). Up to approximately 500 Hz, melod-
ic and harmonic intervals are perceived equally well. Above 1 kHz, humans
begin to be able to judge harmonic intervals with greater accuracy than
melodic intervals; above 3,500 Hz, this difference becomes pronounced.
Chapter 1
18
Loudness
Loudness is the perception of the overall excursion of the waveform (ampli-
tude). Amplitude can be physically measured as a sound pressure level. In
perception, loudness level cannot be accurately perceived in discrete levels.
Loudness is referred to in relative values, not as having separate and dis-
tinct levels of value. Traditionally, loudness levels have been described by
analogy (“louder than,“softer than,” etc.) or by relative values (“soft,
“medium loud,“very soft,extremely loud,” etc.). Humans compare loud-
ness levels and conceive loudness levels as being “louder than” or “softer
than” the previous, succeeding, or remembered loudness level(s).
A great difference exists between loudness as perceived by humans and
the physical amplitude of the sound wave. This difference can be quite
large at certain frequencies. In order for a sound of 20 Hz to be audible, a
sound pressure level of 75 dB must be present. At 1 kHz, the human ear will
perceive the sound with a minute amount of sound pressure level, and at
10 kHz a sound pressure level of approximately 18 dB is required for audi-
bility. The unit
phon is the measure of perceived loudness established at
1 kHz, based on subjective listening tests.
Figure 1-13
Equal
loudness contour.
120
100
80
60
40
20
0
130
120
110
100
90
80
70
60
50
40
30
20
10
Normal Minimum
Audible Range
20 50 100 200 500 1,000 2,000 5,000 15,000
10,000
Sound Pressure Level (dB)
Phons
Frequency (Hz)
The nonlinear frequency response of the ear and the fatigue of the hear-
ing mechanism over time contribute to further inaccuracies of the human
perception of loudness in three ways:
1. With sounds of long durations and steady loudness level, loudness
will be perceived as increasing with the progression of the sound until
The Elements of Sound and Audio Recording
19
approximately 0.2 seconds of duration. At that time, the gradual fatigue
of the ear (and possibly shifts of attention by the listener) will cause
perceived loudness to diminish.
2. As loudness level of the sound is increased, the ear requires increas-
ingly more time between soundings before it can accurately judge the
loudness level of a succeeding sound. We are unable to accurately judge
the individual loudness levels of a sequence of high-intensity sounds as
accurately as we can judge the individual loudness levels of mid- to low-
intensity sounds; the inner ear itself requires time to reestablish a state
of normalcy, from which it can accurately track the next sound level.
3. As a sound of long duration is being sustained, its perceived loudness
level will gradually diminish. This is especially true for sounds with high
sound pressure levels. The ear gradually becomes desensitized to the
loudness level. The physical masking (covering) of softer sounds and
an inability to accurately judge changes in loudness levels will result
from the fatigue. When the listener is hearing under listening fatigue,
slight changes of loudness may be judged as being large. Listening
fatigue may desensitize the ear’s ability to detect new sounds at fre-
quencies within the frequency band (frequency area) where the high
sound-pressure level was formerly present.
Duration
Humans perceive time as duration. Sound durations are not perceived indi-
vidually. We cannot accurately judge time increments without a reference
time unit. Regular reference time units are found in musical contexts and
rarely in other types of human experiences. Even the human heartbeat is
rarely consistent enough to act as a reliable reference. The underlying met-
ric pulse of a piece of music does, however, allow for accurate duration
perception. This accuracy cannot be achieved in any other context of the
human experience.
In music, the listener remembers the relative duration values of successive
sounds, in a similar process to that of perceiving melodic pitch intervals.
These successive durations create musical rhythm. The listener calculates
the length of time between when a sound starts and when it ends, in rela-
tion to what precedes it, what follows it, what occurs simultaneously with
it, and what is known (what has been remembered). Instead of calculating
an interval of pitch, the listener proceeds to calculate a span of time, as a
durational value.
Metric Grid
A metric grid, or an underlying pulse, is quickly established in the percep-
tion of the listener, as a piece of music unfolds. This creates a reference
pulse against which all durations can be defi ned. The listener is thereby
Chapter 1
20
able to make rhythmic judgments in a precise and consistent manner. The
equal divisions of the grid allow the listener to compare all durations and
to calculate the pulse-related values of the perceived sounds. Durations are
calculated as being in proportion to the underlying pulse: at the pulse, half
pulses, quarter of the pulse, double the pulse, etc.
In the absence of the metric grid, durational values cannot be accurately
perceived as proportional ratios. Humans will not be able to perceive slight
differences in duration when a metric grid is not established.
Figure 1-14
Metric
grid.
Musical
Notation
Proportion
Sound Attacks
11
11
1
/4 1
3
/4
1
/2
1
/2 2
3
/4
1
/4
1
/3
3
/3
2
/3
2
/3
2
/3
2
/3
2 3 4 5 6 7 8 9 10 11 12
Pulses
(Regularly spaced out at specific metronome marking)
33
The listener is only able to establish a metric grid within certain limits.
Humans will be able to accurately utilize the metric grid between 30 to 260
pulses per minute. Beyond these boundaries, the pulse is not perceived as
the primary underlying division of the grid. We will instead replace the pulse
with a duration of either one-half or twice the value, or the listener might
become confused and unable to make sense of the rhythmic activity.
The metric grid is the dominant factor in our perception of tempo, as well
as musical rhythm. In most instances, the metric grid itself represents the
steady pulsation of the tempo of a piece of music.
Time
The listeners time perception plays a peripheral role in the perception
of rhythm. Time perception is signifi cant to the perception of the global
qualities of a piece of music, and to the estimation of durations when a
metric grid is not present in the music. The global qualities of aesthetic,
communicative, and extra-musical ideas within a piece of music are largely
dependent on the living experience of music—on the passage of musical
materials across the listeners time perception of their existence.
Time perception is distinctly different from duration perception. The human
mind makes judgments of elapsed time based on the perceived length of
the present. The length of time humans perceive to be “the present” is
The Elements of Sound and Audio Recording
21
normally two to three seconds, but might be extended to as much as fi ve
seconds and beyond.
The “present” is our window of consciousness, through which we perceive
the world and listen to sound. We are at once experiencing the moment of
our existence, evaluating the immediate past of what has just happened
and anticipating the future (projecting what will follow the present moment,
given our experiences of the recently passed moments, and our knowledge
of previous, similar events).
Human time judgments are imprecise. The speed at which events take place
and the amount of information that takes place within the “present” greatly
infl uences time judgments. The amount of time perceived to have passed
will change to conform to the number of events experienced within the
present; the listener will estimate the amount of time passed in relation to
the number of experiences during the present, and make time judgments
accordingly.
Time judgments are greatly infl uenced by the individual listeners attentive-
ness and interest in what is being heard. If the material stimulates thought
within the listener, the event will seem shorter; if the listener fi nds the listen-
ing activity desirable in some way, the experience will seem to occupy less
time than would an undesirable experience of the same (or even shorter)
length. Expectations, boredom, interest, contemplation, and even pleasure
caused by music can alter the listeners sense of elapsed time.
The time length of a piece of music (or any time-based art form, such as a
motion picture) is separate and distinct from clock time. A lifetime can pass
in a moment, through the experience of a work of art. A brief moment of
sound might elevate the listener to extend the experience to an infi nity of
existence.
Timbre in Perception
The overall quality of a sound, its timbre, is the perception of the mixture
of all of the physical aspects that comprise a sound. Timbre is the global
form, or the overall character of a sound, which we can recognize as being
unique.
This overall form (timbre) is perceived as the states and interactions of its
component parts. The physical dimensions of sound, discussed previously,
are perceived as dynamic envelope, spectral content, and spectral enve-
lope (perceived values, not physical values). The perceived dimensions are
interpreted and shape an overall quality, or conception, of the sound.
Humans remember timbres as entities, as single objects having an overall
quality (that is comprised of many unique characteristics), and sometimes
as having meaning in themselves (as a timbre can bring with it associa-
tions in the mind of the listener). We recognize the sounds of hundreds
Chapter 1
22
of human voices because we remember their timbres. We remember the
timbres of a multitude of sounds from our living experiences. We remem-
ber the timbres of many musical instruments and their different timbres as
they are performed in many different ways.
The global quality that is timbre allows us to remember and recognize spe-
cifi c timbres as unique and identifi able objects.
Humans have the ability to recognize and remember a large number of tim-
bres. Further, listeners have the ability to scan timbres and relate unknown
sounds to sounds stored in the listeners long-term memory. The listener
is then able to make meaningful comparisons of the states and values of
the component parts of those timbres. These skills will serve as meaningful
points of departure for the method for evaluating timbre in Part Two.
Suffi cient time is required for the mind to process the many characteristics
of a sound in order to recognize and understand its overall image. The time
required to perceive the component parts of timbre varies signifi cantly with
the complexity of the sound and the listeners previous knowledge of the
sound. For rather simple sounds, the time required for accurate perception
is approximately 60 ms. As the complexity of the sound is increased, the
time needed to perceive the sound’s component parts will also increase. All
sounds lasting less than 50 ms are perceived as noise-like, since a specifi c
timbre cannot be identifi ed at that short a duration; exceptions occur when
the listener is well acquainted with the sound, and the timbre can be recog-
nized from this small bit of information.
The partials of the timbres spectrum fuse to create the impression of a
single sound. Although many frequencies are present, the tendency of our
perception is to combine them into one overall texture. We fuse partials that
are harmonically related to the fundamental frequency, as well as overtones
that are distantly related to the fundamental, into a single impression.
It is especially important for the recording professional to note the follow-
ing related to the fusing of timbres:
Fusion can also occur between two separate timbres (two individual
sound sources) if the proper conditions are present.
Timbres that are attacked simultaneously, or are of a close harmonic
relationship to each other, are most likely to fuse into the perception of
a single sound.
The more complex the individual sound, the more likely that fusion will
not occur.
When the listener recognizes one of the timbres, fusion will be far less
likely to occur.
Related to recognizing timbre, synthesized sounds are more likely to
fuse with other sounds than are known sounds of an acoustic origin.
The Elements of Sound and Audio Recording
23
Spatial Characteristics
The perception of the spatial characteristics of sound is the impression of
the physical location of a sound source in an environment, together with the
modifi cations the environment itself places on the sound sources timbre.
The perception of
space in audio recording (reproduction) is not the same
as the perception of space of an acoustic source in a physical environment.
In an acoustic space, listeners perceive the location of sound in relation
to the three-dimensional space around them: distance, vertical plane, and
horizontal plane. Sound is perceived at any possible angle from the listen-
er, and sound is perceived at a distance from the listener; both of these per-
ceptions involve an evaluation of the characteristics of the sound source’s
host environment.
In audio recording, illusions of space are created. Sound sources are given
spatial characteristics through the recording process and/or through signal
processing. This spatial information is intended to complement the timbre
of the music and/or sound source. The spatial characteristics may simulate
particular known, physical environments or activities, or be intended to
provide spatial cues that have no relation to our reality. In theory, all of the
interactions of the sound with its host environment are captured with, or
can be simulated and applied to, the sound source; upon playback through
two or more loudspeakers, these spatial cues are reproduced.
Playback Environment
Recordings and their sound sources (combined with their spatial charac-
teristics) are heard through two (or more) loudspeakers. The loudspeakers
themselves are placed in, and interact with, a playback environment—such
as a living room or automobile. The playback environment is nearly always
quite unrelated to the spaces on the recording. Thus, the listener ultimate-
ly perceives spatial characteristics applied to the sound source after they
have been altered by:
the characteristics of loudspeakers,
the interaction of the loudspeakers and the environment (caused by
placement of loudspeakers within the playback environment), and
the playback environment itself.
Listeners perceive the reproduced spatial characteristics of the sound
source within the three-dimensional space of their listening environment
(headphone monitoring is not a workable solution, as will be discussed
later).
To accurately perceive the spatial information of an audio recording, the
listening environment must be acoustically neutral, and the listener must
be carefully positioned within the environment and in relation to the loud-
Chapter 1
24
speakers. The listening environment (including the loudspeakers) should
not place additional spatial cues onto the reproduced sound. These condi-
tions are goals that are never fully met in actual monitoring conditions.
Therefore one must be aware of how the playback system and the listening
room are altering the recording and its sound sources.
Perceived Spatial Relationships and Current Sound Reproduction
We perceive spatial relationships in three primary ways:
1. as the location of the sound source being at an angle to the listener
(above, below, behind, to the left, to the right, in front, etc.),
2. as the location of the sound source being at a distance from the
listener,
3. as an impression of the type, size, and acoustic properties of the host
environment.
These perceptions are transferred into the recording medium, to provide a
realistic illusion of space, with one major exception. The angular location is
severely restricted in audio reproduction, as compared to human percep-
tual abilities. Currently used audio playback formats can only accurately
and consistently reproduce localization cues on the horizontal plane, and
then only slightly beyond the loudspeaker array in stereo recordings. The
three-dimensional space of our reality is simulated (with dubious accuracy)
in the two dimensions of audio recording.
Much research and development is taking place attempting to extend the
localization of sound sources to behind the listener and to the vertical plane,
as well as to provide a more realistic reproduction of distance and environ-
mental cues. Signifi cant advances are being made. Surround sound is now
widely accepted and provides reproduction around the listener that can be
mostly stable and accurate. Control of sound localization on the vertical
plane and a more complete simulation of environmental characteristics are
not, however, presently feasible, though the technology of the future will
likely address these areas as well.
The following discussions of the perception of the spatial dimensions of
reproduced sound refer to common two-channel stereo and to surround-
sound systems. These concepts will transfer to other systems such as bin-
aural recordings, holophonics, and others, but with different boundaries of
the limits of perception inherent with each particular format.
The Elements of Sound and Audio Recording
25
Figure 1-15
Area of
sound localization in
two-channel audio
playback.
Left
Loudspeaker
60º 60º
15º15º
A
R
E
A
O
F
S
O
U
N
D
L
O
C
A
L
I
Z
A
T
I
O
N
Right
Loudspeaker
Localization of Direction
Our ability to localize sounds is one of the survival mechanisms we have
retained from our ancient past. This ability has been developed through-
out our evolution; we have learned to perceive the direction of even those
sounds that our hearing mechanism has diffi culty processing.
Humans use differences in the same sound wave appearing at the two ears
for the accurate
localization of direction. Interaural time differences (ITD)
are the result of the sound arriving at each ear at a different time. A sound
that is not precisely in front or in back of the listener will arrive at the ear
closest to the source before it reaches the furthest ear. These time differ-
ences are sometimes referred to as phase differences. The sounds arriving
at each ear are almost identical during the initial moments of the sound,
except the sound at each ear is at a different point in the waveform’s cycle
(and might contain minute spectral differences).
Interaural amplitude differences (IAD) work in conjunction with ITD in the
localization of the direction of the sound source. IAD are also referred to
as interaural spectral differences. IAD is the result of sound pressure level
differences at high frequencies present at the two ears. The head of the
listener, which blocks certain frequencies from the furthest ear (when the
sound is not centered), causes the interaural spectral differences. This
occurrence has been termed the “shadow effect.” Interaural amplitude dif-
ferences (IAD) will at times consist solely of amplitude differences between
the two ears, with the spectral content of the waveform being the same at
both ears.
Chapter 1
26
The sound wave is almost always different at each ear. The differences
between the sound waves may be time/phase-related, amplitude/spec-
trum-related, and/or spectrum differences. These differences in the wave-
forms are essential to the perception of the direction of the sound source.
In addition, these same cues play a major role in perceiving the character-
istics of host environments.
Up to approximately 800 Hz, humans rely on ITD for localization cues.
Phase differences are utilized for localization perception up to about 800
Hz, as amplitude appears to be the same at both ears.
Figure 1-16
Frequency
ranges of localization
cues.
20
Frequency (Hz)
Increasingly Poor Localization
Generally
Poor
Localization
40 80 160 320 640 1288 2560 5120 10240 20k
2k 4k800500
IAD
Interaural Time Differences
Between 800 Hz and about 2 kHz both phase and amplitude differences are
present between the two ears. IAD and ITD are both used for the perception
of direction in this frequency range, with amplitude differences becoming
signifi cant around 1,250 Hz.
In general, time/phase differences seem to dominate the perception of
direction up to about 4 kHz. Although IAD are present, ITD dominates the
perception of direction between 2 kHz and 4 kHz. Humans have poor local-
ization ability for sounds in this frequency band.
Above 4 kHz, interaural amplitude differences (IAD) are the cues that deter-
mine the perception of location. Localization ability improves at 4 kHz and
is quite accurate throughout the upper registers of our hearing range.
Recent studies have revealed the human body also generates physical cues
for localization. The chest, head, shoulders, and outer ears all affect sounds
of various wavelengths in different ways. Our body parts’ different sizes
and their different angles to the hearing canal create a very complex source
of refl ected and diffracted sound waves. These waves all lead to important
interaural spectrum differences between the two ears. These differences
are created by a comb-fi lter effect, comprised of minute cancellations and
reinforcements of frequencies. The brain processes these subtle differenc-
es to aid in identifying a sound source’s direction.
As we have seen, humans do not perceive direction accurately at all fre-
quencies. Below approximately 500 Hz, our perception of the angle of the
sound source becomes increasingly inaccurate, to the point where sounds
The Elements of Sound and Audio Recording
27
seem to have no apparent focused location. An area exists around 3 kHz
where localization is also poor; wavelength similarities between the dis-
tance between the two ears and those of the frequencies around 3 kHz
cause interaural time/phase differences to be unstable and unreliable.
Humans have a well-refi ned ability to localize sounds in the approximate
frequency areas: 500 Hz–2 kHz, and 4 kHz to upper threshold of hearing
(whatever that might be). Within these areas, the minimum discernible
angle is approximately one to two degrees, with less accuracy at the sides
and back than in the front. Sounds that have fundamental frequencies out-
side of these frequency areas, but that have considerable spectral content
within these bands, will also be localized quite accurately.
Interaural spectral differences occur throughout the frequency spectrum.
While they may be subtle, it appears they are important for the localiza-
tion of objects in frequency ranges where IAD and ITD are ineffective. The
makeup of these interaural spectral differences will inherently be unique to
individuals, as (obviously) no two people are the same size and shape, and
no two outer ears are alike.
The outer ear is called the pinna. This part of our anatomy (an elaborately
shaped piece of cartilage) plays several important roles in our perception
of direction. The pinna gathers sound and funnels it into the ear canal. As
the ridges of the outer ear refl ect sound into the ear, the ridges introduce
small time delays between their refl ections and the direct sound that travels
directly to the ear canal. These small time delays vary according to sound-
source location, and are important components of the interaural spectral
differences described above. The pinna and the delays it generates aid us
in differentiating between sounds arriving from the front and those arriving
from the rear.
Resonances also appear to be excited in the outer ear. These also alter
the frequency response of the sound source in predictable ways that vary
between individuals. The brain learns these patterns of spectral changes to
assist in localization. Pinna cues play signifi cant roles in direction percep-
tion even though each individual has a unique ear-shape, and the resultant
spectrum changes are equally unique. Location cues based on spectrum
are thus not universal, but unique to the individual and are learned.
Pinnae serve a critical function in front to back localization. When sound
arrives at the head from the rear, ridge refl ections are not generated. The
pinna actually blocks the direct sound from reaching the hearing canal and
its ridges when sounds are generated beyond 130° from the front center.
The pinna allows us to perceive the sound source as being generated to
our rear because of the absence of spectral differences.
It is interesting to note, our distance and location judgments are not as
accurate to the sides and the rear. The absence of this spectral information
generated and collected by the outer ear may well play a role.
Chapter 1
28
We actually move our heads involuntarily to assist in locating sounds—
especially those sounds that are not in front. In moving our heads to remove
location confusion, we bring the source into our front listening fi eld and
thus reintroduce the IAD, ITD and spectral differences of the pinnae. We
also instinctively seek to bring the source into visual view, which elimi-
nates all ambiguity for acoustic sources—but not for phantom images.
Front-back hearing is only partially understood. Little relevant research in
spatial perception of sounds arriving from the rear and the sides is available.
Certainly more will need to be accomplished before we are able to more
thoroughly understand this area. It is increasingly important, however, that
we understand how we perceive sound arrival from the rear and the sides,
and the different qualities of those sounds, if we are to fully understand
and control the differences between surround sound and stereo.
Distance Perception
Distance perception has not been studied thoroughly. The following infor-
mation is well documented, and it is likely that numerous subtleties will be
discovered in the future.
Two impressions lead to the perception of the distance of a sound source
from the listener: (1) the ratio of the amount of direct sound to reverberant
sound, and (2) the primary determinant, the loss of low-amplitude (usually
high-frequency) partials from the sound’s spectrum with increasing dis-
tance (
defi nition of timbre or timbral detail). Both of these functions rely on
the listeners knowledge of the sound’s timbre for accurate perception of
distance-location. While sound pressure decreases with distance, loudness
itself does not factor into distance-location perception.
Low-energy spectral information is lost with the compressions and rar-
efactions of the waveform over distance. Some information is simply
absorbed by the atmosphere due to air friction. This leads to the listeners
determination of the level of timbral detail (defi nition of timbre) that is the
major factor in distance perception.
Some timbre-related distance information results from waveform travel
and the speed of sound. As high frequencies travel slightly faster than low
frequencies, the spectrum of the sound is altered with increasing distance.
The partials of complex sounds will become increasingly out of phase with
the fundamental frequency and between themselves, the longer the propa-
gation of the sound.
As the source moves from the listener, the percentage of direct sound
decreases while the percentage of refl ected sound increases. This pertains
to enclosed spaces only. This ratio of direct to refl ected sound will play a
signifi cant role in distance perception when the reverberant energy begins
to mask timbral detail. It will also play a more signifi cant role as timbral
The Elements of Sound and Audio Recording
29
detail becomes more and more diminished, or when the sound source is
unknown.
The listener must know the timbre of a sound in order to recognize miss-
ing timbral detail. If the sound is unknown or not recognized, the listener
cannot recognize a loss of low-energy components from its spectrum. With
knowledge of the sound source, the listener will be able to calculate how
much low-energy information is missing and thus be able to determine the
general amount of distance between them and the object.
Knowledge of the timbre of the sound source will assist the listener in rec-
ognizing the absence of spectral information and/or perceiving the reit-
erations of the direct sound and the reverberant sound. These perceptions
will provide the listener with the needed information to judge distance. The
previous experiences and listening skills of the listener will play a major
role in the accuracy of judgments made.
Without prior knowledge of the timbre of a sound, perception of distance
location is considerably less accurate, causing judgments to be rough
estimates.
Related to the ratio of direct to refl ected sound, the time difference between
the ceasing of the direct sound and the ceasing of the reverberant ener-
gy will increase with distance. Through
temporal fusion we perceive the
reverberant sound as being a part of the direct sound. This creates a single
impression of the sound in its environment (referred to as the composite
sound, above). As distance increases, temporal fusion begins to diminish
and the ending of the direct sound and the continuance of the reverberant
energy become more prominent.
Perception of Environmental Characteristics
The perceptions of the characteristics of the host environment and the
placement of the sound source within the host environment are also
dependent upon the ratio of direct to refl ected sound and the loss of low-
level spectral components with increasing distance. In addition, the char-
acteristics of the host environment are perceived through:
1. The time difference between the arrival of the direct sound and the
arrival of the initial refl ections,
2. The spacing in time of the early refl ections,
3. Amplitude differences between the direct sound and all refl ected sound
(the individual initial refl ections and the reverberant sound), and
4. Timbre differences between the direct sound, the initial refl ections,
and the reverberant sound.
The time delay between the direct and the refl ected sounds is directly relat-
ed to (1) the distance between the sound source and the listener, (2) the
distance between the sound source and the refl ective surfaces (which send
Chapter 1
30
the refl ected sound to the listener), and (3) the distance of the refl ective
surfaces from the listener. These three physical distances also create the
patterns of time relationships (the rhythms) of the early refl ections.
Early refl ections arrive at the listener within 50 ms of the direct sound.
These early refl ections comprise the
early sound fi eld. The early sound fi eld
is composed of the fi rst few refl ections that reach the listener before the
beginning of the diffused, reverberant sound (see Figure 1-8). Many of the
characteristics of a host environment are disclosed during this initial portion
of the sound. The early sound fi eld contains information that provides clues
as to the size of the environment, the type and angles of the refl ective sur-
faces, even the construction materials and surface coverings of the space.
We have the capability to learn to accurately judge the size and character-
istics of the host environments of sound sources. This is accomplished by
evaluating the sound qualities of the environment. Humans experience and
remember the sound qualities of a great many natural environments in much
the same way as we recognize and remember timbres. Further, we have the
ability to compare the sound qualities of new environments we encounter
to our memories of environments we have previously experienced.
The listening skill needed to evaluate and recognize environmental char-
acteristics can be developed to a highly refi ned level. Some people who
work regularly with acoustical environments develop these listening skills
to a point where many can perceive the dimensions and volume of an envi-
ronment, its surface coverings, or even openings within the space (doors,
windows, etc.).
Interaction of the Perceived Parameters
The perception of any parameter of sound is always dependent upon
the current states of the other parameters. Altering any of the perceived
parameters of sound will cause a change in the perceived state of at least
one other parameter.
The parameters of sound interact, causing the perception of the state of
one parameter to be altered by the state of another. Certain occurrences of
these interactions were noted under individual perceived parameters. The
following are additional examples of note and are separated for clarity.
Duration for Pitch Perception
Suffi cient duration is required for the ear to perceive pitch. If the dura-
tion is too short, the sound will be perceived as having indefi nite pitch, as
being noise-like. The time necessary for the mind to determine the pitch of
a sound is dependent on the frequency of the sound. Sounds lower than
500 Hz and sounds higher than 4 kHz require more time to establish pitch
The Elements of Sound and Audio Recording
31
quality than sounds pitched between 2 kHz and 4 kHz, where pitch percep-
tion is most acute. At the extremes of the hearing range, pitch quality may
require as much as 60 ms to become established.
Figure 1-17
Minimum
durations for pitch
perception at select
frequencies.
100
Fundamental Frequency (Hz)
Minimum Duration (msec)
48
46
44
42
40
38
36
34
32
30
28
26
24
22
20
18
16
14
12
10
8
6
4
2
200 400 800 2k 3k 5k4k
1k500
The length of time required to establish a perception of pitch will also depend
on the sound’s attack characteristics and its spectral content (its timbre).
Sounds with complex (but mostly harmonic) spectra and sounds with short
attack times will establish a perception of pitch sooner than other sounds.
Loudness and Pitch Perception
Loudness will infl uence the perception of pitch, as humans will perceive
a change of pitch with a change of loudness (dynamic) level. The level 60
dB SPL (or about the loudness level of normal conversation) is considered
to be a threshold where increases or decreases in loudness affect pitch
perception oppositely. Above 60 dB, for sounds below 2 kHz a substantial
increase in loudness level will cause an apparent lowering of the pitch lev-
el; the sound will appear to go fl at, although no actual change of pitch level
has occurred. Similarly, a substantial increase in the dynamic (loudness)
level of a pitch above 2 kHz will cause the sound to appear to go sharp; an
impression of the raising of the pitch level is created, although the actual
pitch level of the sound has remained unaltered. Below 60 dB an increase
in loudness will cause sounds below 2 kHz to be perceived as getting sharp
and sounds above 2 kHz perceived as going fl at.
Chapter 1
32
Loudness and Time Perception
Loudness level can infl uence perceived time relationships. When two
sounds begin simultaneously, they will appear to have staggered entranc-
es if one of the two sounds is signifi cantly louder than the other. The louder
sound will be perceived as having been started fi rst.
Perceived loudness level is often distorted by the speed at which informa-
tion is processed. When a large number of sounds occur in a short period
of time, the listener will perceive those sounds as having a higher loudness
level than sounds of the same sound pressure level, but that are distributed
over a longer period of time. This distortion of loudness level is caused by
the amount of information being processed within a specifi c period of time
(the time period is related to the perceived length of the present).
Loudness Perception Altered by Duration and Timbre
Duration can distort the perception of loudness. Humans tend to average
loudness levels over a time period of about 2/10 of a second. Sounds of
shorter durations will appear to be louder than sounds (of the same inten-
sity) with durations longer than 2/10 of a second.
Timbre can also infl uence perceived loudness. Sounds with a complex
spectrum will be perceived as being louder than sounds that contain fewer
partials. Similarly, sounds with more complex spectra with a strong pres-
ence of overtones will be perceived as louder than sounds containing
mostly proportionally related partials (harmonics), when both are at the
same sound-pressure level. Following this principle, a change of timbre
during the sustained duration of a sound will result in a perceived change
in loudness.
Pitch Perception and Spectrum
As a product of the interaction of the harmonics and closely related over-
tones of a sound’s spectrum, a timbre can create a perception of pitch
where a fundamental frequency is not physically present. The harmonics
of a sound reinforce its fundamental frequency to enhance the perception
of pitch. This phenomenon is so capable of producing the perception of
the fundamental frequency that a harmonic spectrum can provide the per-
ception of pitch when the fundamental frequency is not physically present
(missing fundamental). A perception of the periodicity of the fundamental
frequency is created by the spectrum of the sound, although that frequency
itself may not be present.
The Elements of Sound and Audio Recording
33
Amplitude, Time, and Location
The amplitude of two reiterated sounds (separated in time) can infl uence
location perception. The precedence or Haas effect results when two loud-
speakers reproduce the same sound in close succession. The effect works
against the principle that when two loudspeakers reproduce a sound simul-
taneously, and at the same amplitude, the sound appears to be centered
between the two loudspeakers.
When two loudspeakers reproduce the same sound source in close suc-
cession, normal perception would seem to be to localize a sound source at
the earliest sounding loudspeaker, then to shift the image to center when
the second loudspeaker is sounded. The Haas effect functions to continue
the localization of the sound source at the fi rst speaker location, while add-
ing the second loudspeaker (to reinforce the sound intensity of the fi rst
speaker) without losing the localization of the sound source at the loca-
tion of the fi rst loudspeaker. The time difference between the sounding of
the loudspeakers must be at least 3 ms to keep the sound at the leading
speaker, with 5 ms being a more effective minimum; a maximum delay
of approximately 25–30 ms may be used before the delayed signal is per-
ceived as an echo (echoes will be perceived at all frequencies at a delay of
50 ms). If the leading channel is lowered by 10 dB, or the following channel
increased by 10 dB, the sound source will again be centered.
Masking
Masking occurs when a sound (or a portion of a sound) is not perceived
because of the qualities of another sound. The simultaneous sounding of
two or more sounds can cause a sound of lower loudness level, or a sound
of more simple spectral content, to be masked or hidden from the percep-
tion of the listener. The masking of sounds is a common problem for people
beginning their studies or work in audio recording.
When two simultaneous sounds of relatively simple spectral content have
close fundamental frequencies, the sounds will tend to mask each other
and blend into a single, perceived sound. As the two sounds become sepa-
rated in frequency, the masking will become less pronounced until both
sounds are clearly distinguishable.
Sounds of relatively simple spectral content tend to mask sounds that are
at higher frequencies. This masking becomes more pronounced as the
loudness of the lower sound is increased, and is more likely to occur when
a large interval separates the two pitch levels of the sounds. This masking
is especially prominent if the two pitch levels are in a simple harmonic rela-
tionship (especially 2:1, 3:1, and 5:1). A higher pitched sound can mask a
lower pitched sound if the higher sound is signifi cantly higher in loudness
level, and given the same conditions as above; the higher the loudness
level, the broader the range of frequencies a sound can mask.
Chapter 1
34
Masking can occur between successive sounds. With sounds separated in
time by up to 20–30 ms, the second sound may not be perceived if the
initial sound is of suffi cient loudness level to draw and retain listener atten-
tion, or to fatigue the ear. In a similar way, a sound may not be perceived if
it is followed by another sound of great intensity within 10 ms.
Audio equipment can produce “white” and other broadband noise that
can mask sounds at all or many frequencies. An entire program might be
masked by the noise of the sound system itself, should the loudness level
of the noise be suffi ciently higher than that of the program. This type of
masking problem will fi rst be noticed in the high frequencies, where low
loudness levels exist in the upper components of the sound’s spectrum.
Summary
The three states of sound that concern audio recording are sound as it
exists in air (the physical dimensions of sound), sound as it exists in human
perception (the perceived parameters of sound), and the understanding of
the meaning of a sound (sound as a resource for artistic expression). The
physical dimensions of sound in air are transformed into neural impulses as
the perceived parameters of the sound by the ear and brain. The perceived
parameters of sound become understood as a resource of elements that
allow for the communication and understanding of the meaning of sound
(and artistic expression).
The two physical dimensions of the waveform are frequency and amplitude.
They function in time and form the basis for our understanding of timbre
and spatial properties. As frequency becomes perceived as pitch, amplitude
as loudness, time as duration, timbre as timbral characteristics, and space
as perceived locations and environmental characteristics—the anomalies of
human hearing transform acoustic energy into our perception with marked
changes. What these changes actually are, and how these changes take
place, are of great concern to the recording professional as they work in the
many ways sound is captured, created, modifi ed, and perceived.
Sound, as it is perceived and understood by the human mind, becomes
the resource for creative and artistic expression in sound. The perceived
parameters of sound become the artistic elements of sound in creating
musical material and in communicating other meaningful messages. The
aesthetic and artistic elements of sound in audio recording are presented
in the next chapter.
The Elements of Sound and Audio Recording
35
Exercises
The following exercise should be practiced in short sessions over time, and
should be supplemented with writing out the harmonic series at a variety of
pitch/frequency levels.
Exercise 1-1
Learning the Sound Quality of the Harmonic Series
Tracks 1 and 2 on the accompanying CD provide examples of the harmonic
series. Learning the “sound quality” of the harmonic series will be valuable in
learning to identify the spectral content of sounds.
1. Listen carefully to the harmonic series being constructed a single pitch
at a time. Notice the spacing between the tones and the sequence of
intervals of the series. Work to commit the sequence to memory—both
the names of the intervals and the sound quality of the interval sequence
should be learned.
2. Through 10 partials, practice recalling the sequence of intervals by writ-
ing them.
3. Next, practice playing the sequence on a keyboard, remembering the high-
er intervals do not align with the equal-temper tuning of the keyboard.
4. Continue to listen to the harmonic series on the CD, and move your at-
tention to the quality of the “chord” that is played after the individual
pitches of the harmonic series. Listen carefully to this chord and try to
identify each individual pitch in it.
5. Repeat this process to obtain confi dence in spelling the harmonic series
and recognizing its pitch succession and sound quality.
6. Repeat steps 2 through 5 with a series through 17 partials. Practice until
you are comfortable quickly conceptualizing each of the pitches in the
harmonic series when you hear the chord at the end of tracks 1 and 2.
The goal of this exercise is to bring the reader to recognize the “sound quality”
(or timbre) of the “chord” that is created by the harmonic series, in its specifi c
voicing (or spacing) of intervals and pitches. This knowledge will be used as a
template against which the reader can have a point of departure in calculat-
ing the content of a sound’s spectrum.
36
2 The Aesthetic and Artistic
Elements of Sound in
Audio Recordings
The audio recording process has given creative artists the tools to very
nely shape perceived sound (the perceived parameters of sound) through
a direct control of the physical dimensions of sound. This control of sound is
well beyond that which was available to composers and performers before
the presence of modern recording technology. It has brought new artistic
elements to music, which has led to a redefi nition of the musician and to
new characteristics of sound. While our discussion is focused on musical
applications, it should be remembered that all aspects of these artistic ele-
ments of music also function as artistic elements in other areas of audio
(such as broadcast media, fi lm, multimedia, theatre, etc.).
A new creative artist has evolved. This person uses the tools of recording
technology as sound resources for the creation (or recreation) of an artistic
product. This person may be a performer or composer in the traditional
sense, or this person may be one of the new musicians: a producer, or
sound engineer, or any of the host of other related job titles. Throughout
this book, these people are referred to as recordists.
Through its detailed control of sound, the audio-recording medium has
resources for creative expression that are not possible acoustically. Sounds
are created, altered, and combined in ways that are beyond the reality of
live, acoustic performance. New creative ideas and new additions to our
musical language have emerged as a result of recording techniques and
technologies.
Creative ideas are defi ned by these aesthetic and artistic elements. The
artistic elements are the aspects of sound that comprise or characterize
creative ideas (or entire works of art or pieces of music). Study of the artis-
tic elements will allow us to understand individual musical ideas and the
The Aesthetic and Artistic Elements of Sound in Audio Recordings
37
larger musical event, and to recognize how those ideas and sound events
contribute to the entire piece of music. Discussion will emphasize the artis-
tic elements that are unique to recorded music, especially music created
through the use of modern recording techniques and technologies.
As we have learned from the previous chapter, the artistic elements of sound
are the mind/brain’s interpretation of the perceived parameters of sound.
Sound as it is perceived and understood by the human mind becomes the
resource for creative and artistic expression. The perceived parameters of
sound are utilized as the artistic elements of sound to create and ensure the
communication of meaningful (musical) messages.
The Art of Recording occurs through an understanding that the parameters
of sound are a resource for artistic expression. Recording becomes an art
when it is used to shape the substance of sound and music. These materi-
als that allow for artistic expression will be understood through a study of
their component parts: the artistic elements of sound.
The States of Sound and the Aesthetic/Artistic Elements
After the perception of sound, the recorded material is under stood as being
composed of sound elements that are interpreted by the mind/brain, and
thus communicate artistic ideas. The aesthetic/ artistic elements are directly
related to specifi c perceived parameters of sound, just as the perceived
parameters of sound were directly related to specifi c physical dimensions
of sound.
As will be remembered, sound in audio recording is in three states: physi-
cal dimensions, perceived parameters, and artistic elements.
The artistic elements are used by the recordist to shape music (sound),
resulting in artistic expression. The perceived parameters translate into the
artistic elements:
Table 2-1
The Perceived Parameters and the Aesthetic/Artistic Elements of Sound
Perceived Parameters Aesthetic/Artistic Elements
Pitch Pitch Levels and Relationships
Loudness Dynamic Levels and Relationships
Duration Rhythmic Patterns and Rate of Activity
Timbre (perceived overall
quality)
Sound Sources and Sound Quality
Space (perceived
characteristics)
Spatial Properties
The audio production process allows for considerable variation and a very
refi ned control of ALL of the artistic elements of sound. All of the artis-
tic elements of sound can be accurately and precisely controlled through
Chapter 2
38
many states of variation, in ways that were possible with ONLY pitch on
traditional musical instruments.
Table 2-2
The States of Sound in Audio Recording
Physical Dimensions Perceived Parameters Artistic/Aesthetic Elements
(Acoustic State) (Psychoacoustic Conception) (Resources for Artistic Expression)
Frequency Pitch Pitch Levels and Relationships—
melodic lines, chords, register, range,
tonal organization, pitch areas, vibrato
Amplitude Loudness Dynamic Levels and Relationships—
program dynamic contour, accents,
tremolo, musical balance, RDL
Time Duration (time perception) Rhythmic Patterns and Rates of
Activities—tempo, time, patterns of
durations
Timbre (comprised of physical
components: dynamic enve-
lope, spectrum and spectral
envelope)
Timbre (perceived as overall
quality)
Sound Sources and Sound Qual-
ity—timbral balance, arranging,
performance intensity, performance
techniques
Space (comprised of physical
components created by the
interaction of the sound source
and the environment, and their
relationship to a microphone)
Space (perception of the sound
source as it interacts with the
environment, and perception of
the physical relationship of the
sound source and the listener)
Spatial Properties—stereo location,
surround location, phantom images,
moving sources, distance location,
sound-stage dimensions, imaging,
environmental characteristics, per-
ceived performance environment,
space within space
Pitch Levels and Relationships
Pitch-level relationships present most of the signifi cant information in
music. The artistic message of most of today’s music is communicated (to
a large extent) by pitch relationships. Listeners have been trained, by the
music heard throughout their life, to focus on this element to obtain the
most signifi cant musical information. The other artistic elements often sup-
port pitch patterns and relationships.
Pitch is the most precisely controlled artistic element in traditional music.
The use of pitch relationships and pitch levels in music is more sophisti-
cated than the use of the other artistic elements. Complex relationships of
pitch patterns and levels are common in music.
Information about the artistic element of pitch levels and relationships will
be related to:
1. The relative dominance of certain pitch levels,
2. The relative register placement of pitch levels and patterns, or
3. Pitch relationships: patterns of successive intervals, relationships of
those patterns, and relationships of simultaneous intervals.
The Aesthetic and Artistic Elements of Sound in Audio Recordings
39
Traditional Uses of Pitch
The aesthetic/artistic element of pitch levels and relationships is broken
into the component parts: melodic lines, chords, tonal organization, regis-
ter, range, pitch density, pitch areas, and tonal speech infl ection.
A series of successive, related pitches creates
melodic lines. Melodic lines
are perceived as a sequence of intervals that appear in a specifi c ordering
and that have rhythmic characteristics. The melodic line is often the pri-
mary carrier of the artistic message of a piece of music.
The ordering of intervals, coupled with or independent from rhythm, cre-
ates patterns.
Pattern perception is central to how humans perceive objects
and events. These basic principles relate to all of the components of the
artistic elements. Melodic lines are organized by patterns of intervals (short
melodic ideas, riffs, or motives), supported by corresponding rhythmic
patterns. The complexity of the patterns, the ways in which the patterns
are repeated, and the ways in which the patterns are modifi ed provide the
melodic line with its unique character.
Two or more simultaneously sounding pitches create
chords. In much of
our music, chords are based on superimposing, or stacking, the intervals
of a third (intervals containing three and four semitones, most commonly).
Chords comprised of three pitches, combining two intervals of a third, are
called
triads. Continued stacking of thirds results in seventh, ninth, elev-
enth, and thirteenth chords.
The movement from one chord to another, or
harmonic progression, is the
most stylized of all the components of the artistic elements. Harmonic pro-
gression is the pattern created by successive chords, as based on the low-
est note (the root) of the triads (or more complex chords). These patterns of
chord progressions have become established as having general principles
that occur consistently in certain types of music. Certain types of music
will have stylized chord progressions (progressions that occur most fre-
quently), other types of music will have quite different movement between
chords, and perhaps emphasize more complex chord types. The patterns of
the harmonic progression create
harmony.
Harmony is one of the primary components that support the melodic line.
The chords in the harmonic progression reinforce pitches of the melody.
The speed and direction of the melodic line is often supported by the speed
at which chords are changed, and the patterns created by the changing
chords:
harmonic rhythm.
The expectations of harmonic progression create a sequence of chords,
which will present areas of tension and areas of repose within the musical
composition. The tendencies of
harmonic motion do much to shape the
momentum of a piece of music and can greatly enhance the character of
the melodic line and musical message. Performers utilize the psychological
Chapter 2
40
tendencies of harmonic progression, exploiting its directional and dramatic
tendencies. The expectations of harmonic movement and the psychological
characteristics of harmonic progression have become important aspects of
musical expression and musical performance.
The melodic and harmonic pitch materials are related through
tonal orga-
nization. Certain pitch materials are emphasized over others, in varying
degrees, in nearly all music. This emphasis creates systems of tonal organi-
zation in which a hierarchy of pitch levels exist. A hierarchy will most often
place one pitch in a predominant role, with all other pitches having func-
tions of varying importance, in supporting the primary pitch. The primary
pitch, or
tonal center, becomes a reference level to which all pitch material
is related, and around which pitch patterns are organized.
Many tonal organization systems exist. These systems tend to vary signifi -
cantly by cultures, with most cultures using several different but related
systems. The major and minor tonal organization systems of Western music
are examples of different but related systems, as are the whole-tone and
pentatonic systems of Eastern Asia. The reader should consult appropriate
music theory texts for more detailed information on tonal organization, as
necessary.
The New Pitch Concerns of Audio Production
Certain components of pitch levels and relationships have become more
prominent in musical contexts (and other areas of audio) because of the
new treatments of pitch relationships in music recordings. The components
of range, register, pitch density, and pitch area can be more closely con-
trolled in recorded music than in live (unamplifi ed) performance. These com-
ponents are more important in recorded music, because they are precisely
controllable by the technology, and they have been controlled to support
and enhance the musical material.
Range is the span of pitches of a sound source (any instrument or voice).
Range is the area of pitches that encompasses the highest note possible (or
present in a certain musical example) to the lowest note possible (or pres-
ent) of a particular sound source.
A
register is a portion of a sound source’s range. A register will have a unique
character (such as a unique timbre, or some other determining factor) that
will differentiate it from all other areas of the same range. It is a small area
within the source’s range that is unique in some way. Ranges are often
divided into many registers; registers may encompass a very small group
of successive pitches, up to a considerable portion of the sources range.
A
pitch area is a portion of any range (or of a register) that may or may
not exhibit characteristics that are unique from other areas. Instead, it is a
defi ned area between an upper and a lower pitch level, in which a specifi c
activity or sound exists.
The Aesthetic and Artistic Elements of Sound in Audio Recordings
41
Pitch density is the relative amount and register placement of simultaneously
sounding pitch material, throughout the hearing range or within a specifi c
pitch area. It is the amount and placement of pitch material in the compos-
ite musical texture (the overall sound of the piece of music), and is defi ned
by its boundaries of highest and lowest sounding pitches.
With pitch density, sound sources are assigned (or perceived as occupying)
a certain pitch area within the entire listening range (or the smaller pitch
range used for a certain piece of music). Thus, certain pitch areas will have
more activity than other pitch areas; certain sound sources will be present
only in certain pitch areas, and other sources present only in other pitch
areas; some sources may share pitch areas and cause more activity to be
present in those portions of the range; some pitch areas may be void of
activity. Many possible variations exist.
Pitch density is a component of pitch-level relationships, and is directly
related to traditional concerns of orchestration and instrumentation, with
many new twists. Pitch density is a much more specifi c concern in recorded
music because it is controllable in very fi ne increments. Traditional orches-
tration was concerned, basically, with the selection of instruments, and
with the placement of the musical parts (performed by those assigned
instruments and their sound qualities) against one another.
The register placement of sound sources and their interaction with the oth-
er sound sources take on many more dimensions with the controls of the
recording process. Each sound source occupies a pitch area; the acoustic
energy within the pitch area of a timbre’s spectrum is distributed in ways
that are unique to each sound source. The spectrum of each sound source
is the pitch density of an individual sound source. The pitch density of the
overall program (or musical texture) is the composite of all of the simulta-
neous pitch information from all sound sources, and is timbral balance.
The pitch area they occupy within the timbral balance of the overall pro-
gram often delineates sound sources, and musical ideas. Sound sources
are more easily perceived as being separate entities and individual ideas
when they occupy their own pitch area in the overall program. This area
can be large or quite small, and still be effective.
Sounds that do not have well-defi ned pitch quality occupy a
pitch area.
These types of sounds are noise-like, in that they cannot be perceived as
being at a specifi c pitch. Such sounds may, however, have unique pitch
characteristics.
Many sounds cannot be recognized as having a specifi c pitch, yet have a
number of frequencies that dominate their spectrum. Cymbals and drums
easily fall into this category. Cymbals are easily perceived as sounding
higher or lower than one another, yet a specifi c pitch cannot be assigned
to the sound source.
Chapter 2
42
We perceive these sounds as occupying a pitch area. We perceive a pitch-
type quality based on (1) the register placement of the area of the highest
concentration of pitch information (at the highest amplitude level) present
in the sound, and (2) the relative density (closeness of the spacing of pitch
levels) of the pitch information (spectral components). We are able to iden-
tify the approximate area of pitches in which this concentration of spectral
energy occurs, and are thus able to relate that area to other sounds.
Pitch areas are defi ned as the range spanned by the lowest and highest
dominant frequencies around the area of the spectral activity. This range
is called the
bandwidth of the pitch area. Many sounds will have several
pitch areas of concentrated amounts of spectral energy. In such cases, one
range will dominate and the others will be less prominent. The size of the
bandwidth and the density of spectral information (the number of frequen-
cies within the bandwidth and the spacing of those frequencies) defi ne the
sound quality of pitch areas.
Dynamic Levels and Relationships
Dynamic levels and relationships have traditionally been used in musi-
cal contexts for expressive or dramatic purposes. Expressive changes in
dynamic levels and the relationships of those changes have most often
been used to support the motion of melodic lines, to enhance the sense of
direction in harmonic motion, or to emphasize a particular musical idea. A
change of dynamic level, in and of itself, can produce a dramatic musical
event and is a common musical occurrence. Changes in dynamic level can
be gradual or sudden, subtle or extreme.
Dynamics have traditionally been described by analogy: louder than, softer
than, very loud (fortissimo), soft (piano), medium loud (mezzo forte), etc.
The artistic element of dynamics in a piece of music is judged in relation to
context. Dynamic levels are gauged in relation to (1) the overall, reference
dynamic level of the piece of music, (2) the sounds occurring simultaneously
with a sound source in question, and (3) the sounds that immediately follow
and precede a particular sound.
The components of dynamic levels and relationships in audio recording
are dynamic contour (with gradual and abrupt changes in dynamic level),
emphasis/de-emphasis accents (abrupt changes in dynamic level), musi-
cal balance (gradual and abrupt changes in dynamic levels), and dynamic
speech infl ections.
Traditional Uses of Dynamics
It is common for the most important musical idea/sound source in a piece
of music to be given prominence in one way or another. Making that
sound the loudest is an easy way of achieving this prominence (though not
The Aesthetic and Artistic Elements of Sound in Audio Recordings
43
always the most elegant). Arranging sounds by relating dynamic levels to
the importance of the musical part are very common, and a very natural
association of loudness and the center of one’s attention.
Gradual changes in dynamic levels can be important. The crescendo (grad-
ual increasing in loudness) can be used to support the motion of a melodic
line (for instance), or it might be used on a sustained pitch as a musical
gesture itself. Likewise a diminuendo or decrescendo (a gradual decrease
in loudness) may be used in the same ways.
Rapid, slight alterations or changes in dynamic level for expressive pur-
poses are often present in live performances. This is called
tremolo, and is
used primarily to add interest and substance to a sustained sound.
Tremolo
and vibrato are often confused. Vibrato is a rapid, slight variation of the
pitch of a sound; it also is used to enhance the sound quality of the sound
source. At times, performers may not be able to control their sound well
enough to control tremolo and vibrato alterations; in these instances, trem-
olo and vibrato may detract from the source’s sound quality, rather than
contribute to it.
To support a musical idea or to create a sense of drama, musical ideas
are often brought to the listener’s attention by dynamic
emphasis accents
and attenuation accents. A shift in dynamic level that brings the listeners
attention to a musical idea is an accent. Accents are most often emphasis
accents, making use of increasing the dynamic level of the sound to achieve
the desired result. Much more diffi cult to successfully achieve, de-emphasis
(or attenuation) accents draw the listener’s attention to a musical idea, or a
sound source, by a decrease in the dynamic level of the sound. Attenuation
accents are often unsuccessful because the listener has a natural tendency
to move attention away from softer sounds; these accents are most easily
accomplished in sparse musical textures, where little else is going on to
draw the listeners attention away from the material being accented.
New Concepts of Dynamic Levels and Relationships
Changes in dynamic levels over time comprise dynamic contours. Dynamic
contours can be perceived for individual sounds, individual sound sources,
and individual musical ideas composed of a number of sound sources, and
the overall piece of music. Dynamic contours are perceived at many differ-
ent
perspectives (level of detail). At their extremes, they exist as the small-
est changes within the spectral envelope of a single sound source, and as
great changes in the overall dynamic level of a recording.
The interaction of the dynamic contours of all sound sources in a piece of
music creates
musical balance. Musical balance is the interrelationships of
the dynamic levels of each sound source, to one another and to the entire
musical texture. The dynamic level of a particular sound source in relation to
another sound source is a comparison of two parts of the musical balance.
Chapter 2
44
Dynamic contours and musical balance have been used in supportive roles
in most traditional music. At times dynamic level changes have been used
for their own dramatic impact on the music (as discussed with crescendo
and diminuendo, above), but most often they are used to assist the effec-
tiveness of another artistic element. The mixing process easily alters musi-
cal balance. Recordists exercise great control over this artistic element.
The dynamic levels and relationships of a performance may be signifi cant-
ly different in the fi nal recording. The recording process has very precise
control over the dynamic levels of a sound source in the musical balance
of the fi nal recording. An instrument may have an audible dynamic level in
the musical balance of a recording that is very different from the dynamic
level at which the instrument was originally performed. The timbre of the
instrument will exhibit the dynamic levels at which it was performed (
per-
ceived performance intensity), but its relative dynamic level in relation to
the other musical parts might be signifi cantly altered by the mix. For exam-
ple, an instrument may be recorded playing a passage loudly, and end up
in the fi nal musical balance (mix) at a very soft dynamic level; the timbre
of the instrument will indicate that the passage was performed very loudly,
yet the actual dynamic level will be quite soft in relation to the overall musi-
cal texture, and to the other instruments of the texture.
Many clear examples of this are found in The Beatles
recording of “Penny Lane.” Listening carefully to the fl utes,
piccolo, and piccolo trumpet parts throughout the song, one
will fi nd many instances where the loudness levels of the
performances are not refl ected in the actual loudness levels
of the instruments in the recording. Among many instances
of confl icting levels and timbre cues, we hear moderately
loud fl utes that were performed softly; loudly played pic-
colo sounds at a soft level in the mix; and a piccolo trumpet
appearing at a softer level than in the performance. Other
instruments and voices in the song also have inconsistent
musical balance and performance intensity information.
The reader is encouraged to take the time now to perform
the musical balance and performance intensity Exercise 2-1
at the end of this chapter.
The dynamic level of a sound source in relation to other sound sourc-
es, or musical balance, is quite different and distinct from the perceived
distance of one sound source to another. Yet, these two occurrences are
often confused and are the source of much common, misleading termi-
nology used by recordists. Signifi cant differences are present between a
softly generated sound that is close to the listener and a loudly performed
sound that is at a great distance to the listener, even when the two sounds
have precisely the same sound pressure level (SPL) or perceived loudness
level. Loudness levels within the recording process are independently
Listen . . .
to track 38
for musical balance relationships
that are changed from the origi-
nal performance. The drum mix
has many unnatural relationships
of performance intensity versus
musical balance; some are over
exaggerated to provide clarity of
the topic.
The Aesthetic and Artistic Elements of Sound in Audio Recordings
45
controllable from the loudness level at which the sound was performed,
and are independently controllable from the distance of the sound source
from the original receptor and from the perceived listening location of the
nal recording. Dynamics must not be confused with distance. Dynamic
levels, themselves, do not defi ne distance location.
Rhythmic Patterns and Rates of Activities
Durations of sounds (the length of time in which the sound exists) combine
to create musical rhythm. Rhythm is based on the perception of a steadily
recurring, underlying pulse. The pulse does not need to be strongly audible
to be perceived. The underlying pulse (or
metric grid) is easily recognized
by humans as the strongest common proportion of duration (note value)
heard in the music.
The rate of the pulses of the metric grid is the
tempo of a piece of music.
Tempo is measured in metronome markings (pulses per minute, abbrevi-
ated “M.M.”), or in some contexts as pulses per quarter note. Tempo, in a
larger sense, can be the rate of activity of any large or small aspect of the
piece of music (or of some other aspect of audio, for example the tempo
of a dialogue).
Durations of sound are perceived proportionally in relation to the pulse of
the metric grid. The human mind will organize the durations into groups of
durations, or
rhythmic patterns. In the same ways that we perceive patterns
of pitches, we perceive patterns of durations. Pattern perception is trans-
ferable to all of the components of all of the artistic elements, and is the
traditional way in which we perceive pitch and rhythmic relationships.
Rhythmic patterns are the durations of or between sound-
ings of any artistic element. Rhythmic patterns might be
created by the pulsing of a single percussion sound; in this
way rhythmic patterns would be created by the durations
between the occurrences of the starts of the same sound
source. Rhythmic patterns comprising of the durations of
successive, single pitches (perhaps including some silences)
create melody. Rhythmic patterns of the durations of suc-
cessive chords (groups of pitches) create harmonic rhythm.
Extending this, in the same way rhythm can be transferred
to ALL artistic elements. As examples, it is possible to have
rhythms of sound location (as has become a common mixing technique
for percussion sounds); it is likewise possible to have timbre melodies, or
rhythms applied to patterns of identifi able timbres (this is often used for
drum solos).
Listen . . .
to tracks 45, 46 and 53
for rhythmic patterns of timbre
and location created by the drum
mixes.
Chapter 2
46
Sound Sources and Sound Quality
The selection, modifi cation, or creation of sound sources is an important
aesthetic and artistic element of audio recording. The sound quality of
the sound sources (the timbre of the source) plays a central role in the
presentation of musical ideas, and has become an increasingly signifi cant
form of musical expression.
The sound quality of a sound source may cause a musical part to stand out
from others, or to blend into an ensemble. Sound quality alone can convey
tension or repose and give direction to a musical idea. Sound quality can
add dramatic or extra-musical meaning or signifi cance to a musical idea.
Finally, the timbral quality of a sound source can, itself, be a primary musi-
cal idea, capable of conveying a meaningful musical message.
Until recently, composers used the sound quality of a sound source (1) to
assist in delineating and differentiating musical ideas (making them easier
to distinguish from one another), (2) to enhance the expression of a musi-
cal idea by the careful selection of the appropriate musical instrument to
perform a particular musical idea, or (3) to create a composite timbre (or
texture) of the ensemble, thereby forming a characteristic, overall sound-
quality—timbral balance (also called tonal balance).
Performers have always used the characteristic timbres of their instruments
or voices to enhance musical interpretation. This activity has been greatly
refi ned by the resources of recording and sound-reinforcement technol-
ogy. Performers now have greater fl exibility in shaping the timbre of their
instruments for creative expression. Of equally great importance, after
the performance has been captured, the recording process allows for the
opportunity to return to the performance for further (perhaps extensive)
modifi cations of sound quality.
The selection of a sound source to represent (present) a particular musical
idea is critical to the successful presentation of the idea. The act of selecting
a sound source is among the most important decisions composers (and
producers) make. The options for selecting sound sources are (1) to choose
a particular instrumentation, (2) to modify the sound quality of an existing
instrument or performance, or (3) to create, or synthesize, a sound source
to meet the specifi c need of the musical idea.
The
selection of instrumentation was once merely a matter of deciding
which generic instrument of those available would perform a certain musi-
cal line. The selection of instrumentation has now become very specifi c and
much more important. The performance that exists as a music recording
may virtually live forever and be heard by countless people. This is very dif-
ferent from the typical, live music performance of the past that existed for
only a passing moment and was heard by only those people present.
Today, the selection of instrumentation is often so specifi c as to be a selec-
tion of a particular performer playing a particular model of an instrument.
The Aesthetic and Artistic Elements of Sound in Audio Recordings
47
Generally, composers and producers are very much aware of the sound
quality they want for a particular musical idea. The performer, the way the
performer can develop a musical idea through their own personal perfor-
mance techniques, and their ability to use sound quality for musical expres-
sion are all considerations in the selection of instrumentation.
Vocalists are commonly sought for the sound quality of their voice and their
abilities to perform in particular singing styles. The vocal line of most songs
is the focal point that carries the weight of musical expression. Vocalists
make great use of performance techniques to enhance and develop their
sound quality, as well as to support the drama and meaning of the text.
Performance techniques vary greatly between instruments, musical styles,
performers, and functions of a musical idea. The most suitable perfor-
mance techniques will be those that achieve the desired musical results,
when the sound sources are fi nally combined. One performance technique
consideration that must be singled out for special attention is the intensity
level of a performance.
As touched on in the discussion above with musical balance, a performance
on a musical instrument will take place at a particular intensity level. This
perceived performance intensity is comprised of loudness, energy exerted,
performance technique, and the expressive qualities of the performance.
Each performance at a different intensity level results in a different charac-
teristic timbre of that instrument, at that loudness level.
The same sound source will thus have different timbres, at different loud-
ness levels (and at different pitch levels), through performance intensity.
Along with the timbre (sound quality) and loudness level, performance
intensity can communicate a sense of drama and an artistically sensitive pre-
sentation of the music to the listener. Through performance intensity, louder
sounds might be more urgent, more intense; softer sounds might be cause
for relaxation of musical motion. The exact reverse is equally possible. The
expressive qualities of music are contained in performance intensity cues.
Modifying a sound source is a common way of creating a desired sound
quality. Instruments, voices, or any other sound may be modifi ed (while
being recorded, or afterwards) to achieve a desired sound quality. Most
often, this takes the form of making detailed modifi cations to a particular
instrument so it best presents the musical idea. The fi nal sound quality will
still have some (perhaps many, perhaps only a few) characteristic qualities
of the original sound.
The extensive modifi cation of an existing sound source, to the point
where the characteristic qualities of the original sound are lost, is actu-
ally
the creation of a sound source. The creation of new sound qualities
(or inventing timbres) has become an important feature in many types or
pieces of music. The recording process easily allows for the creation of new
sound sources, with new sound qualities.
Chapter 2
48
Sound qualities are created by either extensively modifying an existing
sound through sound-sampling technologies, or by synthesizing a wave-
form. Sound-synthesis techniques allow precise control over these two
processes, and are having a widespread impact on recording practice and
musical styles. Many specifi c technologies and techniques exist for synthe-
sizing and sampling sounds; all have unique sound qualities and unique
ways of allowing the user to modify or synthesize a sound source.
A new sense of the importance of sound quality to communicate, as well
as to enhance, the musical message has come from this increased empha-
sis on sound quality and timbre. Sound quality has become a central ele-
ment in a number of the primary decisions of recording music, as well as
in the creation of music through the recording process. In making these
basic decisions, sound quality is conceptualized as an object. The sound is
thought of as a complete and individual entity, capable of being pulled out
of time and out of context.
In this way, sound quality is approached as a
sound object. This important
concept will be explored in detail later in Chapter 4, Listening and Evaluat-
ing Sound for the Audio Professional.
The entire, composite sound of the music may also be conceptualized as a
single entity, or overall sound quality. It is composed of all the sound sourc-
es and musical ideas. This sound quality of the overall sound, or entire
program, is called
timbral balance. It is perceived and recognized as the
sound-quality characteristics of the overall program.
The overall texture of the recording is perceived as an overall character,
made up of the states and activities of all sounds and musical ideas. Pitch-
register placements, rate of activities, dynamic contours, and spatial proper-
ties are all potentially important factors in defi ning a texture by the states or
activities of its component parts. This will be covered in detail in Chapter 10.
Spatial Properties: Stereo and Surround Sound
The
spatial properties of sound have traditionally not been used in musical
contexts. The only exceptions are the location effects of antiphonal ensem-
bles of certain Renaissance composers and in certain drama-related works
of the nineteenth century, such as the 1837
Requiem by Hector Berlioz (with
its brass ensembles stationed at the corners of the church, performing
against the orchestra and choir on stage).
The spatial properties of sound can play important roles in communicat-
ing the artistic message of recorded music, however. The roles of spatial
properties of sound are many. Spatial properties may be used in supportive
roles to enhance the character or effectiveness of musical ideas (large and
small), to differentiate one sound source from another, to provide dramatic
impact, to alter reality, or to reinforce reality by providing a performance
space for the music. Further, spatial properties may be used as the primary
The Aesthetic and Artistic Elements of Sound in Audio Recordings
49
idea of an artistic gesture. The spatial property of environmental character-
istics even fuses with the timbre of the sound source to add a new dimen-
sion to its sound quality. Many other possibilities certainly exist.
The number and types of roles that spatial location may play in music have
yet to be exhausted or defi ned. The adoption of surround sound has further
multiplied the possibilities.
All of the components of the spatial properties are under quite precise and
independent control. All of the spatial properties may be in many markedly
different and fully audible states. Further, gradual and continuously vari-
able change between those states is possible and common.
The spatial properties of sound that are of primary concern to recorded
music (sound) are:
1. The
stereo location of the sound source on the horizontal plane of the
stereo array,
2. The
distance of the sound source from the listener,
3. The perceived characteristics of the sound source’s physical
environment,
4. The surround location of sound sources on the lateral plane 360°
around the listener,
5.
Perceived performance environment of the recording.
The perceived elevation of a sound source is not consistently reproducible
in widely used playback systems, and has not yet become a resource for
artistic expression.
Two-Channel Stereo
The spatial qualities of stereo playback are perceived as relationships of
location and distance cues and relationships of sound sources. These cre-
ate a perception of a
sound stage contained within the perceived perfor-
mance environment of the recording.
While surround sound is becoming more prevalent, two-channel sound
reproduction remains the standard of the music recording industry, with
monophonic capabilities still considered for certain Internet, radio broadcast
and television sound applications. The two-channel array of
stereo sound
attempts to reproduce all spatial cues through two separate sound loca-
tions (loudspeakers), each with more-or-less independent content (chan-
nel). With the two channels, it is possible to create the illusion of sound
location at a loudspeaker, in between the two loudspeakers, or slightly
outside the boundaries of the loudspeaker array; location is limited to the
area slightly beyond that covered by the stereo array, and to the horizontal
plane. The characteristics of the sound source’s environment and distance
from the listener are created in much more subtle ways by stereo, but can
be stunning nonetheless.
Chapter 2
50
A setting is created by the two-channel playback format for the reproduc-
tion of a recorded or created performance (complete with spatial cues).
This establishes a conceptual and physical environment within which the
recording will be reproduced more-or-less accurately.
The reproduced recording presents an illusion of a live performance. This
performance will be perceived as having existed in reality, in a real physi-
cal space, as the listener will imagine the activity in relation to his or her
own physical reality. The recording will appear to be contained in a single,
perceived physical environment. Within this perceived space is an area that
comprises the
sound stage.
Sound Stage and Imaging
The sound stage encompasses the area within which all sound sources are
perceived as being located. It has an apparent physical size of width and
depth. The sound sources of the recording will be grouped by the mind
to occupy a single area, from which the music is being played. It is pos-
sible for different sound sources to occupy signifi cantly different locations
within the sound stage but still be grouped into the illusion of a single
performance.
Imaging is the lateral location and distance placement of the individual
sound sources within the sound stage. Imaging provides depth and width
to the sound stage. The perceived locations and relationships of the sound
sources create imaging, as all sources appear to exist at a certain lateral
and distance location within the stereo array.
Figure 2-1
Sound
stage and the per-
ceived performance
environment.
PERCEIVED PERFORMANCE ENVIRONMENT
Containment Walls
Sound Stage
The Aesthetic and Artistic Elements of Sound in Audio Recordings
51
Stereo Location
The stereo (lateral) location of a sound source is the perceived placement of
the sound source in relation to the stereo array. Sound sources may be per-
ceived at any lateral location within, or slightly beyond, the stereo array.
Phantom images are sound sources that are perceived to be sounding at
locations where a physical sound source does not exist. Imaging relies
on phantom imaging to create lateral localization cues for sound sources.
Through the use of phantom images, sound sources may be perceived at
any physical location within the stereo loudspeaker array,
and up to 15° beyond the loudspeaker array.
Stage width
(sometimes called stereo spread) is the width of the entire
sound stage. It is the area between the extreme left and right
source images and marks the sound stage boundaries.
Phantom images not only provide the illusion of the location
of a sound source, but also create the illusion of the physi-
cal size (width) of the source. Two types of phantom images
exist: the
spread image and the point source.
A
point source phantom image occupies a focused, precise
point in the sound stage. The listener can close their eyes
and point to a very precise point of little area where the
source is heard to originate. Point sources exist at a specifi c
point in space, narrow in width, and precisely located in the
sound stage.
The
spread image appears to occupy an area. It is a phantom
image that has a size that extends between two audible boundaries. The
potential size of the spread image varies considerably; it might be slightly
wider than a point source or it may occupy the entire stereo array. The
spread image is defi ned by its boundaries; it will be perceived to occupy an
area between two points or edges. At times, a spread image may appear
to have a hole in the middle, where it might occupy two more-or-less equal
areas, one on either side on the stereo array.
The perceived lateral location of sound sources can be altered to provide
the illusion of
moving sources. Moving sound sources may be either point
sources or spread images. Point sources and narrow spread images that
change location most closely resemble our real-life experiences of moving
objects.
Many interesting examples of phantom images can be found on The Bea-
tles’ album
Abbey Road. An apparent example of a spread image with a
hole in the middle is the tambourine in the fi rst chorus of “She Came in
Through the Bathroom Window.The lead vocal in “You Never Give Me
Your Money” begins the song as a point source. The image soon becomes
a spread image that gradually grows wider, ultimately occupying a signifi -
cant amount of the sound stage (this is partly due to the gradual addition
Listen . . .
to tracks 42 and 43
for narrow and wide spread
images of a guitar;
or
to track 48
for a variety of spread image sizes
of drum sounds and point source
cymbal bells.
Chapter 2
52
and varying of environmental cues, which will be discussed shortly). In the
second section of the work, the new lead vocal sound gradually moves
from the right to the left side of the sound stage, while maintaining a spread
image of moderate size.
Figure 2-2
Sound
stage and imaging,
with phantom images
of various sizes.
PERCEIVED PERFORMANCE ENVIRONMENT
Listener’s Perceived
Location
Sound Stage
High Kybd
High Hat
Lead Vocal
Background Vocals
Acoustic Guitar
Bass Drum
Low Keyboard
Flute
Bass
Tambourine
Perceived
Depth
of
Sound
Stage
Perceived Width
of Sound Stage
Distance Location
Two important distance cues shape recorded music: (1) the distance of the
listener to the sound stage, and (2) the distance of each sound source from
the listener.
Both of these distances rely on a perception that the entire recording ema-
nates from a single, global environment. This
perceived performance envi-
ronment
establishes a reference location of the listener, from which all
judgments of distance can be calculated.
The
stage-to-listener distance establishes the front edge of the sound stage
with respect to the listener and determines the level of intimacy of the
music/recording. This is the distance between the grouped sources that
make up the sound stage and the perceived position of the audience/listen-
er. This stage-to-listener distance places the sound stage within the overall
environment of the recording and provides a location for the listener.
The
depth of sound stage is the area occupied by the distance of all sound
sources. The boundaries of the depth of the sound stage are the perceived
nearest and the perceived furthest sound sources (with the depths created
by their environments, discussed below). The perceived distances of sound
The Aesthetic and Artistic Elements of Sound in Audio Recordings
53
sources within the sound stage may be extreme; they may
provide the illusion of great depth and a large area, or the
exact opposite.
Stage-to-listener and depth-of-sound-stage distance cues
have different levels of importance in different applications.
Depth-of-sound-stage cues tend to be emphasized over
stage-to-listener distance cues in many multitrack record-
ings; in those recordings, the cues of the distance of the
source from the listener are often exploited for dramatic
effect and/or to support musical ideas. In contrast, stage-
to-listener distance cues are often carefully calculated in
classical and some jazz recordings (especially those utiliz-
ing standardized stereo microphone techniques); in those
recordings the stage-to-listener distance will not change
and has been carefully selected to represent an appropri-
ate vantage point (the ideal seat) from which to hear the music.
Turning again to
Abbey Road, the distance cues of the various instruments
of “Golden Slumbers” gives the work and its companion “Carry That
Weight” much space between the nearest and the furthest sources. The
orchestral string and brass instruments are at some distance from the lis-
tener and give signifi cant depth to the sound stage, while the piano estab-
lishes the front edge of the sound stage very near the listener. Remember-
ing that timbral detail is the primary determinant of distance location will
help in accurately hearing these cues.
Environmental Characteristics
It has become important for music recordings to have sound sources
matched with an environment with a suitable sound, and to have a suit-
able environment for the sound stage (the perceived performance environ-
ment). Through these,
environmental characteristics have the potential to
signifi cantly impact music and the quality of the recording.
Environmental characteristics fuse with the sound source to create a single
sonic impression. The host environment shapes the overall timbre/sound
quality of each sound source; this is also true for the overall program
(shaped by its perceived performance environment). Environmental char-
acteristics contribute greatly to sound quality and also play an important
role in the recording’s sense of space. The characteristics provide a space
for the sound sources to perform in, they supply some distance informa-
tion that may be signifi cant, and they contribute to the perceived depth of
the sound stage.
The sound characteristics of the host environments of sound sources and the
complete sound stage are precisely controllable. Each sound source has the
Listen . . .
to tracks 39-41
for distance changes of a single
cello performance
or
to track 48
for a variety of distance locations
within a single drum mix.
Chapter 2
54
potential to be assigned environmental characteristics that are different from
the other sound sources. The recording process allows the potential for each
sound source to be given a different environment, and for the characteristics
of those environments to be varied as desired. Further, each source may
occupy any distance from the listener within the applied host environment.
The
perceived performance environment (or the environment of the sound
stage) is the overall environment where the performance
(recording) is heard as taking place. This environment
binds all the individual spaces together into a single per-
formance area.
The environment of the sound stage and an individual
environment for each sound source (or groups of sound
sources) often coexist in the same music recording. This
places the individual sound sources with their individual
environments within the overall, perceived performance
environment of the recording. The illusion of
space within
space is thus created, with the following potential perceptions:
1. That physical spaces may exist side-by-side,
2. That one physical space may exist within another physical space (where
often a space with the sound qualities of a physically large room may
be perceived to exist within a smaller physical space), and
3. That several sounds may exist at various distances within the same
host environment (space).
Any number of environments and associated stage-depth distance cues
may occur simultaneously, and coexist within the same sound stage. The
environments and associated distances are conceptually bound by the
spatial impression of the perceived performance environment. These out-
er walls of the overall program establish a reference (subliminally, if not
aurally) for the comparison of the sound sources.
Perhaps oddly, the overall space that serves as a reference, and that is per-
ceived by the listener as being the space within which all activities occur,
will often have the sound characteristics of an environment that is signifi -
cantly smaller than the spaces it appears to contain. Such cues that send
confl icting messages between our life experiences and the perceived musi-
cal occurrence are readily accepted by the listener and can be used to great
artistic advantage. This is a very common space-within-space relationship.
Space within space will at times be coupled with distance cues to accentu-
ate the different environments (spaces) of the sound sources, though often
this illusion is created solely by the environmental characteristics of the
different spaces of each sound source.
“Here Comes the Sun,” also from
Abbey Road, provides some clear exam-
ples of space within space. Environments clearly exist side by side from
the song’s opening into the fi rst verse. The guitar has an environment all
Listen . . .
to tracks 45-47
for several different space within
space examples.
The Aesthetic and Artistic Elements of Sound in Audio Recordings
55
to itself in the left channel, the electronic keyboard counter-melody and
Moog synthesizer glissando have similar environments distinctly different
from the others, and the right channel voice has a very different third
environment. The parts are held together by the notion that they all exist
within a single performance space (perceived performance environment).
The entry of additional instruments quickly adds numerous additional envi-
ronments and enhances the sound stage. As the vocal lines are added,
however, they appear to be within the same environment, though at dis-
tinctly different distances from the listener. The notion of spaces within
spaces is also apparent in the drum parts; the drum set seems to occupy
an area, with its characteristic environment, within which low toms in a
larger space are contained.
Surround Sound
Music recordings in surround sound are becoming more important.
Enough activity and interest is present that it is necessary for us to serious-
ly explore this format now, but with some reservation. While some talented
people have been working in this new format and some striking record-
ings have been made, few consistent uses of the unique sound qualities of
surround have emerged. This section will discuss the most prevalent aes-
thetic and artistic elements currently found in surround music recordings,
and will explore some potential applications. Without doubt, recordists
will further defi ne the artistic elements of surround over the coming years;
great changes and advances are likely, as the medium is further explored
in music production.
Listening to a stereo recording, we fi nd ourselves observing a performance.
We are viewing the activity as an outsider. And while we may get con-
sumed by or immersed in the music, we are outside of the experience of
the performance itself and are looking in. With surround sound, we can fi nd
ourselves enveloped by the music. We can be surrounded by the sound,
and thereby contained within the space of the recording; we are no longer
outside observers, but at least inside observers if not participants (at least
in our perception of the experience). Now the listener can be enveloped by
the sound (and become part of the space of the recording) or they might
be oriented by the production techniques to observe a piece of music as a
360° panorama of sound. This aspect of surround sound has great poten-
tial for making a profound impact on music. Location and environmental
characteristics will be approached differently for surround recordings, and
distance cues will also take on new dimensions.
Surround Location
The sound stage of surround sound is vastly more complicated than ste-
reo. Imaging takes on many strikingly new and different dimensions. With
Chapter 2
56
independent channels surrounding the listener, the potential exists for the
sound stage to be extended enormously. This also places the listener in a
listening position that is strikingly different from stereo.
As discussed, stereo is based on a single sound stage between two speak-
ers, whereas 5:1 surround (the format used for evaluations herein, dis-
cussed in detail in Chapter 9) provides the opportunity for as many as 26
possible combinations of speakers. This changes phantom image place-
ment, width, and stability greatly.
The phantom images of stereo exist between two loudspeakers, and up
to 15° beyond. Phantom imaging is more complex in surround. It is com-
posed of primary and secondary images.
There are fi ve primary phantom image locations existing between adjacent
pairs of speakers in surround. These images tend to be the most stable and
reliable between systems and playback environments.
Many secondary phantom images are possible as well. These can appear
between speaker pairs that are not adjacent. These images contain incon-
sistencies in spectral information and are less stable. Implied are different
distance locations for these images, as the trajectories between the pairs
of speakers are closer to the listener position. These closer locations do
not materialize in actual practice. The distance location of these images are
actually pushed away somewhat by the diminished timbral clarity of these
images.
When we consider locations caused by various groupings of three or four
loudspeakers, placement options for phantom images get even more
complex.
Phantom images can be of greatly different sizes in surround. They can range
from completely surrounding the listener with a spread image of enormous
size, to a small and precisely defi ned point source. Point sources and nar-
row spread images are common in current music productions, especially
in the front sound fi eld. The center channel has changed imaging on the
traditional front sound stage tremendously. Images are often more defi ned
in their locations and narrower in size than in two-channel recordings.
The Aesthetic and Artistic Elements of Sound in Audio Recordings
57
Figure 2-3
Phantom
images in pairs of
surround speakers.
Adjacent Pair Phantom Images (Primary)
Non Adjacent Pair Phantom Images (Secondary)
Ls Rs
RL
C
Distance in Surround
Distance cues, of course, remain a product of timbral detail, with some reli-
ance on environmental cues. The enhanced presence of ambience causes
surround to more readily draw the listener into making inaccurate judg-
ments of distance cues. Sounds are often perceived as further away than
is accurate, largely because of an awareness of extra or enhanced environ-
ment information. The listener is drawn toward ambience and away from
an awareness of timbral detail.
The depth of sound stage is extended all around the listener as well. The
listener is able to perceive distance in all directions, and these cues can be
present in surround recordings. This provides for creative opportunities not
possible in stereo, but should be approached with reservation.
First, we know listeners accept sound stages of great depth in the front
sound fi eld. They easily imagine they are viewing something with propor-
tions out of their physical confi nes. Listeners are not prepared to perceive
sounds from the side and rear in the same way. When presented with sim-
ilar materials from side and rear locations, listeners can be reluctant to
place sounds in or behind walls (even after they have been observed and
recognized at those locations). The same cues presented in a musically dif-
ferent way to envelop the listener in the space may allow this greater depth
to be perceived.
Chapter 2
58
Second, phantom images from the side and rear are inherently fi lled with
phase anomalies of the listening room. This can cause a lack of timbral
defi nition and detail, and distort distance cues. Further, it is also common
for surround systems to give different timbral qualities to any instrument
panned across its different speaker locations. These timbre changes often
translate into distance changes and blur distance location imaging.
Finally, involuntary head movement contributes to our natural localization
process for acoustic sound sources. While this instinct has allowed our spe-
cies to survive and evolve, it may well lead to a sense of apprehension in the
listener. When presented with sounds from behind, the fi ght or fl ight instinct
can be triggered, and thus distract the listener or create discomfort.
Environmental Characteristics and Surround Sound
Environmental characteristics can be directed to the listener from every
direction in a very natural manner. This will immerse the listener in the cues
and provide the life-like experience of being present in the space where the
recording was made. Spaciousness can be presented by both two-channel
and fi ve-channel systems to portray a sense of space, but only surround
systems can provide the sensation of being there within the performance.
At present, environmental characteristics are mostly used in ways similar
to stereo recordings. The characteristics of the perceived performance envi-
ronment and of individual sound sources are crafted to shape the musical-
ity of the recordings. One exception is fully or partially immersing the lis-
tener in the ambiance of a source’s host environment, while localizing the
source in a specifi c location elsewhere (usually in the front sound fi eld).
The inherent qualities of environmental characteristics remain unchanged,
except for changes in the direction(s) of the arrival of refl ected sound. This
itself is a great difference, as the fusing of environmental characteristics
and direct sound can become challenged, with a variety of results such as
enlarged images, unnatural effects (perhaps pleasing), distracting refl ec-
tions, and many more. This is especially apparent when environment cues
are sent to only a few channels and do not surround the listener; this can
lead to many different illusions. The environmental cues of individual sound
sources may be perceived as separated from the direct sound, may be used
to enhance imaging and space within space illusions, and many other alter-
natives exist for this new dimension. This new set of illusions makes the
perceived performance environment’s tendency to bind all of the spaces of
the individual sound sources together even more important. The perceived
performance environment will continue to provide an important context
for the recording and a critical point of reference.
The album
Brothers in Arms by Dire Straits provides a number of examples
from these concepts. During the introduction to second track “Money for
The Aesthetic and Artistic Elements of Sound in Audio Recordings
59
Nothing,” an arpeggiated synthesizer sound revolves around the listener
slowly four times, between 0:15 and 1:15. The sound stage surrounds the
listener with a slight sense of envelopment, the fi rst synthesizer sound
(with some string traits) fi lls the left front through left rear of the sound
stage, the vocal and soprano saxophone parts provide important front-cen-
ter sources that vary in size and have a small amount of motion, and a syn-
thesized bass part creates a stable rear-center image. Drums enter at 1:12 to
anchor the listeners attention to the front sound stage and this is gradually
reinforced as the guitar enters quietly at 1:26; the solo guitar at 1:36 clearly
establishes the front center as the primary sound-stage location of the song
before being joined there by the drums (1:48) and lead vocal (2:04), sup-
ported by ambience in the surrounds. The listener then is clearly observing
the front sound stage and sources are no longer around their position.
The third song, “Walk of Life,” presents a sound stage that surrounds the
listener, with the listener sitting within the ensemble (sound stage). The
nal song, “Brothers in Arms,” completely envelops the listener during the
storm sounds and musical materials of the introduction, which clear dra-
matically for the exposed lead vocal (front center) of the fi rst verse.
Conclusion
With the recording process, it is possible for any of the artistic elements of
sound to be varied in considerable detail. In so doing, all artistic elements
can be shaped for artistic purposes and used to create musical ideas. As all
elements of sound can be varied by roughly equal amounts, it is possible
for any element to play an important role in a piece of music. We commonly
see this practice in today’s music productions.
The artistic elements are used in very traditional roles in certain musical
works and types of recording productions, and in very new ways in other
works. These new ways the artistic elements are used tend to emphasize
aspects of sound that cannot be controlled in acoustic performances. The
aesthetic/artistic elements unique to audio recording (especially sound
quality and spatial properties) are commonly used to support and shape
musical ideas. Different musical relationships and sound properties can
exist in audio recordings rather than in acoustic music. Knowing and con-
trolling these elements gives the recordist the opportunity to contribute to
the creative process and the act of making music.
The potentials of the artistic elements to convey the musical message, the
musical message itself, and the characteristics and limitations of the lis-
tener are explored in the following chapter.
Chapter 2
60
Exercises
The following exercise should be practiced until you are comfortable with the
material covered.
Exercise 2-1
General Musical Balance and Performance Intensity Observations
1. Listen carefully to “Penny Lane” by The Beatles. Follow the fl utes, piccolo,
and piccolo trumpet parts to observe the confl icting levels/cues cited in
the discussion above.
2. In succeeding hearings, fi nd other instances where musical balance is at
a different loudness than the performance intensity information of the
instruments’ sound qualities.
3. Listen again, while focusing attention on a specifi c instrument or voice
you know well; follow that sound source carefully throughout the song,
to make some general observations of performance intensity cues.
4. Listen again and note the actual loudness of that instrument/voice in
relation to the other sound sources.
5. Finally, listen again for how these relationships change between major
sections of the song (i.e., between verse and chorus).
61
3 The Musical Message
and the Listener
This chapter will discuss the content of the musical message, and how the
aesthetic and artistic elements function in communicating the message of a
piece of music. The listeners ability to correctly interpret the sounds of the
musical message largely determines their understanding of the intended
artistic meaning of the music. The factors that limit the listeners ability to
effectively interpret the artistic elements of sound into the intended musi-
cal message (or meaning of the music) will be explored; the listener as
audience member, and as audio professional, will be contrasted.
Today’s recordist can do much to shape music and is often a key person in
the creative process. This chapter is intended to provide some insight into
what is created and shaped when music is recorded.
The Musical Message
The message of a piece of music is related to the many purposes or func-
tions of music. Each different purpose that music serves requires a different
approach to listening. The approach will bring some aspect into the center of
our attention or the center of our experience. As will be covered more thor-
oughly later, we listen to music with various levels of attention. This strongly
impacts the listening experience. At the extremes, the listener will be focused
and intent on extracting certain specifi c types of information (active listen-
ing); conversely their attention will be focused on some activity other than
the music (eating, a conversation, a dentist’s drill), with the listener perhaps
not actually conscious of the music (passive listening). How the listener
approaches the act of listening impacts the success of the music reaching
the listener. The intended purpose and message of the music successfully
reaches the listener only when the listener is appropriately receptive.
Chapter 3
62
The purpose of a piece of music and associated characteristics of the musi-
cal message may take the forms of (1) conceptual communication, (2) por-
traying an emotive state, (3) aesthetic experience, and (4) utilitarian func-
tions. The purposes are by no means exclusive; many pieces of music use
different functions simultaneously, or at different points in time.
Music that includes a text, such as a song, will communicate other con-
cepts. These works may tell a story, deliver the authors impressions of an
experience, present a social commentary, etc. Music is used as a vehicle to
deliver the tangible ideas of the author/composer. The interplay between
the music and the drama of the text is often an important contributor to the
total experience of these works.
It is diffi cult for music alone (without words) to communicate specifi c con-
cepts, but it is possible. Often written works portray certain subjects with-
out a text. The listener can associate sounds in music to their experiences
with a subject if a connection can be made. The subjects of such works
are often general in nature, such as Ludwig van Beethovens “Pastoral”
Symphony No. 6.
Certain concepts are associated with certain specifi c or types of musical
materials (types of movie music, musical ideas associated with certain
individuals or certain landscapes, etc.). These are exceptional cases where
music alone can communicate specifi c concepts, with the aid of associa-
tions drawn from the listeners past experiences. It is easy to imagine a
Western chase scene, or an impending shark attack, when listening to cer-
tain pieces of music—after one has heard the music and seen the action
together enough times.
Music communicates emotions easily. One of the reasons many people
listen to music is for emotional escape, relief, or a journey to another place.
Music may portray a specifi c mood, incite a specifi c emotional response
from the listener, or create a more general and hard to defi ne (yet convinc-
ing) feeling or emotive impression. The composer of the music draws from
the past experiences of the listener to shape their emotive reactions to the
material. This is found—at least to some extent—in all music. Works of this
nature may include a text, or not.
Music may be an aesthetic experience. The perception of the relationships
of the musical materials alone, without the associations of concepts or
emotive states, may be the vehicle for the musical message. Music has
the ability to communicate on a level that is separate and distinct from
the verbal (conceptual) or the emotional. Music without words, without
emphasis on the emotional level, can be tremendously successful in com-
municating a message of great substance. Music (as all of the arts) can
reach beyond the human experience; ideas that cannot be verbally defi ned
or represented as an emotional experience can be clearly communicated.
Abstract concepts may be clearly communicated; the human spirit may
reach beyond reality, to loftier ideals. Some people have been so moved
The Musical Message and the Listener
63
as to compare the aesthetic appreciation of a substantial work of art to the
impressions of religious experience. Works of Johann Sebastian Bach such
as the
Brandenburg Concertos or the Cello Suites are excellent examples of
this type of music, from the multitude that encompass hundreds of years of
history and nearly all of the world’s cultures.
Music serves other functions. It is used to reinforce or accompany other
art forms (motion pictures, musical theater, dance, video art), to enhance
the audio and visual media (television, multimedia, radio, advertisements),
and to fi ll dead air in every-day experiences (supermarkets, elevators, etc.).
In these instances, music is present to support dramatic or conceptual
materials, to take the listeners attention away from some other sounds or
activities (a dentist’s offi ce), or to make an environment more desirable (a
restaurant, an automobile).
The complexity of the musical materials is often directly related to the func-
tion of the music. When music is the most important aspect of the listener’s
experience, the musical material may be more complex—as the listener will
devote more effort to deciphering the materials. When the music is playing
a supportive role, the materials are often less sophisticated and are directly
related to the primary aspect of the listeners experience. When music is
being used to cover undesirable noises or to fi ll a void of silence that would
otherwise be ignored, the musical materials are often very simple, easily
recognized, and easily heard without requiring the listener’s attention.
Musical Form and Structure
Within the human experiences of time and space, nothing exists without
shape or form. Music is no exception. Pieces of music have
form as a glob-
al quality, as an overall concept and essence.
Pieces of music can be conceptualized as an overall quality. It is the human
perception of form that provides the impression of a global quality that
crystallizes the entire work into a single entity. Form is the piece of music
as if perceived, in its entirety, in an instant; it is the substance and shape
that is perceived from conceptualizing the whole.
Form is the global shape of a piece of music together with the fundamental
concepts and expressions (emotions) it is communicating. It is the sum that
is shaped from the interactions of its component parts. Form is a single con-
cept and essence of the piece, given shape by structure; it is comprised of
component parts that are the materials of the piece of music. The materials
of the music and their interrelationships provide the
structure of the work.
The structure of a work is the architecture of its musical materials. Struc-
ture includes the characteristics of the musical materials coupled with a
hierarchy of the interrelationships of the musical materials, as they function
to shape the work. The artistic elements of sound function to provide the
Chapter 3
64
musical materials with their unique character, as will be later discussed. All
musical materials are related by structure.
The hierarchy of musical materials is the listeners perception of a general
framework of the materials. The hierarchy provides an interrelationship of
the materials (with their varying levels of importance) to one another, and to
the musical message as a whole. Within the hierarchy of musical structure:
all musical materials and artistic elements will have a greater impor-
tance to the musical message than other materials or elements (except
the least signifi cant);
all sections or ideas in the music will have greater importance over oth-
ers (except the least signifi cant);
all musical materials are subparts of other, more signifi cant musical
materials (except the most signifi cant materials).
Further, the hierarchy of the musical structure organizes musical materials into
patterns, and patterns of patterns. In this way, relationships are established
between the subparts of a work and to the work as a whole. The hierarchy
is such that any time span may contain any number of smaller time-spans,
or be contained within any number of larger time spans; musical material at
any level may be related to material at any other level of the hierarchy.
Figure 3-1
Form and
structure in music.
Form
Major Divisions: A B A
Structure
Major Divisions: Verse1 Chorus Verse2
tonal centers: I V V I
Subdivisions at / \ //\\
intermediate levels:
sub-levels:
phrases: a a' b c d c d' a' a b'
motives: / \ / \ / \ / \ / \\ / \ / \ / \\ / \
melodic
rhythmic
accomp patterns
harmonic progression:
instrumentation: #1. solo voice, guitar, #2. solo voice, guitar, Ensemble Ensemble
bass, complete bass, kybd, #1. #2.
trap set background vocals,
cymbals
Nontraditional artistic elements may function at any structural division and subdivision of the hierarchical
level to create patterns & rhythms:
Dynamic Contour Pitch Density Sound Quality
Stereo Location Distance Location Environmental Characteristics
Primary and secondary elements are present throughout the structural hierarchy as Sound Events and
Sound Objects.
Interrelationships of materials take place at all levels of the hierarchy and between artistic elements.
The Musical Message and the Listener
65
A multitude of possibilities exists for unique musical structures, supporting
very similar musical forms. The innumerable popular songs that have been
written during the past few decades are evidence of this great potential for
variation. Most of the songs have many similarities in their structures, but
they also have many signifi cant differences. The materials that comprise
the works may be very different, but the materials work towards establish-
ing an overall shape (form) that is quite similar between songs.
Many songs share similar forms. Their overall conceptions are very simi-
lar, although the materials of the music and the interrelationships of those
materials may be strikingly different. Form is an overall design and concep-
tion that may be constructed of a multitude of materials and relationships.
Musical materials can be changed to dramatically alter the structure of a
piece of music without altering its form. Many different structures can lead
to the same overall design and portray the same basic artistic statement
that creates form.
Musical Materials
As music moves and unfolds through time, the mind grasps the musical
message through the act of understanding the meaning and signifi cance
of the progression of sounds. During this progression of sounds, the mind
is drawn to certain artistic elements (that create the characteristics of the
musical materials). We perceive the
musical materials as small patterns
(small musical ideas, often called motives or gestures), and group the small
patterns into related larger patterns. The listener remembers the patterns,
together with their associations to larger and smaller patterns (perceiving
the structural hierarchy). In order for the listener to remember patterns,
the listener must recognize some aspect of the organization of a pattern or
some of the materials that comprised a pattern(s).
The use of contrast, repetition, and variation of patterns throughout the
structure creates logic and coherence in the music. Several general ways
in which musical materials are used and developed should make clear the
multitude of possibilities. Materials are contrasted with other materials at
the same and different hierarchical levels (above and below). Materials are
repeated immediately or later in time, at the same or different hierarchi-
cal levels. Materials are varied by adding or deleting portions of an idea,
by altering a portion of an idea, or perhaps by transposing an idea to dif-
ferent artistic elements (such as melodic ideas becoming rhythmic ideas;
harmonic motion becoming dynamic motion).
A balance of similarities and differences within and between the musical
materials is needed for successfully engaging music. A musical work will
not communicate the desired message if this balance is not effectively pre-
sented to, or understood by, the listener.
Chapter 3
66
The listener remembers the context in which the patterns were presented
as well as the patterns themselves. Some patterns will draw the listeners
attention and be perceived as being more important than other patterns—
these are the
primary musical materials. Other patterns will be perceived
as being subordinate; these
secondary materials will somehow enhance
the presentation of the primary materials by their presence and activity
in the music. The secondary materials that accompanied the patterns (or
primary materials) are also remembered as individual entities (capable of
being recognized without the primary musical idea) and as being associ-
ated with the particular musical idea (patterns).
The primary materials are traditionally: melody (with related melodic frag-
ments or motives) and (extra-musically) any text, or lyrics of the music.
The secondary materials are traditionally: accompaniment passages, bass
lines, percussion rhythms, harmonic progressions, and tonal centers.
Secondary materials may also be dynamic contour, pitch density, timbre
development, stereo/surround location, distance location, or environmen-
tal characteristics.
The secondary materials usually function to support the primary musical
ideas. It is possible to have any number of equal primary musical ideas. The
potential groupings of primary and secondary musical ideas, in creating a
single structural hierarchy, are limitless. Consider any number of second-
ary ideas (of varying degrees of importance in their support of the primary
musical idea or ideas) that may coexist in a musical texture, with any num-
ber of related or unrelated primary musical ideas.
Musical materials are given their unique characters by the states and values
of the aesthetic/artistic elements of sound. The artistic elements of sound
function to shape and defi ne the musical materials.
The Relationships of Artistic Elements
and Musical Materials
Musical ideas are also composed of
primary elements and secondary ele-
ments
. The primary elements are the aesthetic and artistic elements of
sound that directly contribute to the basic shape or characteristics of a
musical idea. The secondary elements are those aspects of the sound that
assist, enhance, or support the primary elements.
It is possible (and in fact common) to have more than one primary element
and more than one secondary element contributing to the basic character
of a musical idea. Primary elements exhibit changes in states and values
that provide the most signifi cant characteristics of the musical material.
The secondary elements provide support in defi ning or in providing move-
ment to the primary elements.
The Musical Message and the Listener
67
At all levels of the structural hierarchy, musical materials (primary and
secondary musical ideas) are made up of primary and secondary artistic
elements. Therefore, it is possible for a certain element of sound to be a
primary element on one level of the hierarchy, and a secondary element on
another level. This is not an uncommon situation. For example, a change
in dynamic level of a drum roll might have primary signifi cance at the hier-
archical level of the individual sound source; at the same point in time but
at the hierarchical level of the composite sound of the entire ensemble,
changes in dynamics are insignifi cant to the communication of the musical
message, with pitch changes being of primary importance.
All of the artistic elements of sound have the potential to function as the
primary elements of the musical material. They have the potential to be the
central carriers of the musical idea. Likewise, all of the artistic elements of
sound have the potential to function as secondary elements of the musical
material, and have the potential of functioning in supportive roles in rela-
tion to conveying the musical message. This concept of
equivalence will be
thoroughly explored.
In most music, pitch is the central element or the primary carrier of the
musical message. The unique sound qualities of recording usually appear
in the supportive roles of music, much more than as the primary elements.
Most often, current production practice will use the unique artistic elements
of recordings (such as the stereo location) to support or enhance the prima-
ry message (or perhaps to assist in defi ning an individual sound source).
Rarely do the new sound resources function as the primary element of the
primary musical idea, though this is entirely possible.
The new musical possibilities using all artistic elements can create con-
vincing musical ideas when they are used as the primary carriers of the
musical material. Current practice is likely to continue its gradual change
towards further emphasis of these new artistic elements of sound unique
to recording. It is important to recognize that the potential exists for any
artistic element of sound to be the primary carrier of the musical material.
The potential exists for any of the artistic elements of sound to function in
support of any component of the musical idea. All of the artistic elements
are equally capable of change, and that change can be perceived almost
equally well in all of the artistic elements.
Traditionally, a break down of primary and secondary elements of a piece
of music (with associated musical materials identifi ed) would commonly
appear similarly to Table 3-1. Pitch is the primary element and is support-
ed by rhythm and dynamics. The musical materials are differentiated by
sound-quality differences.
In many current recordings, a similar (and equally common) outline might
appear as Table 3-2. While pitch remains a primary element, rhythm and
sound-quality changes are equally important in delivering the musical
message. Sound quality, in particular, has become more important with
Chapter 3
68
recording technology. Pitch, dynamics, and rhythm still play supportive
roles, with spatial properties assisting sound quality in differentiating the
musical ideas.
Table 3-1
Traditional Hierarchy of Artistic Elements
Primary Elements Secondary Elements
Pitch—melodic line #1 Pitch—harmony
Pitch—melodic line #2 Pitch—accompaniment patterns
Dynamics—contour for expression
Rhythm—supporting melody
Sound Quality—instrument selection
Table 3-2
Common Hierarchy in Current Music Productions
Primary Elements Secondary Elements
Pitch—melodic line #1 Pitch—harmony
Pitch—melodic line #2 Pitch—accompaniment patterns
Rhythm—recurring patterns Dynamics—contour changes without
changes in timbre;
accents; contour for expression
Sound Quality—changing texture Rhythm—supporting melody
Sound quality—instrument selection; expres-
sion changes without dynamic changes
Spatial properties—diverse host environ-
ments for each instrument; rhythmic pulses
in different stereo locations; sound-stage
location of instruments widely varied
Equivalence and the Expression of Musical Ideas
Throughout Western music history, pitch (in its levels and relationships)
has been the most important artistic element of music. The pitch relation-
ships are utilized to create melodies, harmonies, accompaniment patterns,
and tonal systems. Pitch has functioned as the central element in nearly
all music that has descended from or has been signifi cantly infl uenced by
the European tradition. Pitch is the primary artistic element in much of the
music we know and is the perceived parameter that contains most of the
information that is signifi cant to the communication of the message of a
piece of music.
Western music might have developed differently. While pitch relationships
have been used as the primary generator of musical materials much more
than other artistic elements, this need not have been so. Indeed, musics of
other cultures use the artistic elements of music in signifi cantly different
ways. Some cultures emphasize other artistic elements, such as rhythm,
and many incorporate very different types of pitch relationships.
The Musical Message and the Listener
69
It is diffi cult to justify pitchs traditional prominence in the expression of
musical ideas. While it is true that pitch is the perception of a primary attri-
bute of the waveform (frequency, with the other attribute being amplitude),
that of all the elements it is the most easily detected in many states and
values, that pitch is the only artistic element that can be readily perceived
as multiples of itself (the octave repetition of pitch levels, and the percep-
tions of real and tonal transposition of pitch patterns), these factors do not
cause pitch to be a more prominent element than the others.
Our ability to perceive pitch is not signifi cantly more refi ned (if at all) than
our abilities to perceive the other parameters of sound. This is especially
true of those parameters that utilize less precise pitch-related percepts
(timbre, environmental characteristics, texture, and pitch density).
It follows that the artistic elements of sound other than pitch are equally
capable of contributing to the communication of musical ideas.
This capability is being realized in the recordings of today’s creative art-
ists. In fact, this has been going on for quite some time, as the examples
of recordings by The Beatles should indicate. This is occurring without
conscious planning, but rather as a natural exploration of available sound
relationships. Musicians (recordists, performers, composers, and produc-
ers) instinctively fi nd roles for the unique artistic elements in recordings.
The artistic elements that were not available, or that were underutilized,
in traditional music performed in traditional contexts, are functioning in
signifi cant ways in modern music productions.
The concept that all of the artistic elements of sound have an equal poten-
tial to carry the most signifi cant musical information is
equivalence.
The states of the various components of the aesthetic and artistic elements
of sound will make up the musical material. As such, they will function
in primary or secondary roles of importance in the communication of the
musical message. It is possible for any artistic element to function in any
of the primary and secondary roles of shaping musical materials, and of
generating the communication of the musical message. These are under
the control of the recordist and have been commonly used to shape the
musicality of music productions for years. Bringing these musical ideas
into acceptance by the listener will require fi nesse and control of craft by
the recordist.
Equivalence is also a framework for listening. It is a point of departure that
reminds the recordist that any element or aspect of sound can change or
can demand attention. Any change in the sound must be detected by the
recordist and understood—no matter the signifi cance to the music. Any
aspect of sound might require the attention of the recordist at any moment
during the listening experience. Equivalence provides this guidance and
awareness.
Chapter 3
70
Text as Song Lyrics
When a text is present in a piece of music, it is a signifi cant addition to
the musical experience. Through the text, language communicates a con-
cept or describes a drama within the work. Further, the sound resources of
the language will be exploited to enhance the aesthetic experience of the
music. Songs are often relatively short musical pieces that contain a text
(usually a single, rather short text). The song is the most common form of
music today.
The text, or lyrics, of a song is a poem set to music. The text’s elements are
arranged in some sort of structure (as the structural construction of music),
and the concepts of the text will create formal areas that are conceived as a
single entity, as well as an overall idea and meaning of the text.
The lyrics of songs are constructed in many of the same ways as traditional
poetry, written for its own sake and not intended to be set to music. The pri-
mary differences between the traditional poetry and poetry as song lyrics
lie (1) in the repetitions of certain stanzas or phrases of the poem (unaltered
or with slight changes), (2) in the careful crafting of the meters of the text,
the rhythms of the lines, and the timing of the conceptual ideas of the text
often found in song lyrics, and (3) in the sound qualities of the words that
can be chosen to enhance the musical setting.
Literary Meaning
The literary meaning of the text brings the dimensions of verbal communi-
cation of ideas and concepts to the musical experience. Songs have been
written on a multitude of subjects from common, everyday small occur-
rences to the highest of human ideals. The lyrics of a song might present a
story line, or it might be a description of an event or the authors feelings
about some aspect of the world around them. The text might be a presenta-
tion of the social-political philosophies of the author, or it might be a love
song. The potential subjects for a song are perhaps limitless.
The presentation of the text’s literary meaning is often enhanced by sub-
ordinate phrases of text segments that create new dimensions in the text.
These subordinate ideas provide the turns of phrase or concepts that enrich
the meaning of the text as a whole. The turning of the phrase allows for dif-
ferent interpretations of the meaning of certain ideas, at times different
meanings to different individuals (or groups of people) depending on the
experiences of the audience.
The potential for different interpretations allows for some (or much) ambi-
guity and intrigue in the text. The ambiguity may be clarifi ed with a study
of the central concepts of the song lyrics. Reevaluating a well-crafted text
will often allow the listener to fi nd new relationships of ideas or meanings
of materials that enhance the experience of the song for the listener (or
The Musical Message and the Listener
71
recordist). This is common in songs from many styles of music and lends a
considerable dimension to the musical experience.
The concepts used to enhance the literary meaning of the work may or may
not be directly related to the central ideas of the text. These ancillary concepts
may take many forms and are important in shaping the presentation of the
communicative aspects of poetry. A study of poetry, or of the setting of texts
to music, may be very appropriate for the individual recordist, but is out of
the scope of this writing. Some general observations are instead offered.
Structure and Form of Song Lyrics
The structure of the text exists on many levels, similar to the hierarchy of
musical structure. The conceptual meanings of the text and the sounds and
rhythms of the text, do not allow for a clear division between the structural
aspects of the text and the form-related aspects of the text. The structure of
the text should address the sound qualities of the text and its organization
of mechanical parts. The
form of the text should address the conceptual,
often with a recurring concept or theme, a refrain (as the song’s chorus).
Some cross-over will occur between the two areas: (1) the structure of the
text’s presentation may alter the statement’s meaning, and (2) concepts
can, at times, function as structural subparts. These are the result of the
ways we conceptualize in verbal communication, and the previous experi-
ences and social-cultural conditioning of the individual.
The components of the structure of a text will be major divisions of the
materials of the text, and the subdivisions they contain. The materials that
comprise the components are words, with all of their associated mean-
ings, and the thoughts and feelings they invoke from within the individual.
Words will be related by their sound qualities, rhyme schemes, rhythms
and meters of groups of words, repetitions of words and words sounds,
and by tonal and dynamic vocal infl ections. Meanings of the words, rep-
etitions of words with different associated meanings, phrases created by
the concepts (sentences), and groupings of phrases by subject matter or
concepts are also used.
This format will not necessarily be directly transferable to all text settings,
but these concepts can provide a meaningful point of departure.
Chapter 3
72
Figure 3-2
Form and
structure of song
lyrics.
Form
Major Divisions: A B A
Subdivisions
phrases: a b c d e d f g h i
concepts:
Structure
Major Divisions: Stanza #1 Refrain Stanza #2
Sub-grouping by function: beginning of plot author's impressions plot continued
Subdivisions
groupings of phrases by:
rhythm
rhyme scheme
word usage
sound quality
lines by:
rhythm
rhyme scheme
word usage
sound quality
words by:
repetition
varied meaning
tonal inflection
dynamic inflection
Texts and Music in Combination
The structures of the text and the music interact in the overall perception of
the song. They are perceived as being interrelated. They serve to enhance
each other. The structures may complement one another, or they may serve
as areas of contrast, with the text and the music grouped in overlapping
segments, unfolding over time.
Both complementary and contrasting relationships of the structural ele-
ments of the text and the music exist in most works. The two play off one
another, creating a sense of drama between the text and the music.
The relationships of structures create our impression of form: our con-
ceptualization of grasping the essence of the entire work in an instant of
realization. Within our impression of form as the overall conception of the
work, we conceptualize points of climax and points of repose; we concep-
tualize the characteristics of design and shape of the materials that create
the movement from one important event, or moment, to the next.
We recognize the shape and design of the work as it is represented in our
perception of the signifi cant moments of the work, and in the movement
between the moments as they unfold over time.
The relationships of the musical materials create structure in a piece of
music. Our perception of the design of structure is our conception of form.
The structure of a piece of music may be altered signifi cantly without
The Musical Message and the Listener
73
altering its form. Even when the primary musical materials and the struc-
ture of a work are signifi cantly altered, two very different interpretations of
the same piece of music will be perceived as being similar when the form
(or overall conception) of both performances are similar.
Contrast, for example, two performances of the song, “Every Little Thing,” the
original by The Beatles (
Beatles for Sale, 1964) and a cover by Yes (Yes, 1969).
The overall shape of the piece is not dramatically altered, but the structures
of the two performances are quite different. Great differences exist between
the lengths of sections, as well as the treatments of the basic musical materi-
als and how they are organized. Few people would argue that both perfor-
mances are of the same piece of music. Few people could not perceive dra-
matic changes in the structure and materials of the two different versions.
The reader is encouraged to perform the exercise in identifying the struc-
ture of a song found at the end of this chapter, to compile a time line similar
to those in Figure 3-3.
Figure 3-3
“Every
Little Thing” as
performed by The
Beatles and by Yes.
aa' b a
1
a'
1
b'
1
cdc
1
da
2
a'
2
b
2
a
3
a'
3
b
3
cdc
1
d a'' a''' b'' c d c
1
d
1357911 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75
Yes - "Every Little Thing"
The Beatles - "Every Little Thing"
77 81
6
4
85 89 93 97 101 105 109 113 117 121 125 129 133 137 141 145
6
4
6
4
6
4
6
4
6
4
aa' b a
1
a'
1
b'
1
c d c d' e f a
2
a'
2
b'
2
a
3
a'
3
b
3
c d c d' e a a b a a b c d c d' e
e
measures of (except as marked)
4
4
Verse 1 Chorus 1 Verse 2 Chorus 2 Chorus 3Instrumental Verse
(A material)
Bridge
AA
1
BA
2
A
3
A
4
A
5
BB
148 152 156 160 164 168 172 176 180 184 188 192
6
4
6
4
Bridge Verse 3 (2 repeated) Chorus 4 Coda
A
2
B A/B
A
3
a
2
a'
2
b'
2
a
3
a'
3
b
3
c d c d' e
e
f
measures of
4
4
1357911 15 19 23 27 31 35 39 43 47 51 55 59
Verse 1 Chorus 1 Verse 2 Chorus 2 Coda
Instru
Verse
AA
1
BA
2
A
3
A
4
BB
A/B
Chorus 3
Introduction
a''
e
a''
e
a''''
e
a''
(silent beat 1)
Introduction
The Listener
The audience member and the various audio professionals will have very
different levels of listening expertise and usually dramatically different pur-
poses for the listening process.
The recordist will have knowledge of the recording process (that which
appears in Part Three, and more), the states of sound in audio recording, the
materials of music, and of the hearing mechanism (previously discussed).
Chapter 3
74
Further, the recordist will have spent considerable time acquiring the listen-
ing skills for the evaluation of recorded and reproduced music, and sound
that will be covered in Part Two. The recordist is often equally skilled at
evaluating the technical integrity of the audio signal (perceived parameters
of sound), and at evaluating the artistic elements of sound and the materi-
als of the musical message, also to be covered in Part Two.
The
lay listener is the audience for a recording. The lay listener will listen
by relying primarily on their previous listening experiences. They will usu-
ally have little or no formal training. The lay listener may be listening for
some meaning in the music and be concerned with the relationship of the
musical (or literary) message to their personal preferences of musical style,
and musical and dramatic meanings. They may be listening for the sensual
aspects of the music or be listening for the aesthetic experience.
The listener will likely be listening for pleasure and be concerned about
enjoyment. People most often listen for entertainment, and perhaps for
escape or enrichment.
While an album might be carefully crafted as a complete experience from
beginning to end, most people will not sit in a position equidistant from
two loudspeakers for about an hour listening to a CD. Most people do not
dedicate time to focus their attention to music listening, or to sitting in one
place while listening (unless they are driving, and then we hope they are
not listening too intently). This is simply not the normal listening practice
of people today.
Whatever the purpose for listening to music, the listener will not listen in
the same way or for the same sound qualities as the recordist. Nor should
they. The audience member should not be expected to listen in the same
way as the recordist. Their purpose for listening is very different. It is neces-
sary, however, for the recordist to deliver the recording in such a way that
it can be understood.
The receiver and the quality of the communication limit the success of any
communication. The receiver (listener) must be able to accurately process
information (recording/music) for communication to occur. Humans are
limited in their abilities to understand the content and/or meaning of what
they perceive. These limitations are primarily the result of the listeners
experience and knowledge, but are also dependent upon the listeners
degree of interest in the material, intellect, and physical condition. The same
material (music recording) will yield different information to different listen-
ers (or to the same listener on different hearings), depending on knowledge,
experience, analytic reasoning, social-cultural conditioning, expectations of
context, attentiveness, and the condition of the hearing mechanism.
In crafting recordings for an audience, the recordist might need/wish to
directly consider the listener (audience). An examination of these factors
will provide a realistic assessment of a target audience and perhaps allow
The Musical Message and the Listener
75
the recordist to reach them more readily. These are the conditions that
shape the listeners of recordings.
Knowledge
The listeners accumulated information related to what is being heard, as
well as of all subjects related to their existence, plays a substantial role in
the understanding of music and sound.
Knowledge allows the listener to
understand a sound, or a musical passage, by relating the experienced
sound material to a body of known information. When the listener has a
body of known information and/or possible circumstances, the music can
be matched against those possibilities. With a match, listeners can then
comprehend (and potentially reason) the meaning of the material.
Knowledge is the amassed body of learned information, or known truths.
The listener can draw on their knowledge to make evaluations and judg-
ments on what is being heard. Knowledge areas related to the understand-
ing of sound (and music) would include acoustics, psychoacoustics, music
theory, music history and literature, language, audio recording theory and
practice, mathematics, physics, engineering, computer science, communi-
cations, and more. The listener can formally and consciously know these
subjects, or they might be more or less intuitively learned through sensitiv-
ity to life’s experiences.
Experience
The listeners past life experiences are directly related to knowledge. Sound
is experienced. In its conceptualized state, sound becomes experienced
information. A personal knowledge, or experience, of the sound is the
result of the listening process. Prior listening experiences are a resource
that can be drawn from to recognize certain sound events or relationships.
Sounds are mapped into the memory. The listener is better able to retain
sound events in memory when a sound is the same as, or similar to, a
sound that has been previously experienced. The act of listening is itself an
experience, involving the learning of new information from what is going
on in the listeners “present.” New information is recognized and under-
stood by comparing it with what has been previously experienced.
The type and quantity of listening experiences, and the personal knowledge
gained, will vary signifi cantly between individuals. These listening experienc-
es are signifi cant factors in understanding the messages of music. Different
types of music will communicate different messages and may communicate
the message through different musical styles. Diffi culties people experience
in understanding or appreciating different types of music can often be attrib-
uted to a limited experience with a certain type of music. An individual’s lis-
tening experiences may have limited their ability to understand the materials
Chapter 3
76
(language) of the music, or to appreciate what the music is trying to commu-
nicate. Increased knowledge of a type of music, and/or an increased number
of experiences in listening to the type of music, will increase the listeners
ability to understand or appreciate the type of music.
Listening experiences are greatly infl uenced by the life environment of the
individual. The social and cultural environment(s) in which the individual
lives, and has experienced, provide opportunities for a certain fi nite num-
ber of listening experiences. Within any environment, certain experiences
will occur much more frequently than others. Certain types of listening
experiences will be very common, and certain types of listening experi-
ences will never occur or occur only rarely.
Social-cultural conditioning will predispose the listener to a certain set of
available previous experiences. People are conditioned by their environ-
ment (social and cultural) to apply meanings to sounds, and to understand
stylized musical relationships. We learn to listen for certain relationships
in musical materials and the artistic elements of sound. For example, the
music of India uses pitch and rhythm in signifi cantly different ways than
American popular music. Individuals from either culture will not readily
understand the meaning or appreciate the subtleties of the music of the
other culture, upon initial hearings.
The application of meanings to sounds is the basis for language. Sounds
have meaning, and can represent ideas. In this manner, a series of short
sounds as narrowly defi ned, isolated ideas can combine (in a prescribed
ordering) to create a complex concept. Communication of simple ideas to
complex thoughts is thus accomplished by language. As we well know, dif-
ferent cultures have strikingly different languages. Some languages have
common elements to other languages, and certain languages have ele-
ments that are largely unique.
Social context also plays a signifi cant role in defi ning language sounds
and meaning. Quite different meanings may be associated with a single
sound, in the same language, by people of the same culture/society. This
most often occurs between different social groups (ethnic origins, religious
beliefs, age groups, etc.), groups of different economic status, and between
geographic locations.
Sounds have meanings associated with their source. A sound produced by
a car horn will invoke in the listener the thought of an automobile, not of
the horn itself. Such referential listening only occurs when the listener has
a certain set of life/listening experiences. Associations between sounds and
their sources are largely dependent upon the listeners set of life/listening
experiences, as provided by social-cultural environments. One can imagine
living conditions under which an individual might never have experienced
the sound of a car horn (perhaps the nineteenth century). The sound would
not elicit the same response from this person as it would from a modern
urbanite.
The Musical Message and the Listener
77
The meanings of musical sounds transfer between cultural and social
groups in very similar ways to language sounds.
Social-cultural conditioning creates expectations as to the function of
music. People are conditioned to relate various functions and applications
to certain types of music. Dance, celebration, worship, ceremony, accom-
paniment to visual media, and aesthetic listening are but a few of the func-
tions that music serves in various societies. Each function carries with it
certain expectations for musical style. These expectations are defi ned dif-
ferently in different cultures and societies.
The listeners life/listening experiences provide the available information
from which they can understand sounds. Social-cultural environment con-
ditions the listener through (1) providing a predominance of certain lis-
tening experiences, (2) providing certain expectations as to the content of
musical materials (the applications of the artistic elements), (3) providing
certain expectations as to the context within which certain types of music
will be heard (in church, in a club, in the street, etc.), (4) providing mean-
ings of association for certain signifi cant sounds (signifi cant sounds being
perhaps a siren, or a falling tree), and (5) providing associations of group
activity for certain types of music (ceremony, dance, group experiences).
While broadcast media have broadened the number of common elements
between social and cultural groups throughout the world, great diversity
still exists among human cultures and societal groups. Social-cultural con-
ditioning must remain a signifi cant factor in our realization of the limita-
tions of the listener. For example, it might be unrealistic to expect the lay
person from China to understand the musical nuance and message of rap
music, just as it might be unrealistic to expect the typical American, subur-
ban 16-year-old to understand the meaning and signifi cance, or to appreci-
ate the aesthetic qualities, of Tibetan chant.
Expectation
Knowledge, experience, and social-cultural conditioning create expecta-
tions
for the listener. The listener will expect to hear certain sounds (or
sequences of sounds) under certain circumstances. They will expect certain
types of sounds to follow what has already been heard. They can expect to
hear materials in certain relationships (melody with certain harmony), and
to hear certain sounds within a given physical environment (one would
not expect to hear a lion sound on a city street). The listener will likewise
expect certain sounds in a given musical context (an operatic vocal tech-
nique would be unexpected in a reggae work) and expect to hear certain
kinds of music in certain social-cultural contexts (the listener will expect to
hear different music in church, movies, dance clubs, etc.).
When listeners are presented with something that is not expected, they
may be surprised if they are able to recognize the material enough to
Chapter 3
78
understand it and its context, or may be confused if they cannot recognize
the sound or relate the sound to its context. An unexpected sound might
intrigue the listener as a unique turn of a musical idea or as a sound slightly
out of context. Conversely, if unexpected sounds that are also unfamiliar
to the listener are present, they will not be able to understand the sound,
they will not receive the message of the material, and may likely be dissat-
isfi ed or frustrated by the listening experience. Among other possibilities,
the listener might have a dislike for the original context of this unexpected
sound, and thus cause this new experience to be unenjoyable.
Expected and unexpected sounds and relationships are balanced within
all musical styles. A musical style is a set of expectations. Certain types of
musical events and relationships are present that provide a musical style
with consistency and a unique character.
Analytical Reasoning
The listeners knowledge, experience, and analytical reasoning play impor-
tant roles in the understanding of musical messages within various musi-
cal styles. Too many unexpected sounds or situations will result in confu-
sion and frustration on the part of the listener. If expectations are fi lled in
predictable ways, the listener will become bored with the material. They
perceive logic and coherence of the musical materials through a fulfi llment
of expectations in the characteristics and functions of the musical materi-
als, coupled with enough unexpected activity to maintain interest.
Listeners use analytical reasoning to extract the meaning of musical mate-
rials, when they are unable to identify the material. Analytical reasoning,
in music listening, is the ability to relate immediate listening experience to
knowledge, in a manner capable of deducing meaningful observations and
information. The ability to perform this type of listening activity is depen-
dent upon intellect, the amount of knowledge the listener is able to draw
from, the listeners previous experience in performing analytic reasoning
exercises, and the listeners knowledge of the types of information to extract
from the listening experience. This method of listening works similar skills
as the critical and analytical listening skills addressed in Part Two.
Active and Passive Listening
The level of attentiveness of the listener plays an important role in the
understanding of the musical message. Listener attentiveness and musical
understanding are related to active and passive listening, and to the listen-
ers interest in the music.
The difference between active and passive listening is the listeners atten-
tion and involvement.
Passive listening occurs when the listener is not
focused on the listening process, or on the music itself within the listening
The Musical Message and the Listener
79
experience. Passive listening might fi nd listeners otherwise occupied and
listening to music as a background activity (reading a book, for example).
Alternatively, listeners might be listening to music for reasons other than
understanding. They may be listening for relaxation purposes. Other types
of passive listening include approaching music for its emotive state, or
feeling, or listening to music for its pulse only, such as an accompaniment
to dancing. In all of these cases, the listener is not listening to the musical
materials themselves, and they might not be aware of the music during
certain periods of time. In passive listening the music itself is not the center
of the listeners attention.
Music is at the center of the listeners attention in
active listening. Various
levels of detail can be extracted during the active listening process. Among
many possible states, active listening might take the form of listening to
the text and primary melodic lines of a work. It may take the form of follow-
ing the intricacies of motivic development in a Beethoven string quartet, or
of evaluating the characteristics of a sound system. In all cases, the state
of active listening has the listeners attention, aware of musical materials
or sound quality.
The listener is most likely to be an active listener if they are interested in
the music or have a specifi c reason to be listening carefully. The listeners
interest in the music may be determined by mood or energy level at the
time of the listening experience but is most often associated with listening
preferences (and the previous experiences that have shaped those prefer-
ences). These preferences lead to the types of music the listener listens to
most often and what they prefer to hear.
Hearing Mechanism Condition
The fi nal variable between individual listeners is the condition of the hear-
ing mechanism. Some individuals have impaired hearing; some have
knowledge of their condition and others do not. The hearing of the individ-
ual might vary from the norm because of a defect at birth, from accidental
damage from physical trauma or prolonged exposure to high sound-pres-
sure levels, or from the natural deterioration caused by the aging process.
The recordist cannot anticipate hearing impairment of the listener, nor cre-
ate recordings that can be heard well by those so unfortunate. It is quite
important, however, that the recordist know the condition of their hearing.
Variation of the individual’s hearing characteristics from normal human hear-
ing is of great importance to the recordist. The recordist must have knowledge
of their own hearing and make use of that information in evaluating sound
in their job function. Signifi cant hearing problems may make a person poor-
ly suited for certain positions in the recording industry. Normally, a recordist
might fi nd they are less sensitive to sound in certain frequency ranges or
that the two ears have different frequency and amplitude sensitivities. This
Chapter 3
80
information will serve the recording professional well in evaluating sound,
as it will allow them to make adjustments in their work by knowing how
their personal perception is different from the existing sound.
Target Audience
The typical listener envisioned for a specifi c recording project, or piece of
music, is often called a target audience. The target audience for a piece of
music is often determined to help focus a project and to seek a way of pre-
dicting the success of the music in communicating its message. The knowl-
edge, musical and sociological expectations, and the listening experience
of a typical audience member will defi ne the target audience. The music can
then be shaped to conform to the abilities and expectations of the typical
member, thereby increasing the chance it will successfully communicate its
musical message. The goal is to create a recording/song that this defi ned
audience will fi nd engaging and that will be commercially successful.
Conclusion
Recording professionals should not expect people to listen to their record-
ings with undivided attention or with the same level of accomplishment
they have attained. At the same time the recordist must not underestimate
or undervalue the listener. Listeners are often passionate about the music
we record and the music they listen to.
Music audiences feel strongly about “their” music. They are often very pos-
sessive about the type of music they enjoy and the performers they follow.
Music can speak deeply to people and bring people to identify with music
on a very fundamental and personal level. Commentaries about their music
can be perceived as refl ections on themselves. While the listener may not
know much about music or recording, they know what they like—and usu-
ally are willing to tell you about it.
Similarly, listeners are often quick to identify the quality of productions.
Well-crafted and successful recordings are easily identifi ed by listeners,
not for their quality but because they present the musical material in a way
that communicates well and directly to the listener. Further, sound qualities
of the recordings of one type of music will be different from others, and will
draw the listener, or not. The listener may not recognize technical integrity,
but any signal problems will detract from the recording and the listening
experience. Listeners will not miss this. They may not be able to tell what is
wrong but they will recognize that it is not right.
The recordist is in the position to play a central role in the creation of music.
They may use the recording process to shape music performances and the
music itself. This new role for the recordist has been widely recognized
The Musical Message and the Listener
81
since the early 1960s. While perhaps it is new when we think back over
hundreds of years of music, recordists are currently very much a part of
the creative process of nearly all recordings. With over 40 years of sophisti-
cated practice in crafting sounds through multitrack production and stereo
reproduction, the recordist as an artist is no longer something new.
The more the recordist understands music and the listener, the more likely
it is that they will be in a position to assist the artist in delivering a perfor-
mance and recording of a piece of music that will be successful. To bring the
reader to appreciate some of what is involved with this feat was the goal of
this chapter. It is wished that the recordist would want the listener to fi nd
enjoyment in what was recorded—for the artist’s sake and for the music.
Exercises
The following exercise should be practiced on a variety of pieces of music until
you are comfortable with the material covered.
Exercise 3-1
Structure Exercise
The purpose of this exercise is to create a time line of a song, divided into
major structural divisions and phrases.
1. Select a recording of a song you know reasonably well and prepare a time
line with measures numbered, up to perhaps 100.
2. Listen to the recording to identify where the major sections fall against
the time line. Try following the time line while tapping the pulse of the
song, or conducting. When a major section begins/ends, make a mark
on the time line.
3. After listening to the song, write down the names of those divisions.
Now try fi lling in additional information, such as other verse or chorus
beginning/ending points and phrase lengths.
4. Repeat listening to the recording and writing down the information rec-
ognized.
5. The graph is completed when it includes all of the major structural
divisions, the midlevel structural divisions and the smallest uniform
phrase. Incorporating text information is also helpful.
6. Following the time line while listening may prove helpful in initial studies
in identifying structural divisions. You are encouraged to wait until the
music is stopped before writing observations. Clearly separating the lis-
tening and writing activities will assist you in improving listening skills and
in learning to evaluate sound. This will become increasingly important as
this book progresses.
83
Part Two
Understanding the Mix:
Developing Listening and
Sound Evaluation Skills
85
4 Listening and Evaluating Sound
for the Audio Professional
People in the audio industry need to listen to and evaluate sound. Carefully
evaluating sound, for one reason or another, is an integral part of most posi-
tions in the audio industry. Sound must be evaluated in all areas of audio
production, manufacturing, and support. These areas are very diverse. They
may be equipment performance or microphone placements, music mixes
or the technical quality of the signal, or any one of many other possibilities.
Sound is being evaluated by the audio professional in all these cases and
more.
Saying “sound is central to audio” is obvious to the point of sounding triv-
ial. It is equally ironic that the audio and music community has not devel-
oped a way to clearly communicate meaningful information about sound.
No language or vocabulary exists for qualities of sound. Part Two begins
the creation of a means and vocabulary to communicate about sound.
While this book is focused on the artistic roles of the recording profes-
sional, sound evaluation is important to everyone in the industry who lis-
tens to, evaluates, and talks about sound. Part Two of the book can and
should be used by anyone in need of developing the ability to understand,
evaluate, and communicate about sound. It should be a primary objec-
tive of all people in the audio industry to be more sensitive and reliable in
their evaluations of sound. While the term “recordist” will still be used in
Part Two, it should be interpreted to mean “any audio professional” during
discussions of sound evaluation. The sequence of chapters in this part will
present a system for understanding and evaluating sound that will sub-
stantially develop the readers ability when mastered.
It is necessary for all people related to the audio industry to be accurate
and consistent in their evaluations of the quality and content of sound
and audio. As we have seen, the previous experience, knowledge, cultural
Chapter 4
86
conditioning, and expectations of the listener (in this case the audio profes-
sional) have a direct impact on the level of profi ciency at which the listener
is able to evaluate sound. With increased experience in evaluating sound
comes increased skill and accuracy.
The act of listening and the process of evaluating sound can be learned and
greatly refi ned. The following is a presentation of the need for sound evalu-
ation and the listening process, leading to a discussion about how we talk
about sound, and the development of listening and sound evaluation skills.
Why Audio Professionals Need to Evaluate Sound
Audio professionals need to evaluate sound to defi ne what they hear, to
understand what they hear, and to communicate with one another about
sound. These are important aspects of the job functions for almost all peo-
ple in audio.
Recording engineers and producers, obviously, must have well-developed
listening skills because evaluating sound is one of the most important
things they do in their work. The need for highly refi ned skills obvious-
ly holds true for composers and performing musicians, especially those
involved in the audio-recording processes. All audio professionals who lis-
ten to sound share a similar need for these skills. The technical people of
the industry, those involved in artistic roles, and those in manufacturing or
facility design, or product sales and many others, all must share observa-
tions and information about sound.
There are other reasons audio professionals need to evaluate sound in addi-
tion to talking about sound in precise and meaningful terms. The recording’s
sound qualities need to be observed, recognized, and understood to per-
form a great many jobs in the industry. Nearly all positions approach sound
evaluation in a somewhat unique way. In fact, there might be as many rea-
sons (signifi cantly or slightly unique) for evaluating sound as there are job
functions within the multitude of positions in the audio industry.
For the recordist, there are additional benefi ts to sound evaluation, and
some will be discussed in detail in later chapters. These include ways to (1)
keep track of one’s work so that the audio professional can return to those
thoughts/activities in the future, (2) plan recording projects out of the stu-
dio, (3) understand the work and ideas of others, (4) recreate sounds and
musical styles, and many more.
Nearly all people in audio work directly with some aspect of sound. These
aspects might be vastly different, yet these people must communicate
directly and accurately to share information. In order to share information,
sound must fi rst be evaluated and understood by the listener.
Listening and Evaluating Sound for the Audio Professional
87
Understanding sound begins with perceiving the sound through active
attention. One can then recognize what is happening in the sound or rec-
ognize the nature of the sound, provided the listener has suffi cient knowl-
edge and experience. The listener must know what to listen for (i.e., the
artistic elements of sound) and where to fi nd that information (perhaps a
particular musical part). This recognition can lead to understanding, given
suffi cient information. What is understood can be communicated, with the
presence of a vocabulary to exchange meaningful information that is based
on a common experience.
Talking About Sound
People in the audio industry, as in all industries, work together towards
common goals. In order to achieve those goals, people must communi-
cate clearly and effectively. A vocabulary for communicating specifi c, perti-
nent information about sound quality does not currently exist. People have
been talking about sound for hundreds of years without a vocabulary to
describe their actual perceptions and experiences. Instead, people have
used imprecise terms to associate other perceptions and experiences to
sound— unsuccessfully and inaccurately.
Describing the characteristics of sound quality through associations with
the other senses (through terminology such as “dark,“crisp,” or “bright”
sounds) is of little use in communicating precise and meaningful informa-
tion about the sound source. “Bright” to one person may be associated with
a narrow, prominent band of spectral activity around 15 kHz throughout the
sound source’s duration. To another person the term may be associated with
fast transient response in a broader frequency band around 8 kHz, and pres-
ent only for the initial third of the sound’s duration. A third person might
easily provide a different, yet an equally valid, defi nition of “bright” within
the context of the same sound. The three people would be using different
criteria of evaluation and would be identifying markedly different charac-
teristics of the sound source, yet the three people would be calling three
potentially quite different sounds the same thing—“bright.This terminol-
ogy will not communicate specifi c information about the sound and will not
be universally understood. It will not have the same meaning to all people.
Analogies such as “metallic,“violin-like,“buzzing,” or “percussive” might
appear to supply more useful information about the sound than the inter-
sensory approach. This is not so. Analogies are, by nature, imprecise. They
compare a given sound quality to a sound the individuals already know. A
common reference between the individuals attempting to communicate is
often absent. Sounds have many possible states of sound quality.
“Violin-like” to one person may actually be quite different to another person.
One person’s reference experience of a “violin” sound may be an historic
instrument built by Stradivarius and performed by a leading classical artist
Chapter 4
88
at Carnegie Hall. Another person may use the sound of a bluegrass fi ddler,
performing on a locally crafted instrument in the open air, as their reference
for defi ning the sound quality of a “violin.The sound references are equally
valid for the individuals involved, but the references are far from consistent
and will not generate much common ground for communication. The sound
qualities of the two sounds are strikingly different. The two people will be
referencing different sound characteristics, while using the same term. Cer-
tainly there will be strong similarities between these two instrument sounds,
but there will be great differences in the subtle details of the sound qualities;
it is in these subtle details that quality recordings are created, and where the
skills of the recordist must be drawn. An accurate exchange of important
information will not occur without a clear communication of this detail.
The imprecision of terminology related to sound quality is at its most
extreme when sounds are categorized by mood connotations. Sound
qualities are sometimes described in relation to the emotive response they
invoke in the listener. The communication of sound quality through termi-
nology such as “somber,” for example, will mean very different things to
different people. Such terminology is so imprecise it is useless in commu-
nicating meaningful information about sound.
People can only communicate effectively through the use of common
experiences or knowledge. The sound source itself, as it exists in its physi-
cal dimensions in air, is presently the only common experience between
two or more humans.
As we hear sounds, we make many individualized interpretations and per-
sonal experiences. These individual interpretations and impressions are pres-
ent within the human perceptual functions of hearing and evaluating sound.
They cause individualized changes of the meaning and content of the sound.
Therefore, our interpretations and impressions are of little use in communi-
cating about sound. Humans have few listening experiences that are com-
mon between individuals and that are available to function as the reference
necessary for a meaningful exchange of information (communication).
This absence of reference experiences and knowledge makes it necessary
for the sound source itself to be described. Meaningful communication
about sound will not be precise and relevant without such a description.
The states and activities of the physical characteristics of the sound will be
described in our communications about sound. This approach to evaluating
sound requires knowledge of the physical dimensions of sound and how
they are transformed by perception. Meaningful communication between
individuals is possible when the actual, physical dimensions of sound are
described through defi ning the activities of its component parts.
By describing the states and activities of the physical components of a sound,
people may communicate precise, detailed, and meaningful information.
The information must be communicated clearly and objectively. All of
Listening and Evaluating Sound for the Audio Professional
89
the listeners subjective impressions about the sound, and all subjective
descriptions in relation to comparing the sound to other sounds, must be
avoided for meaningful communication to occur.
Subjective information does not transfer to another individual. As people
attempt to exchange their unique, personal impressions, the lack of a com-
mon reference does not allow for the ideas to be accurately exchanged.
Meaningful communication about sound can be accomplished through
describing the values and activities of the physical states of sound. Sounds
will be described by the characteristics that make them unique. Meaningful
information about sound can be communicated through verbally describing
the values and activities of the physical states of sound in a general way.
Information is communicated in a more detailed and precise manner through
graphing the activity, as will be described in the following chapters.
A vocabulary for sound is essential for audio professionals to recognize and
understand their perceptions, as well as to convey to others what they hear.
The Listening Process
Recording engineers and other industry professionals must learn to lis-
ten in very exacting ways. The profi le of the listener discussed in Part One
assisted us in identifying how the recordist has different purposes for lis-
tening and needs a much higher skill level. It is necessary for audio pro-
fessionals to be accurate and consistent in the listening process and its
observations. Likely the most diffi cult job of the recordist is listening and
paying attention.
Listening skills need to be developed for the recordist to function in their
job. They will be focused in their attention and ultimately become system-
atic in how they listen to hear detail in sound. The recordist will not be lis-
tening passively, but rather will be actively engaged in seeking out informa-
tion with each passing sound. They will be concerned about a multitude of
things, from the quality of a performance, to its technical accuracy; from the
quality of a microphone selection, to its appropriate placement; from the
quality of the signal path, to the inherent sound quality of a signal processor.
All of these things and many more might pass through the thoughts of the
recordist frequently and regularly throughout any work session. The listen-
ing experience of the audio professional will be multidimensional in many
ways. All of their work comes back to learning how to listen.
The recordist must acquire a systematic approach to listening that
will involve quickly switching between critical and analytical listening
information. It will involve quickly switching between levels of detail, or
perspective, and focus on various artistic elements and musical materials.
In many ways the recordist’s listening process is like a scanner—always
moving between types of information and between levels of detail.
Chapter 4
90
Critical Listening versus Analytical Listening
Audio professionals evaluate sound in two ways: critical listening and
analytical listening. Critical listening and analytical listening seek different
information from the same sound. Analytical listening evaluates the artistic
elements of sound, and critical listening evaluates the perceived param-
eters. A different understanding of the sound is achieved in each case.
The artistic elements are the functions of the physical dimensions of sound,
applied to the artistic message of the recording. We recognize the physical
dimensions of sound through our perception, as perceived parameters. This
allows understanding of the technical integrity of sound quality to be con-
trasted with musical meaning and relationships of the artistic elements.
The same aspects of sound quality may provide two different sets of infor-
mation. This is entirely dependent upon the way we listen to the sound
material, evaluating the sound for its own content (critical listening) or
evaluating sound for its relationships to context (analytical listening). The
recordist must understand how the components of sound function in rela-
tion to the musical ideas of a piece of music and the message of the piece
itself. These are analytical listening tasks. The audio professional must also
understand how the components of sound function to create the impres-
sion of a single sound quality, and how they function in relation to the tech-
nical quality of the audio signal. These aspects are critical listening tasks.
Analytical listening is the evaluation of the content and the function of
the sound in relation to the musical or communication context in which it
exists. Analytical listening seeks to defi ne the function (or signifi cance) of
the musical material (or sound) to the other musical materials in the struc-
tural hierarchy. This type of listening is a detailed observation of the interre-
lationships of all musical materials, and of any text (lyrics). It will enhance
the recordist’s understanding of the music being recorded, and will allow
the recordist to conceive of the artistic elements as musical materials that
interact with traditional aspects of music.
Critical listening is the evaluation of the characteristics of the sound itself.
It is the evaluation of the quality of the audio signal (technical integrity)
through human perception, and it can be used for the evaluation of sound
quality out of the context of a piece of music. Critical listening is the process
of evaluating the dimensions of the artistic elements of sound as perceived
parameters—out of the context of the music. In critical listening, the states
and values of the artistic elements function as subparts of the perceived
parameters of sound. These aspects of sound are perceived in relation to
their contribution to the characteristics of the sound, or sound quality.
Critical listening seeks to defi ne the perception of the physical dimen-
sions of sound, as the dimensions appear throughout the recording pro-
cess. It is concerned with making evaluations of the characteristics of the
sound itself, without relation to the material surrounding the sound, or to
Listening and Evaluating Sound for the Audio Professional
91
the meaning of the sound. Critical listening must take place at all levels of
listening perspective (see below), from the overall program to the minutest
aspects of sound.
The Sound Event and Sound Object
The concepts of the sound event and the sound object assist in under-
standing how the musical materials (analytical listening) and sound quality
(critical listening) are shaped by the artistic elements. A sound event is the
shape or design of the musical idea (or abstract sound) as it is experienced
over time. The sound object is the perception of the whole musical idea (or
abstract sound) at an instant, out of time.
The sound event is a complete musical idea (at any hierarchical level) that
is perceived by the states and values of the artistic elements of sound. The
term designates a musical event that is perceived as being extended over
time, and has signifi cance to the meaning of the work. The sound event
is a musical idea perceived by its various dimensions, as shaped by the
artistic elements of sound. It is a perception of how the artistic elements
of sound are used to provide the musical section with its unique character.
The sound event is understood as unfolding and evolving over time, and is
used in analytical listening observations.
Sound object refers to sound material out of its original musical context.
For example, in a discussion of the sound quality of George Harrison’s Gib-
son J-200 on “Here Comes The Sun” compared to its sound on “While My
Guitar Gently Weeps,” the two sound qualities of the instrument would be
thought of as sound objects during that evaluation and comparison pro-
cess. A sound object is a conceptualization of a sound as existing out of
time, and without relationship to another sound (except its possible direct
comparison with another sound object).
The concepts, sound object and sound event, are contrasted at any hierar-
chical level. They allow analytical listening and critical listening evaluations
to be performed, interchangeably and/or simultaneously, on the same
sound materials.
These concepts are able to provide an evaluation of the music’s use of the
artistic elements of sound, in ways that are not necessarily related to the
importance or function of the musical materials. Rather, these concepts seek
to determine information on the artistic elements (or perceived parameters)
themselves, as they exist as singular and unique entities (sound objects),
and as they change over time (sound events).
Chapter 4
92
Perspective and Focus
For sound evaluation purposes, the audio professional must be able to
understand the artistic elements of sound, how those elements relate to the
perceived parameters of sound, and how those two conceptions of sound
are used with
perspective and with focus. The concepts of perspective and
focus are central to the listening process and evaluating sound. The audi-
ence will go through this process in a general and intuitive manner. The
audio professional must be thorough and systematized in approaching the
listening process.
In order for the message carried by the artistic elements to be perceived,
the listener (audience or audio professional) must recognize that impor-
tant information is being communicated in a certain artistic element. The
listener must then decipher the information to understand the message,
or recognize the qualities of the sound. The listener will identify the artistic
elements that are conveying the important information by scanning the
sound material at different perspectives, while focusing attention on the
various artistic elements at the various levels of perspective.
Focus is the act of bringing some aspect of sound to the center of ones
attention. The listener needs to identify the appropriate, perceived param-
eter of sound that will become the center (focus) of attention in deciphering
the sound information. Further, the listener needs to determine a specifi c
level of detail on which to focus attention.
The
perspective of the listener determines the level of detail at which the
sound material will be perceived. Perspective is the perception of the piece
of music (or of sound quality) at a specifi c level of the structural hierarchy.
The content of a hierarchy is entirely dependent upon the nature of the
music or program material, at any specifi c time.
In a musical context, the detail might break down as in Table 4-1. Each lev-
el of detail represents a unique perspective from which the material can
be perceived. Each perspective will allow the listener to observe different
dimensions and activities of the sound material. A perspective might be
thought of as a type of distance of the listener from the sound material;
the nearer the listener to the material, the more detail the listener is able to
perceive. Perspective is the level of detail at which one is listening.
Listening and Evaluating Sound for the Audio Professional
93
Table 4-1
Example of Hierarchical Levels of Perspective
Level 1
Overall musical texture and form
Level 2
Text (lyrics)
Level 3
Program dynamic contour, timbral balance, sound stage
Level 4
Individual musical parts (melody, harmony, etc.)
Level 5
Groupings of instruments and voices
Level 6
Individual sound sources (instruments and voices)
Level 7
Dynamic relationships of instruments (musical balance)
Level 8
Composite sound of individual sources (timbre and spatial qualities)
Level 9
Pitch, duration, loudness, timbre, space, and duration elements of an
individual sound of a specifi c sound source
Level 10
Dynamic contour; defi nition of important components of timbre and
space of an individual sound of a specifi c sound source
Level 11
The spectral content (harmonics and overtones) of the individual
sound of a specifi c sound source
Level 12
The spectral envelope (dynamic envelopes of the overtones and har-
monics) of that specifi c sound
The listener may approach any perspective to extract analytical listening
information (pertaining to the function of the musical materials and artistic
elements at that level of the structural hierarchy) or to extract critical lis-
tening information (pertaining to defi ning the characteristics of the sound
itself). Focus, again, is the act of bringing one’s attention to the activity and
information occurring at a specifi c perspective of the structural hierarchy,
and/or within a particular artistic element or perceived parameter.
Attention to focus and perspective are needed in both critical listening and
analytical listening activities, and should be considered before starting any
listening session. It is important for the recording professional to defi ne the
focus and level of perspective of the listening experience before the sound
material begins, as they can shape the listening experience in strikingly dif-
ferent ways for different situations. In many listening situations, all param-
eters of sound will need to be continually scanned to determine their infl u-
ence on the integrity of the audio signal, and all artistic elements will need
to be scanned to determine their importance as carriers and shapers of the
musical message. In other listening situations, the recordist might need
to carefully follow a specifi c artistic element at a specifi c level of perspec-
tive throughout the listening experience. Different situations will require
a different approach to listening. It is important that the recording profes-
sional have a clear idea of what needs to be the focus of their attention and
the level of detail required (perspective) before beginning to listen—or of
the need for continually shifting focus and levels of perspective.
In beginning studies it is very important for the listener to have a clear pur-
pose for each listening experience. This will greatly assist the learning pro-
cess, and will make each listening session more productive and successful.
Chapter 4
94
They should be focused on a specifi c level of perspective and on a specifi c
aspect of the sound, and should seek to ignore other aspects of sound. They
will listen to the material repeatedly with a focus on a new aspect of sound
at each repetition. With practice, one will be able to listen to (and recognize
and understand what is happening to) many elements “at once.
Multidimensional Listening Skills
Equal attention must be given to all aspects of sound as, depending on
the sound material and purpose of the listening, any perceived parameter
of sound or any artistic element may be the correct focus of the listeners
attention. An incorrect focus will cause important information to go unper-
ceived and will cause unimportant information to incorrectly skew the
listeners perception of the material. The recording professional will often
face the possibility that a change might happen in any of the dimensions of
sound, at any point in time, at any level of perspective. It is necessary that
recording professionals hear, recognize, and understand the character of
the sound and any changes that might occur. This awareness needs to be
cultivated, as it is counter to our learned listening tendencies.
Audio professionals must develop their listening skills to be multidimen-
sional. The listening process involves the potential need to listen to many
things simultaneously. Though on one hand impossible, this is in practice
often necessary. To accurately evaluate sound, they must learn to:
1. Shift perspective between all levels of detail,
2. Focus on appropriate elements and parameters at all levels of perspec-
tive (and not allow their attention to be pulled away to activity in anoth-
er element or level of perspective), and
3. Shift between analytical listening (for the qualities and relationships
of musical material) and critical listening (for the characteristics of the
sound itself) to allow the evaluation of sound.
Distractions
It is often diffi cult for the recordist to keep from being distracted. Maintain-
ing focus on the purpose and intent of the particular listening experience is
very important. Common distractions are becoming preoccupied with the
music, being drawn to sounds and sound qualities other than those under
evaluation, and being curious about how a sound quality was created (as
opposed to character of the sound).
Most of us are drawn to a career in recording because of a love of music.
When working on a recording, we can lose our focus by becoming engaged
with the musicality of the material. This focus is similar to listening for enter-
tainment. However, there is a time and place to listen for entertainment.
Most often recordists listen to qualities that are more precise and exacting.
Listening and Evaluating Sound for the Audio Professional
95
Even when listening within musical contexts, working directly with musi-
cal materials, and thinking about the musicality of the recording, the audio
professional will be working at a level of perspective that is far removed
from the passive music listening experience enjoyed by most people.
While focused on listening to the characteristics of one element of sound,
the sound qualities of another element can draw the listeners attention. It
is very important that the listener remain mindful of the purpose of the lis-
tening experience. For example, if the listening activity is intended to deter-
mine the musical balance of the snare drum against the toms, one should
not allow oneself to get distracted by the sound quality of the piano.
In evaluating sound, audio professionals must remember that they are
seeking to understand the sound that is present. It is possible for the lis-
tener to become distracted from listening by their own knowledge of the
recording process or by their wanting to learn more about the recording
process. At times people are drawn to thinking about how sound qualities
were created—equipment, recording techniques, etc. Bringing production
concerns into the process of evaluating sound is counterproductive, unless
the recordist is specifi cally trying to identify equipment choices and pro-
duction techniques, but this is a very different matter.
Listening sessions should have a clearly defi ned function. If the recordist is
listening to determine equipment that may have been used in a recording,
then that is the purpose of the session. If the recordist is listening to under-
stand the sound quality of a certain environmental characteristic, then they
should be listening to the various components of that sound and not be
concerned about identifying the manufacturer or model number or the set-
tings of the device that created the environment.
Personal Development for Listening
and Sound Evaluation
The skills and thought processes required for listening and sound evalu-
ation must be learned. The development of any skill requires regular,
focused, and attentive practice. Patience is required to work through the
many repetitions that will be needed to master all of the skills necessary to
accurately evaluate sound. Each individual will develop at a separate pace,
as with any other learning.
Memory Development
The recordist will evaluate sound more quickly and accurately with the
development of their auditory memory. This will often be accomplished
through their ability to recognize patterns in the various aspects of sound.
The listener must be conscious of the memory of the sound event, and they
Chapter 4
96
must seek to develop their memory to sustain an impression of the sound
long enough to describe, annotate, or graph certain characteristics about
the sound event.
Auditory memory can be developed. As one learns what to listen for, and
as one understands more about sound and how it is used, the listeners
ability to remember material increases proportionally. This is similar to the
process of learning to perform pieces of music through listening to record-
ings of performances and mimicking the performances. With repetition, this
seemingly impossible task becomes a skill that is much easier to perform.
Listeners often remember more than their confi dence allows them to rec-
ognize. The listener must learn to explore their memory and immediately
check their evaluations to confi rm the information.
The human mind seeks to organize objects into patterns. Sound events
have states or levels of activity of their component parts that will often tend
to fall into an organized pattern. The listener must become sensitive to the
possibility of patterns forming in all aspects of the sound event, to allow
greater ease in the process of evaluating sound. Recognizing patterns will
assist in understanding sound and sequences of sounds, and will make
remembering them more possible.
Developing memory is very possible and very important. Considering
sound takes place over time and can only exist by atmospheric changes
over time, it should be understood that sound is a memory. Sound is an
experience that is understood backwards in time. Sound is perceived after it
is past, using memory. Sound does not happen now (at a specifi c moment)
but rather it happened then. It can start or stop now, but it exists over a
stretch of time (duration).
The reader is encouraged to work through the exercise at the end of this
chapter and to return to that exercise regularly during the course of their
work in listening skill development.
Success and Improvement
With increased experience in evaluating sound comes increased skill and
accuracy. The act of listening and the process of evaluating sound can be
learned and become greatly refi ned.
The reader will continue to become more accurate and consistent in evalu-
ating sound the more they practice the skills and follow the exercises in
the following chapters. The development of these skills must be viewed as
a long-term undertaking. Some of the skills might seem diffi cult, or impos-
sible, during the fi rst attempts. The reader must remember their previous
experiences might not have prepared them for certain tasks. The skills are,
however, very obtainable. Further, the skills are desirable, as the individual
Listening and Evaluating Sound for the Audio Professional
97
will function at a much higher level of profi ciency in the audio industry after
they have obtained these evaluation and listening skills.
The mastery of the skills of sound evaluation is a lifelong process, one that
should be consistently practiced and itself evaluated. New controls of sound
are continually being developed by the audio industry. These new controls
create new challenges to the listening abilities of those in the audio industry.
Discovering Sound
Things are present in recorded music that are subtle and diffi cult to hear.
Most people have never really experienced a good number of these subtle
dimensions of recordings. When something has never been experienced or
perceived, one does not know it exists. It is possible for people to simply
not hear some aspect of sound, simply because they do not have an aware-
ness of or sensitivity to that dimension. Once that awareness and sensitiv-
ity is developed, those sounds are heard as easily as any other.
In
Personal Knowledge Michael Polanyi conveys the experience of a medi-
cal student attending a course in the X-ray diagnosis of pulmonary dis-
eases. The student watches dark shadows on a fl uorescent screen against a
patient’s chest while listening to the radiologist describe the signifi cance of
those shadows in detailed and specifi c terms. At rst the student is puzzled
and can only see the shadows of the heart and ribs, with some spidery
blotches between them. The student does not see what is being discussed.
It appears to be a fi gment of the radiologist’s imagination. As a few weeks
progress, and the student continues to look carefully at the X-rays of new
cases and listen to the radiologist, a tentative understanding begins to
dawn on the student. Gradually the student begins to forget about the ribs
and the heart, and starts to see the lungs. With perseverance in maintaining
intellectual involvement, the student ultimately perceives many signifi cant
details, and a rich panorama is revealed. The student has entered a new
world. The student may still see only a fraction of what the seasoned radi-
ologist sees, but the pictures now make defi nite sense, as do most of the
comments made by the instructor.
Many readers will likely discover a new world of sound. Dimensions of
sound exist that are out of normal listening experience. We are not aware
of those sounds until we learn what they are, and learn to bring the focus
of our attention to those elements. Only then can we discover them and
begin to understand them.
We have learned to focus our attention on certain aspects of sound. In music,
we have learned that pitch relationships will give us the most important
information. In speech, we know that the sound qualities of words make
up language, and the sound qualities of the speaker will inform us who is
talking. We know dynamics will simply enhance the message of these two
Chapter 4
98
communications, and we listen to them in that way. We have been taught
that where a musical instrument is playing is not important (and therefore
not worth the effort of recognizing the sound characteristics of location and
environment), but what pitches they are playing
is important (and worthy
of attention).
The reader will now be asked to perform listening exercises and to evaluate
sound in ways that work against these learned (and perhaps natural) listen-
ing tendencies. This requires conscious effort, focused attention, patience,
and diligence. With the knowledge that the listener is working against natu-
ral tendencies, it will make sense that certain things are diffi cult. This does
not mean they are impossible; many people accomplish them daily. Nor
does it mean that this way of listening should not take place. This way of lis-
tening is necessary to evaluate and understand many aspects of recorded
sound that are simply not normally at the center of one’s attention. As we
know, the audio professional needs to listen in ways and for things that are
not part of a layperson’s normal listening experiences.
The student took the leap of faith that is necessary in learning. The stu-
dent believed in the radiologist and continued to try to understand, ini-
tially perceiving the material as an illusion, not really present, a fi gment of
the radiologist’s imagination. The student reached a moment of revelation
when suddenly an image was perceived. It was always there. The student
was now able to see it because of increased sensitivity to the possibility of
its existence and an understanding of what that existence might be.
If the reader can commit to a similar leap of faith, they may be rewarded
with the discovery of a remarkable new world of sound.
Summary
Understanding sound must begin with perceiving sound. This requires
active attention, and suffi cient knowledge and experience to know what to
listen for. One can then recognize what is happening in the sound or rec-
ognize the nature of the sound. This recognition can lead to understanding.
What is understood can be communicated, given a vocabulary to exchange
meaningful information that is based on a common experience.
A system for evaluating sound has been devised and is presented in fol-
lowing chapters. It will provide a means for evaluating sound in its many
forms and uses, and will provide a vocabulary that can communicate mean-
ingful information about sound. The audio professional needs to evaluate
sound for its aesthetic and artistic elements and its perceived parameters,
as they exist in critical listening and analytical listening applications and at
all levels of perspective. The system for evaluating sound addresses these
concerns, and more.
Listening and Evaluating Sound for the Audio Professional
99
Exercises
The following exercise should be practiced until you are comfortable with the
material covered.
Exercise 4-1
Musical Memory Development Exercise
1. Select a recording of a song you know reasonably well and prepare a time
line with measures numbered, up to perhaps 100.
2. Before listening to the recording, sit quietly and try to remember as much
detail of the song as you can.
3. Now, write down the song’s meter. In your mind, listen to the piece and
write down where the major sections begin and end. If you cannot come
up with those divisions easily, you might well be able to deduce that in-
formation by thinking about the patterns of phrases in the introduction,
verses, choruses, etc. Write down as much information as you can.
4. Think carefully about what you wrote and identify aspects you are not certain
about—things that need to be determined when you listen to the recording.
5. Now you can listen to the recording, but listen intently for the informa-
tion you have determined you need. Do not follow your graph. Listen with
your eyes closed. Listen to remember what you hear. Do not write while
you are listening and do not correct your graph while you are listening.
6. When the song has stopped, write down what you heard in your one
listening and correct what you previously wrote. Then repeat steps 4 and
5 until you have created a time line and structure of the song—in as few
listening sessions as possible. Check your information one last time while
following your graph. All of the information you wrote should be checked
for accuracy; make corrections to your graph.
7. Do not get discouraged. Keep trying. If overwhelmed, take a break but
return to the exercise in short order.
8. Select another piece of music you know more thoroughly and perform the
exercise again.
This exercise can be performed whenever a time line needs to be created.
If faced with a new song, listen intently to the song once immediately after
sketching a time line. Remember not to write while listening. Listen when it
is time to listen. Write what you have recognized and remembered when the
music has stopped.
People remember more than they believe they do. If you will trust your memo-
ry and use it, your memory will develop. Your confi dence will grow as well.
This exercise should be continually modifi ed to incorporate any sound element
you need to evaluate. For example, stereo location could replace structure. The
purpose of this exercise is to improve your memory for the perceived parameters
and the aesthetic and artistic elements of sound—any of them and all of them.
100
5 A System for Evaluating Sound
The many different positions of the audio industry and their unique needs
for sound evaluation create a need for a sound evaluation system that can
be readily transferred to a variety of contexts. It must easily yield mean-
ingful and signifi cant information to people of diverse backgrounds and
job functions. The method must transfer between musical contexts and
abstract, critical listening applications.
The aspects of sound evaluated and shaped by people in audio cannot be
described using our current vocabulary. No way to accurately talk about
sound is available.
A system for evaluating sound will be presented over the next six chapters.
The system can be adapted to be useful to all people in the audio indus-
try, and also for analyzing the musical qualities of recordings. The system
will establish guidelines for talking about sound by describing the physical
dimensions of sound, as they have been perceived. Through this, these
descriptions can be objective and accurate.
Outside scientifi c measurement and music notation systems, sound has no
written form. Being able to write down the qualities of sound will greatly
assist people in audio in evaluating sound, describing sound, studying
recording techniques and recordings, discussing sound with others, and
keeping records. The system incorporates ways of graphing and notating
sound’s perceived parameters and the artistic elements. This will greatly
aid the listener in achieving these goals, and in understanding, recogniz-
ing, or evaluating sound.
System Overview
The system for sound evaluations was created to supply objective infor-
mation on the listening experience. It seeks to give the listener the tools
to defi ne what is being heard. This will lead to a better understanding of
the unique qualities of recorded/reproduced sound, better communication
A System for Evaluating Sound
101
between people discussing sound, and enhanced control of the artistic
aspects of making music recordings.
The elements of sound are all evaluated independently, using a variety of
techniques. These isolated evaluations may then be related to evaluations
of other elements, to observe how they interact. The standard
X-Y graph
used in so many different scientifi c contexts has been adapted for many
of these evaluations, especially those that take place against time. Other
evaluations use unique diagrams such as the sound stage.
The system seeks to describe and defi ne the activities of the fi ve physical
dimensions of sound, as they are used in recording production/reproduc-
tion. The system examines the changes of state and value of those dimen-
sions of sound, as they appear in perception and in artistic expression.
Table 5-1 outlines how the various evaluations of the system relate to the
aesthetic and artistic elements or perceived parameters of sound.
Table 5-1
Evaluation Techniques for the Elements of Sound
Element of Sound Evaluation Graphs and Processes
Time Time line of song; with structure, phrase, and text
indications
Sound sources against time line
Pitch Melodic contour
Pitch area
Pitch density
Dynamics Dynamic contour
Musical balance
Sound quality Performance intensity
Sound quality evaluation
Timbral balance
Spatial properties Distance location
Stereo location and surround location
Sound stage
Perceived performance environment
Environmental characteristics of sources
The system starts with basic skills and builds on them. Skill in recogniz-
ing musical materials and building a time line lead to the development of
skills in pitch-related perception and dynamics. Interspersed throughout
the system are exercises to build skills preparing the reader to undertake
sound-quality evaluations. Finally, skills in recognizing spatial properties
and environmental characteristics are addressed.
A complete listing of exercises appears after the table of contents. They
are arranged in what is usually the most effective order for skill develop-
ment, and the reader is encouraged to work through the exercises in the
presented sequence. The exercises appear at the end of the chapters that
contain explanations of the material. Some skills will take longer to learn
Chapter 5
102
than others, and the reader should be careful in assessing their progress.
The assistance of someone that is already a skilled listener or teacher will
at times be valuable. Any one exercise should be learned well before pro-
gressing too far ahead, though mastery of skills is not necessary before
moving ahead. Indeed, mastery of some of these skills might take years,
and the reader is encouraged to return to those exercises throughout an
extended period of time. The fi rst exercises of Part One should be reviewed
before continuing with the exercises of Part Two.
Notating or writing down the characteristics of a sound can greatly assist
the listener in understanding the sound. These notations (written repre-
sentations of the sound) can also be used for communicating with others
about the sound, for evaluating the sound, remembering the characteris-
tics of the sound, and even for recreating the sound. While the reader will
not seek to perform a written evaluation of all sounds during their career,
performing a detailed evaluation of a sound will provide information that
might otherwise go unobserved, especially early in one’s development.
Notating sound material in graph form will be used for fi nely developing
the readers perception and sound evaluation skills. It will also provide the
reader with a useful resource to assist in evaluating sound.
Creating the graphs of the following chapters will force readers to place
their focus on a specifi c element. This will bring them to discover and rec-
ognize the characteristics of the element in greater and greater detail as
their skill develops. The reader will develop listening skills much more
quickly by taking the time and effort to create the graphs than would hap-
pen otherwise. Further, the graphs will aid the listener in comparing one
sound or mix to another, and by such comparisons learn more about the
artists, producers/engineers and the recordings. The graphs also allow the
reader to “record” their observations for further study in the future; it will
be interesting for the reader to observe how these graphs change in accu-
racy over a short amount of time and focused attention.
The system for evaluating sound has much in common with traditional
forms of music-related ear training. Some of the skills learned by musi-
cians will transfer to this process. An ability to take traditional music dic-
tation will be benefi cial to learning the process of evaluating sound, but
is not required. Traditional listening skills emphasize pitch relationships in
musical contexts. This comprises a very small part of our concerns about
sound in audio. The skills of making time judgments and an awareness of
activities in pitch, dynamics, and timbre will need to be developed much
further than traditional approaches allow.
Many musicians start their studies by mimicking or repeating music on
recordings. Music is often learned by the person listening to recordings
and trying to play back what was heard. Many people have even learned to
play musical instruments almost solely by listening to recordings. Repeated
listening to the same recording is something many people have previously
A System for Evaluating Sound
103
done, whether to learn something or for enjoyment. This experience will be
important in the many exercises in developing skill in evaluating sound.
It is important that all information extracted from sound evaluations be objec-
tive. Audio professionals need to communicate about the characteristics of
sound. Communications about how the sound makes them feel or whether
or not they like the sound may come from clients or nonindustry people and
need to be interpreted into the audio professional’s work activities, but the
information is not relevant or valuable in the evaluation of sound.
The reader must learn never to use subjective impressions or descriptions
of the sound event in the evaluation process. Such impressions are unique
to each individual and cannot be accurately communicated between indi-
viduals (they mean something different to all people). They do not con-
tribute to an understanding and recognition of the characteristics of the
sound event. Subjective impressions or descriptions do not contribute per-
tinent, meaningful information about the sound, and will not contribute
to understanding the characteristics of sound. They have no place in the
sound evaluation process.
Sound Evaluation Sequence
The sound evaluation process will follow a sequence of activities:
1. Perceiving an element of sound or an activity of material to be ana-
lyzed at a defi ned perspective,
2. Identifying the material,
3. Defi ning the material, and
4. Observing the characteristics present in the material, and/or between
the material and its musical context.
The evaluation of sound begins with perceiving the sound event or sound
object that comprises the musical material.
A sound event/object can be any sound, aspect of sound, or sequence(s)
of sounds that can be recognized as forming a single unit. The event/object
may be at any hierarchical level of musical context or of sound quality
analysis—from a distant perspective (such as the shape of the overall piece
of music) to a close perspective with a focus on some nuance (such as
a small change in the spectral content of timbre). The sound event/object
must, however, have a specifi c and defi ned perspective. Each sound evalu-
ation will have a single focus on a specifi c perspective. This perspective
must be well defi ned in the listeners mind.
Next, the listener must recognize and, in some way, identify the sound
object/event. This act is necessary to differentiate it from the material that
precedes it, follows it, or is occurring simultaneously. The sound event/
object will have dimensions within which it exists, and through which it
Chapter 5
104
is defi ned. It will have points in time where it begins and ends. It will be
perceived within the musical/communications context or in isolation. The
sound event/object will be defi ned through an understanding of the unique
states and activities of the components of sound (artistic elements or per-
ceived parameters) that comprise the sound.
Whatever the content of the sound object/event, the listener must perceive
it as a single unit. This will be accomplished through identifying and recog-
nizing the boundaries within which the sound event exists.
Third, the listener defi nes the sound event/object. This defi nition process
will seek to compile information on the activities or unique qualities of the
materials of the sound event/object that make it separate and distinct from
the materials that preceded it, follows it, and/or that occurred simultane-
ously with it. Defi ning this activity (calculating what is happening to or in
the various artistic elements or perceived parameters) is often the most
diffi cult task of sound evaluation.
Any number of repeated hearings of the sound event/object will be needed
to defi ne all of the information it contains. This skill will need to be devel-
oped over time and with practice. As the listeners evaluation and listening
skills improve, the number of hearings required will reduce signifi cantly.
The fi nal step is to seek to make sense of the information that accumu-
lated in defi ning the sound event/object. In comparing the information of
its components, meaningful observations can be made. The listener will
compare materials recently experienced and materials that are well known
to the listener. These other sound events/objects are evaluated for their
relationship to the defi ned sound event/object. The listener will be looking
for same, similar, and dissimilar states of activity and other attributes in
the other known sound sources, as those that defi ned the sound source, to
assist in making pertinent observations about the sound event/object. The
process is complete when the listener has compiled the information neces-
sary to make the needed observations of the event and its context.
The sound evaluation system is a clear set of routines. It is directly related
to the listening experience. The routines follow the order:
1. Identifying perspective, with suitable alteration of the listeners sense
of focus,
2. Defi ning the boundaries of the sound event or sound object,
3. Gathering detailed information on the material and activity, and
4. Making observations from the compiled information.
The perception of the individual sound event/object occurs at a specifi c per-
spective. The listener must consciously decide the level of detail at which
the sound event/object will be evaluated—the perspective. The sound
event/object and its component parts can then be identifi ed and isolated
from all other aspects of sound. The perspective at which the listener has
A System for Evaluating Sound
105
identifi ed the sound event/object becomes the reference level of the hierar-
chy, or framework, for the individual X-Y graph (discussed below).
Next, the sound event/object will be defi ned by its boundaries of levels of ele-
ments and speed of activity. It is most often defi ned by (1) when it exists (its
time line), (2) its most signifi cant sound elements (providing the event with
its unique characteristics—the levels of the elements of sound that comprise
the sound event), (3) the highest and lowest levels (boundaries) within those
sound elements (the extremes of levels of activity to be mapped against
the time line, or that do not change over time—levels), and (4) the relation-
ships of how the sound event’s characteristics change over time (amount of
change and rate of change of levels mapped against the time line).
Defi ning the sound event will include:
1. Determining the time line: beginning and ending points in time of the
sound event/object, and identifying the suitable time increment to allow
the activity of the components of sound that characterize the event to
be clearly presented.
2. Determining which of the elements of sound hold the signifi cant infor-
mation that characterize the sound event/object. These are the compo-
nents that supply the information that defi ne its unique characteris-
tics. These are the components that must be thoroughly evaluated to
understand the content of the sound event/object.
3. Determining the boundaries of the components/elements of sound
being evaluated. These will be maximum and minimum values or lev-
els found in each of the components of sound.
4. Determining the speed at which the fastest change takes place. This will
assist in defi ning the most suitable smallest time increment of the time
line.
The third step compiles detailed information on the material and activity. It
will add detail to the above step. A listing of the sources (items to be ana-
lyzed) will draw the listener into the evaluation process quickly and directly,
and it should become one of the very fi rst steps in collecting detailed infor-
mation for the evaluation of sound.
The components of sound will be evaluated to determine their precise lev-
els, often against the time line. This is plotting or notating the activity of
the component parts of the sound. The components of the sound will be
closely evaluated, with as much detail as possible, to determine their pre-
cise levels throughout.
Most often the component being evaluated changes over time and must
have its levels related to a time line. This information will be plotted on a
two-dimensional graph (discussed below); this allows the information to
be written/notated. This process involves all of the skills of taking music
dictation. In fact, this process is a type of music dictation for some new and
some previously ignored aspects of sound.
Chapter 5
106
A written form of the sound will be created through the process of fol-
lowing Steps 1 through 3, above. Graphing the sound event/object makes
it much easier to compile the information that will allow the listener to
recognize and understand the characteristics of sound. When sound has
been notated, it is possible for the listener to check previous observations
for accuracy, to focus on particular portions of the sound event/object, and
for the listener to be able to continue examining information on the sound
out of real time.
The fi nal activity in evaluating sound is examining the compiled informa-
tion to make observations about the sound event. The type of observations
made will vary considerably depending on context—such as either a music
mix or a microphone technique.
For example, if the observations are being made concerning the function-
ing of a particular piece of audio equipment, the evaluations will center
on the aspects of sound that the particular piece of equipment acts upon.
Observations might be focused around the effectiveness of the piece of
equipment, the integrity of the audio signal, any differences between the
input and the output signals, and how the device acts on the various dimen-
sions of sound.
The listener/evaluator will formulate questions and will use the informa-
tion compiled in the steps above to answer those questions. Which ques-
tions to ask will be determined by their appropriateness to the purpose
of the evaluation. The answers acquired through this process will be ones
of substance and will be directly related to the sound event/object. The
answers produced by this process will not produce subjective impressions
or opinions.
The observations made in this fi nal evaluation process need not be pro-
found to be signifi cant. Often the simplest, most obvious observations offer
the most signifi cant and important information concerning a sound event.
Graphing the States and Activity of Sound Components
The traditional two-dimensional line graph is quickly understood and easily
designed and used by most people. Therefore, it has been selected as the
basis for notating the various artistic elements and perceived parameters
that create sound events and sound objects.
A System for Evaluating Sound
107
Figure 5-1
X-Y line
graph.
Horizontal (X) Axis
Time
Vertical (Y) Axis
The line graph will nearly always be used with time as the horizontal (X)
axis. In this way, values of states (levels) of the component parts of the
sound can be plotted with respect to time. This allows the sound to be
observed from beginning to end at a glance, out of real time.
Time Line
The length of the material that can be plotted on a single graph is deter-
mined by the divisions of the time axis, or
time line. Events of great length
(and little detail) may be plotted on a single graph, and events of short dura-
tion (and great detail) may be plotted on a single graph. A balance must be
found in selecting the appropriate time increment for the time line. For the
graph to be of greatest benefi t, the sound should be easily observed in its
totality (from beginning to end), and the graph should have suffi cient detail
to be of use in observing the qualities of the material.
Time increments will be selected for the
X-axis that are appropriate for the
sound. Time increments will take one of two forms: (1) units based on the
second (millisecond, tenths of seconds, groups of seconds, etc.), and (2)
units based on the metric grid (individual or subdivisions of pulses, mea-
sures, or groups of measures).
If the sound material is in a musical context, the metric grid will nearly
always be the appropriate unit for the time axis. Remember, we judge time
increments most accurately with the recurring pulse of the metric grid act-
ing as a reference.
In general, when the sound evaluation utilizes the metric grid, a process
of analytical listening is occurring. Critical listening evaluations most often
use real-time increments and not the metric grid. The difference is one of
context and focus.
Chapter 5
108
If the sound material being evaluated is not in a musical context, increments
based on the second must be used. It will be common to use increments
based on the second in the evaluation of timbre relationships (including
sound quality and environmental characteristics). Envisioning a pulse of
MM:60 (or an integer or a multiple thereof) will provide some reference to
the listener in making time judgments without a metric grid, but this activ-
ity may not always be appropriate. It may distort the listeners perception
of the material, and the reference may be unstable, as the listeners atten-
tion will rightly be focused elsewhere.
A stopwatch might assist in evaluating larger time units (to the tenths of
seconds). The ability to judge time relationships can be developed. It is rec-
ommended the reader turn to the time exercises at the end of this chapter.
They will allow the reader to refi ne their skills in accurately making time
judgments, by learning to recognize the unique sound qualities (timbres)
of various time units and tempo of clock time.
With practice, the listener will develop the ability to make accurate time
judgments of a few milliseconds within the context of known, recognizable
sound sources and materials. This skill will be invaluable in many of the
advanced sound evaluation tasks regularly performed by
audio professionals.
The time unit used in any line graph will be that which is
most appropriate for the sound event or sound object.
The time increment selected must allow the graph to
depict the example accurately. The smallest perceivable
change in the components of sound being analyzed must
be readily apparent, and yet as much material as possible
should be contained on the single graph.
Vertical (Y) Axis
The components of sound to be plotted and the boundaries of levels and
activities of those components are next determined. In the initial two stag-
es of the sound evaluation process, the listener determines those compo-
nents of the sound event that provide it with its unique character. These
components will be the ones most appropriately evaluated by plotting their
activity on the line graph.
The component of the sound event to be evaluated will be placed on the
vertical (
Y) axis of the line graph. The second step of the sound evalua-
tion process (above) is now followed. The listener will now determine the
maximum and minimum levels reached in the sound event, in each of the
components of sound to be graphed. These maximum and minimum levels
will be slightly exceeded when establishing the upper and lower boundar-
ies of the
Y-axis.
Listen . . .
to tracks 26-33
for exercise in developing skills in
judging(recognizing) small time units.
A System for Evaluating Sound
109
Exceeding these perceived boundaries allows for errors that may have
been made during initial judgments of the boundaries and allows for
greater visual clarity of the graph. Boundaries should be exceeded by 5 to
15 percent, depending on context of the material and the space available
on the line graph.
Next, the minimum changes of activity and levels are determined. Through
Step 3 of the sound evaluation sequence described above, the listener will
determine the smallest increment of level change for the components of
the sound event or sound object.
This smallest increment of levels will serve as the reference in determining
the correct division of the
Y-axis. It is necessary for the Y-axis to be divided
to allow the smallest value of the component of sound to be clearly repre-
sented, just as the
X (time) axis of the graph was divided previously so the
fastest change of level would be clear.
The division of the vertical axis must allow the graph to depict the mate-
rial accurately. The smallest signifi cant change in the components of sound
being evaluated must be immediately visible to the reader of the graph,
and yet the vertical axis must not occupy so much space as to distort the
material. The reader of the graph must be able to identify the overall shape
of the activity, as well as the small details of the activity of the component
the graph represents. A balance between limitations of space and clarity of
presentation of the materials will always be sought.
Multitiered Graphs
It is not always desirable for each component of the sound event/object to
have a separate line graph. Many times several components can be includ-
ed on the same graph and plotted against the same time line.
Multitier
graphs allow several components to be represented against the same time
line, and provide the advantage that all characteristics can be more eas-
ily related to one another. This will lead to an easier understanding of the
sound’s qualities.
In multitier graphs the vertical (
Y) axis of the line graph is divided into seg-
ments. Each segment is dedicated to a different component of sound. Each
segment will have its own boundaries and increments.
Plotting a number of components of sound against the same time line not
only makes effi cient use of space on the graph, it also allows a number of
the characteristics of the sound (perhaps the entire sound) to be viewed
simultaneously.
The person reading the graph will be able to extract information more
quickly from a multitier graph than from a series of individual graphs. In
addition, when several components of sound appear against the same time
Chapter 5
110
line, the states and activities of the various components of the sound event/
object can be compared in ways that would be diffi cult (if not impossible)
were these components separated.
Specifi c multitier graphs will be used for certain evaluations in later chapters.
In those cases, the graphs will always appear in a predetermined format,
and greatly assist evaluations of components such as sound quality, musical
balance versus performance intensity, and environmental characteristics.
Figure 5-2
Multitier
graph with multiple
sound sources.
L
R
Far
Near
Proximity
Time
(in measures)
Distance Location Stereo Location Musical Balance
voice
KEY
organ
guitar
bass
1
2
3
4
1
3
4
2
3
4
1
2
2
4
1
3
Graphing Multiple Sound Sources
Multiple sound sources within the same component of the sound will also
need to be graphed. It is quite common for more than one aspect of a com-
ponent to be taking place at any one time (such as the sound of harmonics
and overtones within the spectrum of a sound). This activity would require
a separate tier of a multitier graph for each sound source in each com-
ponent of the sound event/object being evaluated. The line graph would
quickly become large and unclear.
A System for Evaluating Sound
111
As long as the segment of the graph can remain clear, it is possible for any
number of sound sources to appear on any graph. When more than one
sound source appears against the same two axes, the activities of each
sound source must be clearly differentiated from the others. Sound sourc-
es may be differentiated in a number of ways. Each of these ways may
be useful depending on the situation—what is available to the reader, the
nature of the sound, or the context of the material.
The lines that denote each sound source may be labeled. The labeling of
lines is accomplished by placing a number or the name of each sound
source in or near the appropriate line on the graph. This type of differentia-
tion is useful for graphs that contain relatively few sound sources.
Providing a different line confi guration for each sound source is sometimes
a suitable way of differentiating a number of sources on the same tier of
an
X-Y graph. Combinations of dots and dashes, or the insertion of geo-
metric shapes into the source-lines may be useful for differentiating sound
sources on the same graph—again for graphs with relatively few sources.
When sound sources are assigned lines of different colors, the graph can
clearly display the largest number of sources. Only the number of easily
recognized colors available then limits the number of sources that can be
placed on the same graph.
The use of different colors has the further advantage of being able to defi ne
groups of sound sources by assigning a color to the group and assigning
a different line confi guration (combination of dots and dashes) to the indi-
vidual sound source.
Using lines of various thicknesses to differentiate sound sources is not an
option. This approach will obscure the information of the graph. Varying
line thickness will cause the sound to visually appear to occupy an area of
the vertical axis. This is a state that is only accurate for a few select compo-
nents of sound.
The use of color is not always feasible, but it is the preferred method of
placing a number of sound sources on the same graph. Using numbered
lines or using varied and distinct line confi gurations for each sound source
are the next most fl exible and clear methods of differentiating sound sourc-
es. Combinations of color and line confi gurations will produce the most
organized and most useful graphs. Individual sound sources must always
be easily distinguished on line graphs. Readily identifi able lines that have
been precisely defi ned (by using a key, as described below) will ensure the
clarity and usefulness of the graph.
The same sound sources may be depicted on a number of tiers of a mul-
titier graph. In this case, care must be taken to defi ne each sound source
and to depict the sound sources in the same way on each tier (either by the
same number, color, or line confi guration). This will allow someone read-
ing the graph to quickly and accurately determine the states and activities
Chapter 5
112
of all of the sound sources (or aspects of the sound sources) over time. A
key of the sound sources plotted should be created to ensure this clarity.
A key lists all sound sources of the example and presents how they are
represented on the line graph (see Figure 5-2). This listing of sound sources
with their designations must be included in each line graph that contains
more than one sound source.
The listing of sound sources is one of the fi rst activities undertaken in the
entire evaluation process. A listing of the sources (elements to be analyzed)
will draw the listener into the evaluation process quickly and directly, and it
should become one of the very fi rst steps in evaluating sound.
Plotting Sources Against a Time Line
Plotting the individual sources against the time line, without concern for
levels and rates of activity of the component parts of a sound, will allow the
listener to compile preliminary information on the material without getting
overwhelmed by detail. This process is also an excellent fi rst step in getting
acquainted with the activity of writing down material that is being heard
(the taking of dictation). It may become a common initial activity each time
the listener undertakes a detailed evaluation of a sound event. A reliable
ability to place sources against a time line (and, of course, correctly identi-
fying the time line) will be assumed throughout the remainder of the book.
This process will be repeated, at least conceptually, before almost all future
exercises. This is also an excellent exercise for learning to identify all the
sound sources (instruments and vocals) present in a mix—something that
sounds simple and often proves otherwise.
Listing sound sources is an important fi rst step in many evaluations and
will need to be undertaken as a fi rst step toward plotting sources against
a time line. It is important that all individual sources be identifi ed and list-
ed separately. These individual sources often act independently and were
usually recorded with some degree of separation, giving the recordist an
independent control of the sound that will be evaluated in many ways.
Lists of sound sources should identify all independent vocal parts sepa-
rately. Groups of background vocals presenting one musical idea should be
listed as a single sound source; similarly, groups of stringed instruments
playing one line or musical idea would also be labeled as “strings.” Instru-
ments should be listed by names. When more than one instrument of the
same type is used, the instruments should be numbered either by order
of appearance or by range, with the highest instrument usually the low-
est number. Sounds should not be listed by descriptive terms (lush guitar,
happy fl ute, etc.). If the listener is at a loss as to what to call a sound, using
terms such as “unknown 1” would be appropriate until the sound is recog-
nized. Performing a sound quality evaluation (even a general one) would
allow the listener to further defi ne the sound as “unknown 1, with long fi nal
A System for Evaluating Sound
113
decay.When the listener does not know the names of instruments, or the
sound sources are very unique, listing sound sources must be undertaken
with care, and will take effort. For example, it would be a great undertaking
to list all sound sources from “Tomorrow Never Knows” by The Beatles.
Many of the song’s sound sources would need to be described in terms that
addressed the sound source or the sound quality.
In working through many of the exercises and graphs of the following chap-
ters, the reader will frequently (1) create a listing of the sound sources in
the example, and (2) create a time line of the event. By adding a third step
of plotting the listed sound sources against the time line, they will be able
to focus more intently on the material being graphed. The reader should
practice Exercise 5-3 (at the end of the chapter) and become comfortable
with the process. The recordist must be able to quickly recognize sound
sources and focus on the activity of each.
Figure 5-3 provides an example of sound sources plotted against a time line.
The listener will be able to follow the fi gure while listening to The Beatles
“She Said She Said.Whenever a sound source (an instrument or voice) is
sounding, a line is drawn across that half-measure against the timeline. As
an additional activity, complete the graph by plotting the presence of the
high hat part against the time line.
Figure 5-3
Sound
sources placed
against time line, The
Beatles: “She Said
She Said.
High Hat
Crash Cymbal
Low Tom
High Tom
Snare Drum
Bass Drum
Hammond Organ
Backup Vocal
Lead Vocal
Lead Guitar
Rhythm Guitar
Bass Guitar
1 3 5 7 9 111315171921232428313335 3840424446
measures:
4
4
Intro
Chorus 1
2
4
4
4
3
4
Verse 1 Verse 2 Verse 3
Notating Sounds in Snapshots of Time
Some important components of sound might not change over time. While
they are static, their status may well be a very important characteristic of the
sound event/object. These sound components can be evaluated without a
time line and understood as elements that remain unchanged throughout a
defi ned time period. Sounds are examined for qualities of their component
Chapter 5
114
parts that remain unchanged over the time period, as if examining a snap-
shot of the sound’s existence over the time period.
The location of instruments on a sound stage is one such possibility. While
it is very possible for sound sources to change lateral location or distance
location, sound sources often remain in the same location for extended
periods of time (entire verses or choruses, for example). It is also quite
common for some sound sources to remain in the same location through-
out an entire song.
The graphs used for plotting the components can also be used to show the
static, unchanging states. The
X-axis can be used to provide a place for each
sound source that is present, instead of a time line. Many sources can then
be placed against the same vertical axis. In addition, the diagrams of the
perceived performance environment can also be used to show placement
of sound sources and the dimensions of the sound stage, when appropri-
ate (as will be discussed in Chapter 9).
Summary
Graphs will often need to be supplemented by verbal descriptions to com-
plete evaluations of sound events or sound objects. Simply graphing a
sound does not complete the evaluation.
The contents of the graphs need to be reviewed and then described through
observations of the content and activity of the materials. In all instances,
the language and concepts used to defi ne and describe the sound must be
completely objective in nature. All descriptions refer to the actual levels,
and any changes of levels, of the components of the sound. The descrip-
tions lead to overall observations about the qualities of the sound itself and
its relationships to the other events/objects and the entire recording.
The listener must never use subjective impressions or descriptions of the
sound in the evaluation. Such impressions are unique to each individu-
al, cannot be accurately communicated between individuals (they mean
something different to all people), and do not contribute to an understand-
ing and recognition of the characteristics of the sound. Subjective impres-
sions or descriptions do not contribute pertinent, meaningful information
about the sound for the audio professional.
The next fi ve chapters will take the reader through the individual artistic
elements and perceived parameters of recordings, working through this
system for evaluating sound. The sequence of activities outlined above,
and the process of notating (graphing) sound, will be core activities of
those chapters.
In order to benefi t the most from these chapters, the reader should take
the time to work carefully through the examples and exercises. The many
A System for Evaluating Sound
115
nuances of sound contained in recordings will gradually become more
apparent, and the reader will gradually acquire an ability to talk about
sound in ways that have clear meaning to others. Further, the reader will
nd they are better able to understand their own recordings, and better
able to craft recording projects creatively.
Exercises
The following exercises should be practiced until you are comfortable with the
material covered.
Exercise 5-1
Exercise in Graphing Clock Time
Find a recording of a very simple and slow rhythm. Set up a metronome or a
click track on a DAW you can use to establish a steady pulse.
1. Use a marking of MM:60 to establish a pulse clearly in your thoughts and
set up a timeline.
2. Use that pulse to place the notes of the rhythm against the timeline.
3. Repeat Steps 1 and 2 with more complicated examples, and at MM:120
and MM:240.
4. Keep working with these steps until you begin to remember the tempos of
MM:60, 120 and 240.
This process will help you learn the “sound” of these tempos that are
directly related to clock time. This skill will be important for understanding
time issues in critical listening, and will also allow you to be able to calcu-
late time units for compressors, delays, reverbs and other devices. It might
be helpful to note that many band marches are at MM:120; if a person can
clearly remember one of these pieces they may well be able to quickly and
accurately establish a tempo of MM:120, or a pulse every .5 seconds, which
could be very helpful in many situations.
Exercise 5-2
Time Judgment Exercise
Using a digital delay unit and a recording of a high-pitch drum (such as the
snare drum or high tom tracks on the enclosed CD):
1. Route a nondelayed signal to one loudspeaker and a delayed signal to the
other loudspeaker.
2. Delay the signal and perform many repetitions of the sound while chang-
ing time-increments that are easily recognized. Always move by the same
time increment, such as 100 ms, during any given listening session.
Chapter 5
116
3. As confi dence is obtained in being able to accurately judge certain time
units, move to other time units—both smaller and larger—and repeat the
sequence in Step 2.
4. When control of time relationships is accurate within certain defi ned lim-
its and you are confi dent in this ability, test that accuracy by routing both
the direct and delayed signals to both loudspeakers (or to a single loud-
speaker).
5. Continue to work through many repetitions of time increments in a
systematic manner, comparing the qualities of the time relationships of
each listening to previous and successive material, in a logical sequence
(a suggested pattern/sequence: 150 ms, 125 ms, 100 ms, 75 ms, 125 ms,
150 ms).
6. Continue moving to smaller and smaller time units, until consistency
has been achieved at being able to accurately judge time increments of
3 to 5 ms.
Being able to recognize small time units—such as several milliseconds— will
take considerable practice. It is, however, quite possible to develop skill in
recognizing the sound quality of these time units when they are presented as
delay times of repeated sounds. The different time delays will have distinctly
different sound qualities, once you are able to recognize and remember the
individual qualities. These time delays will be heard as “timbres of time.” Time
units will actually have a characteristic sound, as they transform the sound
quality of the reiterated drum sound. These unique sound qualities of sound
reiterations will transform all other sounds in a similar way.
Tracks 26–33 on the enclosed CD present time delays with a snare-drum
sound. The delay times range from 50 ms down to 2 ms. Listening to those
tracks will provide valuable support to learning this material and working
through the above exercise.
Exercise 5-3
Exercise for Plotting the Presence of Sound Sources Against the Time Line
Graph the fi rst few major sections (verse, chorus, etc.) of a piece of popular
music, using the following steps:
1. Compile a list of all the sound sources of the song. Individual percussion
sounds and vocal parts should be listed separately.
2. Create a suitable time line by:
a. determining the pulse (metric grid) of the song;
b. grouping the pulses into measures (weak and strong beats); and
c. plotting those measures on the horizontal (X) axis in increments that
clearly show the material being graphed.
3. Plot the individual sound sources against the time line. Each sound will
have its own location on the vertical axis, making it unnecessary to make
distinctions between the lines of each sound source on the graph. When
A System for Evaluating Sound
117
an instrument is playing, place a line for that instrument in the appro-
priate location against the time line. If the instrument is playing in the
measure, extend the line through the entire measure. Alternately, you can
change this resolution to make note of instruments appearing every half
measure. If still more detail is sought, a smaller time increment could be
used.
4. After several days return to the graph. Check the time line for accuracy
and listen several times again for sound sources. It is not unusual for
sound sources to appear in the music that were not heard in previous
hearings.
5. Listen several more times to check the entrances and exits of the instru-
ments against the time line.
The reader will use this skill on many exercises in the following chapters. Learn-
ing to do this well now will prove very helpful. Further, identifying sound sourc-
es and what they are doing is a very important part of tracking and mixing.
This exercise will also improve a skill the reader will use in production work.
118
6 Evaluating Pitch in Audio
and Music Recordings
Pitch relationships shape musical materials more than the other artistic
elements. This is obvious when we consider melody as a succession of
pitch relationships, and chords/harmony as simultaneously sounding pitch
relationships; the main melody of a song can make a lasting impression on
the listener.
With the exception of percussion instruments, nearly all musical instru-
ments were specifi cally designed to produce many precise variations in
pitch, far fewer variations in timbre, and most have a continuously variable
range of loudness. Most Western music places great emphasis on pitch
information for the communication of the musical message. Pitch is the
central artistic element of most music, with the other artistic elements most
often supporting the activities occurring in pitch relationships.
In evaluating pitch in audio and music recordings, the recordist will work
well beyond traditional concepts of pitch as melody and harmony. An acute
sense of pitch will bring the recording professional to recognize pitch lev-
els, and to identify pitch areas and frequency bands.
Pitch evaluation is used in both analytical and critical listening. In critical
listening perceived pitch is often transferred into frequency calculations.
The analytical listening process relates pitch relationships to the musical
qualities, ideas and message.
The reader will be developing ways of analyzing musical sounds that will
also develop critical listening skills. The same sensations of sound are per-
ceived in each process. How the sound is evaluated (critically versus ana-
lytically) will be the difference.
The pitch analysis concepts presented in this chapter are of particular
importance to sound recordings and for developing the skills of the record-
ing professional. The information gathered will help one to understand the
Evaluating Pitch in Audio and Music Recordings
119
piece of music (analytical listening), and to evaluate sound quality (critical
listening), depending on how the information is applied. Concepts of pitch
that the reader has had much experience with are built upon in this chapter
to introduce new experiences. Pitch area evaluation is an important fi rst
step in understanding and evaluating sound quality, and forming objective
descriptions of what is happening within an isolated sound.
This chapter also presents the processes for evaluating sounds, writing
down observations during listening sessions, and making a time line. These
are important for many activities in upcoming chapters. The reader is encour-
aged to work through exercises and examples with attention to detail of the
information presented and to establish a thorough method of working.
Analytical Systems
Numerous analytical systems have been devised to explain pitch relation-
ships in music. These systems are made up of evaluation criteria that vary
considerably between systems and are more or less specifi c to certain types
of music. Generally, then, these systems can only be useful for examining
certain styles or types of music. Any single analytical system may or may
not yield information pertinent to the music being evaluated.
The recordist will need to recognize and apply the appropriate analytical
system to study the pitch relationships of a particular piece of music, if such
an evaluation is expected of them. This is rarely required. Many recordists
do, however, innately sense the relationships that are explained through
musical analysis. Understanding pitch relationships, and especially har-
monic progressions, can often greatly assist the recording professional in
crafting a recording.
Information about the artistic element of pitch levels and relationships will
be related to (1) the relative dominance of certain pitch levels (which relate
to tonal/key centers and chord progressions), (2) the relative register place-
ment of pitch levels and patterns (related to arranging and orchestration,
and will be later discussed as pitch density and timbral balance), or (3) pitch
relationships: patterns of successive intervals (motives and melodies), rela-
tionships of those patterns, and patterns and relationships of simultaneous
intervals (chords and progressions).
Study of the many systems used to analyze pitch relationships is well
beyond the scope of this book. Theories about music attempt to explain
the analytical listening experience, and to extract basic information about
the structure and form of the music. The recordist can use such insights to
control and craft the sound qualities of the recording so they will support
and directly enhance how the recording process can best deliver musical
ideas to the listener.
Chapter 6
120
Realizing a Sense of Pitch
Everyone has an internal pitch reference. This is a sense of pitch level that
is present unconsciously within each individual. Though it is often undevel-
oped, this reference can be brought to the consciousness of the recording
professional. This will, however, require focused attention and concerted
efforts over a period of time. The skill acquired will be well worth the time
and effort involved.
Every individual is different and has a unique internal pitch reference. The
things that make us unique human beings likely also contribute to our
unique sense of pitch/frequency. What goes into making this reference is
our own complex set of experiences, which vary markedly between indi-
viduals. It is often related to the timbre of certain sounds and uses our
exceptional ability of remembering the correct sounds of a particular sound
source. Therefore, the reference itself is often related to an instrument one
has played for a considerable length of time or to the sound and/or feel
of one’s own voice. Some people identify strongly with certain pieces of
music and can use their memory of the pitch level (tonal center) of that
piece of music as a reliable reference.
The process of realizing and then developing one’s unique sense of pitch
can seem perplexing, daunting, or simply impossible. This is something
that may at fi rst seem beyond human capability, simply because it is
beyond our own experience. This is the fi rst of the skills we will work to
acquire that requires the reader to make a leap of faith; to believe some-
thing is possible is the fi rst step toward making it so. One
must become confused and grapple with the confusion
in order to learn.
The reader should work through the Pitch Reference
Exercise at the end of the chapter over a period of sev-
eral weeks. Daily attention with a number of 5-, 10- or 15-
minute work sessions will yield results quickly. The reader
should always try to fi nd a quiet location where they will
not be distracted.
This exercise will bring the reader to develop a consistent and reliable sense
of
relative pitch. This relative pitch may change by 5% to 10% depending on
mood, energy level, distractions, or countless other factors. Still the core
of the individual’s sense of pitch will be present and can be relied upon
for specifi c tasks or for general use. For tasks that require precision, peri-
odic checks of the reference level for accuracy may be necessary especially
in the beginning. As with any skill, the more it is used the greater it will
be refi ned. Once the skill is acquired, refi ning and maintaining the skill of
remembering a reference pitch level can become intuitive.
Listen . . .
to tracks 4-13
for potential reference frequencies
and reference pitches.
Evaluating Pitch in Audio and Music Recordings
121
Recordists have cultivated this sense of pitch to a reliable reference for
many practical uses. As examples, it is commonly used to identify frequen-
cy levels (such as what is required to immediately determine an appropriate
equalization [EQ] setting). It is also used to keep performers playing in tune
or to keep tuning constant throughout a project, especially for an ensem-
ble without a keyboard. All of the judgments the recording professional
makes related to pitch and frequency will be enhanced with a stable sense
of pitch/frequency gained through understanding one’s own internal pitch
reference.
Recognizing Pitch Levels
The internal pitch reference is an important aid in recognizing frequencies
and pitches throughout the hearing range. This reference will now be used
to help identify the general locations of frequencies and pitches in relation
to carefully devised pitch registers.
Figure 6-1
Pitch
registers against the
grand staff of music
notation.
Critical listening and analytical listening defi ne perceived pitch differently.
Frequency estimation through pitch perception allows for critical-listening
observations, while the same sound will need to be thought of as pitch
relationships to understand analytical observations. The
pitch/frequency
registers
(Figures 6-1, 6-2 and 6-3) will be used to estimate the relative level
of the pitch material and to allow the information to be directly transferred
between these two contexts.
Chapter 6
122
Figure 6-2
Pitch and
frequency ranges of
registers.
Register Pitch Range Frequency Range
LOW up to C
2
up to 65.41 Hz
LOW-MID D
2
to G
3
73.42 to 196 Hz
MID A
3
to A
4
220 to 440 Hz
MID-UPPER B
4
to E
6
493.88 to 1,318.51 Hz
HIGH F
6
to C
8
1,396.91 to 4,186.01 Hz
VERY HIGH C
8
and above 4,186.01 and above
Pitch and frequency estimation are fundamental skills
that must be developed by the audio professional. The
use of pitch/frequency registers will assist the reader in
developing this skill of identifying perceived pitch and
frequency levels. These registers will serve as reference
areas and will provide a basis for a general description
of perceived frequency and pitch levels. The registers will
be used in many of the evaluation processes that follow
and should be committed to memory. They will provide
meaningful reference levels for many listening activities. Further, learning
the frequency equivalents of pitches will also be invaluable to the reader,
both in these studies and in the practice of recording.
It will be helpful to relate this material to actual sound sources. The ranges
of human singing voices stretch from the
low-mid and mid registers (male
voices) to the mid and mid-upper registers (female voices). Most musi-
cal activity occurs in the
mid and mid-upper registers. This is where many
instruments sound their fundamental frequencies, and where most melod-
ic lines and most closely spaced chords are placed in musical practice. Take
a moment to notice the frequency ranges spanned by these registers.
The sibilant sounds of the human voice occur primarily in the
high register,
typically around 2 to 3 kHz. Within the
very high register, humans have
the ability to hear nearly two and one-half octaves. While this register is
not playable by acoustic instruments and by human voices, much spectral
information is often in this register.
The reader should work through Exercise 6-2 (at the end of the chapter) to
begin developing skill at estimating pitch levels and octave placements.
This is designed to take the reader from making general judgments to iden-
tifying precise pitch levels. The boundaries between registers are purpose-
fully large to give the reader a sense of moving from the general to the
specifi c and to acquire the skill with meaningful successes along the way.
This skill is central to many of the evaluations commonly performed by all
people in audio and will be used throughout the remainder of this book. The
listener can easily transfer perceived pitch level into frequency estimation
through using the above registers. The reader should practice transferring
various pitch levels to frequency and the reverse.
Listen . . .
to tracks 14-18
for pitch register boundaries per-
formed on a piano.
Evaluating Pitch in Audio and Music Recordings
123
Figure 6-3
Pitch/ frequency registers in relation to keyboard with pitches, octave designations, and
equivalent frequencies.
4186
3951
3520
3136
2794
2637
2349
2039
1976
1760
1568
1397
1318.5
1174.7
1046.5
987.8
880
784
698.5
659.3
587.3
523.3
493.9
440
392
349.2
329.6
293.7
261.6
246.9
220
196
174.6
164.8
146.8
130.8
123.5
110
98
87.5
82.4
73.4
65.4
61.7
55
49
43.7
41.2
36.7
32.7
30.9
27.5
3729
3322
2960
2489
2218
1865
1661
1480
1244.5
1108.7
932.3
830.6
740
622.3
554.4
466.2
415.3
370
311.1
277.2
233.1
207.7
185
155.6
138.6
116.5
103.8
92.5
77.8
69.3
58.3
51.9
46.3
38.9
34.7
29.1
Frequency (Hertz)
Octave
Designation
C
8
B
7
A
7
G
7
F
7
E
7
D
7
C
7
B
6
A
6
G
6
F
6
E
6
D
6
C
6
B
5
A
5
G
5
F
5
E
5
D
5
C
5
B
4
A
4
G
4
F
4
E
4
D
4
C
4
Middle C
B
3
A
3
G
3
F
3
E
3
D
3
C
3
B
2
A
2
G
2
F
2
E
2
D
2
C
2
B
1
A
1
G
1
F
1
E
1
D
1
C
1
B
0
A
0
C
8
4,186 Hz
F
6
1397 Hz
E
6
1318.5 Hz
B
4
493.9 Hz
A
4
440 Hz
A
3
220 Hz
G
3
196 Hz
D
2
73.4 Hz
C
2
65.4 Hz
C
10
16,744 Hz
C
9
8,372 Hz
E
0
20.6 Hz
VERY HIGHHIGHMID-UPPERMIDLOW-MIDLOW
Chapter 6
124
It is possible for the experienced listener to consistently estimate pitch/fre-
quency level to within an interval of a minor third (one-quarter of an octave).
After considerable practice and experience, and gaining a clear sense of an
internal reference pitch, even greater accuracy is possible. Within several
weeks of thoughtful effort, the reader should be consistent within a perfect
fth (a bit over one-half an octave), and accuracy will continue to increase
at a rapid pace with regular study.
Pitch Area and Frequency Band Recognition
Recordists must often bring their attention to a specifi c range of frequen-
cies, or frequency bands, to identify some aspect of sound quality or equip-
ment performance. This section will present a rough equivalent in musical
contexts that can be used to develop this skill and more.
Many percussion-related sounds occupy a
pitch area, not a specifi c pitch.
These sounds are perceived as existing in an area between two boundar-
ies. The boundaries may, at times, be unstable and changing in pitch and
dynamic level, and secondary pitch area(s) may also be present. These
sounds have noise-like qualities and nonperiodic waveforms in many
respects, but still contain some pitch quality. The sounds have some har-
monic relationships (often inexact) between pitch areas and a dominant
pitch area to provide a sense of a dominant frequency/pitch “area” and not
a precise pitch level.
These sounds can be evaluated and defi ned by:
1. the density (amount) of pitch information within the pitch area,
2. the width of the pitch area (the distance between the two boundary
pitches),
3. the presence of secondary pitch areas, and
4. the dynamic relationships of the primary and any secondary pitch
areas.
Some sounds will have several separated pitch areas. The different pitch
areas of the sound will be at different dynamic levels and have different
densities (the amount and closeness of spacing of the pitch information
within the pitch areas). One pitch area will dominate the sound and be the
primary pitch area (similar to a fundamental frequency). The other pitch
areas will be secondary pitch areas. These are components of the sound’s
spectrum. Some areas dominate, and others will be softer, and much soft-
er. The bandwidth of pitch areas (distance between the lowest and highest
frequencies that create the boundaries of the pitch area) are regions of fre-
quency/pitch information and activity that we recognize as being united by
density (amount of information) and loudness (have a reasonably uniform
level). The reader should focus on the perspective of the spectrum of the
sound source, listening for areas of density, the boundaries of those areas,
and the gaps between them.
Evaluating Pitch in Audio and Music Recordings
125
Evaluating the pitch areas of several percussion sounds will develop a num-
ber of important listening skills. Frequency and pitch estimation (recogni-
tion) will be refi ned, and the listener will take the skills of the last section
to a more detailed level of perspective. The focus will now be on identify-
ing pitch/frequency levels within the spectrum of the source—using sound
sources that allow us to approach this information in a more noticeable
and higher level of perspective. This is a fi rst step toward identifying subtle
aspects of sound quality and spectrum (that will be greatly refi ned later).
Further, this study will be accomplished without the added tasks of a time
line, as the graph will sum spectral information throughout the sound’s
duration. The reader will also make some rough observations on dynamic
levels. This skill will also be greatly refi ned later.
It is likely that pitch area analysis will be unlike anything
the reader has done in the past. Few reference points will
be available for the listener to draw upon, other than the
pitch estimation skills acquired above. Diffi culties and
frustrations are to be expected, along with great satisfac-
tion with acquiring a very useful skill that will be used
and improved throughout ones audio career.
To perform an analysis of pitch areas, sounds are plotted
on a
pitch area analysis graph. The graph incorporates:
1. The register designations for the Y-axis;
2. A space on the X-axis of the graph that is dedicated to each sound
instead of the passage of time (since this evaluation sums all informa-
tion during the sound’s duration);
3. The pitch areas, which are boxed off by upper and lower boundaries of
their bandwidths, in relation to these two axes;
4. The density of each pitch area designated by a number within each box
that relates to a relative scale from very dense to very sparse; and
5. Assigning a number to the relative dynamic levels of pitch areas, espe-
cially important to identifying the predominant (primary) pitch area
(dynamics will receive detailed coverage in the next chapter).
Figure 6-4 presents the pitch areas of the prominent bass drum sound
found in The Beatles“Come Together” (
1 version) at 0:34.
The percussion sound is composed of four pitch areas. The primary pitch
area is the second from the bottom. It is moderately dense and is the
loudest and dominant area. The density of the lowest pitch area is a bit
more dense, but considerably softer. The area between 167 and 265 Hz is
moderately dense and a bit louder than the lowest area, and not as loud as
the primary pitch area. The highest pitch area (approximately 315–395 Hz)
has a rather sparse density and is at the lowest loudness level of all four
pitch areas. It is interesting to note that the lower boundaries of the four
pitch areas are nearly whole number multiples and harmonically related,
Listen . . .
to tracks 19-25
for isolated drum and cymbal sounds
that can be used as source material
for pitch area evaluations.
Chapter 6
126
but far enough away to create strong noise elements. These types of rela-
tionships are common for drum resonances and head vibrations.
Figure 6-4
Pitch area
analysis of the bass
drum sound from
The Beatles, “Come
Together.
First Number 5 Very Dense
4 Dense
3 Moderately Dense
2 Sparse
1 Very Sparse
Second Number 5 Very Loud
4 Loud
3 Moderately Loud
2 Soft
1 Very Soft
LOW LOW-MID MID MID-UPPER HIGH
2,1
3,3
3,4
4,2
Bass
Drum
1396
1319
494
440
220
196
73
65
23
395
315
265
167
128
69
59
39Hz
The pitch area graph and the objective descriptions of density and dynamic
relationships of the pitch areas provide much useful and universally per-
ceived information about the sound. Meaningful communication about
this sound is possible with this information.
Next, the Pitch Area Analysis Exercise (Exercise 6-3 at the end of the chapter)
should be performed until the material is learned. The reader is encouraged
to evaluate drums and cymbals of a variety of sizes directly from music
recordings, after working through some isolated percussion sounds (such
as the ones found on the enclosed CD).
A number of percussion sounds from the drum solo of The Beatles“The
End” appear in Figure 6-5. The reader should examine the dynamic rela-
tionships of the pitch areas and observe their densities. Study the example
carefully, seeking to identify the boundaries of the pitch area, and confi rm
Evaluating Pitch in Audio and Music Recordings
127
the density information presented. Once the pitch areas can be identifi ed,
the reader can observe the dynamic relationships of the pitch areas.
Figure 6-5
Drum
sounds from The
Beatles, “The End.
First Number 5 Very Dense
4 Dense
3 Moderately Dense
2 Sparse
1 Very Sparse
Second Number 5 Very Loud
4 Loud
3 Moderately Loud
2 Soft
1 Very Soft
LOW LOW-MID MID MID-UPPER HIGH
3,4
2,4
2,2
Tom 1
8.7 sec.
1396
1319
494
440
220
196
73
65
23
749
630
267
181
132
79
3,3
500
397
2,2
3,2
2,4
Tom 2
9.6 sec.
891
794
315
187
157
99
1,2
530
386
2,3
3,2
4,4
Tom 3
10.8 sec.
841
470
190
135
105
74
2,2
375
223
2,3
4,2
Kick Drum
30.8 sec.
106
62
50
31
2,3
198
118
4186
The reader will be able to apply the skills and concepts of pitch area to
the recognition of frequency bands in critical listening applications. The
information for evaluating the states and changes of frequency levels
is readily deduced from perceived pitch information. The skills gained
through the recognition of pitch areas and frequency bands will later be
directly applied to the evaluation of timbre and sound quality. In fact,
these pitch area analyses are rudimentary timbre analyses.
Chapter 6
128
Figure 6-6
Crash
cymbal sounds from
The Beatles’ “Come
Together” and
“Something” (both
from 1).
First Number 5 Very Dense
4 Dense
3 Moderately Dense
2 Sparse
1 Very Sparse
Second Number 5 Very Loud
4 Loud
3 Moderately Loud
2 Soft
1 Very Soft
MID MID-UPPER HIGH
3,4
Crash
“Come Together”
3:47
1396
1319
494
440
220
2370
1330
2,3
920
561
Crash
“Something”
0:03
2,2
891
595
4186
VERY HIGH
3,3
6350
3770
2,2
14,200
7550
1,1
1580
944
3,4
6720
2820
2,3
13,400
7620
A common critical listening situation is to compare two sounds. Figure 6-6
compares the crash cymbal from two different songs. Notice the charac-
teristics of the sounds through careful listening to identify pitch areas and
their densities. From the graph we can see the areas are similar in location
and nearly identical in their densities. Some areas have markedly different
relative dynamic levels.
As skill increases, the listener will gradually come to understand and
recognize slight differences present in the spectra of similar sounds.
Eventually the reader will arrive at the point where evaluations of the high
hat and crash cymbal sounds from four different releases of the song “Let It
Be” can be compared (listed in the Discography). The reader is encouraged
to listen carefully to these sounds. The sounds change within each song
and are slightly (to markedly) different in each recording. The reader will
notice many striking differences and perhaps a few of the many subtleties.
These differences will gradually seem more and more signifi cant. Ultimate-
ly, the reader will recognize the sounds as very different and will perceive
the details of their sound qualities.
Evaluating Pitch in Audio and Music Recordings
129
Pitch Area and Pitch Density
Pitch area evaluation of a single sound leads us to consider the similar
concept of pitch density. Pitch density is the range of pitches spanned by
a musical idea plus the spectrum of the sound source playing it. It is the
pitch area of the musical idea. This connection between pitch area as the
spectrum of a sound and the musical material being presented is important
to many activities in mixing and to understanding recordings.
Figure 6-7 presents four percussion sounds from “Here Comes the Sun.
Listen to this recording and recognize the pitch areas of these instruments.
In your next hearings, notice how the instruments compare to others in
terms of their pitch areas—or spectral content. You will be listening at the
level of perspective of the individual sound sources, at one level higher
where you can perceive each source as being equal, and at the highest
level of the overall texture of the recording. Notice:
the ways the percussion sounds blend with others that occupy similar
pitch areas;
the way sounds are more easily distinguished when they occupy pitch
areas that are different from others;
that certain pitch registers of the overall texture have more information
than other registers;
that the amount of activity in certain pitch registers, within the overall
texture, changes from one moment to the next;
that different “left” to “right” locations of the stereo fi eld have different
pitch areas present and absent.
This will be explored in much greater depth in Chapter 10 and later readings
and exercises. Listen one last time to these four sounds and try to identify
the pitch areas and their boundaries and densities, and compare the pitch
areas within the sounds for their relative loudnesses. This is your beginning
work in analyzing the components of a sound’s timbre and sound quality.
Chapter 6
130
Figure 6-7
Four per-
cussion sounds from
The Beatles’ “Here
Comes the Sun.
First Number 5 Very Dense
4 Dense
3 Moderately Dense
2 Sparse
1 Very Sparse
Second Number 5 Very Loud
4 Loud
3 Moderately Loud
2 Soft
1 Very Soft
HIGHMID-UPPERMIDLOW-MIDLOW VERY HIGH
3,3
15,000
9200
4.4
16,200
10,000
5,4
8500
5000
4,4
7500
4000
2,1
1400
500
2,2
1330
730
4,5
5000
1800
3,2
940
660
4,3
517
417
3,2
260
196
2,2
1330
330
4,4
160
90
4,3
60
30
Snare
Drum
0:25
Bass
Drum
1:04
Crash
Cymbal
0:28
Hi Hat
0:32
C8
4186 Hz
F6 1397 Hz
E6 1319 Hz
C10
F
9
C9
F
8
F
7
C7
F
6
B4 494 Hz
A4 440 Hz
C6
F
5
C4
A3 220 Hz
G3 196 Hz
F
4
C4
D2 73 Hz
G2 65 Hz
F
3
C3
F
2
F
1
C1
F
(23 Hz)
Evaluating Pitch in Audio and Music Recordings
131
Melodic Contour
Graphing melodic contour will assist the recordist in a number of ways. First,
at the early stages of the listening-skill development, graphing melodic con-
tour will help develop skills in placing pitch/frequency changes and levels
against time. These same skills will later be used in the much more detailed
(and initially more diffi cult) task of evaluating pitch-related information in
timbre/sound quality and environmental characteristics evaluations.
Learning to plot melodic contours against a time line will be productive in
developing the skills of recognizing pitch levels, of perceiving metric units
and rhythm of time, and of mapping pitch contours. These skills directly
transfer into many of the listening functions of the audio professional.
Second, recognizing the contour (or shape) of the melodic line is impor-
tant to understanding certain pieces of music or musical ideas. In certain
pieces of music, the contour of a melodic line is perceived instead of the
individual intervals. When the melodic line is performed very rapidly (as is
easily accomplished with technology), the perception of the line fuses into
an outline or shape. The series of intervals that comprise the melodic line
are not perceived. The contour of the melodic line is instead perceived.
The melodic contour graph allows the contour of the musical idea to be
recognized and evaluated for its unique qualities.
Graphing Material against a Time Line
Nearly all exercises in the book graph material against a time line. The
sequence for establishing a time line will now be examined. This should
be studied carefully, as the recording professional is continually engaged
in listening to material to notice changes in sound over time. This process
will defi ne the activities of any artistic element against a time line. This
sequence of activities will be applied to both musical (analytical) and criti-
cal listening contexts. It will be only slightly modifi ed for each exercise in
the following chapters, and the reader should become familiar with the
order of activity:
1. During the fi rst hearing(s) of the material, focus listening activity to
establish the length of the time line. At the same time, notice promi-
nent activity of the material (in this case melodic contour) and its place-
ment against the time line.
2. Check the time line for accuracy and make any alterations. Establish
a complete list of sound sources (instruments and voices), and then
place those sound sources against the completed time line.
3. Notice the activity of the material being graphed for boundaries of
levels (here, the highest and lowest pitches of the example, and the
smallest changes between pitch levels) and speed of activity (noting
the fastest changes of levels). The boundary of speed will establish
Chapter 6
132
the smallest time unit required to clearly show the smallest signifi cant
change of the material (melodic contour). The boundaries of levels of
activity will establish the smallest increment of the
Y-axis required to
plot the smallest change of the material. This step will establish the
perspective of the graph, which is the graphs level of detail. The
Y-axis
should allow some space above and below the highest or lowest level
for the material to be clearly observed, and for the possibility of adding
new material or corrections in that area of the graph.
4. Begin plotting the activity of the material (melodic contour) on the
graph. First, establish prominent points of reference within the activity;
these reference points might be the highest or lowest levels, the begin-
ning and ending levels, points immediately after or before silences, or
any other points that stand out from the remainder of the activity; place
these levels correctly against the time line. Use the points of reference
to calculate the activity of the preceding and following material.
5. To complete the plotting of the activity of the artistic element, alternate
focus on the contour, speed, and amounts of level changes.
6. The evaluation is complete when the smallest signifi cant detail has
been heard, understood, and added to the graph.
Listening and Writing
Many hearings of the example will be involved for each of the steps above.
Each listening should seek specifi c new information and should confi rm
what has already been noticed about the material. Before listening to the
material, the listener must be prepared to extract certain information. Lis-
teners must have a clear idea of what they will be listening to and/or for,
and work to keep from being distracted. Attention should be focused at
a specifi c level of perspective and on a specifi c task. Listeners must both
confi rm their previous observations and be receptive to new discoveries
about the example. Previous observations will be checked often, while
seeking new information.
Listening to only small, specifi c portions of the example may assist certain
observations at certain points in the evaluation process. In these situations,
the listener should intersperse listenings to the entire example with the
rehearings of small sections, to be certain consistency is being maintained
throughout the evaluation and to maintain proper perspective.
The reader should very rarely write observations while listening. Instead,
the reader should concentrate on the material and attempt to memorize
their observations. This activity will develop auditory memory and will ulti-
mately greatly reduce the number of hearings required to evaluate sounds,
and improve skill. When the recording/sound has stopped, the reader
should recall what was heard and only then write. If the material is not
clearly remembered, listen again, perhaps to a shorter segment.
Evaluating Pitch in Audio and Music Recordings
133
One must fi rst recognize what has been heard before it can be written
down. This process is about recognizing what is heard and making a writ-
ten record of the experience. The reader should try to make the sequence
“hear, recognize, remember, write” automatic.
Melodic Contour Graph
The shape of the melodic line is plotted on a melodic contour graph. The
graph is composed of:
1. Pitch area register designations for the
Y-axis, using the registers need-
ed to clearly show the activity of the musical example;
2. The X-axis of the graph is dedicated to a time line divided into appro-
priate time units of a metric grid or real time (depending on the mate-
rial and context);
3. Each sound source is plotted as a single line against the two axes (the
melodic contour is the actual shape of this line); and
4. If more than one sound source is plotted on the same graph, a key
should be devised and included with the graph to identify the lines
with the sound sources.
Figure 6-8
Melodic
contour graph—
The Beatles’ “Wild
Honey Pie,” be ginning
at 0:53.
LOW-MID MID MID-UPPER
1319
494
440
220
196
73
65
234
2
23 4
3
234
1
Consider the melodic contour of the classical guitar line from
The Beatles
(the “White Album”), that concludes “Wild Honey Pie” and provides transi-
tional material to “The Continuing Story of Bungalow Bill” (Figure 6-8). The
melodic line is performed too quickly to be heard as a melody composed of
intervals. Instead, this type of fast melodic gesture fuses into a shape or a
contour. As an exercise, check Figure 6-8 for accuracy of shape, detail, and
register placement. Determine the tempo of the passage and attempt to
Chapter 6
134
identify some of the graphed pitch/frequency levels. Chords begin and end
the example, and are clearly shown as several lines occurring simultane-
ously against the time line.
The Melodic Contour Analysis Exercise (Exercise 6-4) will refi ne this skill.
The reader is encouraged to work through several additional recorded
examples to become comfortable with this activity. This graph is an impor-
tant bridge between traditional music dictation (and notation) and the
graphs and processes of this system.
Melodic gestures of this type are very common in music. They commonly
appear from heavy metal guitar solos, to eighteenth- and nineteenth-cen-
tury keyboard music (especially the works of Chopin, Liszt, and works in
the Rococo style). Transcribing fast melodic passages into melodic contour
graphs will provide the reader with practice in developing the following
skills required of recordists:
pitch recognition (estimation),
placing pitch/frequency levels correctly into pitch registers,
recognizing and calculating pitch changes, and
placing pitch changes against a time line.
These skills will be used and further developed in evaluating sound quality
and environmental characteristics in later chapters.
Exercises
The following exercises should be practiced until you are comfortable with the
material covered.
Exercise 6-1
Pitch Reference Exercise
Development of a personal pitch reference requires daily attention with a num-
ber of 5-, 10-, or 15-minute work sessions. With focused effort this exercise will
yield results quickly. The enclosed CD has some potential reference pitches.
1. First you will need to become aware of your personal pitch reference. You
may most likely be able to accomplish this as follows:
a. Exploring your experiences as a performer can turn up a signifi cant
memory. People who have played an instrument long enough will often
have certain pitches in memory that are related to that instrument (es-
pecially when playing it). Guitarists often know the sound of an “E”;
trumpet players a “B-fl at”; violinists an “A”; and so forth. You can make
use of this pitch reference if you have this experience. Think carefully
about the tuning pitch of your instrument or some other pitch level you
Evaluating Pitch in Audio and Music Recordings
135
feel drawn to, and clearly identify the pitch. Now play the pitch, listen
to it carefully, and work to internalize it as a reliable reference.
b. Paying close attention to your natural speaking voice, speak carefully,
but deliberately, and try to remove any stress from your voice box.
Pay close attention to your infl ections and notice when you are caus-
ing the pitch of your voice to go up and down. Finally, bring your
attention to identify the pitch of your voice where it is operating with-
out resistance, and where it is not being forced up or down in pitch.
In this monotone, if properly produced, is the natural pitch of your
voice. This can often serve as a reliable pitch reference.
c. Vocalizing or singing freely to fi nd the pitch level where your voice
causes your chest cavity to resonate. This will require you to pay close
attention to the sensations of your body for a fullness to develop
in the chest cavity. The voice should then be swept through your
comfortable singing range to fi nd the pitch that creates the greatest
fullness. That level of greatest fullness would be “your” pitch, which
might serve as a reference.
d. Other pitch references are possible, such as drums or specifi c pieces
of music. These are unusually unique to the individual. At times these
can be easily identifi ed or refi ned.
2. If your pitch reference is based on your voice, make a note of the pitch
level you identifi ed. Repeat the process of identifying this level four or
ve times over two or three days to validate the level. You will eventually
identify a specifi c and consistent pitch level.
3. Now, listen to your reference pitch often throughout several days. A few
times per day, stop your normal activity to sing, play, and listen to that
pitch level. Use a piano, pitch pipe, tuning fork, or another instrument to
play your pitch. Sing it frequently and become accustomed to the place-
ment of that pitch in your voice. Work to bring yourself to hear the pitch
in your mind before singing it. With this, you are bringing your sense of
pitch into your consciousness.
4. Next, try to consciously carry your reference pitch with you throughout
the day. Take fi ve minutes at four or fi ve set times throughout the day to
sing your pitch level and check it for accuracy. Before singing, quiet your
thoughts and bring your attention to your voice or to your memory of the
pitch. Do not sing a pitch unless you are confi dent you have a memory of
the pitch. If you do not have a memory of the pitch, return to Step 3 for
more practice. Sing the pitch in your memory and check it. Don’t allow
yourself to get frustrated by wrong pitches. Everyone will make mistakes
at this stage. By evaluating your mistakes you can learn from them. Keep
a record of the pitch you sang—high or low, size of interval off, etc. Look
for patterns and try to identify what caused the errors.
5. Remember, this exercise will greatly enhance a skill you will use through-
out your career in audio.
Chapter 6
136
Exercise 6-2
Pitch Level Estimation Exercise
The enclosed CD presents the boundaries of the pitch registers played on a
piano. These may be of assistance in learning these boundaries.
1. Working with a keyboard instrument or a tone generator, practice listen-
ing to the boundaries of the registers. Seek to remember the sounds of
those boundaries, and remember the pitch names and frequency levels of
the several pitches that make up those boundaries.
2. At this point it will be helpful to work with another person. While this
other person performs pitches at the boundaries between registers (on a
keyboard or other device or instrument), identify those boundaries. Main-
tain a record of your mistakes so that you can evaluate them and make
adjustments.
3. Once confi dence has been established in recognizing the general areas
encompassed by the registers, begin playing individual pitches against the
pitch registers. Identify the pitch register of the sound.
4. Finally, work to identify where pitches are sounding within the registers
(i.e., the upper third of low, or the lower quarter of mid). Throughout
these steps, you must rely solely on your memory of the pitch/frequency
registers in making these judgments.
Exercise 6-3
Pitch Area Analysis Exercise
Identify an instrument you want to evaluate—most people fi nd drums are
easiest in beginning attempts. The enclosed CD has a number of drum and
cymbal sounds that will serve this purpose very well. You might wish to set
your CD player to repeat the track.
1. Determine the most prominent pitch area by defi ning the lower boundary
rst, then the upper boundary of the area (a steep fi lter can be helpful in
determining these boundaries during beginning studies).
2. Determine any secondary areas of concentrated activity (these will be
identifi ed by either width, density, or dynamic prominence of the pitch
area), by identifying the lowest and then the highest boundary.
3. Repeat the process for all other pitch areas present. The specifi c frequen-
cies/pitches of the boundaries are often audible, despite the sound not
having an audible fundamental frequency. These pitches/frequencies
should be identifi ed and noted on the graph.
4. Evaluate the densities of the pitch areas and incorporate that informa-
tion into the graph (this is the general amount of pitch/frequency activity
Evaluating Pitch in Audio and Music Recordings
137
within the pitch area and is noted on a numbering scale from very dense
to very sparse).
5. Finally, identify the dynamic relationships between the pitch areas within
the sounds. Describe this information as part of the analysis (these are
the general dynamic relationships of the area and are noted on a number
scale identifying the relative loudness of the pitch areas).
Exercise 6-4
Melodic Contour Analysis Exercise
Find a recording with an instrumental melodic line that is performed too
quickly to be heard as individual pitches. It is best for the melodic line to be
at least two measures, or fi ve seconds, in duration.
1. Determine the time line of the example, including the appropriate time
units (clock time or meter) and the length of the time line. Make note of
prominent pitch/frequency levels.
2. Begin plotting the melodic contour against the time line by identifying
prominent pitch levels and placing them at precise locations on the time
line. Check the time line for accuracy of length.
3. Work to establish as many reference pitches as possible, and the high-
est and lowest pitches of the line (these will identify the upper and lower
limits of the Y-axis). Identify the fastest change of pitch level (this will
become the smallest time unit the graph needs to clearly present, and will
determine the appropriate division of the X-axis).
4. Draw the melodic contour graph using the X and Y axes determined in
Step 3.
5. Locate the reference pitches on the graph at the appropriate locations
against the time line.
6. Fill in the remaining pitch information, making certain to check observa-
tions regularly. The graph is completed when the last noticeable pitch
change is incorporated into the graph.
138
7 Evaluating Loudness in
Audio and Music Recordings
Loudness has traditionally been used in musical contexts to assist in the
expressive qualities of musical ideas. This function of dynamics helps shape
the direction of a musical idea, helps delineate separate musical ideas (usu-
ally in relation to their importance to the musical message), assists in creat-
ing nuance in the expressive qualities of the performance, and/or it may add
drama to the musical moment. In all of these cases, dynamics have func-
tioned in supportive roles in the communication of the musical message.
The recording process has given the recordist more precise control over
dynamics than exist in live acoustic performances. This control has brought
audio recordings to have additional relationships of dynamics and the
potential of placing more musical importance on dynamic relationships.
An example of a new relationship of dynamics is the occurrence of contra-
dictory cues between the loudness level at which a sound was performed
during the initial recording (tracking) and the dynamic level at which the
sound is heard in the fi nal musical texture (the mix). The recordist must be
aware of all relationships of dynamic levels, both those that are naturally
occurring and those caused by the recording/reproduction process, and of
the possibility that dynamic levels and relationships may function on any
hierarchical level of the musical structure.
When most people think of mixing, they think of setting loudness levels of
instruments and voices (sound sources). There is much more to loudness
in recording. The dynamic alterations in mixing are only part of what is
covered in this chapter. Program dynamic contour and overall levels are
important aspects of the recordist’s work and are covered. In Chapter 8
loudness will be considered again as a part of sound quality/timbre evalu-
ation, as we look at dynamic envelope and spectral envelope after shifting
our level of perspective. No matter the level of perspective, a reference
dynamic level (RDL) is required for judging loudness levels.
Evaluating Loudness in Audio and Music Recordings
139
The recording process places unique critical and analytical listening require-
ments on the recordist in the area of the evaluation of loudness. The record-
ist must be able to focus on changes in dynamics at all levels of perspec-
tive and to quickly switch focus between those levels. The recordist must
also be able to use the skills of identifying loudness relationships, switching
between listening to the sound itself out of context (using critical listening)
and listening to the sound within its musical context (analytical listening).
Much confusion often accompanies beginning attempts to perceive, fol-
low, and graph dynamic changes. The listener must remain focused on the
act of perceiving and defi ning changes in loudness levels. In the mix these
are the dynamic levels of instruments/voices and their relationships to one
another. Listeners must not allow themselves to be distracted or misled.
The listener must be conscious, and ever mindful, of not confusing other,
easily misleading information as changes in dynamic levels. Some aspects
of sound that are often confused for dynamics are distance cues, timbral
complexity, performance intensity, sound-source pitch register, any infor-
mation that draws the attention (focus) of the listener (such as a text, sud-
den entrance of an instrument), environmental cues, and speed of musical
information.
It is common for sounds most prominent in the listeners focus not to be
the loudest sounds in the musical texture. Loudness itself does not cre-
ate or ensure prominence of the material. It comes as a surprise to many
people that the most prominent sound in the listeners attention is often
NOT the sound with the highest dynamic level.
This chapter seeks to defi ne actual loudness levels in musical contexts
(dynamics), with the exception of the fi nal section. In that section, the actu-
al loudness of the musical parts as musical balance will be compared to
performance intensity (the loudness of the sound sources when they were
performed in the recording process).
Reference Levels and the Hierarchy of Dynamics
Dynamics have traditionally been described by imprecise terms such as
very loud (fortissimo), soft (piano), medium loud (mezzo forte), etc. These
terms do not provide adequate information to defi ne the loudness level of
the sound source. They merely provide a vocabulary to communicate rela-
tive values.
The artistic element of dynamics in a piece of music is judged in relation to
context. Dynamic levels are gauged in relation to (1) the overall, conceptual
dynamic/intensity level of the piece of music, (2) the sounds occurring
simultaneously with a sound source, and (3) the sounds that immedi-
ately follow and precede a particular sound source. In this way, loudness
is perceived as relationships between sound sources and in relation to a
Chapter 7
140
reference level. Evaluation is more precise, and it is only possible to com-
municate meaningful information about dynamic levels, when a reference
level is defi ned.
Reference Dynamic Level
The impression of an overall or global intensity level of a piece of music will
be the primary reference level for making judgments concerning dynamics.
This level is arrived at through our impression of the intensity level of the
performance of the work as a single idea. It is the
perceived performance
intensity
of the work as a whole, conceptualizing the entire work as a single
entity out of time. The work’s form and essence has a dimension in per-
ceived performance intensity. This is the work’s reference dynamic level.
Every work can be conceived as having a single, overall reference dynamic
level (RDL). This is the dynamic level that characterizes a piece of music
when the form of the work is envisioned. The RDL is a single, specifi c
dynamic level that represents the intensity and expression of the piece.
This specifi c dynamic level is an understanding of the intensity level of the
performance of a piece of music, as a whole and as realized in an instant
(its form). It is the inherent spirit of the music/recording as a level of exer-
tion, expression, mood, and sometimes message combined into a single
concept. When some people talk about a song’s “groove,” they may actu-
ally be talking about its RDL.
Performers establish this reference dynamic level in their minds before
beginning a performance of any piece of music. Often this occurs intui-
tively. Composers also retain this level in their thoughts (at least subcon-
sciously) throughout the process of writing a piece of music. Recordists
will go through a similar process in production work. Recordists establish
this level as a reference from which they are able to calculate all other
dynamic levels and relationships. In recordings, this level often needs to
be consistent for hours, days, weeks or longer, as sessions for a piece of
music progress at their own pace.
Performance intensity cues are related to timbral changes of the sound
sources. Sound sources will exhibit different timbral characteristics when
performed at different dynamic levels and with different amounts of physi-
cal exertion. The impressions the listener receives, related to the intensity
level of the performance (of all of the musical parts individually and collec-
tively), will be related to actual dynamic relationships of the musical parts.
The perceived performance intensity in the recording and the perceived
dynamic relationships of the musical parts will directly shape the listener’s
impression of the existing RDL of the recording. Dynamic and performance
intensity cues (including expressive qualities of the performance) play
signifi cant roles in determining RDL, as does tempo. The sound sources
that present the primary musical materials (or that are at the center of the
Evaluating Loudness in Audio and Music Recordings
141
listeners focus) will often have proportionally more infl uence than sources
presenting less signifi cant material.
Through the perception of these cues and the infl uence of tempo, a single,
conceptual dynamic level will be determined. This is the RDL of the per-
formance (recording/piece of music)—the level at which it is envisioned
as existing. This is the reference level that will be used to calculate the
dynamic levels of the individual musical parts in relation to the whole, as
well as the dynamic contour of the overall program.
Every work will have a specifi c RDL. When a work is divided into separate
major sections, even separate movements (such as a symphony), the work
will have an RDL that allows all sections to be related to the overall con-
cept. Even a 90-minute symphony will have only a single RDL. The RDL
becomes one of the factors that allow the listener to perceive the work as a
single entity, composed of many related parts.
The reference dynamic level can be perceived as existing anywhere from
very loud (fff) to very soft (ppp). The RDL will be established as a precisely
defi ned dynamic level that will serve as a reference level throughout the
work. This will be discussed further.
The RDL will be used for evaluating/defi ning:
1. Dynamic relationships of the overall dynamic contour of the program
(piece of music),
2. Dynamic levels of the musical ideas (sound sources) of the work, and
3. Dynamic relationships of the individual dynamic contours of the musi-
cal balance of the work.
Performance Intensity and Dynamic Markings
Timbre changes of performance intensity are important cues in our percep-
tion of dynamic levels of acoustic performances. We apply these same cues
to recorded sounds to imagine a live performance, despite the medium.
In recordings we gauge performance intensity solely by sound quality, as
the visual cues of a live performance are not present—though they are still
imagined. Our reference for performance intensity is our knowledge of the
instrument’s timbre, as it is played at various levels of physical exertion, with
different types of expression, and with various performance techniques.
The listener recognizes the amount of physical exertion required to produce
a certain sound quality on an instrument. This understanding becomes the
perceived performance intensity. Performed sounds that require expend-
ing energy are perceived as moderately loud (mezzo forte, mf), or above.
Performed sounds that appear to be withholding energy are perceived as
moderately soft (mezzo piano, mp), or below. When the listener imagines
a considerable amount of energy (or perhaps an excessive amount) was
Chapter 7
142
required to produce a perceived sound quality, the performance intensity
will be forte (f), or perhaps more. This becomes simpler when remember-
ing to relate dynamic markings and performance intensity cues to energy
expended by the performer.
The threshold between mezzo piano (mp) and mezzo forte (mf) is critical to
this understanding. This is the energy level the performer can theoretically
sustain indefi nitely. Conceptually, at this level the performer is neither put-
ting forth energy, nor holding back—no energy is being exerted. Above this
threshold, energy is being consumed by pushing forward, becoming more
assertive—even if only a very small amount. Below the threshold, energy
is being held back, or being withdrawn, if only in a small amount. Moving
further above or below the threshold, the perception becomes a matter of
degree, or magnitude, of how much energy is being expended or withheld.
The difference then between ff and fff is the level of intensity and the expec-
tation of the length of time that level of energy can be sustained. Likewise,
the level of restraint and the likelihood of the length of time that restraint
might be sustained distinguish
pp and ppp.
The traditional terms for dynamic levels can continue to be of use with
a well-defi ned RDL based on performance-intensity information. With a
defi ned RDL, the comparative terms can have more signifi cant meaning.
The terms will remain imprecise, but they will be more meaningful. Dynam-
ic levels are more precisely defi ned when sound sources are compared to
one another and placed on the appropriate graph, making the use of the
traditional terms a mere starting point for evaluation of loudness levels.
The terms retain their meanings from musical contexts, whereas the
dynamic terms (such as mezzo forte) describe a quality of performance
and an amount of physical exertion and drama on the part of the perform-
er, as well as being a description of the loudness level. When placed on
a graph (see Figure 7-1) these general terms are transformed into areas
where sounds can be precisely defi ned against the RDL and in relation to
performance intensity.
Evaluating Loudness in Audio and Music Recordings
143
Figure 7-1
Dynamic
ranges and dynamic
contour.
fff
ff
f
mf
mp
p
pp
ppp
Time
RDL
voice
piano
bass
flute
guitar
KEY
Dynamic Levels as Ranges
Dynamic levels do not exist as a specifi c level of loudness in musical con-
texts. Dynamics are understood as ranges or areas. Dynamic markings
refer to a range of actual loudness levels.
The dynamic marking “mezzo forte” does not refer to a precise level. It
refers to a range of dynamic levels between “mezzo piano” and “forte.
Many gradations of “mezzo forte” may exist in a certain piece of music.
Many instruments can be performing at different levels of loudness, yet be
accurately described as being in the “mezzo forte” dynamic range. Entire
musical works or performances MAY take place within the range of a cer-
tain dynamic marking, yet exhibit striking contrasts of loudness levels.
Figure 7-1 presents the vertical axis that will be used for all graphs plotting
dynamics. The dynamic markings are centered within the ranges. A number
of sound sources are plotted on the graph, and the reference dynamic level
is designated on the vertical axis. Each sound source can be compared to
Chapter 7
144
the dynamic levels and contours of the other sources, and to the RDL. The
unique characteristics of each source are readily apparent from the graph.
If limited to describing the sound sources by traditional dynamic-level
designations, some sounds would be a “loud mezzo piano,” some sounds
a “moderate mezzo piano,” and others a “soft mezzo piano.The graph,
in this instance, circumvents the need for these vague and cumbersome
descriptions.
The most extreme boundaries of dynamic ranges, for nearly all musical
contexts, will extend from ppp” to fff” Musical examples that contain
material beyond these boundaries are rare. The individual graph should
incorporate only those ranges that are needed to accurately present the
material, and to leave some vertical area of the graph unused above and
below the plotted sounds, for clarity of presentation.
Determining Reference Dynamic Levels
The perceived performance intensity level that serves as our reference for
evaluating the dynamic relationships of the work is, again, the reference
dynamic level. A single RDL will be used (1) at the highest hierarchical level
to calculate the dynamic contour of the entire program (the complete musi-
cal texture) and (2) at the midlevels of the structural hierarchy to calculate
the dynamic contours of the individual sound sources in musical balance.
At the lower levels of the structural hierarchy, the reference level switches
to the performance intensity level of the individual sound. The intensity
level of a specifi c appearance of the sound source is used as the refer-
ence level to determine the dynamic contours of the individual sound and
its component parts. At these levels of perspective, dynamic contours are
plotted of (1) a typical appearance of the overall sound source (the dynam-
ic envelope) and (2) the individual components of the spectrum (spectral
envelope). This dynamic contour information seeks to defi ne the sound
quality or timbre of the sound source, and will be explored in the next
chapter, in that context.
The RDL of the piece of music is the reference for determining the dynamic
levels of the sound sources and the composite musical texture. In order to
make these evaluations, the RDL must fi rst be defi ned.
The RDL is a precise level that can be clearly defi ned. It is not subjective. All
listeners putting forth the effort to perceive it will arrive at the same level.
The listener will recognize when they have identifi ed the correct level. The
level will cause all other dynamic relationships to be understood, to make
sense. The listener will perceive the piece of music as existing at the pre-
cisely identifi ed dynamic level. The dynamic level is envisioned as a dimen-
sion of the essence of the piece of music.
Evaluating Loudness in Audio and Music Recordings
145
A listeners fi rst attempts to defi ne RDL are usually diffi cult. The concept
itself eludes many people at fi rst. The reader must remain conscious of try-
ing to understand this important concept. It will require many hearings of
the recording/piece of music to learn it well enough to try to defi ne some-
thing requiring this depth of understanding. After achieving this level of
understanding, the listener can become more comfortable with formulat-
ing the impression of a single dynamic level that IS the dynamic/intensity
level of the piece. The listener will recognize the RDL of a piece once it has
been experienced and understood. It is likely they will then not forget it. Lis-
teners often experience the RDL even in passive listening for entertainment
and do not realize it; indeed it is often the song’s overall dynamic/intensity
(energy, intensity, motion, mood, expression—“groove”—combined) that
draws a listener to identify strongly with a song.
It is common for some information to get in the way of formulating this
impression of the RDL. Musical materials, lyrics, tempo, and instrument
timbres all give cues that the listener will be tempted to factor into this
observation. Skilled musicians are even prone to drawing conclusions
based on what they would like to be present in the music, rather than lis-
tening to what is present. Some instruments may well be performing at
intensities that send confl icting cues when considered against the listen-
ers ideas of the potential RDL; this confl icting information enriches art, but
makes defi ning it more diffi cult. A magic formula does not exist for deter-
mining the RDL. This is one of the signifi cant artistic dimensions of a piece
of music that defi es theoretical analysis and instead uses the sensibilities
of the listener. This does not make it subjective, only that it cannot be pre-
dicted by measured calculation, but is determined through the experience
of the music and arrived at through an understanding of the piece. Again,
the RDL will be a precise level that all listeners can agree upon (± 2 percent)
given enough attention to the task.
Figure 7-2 identifi es the RDL of “Lucy in the Sky with Diamonds.Two levels
are shown: one for the original version from
Sgt. Peppers Lonely Hearts
Club Band and the other for the 1999 Yellow Submarine version. The two
versions have slightly different reference dynamic levels. The different lev-
els are caused by the different mixes and presentations of materials, among
other factors such as the sound qualities imparted during mastering. The
essence of the piece has been slightly altered by the slightly different sound
qualities of the sound sources and the recordings. The reference dynamic
levels are perceived as clearly within mezzo fortes moderate expending of
energy. They are both beyond the level midway between the beginning of
mf and the threshold of f (where exertion moves beyond moderate). In fact
both exceed the three-quarters level of the area that comprises mf. After
much listening and contemplation, the levels can be understood as being
in the upper 15 percent of the mf area, with not more than 10 percent of the
area separating the two levels.
Chapter 7
146
Figure 7-2
Reference
dynamic levels of two
versions of “Lucy in the
Sky with Diamonds;”
* original version from
Sgt. Peppers Lonely
Hearts Club Band and
**1999 Yellow Submarine
version.
RDL is calculated after getting to know the composition, recording, and/or
performance well. Two different performances of the same piece by the
same performer may each have a different RDL. Each of two different inter-
pretations will almost certainly have a different RDL, if only slightly.
An exercise in determining the reference dynamic level of a recording/piece
of music appears at the end of this chapter, Exercise 7-1.
Program Dynamic Contour
Changes of dynamic level, over time, comprise dynamic contour. As noted,
the dynamic levels and relationships occur at all hierarchical levels. The
broadest level of perspective will allow the dynamic contour of the overall
program to be perceived and plotted. This is the single dynamic level/con-
tour of the composite sound, the result of combining all sounds. This
pro-
gram dynamic contour
can be envisioned as a monometer, following the
dynamics of the entire program.
Skill in recognizing the dynamic level of the overall program is developed
through plotting program dynamic contour. Recordists use this skill in many
listening evaluations. This high-level graph, or the associated listening skill
alone, will be applied to many analytical and critical listening applications.
f
mf
mp
RDL**
RDL*
Evaluating Loudness in Audio and Music Recordings
147
Dynamic contour must not be confused with performance intensity, with
distance cues, or with spectral complexity. These aspects of recorded sound
often present cues that contradict actual dynamic (loudness) level or that
alter the perception of the actual dynamic level.
Program dynamic contour information is plotted on a program dynamic
contour graph. The graph incorporates:
1. Dynamic area designations for the
Y-axis, distributed to complement
the characteristics of the musical example;
2. The reference dynamic level designated as a precise level on the
Y-axis;
3.
X-axis of the graph dedicated to a time line, divided into appropriate
increments of the metric grid or of real time (depending on the material
and context); and
4. A single line plotted against the two axes (the dynamic contour of the
composite program is the shape of this line).
The program dynamic contour of The Beatles“Here Comes the Sun”
is plotted in Figure 7-3. The program dynamic contour graph shows the
changes in overall dynamic level throughout the work. In listening to the
work, the striking changes in the overall dynamic level will be evident. The
wide dynamic range goes through many large and many subtle changes of
loudness level. Pertinent to reference dynamic level, a sense of arrival hap-
pens at the end of the piece. This occurs when the song settles at the RDL
for a moment and then the song is over.
It is common for a piece of music to arrive at its RDL as an important occur-
rence of the work. It might appear in the introduction, the dramatic climax,
the fi nal chorus or the fi nal verse. These are all common places an RDL
might be reached, but any location is possible. It is also possible that the
song’s RDL is never sounded—purposefully leaving the listener unfulfi lled
in this regard. It is possible the song’s RDL will be prevalent in a piece
or heard only once as in “Here Comes the Sun.” Many possibilities exist,
including silence, as some songs only fully make sense when the silence at
the end arrives and brings introspection.
Listening to the loudness contour and levels of program dynamic contour
is an important skill for the recordist. It is used extensively for live sound,
lm sound, mastering and broadcast at times, when one is concerned
about the level of the overall program and how it changes over time. Dur-
ing mixdown and tracking, it is also necessary to be aware of the overall
program level and dynamic contour in order to control the quality and level
of the recording.
Evaluating Loudness in Audio and Music Recordings
149
Musical Balance
The plotting of all the individual sound sources in a musical texture by
their dynamic contours provides the musical balance graph. The graph will
show the actual dynamic level of the sound sources, in relation to the RDL,
as established above. Each sound source will have a separate line on the
graph that will allow the dynamic contours of the sources to be mapped
against a common time line. This graph will clearly show the loudness lev-
els of all sounds and will represent the mix of the work.
The musical balance graph will not include information on performance
intensity, sound quality, or distance. These cues are often confused with
perceived dynamic level, and are purposefully avoided here. This graph
is solely dedicated to evaluating and understanding the dynamic levels,
contours, and relationships of the sound sources. This should be the focus
of the listener here.
The musical balance graph incorporates:
1. Dynamic area designations for the
Y-axis, distributed to complement
the characteristics of the musical example;
2. Reference dynamic level designated as a precise level on the
Y-axis;
3. X-axis of the graph dedicated to a time line, divided into appropriate
increments of the metric grid;
4. A single line plotted against the two axes for each sound source; and
5. A key relating the names of all sound sources to a unique number,
color, or line composition to identify all source lines on the graph.
Figure 7-4 presents a musical balance graph of some of the sound sources
in The Beatles“Lucy in the Sky with Diamonds”
Sgt. Peppers Lonely Hearts
Club Band version. These are the actual loudness levels of the instruments.
The listener will notice that the loudest instrument/voice is not always the
most prominent.
Exercise 7-3 at the end of the chapter will lead the listener through the
process of creating a musical balance graph. It is suggested the reader
become well acquainted with this exercise, as it is one of the most impor-
tant production and basic listening skills required of the recordist.
It is necessary to clearly envision the correct perspective to accurately
hear musical balance relationships. To compare the levels of two or more
instruments (or sound sources), one must focus on the perspective one
level higher than that of the individual instruments; at that level the sound
sources can be perceived as being equal, and compared without bias.
When listening at the level of the individual sound source, a single instru-
ment will become the center of one’s attention and it will be emphasized in
one’s mind, thereby causing loudness judgment to be skewed. One does,
however, listen at this perspective of the individual sound source to iden-
tify the level of a sound source in relation to the work’s RDL.
Evaluating Loudness in Audio and Music Recordings
151
Performance Intensity versus Musical Balance
Performance intensity is the dynamic level at which the sound source was
performing when it was recorded. In many music productions, this dynam-
ic level will be altered in the mixing process of the recording. The perfor-
mance intensity of the sound source and the actual dynamic level of the
sound source in the recording will most often not be identical and will send
confl icting information to the listener.
The dynamic levels of the various sound sources of a recording will often
be at relationships that contradict reality. Sounds of low performance
intensity often appear at higher dynamic levels in recordings than sounds
that were originally recorded at high performance-intensity levels. This is
especially found in vocal lines. This confl icting information may or may not
be desirable, and the recordist should be aware of these relationships.
Important information can be determined by plotting per-
formance intensity against musical balance for some or
all of the sound sources of a recording. This will often pro-
vide signifi cant information on the relationships of sound
sources and the overall dynamic and intensity levels of the
work, as well as the mixing techniques of the recording.
Performance intensity is plotted as the dynamic levels of
the original performance. The listener will judge the inten-
sity of the original performance through timbre cues. The
listener will make judgments based on their knowledge of
the sound qualities of instruments and voices when per-
formed at various dynamic levels.
The reference for performance intensity is the listeners
knowledge of the particular instrument’s timbre, as that instrument is
played at various levels of physical exertion and with various performance
techniques. A reference dynamic level is not applicable to this element.
Musical balance is plotted as the dynamic levels of the sound sources, as
the listener perceives their actual loudness levels in the recording itself, as
discussed immediately above.
The
performance intensity/musical balance graph incorporates:
1. Dynamic area designations in two tiers for the
Y-axis, distributed to
complement the characteristics of the musical example (one tier is
dedicated to musical balance, and one tier is dedicated to performance
intensity);
2. Reference dynamic level on the musical balance tier, designated as a
precise level on the
Y-axis (an RDL is not relevant to the performance
intensity tier);
3.
X-axis of the graph dedicated to a time line, divided into appropriate
increments of the metric grid;
Listen . . .
to tracks 37 and 38
for one mix that closely aligns
performance intensity and musical
balance and a different mix of the
same performance that radically
alters the musical balance of the
original performance.
Chapter 7
152
Figure 7-5
Performance intensity versus musical balance—The Beatles“Strawberry Fields Forever.
Introduction
1 2 3 45 6 7 8 910 111213141516171819202122
4
4
4
4
2
4
4
4
3
4
2
4
4
4
voice
electric guitar
maracas
snare
mellotron
KEY
measures:
Performance IntensityMusical Balance
Chorus
Verse 1
Evaluating Loudness in Audio and Music Recordings
153
4. A single line plotted against the two axes for each sound source, on
each tier of the graph (each sound source will appear on both tiers; the
same number, composition, or color line is used for the source on each
tier of the graph); and
5. A key used to clearly relate the sound sources to their respective source
line (the same key applies to both tiers of the graph).
Figure 7-5 will allow the listener to observe some of the differences
between the recording’s actual loudness levels and the performance inten-
sities (loudness levels) of the sound sources when they were recorded. A
few key sound sources are graphed from The Beatles’ “Strawberry Fields
Forever.” Some sound sources are at very different levels in each tier, and
others show no signifi cant change. Some sources contain subtle changes
of dynamic levels and/or many nuances of performance intensity informa-
tion, and the Mellotron exhibits few gradations of dynamics and intensity.
An exercise to develop skills in recognizing and evaluating performance
intensity versus musical balance information appears as Exercise 7-4 at the
end of the chapter. As an additional exercise, the reader might determine
the musical balance and performance intensities of the other sound sourc-
es in “Strawberry Fields Forever” during the measures of Figure 7-5.
The musical balance and the performance intensity graphs will function
at the same level of perspective as the pitch density graph (in Chapter 10).
When evaluated jointly, these three artistic elements will allow the listener
to extract much pertinent information about the mixing and recording pro-
cesses, and the creative concepts of the music.
Exercises
The following exercises should be performed with care. You should work me-
thodically to become comfortable with the material covered.
When performing loudness/dynamic evaluations, and evaluations of many
of the other elements that follow, work to identify clearly what you know for
certain. Ask yourself if the sound is at extremes of the dynamic range, and
work toward focusing in on the correct level. Continue to feel comfortable
that you have knowledge of what the sound level is not, and work at fi nding
what the level is.
Exercise 7-1
Reference Dynamic Level Exercise
Select a recording you know well for initial attempts at determining the refer-
ence dynamic level of a piece of music. It would be best for the work to be less
than four minutes duration.
Chapter 7
154
1. Before listening to the piece, spend some time thinking about the overall
character of the piece; consider the overall energy level, performance in-
tensity, concept or message, and other important aspects of the song.
2. Listen to the song several times to confi rm that the observations in your
memory are refl ected in the actual music and recording.
3. Reconsider your observations with each new hearing of the recording.
4. Attempt to determine a precise dynamic level for the RDL. Begin this pro-
cess by working from the extreme levels—ppp and fff—asking if the level
exists in those areas.
5. Once the dynamic area has been determined, work to defi ne a precise
level by asking if the RDL is below 50 percent in the level, or above. Con-
tinue to work toward a specifi c level by narrowing the area further.
6. Leave the example and your answer for a period of time (several hours
or several days). Listen to the song again. Reconsider the RDL previously
defi ned.
If you do not know a piece of music, many hearings will be required before
initial observations can be made.
Exercise 7-2
Program Dynamic Contour Exercise
Select a short song for initial attempts at creating a program dynamic con-
tour graph. The entire song should be graphed for overall dynamic contour.
The dynamic level of the entire recording (the composite dynamic level of
all sounds) will be the focus of this exercise. Initial attempts at this exercise
should use a song with large, sudden changes of dynamic level. Repeat the
exercise using a song with changes that are smaller. Calculate a general shape
of the dynamic contour before attempting to grasp all of the subtle details.
1. During the fi rst hearing(s), listen to the example to establish the length of
the time line. At the same time, become acquainted with the character of
the song to begin formulating an idea of its RDL.
2. Check the time line for accuracy and make any alterations. Establish the
RDL of the work by working through the previous exercise.
3. Notice the activity of the program dynamic contour for boundaries of lev-
els of activity and speed of activity. The boundary of speed will establish
the smallest time unit required to accurately plot the smallest signifi cant
change of the element. The boundary of levels of activity will establish the
smallest increment of the Y-axis required to plot the smallest change of
the dynamic contour.
4. Begin plotting the dynamic contour on the graph, continually relating
the perceived dynamic level to the RDL. First, establish prominent points
within the contour. These reference points will be the highest or lowest
levels, the beginning and ending levels, points immediately after silences,
Evaluating Loudness in Audio and Music Recordings
155
and other points that stand out from the remainder of the activity. Use
the points of reference to judge the activity of the preceding and following
material. Focus on the contour, speed, and amounts of level changes to
complete the plotting of the contour.
5. The evaluation is complete when the smallest signifi cant detail has been
perceived, understood, and added to the graph.
Exercise 7-3
Musical Balance Exercise
Select a popular song with at least three instruments and voice. Evaluate the
rst 32 bars for musical balance. This exercise will graph the dynamic con-
tours—actual loudness levels—of all sound sources against the song’s RDL.
Musical balance is the relationships of sound sources to one another. Initial
attempts should use pieces of music with only a few sound sources.
The exercise will follow the sequence:
1. During the fi rst hearing(s), establish the length and structural divisions
of the time line. At the same time, notice prominent instrumentation and
activity of their general dynamic levels against the time line.
2. Check the time line for accuracy and make any alterations. Establish a
complete list of sound sources (instruments and voices), and sketch the
presence of the sound sources against the completed time line. A key
should be created, assigning each sound source with its own number,
color, or line format.
3. Determine the reference dynamic level of the sound using the process
previously presented.
4. Notice the activity of the dynamic levels of the sound sources (instru-
ments and voices) for boundaries of levels of activity and speed of activ-
ity. The boundary of speed will establish the smallest time unit required
to accurately plot the smallest signifi cant change of dynamic level. The
boundary of levels of activity will establish the smallest increment of the
Y-axis required to plot the smallest change of dynamics.
5. Begin plotting the dynamic contours of each instrument or voice on the
graph. Keeping the RDL clearly in mind, establish the beginning dynamic
levels of each sound source. Next, determine other prominent points of
reference. Use the points of reference to judge the activity of the preced-
ing and following material. Focus on the contour, speed, and amounts of
level changes to complete the plotting of the dynamic contours.
6. You should periodically shift your focus to compare the dynamic levels of
the sound sources to one another. This will aid in developing the dynamic
contours and will keep you focused on the relationships of dynamic levels
of the various sources. The evaluation is complete when the smallest sig-
nifi cant detail has been incorporated into the graph.
Chapter 7
156
It is important to remain focused on the actual loudness of instruments, mak-
ing certain your attention is not drawn to other aspects of sound.
As you gain experience and confi dence in making these evaluations, songs
with more instruments should be examined and longer sections of the works
should be evaluated.
Exercise 7-4
Performance Intensity versus Musical Balance Exercise
Select a multitrack recording of a popular song with at least fi ve sound sourc-
es. Evaluate the fi rst 16 bars for performance intensity and musical balance.
Select fi ve sound sources to graph for this exercise. The graph will have two
tiers: one will graph musical balance (the actual loudness levels in the record-
ing), the other performance intensity (the loudness levels of the instruments
when they were recorded).
1. The musical balance exercise should fi rst be completed as in the previous
section. This will generate the graph’s time line as well.
2. Performance intensity will now be determined for each sound source, for
the performance intensity tier. Sound sources will have the same number,
color, or line format as on the musical balance tier.
3. Begin plotting the performance intensity of each sound source on the
graph. Start by establishing the beginning performance intensity levels
of each sound source. Next, determine other prominent points of ref-
erence. Use the points of reference to judge the activity of the preced-
ing and following material. Focus on the contour, speed, and amounts
of level changes to complete the plotting of these performance intensity
contours. The evaluation is complete when the smallest signifi cant detail
has been incorporated into the graph.
You can now compare the two tiers, and learn signifi cant information on how
the voices and instruments were tracked, and how the recording and mixing
processes altered the sound sources. These provide insights into the musical
and production decisions that went into “crafting the mix” of that recording.
As you gain experience in making these evaluations, all of the sound sources
of songs with many instruments should be examined for longer sections of
songs that have signifi cant changes in the mix.
157
8 Evaluating Sound Quality
This chapter will present a process that can be used to evaluate sound
quality and timbre. This process can bring anyone in the audio industry to
communicate meaningful information about sound.
By directly describing the physical dimensions of sound, the process can be
easily adapted to evaluate any sound. The reader will learn to describe sound
quality (accounting for the various contexts within which sounds exist), and
to evaluate timbre (sound out of context as an abstract sound object).
The critical listening process and the technical areas of the audio industry
are often juxtaposed with creative applications and analytical listening pro-
cesses. These differences are most apparent in examining the characteris-
tics of sound quality and timbre, and will be articulated here. In addition, the
perception of sound quality at various hierarchical levels will be explored.
Sound quality as an overall shape or character exists at a number of levels
in perspective and types of listening. These concepts are central to the eval-
uation process, as they allow us to understand sound as an object (avail-
able for evaluation out of time) at all levels of perspective.
Communicating information about sound quality is important to all facets
of music production. Nearly all positions in the entire audio industry need
to communicate about the content or quality of sound. Yet a vocabulary for
describing sound quality and a process for objectively evaluating the com-
ponents of sound quality do not exist. Meaningful communication about
sound quality can be accomplished through describing the physical com-
ponents of timbre and sound quality. Sounds will be described by the char-
acteristics that make them unique. These characteristics are the activities
and states that occur in the component parts of the sound sources timbre.
The component parts have been reduced to the defi nition of fundamental
frequency, dynamic envelope, spectrum (spectral content), and spectral
envelope. Meaningful information about sound quality can be communi-
cated through describing these characteristics. This can be done in great
detail or in a general way. Information can be gathered and communicated
Chapter 8
158
in a more detailed and precise manner through graphing sound quality.
The following sections will lead the reader to develop skill and language to
objectively describe sound.
Sound Quality in Critical Listening Contexts
Describing sound quality in and out of musical contexts are skills that are
important for recordists. In both contexts, sound quality evaluations occur
at all levels of our perceptual hierarchies—at all perspectives.
Outside musical contexts, we are concerned with understanding the char-
acteristics of sound quality for its own sake and as the integrity of the audio
signal. These are perhaps the most prominent approaches to sound quality
for the recordist.
This fi rst approach to examining sound quality looks at the unique char-
acter of a particular sound. This character of the sound may be what sep-
arates one microphone’s imprint from another, one person’s snare drum
sound from anothers, or even one monitor speaker from others. Through
critical listening skills, sound quality of a particular sound is evaluated for
its unique qualities. The evaluation examines the activities of the compo-
nents of timbre that occur within that sound only. This type of sound quality
evaluation seeks to defi ne the individual sound so it can be understood.
This information can then be put to use in various ways. One might evalu-
ate a sound to examine how a microphone is altering the characteristics of
an instrument to determine if the microphone produces the desired sound.
Perhaps two or more sounds are evaluated and compared—the crash cym-
bal sounds from Chapter 6 are an example of this type of comparison.
Critical listening evaluation is also often concerned with the technical qual-
ity, or integrity, of the recording. The technical quality or the integrity of the
signal are of paramount importance, and are out of musical contexts. It is
usually the goal of recordings to be of the highest technical quality, and
to be void of all degradations of signal quality/integrity and all unwanted
sound (although Internet compression seems to be modifying this goal by
accepting degradations of signal in exchange for speed of transmission).
The technical quality of recordings is degraded by unwanted sounds and
can also be impacted by the numerous ways the dimensions of frequency/
pitch, amplitude/loudness, and time/phase are undesirably altered by the
recording and reproduction equipment and processes (often malfunctions,
mismatches or mis-calibrations). These must all be heard and understood by
the recordist, and are only perceived through sound quality evaluations.
The focus of the listener may need to be at any level of perspective, with
the listener needing to quickly and accurately shift perspectives. The listener
may be evaluating the technical quality of the sound for information related
to frequency response (or spectral content, or spectral envelope), or the
Evaluating Sound Quality
159
listener may be listening for transient response (or dynamic envelope, or
dynamic contour of a specifi c frequency area). As examples of extremes of
perspective, the listener may focus on the sound quality of the overall pro-
gram or may be listening at the close perspective of focusing on the sound
quality of a particular characteristic of a single, isolated sound source.
Critical listening involves the evaluation of sound quality to defi ne what
is physically present, to identify the characteristic qualities of the sound
being evaluated, or to identify any undesirable sounds or characteristics
that infl uence the integrity of the audio signal. This process and conceptu-
alization is performed without consideration of the function and/or mean-
ing of the sound and without taking into account the context of the sound.
Sound Quality in Analytical Listening Contexts
Analytical listening involves the evaluation of sound quality to identify its
characteristic qualities in relation to the context of the sound. Analytical
listening will seek to defi ne the sound quality in terms of what is physi-
cally present, but will then relate that information to the musical context in
which the sound material is presented and perceived. It involves defi ning
sounds and then comparing the sound to others.
Pitch density is an evaluation of pitch-related characteristics of sound qual-
ity, at the perspective of the individual sound source. This evaluation will
use sound-quality information and relate it to a musical context. Pitch area
analysis (such as the ones performed on percussion sounds in Chapter 6) is
another evaluation of pitch-related characteristics of sound quality, only at
a lower perspective within the characteristics of the individual sound. This
may not necessarily take place within the musical context, and if so it is a
critical-listening process. When the information is used to compare how
a pitch area compares to other sounds in comprising the overall texture
(timbral balance) of the recording, this is in musical context and is analyti-
cal listening.
The pitch density and timbral balance evaluations in Chapter 10 are in musi-
cal contexts. The pitch area analyses of the percussion sounds in Chapter 6
are out of musical context.
Both of these studies are simple (or more general) approaches to timbre
and sound quality evaluation. They both defi ne the pitch-component infor-
mation that leads to defi ning timbre, or evaluating sound quality. This pro-
cess is related to evaluating the spectral content of a sound. In the same
way, dynamic contour analyses are related to sound quality analysis, at
various structural levels.
The sound quality of the entire program, or of any individual sound source,
may be supplying the most signifi cant musical information in certain pieces
of music. This concept of music composition (that can be explained through
Chapter 8
160
equivalence) is quite prominent in many very different styles of music.
Throughout the twentieth century, a type of writing, sound mass composi-
tion
, evolved through the work of composers Edgar Varèse, George Antheil,
Krzysztof Penderecki, Karlheinz Stockhausen (whose photo appears on the
album cover of
Sgt. Peppers Lonely Hearts Club Band), and Luciano Berio
(to name only a few). Their music of this type places an emphasis on the
dimensions of the overall musical texture (or sound mass) and/or on the
sound quality relationships within the overall musical texture.
The concept of giving musical signifi cance to the sound quality of the
entire program, the relationships of sound qualities, and to pitch density
can be found in a wide variety of popular works from the 1960s on. Many
examples of these ideas exist, although this concept is used in isolated
areas in most works.
The Beatles’ A Day in the Life” is one such work. The concept of sound
quality and pitch density is what shapes the music and the dramatic motion
of the song’s transition section and its conclusion. The sound mass con-
cept of pitch density is the primary musical element, causing timbre/sound
quality to be the dominant artistic element during those sections.
Sound Quality and Perspective
We will remember that sound quality is our recognition of a sound as a
single, unique entity. The listener conceives sound in this way by its over-
all character that is composed of the component parts: dynamic envelope,
spectrum and spectral envelope. Further, we recognize sound quality in
this way at any number of levels of perspective (perceived detail).
We are able to recognize sound quality most often at the following perspec-
tives for analytical listening:
1. as timbral balance of the overall program (the entire musical texture),
2. as groupings of similar or similarly acting sounds within the overall
program (such as a brass or a string section within an orchestra, or the
rhythm section of a jazz ensemble),
3. as the interrelationships of the pitch densities of individual instruments;
4. as the characteristic overall impression of an individual sound source
(instrument, voice, synthesizer patch, special effect),
5. as specifi c sounds generated by individual sound sources (individual
expressive vocal sounds, a specifi ed voicing of a guitar chord, a single
sound’s timbre characteristics), or
6. as environmental characteristics—the sound quality of an environment
(explored in the next chapter).
Critical listening also engages all levels of perspective in a similar manner.
The recordist is concerned about the integrity of the signal at the overall
texture, the individual sound source, any groupings of sound sources (such
Evaluating Sound Quality
161
as a drum mix), an isolated sound, the subtle qualities of spectrum or spec-
tral envelope within a sound, and more.
At these very different levels of perspective, we recognize sound quality
as the concept that makes a sound a single, unique entity composed of
recognizable characteristics. This global quality is evaluated to determine
specifi c information, to identify the aspects that make each sound unique.
Evaluating the Characteristics of
Sound Quality and Timbre
Sound quality evaluations will examine spectral content, spectral enve-
lope, and dynamic contour, in both musical contexts and in critical listening
evaluations. Sound quality evaluation will be approached in this way at all
levels of perspective.
In analytical listening at the highest levels, the individual sound sources
that make up the overall texture can be conceived as individual spectral
components. At the lowest level, individual spectral components are evalu-
ated, and individual evaluations may be performed for each occurrence of
a sound source. Individual sound sources may be analyzed for their con-
tributions to the sound quality of the overall program. In musical contexts,
this evaluation will compare the sound sources to the overall sound quality
through their individual dynamic contours (creating musical balance), their
pitch area (creating pitch density evaluations), and their spatial character-
istics (Chapter 9).
In critical listening, the contributions of the individual sounds to the overall
program can be approached in relation to the same dimensions, but with-
out relation to musical time or context.
Often, the audio professional is concerned with evaluating the individu-
al sound source. Individual sound sources are evaluated for their unique
sound quality as sound objects. The sounds are evaluated to defi ne their
unique characteristics, through an evaluation of the states and activities of
their component parts, out of the musical context.
This is the most widely applied use of the evaluation of sound quality. It
is used for many activities from setting signal processors to evaluating
the performance of audio devices, and from defi ning the general charac-
teristics of a sound source (such as the sound quality of a guitar part in a
recording) to a detailed evaluation of a particular guitar sound. Many other
similar examples can readily be found.
Often, a sound quality evaluation is performed on a single, isolated presen-
tation of the sound source. An isolated presentation will have its unique
pitch level, performed dynamic level, method and intensity of articulation,
etc. Among many uses, examining a particular isolated presentation of a
Chapter 8
162
sound source allows for meaningful comparison between different perfor-
mances of the same source, or of a different sound source performing simi-
lar material. In practice, several or even many isolated sounds are often
compared to determine the most desirable sound or to try to identify the
source of a distortion or noise.
Evaluations of sound quality will seek to defi ne and describe the states and
activities of the sound source’s (1) dynamic envelope, (2) spectral content,
and (3) spectral envelope. It will also make use of the listeners carefully
evaluated perception of (4) pitch defi nition.
We will focus our study on timbre or sound quality evaluations of single
sounds. This will allow us to explore the most common application of
sound-quality evaluation in perhaps the most straightforward manner.
Defi ning a Time Line
Most often sound quality and timbre evaluation will take place out of musi-
cal contexts. In these critical listening applications, clock time is used to
evaluate the characteristic changes that occur over time. While we need to
conceive sound quality as the shape of the sound in an instant, or out of
time, sound only exists in time. Sound can only be accurately evaluated as
changes in states or values of the component parts, which occur over time.
Sound quality in critical listening applications approaches the sound as an
isolated, abstract object. The sound is understood as an object that has a
characteristic shape that unfolded over time.
The evaluation may also take place within musical contexts. In these
instances, the time of the metric grid will be used, when it is present. Evalu-
ation of sound quality will be focused on the musical relationships of the
material, and usually takes place at a higher perspective than critical listen-
ing’s timbre evaluations.
Sound quality evaluations of single sounds will nearly always use clock
time in tenths or perhaps hundredths of seconds in the time line. The time
line of the sound is determined fi rst in any sound quality evaluation. A clock,
counter, or stopwatch might be used as a reference. Next,
increments within the time line and suitable reference
points within the time line will be identifi ed. Skill needs to
be acquired in performing these tasks. The time exercises
of Chapter 5 will greatly assist this development, and the
reader is encouraged to revisit them. While determining a
time line will require a number of hearings, this number
will be reduced with experience and increased ability.
Listen . . .
to tracks 26-33 again
for the exercise in learning the
sound quality of small time units.
This will assist you in defi ning time
lines for timbre and sound quality
evaluations—and much more.
Evaluating Sound Quality
163
Defi ning the Four Components of Sound Quality Evaluations
The physical dimensions of the sound source (dynamic envelope, spectral
content, and spectral envelope), in their unique states and levels, are used
to defi ne sound quality. The defi nition of fundamental frequency (pitch
defi nition) aids in defi ning information of those values, especially the loud-
ness level of the fundamental frequency in relation to the remainder of
the sound’s spectrum and the dominance of harmonic partials. These four
components of sound quality evaluations are examined throughout the
duration of the sound material and are plotted against a single time line. It
is important to note that pitch defi nition and dynamic envelope exist at the
perspective of the overall sound, and that spectrum and spectral envelope
are internal components of the sound and are recognized only at a lower
level of perspective.
Pitch Defi nition
Pitch defi nition will become the focus immediately after the time line has
been drafted. This defi nition of the fundamental frequency is useful in mak-
ing preliminary and general observations of a sound. Defi nition of funda-
mental frequency is often somewhat stable during the sustain portion of a
given sound. Changes in pitch quality are most commonly found between
the onset and the body of the sound. Pitch quality is placed on a continuum
between the two boundaries of well-defi ned in pitch or as precisely pitched
(as a sine wave) through completely void of pitch or nonpitched (as white
noise). A dominance of harmonics will bring the sound to have a more
defi ned pitch quality. With increased presence of overtones (in either num-
ber or loudness level) comes a more nonpitched character to the sound.
Often the defi nition of fundamental frequency is verbally described as hav-
ing a certain pitch quality for a certain portion of its duration, then another
certain quality for the remainder of its duration. In effect, a contour of pitch
defi nition is present. The pitch defi nition tier of the sound quality charac-
teristics graph is not always required, but this aspect of the sound should
always be addressed, if only during the early stages of the evaluation.
Examining pitch defi nition supplies many clues of the content of the spec-
trum and the spectral envelope. The less pitched the sound, the more prom-
inent the overtone content; the recordist can then be drawn to determining
what overtones are present and when they occur. With focus, one can trace
how pitch defi nition changes over the sound’s duration; then one can seek
information on spectral content and spectral envelope using this informa-
tion as a point of departure. Pitch defi nition and dynamic envelope are also
often linked during the onset of a sound or when the spectrum becomes
active or dense.
Chapter 8
164
Dynamic Contour
The dynamic contour of the sound, or the sound’s overall dynamic level as
it changes throughout its duration, is reasonably apparent at fi rst hearings.
Diffi culties may arise with confusing loudness changes and spectral com-
plexity changes. The listener must remain focused on actual loudness and
not be pulled to other perceived parameters of sound.
A reference dynamic level (RDL) will be required for mapping the dynamic
contour. The RDL will be determined by the intensity level at which the
source was performed. The intensity level itself is the RDL; it will be trans-
ferred to a precise dynamic level (a specifi c point in a dynamic area such as
“mezzo piano” or “forte”). The same reference dynamic level will be used
in the spectral envelope tier, explained below. In this way, the same refer-
ence level functions on two levels of perspective, just as one reference
dynamic level functions for both program dynamic contour and for musical
balance (in Chapter 7).
The dynamic contour is readily described. This is accomplished by discuss-
ing the shape and speed of the dynamic envelope, and its dynamic levels at
defi ned points in time. By discussing how loudness changes and by defi n-
ing the levels and speed of those changes, the listener is describing this
physical element as others are experiencing it. This is an important compo-
nent of the unique objective character of the sound.
Spectral Content
Readers may hear few or no spectral components at the beginning of their
studies. Harmonics and overtones fuse to the fundamental frequency, and
we have been conditioned to perceive all of this information as part of a
whole (the global sound quality). To a great extent, to evaluate sound qual-
ity we must work against many learned listening techniques and our pre-
vious listening experiences. Much patience and focused attention will be
required. Practice and repetitive listening must be undertaken to acquire the
skills of accurately recognizing spectral components and of accurately track-
ing dynamic contours of the components that make up spectral content.
The harmonic series can be an important tool to assist in identifying spec-
tral components. The listener can learn to envision the harmonic series as a
chord above the fundamental frequency. Once the listener has learned the
sound of this chord, it will be possible to focus attention on the individual
pitches of the harmonic series while listening to a sound. The listener will
then be in a better position to identify any pitches/frequencies of the har-
monic series that might be present. Frequencies/pitches other than har-
monics will also be noticed; the listener will ultimately be able to quickly
calculate where the overtones fall in relation to the envisioned harmonic
series. Frequencies/pitches that lie between harmonics will be able to be
Evaluating Sound Quality
165
identifi ed as being “between the fourth and fi fth harmonic,” for example,
and a bit more attention will bring the listener to recognize the partial’s
frequency/pitch more precisely. In this way, the harmonic series is used
as a template, to which the frequencies/pitches present can be compared
and identifi ed. The harmonic series can be a point of reference that makes
evaluating spectral content much more approachable.
The reader is encouraged to return to the discussion of the harmonic series
in Chapter 1 to study its content and to spend time learning the sound of
the series. The reader is also encouraged to review the conversion of pitch
levels into frequency levels and the reverse.
The reader should feel free to use whatever devices are
available to assist in identifying partials. This will especial-
ly prove helpful in initial studies. A tone generator or key-
board may be used to assist in identifying frequency levels
or pitch levels of prominent harmonics and overtones. A
lter can be used to eliminate all but the fundamental fre-
quency, and partials can be added one at a time; it can also
be used to eliminate the fundamental frequency so the
harmonics and overtones can be heard more directly. An
equalizer can emphasize certain frequency bands to assist
the reader in determining where the spectral components
exist. It is important to remember to perform activities that
engage the mind and the ear in searching for the informa-
tion, learning to listen at the perspective of the individual
partials, learning the sound of the spectrum, using ones
knowledge of the content and sound quality of the harmonic series. Spec-
tral-analysis software can sometimes assist one in evaluating sounds; this
could be helpful in helping one gain this listening skill, but one must be
wary of not letting the software replace this skill. It is important for the
reader to be actively seeking the information in a thoughtful and methodi-
cal manner, and to engage the ears and mind in the process.
The reader will be able to describe much about a sound by addressing the
spectrum. Defi ning which harmonics are present and those that are promi-
nent, as well as indicating overtone content, provides a signifi cant amount
of objective information about a sound.
Spectral Envelope
Describing the entrances and exits of the partials (harmonics and over-
tones) as well as the individual dynamic contours of these partials (the
spectral envelope) will provide additional important information on sound
quality. The spectral envelope will be calculated against the reference
dynamic level that was identifi ed for the overall dynamic contour and
against a common time line. All of the spectral components identifi ed, as
Listen . . .
to tracks 1 and 2
for harmonic series played in indi-
vidual frequencies and pitches, and
as a chord. Work to recognize this
pattern and spacing of intervals,
and learn the “sound quality” of
the chord that comprises the har-
monic series.
Chapter 8
166
spectral content, will be present in the spectral envelope tier, including the
fundamental frequency.
The spectral envelope maps the dynamic levels and contours of all partials.
Much signifi cant information on pitch defi nition and the character of the
sound is contained in the spectral envelope. By describing this activity,
the listener is communicating very pertinent and meaningful information
about the sound that is completely objective, and can be experienced and
understood by others.
If individual spectral components were very diffi cult to separate from a
fused spectrum, hearing how those components change in terms of loud-
ness over time is extraordinarily diffi cult. When one becomes adept at hear-
ing individual partials (especially the more prominent lower partials), one
can begin to make observations on how those partials change in amplitude
over time and how their level relates to the RDL. The information of how
the spectrum changes over time is very important, and even general obser-
vations that stem from the pitch defi nition tier can lead one to understand
this information.
Figure 8-1
Sound-
quality characteristics
graph.
Time
Spectral
Envelope
Spectral
Content
Dynamic
Contour
Pitch
Definition
f
mf
mp
p
Evaluating Sound Quality
167
Process of Evaluating Sound Quality
The sound quality characteristics graph will be used to help make sound
quality evaluations. The graph greatly assists in a detailed evaluation of a
sound and allows the reader to record observations of any sound, whether
or not a detailed evaluation is undertaken. Pertinent information can be
written down that can provide a resource for objective descriptions formu-
lated at a later time.
The process of evaluating sound quality should follow this sequence of
events:
1. During the fi rst hearing(s), listen to the example to establish the length
of the time line. At the same time, notice prominent states and activity
of the components (especially dynamic contour and spectral content)
against the time line.
2. Check the time line for accuracy and make any alterations.
3. Notice the activity of the component parts of sound quality for their
boundaries of levels of activity and speed of activity. The speed bound-
aries will establish the smallest time units required in the graph to
accurately present the smallest signifi cant change of the element. The
boundaries of levels will establish the smallest increment of the
Y-axis
required to plot the smallest change of each component (dynamic con-
tour, spectral content, spectral envelope).
4. Pitch defi nition observations are next. Begin by making general obser-
vations about the pitch quality of the sound. Identify precise points in
time when the sound is at its most pitched and least pitched states, and
assign a relative value. Use these levels as references to complete this
contour of pitch defi nition. The prominence of the fundamental frequen-
cy and the number and relative loudness levels of harmonics greatly
infl uence the pitch quality of the sound source. Knowing when a sound
is more pitched than other times provides information that can be used
in determining the spectral content and spectral envelope of the sound.
5. Begin plotting the activity of the dynamic envelope on the graph. First,
determine the RDL of the sound by identifying its performance inten-
sity. Second, establish the beginning and ending dynamic levels of the
sound. The highest or lowest dynamic levels are the next to be deter-
mined; then place them against the time line. Use these levels as points
of reference to judge the activity of the preceding and following mate-
rial. Alternate your focus on the contour, speed, and amounts of level
changes to complete the plotting of the dynamic contour. The evalu-
ation is complete when the smallest signifi cant detail has been per-
ceived, understood, and added to the graph. The smallest time incre-
ment of the time line may need to be altered at this stage to allow the
dynamic contour to be clearly presented on the graph.
6. Plot the spectral content on the graph. First, identify the frequency/
pitch levels of the prominent spectral components and the fundamental
Chapter 8
168
frequency. Knowledge of the sound of the harmonic series will prove
valuable here. Map the presence of these frequencies against the time
line, clearly showing their entrances and exits from the spectrum.
Finally, map any changes in the pitch/frequency levels of these partials
against the time line. Certain spectral components may not be present
throughout the duration of the sound. It is not unusual for harmonics
and overtones to enter and exit the spectrum. This evaluation is com-
plete when all of the spectral components that can be perceived by the
listener are added to the graph. Accuracy and detail will increase mark-
edly with experience and practice on the part of the listener. With time
and acquired skill, this process will yield much signifi cant information
on the sound. Initial attempts may not yield enough information to
accurately defi ne the sound source, but will improve substantially over
time.
7. Plot the dynamic activity of the partials on the spectral-envelope tier
of the graph. Use the same RDL as the dynamic-envelope tier. First,
establish the beginning and ending levels of each of the spectral com-
ponents that were identifi ed in Step 6. For each of the spectral compo-
nents, determine the highest or lowest dynamic levels and any other
prominent points of reference within the dynamic contours. Use these
points of reference to evaluate the preceding and following material. To
complete plotting the activity, alternate focus on the con-
tour, speed, and amounts of level changes. The dynamic
envelopes of all of the spectral components are plotted on
this tier. The evaluation is complete when the smallest sig-
nifi cant dynamic level change has been incorporated into
the graph.
Many hearings will be involved for each of the steps above.
Each listening should seek specifi c new information and
should confi rm what has already been noticed about the
material. Remember to work from what is known to deter-
mine what is unknown.
Before listening to the material, the listener must be pre-
pared to extract certain information, to confi rm their previ-
ous observations, and to be receptive to new discoveries about the sound
quality. A clear sense of the correct perspective and the specifi c informa-
tion that is being sought will make the listening session more successful.
The listener should check any previous observations often, although their
listening attention may be seeking new information.
Listen . . .
to track 3
for the harmonics and overtones
of the sustained piano notes. You
will notice changes in the spec -
trum of the pitches over their
long durations.
Evaluating Sound Quality
169
The sound quality characteristics graph incorporates:
1. A multitiered
Y-axis, distributed to complement the characteristics of
the musical example: one tier with dynamic areas for dynamic enve-
lope (with notated RDL), one tier with pitch register designations for
spectral content, one tier with dynamic areas for spectral envelope
(with notated RDL), and a fourth tier designating a pitched-to-non-
pitched scale;
2. The
X-axis of the graph dedicated to a time line that is divided into an
appropriate increment of clock time (the increment will vary depending
on the material);
3. Each spectral component plotted as a single line, against the two axes;
its pitch characteristics on the spectral content tier, and its dynamic
contour on the spectral envelope tier;
4. Each spectral component’s line should have a different color or compo-
sition, to make each component the same on the two tiers.
Sample Evaluations
Several sound quality evaluations follow. Two are synthesized sounds and
the other is a highly modifi ed (feedback, etc.) electric guitar sound. These
invented timbres are evaluated to determine their unique characteristics
for critical listening use and to better understand their relationships to oth-
er sounds in the music.
Each evaluation is of one identifi ed sounding of the single sound source. A
specifi c appearance of each sound source was selected and evaluated. The
appearance was selected because the sounds’ characteristics are not being
masked by other sound sources and because most of the sounds’ char-
acteristics were audible in the specifi c example. This type of evaluation is
common in recording production; although this type of detail and graphing
is not present, recordists listen for this information and make observations,
though they may not formulate the material in this way.
Figure 8-2 shows the great complexity of the pitch defi nition of the open-
ing guitar sound from “It’s All Too Much” (
Yellow Submarine, 1999). The
pitch defi nition changes are refl ected in changes to the spectral envelope.
They are closely associated. In observing the spectral content tier, one can
determine the dominance of harmonics and note the overtones present.
The dynamic envelope exhibits many subtle changes in loudness level
throughout the nearly 15-second duration of the sound.
Evaluating Sound Quality
171
Figures 8-3 and 8-4 are sound quality evaluations of Moog synthesiz-
er sounds from Abbey Road. The different waveforms used for the two
sounds make for differences in spectral content. The simplicity of the early
instrument is refl ected in the basic contours of the dynamic envelope and
spectral envelope, and the inclusion of only harmonics in the spectrum of
each sound.
As an exercise, bring your attention to focus on each tier during four sepa-
rate hearings. Try to identify all of the information present in these three
sound quality evaluations. Search the sound quality of the instrument to
nd the graphed information.
Figure 8-3
Sound
quality evaluation
of the Moog synthe-
sizer sound from The
Beatles’ “Maxwell’s
Silver Hammer,” at
51.1 seconds (Abbey
Road).
Spectral
Envelope
Spectral
Content
Dynamic
Contour
Pitch
Definition
f
mf
mp
p
0.1.2.3.4.5.6.7.8.91.01.1
F
4
C
4
F
3
F
2
F
4
C
4
F
3
F
2
F
4
C
4
F
3
F
2
F
3
F
2
C
4
F
4
Chapter 8
172
Figure 8-4
Sound
quality evaluation
of the Moog synthe-
sizer sound from The
Beatles’ “Here Comes
the Sun,” at 0:12
(Abbey Road).
Spectral
Envelope
Spectral
Content
Dynamic
Contour
Pitch
Definition
0 .2 .4. .6 .8 1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8
C
6
G
5
G
4
B3
B4
E5
A “Describing Sound” exercise appears at the end of this chapter. All peo-
ple in the audio industry talk about sound, and in doing so describe its
qualities. This exercise will get the reader to think about sound for what
is present, not how it makes them feel or how it reminds them of other
senses or experiences. When the four components of sound quality are
described, the content of the sound can be clearly communicated.
The Moog sound from “Maxwell’s Silver Hammer” that appears in Figure
8-3 could be described in this way. Pitch defi nition begins at nonpitched
at the attack of the sound, and moves to pitched by 0.1 seconds, where
it remains pitched for the remaining second of the sound’s duration. Its
dynamic envelope begins in mp and gradually rises to mid-mf by 0.8 sec-
onds, where it remains before beginning a gradual decay at 0.9 seconds;
all dynamics are relative to the sound’s RDL of about 10% into mf. The spec-
trum of the sound is composed of the lower four harmonics based on a
fundamental frequency of F#2. The spectral envelope demonstrates dra-
matic changes in level of the second harmonic (F#3), as it moves from much
softer to louder than the fundamental, and the third and fourth harmonics
Evaluating Sound Quality
173
increase loudness by lesser amounts but at the same time and speed as
the second harmonic; the fundamental frequency is at the top 5% of mp at
the beginning of the sound and changes loudness in a slight arc over the
sound’s duration.
The reader should next perform the Sound Quality Evaluation Exercise at
the end of this chapter. A very important skill will be gradually acquired with
some focused effort. As skill develops over a period of time, the record-
ist will gain considerable awareness of the content of sound qualities. The
recordist will begin to hear things they previously could not imagine. A
new world of sound will present itself, as will a new way of understanding
that world.
Summary
The ability to evaluate and communicate about sound quality and timbre
is extremely important for the synthesist, sound designer, recording engi-
neer, and producer. It is also required of nearly all positions in the audio
industry.
Sound quality evaluation is used in many activities in recording production.
We evaluate sound when we create an equalizer setting or set a compressor;
we compare sound qualities when we move microphones around to deter-
mine the most appropriate placement, and evaluate sound when selecting
microphones. We evaluate sound to identify the quality of the signal we are
recording, and to verify that quality (on all levels of perspective and in all of
the parameters of sound) throughout the process of crafting a recording.
The sound quality evaluations performed by following the method pre-
sented in this chapter will allow the reader to readily recognize the unique
characters of sounds. Learning to recognize and describe the activities of
the component parts of sound quality will allow the reader to talk about
sound articulately and to share information that is pertinent and readily
understood by others.
The skills gained through the previous chapters are brought together in
the many steps of sound quality (and timbre) evaluation. Pitch and pitch-
area estimation, melodic and dynamic-contour mapping, and judging
time increments are all now used for a more demanding task, and a very
important one. These skills should become highly refi ned, and be continu-
ally developed through carefully considering how the recording process is
altering, capturing, or creating sound quality. The audio professional will be
continually engaged in evaluating sound. Recognizing and understanding
the characteristics of sound quality are the fi rst steps towards communicat-
ing accurate and relevant information about sound. Learning to hear and
recognize the components of sound quality/timbre will allow the recordist
to control their craft in shaping recordings.
Chapter 8
174
Exercises
Exercise 8-1
Describing Sound Exercise
The purpose of this exercise is to develop an approach to talking about sound
that discusses the sound’s physical components.
1. Select a sound and record it, playing only one pitch. A sound from the
enclosed CD may be used.
2. Write down your observations of the time line: how long does the sound
last?
3. Make general observations of the pitch defi nition of the sound. Does it
start with a burst of noise (like a piano)? Are there areas where the sound
changes in pitch quality? Is the sound mostly pitched or mostly noise-
like?
4. Next describe the dynamic envelope. How does the dynamic envelope
change during the sound’s duration? What is the speed of the attack and
initial decay? What is the sustain level in relation to the attack?
5. Describe the spectrum of the sound. Is it dominated by harmonics? Where
are overtones present in relation to the fundamental? Is there a different
spectrum during the onset than in the body of the sound?
6. Describe the spectral envelope. Are some partials prominent? Is the fun-
damental louder than the remainder of the spectrum? How does the
spectrum change over time?
Practice talking about sound in this way whenever you are working with an
audio device. Ask yourself: What I am hearing related to the actual sound
wave? What are its current qualities, or how are those qualities changing?
This will bring you to be able to quickly evaluate sounds in a meaningful way,
and to be able to explain to others what needs to be done to obtain desired
results or what the wonderful qualities of your drum sounds are—specifi cally
and understandably.
Practice describing sounds this way without fi rst creating a sound quality
evaluation graph, and then after creating the graph.
Exercise 8-2
Sound Quality Evaluation Exercise
Find a sound that is a complex waveform (containing overtones as well as
harmonics) with a duration of at least fi ve seconds. A single occurrence of the
sound should be identifi ed and evaluated. A sound that is isolated from other
Evaluating Sound Quality
175
sounds (does not have other sounds occurring simultaneously) will be easiest
to evaluate. Make a recording of the sound to more easily repeat hearings of
the sound.
Perform a sound quality evaluation on the sound by using the following
sequence of activities. See the chapter for more detail, as needed.
1. During the fi rst hearing(s), listen to the example to establish the length of
the time line.
2. Check the time line for accuracy and make any alterations.
3. Notice the activity of the component parts of sound quality for their
boundaries of levels of activity and speed of activity, and establish the
smallest increment of the Y-axis required to plot the smallest change of
each component (dynamic contour, spectral content, spectral envelope).
4. Place pitch-defi nition observations on the graph.
5. Plot the activity of the dynamic envelope on the graph.
6. Identify the spectral content of the sound and place the partials on the
graph.
7. Plot the dynamic activity of the partials on the spectral-envelope tier of
the graph.
When completed, review your sound quality evaluation graph and compare
the activities of all tiers. Summarize and describe how the physical dimensions
of the sound appear and change throughout the duration of the sound as in
Exercise 8-1, but in greater detail.
176
9 Evaluating the Spatial Elements
of Reproduced Sound
The spatial characteristics and relationships of sound sources are an integral
part of music and audio productions. Spatial elements are precisely control-
lable in audio recording, and sophisticated ways of using these elements
have developed in audio and music productions. Spatial relationships and
characteristics often present signifi cant qualities in current music recordings,
and are also important considerations in critical listening applications.
The evaluation of the spatial characteristics of stereo recordings covers
three primary areas: (1) localization on a single horizontal plane in front of
the listener, (2) localization in distance from the listener, and (3) the quali-
ties of environmental characteristics. Surround sound recording replaces
localization in front with localization 360° around the listener. The elements
of environmental characteristics and distance illusions further interact and
create other sound characteristics that must be evaluated in both formats.
The recordist must be able to evaluate these characteristics to properly
evaluate recorded/reproduced sound. Many of the skills required to evalu-
ate the spatial characteristics of a recording have been gradually developed
throughout the previous four chapters. These skills of sound quality evalua-
tion, time judgments, pitch estimation, and dynamic contour mapping will
be used again (from a new perspective) to recognize and evaluate the spa-
tial elements of reproduced sound. The further development of these skills
will again require patience and practice.
An accurate evaluation of the spatial elements is only possible under cer-
tain conditions. The listener must be located correctly with respect to the
loudspeakers of the playback system. This is critical to accurately hear
directional cues. The sound system must interact correctly with the listen-
ing environment to complement the reproduced sound. Reproduced sound
can be radically altered by the characteristics of the playback room and
the placement of loudspeakers within the room. Further, the sound system
Evaluating the Spatial Elements of Reproduced Sound
177
itself must be capable of reproducing frequency, amplitude, and spatial
cues accurately.
Many of the concepts of the spatial elements have not previously been well
defi ned. The length of this chapter is the result of the number of important
spatial elements of sounds in recordings, the methods one must use to
perform meaningful evaluations of these elements, and the explanations
required of new concepts.
Understanding Space as an Artistic Element
Spatial characteristics and relationships are used as artistic elements in
music productions. They are used as primary and secondary elements that
help to shape the unique character of musical ideas. Space has the poten-
tial of being the most important artistic element in a musical idea, but most
often serves to support other elements. It may support other elements by
delineating musical materials, by adding new dimensions to the unique
character of the sound source or musical idea, and/or by adding to the
motion or direction of a musical idea. It is also among the primary qualities
of the overall texture of the recording.
Perceived Performance Environment
The listener acquires an impression of the spatial characteristics of the
recording through the sound stage and imaging. They will imagine a
performance space wherein the reproduced sound can exist during the
re- performance of listening to the recording. The listener will perceive
individual sound sources to be at specifi c locations within a
perceived
performance environment
.
The recording represents an illusion of a live performance. The listener will
conceive the performance as existing in a real, physical space, because
the human mind will interpret any human activity in relationship to the
known physical experiences of the individual. The recording will appear
to be contained within a single, perceived physical space (the perceived
performance environment), because in human experience we can only be
in one place at one time.
The perceived performance environment will have an audible, character-
istic sound quality that is established in one of two ways. The qualities of
the perceived performance environment may be established by applying
a set of environmental characteristics to the overall program (i.e., process-
ing the fi nal mix). Most often, the perceived performance environment is
a composite of many perceived environments and environmental cues. In
these instances, the listener formulates an impression of a perceived per-
formance environment through interrelationships of many environmental
characteristics cues. These cues may be (1) common or complementary
Chapter 9
178
between the environments of the individual sound sources, (2) prominent
characteristics of the environments of prominent sound sources (source
that presents the most important musical materials, or the loudest, the
nearest, or the furthest sound sources, as examples), and/or (3) the result
of environmental characteristics found in both of these areas.
Further, the listener will obtain a sense they are at a specifi c location within
the perceived performance environment. The listener might be aware of
their relationship to the sidewalls and any objects (balconies, seating, etc.)
in the performance environment, to the wall behind and the ceiling above
their location, and of their relationship to the front wall of the performance
environment.
The listener will also calculate their location with respect to the front of the
sound stage.
Figure 9-1
The sound
stage within the per-
ceived performance
environment.
PERCEIVED PERFORMANCE ENVIRONMENT
Sound Stage
Variable perceived distances:
Sound Stage and Imaging
Within this perceived performance environment is a two-dimensional area
(horizontal plane and distance) where the performance is occurring—the
sound stage. The sound stage is the location, where the sound sources are
perceived to be collectively located, as a single ensemble. The listener will
unconsciously group all sources into a single performance area. The per-
formance (that is the recording) will thus emanate from a single location.
Evaluating the Spatial Elements of Reproduced Sound
179
The area of the sound stage may be any size. The size of the sound stage
may appear to be anything from a small, well-defi ned point (an infi nitesi-
mally small world), to a space occupying an area extending from immedi-
ately in front of the listener to a location (spanning a great distance) well
beyond our sight line (perhaps conceived as being an area beyond the size
of anything possible on Earth) and, within the horizontal plane, fi lling an
area beyond the stereo array.
The sound stage may be located at any distance from the listener. The
placement of the front edge of the sound stage may be immediately in
front of the listener, or at any conceivable distance from the listener.
The perceived distances of the sound sources from the listener determine
the depth of the sound stage. The sound source that is perceived as near-
est to the listener will mark the front edge of the sound stage. The sound
source that is perceived as being furthest from the listener will defi ne the
back of the sound stage and will also help to establish the rear wall of
the perceived performance environment. The perceived location of the
rear boundary (wall) will be determined by the relationship of the furthest
sound source to its own host environment. The rear wall of the perceived
performance environment may be located immediately behind the furthest
sound source, or some space may exist between the furthest sound source
and the rear wall of the sound stage/perceived performance environment.
All sound sources will occupy their own location in the sound stage. Two
sound sources cannot be conceived as occupying the same physical loca-
tion. Our sensibilities will not allow this to occur. It is possible for different
sound sources to occupy signifi cantly different locations within the sound
stage, anywhere between the two boundaries.
Imaging is the perceived loca-
tion of the individual sound sources within the two perceived dimensions of
the sound stage (see Figure 9-2). Sources are located within the sound stage
by their angle (on the horizontal plane) and distance from the listener.
Chapter 9
180
Figure 9-2
Imaging of
sound sources within
the sound stage.
PERCEIVED PERFORMANCE ENVIRONMENT
Listener’s Perceived
Location
Sound Stage
High Kybd
High Hat
Lead Vocal
Background Vocals
Acoustic Guitar
Bass Drum
Low Keyboard
Flute
Bass
Tambourine
Perceived
Depth
of
Sound
Stage
Perceived Width
of Sound Stage
Environments of Individual Sources
The placement of each source within its own environment infl uences imag-
ing. The characteristics of the unique performance environments of each
sound source might enrich source width and distance cues, and enhance
the dimensions of the sound stage. In current music productions, it is
common for each instrument (sound source) to be placed in its own host
environment. This host environment of the individual sound source (a per-
ceived physical space) is further imagined to exist within the perceived
performance environment of the recording (a perceived physical space).
This creates an illusion of a
space existing within another space.
The environments of the sound sources and the overall program may be
of any size. The acoustical characteristics of just about any space may be
simulated by modern technology. The sound sources may
be processed so that the cues of any acoustical environ-
ment may be added to the individual sound source, to
any group of sound sources, or to the entire program.
Not only is it possible to simulate the acoustical charac-
teristics of known, physical spaces, it is possible to devise
environment programs that simulate open air environ-
ments (under any variety of conditions) and programs that
provide cues that are acoustically impossible within our
known world of physical realities.
Listen . . .
to tracks 42-44
for narrow and wide guitar phan-
tom images, and a narrow image
broadened by reverberation.
Evaluating the Spatial Elements of Reproduced Sound
181
Figure 9-3
Space
within space.
Figure 9-3 presents an easily accomplished set of environmental relation-
ships, with individual sound sources appearing in very different and unique
environments:
Timpani placed in an open-air environment
A stringed instrument placed in a large concert hall
A vocalist performing in a small performance hall
A piano sounding in small room
A cymbal appearing to exist in a very unnatural (perhaps otherworldly
or outer space), remarkably large environment
These many simulated acoustical environments are perceived as existing
within the overall space of the perceived performance environment. The
spaces of the individual sources are within the space of the perceived per-
formance environment. It is possible for more than one source to be placed
within an environment. Sources contained within the same environment
may have considerably different distance locations, or they may be similar.
The environments of the sound sources and the overall program may be
in any size relationship to one another. The environment of a sound source
may have the characteristics of a physically large space, and the perceived
performance environment may have the characteristics of a much smaller
physical environment. This is a common relationship, and the reverse is
also possible (though more diffi cult to achieve). The spaces of the indi-
vidual sound sources are understood (by the listener) to exist within the
all-encompassing perceived performance environment, no matter the per-
ceived physical dimensions of the spaces involved.
PERCEIVED PERFORMANCE ENVIRONMENT
Sound Stage
Chapter 9
182
The spaces of the individual sound sources are subor-
dinate spaces that exist within the overall space of the
recording. A further possibility (not commonly used, at
present) exists for subordinate spaces to appear within
other subordinate spaces, within the perceived perfor-
mance environment.
Space within space is a hierarchy of
environments existing within other environments. Its cre-
ative applications have not been fully exploited in current
music production practices.
The characteristics of the perceived performance environment function as
a reference for determining the characteristics of the individual environ-
ments of the individual sound sources. All of the environments of a record-
ing will have common characteristics that are created by the perceived
environmental characteristics of the perceived performance environment
(as discussed above). These characteristics provide a reference for deter-
mining the unique characteristics of the individual performance environ-
ments of the individual sound sources.
The characteristics of the perceived performance environment also func-
tion as a frame of reference for the listener in determining the distance
locations of the individual sound sources within the sound stage.
Distance in Recordings
Distance is perceived as a defi nition of timbral detail. It is further calculat-
ed in relation to the characteristics of the environment in which the sound
exists, as well as the perceived location of the sound source and the listener
within that environment. The listener will perceive the distance of the sound
source as it is sounding within its unique environment. The listener will then
unconsciously transfer that distance to the perceived performance environ-
ment, combining any perceived distance of the source’s environment from
the listeners location in the perceived performance environment.
The actual distance location placement of the sound source within the
sound stage is determined by (1) the distance between the sound source
and the perceived location of the listener within the individual source’s
environment combined with (2) the perceived distance of that environment
from the listeners location in the perceived performance environment. All
this information blends into a single impression of distance. Through this
process, sound sources (with and within their environments) are conceived
at specifi c distances from the listener. This is all accomplished subliminally.
The placement of sounds at a distance and at an angle from the listener
(imaging) takes place at the perspective level of the perceived performance
environment.
Listen . . .
to tracks 45-47
for a number of different space
within space relationships.
Evaluating the Spatial Elements of Reproduced Sound
183
Directional Location
Sound sources (with their individual environments and conceived distance
locations) will be located at an angle from the listener. Directional location
is used differently in stereo and surround formats.
The
stereo location will place sound sources on the sound stage, within
the stereo loudspeaker array, at an angle of direction from the listener. The
size of sound source images can be narrow and a precisely defi ned point
in space, or it can occupy an area between two boundaries. Sources that
occupy an area may be of any reproducible width and may be located at
any reproducible location within the stereo array. Further, under certain
production practices, it is possible for sound sources to appear to occupy
two separate locations or areas within the stereo array.
Figure 9-4
Listener
within the sound
stage in surround
sound.
PERCEIVED PERFORMANCE ENVIRONMENT
Sound
Ls Rs
RL
C
Stage
Surround location of sound sources is more complex. Location of sound
sources may be at any angle from the listener. The listener may actually
be placed within the sound stage (Figure 9-4), or the surround speakers
may be used solely for environment information, allowing the sound stage
to remain in front of the listener (Figure 9-5). Sound-source size (width)
remains variable, but now may be anything from a single point in space
to completely enveloping the listener. Further, sources may readily occupy
several locations simultaneously.
Chapter 9
184
Production practice for surround is currently being defi ned, and the poten-
tials of the medium are being explored. How the complex potential of
sound location around the listener ultimately translates into our music lis-
tening experiences will be defi ned over the upcoming years. The listener
and audio professional must remain receptive to possibilities, as this new
technology shapes music, music listening and the creation of recordings.
Figure 9-5
Listener
enveloped by environ-
mental cues, sound
stage in front.
PERCEIVED PERFORMANCE ENVIRONMENT
Sound
Ls Rs
RL
C
Stage
Ambiance
(Environmental Cues)
Stereo Sound Location
Sound location is evaluated within the stereo array to determine the location
and size of the images of the sound sources. These cues will hold signifi cant
information for understanding the mix of the piece and may contribute sig-
nifi cantly to shaping the musical ideas themselves. Phantom images may
also change locations or size during a piece of music. These changes may be
sudden or gradual. The changes may be prominent or subtle.
The
stereo sound-location graph will plot the locations of all sound sources
against the time line of the work. The graph portrays the direction of sourc-
es from the listener and the size of the phantom images.
Evaluating the Spatial Elements of Reproduced Sound
185
Left and right loudspeaker locations and the center position are identifi ed on
the graph. The actual boundaries of the vertical axis extend slightly beyond the
loudspeaker locations (up to 15º). Placing the sound source’s location within
the L-R speaker-location boundaries represents source angle from the listen-
er. Precise degree-increments of angle can be incorporated into the graph;
they are often unnecessary, depending on the nature of the recording.
60º
90º
15º
15º
60º
45º
40º
30º
20º
10º
10º
20º
30º
40º
45º
R
Right
Left
Center
Time
45º
20º
10º
Left
10º
20º
45º
Right
Center
L
60º
Figure 9-6
Calculation
of degree-increments
for location and the
Y-axis of the stereo
sound location graph.
The stereo sound location graph incorporates:
1. The left and right loudspeaker locations, a designation for the center of
the stereo array, and space slightly beyond the two loudspeaker loca-
tions as the
Y-axis; degree divisions may be added to the Y-axis for
greater detail if the music will be better understood by such divisions;
2. The graph will become unclear if too many sound sources (especially
many spread images) are placed on the same tier; the
Y-axis may be
broken up into any number of similar tiers (each the same as listed in
Step 1) to clearly present the material on a single graph;
3. The
X-axis of the graph is dedicated to a time line that is divided into
an appropriate increment of the metric grid, or is representative of a
major section of the piece (or the entire piece) for sound sources that
do not change locations;
Chapter 9
186
4. A single line is plotted against the two axes for each sound source; the
line will occupy a large, colored/shaded area in the case of the spread
image; and
5. A key will be required to clearly relate the sound sources to the graph.
It should be consistent with keys used for other similar analyses (such
as musical balance and performance intensity) to allow different analy-
ses to be easily compared.
A sound source may occupy a specifi c point in the horizontal plane of the
sound stage, or it may occupy an area within the sound stage. The graph
will dedicate a source line to each sound source and will plot the stereo
locations of each source against a common time line.
The source line for point source images will be a clearly defi ned line at
the location of the sound source. The source line for the spread image will
occupy an area of the sound stage and will extend between the boundaries
of the image itself.
Figure 9-7
Stereo
sound-location graph.
Left
Right
Center
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Spread Image
Point Source
45º
20º
10º
10º
20º
45º
It is common for stereo sound-location graphs to be multitier, placing
spread images on separate tiers from point sources or providing a number
of separate tiers for spread images. Figure 9-7 presents a single-tier stereo
sound-location graph, containing two sound sources: one spread image
and one point source. It also incorporates angle increments out from the
center in degrees. The speakers should appear at 30° right and left.
Figure 9-8 presents the stereo-sound location of a number of the sound
sources from The Beatles’ work, A Day in the Life.The location, size, and
movements of the sound-source images directly contribute to the character
and expression of the related musical materials. As an exercise, listen to
Evaluating the Spatial Elements of Reproduced Sound
187
the recording and notice the placement of the percussion sounds. Observe
and defi ne the stereo locations of the percussion sounds as they comple-
ment the placements of the voice, bass, piano, guitar, and maracas to bal-
ance the sound stage.
Figure 9-8
Multitier
stereo sound location
graph—The Beatles’
A Day in the Life.
R
Intro
1 2 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
4
4
measures of (except where marked)
4
4
7
8
4
4
Bridge
L
C
R
L
C
Voice
KEY
Bass
Maracas
Guitar
Piano
Verse 1 Verse 2 Verse 3
Locations and image size do not often change within sections of a work.
Changes are most likely to occur between sections of a piece, or at rep-
etitions of ideas or sections (where changes in the mix often occur). The
listener should, however, never assume changes do not occur. Gradual or
subtle changes in source locations and size are present in many pieces and
sometimes in pieces where such events are not expected. The changes in
source size and location from
Abbey Road works discussed in Chapter 2
will not be noticed unless the listener is willing to focus on this artistic ele-
ment and is prepared to hear these changes.
The reader should work through the Stereo Location Exercise (9-1) at the
end of this chapter to refi ne this skill.
Chapter 9
188
Distance Location
Distance localization and stereo localization combine to provide imaging
for sound sources. Figure 9-9 is an empty stereo sound stage onto which
sound sources are imagined to be located. Placing sounds on this empty
sound stage will allow the listener to make quick, initial observations regard-
ing stereo imaging. These observations can then lead to the more detailed
evaluations of stereo location and distance location. It is important to note
that these location observations will relate to specifi c moments in time or to
sections of a work (specifi c periods of time). While location changes cannot
be written on this fi gure, it is useful for initial observations and for graphing
sources that do not change. In production work, this fi gure can be helpful in
constructing and planning mixes, and for keeping track of parts.
Figure 9-9
Empty
sound stage.
PERCEIVED PERFORMANCE ENVIRONMENT
Sound Stage
Distance is the perceived location of the sound source from the listener. It
is a location where the listener envisions the sound to be placed along the
depth of the sound stage. We hear sounds as occupying a specifi c location
point of distance from our location. Sounds do not occupy distance areas.
A source environment may provide an area of depth to the image, but the
source will be heard as located at a precise point within that environment.
The environment and source fuse into a single sound impression that will
occupy an area of distance, with the source localized at a specifi c spot in it.
The perceived source locations nearest to and furthest from the listener
establish the front and rear boundaries of the sound stage. The front edge
of the sound stage may be immediately in front of the listener, or at any
Evaluating the Spatial Elements of Reproduced Sound
189
distance. The depth of the sound stage will be heard as a single dimension
of the area that contains all sound sources.
Understanding Distance Location
The reader must approach distance location carefully. Distance cues are
often not accurately perceived. Other artistic elements are often confused
with distance. Further, humans mostly rely on sight to calculate distance
and are not normally called upon to focus on aural distance cues.
Distance is NOT loudness. In nature, distant sounds are often softer than
near sounds. This is not necessarily the case in recording production. Loud-
ness does not directly contribute to distance localization in audio record-
ings. At times loudness and distance cues are associated in recordings, but
this is often not the case—especially in multitrack and synthesized produc-
tions. A “fade out” can be accomplished without causing source distances
to increase. Conversely, a fade out may cause sound sources to be per-
ceived as increasing in distance. The distance increase will be the result of
a diminishing level of timbral detail, it will not be the result of decreasing
dynamic level.
Very often, people will describe a sound as being out in front,” implying
a closer distance. The sound may actually be louder than other sounds or
may stand out of the musical texture because of the prominence of some
other aspect of its sound quality. Much potential exists for confusing dis-
tance with dynamic levels.
Distance is NOT determined by or the result of the amount of reverberation
placed on a sound source. In nature, distant sounds are often composed of
a high proportion of reverberant energy in relation to direct sound. Rever-
berant energy does play a role in distance localization, but not so promi-
nent a role that it can be used as a primary reference. Reverberant energy
is most important as an attribute of environmental characteristics and in
placing a sound source at a distance within the individual source environ-
ment. Humans perceive distance within environments, through time and
amplitude information extracted from processing the many refl ections of
the direct sound. The ratio of direct to refl ected sound infl uences distance
location. Thus, reverberation contributes to our localization of distance, but
it is NOT the primary determinant of distance location, in and of itself.
Distance is NOT the perceived distance of the microphone to the sound
source that was present during the recording process. The only exception
to this statement occurs when the initial recording is performed with a
single stereo pair of microphones, and no signal processing is performed
on the overall program. Microphone-to-sound source distance is a con-
tributor to the timbral characteristics of the sound source. Microphone-
to-sound source distance will determine the amount of defi nition of the
sound source’s timbre (how much timbral detail is present in the sound)
Chapter 9
190
captured by the recording process. It will also determine the amount of
the sound of the initial recording environment that has become part of the
sound source’s timbre. Generally, the closer the microphone to the sound
source, the greater the defi nition of timbral components captured during
the recording process. This will provide distance localization information,
in such a way that very close microphone placement will cause the image
to be perceived very close to the listener (if no timbral modifi cations or
signal processing is performed in the mixing process). The sound quality
may be signifi cantly altered in the mix, signifi cantly altering microphone-
to-sound source distance cues. Microphone-to-sound source distance con-
tributes to the overall sound quality of the sources timbre. It contributes
to our localization of distance through defi nition of timbral detail, but it is
NOT a primary determinant of distance localization.
Distance location IS primarily the result of timbral information and detail.
Timbre differences between the sound source as it is remembered in an
unaltered state and the sound as it exists in the recording primary deter-
mine distance localization. The listener is aware of how timbres are altered
over various distances. It is through the perception of these changes that
we identify the distance of a source from our listening location. Humans
are unable to estimate the physical distance (meters, feet, etc.) of a sound
source from their location. We perceive distances in relative terms and
compare locations to one another. In our everyday activities we mostly rely
on sight to make distance judgments. Therefore, our ability to focus on the
“sound” of distance has not been encouraged by our real-life experiences,
and will take focused effort to develop.
We rely on timbral defi nition for most of our distance judgments and use
the ratio of direct-to-reverberant sound to a lesser degree. The extent to
which we rely on either factor depends on the particular context. Distance
localization is a complex process, relying on many variables that are incon-
sistent between environments.
The listener knows the sound qualities of sound sources within the area
immediately around them. The listener has a sense of occupying an area,
encompassing a space immediately around them. This area serves as a ref-
erence from which we judge “near” and “far.Within this area, sounds have
no changes in timbre. All characteristics of timbral content are present, and
sounds will have more detailed defi nition the closer they are to the listener.
The overall sound quality of the sound source may be somewhat altered
by the characteristics of the host environment, but the level of detail pres-
ent in the timbre causes the listener to perceive the source as being within
their immediate area of
proximity. This space that immediately surrounds
the listener is called proximity and is the listeners own personal space.
Sounds in proximity are perceived as being close, as occurring within the
area the listener occupies. The actual size or area of proximity may be per-
ceived as being rather large or very small, depending on the context of the
material and the perspective of the listener.
Evaluating the Spatial Elements of Reproduced Sound
191
Figure 9-10
Contin-
uum for designating
distance location.
Proximity Near Far
Horizon of detailed
distance perception
Adjacent
considerable changes in timbre Sounds are difficult to
recognize, and localize
moderate changes in source timbre Distance relationships begin to
lack definable localization
moderate changes in source timbre
Sound sources begin to lack
definition of sound quality
slight changes in source timbre
Just beyond the listener’s
immediate vicinity of proximity
Listener
extreme level of detail in timbral components
no alterations
of sound source
timbre
moderate level of definition of timbral components
The listener knows the sound qualities of sound sources at “near” distanc-
es. We conceive “near” as being immediately outside of the area that we
perceive ourselves as occupying. Throughout this “near” area, the listener
is able to localize the sound’s distance with detail and accuracy. Timbres
are very slightly altered in the closest of sounds considered near and are
moderately altered in the furthest of sounds considered near. An area will
exist between these two boundaries where sounds are readily compared
as being closer or farther than other similar sounds. Sounds cease to be
considered near when the listener begins to have diffi culty localizing dis-
tances in detail.
“Far” sound sources lack some defi nition of sound quality. The closest of
far sounds will have moderate alterations to sound quality, with little or
no defi nition. Few low amplitude partials will be present, and amplitude
and frequency attack transients will be diffi cult to detect. The furthest of
far sounds will have considerable alterations to sound quality; the sounds
will lack all defi nition. The furthest far sounds may even be diffi cult to rec-
ognize. A wide area exists between these two boundaries of “far.” It could
conceivably be quite large, perhaps stretching to infi nity. “Far” sounds are
diffi cult to place in specifi c distance locations; they tend to be more diffi cult
to place within the “far” area, but can be localized quite readily by compar-
ing them with other sounds.
Chapter 9
192
Evaluating Distance Location
The listener will initially focus on distance cues of the sound source at the
perspective of the source within its own host environment. The listener will
intuitively transfer that information to the perspective of the sound stage.
There, the sound source’s degree of timbre defi nition, the perceived dis-
tance of the sound source within its host environment, and the perceived
distance of the source’s host environment from the perceived location of the
listener blend instinctively into a single perception of distance location. This
process will determine the actual perceived distance of the sound source, at
the perspective of imaging. Fortunately, this all happens quite naturally.
Again, the defi nition of, or the amount of, timbral detail present will play
the central role in determining perceived distance.
The boundaries for distance location extend from “adjacent” to the listener
to a distance of “infi nity.Adjacent is that point in space that is immediately
next to the space the listener is occupying. It should be conceived literally
as being the next molecule available beside the listener, as a sound may be
localized at that location.
The continuum for distance localization consists of three areas. The areas
represent conceptual distance, not physically measurable distance incre-
ments. Distance is thus judged as a concept of space between the sound
source and the listener. An area of proximity surrounds the listener. This
area serves as a reference for judging near and far distances. Human expe-
rience of the nature of sound is used as a reference to conceptualize the
amount of space (distance) between the source and the listener.
The three areas of the continuum are:
1. An area of “proximity,” the space that the listener perceives as their
own area, is the area immediately surrounding the listener that may be
extended in size to be conceived as the size of a small to moderately
sized room. The listener will perceive the proximity area as being their
own immediate space;
2. A “near” area is the area immediately outside of the space that the lis-
tener perceives themselves as occupying, extending to a horizon where
the listener begins to have diffi culty localizing distances in detail; and
3. A “far” area, beginning where perception dictates that space ceases
to be “near;” where detailed examination of the sound is diffi cult, and
extending to where sounds are almost impossible to recognize. Extreme
far sound sources contain very little defi nition of sound quality.
These three areas are not of equal physical size. The amount of physical dis-
tance contained in the conceptual area of proximity will be considerably dif-
ferent than the physical distance encompassed by the conceptual area of
far. All three areas of the continuum occupy a similar amount of conceptual
Evaluating the Spatial Elements of Reproduced Sound
193
space, but represent signifi cantly different amounts of phys-
ical area. The vertical axis of the distance location graph will
clearly divide the three areas.
The size of the three areas may be adjusted between
appearances of the distance location graph. The amount
of vertical space occupied by the areas may be adjusted
to best suit the material being graphed, with certain areas
being widened in certain contexts and narrowed in others.
The far area may even be omitted in certain graphs. The
area of proximity should always be included (although if necessary it may
be narrowed to occupy less vertical space), to clearly present the conceptual
distance between the perceived location of the listener and the front edge
of the sound stage.
Sound sources will be placed on the graph (1) by evaluating the defi nition
of the sound quality of each sound source (the amount of detail present in
the timbre of the sound source), and incorporating information on the ratio
of direct-to-reverberant sound and the quality of the reverberant sound as
appropriate, and (2) by directly comparing the sound source to the per-
ceived distance locations of the other sound sources present in the musical
context (using proportions of different locations between three or more
sound source distances, to make more meaningful comparisons).
The individual listeners knowledge of timbre and environmental charac-
teristics, and their ability to recognize the sound source, are variables that
may cause the listener to inaccurately estimate distance. For example, a
very close tamboura may sound like a far sound to a person who does not
know the sound of a tamboura. As the life experience of listeners varies,
so does an individual’s ability to conceptualize the distance relationships
of sounds.
During initial studies, distance judgments may be diffi cult to conceive and
perceive. Distance is, however, a central concern of sound-source imaging
and thus of music production. Skill in this area can be refi ned and should
become highly developed.
The
distance location graph incorporates:
1. Continuum from adjacent through infi nity (divided into three areas) as
the
Y-axis;
2. The X-axis of the graph dedicated to a time line divided into an appro-
priate increment of the metric grid;
3. A single line plotted against the two axes for each sound source; and
4. A key that is required to clearly relate the sound sources to the graph.
The key should be consistent with keys used for other similar evalu-
ations (such as musical balance or stereo location) to allow different
elements to be easily compared.
Listen . . .
to track 39-41
for a single cello performance that
is placed in Proximity, Near and
Far distance locations.
Chapter 9
194
The locations of all sources are plotted as single lines. Sources are precise-
ly located at a specifi c distance from the listener. No two sources can be at
the same distance level unless they are at clearly different lateral locations
(placement of phantom image in stereo or surround formats).
Sound sources do not often change distance locations in real time or within
sections of a work. Changes are most likely to occur between sections of
a piece, at entrances or exits of individual sound sources, or at repetitions
of ideas or sections (where changes in the mix often occur). The listener
should, however, never assume changes do not occur. Gradual changes
in distance are present in many pieces. At times a sound will become
unmasked by the exit of another instrument from the mix, and will move
closer to the listener with its increased timbral detail.
The reader is encouraged to work through the Distance Location Exercise
at the end of the chapter.
Figure 9-11
Distance
location graph—The
Beatles’ A Day in the
Life.
Intro
1 2 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33
4
4
measures of (except where marked)
4
4
7
8
4
4
Bridge
Proximity Near Far
Adjacent
Verse 1 Verse 2 Verse 3
Piano
Maracas
Guitar
Voice
Voice
Bass
Figure 9-11 is a distance location graph from The BeatlesA Day in the Life.
While the vocal line has a signifi cant percentage of reverberant sound, its
timbral detail brings it to be located in the rear third of the proximity area.
The other sound sources have widely varied distance locations, giving the
sound stage great depth. The guitar, maracas, and bass are also in the prox-
imity area, and the piano is at the front of the near area. As an additional
exercise, the reader should place the percussion sounds in an appropriate
distance location. Notice the great conceptual distances between the vari-
ous instruments of the drum set, as some sounds are located toward the
rear of the far area.
Evaluating the Spatial Elements of Reproduced Sound
195
The fade of “Lucy in the Sky with Diamonds” is graphed in Figure 9-12. The
vocal lines and bass do not change distance locations with diminishing
loudness. The snare drum and organ are quite different, and do change
distance locations as their loudness levels decrease. It is interesting to note
that the speed and amounts of perceived distance-location changes are dif-
ferent for the two sounds, as they decrease in loudness almost in parallel.
Figure 9-12
Distance
location graph of the
fade out from The
Beatles’ “Lucy in the
Sky with Diamonds”
(Yellow Submarine,
1999).
108 110 112 114 116 118 120
Proximity Near Far
Adjacent
1 Organ
2 McCartney Vocal
3 Lennon Vocal
4 Bass
5 Snare
KEY
Environmental Characteristics
The characteristics of the sound source’s host environment are important in
shaping four qualities of the recording: (1) the overall quality of the sound
source, (2) the perceived performance environment, (3) space within space,
and (4) the imaging of the sound stage.
The environmental characteristics of the entire program (the perceived
performance environment) shape the illusion of a space in which a per-
formance is occurring. The characteristics of this envisioned performance
environment will greatly infl uence the conceptual setting for the artistic
message of the work.
Environmental characteristics of both the host environments of the individ-
ual sound sources and the perceived performance environment play signifi -
cant roles in music production. These artistic elements have the potential to
provide signifi cant information for enhancing and communicating the musi-
cal message of the piece of music. Currently, they are most often used in
supportive roles. They are coupled with sound quality in defi ning the unique
characters of individual sounds (enhancing their sound quality and providing
Chapter 9
196
each sound with a sense of depth). Environmental characteristics are used as
a separate element in creating depth of sound stage, in providing a resource
for space within space, in creating the illusion of the perceived performance
environment, and in giving breadth and depth to phantom images.
The recordist needs to be able to recognize the characteristics of the envi-
ronments within which sound sources exist, and the characteristics of the
perceived performance environment. This will lead to an understanding
of how they infl uence the recording, and will bring the recordist to more
effectively craft a sound stage within an envisioned environment that best
suits their project.
Evaluating Environmental Characteristics
A composite sound of environmental characteristics occurs with the sound-
ing of a sound source within the environment. The sound source and the
environment interact and fuse to create a composite sound—a new overall
sound quality. To understand the infl uence of the host environment on the
sound source in making this composite sound, the environment and the
sound source must be evaluated separately.
We perceive environmental characteristics as an overall sound quality that
is composed of a number of component parts. As global sound qualities,
environmental characteristics are conceptually similar to timbre. The evalu-
ation of timbre (sound quality) and environmental characteristics will be
similar, in that both will seek to describe the states and activities of the
physical components of sound. While people might recognize large halls,
small halls, and other spaces as having common environmental charac-
teristics, each environment is unique. To meaningfully communicate infor-
mation about environmental characteristics, the audio professional must
defi ne the environment by its unique sound characteristics. These charac-
teristics can only be objectively described through discussing the levels
and activities of the component parts of the environment’s sound.
Environmental characteristics appear as alterations to the sound sources
timbre, created by the interaction of the sound source and the environ-
ment. The evaluation of the characteristics of the environment therefore
defi nes the changes that have occurred in the sound sources timbre after
being sounded within the environment. Evaluating environmental charac-
teristics will engage activities that are contrary to our natural tendency to
fuse the environment’s sound with the source’s sound. Care in focusing on
the correct perspective and aspects of sound will be required.
Evaluating the Spatial Elements of Reproduced Sound
197
Environmental characteristics are determined by listeners through com-
paring their memory of the sound source’s timbre outside of the host envi-
ronment to the sound source’s timbre within the host environment. The
listener must go through this comparison process carefully, scanning the
composite sound for information and then comparing that information
with their previous experiences with the timbre of the sound source (at
times considering how the source appeared within other environments).
Differences in the spectrum and spectral envelope of the sound source as
remembered by the listener, and as heard in the host environment, form
the basis for determining most environmental characteristics.
If the listener does not recognize the sound source (timbre) or has no prior
knowledge of the sound source, they will be at a disadvantage in calculat-
ing the characteristics of the host environment. Listeners will have no point
of reference in determining how the environment has altered the timbre of
the original sound source. They must rely on their knowledge of what they
presume to be similar sounds to calculate estimations of the characteris-
tics of the environment. This may or may not turn out to be accurate. Loud
sounds of shorter duration can also be sought, as they expose much time
and spectrum information.
In evaluating environmental characteristics, the listener is seeking to defi ne
the characteristics of the environment itself. The listener must make certain
they are NOT identifying characteristics of the sound source and must make
certain they are NOT identifying characteristics of the sound source within the
environment. The characteristics of the environment can be reduced to three
specifi c component parts. These characteristics are what must be determined
by identifying the differences between the sound quality of the sound source
itself and the sound quality of the sound source within the environment.
The component parts of environmental characteristics are (1) the refl ection
envelope, (2) the spectrum, and (3) the spectral envelope. The environmen-
tal characteristics graph (Figure 9-13) allows for the detailed evaluation of
these three components.
Chapter 9
198
Figure 9-13
Environ-
mental Characteristics
Graph.
Refl ection Envelope
The refl ection envelope is made up of the amplitudes of the initial refl ec-
tions and the reverberant energy of the environment throughout the dura-
tion of the environment’s sound. This envelope is composed of many reit-
erations of the sound source. The reiterations vary in dynamic level and in
spacing between one another (time density).
The spacing of the reiterations of the sound source will be dramatically dif-
ferent over the duration of the sound of the environment (for example, the
spacing of the early refl ections will be considerably different from the spac-
ing of refl ections near the end of the reverberant energy). This portion of the
graph can clearly show the time of arrival of the refl ected sounds to the lis-
tener location, the density of the arrival times of the refl ected sounds, and the
amplitude of those arrivals in relation to the amplitude of the direct sound.
The amplitude of the direct sound is included to serve as a reference level
for the calculation of the dynamic levels of the refl ected sound. The record-
ist should use their skills at pattern recognition to extract time information
from the sound of the environment. The refl ections portion of the graph will
show the following information. The listener should organize their listening
Nominal
Level
Time
Spectral Envelope Spectrum Reflections
f
mf
mp
p
Direct
Sound
Evaluating the Spatial Elements of Reproduced Sound
199
to the time elements of the environmental characteristics to recognize this
information:
Patterns of refl ections created by dynamics
Patterns of refl ections created by spacings in time
Spacing of refl ections in the early time fi eld
Dynamic contour of the entire refl ections portion
Density (number and spacings of refl ections) of reverberant sound
Dynamic relationships between the direct sound, individual refl ections
(of the early time fi eld), and the reverberant sound
Dynamic contour shapes within the reverberant sound
An isolated appearance of the sound source in the host environment must
be found for all of the time and refl ection-amplitude information to be
accurately evaluated. This is especially true for the decay of the reverberant
energy and the spacing of the early refl ections (which are important parts
of the environment’s sound quality). A short (staccato) sound will allow the
refl ection information to be most audible, since it will not have to compete
with the sound of the source itself.
An exercise to develop the readers skill in this area appears at the end
of this chapter. The Refl ections and Reverberation Exercise helps readers
understand and recognize this characteristic of environmental sounds even
if they never wish to perform an evaluation as detailed as the refl ection
envelope of the environmental characteristics graph. The exercise will lead
the reader to important observations that can lead to a better understand-
ing and use of environments.
When listening to sounds in actual music recordings, hearing these char-
acteristics is a much greater challenge—and may at times be impossible.
Without the opportunity to hear the environment’s complete presentation,
information related to the reverberant energy might never be complete-
ly audible. The listener should try to fi nd several appearances of a sound
source where it can be heard alone, without other sound sources, and
where it is playing short durations.
Many hearings of the sound in a wide variety of presentations will be nec-
essary to compile an accurate evaluation of all of the time characteristics
of the environment.
Environment Spectrum and Spectral Envelope
The spectrum of the reverberant sound and the initial refl ections is a col-
lection of all frequencies or bandwidths of pitch areas that are emphasized
and de-emphasized by the environment itself. This spectrum will be only
those frequencies that are altered by the environment. The environment
may emphasize or de-emphasize bandwidths of frequencies or specifi c fre-
quencies. Often the spectrum of the environment will only contain a small
Chapter 9
200
number (three to seven) of prominent frequencies or pitch areas that are
either emphasized or attenuated (de-emphasized).
These frequencies are determined through a careful evaluation of many
appearances of the sound source in the environment, by listening to the
way the sound sources timbre is changed by the environment over a wide
range of pitch levels. Some appearances of the sound source will not have
frequency information in certain frequency areas that are emphasized or
de-emphasized by the environment. The listener must scan many pitch lev-
els of the sound source to determine the spectral content and the spectral
envelope of its environment.
The spectral envelope of the environment is how the frequencies that
are emphasized and de-emphasized by the environment (spectrum) vary
in loudness level over the duration of the sound of the environment. The
spectral envelope and spectrum portions of the graph are coordinated to
present different activity of the same sound components (as with sound
quality evaluation).
A
nominal level is used as a reference for plotting the dynamic contours
of the spectral components. The nominal level will vary in loudness/ampli-
tude over the sound’s duration. The nominal level
is the dynamic envelope
of the environment, where the sound sources frequency components are
unaltered. The dynamic envelope of the environment changes over time.
It is the dynamic contour that is outlined by the refl ections envelope. This
dynamic envelope is represented as a fi xed, steady-state level on the spec-
tral envelope portion of the environmental characteristics graph.
The nominal level is placed at the dynamic level precisely between mezzo
forte and mezzo piano. Frequencies or pitch areas that are emphasized by
the environment will be plotted as activity above the nominal level. Fre-
quencies or pitch areas that are de-emphasized by the environment will be
plotted as activity below the nominal level.
An exercise to help develop skill in hearing environmental characteristics
spectrum and spectral envelope appears at the end of this chapter. In a
similar way to the refl ections and reverberation exercise, this exercise
will increase the listeners ability to recognize these important aspects of
environmental characteristics. Undertaking the detailed task of creating an
environmental characteristics graph is not necessary to
develop the skills needed to describe the spectrum and
spectral envelope of an environment. This exercise will,
however, aid the listener in developing the skills to make
such observations accurately and with as much detail as
the audio professional’s position requires.
Listen . . .
to tracks 41, 44, 45, 45 and 47
for individual sounds and entire
mixes with strong environmental
characteristics.
Evaluating the Spatial Elements of Reproduced Sound
201
Environmental Characteristics Graph
The environmental characteristics graph allows for a detailed evaluation of
the refl ection envelope, the spectrum and the spectral envelope (see Figure
9-13). Creating environmental characteristics graphs will greatly assist in
understanding the nuance of any environment. When created with much
detail, this graph requires great skill that will be acquired over an extended
period of practice and patience. Using this graph for general observations
and beginning studies will also prove very helpful to the beginner and audio
professional alike. Observations can be recorded for future reference and
to assist in learning, understanding, and recognizing this artistic element.
The environmental characteristics graph incorporates:
1. Three tiers as the
Y-axis: refl ections (a continuum of dynamic level),
spectrum (a continuum of pitch level), and spectral envelope (a con-
tinuum of dynamic level);
2. The refl ections portion of the graph, comprising a vertical line at each
point in time that a refl ection occurs. The height of the vertical line cor-
responds to the amplitude of the refl ection. The dynamic level of the
direct sound is indicated on the vertical axis and serves as a reference
for calculating the dynamic levels of the refl ections. This portion of
the graph presents information on the dynamic contour of the refl ec-
tions of the environment and the spacings, in time, of the refl ections
throughout the sound of the environment;
3. The spectrum portion of the graph, comprising the registers estab-
lished in Chapter 6. Spectral components are placed against the
Y-axis
by pitch/frequency level. A single line is plotted against the two axes
for each spectral component, and it will occupy a large, colored/shaded
area in the case of pitch area and a narrow line in the case of a specifi c
frequency;
4. The spectral envelope portion of the graph, which depicts the dynam-
ic contours of the spectral components, using dynamic areas as the
Y-axis;
5. The
X-axis of the graph, a time line divided into an appropriate time
increment (usually needing to allow millisecond increments to be
observed) to clearly display the smallest change of a duration, dynam-
ics, or pitch present in the characteristics of the environment; and
6. A key that is required to clearly relate the components of the spectrum
and spectral-envelope tiers of the graph.
The perspective of the environmental characteristics graph will always
be of either the individual sound source or of the perceived performance
environment.
It is not always possible to compile a detailed evaluation of environmental
characteristics. The information of the environment is often concealed by
other sounds in the musical texture. Further, it is not easily separated from
Chapter 9
202
the sound quality of the sound source itself. The ability to recognize envi-
ronmental characteristics involves much focused attention and practice. It
relies (1) on a knowledge of many sound sources, (2) on an ability to evalu-
ate sound quality of the sound source within the host environment, and (3)
on an acquired skill for comparing and contrasting a previous knowledge
of the sound source with the appearance of the sound source within the
environment that is to be defi ned.
Using the graph for general evaluations of environmental characteris-
tics is often the most feasible approach. This approach will not require as
advanced a skill level as a detailed graph and will provide a good amount
of signifi cant information. These general evaluations are acceptable for
most applications. They provide pertinent information quickly, but without
the subtle details that are diffi cult and time intensive to identify.
General evaluations of environmental characteristics will include (1)
the contour and beginning level of the reverb, (2) the level of the direct
sound, and (3) the most prominent frequencies or frequency bands that
are emphasized or attenuated. If possible, and after practice, they should
include a general description of the spectral envelope and an indication of
the content of the early time fi eld.
The complexity of environmental characteristics can vary widely. Certain
environments will have very few spectral differences from the original sound
source. Some environments will have no refl ections present between the
early time fi eld and the reverberant energy, and the reverberant energy will
increase in density through a simple, additive process. Other environments
may be quite sophisticated in the way they were created, with time incre-
ments of the early time fi eld precisely calculated at different time inter-
vals, with spectral components precisely tuned in patterns of frequencies
(designed to complement the sound source), and with spectral envelope
characteristics reacting accordingly. Natural environments and those that
are created can be remarkably similar or different in character.
It is possible for all perceived environmental characteristics to be changed
in real time, with our current technology. It is also possible for the per-
ceived environment of a sound source to be generated solely by a delay
unit, by a simple reverberation unit, or by any similar process that would
provide easily calculated cues. Although these environments would not be
perceived as natural spaces, the listener would proceed to imagine an envi-
ronment created by the impressions of those simple characteristics. The
listener simply will not perceive a sound as being void of environmental
characteristics. If no environment is present, it will be imagined.
Figure 9-14 presents an environmental characteristics evaluation of McCart-
ney’s vocal from the opening of “Hey Jude.The graph shows several alter-
ations of spectrum and spectral envelope, and the environment’s subtle
time elements. The sparse texture during the beginning of the song allows
the characteristics of all environments to be perceived quite clearly.
Evaluating the Spatial Elements of Reproduced Sound
203
Figure 9-14
Environ-
mental characteristics
graph of Paul McCart-
ney’s lead vocal in The
Beatles’ “Hey Jude.
f
mf
mp
p
pp
Nominal
Level
msec
Spectral
Envelope
Spectrum Reflections
f
mf
mp
p
pp
Direct
Sound
020406080
Exercise 9-5 at the end of this chapter provides guidance in learning to evalu-
ate environmental characteristics. The reader will gain much from attempting
this exercise on several different sounds from several different recordings.
Space within Space
The overall environment of the program provides a setting within which
the subordinate environments of the individual sound sources will appear
to exist. This overall environment (or perceived performance environment)
is a constant that equally infl uences the individual environments of all
sound sources. The perceived performance environment becomes part of
the overall character of the recording/piece of music.
The overall environment is either (1) perceived by the listener as being a
composite of the dominant, predominant, and/or common characteristics
of the individual environments of the sound sources, or (2) is a set of envi-
ronmental characteristics that is superimposed on the entire program.
Works will be perceived to have a single overall environment that is pres-
ent throughout the piece. The listener will imagine a single space (the per-
ceived performance environment) in which the performance (recording)
occurs. When this overall environment is created by adding environmental
characteristics to the entire musical texture, it is possible for the overall
environment to change during the course of a work. In such instances,
abrupt changes (usually at a major division of the form of a piece, such as
between verse and chorus) are most common. The various environments
Chapter 9
204
will be perceived as having occurred within a single overall environment,
even if a single environment is not present.
The perceived performance environment may be a composite of many per-
ceived environments and environmental cues. The listener will perceive the
overall performance environment in this way, if an overall environment has
not been applied. In these instances, the perceived performance environ-
ment is envisioned by the listener as a result of environmental character-
istics (1) that are common or complementary between the environments
of the individual sound sources, (2) that are prominent characteristics of
the environments of prominent sound sources (a source that presents
the most important musical materials, or the loudest, nearest, or furthest
sound sources, as examples), and/or (3) that are created by environmental
characteristics found in both of these areas.
Within this overall environment, the individual environments of sound
sources are perceived to exist. This is the illusion of space within space.
If reverberation has been applied to the overall program to create a per-
ceived performance environment, all of the recording’s sound sources and
their host environments will be altered by those sound characteristics.
Figure 9-15
Perceived
performance environ-
ment of The Beatles
“Hey Jude.
ff
f
mf
mp
p
pp
Nominal
Level
Spectral
Envelope
Spectrum Reflections
mf
mp
p
pp
Direct
Sound
msec
0 40 80 120 160 200 300 400 500
Figure 9-15 is the perceived performance environment of The Beatles“Hey
Jude.The lead vocal, with its fused environmental characteristics of Fig-
ure 9-14, appears contained within this overall environment of the record-
ing/performance. This space-within-space illusion is convincing in bringing
the listener to accept the small space of the lead vocal contained within
Evaluating the Spatial Elements of Reproduced Sound
205
a midsized, natural-sounding performance space (perceived performance
environment).
A complete space-within-space evaluation will include the environmen-
tal characteristics of all sound sources and the perceived performance
environment. Relationships between the environments and the perceived
performance environment can then be understood and evaluated, among
other possible observations. This complete evaluation might not be under-
taken often, but this type of attention is often a level of focus in the master-
ing process and the fi nal mixdown. The skill to compare environments also
leads to an understanding of how environments can be used to enhance
sound sources and the recording in complementary ways.
Distance location and environmental characteristics are related. The inter-
relationships of distance location and space within space should always be
considered when evaluating a music production. They work in a comple-
mentary way to give depth to the sound stage. The two are closely interac-
tive, and comparing the two will offer the listener many insights into the
creative ideas of the recording.
Figure 9-16
Environ-
mental characteristics
evaluation of the fi rst
appearance of the
piano in The Beatles
“Hey Jude.
f
mf
mp
p
pp
Nominal
Level
msec
Spectral
Envelope
Spectrum Reflections
f
mf
mp
p
pp
Direct
Sound
0 40 80 120 160 200 300 400 500 600 700 800
Additional environments from The Beatles’ recording “Hey Jude” are pre-
sented in Figures 9-16, 9-17, and 9-18. The graphs allow us to recognize the
very different decay times of the three environments. When comparing the
graphs to Figures 9-14 and 9-15, we can also recognize the uniqueness of
all four host environments and their perceived performance environment.
It is interesting to note that the high and very high areas are attenuated
Chapter 9
206
(in different bands) in the guitar and piano environments, and the spectral
alterations of the tambourine environment are very subtle. The refl ections
are sparse in their spacing, to varying degrees, and fairly regular refl ec-
tions occur in all three environments.
Figure 9-17
Environ-
mental characteristics
evaluation of the fi rst
appearance of the
guitar in The Beatles
“Hey Jude.
f
mf
mp
p
pp
Nominal
Level
Spectral
Envelope
Spectrum Reflections
f
mf
mp
p
pp
Direct
Sound
msec
0 40 80 120 160 200 300 400
Figure 9-18
Environ-
mental characteristics
evaluation of the fi rst
appearance of the
tambourine in The
Beatles’ “Hey Jude.
ff
f
mf
mp
p
pp
Nominal
Level
Spectral
Envelope
Spectrum Reflections
f
mf
mp
p
pp
ppp
Direct
Sound
msec
0 40 80 120 160 200 300
Evaluating the Spatial Elements of Reproduced Sound
207
Surround Sound
As noted before, surround recording is in its infancy. While we have some
well-conceived recordings on the market, we will surely see profound
developments in how surround sound is used to deliver and enhance
music. Recording practice is still being defi ned; ways that surround loca-
tions can be used to mix music are still being discovered as a result of
ongoing experimentation.
This section will present a way to document and evaluate the directional
location information of surround recordings. It is expected that the ways the
audio professional will need to evaluate surround recordings will change to
refl ect developments in production practice. How we document and evalu-
ate surround will need to be adapted to refl ect future developments.
Format Considerations
Much debate and deliberation has occurred regarding surround formats.
Many different channel and loudspeaker combinations and placements
have been proposed for surround, too many to accurately count let alone
cover here. These include formats from four channels to seven or eight,
most with subwoofers, and some with bipolar surround speakers. Some
formats have all speakers at ear level, others have the surround speakers
higher, and one format uses a wonderfully effective sixth overhead chan-
nel (providing subtle but convincing ambience and some impressive verti-
cal cues). While there was once much debate about which format would
become the standard, the fi ve-channel system with a subwoofer for low-
frequency effects has emerged from this fray of formats.
Widespread consumer adoption of the 5.1 cinema format has led to its
adoption for music as well. Using the specifi cations of the International
Telecommunications Union (ITU) Recommendation 775 (see Figure 9-19),
the format has proven stable, and was created after a great deal of thought
and experimentation. Further, it can accomplish almost all of what is rea-
sonable to expect of surround (albeit with the sad loss of the overhead
channel). While the wide spread angle of the surround speakers make rear
phantom imaging unstable and can quickly pull images forward, the equi-
distant placement of all fi ve speakers have advantages for dynamic bal-
ance and time-based considerations, and provide convincing ambience.
Another positive aspect of this format is that it is compatible with current two-
channel playback concerns. The 60° angle between the left and right speak-
ers provides for accurate listening to stereo recordings (see Figure 9-20). It
is the recommended listening relationship for accurate stereo reproduction
and can therefore also be used for evaluating two-channel recordings.
Chapter 9
208
Figure 9-19
ITU-
recommended
speaker layout for
surround sound.
Represent equal distances from listener
C
L R
Ls Rs
-30º +30º
Sub
placed for
best response
Figure 9-20
Two-chan-
nel sound reproduc-
tion with surround-
sound loudspeaker
placement.
C
L R
Ls Rs
60º 60º
60º
This format in the ITU specifi cation (also defi ned by the Audio Engineering
Society) was used in making the evaluations of surround recordings that
appear in this book.
The audio industry will surely see profound developments in how sur-
round sound is used to deliver and enhance music. Use of surround sound
in music production is still being defi ned. Here we will look at the sound-
stage dimensions of location and distance, and at environmental character-
istics in terms of their potential and current use. We have no certain way to
predict how artistic expression will bring surround production into matu-
rity and can only examine what is currently before us.
Evaluating the Spatial Elements of Reproduced Sound
209
Evaluating Location in Surround Sound
Sound location is evaluated in surround to identify the location and size of the
phantom images of sound sources. These cues hold signifi cant information
for many surround productions, and may contribute signifi cantly to shap-
ing the musical ideas themselves. These are the cues that separate surround
recordings from two-channel (stereo) recordings. Phantom images may
change locations or size at any time during a piece of music. These changes
may be sudden or they can be gradual, and may be prominent or subtle.
Figure 9-21
Surround
sound location graph.
[RC]
Ls
L
C
R
Rs
[RC]
The
surround sound location graph will allow the reader to plot the loca-
tions of all sound sources against the time line of the work. The graph can
portray any lateral direction of sources from the listener and the size of the
phantom images, 360° around the listener.
“Left,“right,“center,“left surround,“right surround,” and “rear center”
locations are identifi ed on the graph. The sound sources location is rep-
resented by placing a mark on the graph at the location of the phantom
image. The angle of the sound source from the listener can be determined
from the centerline out 180° up or down, and degree-increments of angle
may or may not be incorporated into the graph, as desired. The rear-center
location is placed at the very top and at the very bottom of the graph. This
allows sound movement and spread images across the rear sound fi eld
to be graphed, although sometimes not as clearly as we might wish. As
Figure 9-22 shows, the movement of sound sources across the rear and
the locations of spread images would move off the top and bottom of the
graph to wrap the source to the other rear-center location.
Chapter 9
210
Figure 9-22
Rear-
center sound source
movement and
spread images on sur-
round location graph.
The surround sound location graph incorporates:
1. “Left,“right,“center,“left surround,“right surround,” and “rear cen-
ter” as the
Y-axis;
2. The X-axis of the graph, dedicated to a time line that is devised to fol-
low an appropriate increment of the metric grid or is representative of
a major section of the piece (or the entire piece) for sound sources that
do not change locations;
3. A single line plotted against the two axes for each sound source and
occupying a large, colored/shaded area in the case of the spread image;
and
4. A key that is required to clearly relate the sound sources to the graph.
The key should be consistent with keys used for other similar analyses
(such as musical balance and performance intensity) to allow different
analyses to be easily compared.
A sound source may occupy a specifi c point in the horizontal plane of the
sound stage, or it may occupy an area within the sound stage, as was ear-
lier found in stereo sound location. Similarly, the surround location graph
will dedicate a source line to each source against a common timeline. Point
sources and spread images will be graphed in the same way as on the
stereo location graph.
A surround mix might have a secondary sound stage behind the listener
to augment the front sound stage, and have little activity at the direct sides
of the listener. This is found in a number of surround mixes. In such a case,
Time
[RC]
Ls
L
C
R
Rs
[RC]
single spread image
KEY
point source
Evaluating the Spatial Elements of Reproduced Sound
211
the Y-axis presented in Figure 9-23 would more clearly show the content
of that mix. Note the listeners heads are present to help the reader orient
him or herself to the graph; they would not appear on a graph of a mix.
The location of the secondary sound stage might take some getting used
to, but in time will make considerable sense; a clear stereo sound stage
can be seen in front and behind the listener location. The sounds at the
side will be cumbersome to plot; Figure 9-24 shows a spread image at the
right side and a point source moving from the left front to the left rear to
demonstrate how such sounds would appear. It must be noted this format
might not be the best choice if many sounds of this type were present, but
in recordings that emphasize the front sound stage and create a rear sound
stage, this
Y-axis would more clearly show the sound stage than the format
of Figure 9-21.
Figure 9-23
Alterna-
tive Y-axis for the
surround sound
location graph.
Ls
RC
Rs
L
C
R
single spread
image
KEY
point source
Time
Chapter 9
212
Locations and image size do not often change within sections of a song/
piece of music. Changes are most likely to occur between sections of a
piece, or at repetitions of ideas or sections. These are places where changes
in the mix often make the best musical sense and most often occur. The
listener should, however, never assume changes did not occur. Gradual
changes in source locations and size are present in many pieces, and some-
times in pieces where such events are not expected.
Figure 9-24
Surround
sound imaging of a
hypothetical mix.
KEY
Lead Vocal Lead Vocal Ambiance
Bell 1 Bell 2 Bell 3 Bell 4 Ambiance of all Bells
Keyboard Acoustic Guitar Bass
point source location
When sound sources do not change locations in a section, a stationary
graph without a time line can appropriately be used. Figure 9-24 presents a
surround location image that can be used to notate the locations of sound
sources in such instances. Sizes of images and placements are clearly
shown. It is also possible for the listener to sketch the sound stage—the
area where the performance appears to emanate. This allows the listener to
recognize when the surround speakers are being used for ambience only.
This fi gure also has the advantage of allowing the listener to notate second-
ary sound-source phantom images created by nonadjacent pairs of loud-
Evaluating the Spatial Elements of Reproduced Sound
213
speakers and by groups of speakers. These images cannot be incorporated
into the surround sound location graphs described above. The graph can
also show separations between a sound-source phantom image and its
ambiance (environmental characteristics) found in many surround produc-
tions. It can also show image location areas, containing breadth and depth.
These can envelop the listener or surround the listener.
Figure 9-24 depicts the surround imaging of a hypothetical mix containing
a number of sources separated from their environmental cues.
Figure 9-25
Surround
sound location graph
of “Money for Noth-
ing” by Dire Straits.
Dire Straits “Money for Nothing” Surround Location
(RC)
Ls
L
C
R
Rs
(RC)
1
:02
35
:10
79
:17
11 13
:24
15 17
:31
19
Vocal
Arpeggio Synth
Soprano Sax
Synth Bass
KEY
21
:38
23 25
:45
27 29
:53
31 33
1:00
35 37
1:07
39 40
1:14
(RC)
Ls
L
C
R
Rs
(RC)
Figure 9-25 presents the surround-sound locations of a number of sounds
from the introduction to “Money for Nothing” by Dire Straits. The graph
clearly shows an arpeggiated synthesizer sound revolving around the lis-
tener; the sound rotates around the room four times, between 0:15 and
1:15. The vocal line and the soprano saxophone part provide important
Chapter 9
214
front-center sources that vary in size and have a small amount of motion,
and a synthesized bass part creates a stable rear-center image.
Figure 9-26 presents the synthesized strings sound from the same section. The
sound extends the sound stage to surround the listener with a slight sense of
envelopment, fi lling the left front through left rear of the sound stage.
The reader should try to identify the remaining two sounds that perform
in this section and place them on one of these two graphs. Next, graph the
following section (the second half of the introduction) and the beginning of
the fi rst verse. You will notice when the drums enter at 1:12 the front sound
stage becomes active and widens; this is reinforced as the electric guitar
enters quietly at 1:26. The solo guitar at 1:36 and the entrance of the lead
vocal (2:04) cause additional shifts in the sound stage and the perspec-
tive of the listener. Identifying these characteristics will assist the reader in
understanding the mix.
Figure 9-26
Surround
sound imaging of the
synthesized strings
sound during the in-
troduction of “Money
for Nothing” by Dire
Straits.
KEY
Synth Strings Synth Strings + Ambiance
The reader should work through the Surround Sound Location Exercise at
the end of this chapter to refi ne their skill in localizing sources in surround
playback. The exercise should be performed on different pieces of music,
and should use the two Surround Sound Location Graph formats and the
Surround Sound Imaging fi gure.
Evaluating the Spatial Elements of Reproduced Sound
215
Exercises
Exercise 9-1
Stereo Location Exercise
Find a work that displays signifi cant changes in stereo location of sound
sources. Plot the location of the sound source that displays the greatest
amount of change, and at least one spread image and one point source. Do
this throughout the fi rst two major sections of the piece.
The process of determining stereo-sound location will follow this sequence:
1. During the initial hearing(s), listen to the example to establish the length
of the time line. At the same time, notice the presence of prominent
instrumentation, with placements and activity of their stereo location,
against the time line.
2. Check the time line for accuracy and make any alterations. Establish the
sound sources (instruments and voices) that will be evaluated and sketch
the presence of the sound sources against the completed time line.
3. Notice the locations and size of the sound sources for boundaries of
size, location, and any speed of changing locations or size of image. The
boundaries of source locations will establish the smallest increment of
the Y-axis required. The perspective of the graph will always be of either
the individual sound source or of the overall sound stage.
4. Begin plotting the stereo location of the selected sound sources on the
graph. The locations of spread images are placed within boundaries; the
boundaries may be diffi cult to locate during initial hearings, but they can
be defi ned with precision; the listener should continue to focus on the
source until it is defi ned. The locations of point-source images are plotted
as single lines. These sources are often easiest to precisely locate and are
the sources most likely to change locations in real time.
5. Continually compare the locations and sizes of the sound sources to one
another. This will aid in defi ning the source locations and will keep the
listener focused on the spatial relationships of the various sound sources.
The evaluation is complete when the smallest signifi cant detail has been
incorporated into the graph.
As you gain experience in making these evaluations, songs with more
instruments should be examined and longer sections of the works should be
evaluated.
Chapter 9
216
Exercise 9-2
Distance Location Exercise
Select a recording with at least fi ve sound sources that exhibit signifi cantly dif-
ferent distance cues. Plot the distance locations of those sources throughout
the fi rst three major sections of the work.
The process of determining distance location will follow this sequence:
1. During the initial hearing(s), establish the length of the time line. Notice
the selected sound sources and any prominent placements and activity of
distance location, especially as they relate to the time line.
2. Check the time line for accuracy and make any alterations. Clearly identify
the sound sources (instruments and voices), and sketch the presence of
the sound sources against the completed time line.
3. Make initial evaluations of distance locations. Notice the locations of the
sound sources to establish boundaries of the sound stage (the location
of the front and rear of the sound stage). Notice any changing distance
locations and calculate any speed of changing locations. The placement
of instruments against the time line, more than the boundary of speed of
changing location (which are quite rarely used), will most often establish
the smallest time unit required in the graph to accurately show the small-
est signifi cant change of location. The amount of activity in each area
will establish the amount of Y-axis space required. The perspective of the
graph will always be of either the individual sound source or the overall
sound stage.
4. Begin plotting the distance of the selected sound sources. Sound sources
will be placed on the graph by (1) evaluating the timbre defi nition of
each sound source by focusing on the amount of detail present, while
being aware of the amount and characteristics of the reverberant sound;
(2) transferring this evaluation into a distance of the source from the
listening location and penciling in the sound on the distance location
continuum for reference; (3) reconsidering the defi nition of the timbre (is
the source in the listener’s own space, or proximity? Is it near or far?), and
then placing the sound in relation to the sound stage; (4) precisely locat-
ing the distance location of the sound source by comparing the sound
source’s location to the locations of other sound sources.
5. Once several sounds are accurately placed on the distance continuum,
identifying additional distance locations is most readily accomplished by
directly comparing the sound source to the perceived distance locations
of the other sound sources present in the music. Use proportions of dif-
ferences between the locations of three or more sound source distances
to make for more meaningful comparisons. Is sound “c” twice or one-half
the distance from sound “a,” as “a” is from sound “b?” How does this
compare to the relationship of sounds “d” and “c?” Sounds “c” and “b?”
Evaluating the Spatial Elements of Reproduced Sound
217
Continually compare the distance locations of the sound sources to one
another. The evaluation is complete when the smallest signifi cant detail
has been incorporated into the graph.
Remain focused on the distance location of the sound sources, making cer-
tain your attention is not drawn to other aspects of sound.
As you gain experience in making these evaluations, you should examine songs
with more instruments and evaluate longer sections of the works.
Exercise 9-3
Refl ections and Reverberation Exercise
Find a snare drum sound that was recorded without environmental cues, such
as Track 23 on the enclosed CD. The sound will need to be repeated many
times, over a period of 10 or more minutes; place the track on repeat play if
possible. Route the sound through an appropriate reverb unit.
1. Make the reverb unit emphasize the items listed below one at a time, and
make radical (perhaps unmusical) settings of these parameters to learn
their characteristic sound qualities. Listen carefully to individual snare
drum hits while adjusting the device.
2. Seek to create a pronounced early time fi eld from the unit. Establish
a clear set of two refl ections and create a setting that will repeat this
pattern. A recurring pattern of refl ections is the result. Listen carefully,
and alter the speed of the refl ections and spacings of the refl ections and
patterns.
3. Repeat this sequence with a clear set of 3, 4, and then 5 refl ections, es-
tablishing recurring patterns while gradually increasing the number of re-
ections and the complexity of the pattern. Listen carefully to create and
recognize:
a. Patterns of refl ections created by dynamics
b. Patterns of refl ections created by spacings in time
c. Spacing of refl ections in the early time fi eld
d. Dynamic contour of the entire refl ections portion
e. Density (number and spacings of refl ections) of reverberant sound
f. Dynamic relationships between the direct sound, individual refl ec-
tions (of the early time fi eld), and the reverberant sound
g. Dynamic contour shapes within the reverberant sound
You will begin to notice and recognize that certain spacings in time have a
certain consistent and unique sound quality. A “sound of time” can be under-
stood and recognized for delay times and reverberation rates. With patience
and practice, this skill can become highly refi ned—as many room designers
will attest.
Chapter 9
218
Exercise 9-4
Environmental Characteristics Spectrum and Spectral Envelope Exercise
Find a high-quality acoustic instrument sound and loop it in a DAW or
otherwise where it can be controlled. Route the sound through an appropri-
ate reverb unit.
1. Establish a reverb setting with three or more seconds of decay and with a
high proportion of reverb signal (or only reverb signal).
2. Alter the frequency response, equalization, or any similar frequency-
processing control on the reverb to emphasize and de-emphasize (attenu-
ate) several specifi c frequencies or frequency bands.
3. Play single pitches with short durations on the keyboard. Listen carefully
to how changes of settings alter the sound quality of the instrument. Keep
track of the settings played. Repeat this process while moving through the
entire frequency range(s) the unit will alter and listening (and learning)
carefully.
4. In a separate process, listen carefully to pitches played throughout the
instrument’s range, played through an unchanging reverb setting. Notice
how the qualities of some pitches are altered differently than others.
Changes in the environment’s spectrum and spectral envelope will occur
only if the particular pitches performed have spectral energy at the fre-
quencies being altered.
Repeat this process again several hours, then several days later. During these ses-
sions try to anticipate what the modifi cation will sound like before you listen to
it. Check your memory and your recognition of many different spectrum chang-
es. Keep returning to this exercise to become comfortable with the material.
Exercise 9-5
Environmental Characteristics Exercise
Return to the work or works evaluated in the distance location exercise. Care-
fully select three of the fi ve sound sources previously evaluated for distance
and perform environmental characteristics evaluations on those sounds as
outlined below.
As an alternative, look for a suitable surround sound recording with few
sound sources. Identify three to fi ve sources that have separate locations for
their direct sound and environmental characteristics. This will greatly assist
you in comparing the direct sound and the environment, and in isolating
environmental characteristics.
While these evaluations are most easily accomplished for short-duration per-
cussive sounds, environmental characteristics evaluation is possible for any
Evaluating the Spatial Elements of Reproduced Sound
219
sound source as long as the reverberant energy of the environment is exposed
(not accompanied by or masked by other sound sources) after the sound
source has ceased sounding.
The process of determining environmental characteristics will follow this
sequence:
1. During initial hearings of the entire work, listen to each sound source to
identify a location where the sound is isolated throughout the duration of
the environment. Nearly always the graph’s time increments on the time
line will need to show milliseconds. Estimate the length of the time line
for that presentation of each sound source.
2. Check the time line for accuracy and make any alterations. Work in a
detailed manner to establish a complete evaluation of the refl ections of
the sound. First, sketch the presence of the most prominent refl ections
against the completed time line; then, establish the precise time place-
ment and the dynamic levels of these prominent refl ections against the
time line. Use the prominent refl ections as references to fi ll in the remain-
ing refl ections in the early time fi eld. After the early time fi eld is plotted,
complete the refl ections portion of the graph by plotting the dynamic
envelope and spacing of refl ections (density) of the reverberant energy.
3. Notice the locations and size of any emphasized or de-emphasized pitch
areas or frequencies. Scan the entire piece of music, listening to how the
sound source is altered by the environmental characteristics by listening
to many different pitch levels. Throughout these hearings, keep track of
pitch areas or specifi c frequencies that appear to be emphasized or de-
emphasized. With a running list of observations, regularly identifi ed pitch
areas/frequencies will begin to emerge. Further hearings will allow you to
more accurately identify these frequencies and pitch areas (that make up
the spectrum of the environmental characteristics), and to place the pres-
ence of these frequencies or pitch areas against the time line.
4. You will now plot the dynamic contours of the components of the spec-
trum against the time line. This process is the same as the process of plot-
ting the spectral envelope of sound quality evaluations. Each component
of the spectrum is plotted as a single line, and these components are
listed in a key, so their dynamic contours may be related to the spectral-
envelope tier of the graph.
5. Continually compare the dynamic levels and contours of the spectral
components to one another. This will aid in remembering the nominal
dynamic level (where the amplitude of the spectral components of the
sound source are unaltered by the environment), will aid in keeping the
dynamic levels and contours consistent between spectral components,
and will keep you focused on the relationships of the sound source and
its host environment. The evaluation is complete when the smallest sig-
nifi cant detail has been incorporated into each tier of the graph.
This evaluation can be detailed and time intensive. It is not proposed that
these detailed evaluations be undertaken in normal, daily activities of audio
Chapter 9
220
professionals. As a learning tool, this study will be very successful at bringing
you to hear, understand, recognize, and remember these important aspects
of sound. You are encouraged to return to this exercise. Once speed and ac-
curacy improve, you should undertake evaluations of more complex environ-
ments and sounds that are partially masked.
Exercise 9-6
Exercise in Determining the Environmental Characteristics of the Perceived Performance
Environment
This exercise will seek to defi ne the environmental characteristics of a record-
ing’s perceived performance environment. A multitrack recording should be
selected that contains no more than three or four sound sources, a sound
stage that clearly separates the images, and an overall sound that appears to
envelop the sound stage.
1. Identify the sound sources and the different environments of the piece.
2. Perform general environmental characteristics evaluations of the envi-
ronments. These initial evaluations should be general in nature, seeking
prominent characteristics rather than detail.
3. Compare the environments for similarities of time, amplitude, and fre-
quency information to identify common traits between the individual en-
vironments. (1) When traits are common to all sounds, an applied, over-
all environment is present. The traits will be present in all environments
equally. If the common traits are not applied to all sources equally, a
single environment has not been applied to the entire program. (2) Then
you must look at other factors as well. Next, identify the predominant
traits of the environments of musically signifi cant sound sources. They
also directly contribute to the characteristics of the overall environment.
4. Listen to the work again to identify an overall environment of the pro-
gram. An applied overall environment will be most easily detected by its
detail in spectral changes of the reverberant sound, and in the clarity of
the initial refl ections of the early time fi eld. The characteristics of these
environments will be perceived by listening for detail at a close perspective
of slight changes to the predominant characteristics of the environment.
Overall environments that are an illusion (created by the composite and
predominant characteristics of the individual sound sources) will have
characteristics that are not readily apparent. The characteristics of these
environments will be perceived by listening at the more distant and gen-
eral perspective of the dominant characteristics of the environment.
5. Compile a detailed environmental characteristics evaluation of the per-
ceived performance environment. The evaluation is complete when the
smallest signifi cant detail has been incorporated into each tier of the
graph.
Evaluating the Spatial Elements of Reproduced Sound
221
Repeat this exercise on other recordings until you have evaluated a recording
with an applied overall environment and a recording with a perceived perfor-
mance environment that is the perceived result of the environmental charac-
teristics of the individual sound sources.
Once skill and confi dence are improving, repeat this exercise on recordings
that have more activity and with less pronounced characteristics in the per-
ceived performance environment.
Exercise 9-7
Space Within Space Exercise
Select a multitrack recording containing a small number (fi ve or six) sound
sources. A recording with a sparse texture (few instruments sounding simulta-
neously) and pronounced environments on the individual sound sources will
be easiest to evaluate during initial studies.
The process for determining space within space follows this sequence:
1. Identify the various environments of the piece. Some sound sources may
share environments with other sound sources (at the same or different
distances), and some sources may change environments several times in
the piece.
2. Perform general environmental characteristics evaluations of the envi-
ronments. These initial evaluations should be general in nature, seeking
prominent characteristics rather than detail.
3. Compare the environments for similarities of time, amplitude, and fre-
quency information. This observation will determine common traits be-
tween the individual environments of the sources. These common traits
will signal a possible applied, overall environment if they are present in all
environments equally. If the common traits are not applied to all sources
equally, other factors are in play as well. Identify the predominant traits
of the environments of musically signifi cant sound sources. They also
directly contribute to the characteristics of the overall environment.
4. Listen to the work again to identify the characteristics of the overall
environment of the program (the perceived performance environment).
Compile a detailed environmental characteristics evaluation of the
perceived performance environment.
5. Begin the master listing of environments with this environmental charac-
teristics evaluation of the perceived performance environment.
6. Perform detailed environmental characteristics evaluations of the indi-
vidual host environments of each sound source. The characteristics of
the overall environment may or may not be present in these evaluations,
depending on the nature of the overall environment and the nature of the
individual sound sources’ environments. The evaluation of each source is
Chapter 9
222
complete when the smallest signifi cant detail has been incorporated into
each tier of the graph.
7. Number each environment and enter the evaluation into the master listing
of environments. Note on the master listing the sound source or sources
that are present within the environment.
Once skill and confi dence are improving, repeat this exercise on more sound
sources in recordings that have more activity and with less pronounced envi-
ronmental characteristics.
Exercises 9-8
Surround Sound Location Exercises
Two approaches can be used for surround sound location (A and B). You
should work through both approaches, as one will be more suitable to any
sound material than the other. Determining which approach is most suitable
will be a valuable undertaking in itself.
Exercise 9-8A
Find a surround recording with sources located around the array, but listen-
ing at the audience perspective. Perform an evaluation of the locations of four
or fi ve sources for the fi rst two major sections of the work.
The process of determining surround sound location will follow this
sequence:
1. During the initial hearing(s), listen to the example to establish the length
of the time line. At the same time, notice the presence of prominent
instrumentation, with placements and activity of their surround location,
against the time line.
2. Check the time line for accuracy and make any alterations. Establish a
complete list of sound sources (instruments and voices), and sketch the
presence of the sound sources against the completed time line.
3. Notice the locations and size of the sound sources (instruments and voic-
es) for boundaries of size, location, and any speed of changing locations
or size of image. The placement of instruments against the time line will
most often establish the smallest time unit required in the graph to accu-
rately exhibit the smallest signifi cant change of location. The boundaries
of the sound sources’ locations will establish the smallest increment of
the
Y-axis required. The perspective of the graph will always be of either
the individual sound source or of the complete array.
4. Begin plotting the surround location of each source on the graph. The
locations of spread images are placed within boundaries. The boundaries
Evaluating the Spatial Elements of Reproduced Sound
223
may be diffi cult to locate during initial hearings, but they can be defi ned
with precision. Continue to focus on the source until it is defi ned. The
locations of point-source images are plotted as single lines. These sources
are easiest to precisely locate and are most likely to change locations in
real time.
5. Continually compare the locations and sizes of the sound sources to
one another. This will aid in defi ning the source locations and will keep
you focused on the spatial relationships of the various sound sources.
The evaluation is complete when the smallest signifi cant detail has been
incorporated into the graph.
Locating sound sources originating from behind normally causes a listener to
move their head. You should consciously keep your head still and focus on the
direction and size of the image.
Exercise 9-8B
This exercise should be repeated on other recordings. Find a recording with
the listener located within the ensemble or the performance.
When sound sources do not change locations or when sources appear to
envelop you, a stationary sound-location fi gure will be substituted for the sur-
round location graph. Figure 14-2 will be used for these evaluations and will
also allow you to localize phantom images generated by nonadjacent pairs
of loudspeakers (as described in the chapter), and represents a fi xed period
of time. Identify and graph several point sources and spread-image sound
sources in the song. A separate graph will be used whenever source size or
locations change. Graph several different sections of the work on separate
graphs to note and understand the characteristics and changes of imaging
that occur in the song.
224
10 Complete Evaluations and
Understanding Observations
The evaluations of artistic elements from previous chapters will be drawn
together and compared here. Observations will be made from examining
those evaluations and comparing the artistic elements to the musical
materials.
Our evaluations have primarily been on three different levels of perspective:
(1) the characteristics of an individual sound, (2) the relationships of individ-
ual sound sources, and (3) the overall musical texture, or overall program.
The highest level of perspective brings the listener to focus on the compos-
ite sound of a recording, or its overall texture. At this level, all sounds are
summed into a single impression. Much recreational listening occurs at this
level of perspective, with shifts of focus moving to text and melody—and
other aspects attractive to the listener, such as beat or pulse or the emotion
and message of the song—in a random manner and at undirected times.
The overall texture is very important for the recordist as well, as its dimen-
sions can profoundly shape the music/recording. These dimensions are the
piece of music/recording’s form, perceived performance environment, sound
stage, reference dynamic level, program dynamic contour, and its timbral
balance. The overall dimensions that provide the recording and music with
its unique character must be readily recognizable and understood by the
recordist, listening at the upper levels of perspective. All but one of these
dimensions, timbral balance, has been explored in previous chapters.
This chapter will examine relationships of the various dimensions of the
overall texture and explore timbral balance and pitch density. The informa-
tion offered by comparing evaluations of individual sources is explored next.
An examination of how the artistic elements shape a recording will lead to
a summary of the system for evaluating recorded/reproduced sound.
Complete Evaluations and Understanding Observations
225
Pitch Density and Timbral Balance
The process of evaluating pitch density is directly related to pitch area anal-
ysis from Chapter 6. Pitch density is the amount and placement of pitch-
related information of a single sound source within the overall pitch range
of the musical texture. It is composed of the pitch area of the source’s musi-
cal materials fused with its sound quality (timbre).
Timbral balance is the
“spectrum” of the overall texture that is created by the combination of the
pitch densities of all of the sound sources in the music. The two are inter-
related, as each source’s frequency content contributes to create the overall
frequency content of the recording.
Pitch Density
The concept of pitch density allows each musical idea/sound source to be
perceived as having its own pitch “placement” or “location” in the musical
texture. The range of pitch that spans our hearing (and the musical texture)
can be conceived as a space. Within this space, sound sources might be
understood as being placed and/or layered according to the frequency/pitch
area they occupy. The overall pitch range can be perceived as being divided
into areas. Some of these areas will be occupied by the sound sources, or
left empty; some areas might contain much pitch/frequency material, some
little. The size of pitch areas and their placement are unique to each piece of
music. Further, they may remain stable throughout a song, or may change
at any time.
With this approach, the concepts of pitch density and timbral balance are
often applied to the processes of mixing musical ideas and sounds in
recording production. This is similar to the traditional concept of arranging
and orchestration, where instruments are selected and combined based
on their sound qualities, and the musical materials they are presenting.
The recording medium and its various formats provide new twists to this
traditional approach to combining sounds.
Pitch density is (1) the pitch area occupied by the musical material of the
sound source with (2) a density of pitch/frequency information provided by
the sound quality (timbre) of the sound source.
Musical materials create a single concept or pattern. The materials will
come together in our memory and perception, as a single idea comprised
of a group of pitches. The material will be heard to occupy a specifi c pitch
area. The pitch area of the musical material is defi ned by its boundaries—its
highest and lowest pitches. Within the boundaries a bandwidth of sorts is
established. The number of different pitch levels that comprise the musical
material and their spacing creates a density within the pitch area.
The sound quality/timbre of the sound source will also infl uence the band-
width of the pitch area and the pitch density of the musical idea. The primary
Chapter 10
226
pitch area of the sound source’s spectrum will often be made up of the fun-
damental frequency of an instrument or voice, perhaps with the addition
of a few prominent lower partials. A primary pitch area might also contain
environment information (delayed and reverberated sound). These cues
may add density to the sound without introducing new pitch information.
Distortion sounds and other processing effects may also provide additional
spectral information and added density to the primary pitch area.
Often, the primary pitch area of the sound source is rather narrow, often
slightly more than the fundamental frequency alone. This is especially
true when an instrument or voice is being performed at a moderate to low
dynamic level. Strong secondary pitch areas in the sound-source timbre can
be present. These provide additional pitch information that will widen the
pitch area of the source (raising the upper boundary of the pitch area) and/or
that can add to its density. Formant regions will often add a consistency of
pitch-density information between a sound source’s pitch density events.
As the dynamic level of a sound source increases, lower partials will often
become more prominent, and the width of the pitch area will tend to widen.
In this way the spectrum of the sound source and its performance intensity
impact pitch area.
Using this information, the highest boundary of the pitch area is determined
by perceiving the spectrum of the sound source. The upper boundary of the
pitch area “bandwidth” will be located at the pitch/frequency level where
signifi cant harmonics and overtones cease to be present. Typically there is
approximately a 3-to-1 loudness ratio between the lowest boundary of the
pitch area and the upper boundary.
The pitch area of each musical idea must be determined by (1) defi ning the
length of the idea. It is then possible to defi ne (2) the lowest boundary of the
pitch area, and (3) the highest boundary of the area. Once the bandwidth of
the pitch area has been established (4) the amount of spectral information
present can be observed, completing the concept of pitch density.
Most listeners can easily determine the length of the idea by simply ask-
ing, “When does this idea end, and when does the next idea performed by
this instrument/voice begin?” This usually takes place at the perspective of
the individual sound source. At times an instrument such as a piano may
present several musical ideas simultaneously, in which case each would
be understood and processed separately. At times, a group of instruments
such as a brass section may present a single musical idea and be heard as
a single unit; these would be grouped together.
Pitch information can be determined by examining the melodic activity
(and/or harmonic activity for instruments like guitar and keyboards) of the
musical idea to determine the shape and the highest and lowest pitch levels;
this will establish the lowest boundary of pitch density. Adding detail per-
taining to the sound qualities of the sources performing the idea provides
Complete Evaluations and Understanding Observations
227
the upper boundary and density information. The boundaries of the pitch
areas of musical ideas may or may not change over time.
The pitch area of a sound source’s timbre as it presents a musical idea is
a composite impression that establishes a frequency band we call pitch
density. Pitch density contains all of the appearances of the sound source
performing all of the pitch material within the time period of the musical
idea. It is the sum of all of the pitch levels and the signifi cant sound-qual-
ity information of its sound source(s). The relative density of the idea is
determined by the amount of pitch information generated by the musi-
cal idea and the spectral information of the sound source(s). As a song
unfolds, a source’s pitch area will change as it presents different musical
materials and/or changes timbre by differing performance intensity, loud-
ness, expression, etc. This gives the different pitch density events different
characters and characteristics that directly contribute to the music.
This process for determining pitch density is repeated for each individual
musical idea. All sounds can be plotted on a single graph to allow the over-
all texture to be observed. This graph will represent the timbral balance of
the recording/music.
Timbral Balance
Timbral balance is the combination of all of the pitch densities of all of the
recording’s sounds. It is a dimension of the overall texture that conceptu-
ally represents its “spectrum.Timbral balance is the distribution and den-
sity of pitch/frequency information in the recording/music.
Evaluating timbral balance is a sound quality evaluation of the overall tex-
ture. Here the individual sound sources that make up the overall texture
can be conceived as individual spectral components. Individual sound
sources are analyzed for their contributions to the sound quality of the
overall program in terms of their timbres and the musical materials they
present (pitch density).
All sound sources are plotted on the timbral balance graph. The graph may
take two forms, with or without a time line. The graph may simply plot each
sound source’s pitch area against one another (as the pitch area graph, in
Chapter 6), and be a rather general representation of the overall texture.
Most helpful in beginning studies is when the sound sources are plotted
individually against the work’s time line, allowing the graph to visually rep-
resent changes in timbral balance as the work unfolds.
In either form, the timbral balance graph contains:
Each sound source is represented by an individual box denoting its
pitch area,
Chapter 10
228
The Y-axis is divided into the register designations fi rst presented in
Chapter 6,
The density of the pitch areas can be denoted on the graph through
shadings of the boxes of the sound sources or by other descriptions.
The pitch densities of all of the sound sources may be compared to one
another and to the overall pitch range of the musical texture. Timbral bal-
ance allows the pitch densities of all sound sources to be compared. Thus,
the recordist is better able to understand and control the frequency con-
tent of the recording by observing the contribution of the individual sound
source’s pitch material to the overall musical texture and of the mix.
The timbral balance of the beginning sections of The Beatles“Lucy in the
Sky with Diamonds” (1999
Yellow Submarine version) appears in Figure 10-1.
The work uses density and the registeral placement of sound sources and
musical ideas to add defi nition to the musical materials and sections of the
music. Timbral balance itself helps create directed motion in the music. The
musical ideas are precisely placed in the texture, allowing for clarity of the
musical ideas. The expansion and contraction of bandwidth of the overall
pitch range and textural density of the musical ideas (and sound sources)
add an extra dimension to the work and support it for its musical ideas.
While dynamic levels of sources can impact perceived timbral balance,
dynamic-level information is not contained in the timbral balance graph.
Information on the dynamic levels of the sound sources can be found in
the musical balance graph, however. Viewed in this way, musical balance
conceptually represents the “spectral envelope” of the overall texture. By
comparing the two graphs, the reader can understand more about how the
recording made use of pitch and dynamic information to shape its overall
sound quality, or timbre.
Similarly, stereo location (or surround sound location) can impact perceived
timbral balance. Timbral balance is distributed across the sound stage. The
spectral information of the overall texture is balanced by location as well as
by the distribution of pitch/frequency information by register. Paying close
attention to this type of interaction allows us to recognize how a sound
source can emerge from the timbral balance of a mix by moving it to a
new location; sounds can be blended or given clarity by their placement in
location and by their spectral content. By comparing the timbral balance
graph to the stereo (or surround) location graph, we can recognize how the
spectrum of the recording is distributed by location.
Exercises for determining pitch density of a single source and for deter-
mining a recording’s timbral balance appear at the end of this chapter. The
reader is encouraged to spend enough time with each exercise to feel com-
fortable with these concepts that are so important to the mixing process.
Chapter 10
230
The Overall Texture
The overall texture of the recording is perceived as an overall character,
made up of the states and activities of all sounds and musical ideas. Pitch-
register placements, rate of activities, dynamic contours, and spatial prop-
erties are all potentially important factors in defi ning an overall texture.
The characteristics of the overall texture provide many fundamental quali-
ties of the music and recording. These greatly shape the music and its
sound qualities, and communicate most immediately to the listener. The
framework for the music and the context of the message of the recording
are crafted at this level of perspective.
The characteristics of the overall texture are:
Perceived performance environment
Sound stage
Reference dynamic level
Program dynamic contour
Timbral balance
Form
The perceived performance environment creates a world within which the
recording exists. This adds a dimension to the music recording that can
substantially add to the interpretation of the music. The level of intimacy of
the recording can become related to the level of intimacy of the message
of the music. This element will largely be defi ned by how the sound stage is
placed in the perceived performance environment, the listeners perceived
distance from the sound stage, and the size and depth of the sound stage.
The reference dynamic level represents the intensity and expressive charac-
ter of the music and recording. What the music is trying to say is translated
into emotion, energy, expression, and a sense of purpose. These are refl ect-
ed in recordings and performances, and are understood as a perceived per-
formance intensity that is used as a reference dynamic level. Understanding
this underlying characteristic of the music allows the recordist to calculate
the relationships of materials to the inherent spirit of the song/composition.
The program dynamic contour allows us to understand the overall dynam-
ic motion of the entire piece of music. Actual dynamic level will impact the
recording process in many ways, and it also shapes the listeners expe-
rience. While program dynamic contour depicts how the work unfolds
dynamically over time, this contour is often closely matched to the drama
of the music. The tension and relaxation, the points of climax and repose,
movement from one major idea to another, and more are contained in the
contour of this sum of all dynamic information.
Timbral balance provides information on the spectrum of the entire musi-
cal texture. The movement of musical ideas through the “vertical space” of
pitch provides a sense of place for musical materials and adds an important
Complete Evaluations and Understanding Observations
231
dimension to the character of the overall texture. The number of sources,
their densities and distribution throughout the pitch registers provide a
density to the texture that can also shape the direction, motion and sound
quality of the overall texture. The sound quality of the overall texture is
largely shaped by timbral balance. Timbral balance can be envisioned as
providing spectral information—materials that are harmonically related,
and those that are not, all add their unique qualities to the production as
they change over time or provide a continual presence that is part of the
overall sound of the recording/music.
Finally, the form of music is created by all of these characteristics, plus the
text and musical ideas. This essence of the song is what reaches deeply into
the listener. It resonates within the listener when the song is understood.
Form is the overall concept of the piece, as understood as a multidimen-
sional, but single idea.
Figure 10-2 presents a time line and the structure of “Lucy in the Sky with
Diamonds.This is an outline of the structural materials that contribute
to the form of the piece. The shape of the music and interrelationships of
parts, as well as elements of the text, can be layered into this graph and
made available for evaluation.
Figure 10-2
Time line
and structure—The
Beatles’ “Lucy in the
Sky with Diamonds.
1 3 5 7 9 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75
3
4
Intro
A
A
1
B
Chorus
Bridge
A
2
A
3
B
1
Bridge
4
4
C
77 81
3
4
85 89 93 97 101 105 109 113 117
Chorus
Bridge
C
A
4
A
5
Chorus
C
1
C
2
C
2
4
4
4
4
Fade
3
4
Verse 1 Verse 2
Coda
Verse 3
Matching the text against the structure of the piece will allow the reader
to notice recurring sections of text/music combinations—verses and cho-
ruses. The musical materials enhance nuances of the meaning of the text as
they were captured and enhanced by the recording process.
Dynamic relationships between the various sections of “Lucy in the Sky
with Diamonds” are present. These are clearly observed in the program
dynamic contour graph (Figure 10-3). The graph represents the overall
shape and dynamic motion of the song. The reader can experience how
that motion relates to the song’s text and sense of drama by observing the
graph while listening to the recording.
The reference dynamic level of the song is also identifi ed in Figure 10-3, as
a high mezzo forte.
Complete Evaluations and Understanding Observations
233
The timbral balance graph of the song in Figure 10-1 allows the reader to
identify and understand how the song emphasizes one pitch area for a
time, then more evenly distributes pitch density information—moving from
one texture to another between sections. Evaluations of the perceived per-
formance environment and sound stage will provide the remaining charac-
teristics of the overall texture, allowing all to be compared and considered.
Through this process, important and fundamental characteristics of the
song can then be more readily understood and communicated to others.
Relationships of the Individual Sound Sources
and the Overall Texture
The mix of a piece of music/recording defi nes the relationships of individual
sound sources to the overall texture. In the mixing process, the sound stage
is crafted by giving all sound sources a distance location and an image size
in stereo/surround location. Musical balance relationships are made during
the mix, and relationships of musical balance with performance intensity
are established. The sound quality of all of the sound sources is fi nalized at
this stage also, as instruments receive any fi nal signal processing to alter
amplitude, time, and frequency elements to their timbre and environmen-
tal characteristics are added.
These elements crafted in the mix exist at the perspective of the individual
sound source. Many important relationships exist at this level. This focus is
common and important for the recordist, but is not common in recreational
listening. The many ways sound relationships are shaped during mixing
bring the recordist to often focus on this level of the individual sound source
and also at the next higher level of perspective, where sounds can be com-
pared by giving equal importance and attention to all sources. Learning to
evaluate these elements, and to hear and recognize how these elements
interact to craft the mix, is one of the most important listening skills to be
developed for the recordist.
Figures 7-4 and 10-4 present two musical balance graphs of the beginning
sections of “Lucy in the Sky with Diamonds.These are evaluations of two
separate versions of the song. Mixing decisions brought certain sounds to
be at different dynamic levels in each version. Listening with a focus on
several different sound sources, while comparing the two graphs to what is
heard, will provide the listener with insight into these very different mixes.
The performance-intensity tier of Figure 10-4 should also be examined and
then compared to the fi nal dynamic levels. With this graph the reader is
able to identify how the mixing process transformed performed dynamics.
This gives much insight into the performance of the tracks and their sound
qualities, and the musical balance decisions that followed.
Complete Evaluations and Understanding Observations
235
The sound stage provides each recording with many of its unique quali-
ties. In examining the structure of sound stage, the listener will learn many
things about a recording. Important among the many qualities are:
Distribution of sources in stereo or surround location
Size of images (lateral and depth)
Clearly defi ned sound-source locations, or a highly blended texture of
locations (wall of sound)
Depth of sound stage
Distribution of distance locations
Location of the nearest sound source
Sound-stage dimensions that draw the listeners attention or are
absorbed into the concept of the piece
Changes in sound-stage dimensions or source locations
Figure 10-5 presents the stereo location of sound sources in the 1999
Yel-
low Submarine version of “Lucy in the Sky with Diamonds.The phantom-
image locations and sizes add defi nition to the sound sources and musical
materials, and the width of several images changes between sections.
Comparing the graphs for timbral balance (Figure 10-1) and stereo location
(10-5), it is possible to examine how pitch information (“vertical space”) is
distributed along the stereo sound fi eld (lateral space). As the song begins,
each source is in its own pitch area and stereo location. As sounds with simi-
lar pitch areas enter, some are given their own location to give the parts clar-
ity. This allows them to be distinguished from others of similar pitch areas;
notice, for example, the high hat open and closed, relative to the Lowrey
organ in measures 12–14. Parts in the same pitch areas that are intended to
blend (or fuse) are at the same or similar stereo location; this is especially
evident with Lennons lead vocal and its doubling in measures 20–22.
Comparing this stereo location graph (Figure 10-5) to surround placements
in Figures 10-6 and 10-7 allows some interesting observations. The lead
vocal, Lowrey organ, and tamboura have been graphed into the fi rst verse
in the surround version, and both graphs need to be used to clearly defi ne
their dimensions. The image sizes and locations in a surround mix are quite
different from the two-channel versions, and offer a very different experi-
ence of the song. Comparing these location graphs will allow the reader
insight into what makes each mix different.
Complete Evaluations and Understanding Observations
237
Figure 10-6
Sur-
round sound location
graph—The Beatles’
“Lucy in the Sky with
Diamonds” (1999
Yellow Submarine
version).
1 3 5 7 9 111315171921 23
Vocals
KEY
Lowrey Organ
Tamboura
3
4
Intro AA
Ls
Rs
Figure 10-7
Surround
image placements
of Lowrey organ,
tamboura, and vocal
images—The Beatles’
“Lucy in the Sky with
Diamonds” (1999
Yellow Submarine
version).
KEY
Lowrey Organ
Tamboura Vocals location
and density of
ambience
Chapter 10
238
The reader is encouraged also to evaluate the stereo location of the sources
in the original Sgt. Peppers Lonely Hearts Club Band version of the song.
Then compare the three location evaluations for similarities and differenc-
es of image location and sizes, and note when and how images change
sizes or locations. In the end, consider how the different sizes and locations
of the images impact the musicality of the song and how this aspect of
imaging contributes to presenting the musical ideas.
Figure 10-8 is a blank distance location graph for the beginning sections
of “Lucy in the Sky with Diamonds.The graph can be used to plot the
distance location of sources from any of the three versions of the song—or
to compare one or more selected sources from all three versions. In listen-
ing to all three versions of the song, the reader will notice some striking
differences in distance location. Between the two-channel versions, sound-
source changes in distance locations are more pronounced in the
Yellow
Submarine
version, and during the fi rst verse the bass and lead vocal are
closer than in the Sgt. Pepper version. Consider how distance cues enhance
certain musical ideas and sound sources, and provide clarity or blending of
images in various instances.
Figure 10-8
Distance
location graph for The
Beatles’ “Lucy in the
Sky with Diamonds.
1 3 5 7 9 111315171921232527293133353739 414345
3
4
4
4
Intro
Bridge
Chorus
A
AB
measures:
3
4
Proximity Near Far
Verse 1
Together the sound qualities of all of the sound sources jointly shape the
timbral balance of the song. Sound quality also contributes fundamen-
tally to the performance intensity of the sound source, and might con-
tribute to shaping the song’s reference dynamic level. The environmental
Complete Evaluations and Understanding Observations
239
characteristics of the source also contribute to its overall sound quality.
They fuse with the source’s sound quality to add new dimensions and
additional sound-quality cues to the resulting composite sound. Lastly, the
sound quality of each source provides important distance information, as
the amount of timbral detail primarily determines distance location.
This timbral detail carries over into clarity of sound-source timbres in the
mix. Sound sources can have well defi ned timbres with extreme detail and
clarity. Conversely, their sound qualities can be well blended with details
absorbed into an overall quality. Both extremes are desirable in different
musical situations. Both would place the sound at different distances, and
each would cause the sound source to be heard differently in the same
mix—each has the potential to be more or less prominent in a musical
texture than the other.
How sound sources “sound” is an important aspect of recordings. Models
of instruments and specifi c performers have their own characteristic sound
qualities. Sound qualities are matched to musical materials and the desired
expressive qualities to create a close bonding of sound quality and musi-
cal material. For instance, the opening of “Lucy in the Sky with Diamonds”
would sound very different on a Hammond organ than on the Lowrey used
in the recording. A listener would recognize something incorrect about the
source, even before they might identify the sound quality as being differ-
ent. That musical idea is forever wedded to that original sound quality.
The Complete Evaluation
Great insight into productions can be found through an evaluation of all of
the artistic elements in a particular recording/piece of music. This process
can allow the listener to explore the inner workings of the sound relation-
ships of a recorded piece of music in great depth. The listener would benefi t
from performing this exercise on a number of pieces, over the course of a
long period of time. Not only will the listening and evaluation skills of the
listener be refi ned, if recordings are thoughtfully chosen, these evaluations
will also provide many insights into the unique production styles of certain
engineers and producers, as well as an understanding of the artists’ work.
Complete evaluations can also bring attention and training to shifting focus
while listening. Individual evaluations of individual artistic elements main-
tain the listeners focus at a specifi c level of perspective; this has been our
listening practice thus far. Now comparing graphs of various elements and
graphs at various levels of perspective will bring the listener practice in
shifting focus with a purpose. This training in shifting focus is excellent
preparation for the continual shifting of focus required in producing record-
ings. It will also lead the listener to be able to make faster evaluations of
recordings and judgments of the various qualities of recorded sound.
Chapter 10
240
The project of performing a complete evaluation of a piece will be lengthy.
It will take the beginner many hours of concentrated listening. The demands
of this project are, however, readily justifi ed by the value of the informa-
tion and experience gained. This project will develop and refi ne critical and
analytical listening skills in all areas.
For greatest benefi t, an entire song should be evaluated. The listener
should be mostly concerned with evaluating all of the artistic elements. An
evaluation of the traditional musical materials and the text might be help-
ful as well, but is not necessary for most purposes. After an evaluation of
the artistic elements individually, the listener should evaluate how these
aspects relate to one another, and how they enhance one another.
Elements To Be Evaluated
This complete analysis of an entire recording is strongly encouraged. Many
aspects of a recording will only become evident when evaluations of sev-
eral artistic elements are compared with one another, and compared to the
traditional musical materials. The use of the artistic elements in communi-
cating the musical message of the work will become much more apparent,
when their interrelationships are recognized.
The listener will compile a large set of data, in performing the many evalu-
ations spanning Chapter 6 through the pitch density and timbral balance
evaluations of Chapter 10. These many evaluations will represent many dif-
ferent perspectives and areas of focus. Some of this information will be
pertinent to understanding the musical message of the work, and some of
the information will pertain to its elements of sound (such as how stereo
location is used in presenting the various sound sources).
In addition, some of the information will be pertinent to appreciating the
technical qualities of the recording. When evaluating recordings, we should
also bring our attention to the quality of the signal, the presence of noises
and distortions, performance issues and other matters related to the integ-
rity of the signal and the quality of the production. If desired, one can listen
to seek information on “how” certain recording results or were achieved.
All of this information will contribute to the audio professional’s complete
understanding of the piece of music, of how the piece made use of the
recording medium, and of the sound qualities of the recording itself.
The sequence of evaluations that is usually most effi cient in evaluating an
entire work (depending upon the individual work, it may vary slightly) is:
List all of the sound sources of the recording
Create a time line of the entire work
Plot each sound source’s presence against the time line
Defi ne unknown sound sources and synthesized sounds through sound
quality evaluations
Complete Evaluations and Understanding Observations
241
Designate major divisions in the musical structure against the time line
(verse, chorus, etc.)
Mark recurring phrases or musical materials, similarly, against the time
line; an in-depth study of traditional musical materials would be appro-
priate at this stage, if of interest
Evaluate the text for its own characteristics and its relationships to the
structure of the traditional musical materials, as appropriate
Evaluate the pitch areas of unpitched sounds
Determine reference dynamic level
Perform a program dynamic contour evaluation
Create a musical balance graph
Create a performance intensity graph
Graph the song’s stereo location or surround location
Evaluate the work for distance location
Perform environmental characteristics evaluations of all host environ-
ments of sound sources and the perceived performance environment
Create a timbral balance graph of the work
Study these evaluations to make observations on their interrelation-
ships and to identify the unique characteristics of the recording
Examine the recording for technical issues, integrity of the signal, per-
formance and recording technique fl aws, and other issues with the
quality of the recording
Identify a stream of shifting focus on various elements and perspec-
tives after compiling a complete evaluation graph from the previous
graphs
Observe how the artistic elements work jointly with the musical materi-
als and the text to create the recording and the song/music
Observing the Roles and Interaction of Elements
As we have learned, all of the elements of the recording have roles in
delivering the musical ideas and message of the recording. Further, all of
the elements interact and support the activity of the other elements (and
potentially can detract from other elements). In examining these aspects,
we learn much about the recording and how it presents the music.
The following materials may be coupled on the same graph (on separate
tiers), or on similar graphs. They are all at the same perspective (at the level
of the sound source):
Performance intensity
Musical balance
Distance location
Stereo location or surround location
Pitch density (timbral balance showing individual sound sources)
Chapter 10
242
Pitch density is timbral balance considered at the perspective of the indi-
vidual sound source. Often when the perspective is not of the individual
source, groups of instruments present a single musical idea; in this case
they function as a single source. Instruments capable of playing several
parts simultaneously (such as a keyboard or drum set) are divided into sev-
eral separate pitch areas—each functioning as a separate sound source. In
this way the timbral-balance graphs information can be directly compared
to the four elements listed above.
These fi ve artistic elements will be interrelated in nearly all recording pro-
ductions. Observing their interrelationships will allow the listener to extract
signifi cant information about the music and the mix. The reader can explore
this material by reviewing the graphs of “Lucy in the Sky with Diamonds”
found throughout this chapter.
In making these observations, the listener will continually formulate ques-
tions about the recording and seek to fi nd meaningful answers. The ques-
tions of how artistic elements (and all musical materials) relate to one
another will center on:
Patterns of activity within any artistic element (patterns of activity are
sequences of levels within the artistic element, and rhythmic patterns
created by the relationships of those levels),
Levels of any artistic element (how high-pitched, what loudness levels,
etc.) and the areas they span,
Speed or rate of patterns or changes within the elements,
Interrelationships of patterns between artistic elements (Do the same
or similar patterns exist in more than one element?).
Music is constructed as similarities and differences of values and patterns of
musical materials. This is also the way humans perceive music. People per-
ceive patterns within music (its materials and the artistic elements). Listen-
ers will perceive the qualities (levels and characteristics) of the elements of
the music and will relate the various aspects of the music to one another.
At the same time, the listener should compare what they are hearing with
what was previously heard, and to their previous experiences. Meaning
and signifi cance will be found in this information by looking for similarities
and differences between the materials.
The listener should ask “What is similar?” between two musical ideas (or
artistic elements); “What is different?”; “How are they related?” These will
be answered through observing the information that was collected during
the many evaluations. The shapes of the lines on the various graphs may
show patterns. The vertical axes of the graphs may show the extremes of
the states of the materials and all of their other values.
The listeners ability to formulate meaningful questions for these evalua-
tions will be developed over time and practice. They will be asking: “What
Complete Evaluations and Understanding Observations
243
makes this piece of music unique?”; “What makes this recording unique?”;
“How is this recording constructed?”; “How is this piece of music construct-
ed?”; “What makes this recording effective?” “What is important and how
is it presented?”; “How does [insert any element] contribute to delivering
the music or musical idea(s)?” Many other, much more detailed questions
will be formulated during the course of the evaluation. The listener should
nally ask, “Which of these relationships are signifi cant to the communica-
tion of the musical message; which are not?”
The use of the artistic elements in the recording can also be considered
in their relationships to the traditional musical elements and materials.
This brings an understanding of the importance of each musical idea, as
related to the piece as a whole. Through these observations, the recordist
will obtain an understanding of the signifi cance of the artistic elements to
communicating the message (or meaning) of the music. The recordist can
then understand and work to control how the recording process enhances
music, and how it contributes to musical ideas and the overall character of
the piece. The complete evaluation graph will assist all this, and more.
Stream of Shifting Focus and Perspective
A fi nal step in a complete evaluation of a recording is to identify a stream
of shifting focus. This will help one to recognize the various elements and
perspectives that shape the recording in signifi cant ways, as the work pro-
gresses. This will be shaped at several perspectives and will be different
from any one moment in time to another.
Figure 10-9 presents a “complete evaluation graph” to assist in compil-
ing this information. The elements at the perspective of the overall texture
appear in the top tier; the three elements at the very top of this tier are
qualities that do not change. They are separated from the three qualities of
the overall texture that can change over the course of the song by a thin
line. A heavy line separates the top tier from the bottom tier where the ele-
ments at the perspective of the individual sound source that appear. Nota-
tions drawing attention to important activities in each element are made on
the graph. Notes about important characteristics and qualities of elements
should also appear here. This allows for elements and perspectives to be
easily compared. The time line allows the reader to follow the elements as
they unfold over the song.
This study will bring the listeners attention to important aspects of the
music and the recording. If followed throughout a work it will provide an
experience of hearing the song move from one idea to another, from one
element to another, from the importance of one level of perspective to
another. As an example, one might hear how distance shapes a certain
sound in an important way at a certain moment, and then quickly shift to
an important quality of timbral balance that was created by this change in
Chapter 10
244
distance, moving again to hearing and understanding how the program
dynamic contour was then altered, then recognizing that a certain sound
source became less prominent in the mix because of this activity and upon
investigation fi nding that its loudness did not change but its spectral con-
tent was altered. The reader will be able to use this graph to recognize and
follow the most signifi cant events and activity at any one time, and also to
recognize and follow how all other elements contribute at the same and at
other levels of perspective.
This stream of information brings a new understanding to the music and
the mix, to how the recording process shaped the music and how the artis-
tic elements work jointly with the musical materials to successfully deliver
a quality music recording. The role of shifting focus and shifting perspec-
tive will be better understood. Of as great importance, the reader will be
Figure 10-9
Complete
evaluation graph.
Form
Perceived Perfor-
mance Environment
RDL
Program Dynamic
Contour
Sound Stage
Dimensions
Timbral Balance
Pitch Area
Pitch Density
Sound Quality
Environments
Distance
Stereo Location
Performance
Intensity
Musical Balance
Structure:
1 2 3 4 5 6 7 8 9 10
Complete Evaluations and Understanding Observations
245
able to practice the process of shifting perspectives and focus that takes
place continually in recording production. This skill can be diffi cult to learn
how to execute in a controlled and deliberate manner, if only practiced dur-
ing actual production work.
Figure 10-10 is a complete evaluation graph of the opening of The Beatles’
“Here Comes the Sun.This summary of elements provides an overview of
important activities, and from this we can achieve a greater understanding
of the recording. The graph clearly shows that changes in instrumentation
and elements of the mix coincide with the changing structure of the song.
During these 15 measures, important changes take place in sound-stage
dimensions, timbral balance and pitch density, distance location, stereo
location and musical balance. These all contribute to a changing context for
the song that adds to its character. Attention should be drawn to these at
strategic moments. A few of these are:
The sound stage moving from fully left to the center during measure 8,
followed by a sudden broadening of the sound stage in measure 9,
The different “proximity” distance locations of the doubling guitars in
the fi rst four measures, the dominance of the “near” area in measures
9 through 13 and the dominance of the “proximity” area in measure 14
and 15.
Timbral balance is focused on the “low-mid” through “mid-upper”
ranges from the beginning through measure 13; the density of activ-
ity varies signifi cantly but the bandwidth of pitch registers being used
does not change markedly (except for measure 8’s solo Moog glissan-
do). This is especially noticeable in measures 9 through 13; at measure
14, the pitch areas covered suddenly expand to encompass activity
from the “low” register into “high.
The reader would benefi t from performing several or all of these graphs to
notice the subtleties of activity. Following all elements into the next section
would provide much more interesting information; as the song unfolds,
more changes in the mix enhance the song profoundly. The reader might
wish to refer back to Figure 6-7 for pitch areas of several percussion sounds
and to Figure 7-3 for the entire program dynamic contour of “Here Comes
the Sun.
This complete evaluation process will greatly assist the recordist in under-
standing how the artistic elements can be crafted during the recording pro-
cess to enhance, shape, or create musical materials and relationships.
The recording production styles of others can be studied and learned. By
understanding the sound qualities of a recording and being able to recog-
nize what comprises those sound qualities, the sound of another recording
or type of music can be emulated by the recordist in their own work as
desired.
Complete Evaluations and Understanding Observations
247
Using Graphs for Making Evaluations
and in Production Work
Graphing the artistic elements may be time consuming and at times tedious
and perhaps frustrating. It is important, however, for developing listening
skills and evaluation skills, especially during beginning studies. It is also
valuable for in-depth looks at recordings, providing insights into the artistic
aspects of the recordist’s own recordings and the recordings of others.
This process of graphing the qualities of the various artistic elements is
also a useful documentation tool. Graphs can be used to keep track of how
a mix is being structured or how the overall texture is being crafted. Many
of the graphs or diagrams can even be used to plan a mix. For example,
the imaging diagram can be used to plan the distribution of instruments on
the sound stage and consider distance assignments before beginning the
process of mixing sound—perhaps even before selecting a microphone to
begin tracking. Working professionals through beginning students will fi nd
these useful in a variety of applications.
Detailed graphs of artistic elements are not proposed for regular use in pro-
duction projects. The graphs are not intended for the production process
itself, though they might be of some use in certain planning and record keep-
ing matters. Audio professionals must be able to recognize and understand
the concepts of the recording production, and hear many of the general rela-
tionships, quickly and without the aid of the graphs. The graphs are intended
to develop these skills, and to provide a means for more detailed and in-
depth evaluations that would take place outside of the production process.
Recordists who have developed a sophisticated auditory memory will also
nd these graphing systems of evaluation to be useful for notating their
production ideas, and for documenting recording production practices.
These acts will allow them to remember and evaluate their production
practices more effectively, allowing them more control of their craft.
Summary
The pitch area that comprises the focused frequency content of individual
sound sources is joined with the source’s musical material to establish pitch
density. This takes place at the perspective of the individual sound source
and their interaction with other sources. At the highest level of perspective,
overall texture, we observe timbral balance, as the frequency content of the
recording is established by the frequency content of each sound source. In
this way, each sound source and the musical materials they present can be
envisioned as a “partial” in the “spectrum” of the overall textures timbral
balance. With timbral balance addressed, all of the dimensions of the over-
all texture have been explored.
Chapter 10
248
Chapter 10 concludes Part Two with a discussion of how the individual ele-
ments contribute to shaping the recording. At the perspective of the indi-
vidual sound source, musical materials are presented and enhanced by the
artistic elements. Musical balance, distance location, image size and place-
ment, sound quality and other elements contribute to shaping the music
and the recording. These relationships are realized in production through
using pitch, dynamic and spatial relationships in planning (composing) and
executing (performing) the mix. These individual elements become part of
the fabric of the overall texture, at the highest level of perspective.
There are six characteristics or dimensions of the overall texture. The three
dimensions of form, perceived performance environment and reference
dynamic level are qualities that are single, global concepts that remain fi xed
throughout the work—they are not time dependent. The three qualities of
sound stage, program dynamic contour and timbral balance are variables;
they may change from moment to moment as the song is experienced,
and are the product of all of the sound sources coming together to create
an overall state within the elements of spatial relationships, dynamics and
frequency content, respectively. These three qualities are time dependent
and create an overall shape of the recording.
As we have seen, the recording’s artistic elements and the song’s musi-
cal materials and text are fundamentally linked. The recording’s artistic
elements contribute to the delivery of musical ideas and can help shape
them; they present the musical ideas in important ways, and at times can
add important dimensions to the music. The artistic elements act in concert
with the musical materials and text to create the music and the recording.
Part Three will explore concepts that will assist the reader in putting these
ideas into practice in their own recordings.
Throughout Part Two a systematic method for evaluating the artistic ele-
ments of recordings and their impact on the recording’s delivery of musi-
cal ideas has been explored. This method embraces each element sepa-
rately and works toward their interaction. Materials and exercises have
been ordered carefully. All of the exercises presented in the text are listed
at the beginning of the book after the table of contents. The exercises are
ordered to systematically develop the readers sound evaluation and listen-
ing skills. Working through the exercises in this order will be most effective
for most people. Readers with much experience and well-developed skills
will still fi nd at least a few exercises that are suffi ciently advanced to test
and improve their skills. Learning anything new requires effort and a will-
ingness to reach into the unknown.
Very sophisticated listening skills are required in audio and music. Devel-
oping such skills from the beginning will take practice, patience, and per-
severance. At times the listener will be told to listen to things they have
never before experienced. They will have no reason to believe such sound
characteristics even exist, let alone that they can be heard, recognized, and
Complete Evaluations and Understanding Observations
249
understood. Faith will be required; a willingness to be open to possibilities
and leap blindly into an activity, searching the sound materials for what
they have been told exists. Using the processes that were learned and the
graphs that were created throughout Part Two will make this process easier
and more productive.
Following the system will give the listener a refi ned ability in critical and
analytical listening. The listener will learn to communicate effectively about
sound, and will be able to apply this new language to many situations.
Among other things, Part Three will explore how these new listening skills
and knowledge of sound can provide the recordist with the ability to craft
quality and artistically inspired recordings.
Exercises
Exercise 10-1
Pitch Density Exercise
Select a recording with an instrument performing alone, or with very few other
instruments, over the fi rst several sections of the work. The instrument’s musi-
cal material should show some noticeable and considerable change in pitch
levels. This exercise will graph the pitch density of this single instrument over
this period.
Pitch density will be graphed to show the musical material the instrument is
performing and the prominent aspects of its spectrum.
The process for determining the pitch density for a single source will follow
the following sequence:
1. During the initial hearing(s), establish the length of the time line. Notice
entrances and exits of the instrument against the time line.
2. Check the time line for accuracy and make any alterations. At the same
time, work to identify the musical ideas that the instrument is presenting
and note their presence against the time line.
3. Transcribe the instrument’s pitch material onto the graph, so its melodic
contour is represented as a single line on the graph. This might represent
the fundamental frequency of the sound-source timbre.
4. Next notice how the single melodic line falls into phrases that generate
distinct, individual musical ideas. Mark the beginning and ending points
of those ideas, and modify the melodic contour to show a more general
outline of the line, eliminating small and fast variations of pitch level.
5. Turn focus to the spectrum of the instrument’s fi rst sound. Determine the
bandwidth of the pitch area by identifying where the spectrum becomes
Chapter 10
250
about one-third the loudness of the lowest frequency. This will determine
the upper boundary of the pitch area.
6. Scan the musical material to determine changes in performance intensity.
These changes will bring about changes in the instrument’s spectrum that
often change the bandwidth of this pitch area. Note these changes on
your graph.
7. Map out the upper boundary of the pitch area by listening to the spectral
information against the lowest frequency.
8. Once these boundaries are fi nalized, make observations on the density of
spectral activity (amount of frequency and overtone information) within
the defi ned pitch area.
Exercise 10-2
Timbral Balance Exercise
Select a recording with four to six sound sources, to graph the pitch density
over the fi rst three or four major sections of the work. The recording should
contain timbral balance changes within and between sections.
The musical texture must (1) be scanned to determine the musical ideas pres-
ent. The musical ideas might be a primary melodic (vocal) line, a secondary
vocal, a bass accompaniment line, a block-chord keyboard accompaniment,
and any number of different rhythmic patterns in the percussion parts. They
should then (2) be identifi ed by instrument or voice performing the material,
and listed in a key.
Each idea will (3) then have its pitch areas defi ned as a composite of its pitch
material and the prominent aspects of the sound quality of the instrument(s)
or voice(s) that produced the idea.
The process of determining the timbral balance graph will use the same skills
developed in Exercise 10-1. This graph will plot the pitch density of all of the
musical ideas of all of the sound sources to clearly represent the complete
frequency content of the recording.
The reader should perform a pitch-density evaluation of all of the sound
sources in the example, singling out each source for careful evaluation. Once
these are completed, this pitch-density information of all sound sources will
be combined into a single timbral balance graph.
Complete Evaluations and Understanding Observations
251
Exercise 10-3
Shifting Focus and Perspective Exercise
Select a short song with only a few instruments to evaluate completely. Create
all of the graphs for the overall texture and for the individual sound source,
as outlined above.
Create a complete evaluation graph. Transfer the important events and char-
acteristics of each element onto the graph, in its designated place. When an
element changes, note where this change occurred and its nature.
Once the graph is completed, listen to the work again following this graph.
Highlight the most important changes that occurred. Next, observe the
changes that you heard as not being as signifi cant as others, and note how
they might lend a supportive role to the other elements.
Practice changing focus and perspective in a controlled and deliberate way by
listening for specifi c material and changes while following this graph. Next,
listen to the recording with your knowledge of the important elements and
changes of perspective, and when important shifts occur, try to follow these
changes without the aid of the graph.
253
Part Three
Crafting the Mix:
Shaping Music and Sound, and
Controlling the Recording Process
255
11 Bringing Artistic Judgment to
the Recording Process
Artistic judgment will be brought to the recording process by the recordist.
How the recordist makes use of the artistic elements shapes the recording
in signifi cant ways. The recordist may shape the sound in artistically sensi-
tive ways, or not. Skilled recordists are artists, and are able to make these
decisions intuitively, if not consciously.
The recordist can bring artistry to the recording by learning the recording
process well enough to use it creatively, by learning the sound qualities
of the instruments of the recording studio (recording equipment and tech-
nologies), and learning all of the possible ways the recording processes
and devices can transform sound. These will give the recordist the tool set
needed to control how the recording process shapes sound. The recordist
will then be able to make decisions on how to shape the recording. Those
are the decisions that directly contribute to the music and create a charac-
teristic sound to the recording.
Part Three Overview
Part Three will explore how the recording process can be used creatively.
It will bring the reader to consider the sound qualities of recording devices
and talk about how to evaluate sound during production—and before and
after. The recording process will be considered from beginning to end, in
broad terms.
Part Three will defi ne basic concepts, aesthetic and artistic considerations,
and some philosophies on recording. This creates a broad framework for
the recordist, a way to orient any recording project toward fundamental
principles. This big-picture approach may at times seem to state the obvi-
ous, but it is intended to bring the reader (recordist) to keep a perspective
on the overall direction of a project and of basic concerns. All too easily we
Chapter 11
256
are overwhelmed by details and lose track of overall quality and direction.
Our effectiveness is diminished and our artistic vision blurred.
One of the goals of this section is to stimulate thought, to get the reader
(beginning recordist) to consider how to use the recording process cre-
atively, and to bring the reader to develop a thoughtful and fl uid approach
to working in the studio. Ultimately this should lead the reader to develop
their own personal way of working, and over time perhaps even their own
unique production “sound.
While Part Three explores certain aspects of production in some detail, it
does NOT present step-by-step sequences of instructions. It is designed to
move the reader to think about what occurs and to consider the process as
a creative endeavor.
Recording equipment is discussed in general terms. The concepts deliv-
ered are relevant to all devices and technologies, and should remain so
with changing technologies. It is signifi cant to recognize this discussion is
not technology dependent, and should remain valid when applied to any
technology or device—and as technology inevitably changes.
Part Three needs to be supplemented by readings on recording technolo-
gies and on equipment use and production techniques. These areas must
be learned for the reader to be able to control the recording process well.
This information is outside the scope of this writing and can be found in
many excellent books (many of which are listed in the bibliography).
The Signal Chain
The overall concept of the recording process involves the fl ow of signal
through a chain of recording/reproduction stages or devices. The recording-
and-reproduction signal chain may take many forms, depending on the
nature and complexity of the recording project and the technologies being
used.
The stages of the signal fl ow are interrelated. They interact with one anoth-
er, and their sequence can shift depending on operator convenience or the
desired sound quality of the signal (such as a sound source being record-
ed). While signal ow is sequential, it can be altered for the individual proj-
ect or because of personal working preferences (developed over time and
with experience). Altering the sequence of the signal chain also can alter
sound quality; consider the result of placing a noise gate on a signal before
it is sent through an equalizer, as opposed to afterwards.
The fl ow of the signal through the chain, and the order of equipment and
events, will often be consistent with Figure 11-1 and Figure 11-2. The two
gures show the general signal fl ow from tracking through mixdown. Fig-
ure 11-3 presents a typical signal chain of a digital audio workstation (DAW)
Bringing Artistic Judgment to the Recording Process
257
with related input/output and monitoring systems, demonstrating much of
the signal fl ow occurs in software within the DAW.
The activities and associated devices of any recording-and-reproduction
signal chain generally appear in the sequence outlined below. Some “devic-
es” (real or virtual) are used continually (or intermittently) throughout the
recording process, such as the mixing console and the monitoring system.
Other activities, such as editing, may occur at several different stages of the
production process and within the signal chain, or might not occur at all.
Figure 11-1
Traditional
signal chain for track-
ing sessions.
Monitoring
Record
Levels
Preprocessing
Tracking and
Overdubs
Sound Generators
(Synthesizers and
Computers)
Computer Control
MIDI, SMPTE, and
Proprietary Automation Systems
Tracking Sessions
Creating
and
Capturing
the Sound
Digital or Analog Multitrack Recorder
or
Digital Audio Workstation
Tracking
Microphones or sound generators
Mixing console (preprocessing, record levels and routing)
Digital multitrack recorder or analog multitrack recorder (perhaps with
noise reduction processors), digital audio workstation, or other digital
or computer-based storage
Mixdown
Digital multitrack recorder or analog multitrack recorder (perhaps with
noise-reduction processors), digital audio workstation, or other digital
or computer-based storage
Mixing console (routing and mixing)
• Signal processors
Chapter 11
258
System control methods (automation systems, MIDI, SMPTE, and
others)
Mixdown and master recorder (analog, digital, or computer-based)
Editing (razor blade or computer-based)
• Monitoring
Figure 11-2
Tradi-
tional signal chain for
mixdown sessions.
Monitoring
Mixdown
and
Mastering
Mixdown
Signal Processing
Computer Control
MIDI, SMPTE, and
Proprietary Automation Systems
Mixdown Sessions
Multitrack
Storage
Sound Generators
(Synthesizers and
Computers)
Many subcategories of these real and virtual devices exist, and many of these
devices can be used for multiple purposes. In keeping with the approach
of remembering broad concepts, the reader is encouraged to think about
equipment functions in basic terms: specifi cally, what the device does (for
example, mixing consoles exist to mix and route signals—anything else is
an add-on); how the device accomplishes its task (it combines several sig-
nals into one with variable proportions and it sends signals from one place
to another). Such general observations can keep the process simple, when
it starts to get complicated.
Evolving technologies can tend to blur traditional processes and at times
refi ne them. The common DAW brings many of the above components of
the traditional signal chain and workfl ow into new dimensions (Figure 11-3).
Sound enters and leaves the DAW through an I/O interface (also referred to
as a computer audio interface). All facets of the signal chain and production
process can be handled through software on the computer (DAW). Steps
that were once sequential in analog can be simultaneous or reordered in
a DAW. This can lead to ineffi ciencies or poor results, but can also lead to
increased effi ciency and effective work practices. Maintaining a clear vision
of the signal chain can keep the process simple, focused and productive.
Bringing Artistic Judgment to the Recording Process
259
Monitoring
Creating
and
Capturing
the Sound
Analog
Audio
MIDI
Digital Audio
and MIDI
Computer Audio
(I/O) Interface
Digital Audio
Workstation
Figure 11-3
Signal
chain of a DAW-based
system.
The sonic imprint of devices in the signal chain is an important factor
that will be covered later. The reader/recordist will learn to evaluate these
imprints and learn to apply them appropriately. How a device shapes sound,
both by how it functions and because of its own inherent sound quality, are
central concerns. Choosing devices is an important artistic decision as well
as a practical and technical one. It requires artistically sensitive judgment
on the part of the recordist, as the recording devices and processes shape
the aesthetic and artistic elements of sound.
Guiding the Creation of Music
The recordist can have many roles in making a recording. Among these
roles is creating a working environment where the magic of making art
can happen. How the recordist might participate in making art will be cov-
ered in the next chapter. The recordist usually sets the tone for recording
projects and can control many things, including a project’s pace and how
people interact. The recordist is usually largely responsible for guiding
recording projects, whether or not the others involved acknowledge this.
Recordists are often responsible for keeping the creative process moving
effectively, effi ciently, and invisibly, giving guidance to the artists or giving
them enough space and support so they are free to be creative.
It has often been said that the recordist should fi rst be a psychologist. While
this statement may be somewhat extreme, the recordist needs to be sensi-
tive to interpersonal relations. How they work with performers and others
will shape the project as much as their actual recording duties.
The ways people normally treat one another in everyday living, and espe-
cially in standard business environments, are often nonproductive (at best)
Chapter 11
260
in the recording studio. Recordists should consider how they speak with
the artists about the project, and how they interact with the artists socially
and during the creative processes. The type of image the recordist pres-
ents to the artist will infl uence the artist’s comfort level and ability to work.
The recordist will try to keep artists relaxed in the studio environment and
focused on the project.
Recordists strive to get creative people to do their best work, while attempt-
ing to perform their own tasks at the highest standard. The process of creat-
ing art (a music recording) is an emotional roller coaster of ecstasy of what
has just been discovered and anguish over not having an equally brilliant
answer to “what comes next?” Time and fi nancial constraints further stress
artists (clients).
Musicians/creative people are exposed and vulnerable in the recording
process. The recordist must be certain to do nothing, and not to allow
anything to happen within the session environment, or by anyone else, to
make artists feel unprotected or, worse, threatened. Musicians must have
the freedom to be creative around the recording studio, without feeling
that their every move is watched or evaluated. At times they will need to
feel they are alone.
The expressive nature of performing music will often involve taking chanc-
es, stretching performance abilities to their limits or beyond, and making
mistakes. These necessary activities can potentially embarrass confi dent
(let alone less than confi dent) performers if they are critically judged at
this vulnerable time. Performers need to be confi dent to perform well. The
recordist is attempting to get the performers to exceed the height of their
ability. Nothing should be allowed to happen that would diminish the con-
dence level of the performers and to take away from the trust that the
performers must have in the recordist.
While evaluations have their place in the recording process, judgments—
especially those of a negative nature—rarely can be used constructively.
Summary
The broad strokes of Part Three will give the reader/recordist a method for
applying artistic judgment to the recording process. It will lead them to
envision the project as a whole before it begins, and to retain that vision
while executing the many technical tasks and artistic decisions required to
craft a music recording.
The reader will gain insight into how recording devices and techniques can
be evaluated and applied to crafting a mix in an artistically expressive way.
261
12 The Aesthetics of
Recording Production
Two central issues defi ne the differences between various approaches to
the aesthetics of recording production: (1) the relationship of the recording
to the live listening experience and (2) the relationship of the recordist to
the creative and artistic decisions of the recording.
The recording process can capture reality, or it can create (through sound
relationships) the illusion of a different world. In most situations record-
ists fi nd themselves moving within the vast area that separates these two
extremes, where they enhance the natural characteristics and relationships
of sounds. The recordist will determine an appropriate recording aesthetic
suitable for the individual project, based on its planned material and over-
all concept and qualities.
The recordist will use the recording process to support the production
aesthetic of the project. The artistic elements of sound will be captured
and shaped in relation to the live listening experience and to the musi-
cal message. Recording techniques and technologies may also be used to
shape the performance, especially through editing, mixing, miking, pro-
cessing, and overdubbing techniques.
The recordist will play a more active role in shaping the materials of the
music (or the presentation of those materials) in certain projects more than
in others. This is often a result of the function of the recordist, in relation to
the other people involved in the recording project, in the artistic decision-
making process.
For each project individually, the recordist will defi ne the proper record-
ing aesthetic and their own role in the creative process of crafting the
recording.
Chapter 12
262
The Artistic Roles of the Recordist
The recordist must have a clear idea of their role in the creative process for
each project. The project may include the composer of the music, one or
many performers, a conductor of an ensemble, and/or a specifi c recording
producer. The recordist must know their responsibility to the fi nal artistic
product and the roles and responsibilities of the others involved.
Of the many possibilities, the recordist may be functioning to capture the
music as closely dictated by the composer. They may be functioning to
capture, as realistically as possible, the performance of an ensemble, as
precisely directed by the conductor; they may be functioning to capture
the interactions and individual nuances of a group of performers, without
altering the performance through the recording process; or the recordist
may be functioning to precisely execute a recording producers instruc-
tions (often in ways that transform performances). In all of these cases and
many others, the recordist is allowing the artistic vision and decisions of
others to be most accurately represented in the recording. The recordist’s
role then is to facilitate and realize the artistic ideas of others, and not to
directly impose their ideas onto the project.
The recordist’s role sometimes might be to offer suggestions to the cre-
ative artists or even to take an active role in the artistic decision-making
processes. The role of the recordist might be active in shaping a perfor-
mance of an existing work, or in creating a new piece of music. The record-
ist might be active in determining the sound qualities of the instruments
of the recording, or in determining the sound sources themselves. Vastly
different levels of participation in the artistic process are often required
from one project to the next.
The process of writing a piece of music for a recording is often a collaborative
effort. It may take place with many people (composer, performers, producer,
recordist) or just a few (performer/composer and recordist/composer).
In many ways, the recordist functions as a creative artist and can serve the
traditional roles of a composer, a conductor, and a performer. The recordist
also shapes sounds in nontraditional ways. Recordists have unique con-
trols over sound and live performances that allow for an additional musi-
cal voice. It is possible to compose with the equipment (instruments) of
the recording studio, to shape sounds or performances through the use of
recording and mixing techniques, or to create a new musical environment
for someone else’s musical ideas and performances.
As will be explored below, the recording studio can be thought of as a
musical instrument or a collection of musical instruments. In this way, the
recordist may conduct all of the available sound sources (for example,
bringing sounds into and out of the musical texture through mixing); may
“perform” the musical ideas through the recording process; may alter or
reshape the sounds of the sources, or “interpret” the musical ideas, in
The Aesthetics of Recording Production
263
ways that are not possible acoustically; and may create (compose) new
musical ideas or sounds.
The Recording and Reality:
Shaping the Recording Aesthetic
The recordist has many potential roles in shaping the recording aesthetic.
The role of the recordist might be to capture a live event as accurately as
possible in relation to the dimensions of that real-life experience, or the
recordist might seek to alter the artistic elements of sound to enhance the
quality of that real-life experience. The recordist may even seek to create a
new reality or set of conditions for the existence and relationships of sounds.
Reality is simulated, enhanced, or created through the recording process.
The relationship of the recording to the live listening experience is central
to the aesthetic quality of the recording. A recording may differ from the
live listening experience by (1) the use of the artistic elements of sound in
ways that cannot happen in nature, and (2) the presentation of impossible
human performances and compilations of perfect performances.
The aesthetic and artistic elements that most infl uence the life-like qualities
of the recording are environmental characteristics and the dimensions of
the sound stage, and the relationships of musical balance to the timbres of
sound sources.
Sound Stage and Environments
Sound exists in space. Humans conceptualize sound, especially in the con-
text of music performances, in relation to the spaces in which the sound
is heard to exist. The recording process must provide the illusion of space
to convince listeners that the sound has been reproduced in a way that
is associated with their reality. The recording will provide the illusion of a
performance space or a physical environment for the performance—this is
the perceived performance environment.
This perceived performance environment is an illusion of a space wherein
the recording can be imagined as existing during its re-performance (play-
back). The realistic nature of the performance of the recording will play a
central role in establishing the relationship of the recording to the live lis-
tening experience. The listener will subconsciously scan the recording to
establish environmental characteristics, an imaginary stage (sound stage),
and a perceived performance environment. This information allows the
listener to complete the process of establishing a reality for the listening
experience of the recorded music performance.
Chapter 12
264
These three important characteristics need to be deliberately shaped or
captured to precisely determine this aspect of the recording’s aesthetic. If
not, the recording will appear defi cient in some way.
The imaginary environments will be either the captured reality of the origi-
nal performance space, an altered or enhanced reality of the original perfor-
mance space, or new realities that are created for the performance through
signal processing. The listener will make fundamental judgments about
the material of the recording based on the qualities of the environment(s)
simulated in the recording and will match the musical material against the
appropriateness of the environmental cues. An environment’s size and
sound qualities can have a profound infl uence on an instrument’s presen-
tation of musical materials and its overall character.
The listener will imagine the location of the sound sources relative to one
another and to the overall environment. The listener will envision the sound
stage of the recording. In so doing, the entire ensemble will be placed at a
certain distance from the listener, and each individual sound source will be
placed at a distance and at an angle from its perceived location. The relation-
ship of these cues to the potentials of live performances will also defi ne the
aesthetic of the recording in relation to the possibilities of our physical exis-
tence. The recordist must give focused consideration to the makeup of the
sound stage. Crafting the sound stage is the primary opportunity to shape
the aesthetic of the recording and to provide the musical
materials and recording with their spatial dimensions.
The sound stage of the recording might place the sound
sources in locations that purposefully resemble those of
a live performance. In certain recording techniques, the
integrity of this imaging is a primary concern. Certain
stereo microphone techniques are designed to accurately
capture the depth of the sound stage and the lateral loca-
tion of the sound sources. Other techniques accurately
capture the microphone-to-stage distance and stage width.
Multitrack recordings can also deliberately create a sound
stage that recreates the live-performance relationships of
the performers.
The recordist often alters the sound stage to enhance the
musical material. One or several additional microphones
may be used to accent certain members of an ensemble.
The highlighted instruments are given a distance from the
listener, width of image, or a specifi c location that provides
them with more prominence in the musical texture. This can be performed
subtly, so as not to dramatically alter the natural qualities of the recording,
or it can be quite pronounced, depending on the aesthetic of the recording.
Sound stages are created for multitrack recordings and for recordings
made with only (or mostly) synthesized sounds. These recordings were
Listen . . .
to tracks 52 and 53, then 50 and
51
for the sound stage dimensions of
one mix that simulates the rela-
tionships of a live sound stage and
a second mix of the same musical
balance that signifi cantly alters
the sound stage to unnatural pro-
portions and relationships. Finally
compare those mixes to two stereo
microphone techniques.
The Aesthetics of Recording Production
265
created outside a common environment and with minimal naturally occur-
ring spatial cues captured with the sound sources. The recording is given
spatial cues by the recordist during mixing and signal processing. If not,
the listeners imagination will generate these relationships. The recordist
will control the sound stage for the recording by crafting the characteristics
of environment(s), distance, and stereo/surround location.
The recordist can provide the sound sources with life-like environments
and place them in natural physical relationships to one another, or they
can purposefully create environmental, distance, and localization cues that
would be impossible in nature.
These concepts have been reinterpreted in two common approaches to mix-
ing for surround sound. The traditional use of the listeners front fi eld can
still be the focal point of the surround mix. Left/right/center channels can
present the primary materials and be enhanced by placing ambience and
special effects in the rear fi eld. This replicates the observed performance of
traditional music listening experiences. Surround might also approach the
sound fi eld as a 360° environment, where instruments and mix elements
can appear anywhere in the surround-sound fi eld. Listeners are now sur-
rounded by sound sources and may perceive themselves to be within the
music, perhaps even right in the middle of the performance ensemble. This
has the potential to be a strikingly new listening experience.
The sound-stage diagrams of Chapter 9, and Figures 14-1 and 14-2 from
Chapter 14 will help the recordist to craft sound-stage relationships. They
will prove useful in many ways, including evaluating the relationships of
sound sources and keeping track of the locations of sources.
The perceived performance environment plays a large role in determining
the overall sound quality of the recording, and its illusion/reproduction of
the size of the “space” of the recording. The listeners position in relation
to the sound stage (the stage-to-listener distance) plays a critical role in
the level of intimacy of the recording. The dimensions of the sound stage
can provide great breadth and depth to the recording or can pull a group
of instruments (sound sources) closely together; it provides opportunity to
craft the recording in signifi cant ways, and allows for individual sounds to
be altered for width, location and distance.
Musical Balance and Sound Quality
The interrelationships of musical balance and the differences of sound
quality of sound sources played at different dynamic levels (performance
intensity) are integral parts of live performances, and are easily altered by
the recording process.
Recordings that attempt to capture the aesthetics of the live performance
will seek to capture the musical balance of the performers as they (or the
Chapter 12
266
conductor) intended. The changes in the sound quality of the instruments
will be precisely aligned with changes of dynamic levels in the musical bal-
ance of the ensemble and to changes in musical expression. It is important to
maintain these relationships to keep the character of the live performance.
Recordings that sought to enhance the characteristics of the live perfor-
mance may contain slight changes in musical balance that were not the
result of the performers, but were rather the result of the recording or mixing
process. These alterations will be heard as changes in dynamic levels that
are not supported by changes in the sound qualities of the instrument(s).
This enhancement might take place in only a few instruments, or it may be
used extensively throughout the entire ensemble. This enhancement tech-
nique may be quite subtle and diffi cult to detect, or it may be prominent. A
soloist with an orchestra is a common example of when this might occur.
Alterations in dynamic levels, and thus musical balance, that are not aligned
with changes in performance intensities have become integral parts of
music written for recordings. Multitrack mixes frequently exhibit changes in
musical balance that were not caused by the performers. These changes in
dynamic level, then, are inconsistent with the sound qualities of the instru-
ments in the fi nal recording. This enhances the potential of each element
(dynamics and performance intensity/sound quality) to be used individu-
ally in shaping or enhancing the musical material. Further, the expressive
qualities of sound quality/performance intensity can be incorporated into a
mix without the impact of a louder or softer dynamic level than desired.
The relationship between the musical balance and the
timbre of sound sources in many multitrack recordings
creates a wealth of contradictions between reality and
what is heard. The aesthetics of this type of recording
leans toward redefi ning reality with each new project and
is a stark contrast to the aesthetic of trying to capture the
reality of the live performance.
The recordist’s approach to any project should include
a conscious decision on a level of realism. How will the
nal sound relate to real-life experiences, and how will the
characteristics of sound be shaped? What is the listener
intended to believe, and how can this be achieved?
The Recording Aesthetic in Relation
to the Performance Event
The recording process will shape music performances in such a way that
the sound qualities and relationships of live performances may be altered.
How the process alters the live listening experience is central to the
aesthetics of the recording.
Listen . . .
to tracks 37 and 38
for the sound quality of the perfor-
mance intensity of the instruments
when they were recorded and how
these coincide with the dynamics in
the two mixes.
The Aesthetics of Recording Production
267
Production-Transparent Recordings
The recording medium is often called upon to be transparent. In these con-
texts, it is the function of the recording to capture the sound as accurately as
possible, to capture the live performance without alteration. This type of aes-
thetic is common for archival recordings that function to document events.
These
production-transparent recordings may or may not be sensitive to
the performance environment. At times, these recordings attempt to capture
the sound of the music performance without considering the artistic dimen-
sion of the relationship of the music (and musicians) and the performance
space (and audience). In other instances, these recordings seek to negate
any infl uence of the performance space on the sound of the recording.
Because these are recordings of live performances, the recordist is not
involved with compiling the performance. The performance takes place in
real time, and it will not be possible to back up and fi x a certain section or
idea. The recordist is primarily concerned with the technical aspects of the
sound of the recording (critical listening) and the sound qualities of the
overall program (at the highest level of perspective).
A limited number of microphones are often used in making this type of
recording. Usually two microphones are used in some appropriate stereo-
microphone technique, placed fairly close to the ensemble. The microphones
generally are sent directly to a two-track (or surround) master, with little or
no signal processing. The recordist will exercise little real-time control over
the quality of the sound and over the shaping of the performance.
The recording medium can also be transparent in documenting a perfor-
mance, while placing the music in a complementary relationship with the
host environment of the performance. Specifi c pieces of music are best
suited to certain environments and are most accurately perceived from cer-
tain listening distances. The artistic message of a specifi c piece of music
will be most effectively communicated in a certain environment and with
the listener at an ideal distance from the ensemble.
Spatially Enhanced Production-Transparent Recordings
Spatially enhanced production-transparent recordings can ensure pieces
of music will be perceived as having been performed in an ideal environ-
ment, with the listener located at an ideal distance from the ensemble,
when listening to the recording. This approach locates the listener at the
ideal seat, and can be accomplished without altering the performance itself
and maintain transparency of the recording process.
The recordist (often with input from a conductor or producer) will deter-
mine the type and amount of infl uence the acoustic performance envi-
ronment will have on the fi nal recording. Microphone selection, choice
of stereo microphone array, and array placement within the performance
Chapter 12
268
environment are the primary determinants of the envi-
ronment sound that is captured from the performance
environment. Artifi cial reverberation units or other time
processors may sensitively enhance the characteristics
of the environment. The distance of the listener from the
ensemble is determined primarily through microphone
placement and through time processing.
This recording aesthetic attempts to present the music in
the most suitable setting possible for that particular work
and to simulate the listening experience in the concert
hall. This recordist seeks to ensure that the sounds will be
in the same spatial relationships as the live performance,
that the recording process will not alter the balance of the
musical parts, and that the quality of each sound source
will be captured in a consistent manner. In these
live
acoustic recordings, the recording may seek to reproduce
the sound of the performance space—surround sound sys-
tems can be used for great realism in this approach.
This aesthetic can have the recordist more involved with the decision-
making process in some projects than in others. This aesthetic may be used
for many types of music, and may be used for live concert recording as
well as session recording. While it is common in orchestral and other art
music formats, it is equally appropriate for jazz or any other music record-
ings where the performers are refi ned in their sensitivity to and control of
their relationships to the whole ensemble. In session recordings, some (or
much) editing may be a part of this aesthetic. A consistency of sound qual-
ity and spatial relationships between all portions of the work will nearly
always be sought; this is a stark contrast to many multitrack productions.
Enhanced Performances
The recording medium may enhance the performance in widely varying
degrees. This aesthetic may be a slight extension of the concept of a trans-
parent live recording, with the recording process slightly enhancing certain
musical ideas, or this aesthetic may set another extreme of being a life-like
session recording that was recorded out of real time.
This aesthetic simulates a natural listening experience, by capturing or cre-
ating many of the inherent characteristics of a live, unaltered performance.
The timbre and dynamic relationships, spatial cues, and editing techniques
all serve to create the impression that the recording did indeed take place
within reality—as an actual, live performance.
When this aesthetic is an extension of the concept of a transparent live
recording, sounds are placed in the sound stage in the same relative
Listen . . .
to tracks 50, 48 and 49
for a stereo microphone technique
that does little to alter the character
and sound qualities/relationships
of the performance, a mix that es-
tablishes sounds at unnatural re-
lationships, and a mix that adds a
stereo microphone technique to the
unnatural relationships in the mix
of track 48.
The Aesthetics of Recording Production
269
positions as the instruments were in during the recording. The width and
depth of the sound stage and image sizes are realistic, and the record-
ing will usually have a single environmental characteristic applied to the
overall program (a single soloist might be present with a slightly differ-
ent environment). Dynamic changes are nearly always aligned with timbre
(performance intensity) changes, though some microphone highlighting
might create a limited number of dynamic changes without timbre chang-
es. The recording process is used to slightly enhance certain musical ideas
from the live performance.
This aesthetic may be used for controlled live performances (those that
have been rehearsed with the recordist) or in recording sessions, for a wide
variety of musical styles. Minimal miking will usually be used, often an
overall stereo array with a small number of accent microphones (or stereo
pairs). Accent microphones allow this aesthetic to be adaptable to stage
recordings of large classical ensembles or for musical theater and opera.
The recording is usually mixed directly to a two-track (or surround), with
mixing decisions taking place during the rehearsals or during the record-
ing session(s). Recording submixes to a multitrack recorder or DAW is also
common, but many of the decisions related to the sound of the recording
are still accomplished during the recording session or rehearsals.
Recording sessions will often be composed of many takes of large and
small sections of the work. As the ensemble balance is largely controlled
by the performers, and the parts are not singled out (making re-recording
of individual parts unavailable), ensemble problems of accuracy and sound
quality often cause a lengthy recording session and a large set of session
takes.
The master ends up being a collection of a few to many takes. The edit-
ing of these takes becomes an integral part of the recording process. The
best takes are selected based on musical and technical qualities. These are
then edited together (cut and paste) to compile a
perfect performance of
the work. The master or fi nal version represents the fi nal performance. The
goal of this approach is usually to craft the best possible (perfect) perfor-
mance, interpretation, and presentation of the music.
The aesthetic of slightly enhancing the reality of the performance may also
be found in session recordings that simulate natural sound relationships.
Although recorded out of real time, the recordings will seek to simulate
the experience of live music. Some emphasis of certain musical materials
(and/or artistic elements) over others will be unavoidable in the record-
ing process and will diminish the naturalness of the relationships of the
sounds. Some recordings may simulate reality only generally, but still have
a goal of providing the illusion of a naturally occurring performance—even
with the complete control of the multitrack recording (miking, processing,
and mixing) process.
Chapter 12
270
Created Performances
Music written for the recording medium may have qualities that are
signifi cantly different from live acoustic music. It may be constructed in dif-
ferent ways, and it may contain additional artistic elements. Music written to
be recorded, and especially music written during or through the recording
process, is often composed and/or performed in layers—its performances
created by compiling separately played parts.
The musical materials are often written and recorded one part at a time, or
a small group of parts at a time. The recordings use close miking techniques
that ensure a separation of parts (and thus allow for precise control of
the individual sound source) or will physically isolate the performers/
performances from one another. The parts are continuously compiled on
a multitrack recorder, with each new musical line added to the musical
texture. Players often perform their parts many times; any number of ver-
sions may be recorded before the desired result is achieved. The recordist
(sometimes with the aid of a producer) may be responsible for listening
for performance mistakes, listening for the most interesting and success-
ful performance, keeping track of which portion of which musical part was
performed most accurately, on which take, etc., in addition to maintaining
impeccable sound quality.
The fi nal piece may be a composite of any number of performances, and
it may be a controlled integration of many different musical ideas and per-
sonalities. The performances may or may not have taken place at the same
studio, or during the same day (or year), and the performers may or may
not have met and discussed their musical intentions.
The recording medium can create the illusion of a performance that con-
tains characteristics that cannot exist naturally. This aesthetic has become
common since the early/mid-1960s. In this “new” aesthetic, the record-
ing medium’s unique sound qualities and creative potential are used. It
becomes a musical ensemble with its own set of resources for shaping a
performance or creating a musical composition.
Music written to be recorded may exploit environmental characteristics,
musical balance and sound quality contradictions, sound stage depth and
width, sound source imaging, or its other unique elements to create, defi ne,
or enhance its musical materials.
This aesthetic might purposefully create relationships that cannot exist in
nature—a whisper of a vocalist might be signifi cantly louder than a cymbal
crash. This aesthetic will use the unique qualities of recorded sound in the
communication of the work’s musical message.
Recordings of this aesthetic might seek to create a new reality for each work
or project. Unique relationships of sound are calculated and incorporated
into the music. Recordists (engineers and producers) develop personal
The Aesthetics of Recording Production
271
styles of the ways they shape aspects of balance, imaging, sound stage,
and environment, while continuing to explore the expressive potentials of
recording and the medium’s relationships with reality.
Much of today’s popular music falls within either the aesthetic described
above, or within the aesthetic of using the recording medium to enhance
the illusion of a live performance. Many of the artistic considerations of the
recording process are very apparent in these two aesthetics.
Altered Realities of Music Performance
The reality of the music performance event itself is also altered by the aes-
thetics of the recording production. The recording provides an illusion of
a live performance, and the content and qualities of the perceived music
performance may vary from a slight improvement of our listening realities,
to being a live performance that exists in ways that are impossible in our
known world.
Recording allows a music performance to be an object that can be precisely
polished by the artists, that can exist as an almost indestructible physical
object and held in one’s hands, and that can be owned by any number of
members of the general public.
The reality of a live music performance as an experience witnessed in a
eeting instant in time, and as retained only in the memories of those who
experienced the event, is signifi cantly altered by the existence of record-
ings. A recording is a permanent performance of the piece of music—one
that can potentially live well beyond the artist’s lifetime.
Additional pressures, ideals and aesthetics are placed on the artists
responsible for any individual recording, as opposed to a live performance.
A recording may transform the live listening experience: (1) by creating
humanly impossible performances, (2) by providing performance condi-
tions that are inconsistent with reality, (3) by presenting error-free and pre-
cisely crafted performances, and (4) by providing a permanent record of a
music performance.
A recording is a
permanent performance of a piece of music. It is a period
of time that has been created or captured, and that may be preserved for-
ever. The performance can be revisited (and observed at any level of detail)
at any time, and any number of times, and by anybody.
Recordings can often become
defi nitive performances of a piece of music.
The defi nitive performance may be thought of as being either that of a
certain artist or of the particular piece of music. An artist’s performance/
recording of a work might be what is widely accepted as the defi nitive per-
formance (or reference) of how the work exists in its most suitable form,
in relation to performance technique or to the communication of the musi-
cal message. A specifi c recording of a work can also serve as a defi nitive
Chapter 12
272
reference of how a work exists in its most suitable state, in relation to
recording practice or to musical considerations.
Recordings not only are a means of creating an art form, they also preserve
the artistic ideas of music performance and expression that do not rely on
the recording process. Recordings may permanently preserve the music
performances of an artist. They may provide historical documentation or
archival functions by preserving the music performances of particular art-
ists, ensembles, events, and more—even nature and its sonic landscapes.
The great contradiction of producing a recording that is a permanent record
is that the recording often becomes dated. Artists develop and grow. Their
musical abilities, levels of understanding, artistic sensibilities, and their
musical ideas change. The permanent performance that was previously
created (perhaps only a few weeks before) may no longer be representa-
tive of the artists’ abilities or aesthetic opinions. It can be a snapshot of a
point in time.
The recording will often represent the artists’ and recordist’s idea of a
perfect performance of the work. Theoretically, a perfect performance of
any piece of music can be produced through the recording process. The
defi nition of the perfect performance may vary considerably between per-
formers, but the concept of the recording itself will be similar. It will be a
presentation of what the performers and producer believe to be the most
appropriate interpretation of the piece of music, under the most appropri-
ate performance conditions (instruments used, performance space, etc.).
A perfect performance will combine the artists’ desired interpretation of
the music (and an absence of performance inaccuracies), with an illusion
of the drama of a live concert, and as being experienced at the ideal listen-
ing location of an ideal performance environment for the ensemble and
piece of music. Practical considerations of the recording process might
compromise the actual quality of the recording, but the goal of the record-
ing remains constant.
The recording may present musical ideas, and sound qualities and relation-
ships, that are impossible to create in live performance. Musical materials
may be presented in ways that are beyond the potentials of human execu-
tion. These might be rapid passages performed precisely and fl awlessly,
dynamics and sound-quality expressions that change levels quickly and in
contradictory ways, or the use of a single human voice to perform many dif-
ferent parts. These are only a few of the possibilities. Humanly impossible
performance techniques and relationships are easily created in recording.
The reality of what is humanly possible in a music performance is often
inconsistent with the music performance of the recording. The relationships
of sound sources, the characteristics of sound sources, the interrelation-
ships of musical ideas and artistic elements, and the perceived physical per-
formance of the musical ideas may be such that they could not take place
The Aesthetics of Recording Production
273
in nature. The music performance of the recording may be such that it could
not be accomplished without recording techniques and technologies. It may
be impossible to recreate the music or the performance live, on stage—a sit-
uation often cited as one of the reasons why The Beatles stopped touring.
Recordings have greatly infl uenced our expectations of live music perfor-
mance. The listening audience of a recording becomes accustomed to a
recording as being the perfect performance of a piece of music. The audi-
ence may learn the subtleties of a recording quite intimately.
A particular performance of a piece of music is created or captured in a
music recording. When an audience member owns a copy of the record-
ing, or when a recording has received much exposure through media, the
audience may have listened to a recording many, many times. The record-
ing becomes the defi nitive performance of the music, for some audience
members. The audience will carry this knowledge of the music recording
into a live music concert, and may impose unrealistic expectations onto the
performers and the event.
The artists’ new and different interpretations of the music, the absence of
certain sound qualities from recording production, the inconsistencies of
human performance, or other factors, might create differences between
the live performance and the known recording of a piece of music. A poten-
tial exists for audiences to become less involved with the drama and excite-
ment of the live performance of music. Audiences may attend concerts to
publicly hear live performers and the music performances they have come
to know well as recordings, heard privately many times. Audience mem-
bers do not always accept the reality that the live performance was not the
same as the studio-produced recording. A potential exists for the audience
member to be dissatisfi ed that the defi nitive performance they know well
was not reproduced for them at the live event.
An audience may place unrealistic expectations on the performers. Per-
formers may be expected to perform fl awlessly, or with the same version
and interpretation of the work as their released recording. An audience
may expect the performing artist to provide the role of reproducing one
particular performance of the work. By reacting to this audience, the per-
former may be restricted from allowing their interpretation of the music
to evolve and change according to their growing experience, and may be
restricted from creating a more exciting performance. The subtleties of
artistic expression that are possible only through the artist and audience
interaction, along with other unique qualities of live music performance,
may be lost from an event and diminish the musical experience.
It is unrealistic to expect to hear a precise reenactment of a recording in
a live concert environment. Many music recordings have been produced
in such a way that a live performance of all musical parts, sounds, and
relationships is impossible. These are potential negative outgrowths of the
Chapter 12
274
audience’s familiarity with certain recordings and the new listening habits
afforded by readily available music performances.
The general public hears much more recorded music than live music.
They are often prone to judging live performances with the expectations
of a perfect, recorded performance. Human inaccuracies are sometimes
not easily tolerated, and new musical and expressive interpretations of a
known piece of music may be heard as simply wrong. Further, people own
their own personal copies of performances. A tendency to personalize or
to become attached to those performances is common. When the perfor-
mance is changed, something personal (“their music”) has been altered.
Summary
The recording aesthetic is determined by the relationship of the recording
to the live listening experience. The recording aesthetic is arrived at through
a careful consideration of the musical material, the function of the recording
(type of music recording, sound track, and advertisement) and the desired
nal character of the recording. The recordist’s role is defi ned by their con-
tributions (or lack thereof) to the process of making the creative decisions of
the recording, whether making decisions or executing the ideas of others.
The recordist’s overall control of the many qualities of the fi nal music record-
ing is highly variable. The recordist is responsible for the overall characteristics
of the recording and may be in control of (and responsible for) its most min-
ute details, depending on the recording techniques being used. The recordist
might have precise control over shaping or creating a performance, or they
may be engaged in capturing the global aspects of a live performance.
The amount and the types of control used in the recording process will
determine the degree of infl uence the recordist has on the fi nal content of
the music recording. The recording medium may be used to greatly infl u-
ence the sounds being captured by the microphones, or the recording
medium may shape sound much more subtly. The recording process will
be used differently, depending on the particular project.
The aesthetics of recording production vary with the individual, with the musi-
cal material, and with the artistic message and objectives of a certain project.
An aesthetic position or approach may be appropriate for a certain context, or
it may not. An approach might enhance the artist’s conception of the music,
or it may not. An approach may be consistent with other considerations of
the project or the music, or it may not. Perhaps consistency is not desired.
The aesthetic approach to recording production creates a conceptual con-
text of the artistic aspects of recording. The intangible aspects of the art can
then be appreciated within this context. The recordist must clearly defi ne
the aesthetic position of the recording, in order to successfully control and
shape it.
275
13 Preliminary Stages: Defi ning
the Materials of the Project
The preliminary stages of the project will defi ne the musical ideas of the
project, many qualities about the music and the recording, and move
toward making the proper production decisions to best capture or realize
those ideas. Artistic, technical, logistical and conceptual concerns will all be
considered and addressed.
A recording project will start with identifying the music to be recorded. In an
album project, this will often include how the songs in the album relate to
one another, and perhaps some thoughts of song sequencing. Some songs
will fi t the project, and others might not. A song might emerge as a central
theme or an anchor to the project. An overall concept for the album might
begin to emerge; certainly an overall quality will take shape during these
preliminary stages. This overall quality is often not articulated between
those involved, but it is certainly present within their basic understanding
of what the project is trying to communicate or achieve musically.
This overall quality will become focused while making decisions on what
is trying to be accomplished musically—how to best deliver the story of
the text and music. While the composer, artists, producer, and others are
involved in the project and make musical decisions, the recordist needs to
place those musical ideas into the appropriate context of the recording. The
recording process will impact the music and its message, as discussed in
the previous chapter. How this will occur, and the aesthetic of the record-
ing, will be determined at this beginning stage.
The recording process will shape the overall quality of the project and
songs by establishing its overall texture. As discussed above, this is com-
posed of:
Form
Perceived performance environment
Chapter 13
276
Pitch density
Program dynamic contour
Reference dynamic level
Sound stage (lateral sizes and locations, and distance locations of
sources)
Listener-to-sound-stage distance and relationship (level of intimacy)
Environments of sound sources
Musical balance of sources
Sound quality
At the preliminary stages, these elements will begin to take shape. Deci-
sions will be made that will infl uence and determine how these elements
will appear in the fi nal recording. Some of these decisions will not be able
to be reversed after tracking has begun.
The recordist will need to consider the project from a variety of perspec-
tives during these preliminary stages. The highest level of perspective will
often be a suitable starting point; the overall concept and overall shape of
the composition, how the song will progress dynamically as it unfolds, its
reference dynamic (intensity) level, etc., as listed above. These consider-
ations will take shape as they are built from the materials and relationships
that exist at lower levels of perspective.
The sound qualities and the relationships of groups of sound sources and
of individual sound sources will shape these overall qualities. For instance,
as the recording is being planned, thoughts of how the lead vocal should
“sound” will emerge. The recordist will formulate a sound in their imagina-
tion that they are trying to obtain; this sound will have the dimensions of
timbre/sound quality, which should include thoughts of environmental char-
acteristics, and it will have performance intensity, a sense of the required
level of intimacy (distance location) to best communicate the message of
the text, a breadth (size) and location of the voice on the sound stage. These
decisions will directly impact the song’s sound stage, pitch density, reference
dynamic level, perceived performance environment, etc. As other sources
or groups of sources are defi ned, the overall qualities are further refi ned.
The sound qualities of all of the sound sources can be captured or crafted in
many ways. Explored below, synthesis techniques and microphone selec-
tion and techniques contribute greatly to this process. The inherent sound
qualities of all devices and processes of the signal chain also contribute to
crafting the sound quality of individual sources, and the overall recording,
and need to be understood and recognized.
Finally, the playback system and the listening environment can alter how
the sound qualities of the recording are heard. Inaccurate recordings can
easily result. At this preliminary stage of the project, it is important to rec-
ognize the characteristics of ones monitor system. Accurate sound repro-
duction is needed in order to control the recording process, ensure the
Preliminary Stages: Defi ning the Materials of the Project
277
quality of the signal, and to accurately recognize and understand the sound
qualities and relationships that are being crafted.
The preliminary, preproduction stage will consider, and work to defi ne:
The music and its desired qualities,
The recording aesthetic to best achieve those qualities,
How the recording process will contribute to shaping the music,
How the recording process will unfold, and
The equipment and technologies most suitable to the desired sound
qualities of the project.
Sound Sources as Artistic Resources,
and the Choice of Timbres
The music of a project and the sound sources (voices and instruments) to
perform the music are usually determined for the recordist. Clients usually
dictate what is recorded and who performs. Many subtle decisions on the
selection of or qualities of sound sources are, however, often made during
the preliminary stages of the recording process (and during the produc-
tion process itself). Sound sources are selected, created, or shaped. In all
instances, their selection is an important decision in shaping the sound of
a recording.
Sound sources deliver the materials of the music production. They are vehi-
cles for presenting musical ideas. They must be selected carefully and with
attention to their anticipated roles in the production. Sound sources are often
coupled with the musical ideas themselves; the musical idea and the sound
quality of the source are often melded into a single artistic impression.
In selecting sound sources, decisions are being made as to the most
appropriate timbre to present the musical materials. In doing so, the sound
source is considered in relation to:
Suitability of its sound quality to the musical ideas,
Potential to deliver the required creative expression,
Pitch-area information for anticipated placement in relation to timbral
balance, and
How it might appear in musical balance, on the sound stage, and with
environmental characteristics.
In making these projections and evaluating sources, the recordist will listen
at a variety of perspectives. Focus will shift from overall qualities of the
source, to its dynamic envelope, spectral content, and spectral envelope, to
perhaps the subtle characteristics of spectrum changes that might appear
when the sound is combined with other sounds, as examples. Sound qual-
ity evaluations performed quickly to make general observations will assist
this process greatly.
Chapter 13
278
Performers as Sound Sources
Individual performers may be selected because of their unique sound qual-
ities. Individual performers, themselves, are unique sound sources. This is
especially true of vocalists, who are sought for their unique singing voice
and styles, as well as for their speaking voices.
Accomplished instrumental performers that have developed their own
style(s) of playing or that are skilled in performance techniques are also
sought for their unique sound qualities. Individual performers often bring
their own creative ideas and special performance talents to a project, and
considerably aid the defi ning of the sound qualities of the sound sources.
The act of selecting particular performers for a recording is important for
defi ning the sound quality of the sound source down to the minutest detail.
Since the recording is a permanent performance of the piece of music, the
selection of the performers for this performance is often an important con-
sideration in determining the sound qualities of the sound sources.
Creating Sound Sources and Sound Qualities
The sound manipulation and generation techniques of sound synthesis
allow sound sources to be created with the design of new timbres. New
sources can be invented with new sound qualities. Many approaches to
sound synthesis are available; among these are:
Analog synthesis techniques
Additive and FM digital synthesis techniques
Sampling-based synthesis techniques
Many hybrid (analog + digital + sampling) synthesis techniques (such
as waveshaping, hyper integrated, phase distortion, wavetable, physi-
cal modeling, granular synthesis techniques, etc.)
DAW virtual instruments and plug-ins
Musique concrète techniques
Recording and performing techniques on sound samplers (sampled
live or with commercially available sound libraries)
The creation of sound sources allows the recordist great freedom in shap-
ing sound qualities. The recordist will be functioning as a
sound designer,
whose goal is to create a sound (with a sound quality) that will most effec-
tively present the musical materials and ideas of the music. Sound sources
that precisely suit the contexts of the sound and the meaning of the music
may be crafted or created by the recordist.
While an examination of the sound synthesis process is out of the scope
of this writing, it is important for the recordist to be aware of the many
creative options afforded by sound synthesis. The study of sound synthesis
from the perspective of building timbres will greatly assist the recordist
Preliminary Stages: Defi ning the Materials of the Project
279
in understanding the components of sound, and how the components of
sound may be used as artistic elements. This process will also help develop
skill in recognizing the subtle qualities of spectrum, spectral envelope, and
dynamic envelope.
It is important to note, by inventing sound sources, the recordist will be
presenting the audience with unfamiliar “instruments.The sources (new
instruments) may be performing signifi cant musical material. The reality
of the performance has been altered out of the direct experience of the lis-
tener. The recordist will create a new reality of sound relationships or might
emphasize known sound relationships to reestablish known experiences.
These relationships need to be accomplished in such a way as to support
the musical materials and ideas of the recording.
Environmental Characteristics
The human realities of sound relationships are most closely associated
with acoustical environments. The listener will process the characteristics
of the environment within which the sound source is sounding, and the
location of the source within its acoustical environments, to imagine the
reality of the performance. The acoustical environment itself will also func-
tion as a sound quality, shaping the sound source.
Fusion occurs combining the source’s sound quality and environmen-
tal characteristics into a single impression. The selection of environment
therefore may be as important for sound quality concerns as it is for shap-
ing the spatial aspects of the recording.
The recordist needs to be sensitive to the dimensions of environmental
characteristics, as previously covered. How applied environmental charac-
teristics transform the source’s timbre will be evaluated by careful attention
to spectral components, and also to the composite sound quality, by shifts
of perspective. Environmental characteristics can readily mask, distort
or enhance desirable qualities of the original sound; careful attention to
detail is required to create a suitable environment and add an appropriate
amount of the environment to obtain the desired sound.
Nonmusical Sources
Nonmusical concepts often fi nd a place in a music project. As sound sourc-
es, speech and special effects require special consideration.
With speech as a sound source, a particular voice is selected to comple-
ment the meaning of the text and to complement the other sounds in the
musical texture. The voice is carefully selected for the appropriateness of
its sound quality, and thus its dramatic or theatrical impact, to the meaning
of the text to be recited.
Chapter 13
280
Special effects are sound sources that are used to elicit associated respons-
es or thoughts from the listener. The associated thoughts generated by the
special effects are not directly related to the context of the particular piece
of music. Special effects pull the listener out of the context of the piece of
music, to perceive external concepts or ideas. A horn sound occurring in a
piece of music is an effect, used to elicit the mental image of an automo-
bile. The same sound used as part of the musical material, used to comple-
ment the musical ideas of the work, would not be a special effect, but rather
a musical sound source.
Microphones: The Aesthetic Decisions
of Capturing Timbres
The sound qualities of instruments and voices are shaped, and can be sig-
nifi cantly transformed, when captured by a microphone. The selection of
a particular microphone, placing the microphone at a particular location
(within the particular recording environment), and how these complement
the characteristics of the particular sound source will determine the fi nal
sound quality of a recorded sound.
A specifi c microphone will be selected to make a certain recording of a
sound source, because of the ways its performance characteristics com-
plement the sound characteristics of the sound source. This interaction of
the characteristics of the microphone and of the sound source allows the
recordist to obtain the desired sound quality of the recorded sound.
The recordist needs a clear idea of the sound quality sought. With this, the
recordist will determine which microphone is most appropriate by com-
paring the characteristics of the sound quality of the sound source and
the performance characteristics of the microphone, to their vision of the
nal sound quality sought. Perhaps no other decision shapes sound quality
more than microphone selection and placement.
A sound source’s sound quality (dynamic envelope, spectrum and spec-
tral envelope), distance location (level of timbral detail), and perhaps envi-
ronmental characteristics (that might be captured from the performance
space) are all determined in this process. In future processes, some aspects
can be altered (such as adding a compressor to alter the sound’s dynamic
envelope), some elements accentuated (such as additional reverberation),
and some elements deleted or attenuated (such as fi ltering out high fre-
quencies). Some qualities must be captured during the initial recording
that cannot be added later—for instance, a high degree of timbral detail
must be captured by appropriate microphone selection and placement dur-
ing the initial recording if the source is to be at a very close distance to the
listener in the fi nal mix, as this detail cannot be created later. The distance
cues of the fi nal sound stage are signifi cantly determined by the timbral
detail captured by the microphones used.
Preliminary Stages: Defi ning the Materials of the Project
281
Microphone placement can be as critical as microphone selection in shap-
ing sound. Where a microphone is placed will be carefully calculated against
the performance characteristics of the microphone and how the instrument/
voice produces its sound. Often placement options are impacted by the
recording environment and unwanted sound qualities, or by isolating the
microphone from sound sources the recordist does not wish to record.
No single microphone will be the best microphone for every sound source or
for the same source for every piece of music. The microphones selected for
recording the same sound source may vary widely depending on the above
circumstances, the desired sound, and what microphones are available.
Performance Characteristics
All microphones can be evaluated by their performance characteristics.
These characteristics are information on how the microphone will consis-
tently respond to sound. Thus, through these characteristics, the recordist
can anticipate how the microphone will transform the sound quality of the
sound source while it is being recorded.
As the microphone alters the sound source, it has the potential to contrib-
ute positively in shaping the artistic elements. If the recordist is in con-
trol of the process of selecting the appropriate microphone for the sound
source and conditions of the recording, the selection and applications of
microphones can be a resource for artistic expression. The artistic elements
of the recording can be captured (recorded) in the desired form, and the
microphone and its placement will become part of the artistic decision-
making process.
A number of microphone performance characteristics are most prominent
in shaping the sound quality. These characteristics are of central concern in
determining the artistic results of selecting these microphones for certain
sound sources. These microphone performance characteristics are:
Frequency response
Directional sensitivity
Transient response
Distance sensitivity
Subtle transformations of the source’s sound qualities take place during
this process. The listener should compare the sound of the instrument in
the recording room to the sound of the signal in the control room to hear
these differences. Focus on dynamic envelope, spectrum, spectral enve-
lope and proportions of direct-to-reverberant sound changes will reveal
many of these microphone performance characteristics. The recordist will
ultimately learn the “sound” of specifi c microphones, and learn how to use
them to their best advantage.
Chapter 13
282
Frequency Response
Frequency response is a measure of how the microphone responds to the
same sound level at different frequencies. Amplitude differences at various
frequencies can appear at the microphone’s output and defi ne the sensitiv-
ity of the microphone to frequency.
The frequency response of a microphone often has frequency bands that
the microphone accentuates or attenuates. The matching of a sound source
with similar frequency characteristics may or may not provide the recordist
with the desired sound. The microphone may cause accentuation of cer-
tain characteristics of the sound source, and perhaps a microphone with
somewhat opposite frequency characteristics as the sound source will be a
more appropriate choice. Again, this decision is dependent upon the fi nal,
desired sound. Figure 13-1 presents a frequency response for a hypotheti-
cal microphone that slightly emphasizes the frequency band from approxi-
mately 2 kHz to 5.5 kHz, and attenuates frequencies below 100 Hz and above
12 kHz.
Microphone frequency response adds new formant regions to the sound
source. The frequencies emphasized and/or attenuated by the microphone
act on all sounds equally, regardless of pitch level. Microphone frequency
response directly shapes, contributes to, and captures the spectrum of the
sound source.
Figure 13-1
Frequency
response of a hypo-
thetical microphone.
20 50 100 200 500 1k 2k 5k 10k 20k
Frequency in Hertz
Relative Response in Decibels
+ 9
+ 6
+ 3
0
− 3
− 6
− 9
Directional Sensitivity
Microphones do not capture sounds equally that arrive at different angles
to its diaphragm. The
directional sensitivity of a microphone is its sensitiv-
ity to sounds arriving at various angles to the diaphragm. The
polar pattern
of a microphone depicts the sensitivity of a microphone to sounds at vari-
ous frequencies in front, in back, and to the sides, and the actual pattern is
Preliminary Stages: Defi ning the Materials of the Project
283
spherical around the microphone (Figure 13-2). Directional response mea-
sures the microphone’s sensitivity to sounds arriving from angles, but cal-
culates this sensitivity at only a few frequencies (Figure 13-3).
Figure 13-2
Polar
pattern spheres and
the microphone axis.
Off-Axis
Side View:
On-Axis
Floor
Off-Axis
(angle of sound arrival)
90º
View from Above:
On-Axis
Figure 13-3
Polar
pattern sensitivity at
various frequencies.
30º
60º
90º
120º
150º
180º
210º
240º
270º
300º
330º
125 Hz
1000 Hz
4000 Hz
8000 Hz
12,500 Hz
Chapter 13
284
Sounds directly in front of the microphone diaphragm are considered to be
on-axis. Sounds deviating from this 0º point on the polar curve are consid-
ered
off-axis and are plotted in relation to the on-axis reference level. The
frequency response of most microphones will vary markedly to sounds at
different angles. Even microphones that show no pronounced frequency
areas of accentuation or attenuation (fl at frequency response) on-axis will
show an altered frequency response at the sides and the back of the polar
pattern (Figures 13-3 and 13-4).
Figure 13-4
Frequency
response of a hypo-
thetical cardioid
microphone on-axis
(0º) and off-axis 180º
and 90º.
20 50 100 200 500 1k 2k 5k 10k 20k
Frequency in Hertz
0
− 10
− 20
− 30
Response in Decibels
90º
180º
These variations in frequency response at different angles are commonly
called off-axis coloration. This coloration alters sound-source spectrum just
as on-axis frequency response accentuates and attenuates specifi c frequen-
cy bands. These changes are more pronounced at or towards the attenuat-
ed angles of the microphone patterns. In the intermediate angles between
directly on-axis and the dead areas of directional patterns, a slight change
in the angle/direction of the microphone can make a substantial difference
in the frequency response of the captured sound source. Frequencies above
4 kHz are usually most dramatically altered by slight angle changes.
The amount of off-axis coloration is an important measure of the micro-
phone’s suitability to a variety of situations. This is especially pronounced
in stereo microphone-array recording techniques. Instruments at the edges
of the array’s pick-up pattern and the reverberant sound of the hall will
arrive at the array mostly from angles that are off-axis. The sound qualities
of those instruments and of the reverberant energy may be altered sig-
nifi cantly by the off-axis coloration. Off-axis coloration has the potential to
have a profound impact on the sound qualities of the recording.
Preliminary Stages: Defi ning the Materials of the Project
285
Transient Response
Microphones do not immediately track the waveform of a sound source. A
certain amount of time is required before the applied energy is transferred
into movement of the microphones diaphragm. Also, a certain amount of
acoustic energy must be applied to initiate this movement, and some of this
energy (which is the sound wave of the sound source) can be dissipated in
the action of getting the diaphragm moving. Diaphragm mass inhibits tran-
sient response, making condenser and ribbon microphones more accurate
than even the highest quality dynamic microphones.
Thus, microphones have different response times before they will begin to
accurately track the waveform of the sound source. This
transient response
time distorts the initial, transient portions of the sound’s timbre. Slow tran-
sient response is most noticeable when the microphone is applied to a
sound source that has a fast initial attack time (in the dynamic envelope)
and a large amount of spectral energy during the onset.
Figure 13-5 is a comparison of the transient response of typical condens-
er and dynamic microphones. It shows the condenser microphone traces
the initial transient of the sound more accurately than the dynamic micro-
phone. The rest of the signal is reproduced at about the same accuracy by
either microphone, and is lower in frequency and amplitude. In this way,
transient response is often related to frequency response.
Figure 13-5
Transient
response of typical
condenser and dy-
namic microphones.
Dynamic
Microphone
Condenser
Microphone
10 µV
50 µs
Transient response is a microphone performance characteristic that is not
included in the manufacturer specifi cation information that accompanies
promotional literature and owners manuals. Further, it is not calculated
Chapter 13
286
by a standard of measurement. It exists as an alteration of the original
waveform’s initial attack time in both the dynamic envelope and the spec-
tral envelope.
The developed hearing of the recordist can only be used to judge this
microphone characteristic. The recordist must become aware of differences
in timbre that are present in the early time fi eld of the captured (recorded)
sound source, as compared with the live sound source. These differences
are identifi ed through the critical-listening process and are vitally impor-
tant to the recordist. Recognizing and understanding how the transient
response is altering the source’s sound quality will allow the recordist to be
in control of capturing and shaping the sound as desired.
Two microphones with identical frequency-response curves may have very
different sound characteristics caused by different transient response times.
Distance Sensitivity
All microphones will respond differently to the same sound source, at the
same distance and angle. The ability of each microphone to capture the
detail of a source’s timbre will be different. Microphones will have differ-
ent sensitivities in relation to distance. The
distance sensitivity of a micro-
phone is often infl uenced by the polar response of the microphone and/or
its transduction principle (condenser, moving coil, ribbon, etc.).
Directional patterns often are able to capture timbral detail of a sound source
at a greater distance than an omnidirectional microphone. This is primarily
the result of the ratio of direct to indirect sound, and a masking of timbre
detail, but it may also be attributed to the transduction principle, depend-
ing on the particular circumstances. Similarly, condenser microphones will
have a tendency towards greater distance sensitivity than dynamic micro-
phones, due to their more sensitive transfer of energy. Many times a micro-
phone with a small-sized diaphragm will have greater distance sensitivity
than a microphone with a larger diaphragm, all other factors being equal.
The concept of the distance sensitivity (sometimes called “reach”) of a
microphone is an important one. Recordists must also judge this micro-
phone characteristic through acute listening and experience. They must
become aware of differences in timbre detail that are present between the
miked sound source and the live sound source. Distance sensitivity is a
microphone performance characteristic that is not included in the manu-
facturer-specifi cation information, although it might be measured scientifi -
cally if an appropriate scale were devised. Distance sensitivity is a charac-
teristic that must be learned from experience and must be anticipated for
the individual environment and recording conditions.
Preliminary Stages: Defi ning the Materials of the Project
287
Microphone Placement
Many variables must be considered during the process of selecting and
placing microphones. The primary variables for microphone selection were
presented above. The variables of
microphone placement will directly infl u-
ence the selection of a microphone, even after an initial selection has been
made. The placement of the microphone in relation to the sound source
and the performance environment will greatly infl uence the sound quality
of the recording. At times, this infl uence may be as great as the selection
of the microphone itself.
The recordist must consider the following when deciding on placing the
microphone in relation to the sound source:
How sound is produced by and how it radiates from the sound source
Distance relationships between the microphone, the sound source,
and the refl ective surfaces of the recording environment (performance
space)
Distance of the microphone from the sound source to be recorded and
other sound sources in the performance space
Height and lateral position of the microphone in relation to the sound
source
Angle of the microphone’s axis to the sound source
Performance characteristics of the microphone selected, as altered by
the above four considerations
Microphone placement interacts with microphone perfor-
mance characteristics to create the recorded sound qual-
ity of the source. The distance of the microphone will be
largely determined by the distance sensitivity of the micro-
phone. The angle of the microphone will be largely deter-
mined by the frequency response of the microphone in
relation to its polar pattern, the characteristics of how the
sound source projects its sound, and the characteristics of
the environment. The height of the microphone is also a
result of the directional characteristics of the sound source
(as instruments radiate different spectral information in dif-
ferent directions), the environment, and the microphones
frequency response and distance sensitivity.
Controlling the Sound of the Performance Space
The sound characteristics of the studio or performance space can be captured
in recording the sound source. The recordist may or may not wish to include
the sound of this space as part of the sound quality of the sound source. In
either instance, the recordist must be in control of the indirect sound of the
environment arriving at the microphone from refl ective surfaces.
Listen . . .
to tracks 39-41
for differences in the sound quality
of the three recordings of the same
performance; these differences are
all due to microphone selection
(performance) and placement con-
siderations.
Chapter 13
288
The recordist may seek to record the sound of the sound source within its
performance environment. For this to happen, there must be a control of
the balance of direct and indirect sound. Through the selection and place-
ment of a microphone with a suitable polar pattern and distance sensitiv-
ity, the desired characteristics of the environment may be captured, in the
desired amount, along with the sound of the sound source. The distance
cues of the initial refl ections in the early time fi eld and the amount of timbre
detail will be evaluated and weighed against the ratio of direct to indirect
sound. Distance and angle of the microphone will be adjusted to achieve
the desired sound. When this technique is used, individual sources must be
well separated or recorded separately in order to be isolated.
The recordist’s objective may be to capture the sound source without the
cues of the environment. The sound source may be physically isolated (with
portable baffl es—gobos—or in isolation booths) from other, unwanted
sounds from the environment and other sources, or the leakage of unwant-
ed sounds to the recording microphone might be minimized with micro-
phone pattern selection and microphone placement. This is accomplished
through close miking techniques and direct inputs. It allows the fl exibility
of being in complete control of the sound source, with environmental char-
acteristics later applied to the sound through signal processing.
Capturing the sound of the environment will be controlled by the relation-
ships of the microphone to the sound source and any other sound sources
that may be occurring simultaneously, including the sound of the environ-
ment itself.
Refl ective Surfaces
The refl ective surfaces of the environment (or of any object in the environ-
ment) can cause the sound at the microphone to be unusable. Interference
problems may be created when the sound from a refl ective surface and the
direct sound reach the microphone at comparable amplitudes. The slight
time delay between the two sounds will cause certain frequencies to be
out-of-phase (with cancellation of those frequencies) and certain frequen-
cies to be in-phase (with reinforcement of those frequencies).
The frequencies that will be accentuated and attenuated can be deter-
mined by the difference of the distance between the refl ective surface and
the microphone (
D
1
), and the distance between the sound source and the
microphone (
D
2
), in relationship to air velocity. The amount of reinforce-
ment and cancellation of certain frequencies that will occur when the two
signals are combined will be determined by their amplitudes. Constructive
and destructive interference are most pronounced when the two signals
are of equal amplitudes. As the difference in amplitude values between the
two signals becomes larger, the effect becomes less noticeable.
Preliminary Stages: Defi ning the Materials of the Project
289
Figure 13-6
Distances
between microphone,
sound source and
refl ective surfaces.
The constructive and destructive interference of the combined signals
results in a frequency response with emphasized and attenuated frequen-
cies (peaks and dips). These peaks and dips of the frequency response curve
of the sound’s spectrum have been compared, in analogy, to the tines of a
comb. Thus, the term
comb fi lter has been applied when a signal is com-
bined with itself, with a slight time delay between the two signals.
Microphone Positioning
The distance of the microphone to the sound source will alter the sound
quality of the recorded sound source. It will also be altered by the dis-
tance of the microphone to other sound sources in the environment, along
with the height, lateral position, and angle of the microphone. These are
variables that may be used as creative elements in shaping sound qual-
ity (and the musical material), in the same way as microphone selection.
These positioning variables will most signifi cantly infl uence the following
aspects of sound:
Environmental characteristics (and secondary distance) cue—ratio of
direct to refl ected sound
Environmental characteristics cue—early time-fi eld information created
by initial refl ections
Distance cue (defi nition or amount of timbral detail)
Capturing of the blend and shaping the overall quality of the source’s
timbre
The distance from the microphone to the sound source is the primary
factor that controls the ratio of direct to refl ected (indirect) sound. As the
D1
D2
Chapter 13
290
microphone is placed closer to the sound source, the proportion of direct
sound increases in relation to refl ected sound.
Refl ective surfaces that are located near the sound source will cause incon-
sistencies with this general rule. If at all possible, sound sources should
be placed at least three times the distance of any refl ective surface, as the
distance from the microphone to the sound source. The height and angle
of the microphone in relation to the sound source, coupled with its polar
pattern, may allow the microphone to keep from picking up the refl ected
sound off surfaces close to the sound source.
Distance cues can be established by microphone placement when an audible
amount of sound from the recording environment is present in the sources
sound quality. While this is not usually the case with close-miked sound
sources, many sound sources in multitrack projects are recorded from a dis-
tance that will capture certain information from the recording environment.
All distant miking techniques will capture a signifi cant amount of the per-
formance environment’s sound.
Recordings made with a moderate distance between the microphone
and the sound source will likely have prominent refl ection information in
the early time fi eld. The oor and objects immediately around the sound
source will create refl ections that arrive at the microphone at a similar time
to the direct sound. While only a few refl ections may be present in the fi nal
sound, and the refl ections may be of signifi cantly lower amplitude than the
direct sound, they will impart important environment information. These
refl ections can provide environmental cues that could distort a reverb pro-
gram applied to the sound.
Microphone placement location and performance characteristics will play
vital roles in establishing distance cues, through defi ning the amount of tim-
bre detail present in the recorded sound source. This is important to remem-
ber, especially if placing a source in “proximity” is planned. This timbral detail
must be captured in the recording process, as it cannot be created later.
Generally, the recorded sound source will have greater timbral detail the
closer the microphone to the sound source and/or the more sensitive the
microphone in terms of distance sensitivity and transient response. This
provides a distance cue that may place the listener of the recorded sound
near the reproduced image of the recording, sometimes unnaturally near
the source. This level of timbral detail directly establishes the level of inti-
macy of the recording. The depth of sound stage will also be greatly deter-
mined through these cues.
Distance cues are often contradictory when a high degree of timbre detail
and a large amount of reverberant energy are present simultaneously. This
unrealistic sound may be desired for artistic purposes, and has been used
to great effect.
Preliminary Stages: Defi ning the Materials of the Project
291
The sound quality of the sound source may potentially be altered by close
microphone placements. As above, these alterations can be used to cre-
ative advantage, or they can interfere with obtaining the desired sound
quality. The sound quality may exhibit the following alterations when the
source is close miked:
An increase of defi nition over known, naturally occurring degrees of
timbre detail
Changed spectral content of the sound source
An unnatural blend of the source’s timbre, caused when the source has
not had enough physical space to develop into its characteristic sound
quality
Noises from the performer or from the instrument
A cardioid or bidirectional microphone placed within two feet of a sound
source may have an altered frequency response, and thereby alter the
source’s spectral content. This altered frequency response of certain micro-
phones is the
proximity effect. The response in the low-frequency range
rises relative to response in higher frequencies. Figure 13-7 demonstrates
the low-frequency boost of a sound source at 1 inch (dotted line) compared
to the source at 24 inches.
Figure 13-7
Altered
frequency response
of a sound source at
1 inch from micro-
phone caused by the
proximity effect.
Frequency in Hertz
Relative Response in Decibels
Proximity Effect
+10
0
−10
50 100 200 500 1k 2k 5k 10k
1"
24"
Performers and their instruments can create undesirable noises, causing
diffi culties in obtaining a desirable sound.
Blend
The sound source and the way it radiates sound must be considered in
relation to the sound’s environment. All sound sources radiate sound in a
unique way. A polar pattern similar to a microphones is created as sources
radiate different frequency response curves in different directions.
Chapter 13
292
Sound sources require physical space for the sound quality to develop and
coalesce. The sound quality of instruments and voices is a combination of
all of the sounds they produce. When a microphone is placed in a physical
location near a source, it can be within this critical distance necessary for the
sound to develop into its characteristic single sound wave. The sound source
might not have the opportunity to
blend into its unique, overall sound qual-
ity. The microphone might capture only a portion of the sound. This result-
ing sound quality can be very different from how the sound source exists
in acoustic environments. It is important that the recordist be aware of this
space and in control of recording the desired blend of the source’s sound.
Generally sounds need several feet of space for their sound quality to form.
This distance is too far for a microphone placement that would isolate the
source from other sources performing at the same time. Close microphone
techniques change the sound quality of sources, sometimes markedly. This
can be used for great creative advantage, or to the detriment of the proj-
ect. Often several close microphones are used on a single instrument to
“manufacture” the blend of an instrument, and to maintain the isolation
of sound sources. Various parts of the sources sound are blended by the
recordist instead of relying on the sound source’s natural blend. Pianos
are commonly recorded in this way. This concept can be applied to many
sound sources such as the enclosed CD’s cello and guitar tracks.
Stereo and Surround Microphone Techniques
Stereo and surround microphone arrays are composed of two or more
microphones (or diaphragm assemblies) in a systematic arrangement. They
are designed to record sound in such a way that upon playback (through
two channels or surround) a certain sense of the spatial relationships of the
sound sources present during the recorded performance is reproduced.
These techniques have the potential to accurately capture the sound quali-
ties of the live performance and of the hall, the performed balance of the
instruments of the ensemble, and the spatial relationships of the ensemble
(location and distance cues), with minimal alteration but with subtle and
unique qualities.
Many microphone techniques have been developed. Among the most com-
monly used stereo techniques are:
X-Y coincident techniques
Middle-side technique (M-S)
Blumlein
X-Y (crossed fi gure-eights)
Near-coincident techniques (NOS and ORTF)
Spaced omnidirectional microphones
Spaced bidirectional microphones
Binaural system (artifi cial head)
Sound fi eld and other specialized microphones and systems
Preliminary Stages: Defi ning the Materials of the Project
293
Among common surround microphone techniques are:
Sound fi eld microphone (coincident array)
Schoeps Spherical Array (near-coincident array)
Four coincident cardioids, coincident array
Four spaced cardioids, surround ambience microphone array
Sound Performance Lab Array, fi ve multipatterned spaced microphones
Frontal arrays, plus ambience microphones
Stereo and surround microphone techniques are often used in recording
large ensembles, in large acoustic spaces, and from a rather distant place-
ment. The techniques are very powerful in their accuracy and fl exibility, and
may be applied to either a single sound source or to any sized ensemble.
They may be used from a rather distant placement (perhaps 15 or more
meters, depending on the pertinent variables), to within about a meter of
the sound source. Close placements of stereo arrays are commonly applied
to drum sets, for example, sometimes supplemented with accent micro-
phones, sometimes not.
Stereo and surround microphone techniques can be
thought of as a preprocessing of the recorded sound.
Preprocessing is the altering of the sound sources sound
quality before it reaches the routing and mixing stages of
the recording chain. Microphone techniques will add and
capture spatial information to the recording.
Microphone arrays can signifi cantly alter the sound source
in a number of ways. All stereo and surround microphone
techniques have their own unique characteristics and their
own inherent strengths and weaknesses. The inherent
sound qualities of the microphone arrays can be used to great advantage if
the recordist understands and is in control of their sound qualities.
The following artistic elements are created or captured, with varying char-
acters and realism, by the stereo and surround techniques listed above. By
evaluating the sound qualities of each of the various arrays, their unique
qualities can be learned.
Perceived listener-to-sound-stage distance
mount and sound quality of the environmental characteristics of the
performance space
Perceived depth of the ensemble, sound source, and/or sound stage
Perceived width of the sound source or the sound stage
Defi nition and stability of phantom images and distance imaging
Musical balance of the sound sources in the ensemble
Sound qualities of the entire ensemble or of specifi c sources within the
ensemble
The microphone placement variables are also factors in microphone tech-
niques, as they determine where the array is to be placed. Placement of the
Listen . . .
to tracks 50 and 51
for unaltered stereo microphone
technique recordings.
Chapter 13
294
array will considerably impact the quality of sound, as will the microphones
selected for the microphone array, as they impart their own unique sound
characteristics, as described above. Matching specifi c microphones (type of
transducer or model/manufacturer) to the stereo or surround microphone
array is also important in ensuring the effectiveness of the array’s capturing
the desired sound qualities of the performance.
The aesthetic approach to the project will guide how the microphone tech-
nique is selected and used. Techniques can provide a clear documentation
of a performance, can enhance a performance, and can impart signifi cant
and unique characteristics to a recording.
The recordist will often envision an
ideal seat for the performance when
determining the placement of a microphone array. This type of placement
of the array will seek the sound qualities that are most desirable for the
particular ensemble, performing a specifi c piece of music in the perfor-
mance space. The recordist will seek to balance the hall sound with that
of the ensemble, capture an appropriate amount of timbre defi nition from
the ensemble, retain all performed dynamic relationships, and establish
desirable and stable spatial relationships in the sound stage. The balance of
the total energy of the direct sound and that of the reverberant sound will
be carefully considered in determining the placement of the microphone
array. The point where the two are equal has been identifi ed as the
criti-
cal distance. The recordist will focus on this ratio of the refl ected to direct
sound to identify a desired balance.
Accent microphones are often used to supplement stereo and surround
microphone techniques. These are microphones that are dedicated to cap-
turing a single sound source, or a small group of sound sources, within
the total ensemble being recorded by the array. The accent
microphones are placed much closer to the sound sources
than the array and may cause the recordist to consider
some of the close miking variables discussed above.
Accent microphones are most often used to complement
the array. They assist the overall array by bringing more
dynamic presence and timbre defi nition to certain sound
sources in the ensemble, and they allow the recordist
some control over the musical balance of the ensemble.
Accent microphones also create noticeable time differenc-
es between the arrival of the sound source(s) at the stereo
array, and the arrival of the sound source(s) at the accent
microphone.
Adding accent microphones will diminish the realistic
sound qualities of the microphone technique. The sound relationships of
the performance will be altered by the dynamics, sound quality, and dis-
tance cues added by the accent microphone(s). The microphone array’s
Listen . . .
to track 50 and 37
for an accent microphone added
to a stereo microphone technique:
track 50 is an ORTF recording of
drum set, track 37 adds an accent
microphone to the ORTF recording
of track 50.
Preliminary Stages: Defi ning the Materials of the Project
295
ability to accurately capture the performance will be diminished by using
accent microphones to alter the sound present in the performance hall.
Equipment Selection: Application
of Inherent Sound Quality
All recording/reproduction devices impart a unique sonic imprint on the
sound. As was discussed with microphones above, individual recorders,
mixing consoles, signal processors, analog-to-digital converters, and any
other devices in the signal chain all have unique sound qualities, which are
the result of their performance characteristics. This extends to all software
that acts upon sound such as functions within DAWs, plug-ins, etc. They
all modify the original signal; some do this almost imperceptibly, and oth-
ers quite profoundly. Any individual device will be evaluated for how it
transforms the frequency, amplitude, and time components of the original
signal. The modifi cations of the original signal caused by the basic perfor-
mance characteristics of a device (real or virtual) create its
inherent sound
quality. Further, different technologies have distinct sound characteristics,
or the potential to display or produce certain characteristics.
In defi ning the materials of the project, the specifi c pieces of recording
equipment to realize the desired sound qualities of the individual sound
sources and the project’s overall sound need to be determined. The record-
ist can approach this problem directly by evaluating and understanding the
inherent sound characteristics of the available individual devices, and the
inherent sound characteristics of the technologies of those devices, against
the unique needs of the individual project. A compatible match will be
sought for the device, technology, and sound source to arrive at a desired
sound quality.
Just as with evaluating microphone characteristics, the recordist’s focus
must shift between critical and analytical listening purposes at a variety
of perspectives in order to evaluate the sound qualities of devices. Careful
evaluation out of context of a recording will allow traits to be evaluated
most easily. This process can be applied to evaluating the sonic differences
between any technologies, devices, or software available for the project.
Technology Selection: Analog versus Digital
The debate over the superiority of analog and digital devices and their
inherent sound qualities, and other characteristics, continues to linger.
Decades of debate and technological advancements have shown both for-
mats can produce stunning recordings, with some unique qualities and
with many similar qualities, both positive and negative.
Chapter 13
296
Digital recording, processing, and editing equipment is not necessar-
ily “better” than analog equipment, nor is the opposite true. Both have
great potential for artistic expression, and both can generate recordings of
impeccable quality. No technology is inherently better-suited than anoth-
er for generating, capturing, shaping, mixing, processing, combining, or
recording sound. Some devices may have functional features that are more
attractive to an individual recordist, and some people develop personal
sound preferences. These are, however, matters of taste. Technologies are
simply different in how they sound, how they retain the original signal
characteristics, and how they alter the sound source.
Analog technology has certain inherent sound characteristics. Digital tech-
nology has certain inherent sound characteristics. The characteristics of one
technology may or may not be appropriate for a particular project. Inherent
sound qualities are inherent sound defi ciencies if they work against the
sound quality the recordist is trying to obtain. Inherent sound qualities are
desirable if they produce the sound quality the recordist is looking for.
It is diffi cult to make generalizations as to the characteristics of analog ver-
sus digital technology. The sound qualities of both technologies vary widely
depending on the particular unit and the integrity of the audio signal within
the particular devices. An 8-bit digital system is signifi cantly less accurate
and fl exible than a 32-bit system. A consumer-model analog system is sig-
nifi cantly noisier and less accurate than a professional recorder with high-
end noise reduction.
Differences often exist between the two technologies in:
Accurately tracking the shape of the waveform (especially the initial
transients of the sound wave, in both technologies)
Processing all frequencies equally well (especially frequency response
linearity in analog or quantization issues in digital)
Storing the waveform without distortion from the medium (especially
tape noise fl oor or A/D and D/A conversion accuracy)
Altering the waveform in precise increments and precisely repeating
these functions (a measure of signal processors)
Performing repeated playing, successive generations of copying, and
long-term storage with minimal signal degradation (a measure of
recording formats)
Noise and distortion added to the signal
Many other, more subtle, differences exist, especially between specifi c
devices of each technology.
Selecting Devices and Models
No recording device is inherently “better” than another similar device.
While certain devices might be more fl exible than others, and certain
Preliminary Stages: Defi ning the Materials of the Project
297
devices are certainly of higher technical quality than others, the primary
artistic concern for selecting a piece of equipment is its suitability to the
particular needs of the project, at a given point in time.
The advantages of any particular device in one application may be a dis-
advantage in another. The sound quality of one device may be appropriate
to one musical context, and not to another. The measure of the device will
be in how its inherent sound qualities can be used to obtain the desired
sound qualities of the recording. Pieces of recording equipment should be
evaluated for their sound qualities, and their potential usefulness in com-
municating the artistic message in the piece of music. This evaluation is
performed through a critical-listening process similar to that used to evalu-
ate microphone performance characteristics.
Recording equipment (including computers) and software are tools. The
tools may be applied to any task, with consistent results. The recordist
needs to decide if the particular tool (device or software) is the appropriate
one to craft the sound quality that is required of the particular project.
Musicians often carry with them a number of musical instruments. They
will use a different model of the same instrument (perhaps made by a dif-
ferent manufacturer) to obtain a different sound quality of their perfor-
mance, depending on what is required by the musical material. The record-
ist should recognize this is similar to their situation.
In selecting recording equipment, the recordist is, in essence, selecting a
musical instrument. The sound quality of sound sources, or of the entire
recording, may be markedly transformed by the piece of equipment, while
the sound is under the control of the recordist. This is the way a traditional
musical instrument is applied by a traditional performer.
Recordists will develop sound-quality preferences and working preferenc-
es for particular pieces of equipment, and perhaps for a particular technol-
ogy. Developing such preferences may or may not be artistically healthy.
The recordist may become inclined to consider a certain technology to be
“better” than another simply because it is the one they are most familiar
with, not because it is the one that is most appropriate for the project. Per-
sonal preferences (or personal experiences) might become confused with
the actual quality or usefulness of a device or a technology.
In contrast, one’s own “sound” can result by developing one’s own produc-
tion preferences and sound quality preferences. Equipment selection will
contribute much to this when a recordist has strong preferences to use cer-
tain devices; how the recordist shapes the artistic elements will also play
a signifi cant role. A person’s own sound is developed over time, and after
considerable experience. How a recordist handles the mix process and the
nal dimensions of the recording ultimately defi ne their “sound.
In summary, audio devices have inherent sound qualities that are deter-
mined by technology and the device’s unique performance characteristics.
Chapter 13
298
The recordist will be using these devices to shape the sound of the music
recording. Learning the inherent sound qualities of as many devices as
possible will increase the creative tools of the recordist, and provide them
with more options in obtaining the sound they want. These devices are the
musical instruments that are used to craft the mix, the recording.
Monitoring: The Sound Quality of Playback
During these preliminary stages, the recordist must decide on how the record-
ing process will be monitored. Monitoring most often takes place in a record-
ing control room—whether a commercial facility, a home studio, a closet at a
remote location, or something else. All of the sounds and relationships of the
project are presented to the recordist through the playback system.
The monitor system is much more than a pair of loudspeakers. In terms of
hardware, it includes power amplifi er(s), loudspeaker drivers, crossover
networks, loudspeaker enclosures and even connector cables. The monitor
system also encompasses the listening room itself; the placement of the
loudspeakers in the room, and the interactions of the room and the loud-
speakers become part of the sound quality of the monitor system.
The monitor system has the potential to transform all sound qualities and
relationships in the recording. The system must not considerably alter the
original signal, or at the very least the recordist must know how the sound
is being altered. The monitor system needs to accurately reproduce the
spatial qualities and the frequency, amplitude and time information of the
recording.
If recordists are to be in control of the recording process, they must be able
to evaluate the recording itself, not the recording modifi ed by the monitor
system. The recording will only have the same qualities in another (neutral)
listening environment if the sound quality is not originally altered through
the recordist’s system.
The most desirable monitoring system is transparent. The playback system
reproduces sound while interacting with the control room acoustics with-
out altering the quality of sound. In actual practice this is nearly impos-
sible. Still, sound alterations caused by monitor systems can be minimized.
High-quality audio systems can be assembled and appropriately located in
suitable environments to provide excellent sound reproduction.
The following considerations must be factored into establishing an accu-
rate monitor system:
High-quality playback system
Loudspeaker and control room interaction
Effective listening zone
Sound fi elds: direct/near fi eld versus room monitoring
Monitoring levels
Preliminary Stages: Defi ning the Materials of the Project
299
High-Quality Playback System
The playback system for accurate monitoring will be of high quality and
carefully engineered. It is designed to provide unaltered, detailed sound,
while working within the acoustic characteristics of a control room or simi-
lar environment. It is inherently different from systems intended for home
use (even high-end systems intended for the audiophile market).
Most consumer playback systems seek to blend sound in pleasing ways and
alter the sound with characteristics consumers (or certain segments of the
general public) might enjoy; they often seek to smooth out imperfections
in recordings and remove timbral detail that may be startling to listeners.
The purpose and function of a pro audio playback system is very different.
It seeks to provide the recordist with extreme clarity and detail of sound,
minimizing any added fusion of sound elements. While it is common for
some recordists to use one or several sets of consumer type loudspeakers
(sometimes in and often outside the studio) to obtain an idea of how their
project might sound over home-quality playback systems, listening during
the recording process requires great attention to subtleties of sound that are
often only apparent over speakers designed to deliver great sonic detail.
All components in a high-quality playback system must perform at a high
level and have qualities that complement the other components of the sys-
tem. From the input of the power amplifi er(s) to the output of the loud-
speaker drivers, the signal should undergo minimal alteration, distortion
and added noise.
Loudspeaker systems are comprised of multiple drivers—at a minimum,
a low-frequency woofer and a high-frequency tweeter, and often addition-
al mid-range driver(s) and/or a subwoofer. These drivers are housed in a
loudspeaker enclosure that will impart sound qualities onto the reproduced
sound, as well as impact the effi ciency of the system. The subwoofer is the
exception, as it will have its own enclosure and will reproduce the lowest
frequencies of all channels. The drivers are fed a signal in the frequency
range of their optimal performance by a crossover network.
A crossover network is a series of fi lters that produce the required fre-
quency-limited signals for each driver. The network may be inserted into
the signal chain after the power amplifi er; this is a passive network that
lters the input signal from the power amplifi er into appropriate frequency
bands at its outputs. Active crossover networks are inserted into the signal
chain before the power amplifi er; since the total frequency range is divid-
ed before the amplifi er, a separate amplifi er channel is required for each
driver. A biamplifi ed system contains a two-driver loudspeaker, two power
amplifi ers (or a two-channel amplifi er) and an active crossover network.
A topic of some passionate debate, the qualities of the wires (cables) that
connect devices may alter properties of the signals they carry. This seems
especially noticeable with the cables between the power amplifi er and the
loudspeaker.
Chapter 13
300
Detailed information on performance specifi cations of all of these devices,
and the delicate science of matching components, is well outside the scope
of this writing. The reader is encouraged to explore this material, and also
to listen carefully to many different playback systems to learn their many
characteristics—preferably when using the same or similar recordings, and
placing the system in the same location(s) in the same room.
A high-quality playback system should meet the following criteria:
Effi cient transfer of power between components
Exhibit ±1.5 dB deviation in all frequencies from 40 Hz up to 17 kHz
Even lateral loudspeaker dispersion at all frequencies to cover the
effective listening zone equally with the same frequency response
Meet quality performance levels in dynamic range, power supply and
output noise, frequency response, slew rate, damping factor, total har-
monic distortion (THD), intermodulation distortion (IMD), transient
intermodulation distortion (TIM), time coherence
Noise oor of the system in relation to the acoustic noise fl oor of the
listening room
The system should produce sound with a high level of clarity and detail,
strong time/phase coherence throughout the listening range (especially
around the crossover frequencies), accurate tracking of dynamic changes
(especially for high-pitched sounds with fast attacks), and a fl at frequency
response (especially in the lower octaves and above 10 kHz). The moni-
tors should supply stable imaging and not draw the listeners attention
to the loudspeaker locations. It is important for the recordist to be able to
identify if alterations to sound quality are a product of the components of
the playback system, or if they are being created by the interaction of the
loudspeakers and the listening room.
Loudspeaker and Control Room Interaction
The control room itself can alter the sound that is heard. The acoustics of
the control room can cause radical changes in the frequency response and
time information of the sound. Ideally, any listening room would have a
constant acoustical absorption over the operating range of the loudspeak-
ers, and would appropriately diffuse the sound from the loudspeakers to
create a desirable blend of direct and refl ected sound at the mixing/listen-
ing position.
Nonparallel walls; nonparallel fl oor and ceiling; acoustical treatments to
absorb, refl ect and diffuse sound where needed; careful selection, place-
ment and installation of loudspeakers; and a suffi cient volume (dimensions
of the room) will minimize the infl uence of the control room on the sound
coming from the loudspeakers.
Preliminary Stages: Defi ning the Materials of the Project
301
The room should absorb and refl ect all frequencies equally well, should
produce very short decay times that are at substantially lower levels than
the direct sound from the loudspeakers. The room should not produce res-
onance frequencies and should not produce refl ections that arrive at the
listening position at a similar amplitude as, or within a small time window
(2–5 ms) of, the direct sound.
As needed, rooms may be tuned for uniform amplitudes of all frequencies
by room equalizers, and tuned for the control of refl ections (time) with
diffusers, traps, and sound absorption materials. The ideal control room
would include very specifi c acoustical treatments in such a way that mini-
mal room equalization is needed.
Studio designers have widely divergent opinions on the most desirable
acoustical properties of control rooms. Confl icting information and opin-
ions are common. The objective of all designers is very similar, however.
It is to establish a listening environment where sound can be accurately
reproduced.
Figure 13-8
Loud-
speakers as part of
the control room and
the effective listening
zone.
Points Equidistant
from each speaker
60º 60º
60º
Diminished Imaging
Accurate Imaging
Possible Effective
Listening Zone
CONTROL
ROOM
WALLS
Equal distances
Exaggerated Imaging
Chapter 13
302
Loudspeaker placement and performance specifi cations are factored into
the design of control rooms. One common design approach dictates that
the loudspeaker should be a part of the wall, mounted within the wall
itself so the front of the loudspeaker is fl ush with the face of the wall. This
negates the usual boost of low frequencies that results when a loudspeaker
is placed near walls, ceilings, and fl oors (and especially in corners).
Loudspeakers can also be freestanding within the room in a direct fi eld to
the listener. Direct fi eld loudspeaker placement away from sidewalls by
4 feet and from the front wall by at least 3 feet will minimize the boost of low
frequencies (that occurs when the omnidirectional low frequencies refl ect
off the wall surfaces and combine with the direct sound from the speaker).
The loudspeakers should be aligned on the same vertical plane, usually
at ear height for a seated person or slightly higher. It is important that
the meter bridge of the console, or any other object such as a computer
monitor, not be in the path between the loudspeakers and the mix position.
Strong refl ections off the console, tabletop, etc. also must be minimized.
The loudspeakers should be aligned symmetrically with the sidewalls.
The
effective listening zone is an area in the control room where the repro-
duced sound can be accurately perceived. In a control room, the mix posi-
tion and/or the producers seat are located in the effective listening zone.
In most control rooms the size of the effective listening zone is quite small.
It is an area that is equidistant from the two loudspeakers in stereo and all
loudspeakers in surround. The area is located at roughly the same distance
from each speaker as the speakers are from each other. Angling loudspeak-
ers correctly provides optimal imaging in the effective listening zone, given
complementary room acoustics. The reader should review stereo and sur-
round loudspeaker positioning described earlier.
For accurate spatial perception, it is necessary for the effective listening
zone to be carefully evaluated. The listener must be seated in the proper
location, and the volume level must be the same at each speaker when
identical signals are applied to each channel. Moving the listening location
closer to the speaker array exaggerates imaging and moving away from
the loudspeakers diminishes imaging relationships. Rooms are built and/or
speakers are placed so few strong refl ections arrive at this area. The control
room should be virtually transparent (add or subtract no characteristics to
the sound) in the listening zone.
Near-Field, Direct-Field and Room Monitoring
Loudspeakers can be incorporated into the structure of the control room,
as described above. These room monitors utilize the acoustics of the
room to their advantage, to complement their performance. Loudspeak-
ers are also found freestanding and on desktops in control rooms.
Near-
eld
monitoring seeks to eliminate the infl uence of the control room on
Preliminary Stages: Defi ning the Materials of the Project
303
the performance of the loudspeakers, and direct-fi eld monitoring seeks to
minimize this infl uence. Near-fi eld monitors are typically two-way speak-
ers with dome-shaped tweeters and 6-inch woofers, and are designed to
be placed 3 to 4 feet from the listener. Direct-fi eld monitors are typically
a bit larger (8-inch woofers are common) and are usually placed 4 to 5
feet from the mix seat (monitoring location) but can be somewhat further
away depending on the listening room. Near-fi eld monitoring, and the use
of freestanding loudspeakers at close distances (direct-fi eld monitoring),
have become more common as small project studios have begun to domi-
nate the market. Quality, accurate and reasonably priced monitor systems
have brought detailed and accurate reproduction systems widely acces-
sible for homes and small studios.
For direct- and near-fi eld monitoring, speakers can be located on stands
in front of a DAW’s computer screen, seated on a meter bridge, or placed
elsewhere as appropriate. The speakers will be 3 to 5 feet apart, and the
listener will be at the same distance from the speakers (Figure 13-9). Sound
is heard near the speakers, before the acoustics of the room can alter it.
The goal is for the refl ected sound in the control room to have little or no
impact on what is heard in direct- and near-fi eld monitoring because of the
listeners close position to the loudspeakers. A room with highly reverber-
ant characteristics, such as highly refl ective parallel walls and other sur-
faces, may still alter sound in direct- and near-fi eld monitoring.
Figure 13-9
Freestand-
ing direct-fi eld and
near-fi eld loudspeaker
relationships to the
listening environ-
ment and the listener
location.
60º 60º
SIDE
WALL
5 feet
(approximate
maximum)
60º
3 feet
(minimum)
4 feet
(minimum)
4 feet
(minimum)
FRONT
WALL
SIDE
WALL
Chapter 13
304
Since the control room has very little infl uence on the sound quality of
near- and direct-fi eld monitoring, this approach is often the exclusive moni-
toring system of control rooms that have poor room acoustics. In this case,
high-quality monitors specifi cally designed for professional direct/near-
eld monitoring are used. These speakers are preferred over the large stu-
dio monitors by some recordists, especially during long sessions when the
more intense energy of (some) large speakers can fatigue the ear.
Often problems with recordings can be more easily identifi ed using high-
quality near-fi eld monitors rather than monitors that interact with the room.
This is especially true of rooms with decay times above 0.25 seconds. Near-
eld monitors can be used for quality control to more readily identify sub-
tle issues and noises in recordings and to facilitate editing, because subtle
details are often perceived as clearer when room sounds are not perceived.
The ambience of the room does not pull detail from the sound and the lis-
tener is able to more easily focus attention on a nearer source.
The size of the effective listening zone for direct-fi eld and near-fi eld moni-
toring is very small. The listener must be precisely centered between the
two loudspeakers, and the listener cannot be beyond the distance from the
two loudspeakers as they are from one another, or the effects of the room
will immediately come into play. The listener should be located at or very
near the same distance from each loudspeaker, as the two loudspeakers
are from each other.
It is not unusual for recordists to periodically switch back and forth between
room and near-fi eld monitors. This allows the recordist to evaluate the
sound qualities and relationships of the recording from different listening
locations and with loudspeakers that have different sound qualities. The
recordist will be attempting to create a consistency between the sounds
of the two monitor systems inasmuch as this is possible. The speakers are
usually switchable at the mixing position, with each pair having a dedicated
amplifi er(s), set to match monitoring level while switching speakers.
Bookshelf-type, consumer loudspeakers are sometimes used in the studio
in a near-fi eld setting. This provides a reference to a consumer playback
environment, and can represent the sound of typical, moderately priced
speaker systems. They often tend to have a narrow frequency response, de-
emphasizing high and low frequencies areas, and diminished detail.
Monitor Levels
The sound pressure level (SPL) at which the recordist listens will infl uence
the sound of the project. Humans do not hear frequency equally well at all
amplitudes, as discussed in Chapter 1. A recording created at a
monitor lev-
el
of 100 dB (SPL) will sound considerably different when it is played back
at the common home-listening monitoring level of 75 dB or lower. The bass
line that was present during the mixdown session will not be as prominent
Preliminary Stages: Defi ning the Materials of the Project
305
in playback at the lower level. Similarly, the tracks that were recorded at
105 dB during a tracking session (because someone in the control room
wanted to “feel” the sound) will have quite different spectral content when
they are heard the next day at 85 dB, during the mixdown session.
The recordist will need to develop consistency of listening levels. Ideally,
monitoring levels should be reasonably consistent throughout a project—
tracking, mixing, editing, and mastering. Calibrating your monitor system
for 85 dB SPL at the monitoring location to match 0 VU reference will assist
this greatly.
The most desirable range for monitoring is 85 to 90 dB SPL; a nominal lis-
tening level of around 85 dB with peaks reaching to 90 dB (or very slightly
beyond) is a fairly loud home-listening level, but it is a level that can be
sustained throughout a work day. Recordings made while monitored in
this range will exhibit minimal changes in frequency responses (5 dB at
the extremes of the hearing range) when played back as low as 60 dB SPL.
Monitoring at this level will also do much to minimize listening fatigue dur-
ing prolonged listening periods (recording and mixing sessions).
The possibility of hearing damage is a very real job hazard for recordists.
If the recordist primarily listens in the 85–90 dB range, they will have accu-
rate hearing much longer than if they consistently monitor at a level 10–15
dB higher—they may very likely also have a longer career.
Loudness is our perception of the physical sensation of amplitude. It is pos-
sible to become sensitive to amplitude, and therefore loudness, by atten-
tion to the physical sensation of sound pressure levels. It is unreasonable
to believe the listener might develop an ability to accurately identify a spe-
cifi c SPL or a specifi c increment of change of level. It is however possible
to learn how a narrow range of SPL “feels,” and thereby learn to recognize
a safe and effective listening range. The reader is strongly encouraged to
work through the loudness perception exercise at the end of this chapter,
and to purchase an inexpensive sound-level meter and check listening lev-
els frequently.
Headphones
Headphones are sometimes used in recording processes for monitoring.
They are required for creating and listening to binaural recordings, which
minimize reliance on acoustic interaural signals. Headphones are often
used for remote recording and for projects that have the performers in the
same space as the recordist. In these instances, headphones may be the
only feasible monitoring option. Other than these situations, monitoring is
most accurately accomplished over loudspeakers, with few exceptions.
Headphones will distort spatial information, especially stereo imaging. The
listener will perceive sound within their head, instead of as occurring in front
Chapter 13
306
of their location. This distorts depth of imaging. The interaural information
of the recording is not accurately perceived during headphone monitoring,
causing potentially pronounced image size and stereo location distortions.
Further, the frequency response of headphones will not match that of loud-
speakers and is inconsistent. Frequency response of headphones will vary
with the pressure of the headphones against the head and the resulting
nearness of the transducer to the ear. In listening to a recording over head-
phones, the recordist must also imagine what the recording will sound like
when played back over loudspeakers. Some recordists have developed this
skill very accurately.
High-quality headphones paired with quality headphone amplifi ers can
prove useful. They can provide low distortion, broadband frequency
response, and excellent transient response and deliver exceptional timbral
clarity and detail. Headphone monitoring with such equipment can be very
useful for editing, especially when listening for subtle technical details. At
times the ambience of a control room can make processing artifacts, tim-
ing errors, faint clicks, ticks and edit point issues very diffi cult to hear; such
sounds are often easier to perceive over headphones, where the speaker
is nearer the ear. Sonic detail and accuracy of spatial information remain
compromised, however.
The sound evaluations that are covered herein can only be accurately per-
formed over quality loudspeakers, in a transparent (or complementary) lis-
tening environment. Rarely is it preferable to monitor over headphones if
both options are available.
Summary
Before sounds are recorded, the recordist makes many decisions that
shape the recording project. Many of these choices will limit how the music
might be shaped later in the recording process and must be approached
with knowledge and sensitivity. Selecting and defi ning sound-source tim-
bres, capturing those sought qualities with effective microphone selection
and placement, crafting new timbres with synthesis, selecting recording
devices throughout the signal chain, and selecting monitoring systems and
conditions are all decisions that can profoundly impact any project.
When the entire project is considered at this planning stage, decisions in
many areas can be made that will not limit the recordist as the project
progresses. A clear sense of artistic direction will be established that will
allow the music’s potential to be realized and a high-quality recording to
be made.
These preliminary stages prepare for successful projects of quality that
make effi cient and effective use of studio time. They also allow recordists
to be in control of their craft and shape the artistic qualities of music.
Preliminary Stages: Defi ning the Materials of the Project
307
Exercises
Exercise 13-1
Loudness Perception Exercise
Humans experience a physical sensation from the amplitude of a sound wave
that is transferred into loudness. Excessive amplitude can cause pain in hu-
mans. Becoming sensitive to the physical sensation of listening at an appropri-
ate loudness level is equally possible, with practice, attention and diligence.
1. Purchase an inexpensive sound-level meter, and keep it in front of your
listening location for your monitor system. Set your monitoring level so
that the average SPL registered is between 80 and 85 dB SPL. It is accept-
able for peaks to hit 90 dB or slightly above and for soft passages to dip
below 80 dB by several dB.
2. Begin your practice by listening to recordings you know well or are study-
ing at this loudness level, checking the meter frequently to verify listening
levels. In the beginning, keep the meter on whenever you are listening.
Become aware of the physical sensation of your hearing created by listen-
ing at that level. You will begin to notice energy impacting the hearing
mechanism (inner ear) with your focused attention, over time. You will
begin to develop an increased sensitivity to the physical sensation of loud-
ness level.
3. Be consistent in listening at this level when working on your own projects.
Check your meter regularly. Over a few weeks (or a bit less or longer) of
being aware of the physical sensation of this loudness level region, you
will develop a memory of the sensation.
4. Listen to a recording you know well on your monitor system by starting
with the monitor level off and without the meter on. Bring up the level
gradually until you believe you have reached this average level of between
80 and 85 dB SPL. Check yourself with your meter after you believe you
have established this level. Repeat this exercise regularly (several times per
day, if possible or a minimum of once per day) until you begin to have
success in recognizing this level.
5. Turn next to listening in other environments you do not know as well,
perhaps your automobile, a home system, or a monitor system in another
location. You will notice you are not as accurate at fi rst, but this accuracy
will increase as you become more aware of focusing on energy impacting
your hearing mechanism. In time this skill will carry over from listening
environment to listening environment, if you continue to try to establish
a correct listening level in a new location, and check your accuracy with
your meter.
Just as pitch reference, this skill will take time and effort over an extended peri-
od, and the skill will not become completely reliable. It will, however, improve
Chapter 13
308
your listening skill considerably and serve as an important point of reference.
In time you will gain an awareness of when you are listening at an average level
of between 80 and 85 dB SPL. It is also possible that you will come to prefer
an average level toward either the high or low side of this range. Arriving at a
reliable sense of average loudness level will greatly assist one in maintaining a
stable sound quality of monitoring, as well as an accurate listening level.
Most importantly, you will become very aware of being in an environment
where the SPL extends well above 90 dB SPL. That higher level brings increased
pressure on the hearing mechanism that will be very apparent and uncomfort-
able. This will trigger your awareness that hearing fatigue will happen quickly
and a sense of urgency that if this level extends far above 90 dB, you are in
danger of damaging your hearing.
Exercise 13-2
Identifying and Comparing Microphone Characteristics
The purpose of this exercise is to learn the special way a microphone will
transform sound. This is most easily accomplished by comparing several dif-
ferent microphones located in a very similar location while recording a single
performance.
1. Set up to record a single performance of a sound source you know well,
and that is easy for you to record accurately, given your setup and record-
ing space. Identify your preferred location to capture the sound of that
instrument or voice.
2. Place the 2, 3 or 4 microphones (on appropriate stands) as near the same
location as possible.
3. Assign each microphone its own track on a multitrack recorder, and ob-
tain suitable record levels.
4. Record the sound source performing pitches in the extremes of its range,
performing loudly and softly, short sounds and sustained sounds, sounds
with a fast attack and with slower attacks (of course this is not possible
on some instruments), and other material you might fi nd helpful.
5. Establish playback levels where each track can be compared to all others
at the same loudness.
6. Listen carefully to the sound quality of each microphone to identify its
unique qualities. Listen to how the microphone captured the spectrum
and the dynamic envelope of the source, and how quickly it responded to
changes. Listen to different “blends” of the source’s timbre captured by
the individual microphones, and how quickly each microphone respond-
ed to changes to the source.
Preliminary Stages: Defi ning the Materials of the Project
309
Exercise 13-3
Comparing Microphone Placements
The purpose of this exercise is to learn how microphones have different sound
qualities when placed at different distances. This can be accomplished by
creating a recording of a single performance of the same microphone at a
number of locations. You will need several (2, 3 or 4) of the same microphone
make and model for this.
1. Set up to record a single performance of a sound source you know well,
and that is easy for you to record accurately, given your setup and record-
ing space. Identify your preferred locations to capture the sound of that
instrument or voice.
2. Place the 2, 3 or 4 microphones (on appropriate stands) at those loca-
tions. Take the time to calculate and write down the distances and angles
of the microphones to the source.
3. Assign each microphone its own track on a multitrack recorder, and ob-
tain suitable record levels.
4. Record the sound source performing pitches in the extremes of its range,
performing loudly and softly, short sounds and sustained sounds, sounds
with a fast attack and with slower attacks (of course this is not possible
on some instruments), and other material you might fi nd helpful.
5. Establish playback levels where each track can be compared to all others
at the same loudness.
6. Listen carefully to the sound quality of each microphone to identify its
unique qualities. Listen to how the microphone captured the spectrum
and the dynamic envelope of the source, and how quickly it responded to
changes. Listen to different “blends” of the source’s timbre captured by
the individual microphones, and how quickly each microphone respond-
ed to changes to the source. Finally, note the “reach” of the microphone
at these various locations, and bring your attention to the accuracy of
timbral detail and the ratio of direct to reverberant sound. If you can,
listen for refl ections in the room that are causing portions of the source’s
timbre to be emphasized and attenuated.
Exercises 13-4
Additional Exercises for Microphone Techniques
Stereo and surround microphone techniques and their placements can also
be compared and evaluated by substituting arrays for the individual micro-
phones found in the above two exercises. In these evaluations, spatial rela-
tionships of the source to the sound stage and the spaciousness of environ-
mental cues will also be evaluated.
310
14 Capturing, Shaping and
Creating the Performance
With objectives clearly defi ned and a vision of the sound and content of the
project, the recording can now take place most effectively.
The recordist will capture the performance through recording or tracking the
musical parts and instrumentation, and will shape sound qualities and rela-
tionships of the performance through equipment selection and signal pro-
cessing. The mixing process will continue the shaping of the performance
and will result in the creation of the performance (that is the recording).
The actual act of recording sound begins here. The preliminary stages that
were discussed in the previous chapter will defi ne the dimensions of the
creative project and of the fi nal recording. In addition to these, certain oth-
er preparations must take place before a successful tracking or recording
session can begin.
Whenever possible, the musicians should be prepared for their performance
at recording and tracking sessions. Unfortunately, it is possible some musi-
cians will get their music or performance instructions when they arrive for
a recording session. This limits their potential to contribute to the musical-
ity of the project and can minimize the quality of their performance.
Ideally, music should be given to performers well in advance of a session.
Solo performers should have the opportunity to learn their parts before
they join the other musicians, and all musicians expected to perform simul-
taneously as a group should be well rehearsed together (as an ensemble).
As much as possible, musicians should be given the opportunity to arrive
at the session knowing what will be expected of them, know their parts,
and be ready to perform. It is often best when the music to be recorded has
been performed several times in concert—in front of an audience. In the
reality of music recording, this ideal is not always feasible.
Capturing, Shaping and Creating the Performance
311
Some fi nal instrumentation decisions and many of the supportive aspects
of the music are, however, often decided upon during the preliminary stag-
es of the recording process (or during the production process itself). The
recordist will learn to anticipate that some musical decisions will be made
during tracking. They will become aware that dimensions of creative proj-
ects can tend to shift (sometimes markedly) as works evolve, especially
between types of clients and types of music. The recordist must keep as
many options open as possible, to keep creative artists from being limited
in exploring their musical ideas. They must not fi nd themselves in the posi-
tion of not being able to execute a brilliant musical/production idea (with-
out hours of undoing or redoing) because of a recording decision made
a few minutes earlier. Allowing for fl exibility may be as simple as leaving
open the option of adding more instruments or musical ideas to the piece,
by leaving open tracks on the multitrack tape, or in console layout, cue mix
changes and signal routing, or in mixdown planning.
The recording studio is the musical instrument of the recordist. The record-
ing process is the musical performance of the recordist. In order to use
recording for artistic expression, the recordist must be in complete control
of the devices in the studio, and must understand their potentials in captur-
ing and crafting the artistic elements of sound.
Recording and Tracking Sessions:
Shifting of Focus and Perspectives
Recordings are made with performers recorded individually or in groups,
or with all parts of the music performed at once. “Recording sessions” as
used here are recordings of all the parts being played at once; the entire
musical texture is recorded as a single sound.
Tracking is the recording of the individual instruments or voices (sound
sources) or small groups of instruments or voices, into a multitrack for-
mat (DAW, analog multitrack recorder, etc.). This is done in such a way that
the sounds can be mixed, processed, edited, or otherwise altered at some
future time, and without altering other sound sources. For this isolated
control of the sound source to be possible, it is imperative that the sound
sources be recorded with minimal information from other sound sources
and at the highest safe loudness level the system will allow.
Recording and tracking sessions require the recordist to continually shift
focus and perspective, while listening to both the live musicians and the
recorded sound. Further, they will move between analytical and critical-
listening processes. The musical qualities of the performance will be con-
stantly evaluated at the same time as the perceived qualities of the captured
sound. The recordist is responsible for making certain both aspects are of
the highest quality and exist accurately and without distortion state in the
storage medium.
Chapter 14
312
In Part Two listening skills were developed for individual aspects of sound.
These skills are used here, during production, in actual practice. The great
difference is Part Two allowed the reader to focus on a single aspect of
sound and to remain at a single level of perspective. This was very helpful
for developing one’s listening abilities. In actual recording practice, those
many skills will be used virtually simultaneously, as the recordist switches
between elements, levels of detail and types of information very quickly
and deliberately.
Focus will have the listener’s attention moving freely but deliberately
between all of the artistic elements of sound and all of the perceived
parameters of sound. The recordist will need to shift focus between artistic
elements (perhaps shifting attention between dynamic levels, pitch informa-
tion, or spatial cues), then immediately shift to a perspective that evaluates
program dynamic contour (of the overall sound). The recordist is required
to continually scan the sound materials to determine the appropriateness
of the sound (and its artistic elements) to the creative objectives of the proj-
ect, and to determine the technical quality of the perceived characteristics
of the sound.
Within the recording and tracking sessions, the recordist is concerned
about the aesthetics of the sound quality that is going to “tape,” and they
are concerned about the technical quality of the signal. Depending on
their function in the particular project, they may not be in the position to
make decisions related to performance quality. They should nonetheless
be aware of the performance and be ready to provide their evaluations
when asked or to anticipate the next activity (repeat a take, or move on to
the next section, as examples). It is imperative that the recordist always be
aware of the technical quality
of the recording. It is their responsibility for
high-quality sound to be recorded.
Analytical Listening Concerns
There are many performance-quality aspects of music that will engage
the recordist in analytical listening at this stage. Most of these aspects are
related to:
Intonation
Control of dynamics
Accuracy of rhythm
Tempo
Expression and intensity
Performance technique
The pitch reference, time judgment and musical memory skills gained in Part
Two will now be invaluable. Staying in tune, at a specifi c reference (such as
A440) is necessary for “Take 1” to be useable against, say, “Take 617.Tempo
also must not change unintentionally, and dynamics must remain consistent
Capturing, Shaping and Creating the Performance
313
throughout the session(s). Dynamics can be altered by performance inten-
sity, and these changes in expression should be under control and factored
into dynamic level considerations. How a musical idea is played is often just
as important as the idea itself; the expressiveness of the performance and
its suitability to the project are important aspects of great recordings (such
as the ones we seek to make). While these areas may not always be under
the purview of the recordist, fl awed material needs to be identifi ed at all
times. Just how and when issues get pointed out to the artists, producer or
others varies with the project, roles and personalities involved.
The performance technique of musicians is critical in recording. The per-
formers sound cannot be covered up by the other players or assisted by the
acoustics of a performance environment, and will be apparent in the fi nal
sound. The ways performers produce sound or the instruments themselves
can create the desired musical impact, or it may not. The person respon-
sible for the project will need to make evaluations and to offer necessary
alternatives. The recordist should know any performance-technique prob-
lems and the natural acoustic sound properties of all of the instruments
(while well outside the scope of this writing).
At this stage the recordist should be getting a clear idea of the intended
overall qualities of the fi nal recording, and how each track will contribute
to those overall qualities. Tracking will capture or shape sound signifi cantly,
and will either allow the recordist to achieve the desired sound or it will fall
short. Careful attention must be given to the sound quality of the sound
source for timbral detail, performance intensity, dynamic contour, spatial
concerns, sound quality (for pitch density), and more. Depending upon
what source is being recorded at the time and the fi nal intended sound
quality, some artistic elements might be more important than others; still
all contribute important information and must be considered. All of the
analytical listening skills gained in Part Two will be used in this process.
Critical Listening Concerns
The technical quality of the recording will be refl ected in the integrity of the
recording’s signals. The recording process must not be allowed to alter the
perceived parameters of sound, unless there is a particular artistic purpose
or technical function for the alterations. The perceived pitch, amplitude,
time elements, timbre, and spatial qualities of the recorded sound sources
should be accurately captured in the recording, and should be the only
sounds present in the reproduced recording. Any extra sounds are noise,
and any unwanted alterations to the sound are distortions. The recordist
must know the recording process and devices instinctively well for this to
be accomplished fl uently. The goal is for impeccable technical sound qual-
ity to be achieved effortlessly, so that attention can be mostly given the
artistic qualities of the sound and the performance.
Chapter 14
314
Accurate recording and reproduction of the audio signal must be consistently
present for professional-quality music recordings. Skills in evaluating sound
quality, time information, and dynamic contours at all levels of perspective
are important here. Sound will be evaluated for fl aws and added sound/nois-
es by scanning all parameters of sound at all levels of perspective.
Each device in the signal chain has potential to add noise to the signal, and
to alter aspects of the signal in undesirable ways. A myriad of noises (such
as clicks, hiss, hum, digital artifacts, quantization noise, and countless oth-
ers) can be added to the signal by any device. They might also degrade
the quality of the signal by introducing clipping, total harmonic distortion,
phase shifting, drop out, intermodulation distortion, latency, and more.
These can all be caused by misuse or malfunctions of the devices (i.e., a
microphone overdriving a microphone preamp or a dirty potentiometer on
a mixing console).
Interconnections of devices must also be correct to preserve the quality
of signals throughout the signal chain. Gain staging, connection points,
analog-to-digital and digital-to-analog conversions, conversions of digital
formats, and many other issues also have the potential to diminish the
integrity at any point throughout the signal chain. It is the recordist’s respon-
sibility to recognize all these events, their sources, and more to ensure a
high-quality recording.
Other critical-listening applications during recording and tracking sessions
relate to microphone issues discussed in Chapter 13:
Isolating sound sources during tracking
Achieving a desirable sound quality through matching an appropriate
microphone with the sound source
Identifying an appropriate placement angle and distance for the
microphone
Eliminating unwanted sounds created by performers or instruments
Sounds will have a certain degree of isolation from one another. If the
sounds are to be altered individually in the mixing stage, they must be iso-
lated. When a group of sounds function as a single unit, it may not be appro-
priate to isolate the sounds from one another, as they are often blended by
the performance space. Problems arise when unwanted sounds leak onto
tracks that contain sound sources that were intended to be isolated from
all other sound sources. This leakage can be the cause of many problems
later on in mixing directly to two-track or surround, or in the multitrack
mixdown process.
Sound quality should be carefully evaluated as it is captured by the micro-
phone. The amount of timbral detail, blend, etc., of the sound sources will
be determined now. These are major decisions that will have a decisive
impact on the sound of the fi nal recording, and are largely determined by
Capturing, Shaping and Creating the Performance
315
microphone selection and placement. Evaluations of the sound out of the
context of the music is helpful in discovering subtle qualities that are often
missed when one is listening analytically.
Many recordists rely on equalization and other processing to obtain a suit-
able sound quality during this initial recording. This use of processing alters
the natural sound quality of the sound source and does not alter the timbre
equally well throughout an instrument’s range; for example, equalization
will add a new formant region to the track/sound source. It can be used
effectively when the processed sound is desired over the sources natural
sound, or when practical considerations limit microphone selection and
placement options. Often it can compromise the recording and should be
approached carefully. Processing—especially equalization—is often used
at this stage to compensate for poor microphone selection or placement.
Related to performance technique, the ways performers produce sound
on their instrument, or the instruments themselves, can create unwanted
sound qualities that must be negated during the tracking process. Often
these aspects of sound quality are very subtle and go unnoticed until the
mixing stage (when it is too late to correct them).
Instruments are capable of making unwanted sounds, as well as musical
ones. The sound of a guitarist’s left hand moving on the fi ngerboard, the
breath sounds of a vocalist or wind player, and the release of a keyboard
pedal are but a few of the possible nonmusical and (normally) unwanted
noises that may be produced by instruments during the initial recording
and tracking process.
These sounds are easily eliminated during the initial recording and tracking
process through altering microphone placement, through slight modifi ca-
tions in performance technique, or through minor repairs to the instrument.
The multitrack tape should be as free of all unwanted live-performer sounds
and sound alterations as possible. These sounds will be much more diffi cult
to remove later in the recording process. They may be comprised of certain
performance peculiarities that (depending on the situation) can only be alle-
viated by signal processing (such as the use of a de-esser on a vocalist).
Anticipating Mixdown During Tracking
The mixdown process should be anticipated while compiling basic tracks.
It is necessary to determine how much control in combining sounds is
needed or desired during the mixing process. Tracking, submixing and iso-
lation decisions can then be made accordingly.
Any mixing of microphones during the tracking process will greatly dimin-
ish the amount of independent control the recordist will have over the
individual sound sources during the fi nal mixdown process. Some mixing
Chapter 14
316
will often occur during the tracking stage, as submixes, to consolidate
instruments and open tracks, or to blend performers that must interact with
one another for musical reasons.
Submixes will be carefully planned at the beginning of a session, and must
be executed with a clear idea of how the sounds will be present in the
nal mix. Drums are often condensed into submixes (either mixed live, or
through overdubbing and bouncing). Other mixes that will occur during
the tracking process include the combining of several microphones (and/or
a direct box) on the same instrument(s). The recordist must be planning
ahead to how these sounds will appear in the anticipated fi nal mix, espe-
cially for musical balance, sound qualities, distance cues (defi nition of
timbral detail), and pitch density.
Preprocessing (such as adding compression while recording) alters the
timbre of the sound source before it reaches the mixing stages of the
recording chain. Preprocessing also diminishes the amount of control the
recordist will have over the sound during the mixing process. At times, it is
desirable to preprocess signals; often it is not.
Desirable preprocessing might include stereo microphone techniques used
on sound sources, effects that are integral parts of the sound quality of
an instrument (distorted guitar), or processors that are used to provide a
specifi c sound quality (compressed bass). At times preprocessing is used
to eliminate unwanted sounds during tracking (such as noise gated drums).
Once a source has been preprocessed, the alterations to the sound source
cannot be undone. The recordist should be confi dent that they want the
processed sound before recording it.
Some initial planning of the mixdown sessions will begin during the track-
ing process. Certain events that will need to take place during the mixdown
session will become apparent as the tracking process unfolds. Keeping a
tally of these observations will save considerable time later on and may
help other tracking decisions.
Examples of items that should be noted for the mixing process are:
Sudden changes in the mix that may be required because of the con-
tent of the tracks
Certain processing techniques that are planned
Any spatial relationships or environmental characteristics that may be
desired for certain tracks
Track noises or poor performances of certain sections that will need to
be eliminated (muted) during the mix
These are just a few examples of the many factors that may become appar-
ent during the tracking process.
Capturing, Shaping and Creating the Performance
317
Signal Processing: Shifting of Perspective
to Reshape Sounds and Music
Among the most commonly applied audio devices are signal processors.
They are important tools (instruments) for the recordist. These devices
may play a large role in shaping the individual project. Specifi c devices are
chosen because their individual, inherent sound qualities lend themselves
to the particular project. They each control one of the three basic properties
of the waveform: frequency, amplitude, or time.
Three types of signal processors exist, each functioning on a particu-
lar dimension of the waveform. As learned in Part One, an alteration in
one of the physical dimensions of sound will cause a change in the other
dimensions. Furthermore, alterations of the physical dimensions will cause
changes in timbre (sound quality). The three types of processors do not
only cause audible changes in the three characteristics of the waveform, but
they may also alter the timbre of the sound source. If considered accord-
ing to how processors alter sound, signal processing can be simplifi ed and
approached with clarity.
Frequency processors
Amplitude processors
Time processors
Frequency processors include equalizers and fi lters. Compressors, limiters,
expanders, noise gates, and de-essers are essentially amplitude processors.
Time processors are primarily delay and reverberation units. Effects devic-
es are hybrids of one of these three primary categories. Some examples
of these specialized signal processors include fl anges, chorusing devices,
distortion, fuzz, pitch shifters, and many more.
Uses of Processing
Signal processing can be used to shape sound qualities. It is applied to the
sound source to complete the process of carefully crafting sounds. This is
done for the character of the source’s sound quality, and to shape sounds
to complement the functions and meanings of the musical materials and
creative ideas.
In the recording chain, signal processing can occur at a number of times. It
may be incorporated in the tracking as preprocessing, and is most often used
to bridge the tracking and mixdown processes. It can also be added subtly
in mastering. Individual instruments or voices can be directed through any
number of signal processors. Similarly, groups of instruments can receive
the same processing, either in the same way or in differing amounts (such
any number of instruments, each sending a different amount of signal to a
buss feeding a reverb unit). The entire recording might also be processed,
as is common in the mastering process.
Chapter 14
318
Signal processing often occurs separately between the tracking sessions
and in preparation for the mixdown session(s)—usually without perform-
ing actual re-recordings of the basic tracks. Tracks are evaluated and sig-
nal processors (hardware devices or the software equivalent plug-ins) are
applied to the tracks to determine fi nal sound qualities. Processor settings
are often noted in session documentation for incorporation into the fi nal
mix, and the recordist performs signal processing in real time during the
mixdown. While it is not common to change processor settings throughout
the course of the mix, the ratio of processed signal to unprocessed signal
might be adjusted, especially for reverb.
Signal-processing sessions on a DAW are simpler. When the desired
settings in one or more plug-ins have been established, they are saved as
part of the project fi le. Of course it is also possible to send signals out of
a DAW for signal processing, and this is a reasonably common practice,
especially when one might want a specifi c device (reverb unit or compres-
sor, for instance) instead of a plug-in.
Listening and Processing
Signal-processing decisions involve critical and analytical listening. Critical-
listening decisions will be made to establish qualities of the sound, for its
own sake. Decisions will also be made concerning how the sound relates to
other sounds and to the entire program (analytical listening). The recordist
will focus on the smallest changes in the source’s timbre, and on any num-
ber of higher levels of perspective. Careful attention to detail is needed, as
well as attention to how small changes in one element or level of perspec-
tive impact the sound in other ways.
The recordist must use the skills of Part Two to focus on the component
parts of the sound qualities of the sound sources being processed. Small,
precise changes in sound quality are possible with signal processing. This
requires the recordist to listen at the lowest levels of perspective, and to
continually shift focus between the various artistic elements (or perceived
parameters) being altered. These changes are often subtle, and can be
unnoticeable to untrained listeners. Often beginning recordists are not able
to detect low levels of processing. This is a skill that must be developed.
Most signal processing involves critical listening. The sound source is con-
sidered for its timbral qualities out of context and as a separate entity. In
this way, the sound can be shaped to the precise sound qualities desired by
the recordist, without the distractions of context.
Knowledge of the physical dimensions of the sound and of perception are
great aids for successful signal processing.
Signal processing alters the electronic (analog or digital) representation
of the sound source. In this state, the sound source exists in its physical
Capturing, Shaping and Creating the Performance
319
dimensions. The various signal processors are designed to perform spe-
cifi c alterations to the physical waveform, which will cause changes in the
perceived timbre of the sound source. Signal processors are only useful as
creative tools if the recordist is in control of these changes in the physical
dimensions. Immediately after altering the sound sources physical dimen-
sions, the recordist will shift focus to place the sound in the context of the
music, as an artistic element.
After the sound has been reshaped, the listener will use analytical listening
to evaluate the sound. The altered characteristics of the sound source and
the overall sound quality of the source will be evaluated as they relate to
the other sound sources and to their function in the musical context. They
will ask Are these changes appropriate for the music, or do they achieve
the desired sound for the musical instrument?” The sound quality shifts
of processing will be evaluated according to their appropriateness to the
musical idea.
The Mix: Composing and Performing the Recording
Mixing is where the piece of music begins to emerge and ultimately comes
together. The mix creates the piece of music, almost in its fi nal form. Here
the individual sound sources that were recorded or synthesized are com-
bined into a two-channel or a surround-sound recording that will become
the fi nal version of the piece after the mastering process.
It is often helpful to consider the mix process in two stages: one artistic,
composing (crafting) the mix, and one technical, performing (executing)
the mix. While the two can happen almost simultaneously, they require dif-
ferent skills and thought processes.
The process of planning and shaping the mix is very similar to compos-
ing. Sounds are put together in particular ways to best suit the music. The
mix is crafted through shaping the sound stage, through combining sound
sources at certain dynamic levels, through structuring pitch density, and
much more. How these many aspects come together provide the overall
characteristics of the recording, as well as all its sonic details. Consistently
shaping the “sound” of recordings in certain ways leads some recordists to
develop their own personal, audible styles over time.
The actual process of executing the mix is very similar to performing. Mix-
ing often involves controlling sound in real time. Controlling the loudness
levels of tracks, changing routing, muting tracks, altering processing, and
more are routinely performed during mixdown, especially in systems that
do not have automation. Many technical decisions also occur during mix-
ing; many are out of real time. Recordists will develop their own approach
to sequencing those activities and decisions, and ultimately create their
own working methods.
Chapter 14
320
Often, composing the mix and performing the mix happen simultaneously.
The piece is shaped slowly as parts come together, and creative ideas are
refi ned as new ideas emerge from hearing portions of the work as they
are being completed. Sometimes new ideas take the project in new direc-
tions entirely; sometimes sounds or relationships of parts are “discovered”
during the mixing process. Creative ideas get refi ned as a project unfolds.
Keeping the technical process and all of the devices and controls from
absorbing the creative energy and disrupting the fl ow of the project is nec-
essary, and is not an easy thing to accomplish. It is necessary to learn to
uently operate all devices and software of the recording chain so one is in
complete control of the recording process. Only with the technical process-
es under control and in the background of the recordist’s mind can atten-
tion truly be directed to crafting the artistic dimensions of the recording.
During the mixdown sessions the separate tracks that were recorded dur-
ing tracking and synthesis will be combined. Many of the recording’s sound
relationships are crafted in the process. To compose the mix, the record-
ist shapes the artistic elements discussed in detail previously. It can be
helpful to group the elements into three groups, or broad areas of focus.
Approaching each separately in planning and executing the mix can help
the recordist clarify their ideas and approaches to a mix. Each of these
signifi cantly shapes the recording, and each will also impact the other two.
One must remember to listen to a mix from many perspectives and atten-
tion to all artistic elements to be certain a desirable sound quality exists in
all aspects of the recording.
These three groups are listed below, with the elements and some other
concerns they contain:
1. Musical balance
a. Loudness
b. Prominence versus loudness
c. Attention, meaning, surprise
2. Pitch and sound quality
a. Pitch density and timbral balance
b. Performance intensity and sound quality
c. Environments of sources and sound quality
d. Dynamic contour by density and/or register versus loudness
3. Spatial qualities and relationships
a. Dimensions of the sound stage
b. Placement and size of sources on the sound stage
c. Listener to sound-stage distance
d. Environments of sources and depth of sound stage
It is important to remember the recordist must have, or be working deliber-
ately to establish, a clear idea of the fi nal sound qualities that are desired.
With clear objectives, the recordist can work toward meeting those goals.
Capturing, Shaping and Creating the Performance
321
Successful mixes balance these elements in all of the sound
sources with the message and musicality of the song.
These elements are used to craft the overall characteris-
tics of the song: form, structure, reference dynamic level,
sound stage dimensions, perceived performance environ-
ment, timbral balance, and program dynamic contour.
Before we explore these three groups of elements indi-
vidually, we will examine how the mix interacts with the
music message and its musical materials.
Composing the Mix: Presenting, Shaping and
Enhancing Musical Materials
The song (piece of music) is made in the mix, and the
recordist is participating. The mix makes musical ideas
come together. The song is built during the mixing process
by combining the musical ideas, focusing on shaping the
three groups of elements that will be discussed below.
A successful mix will be constructed with a returning
focus on the materials of the song, and the message of the
music. The musical ideas that were captured in tracking are
now presented in the mix in ways that best deliver the story of the text and
the character of the music.
Referring back to the hierarchy of musical materials in Part One, we remem-
ber that the structure of a song, or piece of music, is created by primary
and secondary musical elements. These take place at different levels of
importance to the overall musical structure, and at different levels of per-
spective for the listener. How the musical ideas are crafted and combined
is in the area of music theory and, while not covered here, this information
is of great value to the recordist. The ways the recording process will com-
bine the sound sources’ materials will present their musical ideas in new
relationships, and can profoundly shape and enhance them. The mix will
greatly infl uence the musical style of the piece of music.
It is important to establish a clear idea of the fi nal sound qualities that are
desired for the song/recording. Just where to start will vary between indi-
viduals. Some people work from small ideas, adding and building to create
large sections. Some people need to have an idea of where they want to
arrive before they begin crafting the smaller details. Often individuals will
use different approaches for different songs. How one arrives at a vision of
what the song needs to sound like is not important. What is important is
having a strong sense of the desired overall sound qualities of song, and a
strong sense of how some (or most, or all) of the details of the music will
bring this to reality.
Listen . . .
to tracks 48-53
for these musical balance, pitch and
sound quality, and spatial qualities
groups and the elements they con-
tain found in these six different mix-
es of the same drum performance.
Note that all of the materials dis-
cussed in this section played directly
into these different sound qualities.
Observe these various production
aesthetics and the sound stages and
sound qualities that are created.
Use any graphs from Part Two that
might be helpful.
Chapter 14
322
The overall sound qualities, as explored previously, are:
the song’s sense of intimacy with the listener, created by the perceived
listener to nearest sound source distance
the energy level, intensity and expressive qualities of the song that cre-
ate a reference dynamic level
the overall dynamic shape of the song (program dynamic contour)
how the song uses the pitch registers to create a timbral balance or
spectrum of the overall recording
the width and depth of the sound stage (stereo or surround)
the impression of the song’s performance space, or perceived perfor-
mance environment
all of these and the musical structure coming together into a global
shape and concept of the song, its form
Some simple and direct questions can sometimes aid in determining or
clarifying these areas:
How does the story of the text unfold?
How does the music support this?
How can the mix support this?
What relationship do I want the listener to have with the music?
(Observing from afar or intimately close? Maintaining a comfortable
distance? In a large performance space or small? Focused on the text
or feeling the beat? And many more.)
What special qualities are needed to most effectively present what the
song is trying to portray?
How should each of the overall qualities contribute to communicating
the music?
Certainly many other questions are equally valid and potentially important.
The clearer the vision of what is sought in the areas, the smoother and
more effectively the mix will progress.
Mixes create relationships of individual sources or small groups of sources
when they are assembled. These sources (and the musical materials they
are playing) are all given their own or a shared “place” in the mix. This
“place” might be a “space” or “location,” a “level” or “area,” a “character”
or a “set of characteristics” in each artistic element, such as lateral location,
register, dynamic level, etc. In this way, mixing is the act of putting every-
thing in its place, while giving any fi nal shapes to the sounds.
Just where to place a sound source is a matter of what will best suit the
song, and what will deliver the desired overall sound qualities. Among
important questions to ask are:
Capturing, Shaping and Creating the Performance
323
How can this instrument/voice, presenting this musical idea, be placed
in the musical balance to contribute most effectively?
What spatial qualities and relationships will most effectively present
this instrument/voice and its musical material?
What sound qualities are best suited to this instrument/voice present-
ing this musical idea?
How can I best present or enhance this sound source and musical
idea?
Should this idea and instrument (bass, for instance) be emphasized, or
should it be blended with others (bass, keyboard and bass drum)?
What special qualities do I want to bring to this instrument to enhance
the song?
How can these instruments be combined to provide the sound qualities
and sound stage that is desired?
What musical materials/instruments contribute most to defi ning the
energy or the character or the message or the sound quality or the
expression of the piece—and how should they be treated?
What musical materials are supporting the primary ideas, and how can
they be made to do this most effectively?
The answers to these questions (and others that are similar and others
that are more detailed) will lead the recordist with direction and purpose to
crafting a mix that supports the music and presents it in the most appropri-
ate way.
The recordist will decide when to blend sources together and when to
allow a source to be prominent. Just as importantly, they will decide on
how the characteristics of sound will allow these to happen. We remember
that sounds may have any level of importance to the mix and have many
dimensions. Sounds can be blended with others into groups, or be clearly
delineated from others by unique qualities. All of the elements can give
coherence to the mix by providing groups of sources with similar qualities,
and can provide variety to the mix by providing sources with unique quali-
ties. Different elements will function differently on the sound sources, such
as some instruments might be grouped by pitch density, as they perform
in a similar pitch range, but be delineated by a different stereo location.
The combinations and subtle variations are almost limitless and provide
the basis for shaping the artistic qualities of the music. Without a clear idea
of how the fi nished mix is intended to sound, the recordist will have great
diffi culty in making quality decisions.
Stating the seemingly obvious, songs have sections with different musi-
cal materials, and often the sections contain different instrumentation;
songs change from beginning to end in many possible ways, sometimes
Chapter 14
324
markedly and sometimes subtly. Thinking in such simple terms can be of
great assistance in compiling mixes, especially in the beginning. Mixes
have the potential to change throughout a song as well, and this is not
unusual. Several distinct mixes might be used in a song, changing with the
song’s various sections (such as verse or chorus) or within sections. Mixes
can change in an unlimited number of ways, just like the musical materials
of songs. Some of these changes might be very subtle, and some might be
striking. A few potential large-scale changes might be:
marked changes between sections (such as verse and chorus)
subtle changes over the course of the song
marked changes of only a few sound sources (such as lead vocal and
drum set) between sections
sudden entrances of large groups of instruments, with corresponding
changes in the mix
sudden exits of nearly all instruments to reveal only a few sources,
with corresponding changes in the mix for the remaining sources
Referring to the three concepts to be explored below, mixes might be
changed in all or one of the following groups of elements:
changing relationships of musical balance of some or all sound sources
changing spatial qualities and relationships of some or all sound
sources
changing pitch and sound qualities of some or all sound sources
The mixing process taxes listening skills. Acute attention to all elements of
analytical listening and critical listening are vital. While the perspective is
usually at the individual sound source and up one level to how the sourc-
es sound against one another, strong awareness of the overall texture is
required to shape those characteristics discussed above. Close attention
to the detailed characteristics of sources, to the integrity of the signal and
the technical qualities of the recording are also required. The recordist will
be continually engaged in shifting focus from one element to another, and
from one level of perspective to another. The recordist must continue to
seek greater detail and an awareness of subtle qualities and relationships
of sound sources and materials; the subtle characteristics of a recording
often separate the remarkable mix that presents the music in an engaging
way from one that is acceptable but uninspiring—or worse, from one that
is poor or ineffective.
Crafting Musical Balance
The entire mixdown process is often envisioned as the process of deter-
mining the dynamic-level relationships of the sound sources. As we have
noted, the mixdown process is actually much more. It combines many
complex sound relationships, of which dynamics is only one, but it’s an
important one.
Capturing, Shaping and Creating the Performance
325
Sound sources in the mix are related to one another by dynamic level, just
as in live performance. The difference is that the levels can be carefully cal-
culated and changed throughout the course of the mixdown process. The
mixing console or mix function of a DAW allows the recordist to perform
the individual dynamic levels of the sound sources and to make any chang-
es in level in real time. This signifi cantly shapes the mix and the music.
The recordist devises a musical balance of the individual tracks and per-
formers. This is the relationship of the dynamic levels of each instrument to
one another and to the overall musical texture. The individual sound sourc-
es are combined into a single musical texture, each source at its own loud-
ness level. Small changes in level of a single sound source can be diffi cult
to detect, especially in the beginning; this is changing the dynamic contour
of a line, and hearing these changes is an important skill to be developed.
Dynamic relationships are more readily perceived at a higher level of per-
spective, one that compares sources to one another in the overall texture;
beginning recordists can more readily develop the skill of hearing these
relationships earlier in their work. Obtaining the ability to recognize and
control loudness changes and relationships is necessary. A slight alteration
of musical balance and/or the dynamic shape of a musical idea can have a
profound impact on the music.
As noted in Parts One and Two, changing loudness levels in a mix readily
creates a difference between the actual loudness of the sound source and
the performance intensity at which it was performed during tracking. Sound
quality and loudness are then separated. This difference between musical
balance and performance intensity
skews the impression of a live perfor-
mance and interaction of the musicians. Potential exists for dramatic and
creative dimensions in altering the realities of sound quality and dynamic-
level relationships, or this can cause a desired live-sounding recording to
be distorted.
With sound-source loudness levels carrying characteristic timbres, a psy-
chological effect exists whereby sounds might be imagined to be at the
loudness level of their performance intensity, when actual loudness is very
different. A much greater imagined loudness can occur when a sound is
recorded moving from very soft to very loud, while a compressor holds
loudness almost constant at a low level. Here the timbre of the sound
source, and our knowledge of the amount of energy and expression placed
into the performance, brings an impression of a higher loudness level,
although the actual loudness (amplitude) is markedly different. This effect
can be used to creative advantage to give prominence to a sound, without
increasing its loudness.
Perception of actual loudness level is often mistaken for other things. It is
often easy to confuse the prominence of a musical idea or instrument with
loudness. A sound may be prominent because of some special quality (reg-
ister placement, for instance) while actually being at a lower loudness level
Chapter 14
326
than other sounds. A sound can be the most prominent in the listeners
consciousness, while being at a lower dynamic level than other sounds. The
other artistic elements have equal potential to provide outstanding qualities
to the sound, and to cause the sound to stand out of the musical texture.
In similar ways, loudness is often confused with or distorted by the lis-
teners attention to certain aspects of a song, by unexpected events in the
music, and by the meaning of a text or the music. The listener may be
drawn to the text of a song, and the singer might be perceived as the loud-
est musical part; while this can be the case, it often is not. This understand-
ing can sometimes allow the recordist to lower the loudness level of the
vocal without moving the attention of the listener.
When something new or unexpected happens in a song, the listener often
shifts attention to it. This can cause the listener to perceive the event as
louder than it actually is. For instance, one might incorrectly perceive the
high hat sound in the second verse of “Let It Be” (
Let It Be version) to be
louder than the voice. In closer listening it is clearly softer. This mispercep-
tion is caused simply because the high hat arrival is attention-getting and a
surprise, since no percussion sounds have preceded it—and also because
the high hat’s environment pulls it across the sound stage and because it is
in a very different pitch area than the lead vocal and piano.
The musical balance graph of Part Two can be used to plan musical bal-
ance relationships, or to take notes during production. While a complete
graph will likely never be created in production practice, beginning record-
ists might fi nd writing out actual relationships helpful in understanding
and planning their early mixes. Most importantly, the graph can help one
to learn to bring focus to loudness alone, and not to confuse other percep-
tions as being loudness related.
Pitch and Sound Quality Concerns in Combining Sound Sources
The mix also combines the sound qualities of sound sources. When sounds
are combined in the mixing process, the timbres of the instruments/voices
are blended. This blending of sound-source timbres can bring sounds to fuse
together into a group, or if handled differently the sound sources can retain
all or some of their unique characters even if they occupy a very similar in
pitch area. Carefully employing the other elements of sound (such as stereo
location, loudness, etc.) can keep similar timbres from fusing in the mix.
The sound quality of the sound source plays a signifi cant role in the suc-
cessful presentation of the musical idea. The instrument or voices timbre
is shaped to most effectively present the musical material, and the listener
will ultimately come to identify the musical idea by the instrument (or sing-
er) that delivers the musical idea. This process of shaping the sound quality
of the sources began in tracking, with instrument selection, microphone
selection and placement, performance intensity and expressive qualities
Capturing, Shaping and Creating the Performance
327
of the performance, and perhaps signal processing. Sound sources will
now receive fi nal shaping during mixdown by signal processing (whether
hardware or plug-ins).
This fi nal shaping can be used to enhance the character of the sound, or
to help a sound combine more effectively in the mix. Time, spectrum and
amplitude process can all be employed for these purposes.
The sound quality of sound sources also contains the dimension of envi-
ronmental characteristics. We recall the sound source and its host environ-
ment fuse into a single impression. In this way, the sound qualities of the
environment become a part of the sound qualities of the sound source.
Shaping of environmental characteristics must therefore be viewed from
the perspective of how they impact the sound quality of the source. This
can have a signifi cant impact on the source’s pitch area.
A source’s sound qualities may remain constant throughout the piece, or
the qualities may make sudden changes or be gradually altered in real time
during the mix. Many possibilities exist for shaping and controlling sound
quality.
When instruments and voices get placed in the mix, they should be evalu-
ated for their frequency content. This is composed of the pitch(es) they per-
form plus the spectrum of their timbre (including environmental character-
istics). In this way, all sounds occupy an “area” in the frequency range of
our hearing. This is a bandwidth, of sorts, where the instrument’s sound and
musical material combine and that they occupy. This is the pitch density of
the musical idea. The pitch densities of all of the sound sources combine to
create the timbral balance, or the “spectrum” of the overall texture.
It is common for some types of music to emphasize certain pitch regis-
ters over others. For example, the rhythm section of the typical rock band
will have the bulk of its pitch-plus-timbre information (or pitch area) in the
“low,“low-mid,” and “mid” ranges. The timbral balance is weighted in
these low-frequency ranges, and instruments and voices performing above
these registers occupy a very different pitch area and are easily perceived
even when performing at lower dynamic levels or placed in the same loca-
tions on the sound stage. Thus, when sounds are combined, a source’s
pitch area can be exploited to bring the source to blend with others or to be
more readily perceived.
As this song progresses, the rhythm section might thin out at times or per-
haps stop. This causes a shift in the timbral balance. Such shifts are com-
mon in music. Shifts of timbral balance often happen with changes in the
mix, at climactic points in the music, with the entry and exiting of instru-
ments/voices, between a verse and chorus, and more. Shifts are common;
some are very subtle and some are striking. This emphasis of some pitch
registers over others provides an overall sound quality to the song that
comprises its timbral balance.
Chapter 14
328
Timbral balance can contribute to shaping the program dynamic contour
of the work. The density of pitch areas and register placement can increase
or decrease the overall loudness of the program. In this way, the dynamic
level of the overall program is shaped by the number of sound sources
present and the registers in which they sound, as much as the loudness lev-
els of the sources. This is an important consideration when adding instru-
ments and voices to the mix, or pulling sounds out.
Pitch density may be used in innumerable ways to assist in shaping the
timbral balance and dynamic contour of the music and in defi ning the rela-
tionships of the individual sound sources. The timbral balance graph of
Chapter 10 can be used (with or without a time line) to plan the mix and to
keep track of sound-source registers. It can be a useful tool in composing
the mix or in evaluating the recordings of others. The graph can be cre-
ated with careful attention to identifying pitch levels and areas, or it can
be sketchier and be composed of quick, general observations. Either might
prove useful, especially for people who have little experience or who are
trying to match the sound of other recordings.
Crafting sound qualities and creating pitch densities and a timbral balance
are new forms of arranging and orchestration. Recordists will be required to
listen in many different ways to make these important decisions. They will
at times focus on the dimensions of the individual sound qualities. When
the sound qualities of the sources are combined to create the sound quali-
ties of groups of instruments and the overall sound quality of the music
(timbral balance), the focus of the recordist will shift perspective between
these various levels while continuing to scan between the components of
each source’s timbre.
Throughout the process of compiling (composing) the mix, the recordist’s
attention will return to timbre and sound quality. This is accomplished
using both analytical and critical listening skills. Timbres will be considered
as separate entities (out of time) and as sound quality in the musical con-
texts of all hierarchical levels.
Creating a Performance Space for the Music and the Recording
Sound sources are given spatial qualities during mixing. These qualities
provide an illusion of a performance space for the recording—an imaginary
place where the music was performed, with dimensions and sound quali-
ties. The mix creates these dimensions and qualities.
Individual sources are shaped in terms of placement on the sound stage
and environmental characteristics. Sounds are placed on the sound stage
at specifi c locations and are placed in environments.
Spatial properties for the overall program are also crafted in the mix. The
spatial properties created in the mix provide an illusion of a space within
Capturing, Shaping and Creating the Performance
329
which the performance takes place. As sounds are placed at locations with-
in this perceived performance environment, the dimensions of the sound
stage are defi ned.
The spatial qualities that are crafted during mixing are:
Dimensions of the sound stage
Placement and size of sources on the sound stage (lateral and dis-
tance)
• Listener-to-sound-stage distance
Environments of sources and depth of sound stage
Perceived performance environment
During the mix, sounds will be placed at specifi c lateral locations, as phan-
tom images or at speaker locations. These images will be given a width
anywhere from the breadth of the entire available sound stage to a narrow
point in space. This lateral location can have great impact on separating
sounds in the mix, or blending them. The size and location of sounds can
provide prominence or importance for a sound that would otherwise be
less noticeable.
It is helpful to remember humans do not localize sounds equally well at
all frequencies. Further, we primarily use interaural time information to
localize sounds below 2 kHz and interaural amplitude cues above 4 kHz
to localize sounds. This requires the recordist to consider the pitch area of
the sound source, and to work with amplitude, time/phase and spectrum
appropriately to accurately create stable images.
Taken as a sum, the lateral placements of all of the sound sources provide
the listener with a sense of width of the sound stage. The listener develops
a sense of “where” the sounds are and the size and location of the stage
where the song is being performed. This shapes an overall quality of the
recording that is very important.
The location and size of the sound stage has become increasingly signif-
icant, as the industry has engaged surround-sound production practice.
This is a major factor that separates different approaches to surround-
sound production, and revolves around how the rear channels are used
for different types of program materials (especially musical materials and
reverberant sound). When sound sources are placed behind the listener, for
instance, they might perceive themselves as sitting within the sound stage.
The rear channels might also be used to pull the sides of the sound stage
wider than is possible with two-channel stereo playback. Of course it is also
possible (and not uncommon) for the sound stage to remain in front of the
listener, and for the lateral imaging to be very similar to a traditional ste-
reo recording, with the addition of manufactured ambient sound appearing
behind the listener.
Chapter 14
330
The other dimension of the sound stage is distance. Sounds are placed at
a distance from the listener, and provide the illusion of depth to the record-
ing. As a group, the distances of all of the sound sources provide the front-
to-back dimension of the sound stage.
Distance placement can provide a special quality to a sound source. It can
allow a sound to be clearly apparent in a musical texture or to blend more
with other sounds. The importance of distance cues is often underestimat-
ed. These cues bring musical materials and instruments/voices into a physi-
cal relationship to the listener that can be profoundly effective in helping
the musical message or expressive nature of a line. Distance can bring a
musical idea to an immediacy for the listener, or can provide a sense of
being removed from the source and musical idea; this has the potential
to greatly shape the listeners sense of the musical idea. A very different
sound stage exists when all sources are at approximately the same dis-
tance, as they appear to be performing in a similar area and have a sense
of connection in space, than when all sound sources are at even slightly dif-
ferent distance relationships, as they appear to extend the sound stage and
bring sources to have differing relationships to the listener. In recordings
where sources are extended from close proximity into the far areas, with
many sounds in between, the sound stage can achieve vast proportions
and bring a substantial new dimension to a mix.
The stage-to-listener distance is critically important in establishing the lev-
el of intimacy of the recording. The song can speak intimately to an indi-
vidual listener when the lead vocal is very close and clearly within their
sense of proximity. The song can have a different character with a moder-
ate distance, which places the listener in the position of observing what is
being said by the song instead of being personally engaged in it by a close
distance relationship.
It is important to remember that distance cues are primarily the result
of timbral detail. A high degree of low-level detail must be present for a
sound to appear very close. Sounds become more distant as this detail is
removed. Placing a sound in the mix can alter its timbral detail, as it blends
with or is masked by other sounds. Being sensitive to pitch density and
lateral location are especially helpful in preserving a sound’s timbral detail.
Reverberation can alter distance. This can be because of the ratio of direct-
to-reverberant sound, but is often because the reverberant sound masks the
timbral detail of the direct sound. The reverberation’s arrival time gap and
the refl ections of the early sound fi eld will also provide subtle cues that can
shape the imagined distance of the sound source. When adding reverb to a
signal, it is important to bring attention to how it is impacting distance.
The listening perspective of the recordist will alternate between locating
individual sound sources on the sound stage, comparing locations of sourc-
es to one another, observing how all sources create an overall sound stage
(and observing how balanced the stage might be), and recognizing how
Capturing, Shaping and Creating the Performance
331
the sound qualities of sources might be transformed by placing the sounds
on the sound stage. These observations will be made for both distance and
lateral location cues.
Figure 14-1
Sound-
stage diagram
for two-channel
recordings.
Sound stages can be planned and evaluated using the diagrams of Figures
14-1 and 14-2. These will prove helpful in balancing the sound stage and in
creating variety and interest—as desired—of image locations and distanc-
es. These diagrams are snapshots of time and may represent any time unit
from a moment to a complete song. They allow imaging to be recognized
and understood. The dimensions of the sound stage can be drawn around
the loudspeakers in each fi gure, and the listener location will be determined
in two-channel recordings. The diagrams should show the front edge of the
sound stage, giving the recordist a reminder of the sense of intimacy of the
recording, and the depth of the sound stage, providing important environ-
ment size and distance information. A signifi cant set of sound relationships
can be planned, crafted, and evaluated with these diagrams.
Chapter 14
332
Figure 14-2
Sound-
stage diagram for
surround-sound
recordings.
Each individual sound source is placed in an environment during the pro-
cess of making the recording. Qualities of the recording environment may
have been captured during tracking, and these may provide all of the envi-
ronmental characteristics that are desired. Very often environmental char-
acteristics are added to a sound source or a group of sound sources during
the mix. If a sound is added into the mix without environmental character-
istics (such as a direct-in electric bass), the listener will imagine an environ-
ment (often a very small one).
Environments have sound qualities. When a source is placed in an envi-
ronment, it acquires the spectrum and time/refl ection components of the
environment. The environment and sound-source timbres are fused into a
single and new sound quality.
Capturing, Shaping and Creating the Performance
333
Environmental characteristics can add important dimensions to a sound
source, and shape their sound quality in signifi cant ways. They also can
add to the dimensions of the sound stage. The environment of a sound
source can extend the rear of the sound stage, especially with long, high-
level decays in the reverberation. The perceived size of the environment of
a sound source can shape its musical material and also shape the sound
stage itself.
In surround sound production, it sometimes happens that the environ-
mental characteristics of a sound source are separated from sources them-
selves. As such, the direct sound appears in one location and the reverber-
ant information in another. Since the listener naturally fuses the two sound
qualities, but they are now clearly distinguishable because of different
locations, a new relationship is achieved. This is different from separating
the ambience of the entire mix with a sound stage located elsewhere. Care-
ful crafting of the mix will be required to allow this new relationship to be
acceptable to the listener.
Finally, the recordist must turn attention to shaping the environmental char-
acteristics of the overall program. The sound stage will be heard as existing
within a single performance space, the perceived performance environment.
The dimensions of the perceived performance environment may be applied
during the mastering process. A subtle reverberation program can be applied
to the fi nal mix to provide an environment for the recording. In this case, the
recordist should plan for this change in sound quality while compiling the
mix, paying special attention to timbral balance and distance cues.
Often the environments of several or all individual sound sources will cre-
ate the perceived performance environment. In this case, elements of the
environmental characteristics of some or all sound sources are heard as
important to the character of the work, and become perceived as dimen-
sions of the perceived performance environment. No environment is added
to the fi nal mix, but the listener imagines one. Usually the most important
instruments and voices shape these elements, but unusual environments
of lesser important sources can have a strong impact. In this way, the shap-
ing of the environmental characteristics of the sound sources that present
the most important musical materials will have a direct and marked impact
on the listeners impression of the performance environment within which
the recording itself appears to take place. The recordist must be aware that
how certain individual environments are crafted could impact this overall
characteristic of the recording.
Shaping environmental characteristics requires focused attention at the
lowest levels of perspective, for spectrum, spectral envelope, amplitude,
and time information. Focus will shift to a higher level to evaluate how
the characteristics of the environment alter the sources sound quality, and
will ultimately move to the highest level to observe the environment of the
overall program.
Chapter 14
334
Crafting spatial properties is as important to the mix as any other element.
They may be used to delineate the sound sources into having their own
unique characteristics, or they may be used to cause a group of sound
sources to blend into a sense of an ensemble. It is possible for a group
of instruments to be grouped in several spatial dimensions (such as an
environment), but to have very different characteristics in another (such as
distance). Common characteristics provide a connection between sources
and unify them in some way. Differences distinguish sound sources and
can add prominence to their presence in the mix, and perhaps call more
attention to their musical material.
Performing the Mix
In performing the mix, the sound qualities and relationships of the record-
ing are realized. How everything comes together depends on technologies
and work methods.
The process of executing the mix is similar to performing, in that sound
is being controlled, often in real time. Many technical decisions also occur
during mixing; many are out of real time. These activities are planned, just
as sound qualities and relationships were planned while “composing” the
mix. Details pertaining to specifi c recording equipment, technologies and
techniques will directly impact how the mix is planned.
Work methods will vary by individuals, but they will all center on estab-
lishing a logical, effi cient and effective sequence of activities. The goal is
to best use the selected equipment and technologies to craft the mix as
planned, and to establish and maintain a high-quality signal. Equipment
and technologies vary greatly, but the process remains the same; the basic
signal chain and the events of the production sequence remain constant
and should remain clearly in mind. Once the recordist has obtained knowl-
edge of technologies and mastery of software and equipment usage, they
can develop their own approach to sequencing production activities and
decisions, and might ultimately create their own working methods. As
these methods are technology and equipment dependent, they are outside
the scope of this writing.
Recording aesthetics and the planned qualities of the recording will shape
the production process. Many possibilities for approaching the mix exist,
and the correct approach for one project is not necessarily correct for
another. For example, multitrack mixes are often ‘performed’ a track or two
at a time, but sometimes any number of tracks are crafted simultaneously;
in direct-to-master recordings all tracks are performed simultaneously. In
another example, sounds might be placed in the mix before sound qualities
receive their fi nal shaping (with fi nal EQ or reverberation added later) in
one project, but frequently sound qualities are fi nalized well before sources
are placed in the mix and combined with other sounds.
Capturing, Shaping and Creating the Performance
335
Skill in performing the mix relies on knowledge of the technologies being
used in the signal chain, on mastery of equipment and software usage and
the interface points of the signal chain, and on dexterity with the techniques
of creatively using these devices and technologies to craft the mix.
Summary
The mix is the recording in nearly its fi nal form, and is where the piece of
music comes together. A vision of the sound qualities and relationships of
sounds in the recording must be present at the start of the recording pro-
cess. Preparation for the mix begins with selecting sound sources suitable
for the musical materials, and then capturing or creating the performances
of the sound sources during tracking or recording sessions.
The mix is composed through planning the qualities and relationships
of sounds, and the overall characteristics of the recording. Details about
recording equipment, technologies and recording techniques will achieve
these qualities and relationships are also planned. During the mix, the
sound qualities, dynamics and spatial qualities of the sound sources and
the recording are then carefully shaped.
The sound sources may be recorded in a multitrack format (in a DAW or to
analog multitrack tape, as examples), for mixing at a later time, or mixed
directly to the recording’s fi nal format (typically two-channel stereo or 5.1
surround). These contrasting methods will be explored in the next chapter.
The mix results in the fi nal version of the piece of music. This recording will
often be transformed a fi nal time in making the master of the recording.
Exercises
The reader will benefi t greatly from methodically exploring the following top-
ics. Each of the topics should become the focus an exercise that would have
the reader explore:
How does the device or process transform the waveform?
What changes to the waveform can I recognize?
In what ways do these transformations impact the audio signal and the
sound source?
The accompanying CD provides tracks that can be used as sound sources for
many of these exercises, whether it be the sound of track 24 or 25’s cymbal set
to repeat indefi nitely, one of the solo cell or solo guitar tracks, the piano pitch
of tracks 10 through 13, or any others that might be of interest or appropri-
ate. These tracks could be fed from a CD player into the various devices in the
signal chain. Other source material can certainly be substituted.
Chapter 14
336
In all exercises, the reader should begin work on exaggerated settings. These
will likely be far from artistically pleasing, but will cause the changes to be
readily apparent and easier to perceive in fi rst encounters where subtle al-
terations or changes in level might well go undetected. Upon repetitions,
make changes to the source(s) smaller and smaller, to refi ne your ability to
identify the changes and observe the alterations created to the qualities of
the sound source(s). Any exercise will benefi t from repetitions with different
sound sources.
Do not think about the musical result of your exercises (tracks, sound quali-
ties and mixes) at this point; allow yourself the opportunity to simply learn
sound qualities and relationships. Learn these devices and processes, as they
are your tools; when you have control of your tools you can start bringing
your attention to making artistic decisions.
Exercises 14-1
Tracking Exercises
a. Alternate listening to input of console/DAW and playback of recorded
signal (output of sound card, I/O interface or record deck); seek to iden-
tify alterations to the source signal caused by the recording medium and
signal chain, and listen for added noise of any type.
b. Microphone selection and placement; listen carefully to individual micro-
phones for timbral detail and related frequency response and distance
cues, blend of source’s sound, pitch area and sound quality; substitute
microphones, alternate between several placements of the same micro-
phone.
c. Performance related issues; listening carefully during tracking and in eval-
uating recorded tracks for:
Tuning of instruments, tuning from one take to another;
Tempo remaining consistent throughout a take, and changes of tem-
po between takes;
Loudness levels of instruments and the recording consistent within
and between takes;
Musical expression and performance intensity: do they match the
qualities you (or the performers) are trying to pull out of the music?
Do they remain consistent between takes?
Capturing, Shaping and Creating the Performance
337
Exercises 14-2
Signal Processing Exercises
Through focusing on critical listening techniques to a single source alone, ex-
plore the results of applying signal processors. Spend time with the manuals
(or Help fi les) of the devices or plug-ins to learn about and experience their
potential. Again, start with extreme settings to identify the perspective and
artistic element that must be the center of your focus, and then move to more
subtle changes.
a. EQ; listen to changes created to a single drum or cymbal sound; listen to
changes created to an entire guitar track; the EQ will act differently on
different pitch levels by “adding formants.”
b. Noise gates; listen to changes created to a single drum or cymbal sound.
c. Compressors; listen to changes created by applying compression to an
entire cello track (such as track 39) or drum track (such as track 38).
d. Delay; work with various individual drum sounds and the guitar track 42
to identify and learn the effects of adding delay to individual sounds and
to an entire track.
e. Reverb; work with the bass drum and tom drum sounds and the guitar
track 42 to identify and learn the effects of adding various reverb to indi-
vidual sounds and to an entire track.
f. Filters; place high-pass and low-pass fi lters on piano tracks 3 and 10
through 13 to identify and learn how the sound is changed; repeat this
for the cymbal sounds (tracks 24 and 25) and for cello track 41.
Exercises 14-3
Mixing Exercises
Mixing requires sources to be compared to one another and relies on analyti-
cal listening observations. Bringing one’s attention to the level of perspective
just above the individual sound source, where the sources are of equal impor-
tance and can be compared to one another, can develop this skill.
For these exercises the reader will need to compile or obtain source tracks that
can be mixed. Many DAWs come with tutorials that supply such tracks, but
recording a few friends performing a simplifi ed cover of a well-known song
can be a very rewarding (and educational) way of securing these tracks.
Start your exercises with 2, 3 or 4 tracks and gain confi dence with results
before moving on. Again, start with extreme settings to identify where (the
perspective and artistic element) the center of your attention must be, and
then move to more subtle changes. Remember, do not burden yourself think-
ing about the musical result of your mixes at this point; allow yourself the
Chapter 14
338
opportunity to simply learn sound qualities and relationships. These mixes
need not be recorded and shared with others.
The reader might want to use some of the graphs and fi gures from evaluat-
ing the recordings of others to keep track of their own mixes. They can be
especially helpful in beginning musical balance, stereo location and distance
location exercises.
a. Musical balance; try to align two sources at the same loudness level; then,
noticing how loudness level shifts change relationships, alter the loudness
levels to make one sound much louder than the other; then notice that
sounds can be soft and still be prominent; try to create a musical balance
that is contrary to performance intensity cues (for instance, loud sounds
appearing soft in the mix); work through these issues again with 3, 4 or
more sources.
b. Timbral balance; in the fi rst mix you established for musical balance
above, change the timbral balance by adding EQ to increase a portion of
the spectrum of one of the sounds, until you notice a shift in the timbral
balance of the mix; repeat this to subtract the same portion of the spec-
trum of one of the sounds, until you notice a shift in the timbral balance
of the mix; work through these two concepts again with 3, 4 or more
sources.
c. Stereo location; returning to the fi rst musical balance mix, pan one source
of the far right side of the sound stage and the other far left; after listening
to this, bring them both to the center, then both halfway between center
and left or right, respectively; now do the same for your second musical
balance mix. Now add a number of other sources to the mix and place the
sounds in distinctly different locations to make a wide sound stage, then
move them all near the center to create a narrow sound stage; fi nally,
listen carefully to the spectrum of the sounds and try to place them in
locations where (1) spectral components are not covered by other instru-
ments, and then (2) where the spectrums of two or more instruments
blend together.
d. Distance location; returning to the fi rst musical balance mix, pan one
source of the far right side of the sound stage and the other far left; after
listening to this, bring them both to the center, consider any changes
to distance location that may have been created by masking of spectral
information in overlapping stereo locations; separate the sources far left
and right again and begin changing the EQ of one of the sources to de-
crease timbral detail and therefore increase the distance of the source
from the listener; return the EQ to a fl at setting and add a reverb program
to the source until timbral detail blurs and distance is increased.
339
15 The Final Artistic Processes
and an Overview of Music
Production Sequences
Multitrack and direct-to-master recording processes can represent the
extremes of the different recording aesthetics. They are very different in
their production sequences and their utilization of recording techniques.
The artistic use of the recording medium, the importance of the perform-
ers and their interactions, and the amount of input the recordist has on the
artistic aspects of the recording may be very different between multitrack
recordings and direct-to-master recordings.
The differences in approach to production (often linked to aesthetic posi-
tions discussed in Chapter 12) are refl ected in the sequence of events that
occur in creating music recordings. Production aesthetics and techniques
will determine the amount of control and infl uence the recordist will have
over the musical relationships in the recording, and the extent to which the
musicians determine the musical relationships (by their performances as
individuals and by their interactions as an ensemble).
The fi nal, creative processes of editing and mastering the music recording
will change somewhat with individual approaches to multitrack recordings.
Those differences will be even greater between multitrack recordings and the
various approaches to direct-to-master recordings. Finally, the relationships
of editing and mastering to the fi nal recording (and “record album”) will be
considered, and lead to the fi nal presentation of the recording and music.
Chapter 15
340
An Overview of Two Sequences
for Creating a Music Recording
Every music recording project is unique. Some general observations about
production techniques and sequences can be made, but details will cre-
ate differences between projects. The sequence of events in the individual
recording production and the use of the recording chain will be adapted to
suit the needs of the individual music recording.
Some projects will require more session preplanning than others. One
project may require more mixdown planning and preparation than anoth-
er. Other projects might have very different requirements from more con-
ventional projects. The order of events will be mostly consistent with the
outline below; some overlapping between the events will be common, as
well as some alterations to the orders of the events—or portions thereof.
Multitrack Recording Sequence
A complete sequence of events for a multitrack recording might be:
1. Session preplanning: conceptualize project and pieces of music to be
recorded (writing the music if necessary), rehearse musicians, defi ne
sound sources and their timbres, select microphones, plan track assign-
ments, determine recording order of the tracks, plan the recording’s
sound stage;
2. Tracking session: record reference tracks (vocals and accompaniment,
etc.), followed by recording the basic tracks (primarily the rhythm
tracks);
3. Editing of basic tracks for out-takes, and to create the basic structure
and length of the piece; reorganize tracks;
4. Overdub sessions: replacing reference tracks with fi nal performances;
adding solo parts and secondary ideas to the basic tracks, refi ning the
musical material; composing and recording any additional parts to fi ll
newly discovered requirements of the piece;
5. Processing and mixdown preparation sessions: fi nalize the sound qual-
ities of sound sources; edit the source tape/tracks for mixdown (reorga-
nize tracks, remove unwanted sounds);
6. “Compose” the mix: defi ning the artistic elements of dynamic levels,
spatial properties and sound quality for each sound source, and by
considering the interrelationships of the mix and the musical materi-
als of the piece; rehearse the mixdown sequence with people who will
assist in the session; remember, different sections can have very differ-
ent mixes;
7. Mixdown session: perform the mix(es), mixing the multitrack down to
two tracks or surround (often occurring during the same session as
Step 6);
The Final Artistic Processes and an Overview of Music Production Sequences
341
8. Compiling the song: assemble and process a fi nal version of the song
by combining the section mixes and by matching levels and applying
any global signal processing.
Direct-to-Master Recordings
Direct-to-two-track recordings (or direct-to-surround or direct-to-mono) are
common in music recordings for fi lm, television, and advertising. These
recordings are mixed to fi nal relationships (master) during the recording
session—and the mix/recording processes can often impose minimal alter-
ations to the ensemble’s sound. This approach is common in recording art
music (such as orchestral, choral, or chamber music). It can also be suitable
for jazz, folk, ethnic musics, popular, rock, or any other music when the
musicians (or the conductor) want to be in control of the musical relation-
ships within their performance, or when the function of the recording is
best served by having all of the musical parts performed at once (often the
case for fi lm scoring or archival recordings, as examples).
The process of making direct-to-master recordings is strikingly different
from the multitrack recording process discussed previously. Nearly all of
the recording process considerations of Chapter 14 are not directly relevant
to this approach.
Further, the act of defi ning the sound quality, as presented in Chapter 13, is
shifted from the sound source to the perspectives of the overall ensemble,
of groups of instruments within the ensemble, or to the perspective of a
limited number of individual soloists.
A complete sequence of events for a direct-to-master recording might fol-
low the outline below. These events will be discussed in detail and in the
form of a commonly occurring sequence in the following paragraphs. This
sequence, and the details that follow, are guidelines that are altered for the
individual project—sometimes markedly.
1. Session preplanning;
2. Creating the sound quality of the recording;
3. Consultations with the conductor (musicians);
4. Recording session;
5. Selection of takes; and
6. Editing to compile a fi nal version of the recording and music.
1
. Session preplanning always begins the production sequence. Once
the music to be recorded is known, the recordist will need to know the
performance level of the musicians (performers) and the location of the
recording session (if it will not take place in the studio). This will allow suit-
able microphones to be selected, an appropriate stereo or surround micro-
phone technique to be identifi ed (if desired), and microphone placements
to be planned. The acoustics of the recording environment will also need
Chapter 15
342
to be evaluated when the recording is to take place in a space unknown to
the recordist.
The recordist may then meet with the primary performers (or the conduc-
tor of the ensemble) to determine how the music can be effectively divided
into sections, or the recordist might make these decisions on their own.
Problems of editing sections together into the master recording of the work,
and issues in stopping and starting the ensemble, will be considered when
making these divisions. The order in which the sections will be recorded
may then be determined. The recordist’s and conductors scores, and the
musicians’ parts, should be marked to identify these sections, to make
starting and stopping the ensemble during the recording session clear and
effi cient. A discussion between the recordist and the conductor (or per-
formers) should also clarify the recordist’s artistic role in the project.
Further discussion, and perhaps some recorded rehearsals in the recording
space, should clearly defi ne the sound qualities that will be sought for the
recording project.
2
. Crafting the recording’s actual sound quality can be accomplished by
monitoring a fi nal rehearsal of the ensemble, in the performance space
in which the recording will take place. Alterations to microphone selec-
tion and placement will be made to achieve the desired sound quality of
the recording, previously discussed. The microphone selection and place-
ment will largely determine the spatial properties, dynamic level relation-
ships, and the sound quality of the recording. Balancing of microphones
and signal processing will do the fi nal shaping of the sound quality of the
recording. Any necessary signal processing (environmental characteristics,
time delay, EQ, and dynamic processing being most common) for accent
microphones and the stereo array (or arrays) will be added and tuned at
this stage of the production sequence.
After the fi nal sound quality has been established, portions of the rehearsal
are recorded for later reference and discussion. Any changes in the mix that
may be required for the recording session are determined. In the best of
situations, these changes will be thoroughly rehearsed during this rehears-
al of the ensemble.
3
. The recordist and conductor (or musicians) will listen to the reference
recording that was made during the dress rehearsal. Often their discus-
sion will be solely on the subject of sound quality; all musical consider-
ations may be determined between the conductor and the musicians, or
amongst the musicians themselves. Any alterations that must be made to
the recording’s sound quality are determined during this discussion; the
recordist needs to obtain a clear idea of the sound qualities required for
the project.
All requested changes to the sound quality of the recording are worked
into the recording process. The microphone and recording equipment
The Final Artistic Processes and an Overview of Music Production Sequences
343
set-up for the recording session will refl ect these changes. Sound quality is
rechecked during the musicians’ warm-up period, before the beginning of
the recording session. The recordist and conductor (musicians) might now
make fi nal evaluation of the changes that were made to the sound quality
and confi rm that the sound quality is correct. The recordist, conductor, and
musicians will briefl y clarify the logistics of the recording session (how
stopping, starting, slating, etc., will be handled).
4
. The recording session follows. During the recording session, the sections
of the piece are performed in the prearranged order. Many takes may be
performed of each section of the work until two suitable takes (of each sec-
tion) are recorded. Each take of each section is monitored by the recordist,
with a focus on consistency of loudness levels, tempo, intonation, perfor-
mance quality, and the expressive qualities of the performance. An assis-
tant engineer or second engineer may be used to assist in sound evaluation.
This person would focus their attention on the technical and critical-listen-
ing aspects of the sound of the recording. If possible, another assistant will
be used to maintain a record of the content of the session takes, making
notes on the recordist’s observations of each take, and of the observations
of the musicians’ spokesperson (usually the conductor, if one is present).
Any changes in the mix that are required in the recording were choreo-
graphed during the rehearsal session(s). These changes in the mix are per-
formed, in real time, during the musician’s performance in the recording
session. The recordist may coordinate the activities of one or more assistant
engineers, who would physically perform the actual changes in the mix. The
recordist would remain focused on the accuracy level of all of these chang-
es, as they are made, as well as their relationships to the performance.
A multitrack recording of the session may be made simultaneously with
the reduction mix to make a safety recording of the session. This will allow
the recordist to perform a remix of the session at some future time, should
this be necessary. All microphones will usually be sent from the console
directly to a DAW or multitrack recorder, often routed from the consoles
direct or patch outputs. No balancing of dynamic levels or extra signal pro-
cessing would normally be performed on these tracks. This would defeat
the purpose of the multitrack backup.
5
. The conductor (musicians) will often listen to the session tapes with the
recordist. Often they will listen to only a few takes of each section of the
piece. These takes will have been preselected by the recordist, using the
conductors observations (that were written down by the production assis-
tant) during the recording session as a guide for the selection of takes.
Takes with technical problems will also be identifi ed. The takes that will be
used in the fi nal recording are determined during this conference between
the recordist and the conductor. Both parties will discuss their perception
of the sound qualities of the takes, and the recordist may or may not be
asked to evaluate the musicality or accuracy of the performances of each
Chapter 15
344
take. Any specifi c aspects of the sound quality that are undesirable will be
identifi ed. The recordist might determine signal processing alterations to
attempt to solve (or minimize) any sound quality problems and may play
some of the possible alterations for the conductor (performers) during this
session. A remix from the safety multitrack might be considered at this
point as a last resort in the event of very poor session results.
6
. Any signal processing or mixing alterations that were determined in the
review of takes are performed by the recordist in the mastering session.
These changes may be performed before or after the master recording
has been compiled, depending on the type of alterations that need to be
made. The master recording is created by editing or “splicing” together the
selected takes—at the correct locations and in the correct order. Any global
signal processing will be applied to the overall program after the edited
version has been assembled. In this case, a master recording will be made
by playing the edited version through any signal processing device(s), to
record the actual master of the work. The recordist will arrange for the con-
ductor (musicians) to hear the master recording, for fi nal observations and
approval. Any nal alterations to sound quality (etc.) requested by the con-
ductor will be performed by the recordist and will complete the project.
Editing: Rearranging and Suspending Time
With analog tape, the recordist can physically hold time in their hands and
move it around. Audio recording transfers sound, which can only occur over
time, into a storage medium where the sound is physically located, suspend-
ed out of time. This is very signifi cant, and this concept is getting lost as more
people have never experienced or worked with analog tape. The sound can
then be changed and reordered by physically altering the storage medium
itself (as in cutting analog tape), or by altering the way the storage medium
reproduces the sound (i.e., replaying a portion of a digital recording/sound
le). The sound may be altered at any time, present or future, and may be
replayed forwards, backwards, at any speed (even at uneven speeds).
In
editing sound, the recordist is able to precisely shape material out of real
time. Editing usually combines or joins several different time segments, each
time segment being composed of a group of any number of sounds. The
time segments may exist as pieces of analog tape or as computer data.
In joining the sound segments, the recordist can signifi cantly alter the
piece of music and its artistic message. These alterations to the music
that are made possible by editing serve many functions. The edit must be
accomplished in an artistically sensitive manner and must be inaudible in
all areas of technical quality.
It may be impossible to perform technically inaudible or artistically sensi-
tive edits under some circumstances and in some locations in the piece of
music. The recordist will identify potential edit points to carefully calculate
The Final Artistic Processes and an Overview of Music Production Sequences
345
an edit before it is made. In analog, this may even involve rehearsing the
edit on a copy of the master tape.
Editing is often used to compile a master of a recording session. In this
process, a few or a good many separate sections of the piece of music
are joined into a single performance. The most appropriate material, or the
most accurate and/or pleasing performances, will be selected for the mas-
ter. A single performance is compiled from the many takes of segments of
a piece or of a direct-to-master recording, or a single performance is com-
piled from joining the few individual mixes of multitrack recording.
It is possible to reorder sounds through editing techniques. The major
sections of a piece of music may be rearranged. Entire measures may be
exchanged, or sounds within a measure may be reordered.
In analog recording, the editing process will alter all sounds present. A
reordering of sounds cannot occur unless they are isolated. It will not be
possible to reorder the sounds of instruments in a drum fi ll without also
moving the sounds that occur simultaneously with the drum sounds. Like-
wise, it is impossible to cut a sound source into numerous time segments
and reorder the sound, unless that sound is isolated from other sounds.
Today’s digital audio workstations make all of these and much more pos-
sible, depending on how tracking was accomplished.
Identifying Edit Points
Edit points (also called splice locations) are calculated by anticipating the
sound that will be created when the two segments are joined. A critical-
listening process of evaluating sound quality is used. Each segment to
be joined will be evaluated for its sound qualities to determine the most
appropriate location of the splice. Beginners often fi nd the edit points in
the music through trial and error. With developed skill, the listener will
readily identify these locations by listening carefully to the sound qualities
of the two segments, remembering what was heard, and comparing the
two sound events. How the edit impacts the musical materials will also be
considered by the recordist.
Audible edits are nearly always unacceptable and may be created by many
factors. Both the critical listening concerns of audio quality and the ana-
lytical listening concerns of the musical materials must be considered in
determining suitable edit points. The sound must be evaluated for any
changes that might be caused by the edit process itself, and for any noises
that may have been added. In calculating the edit, the recordist will scan all
artistic elements, or perceived parameters of sound, at all perspectives, to
determine a usable edit point.
It is not possible to perform an inaudible edit when large differences exist
in any of the elements of sound, between the two time segments. Such a
Chapter 15
346
splice would result in a sudden alteration of a component of sound at the
point where the two segments meet; the sudden change would be audible
and unacceptable. As soon as an edit has been made, it will be checked for
accuracy and to be certain it is inaudible, and that no noises were added in
making the edit.
Under unique circumstances, sudden changes between segments may be
desired, as in creating a master recording where the splice actually joins
very different musical ideas. In these instances, the recordist must make
certain that the editing process does not create noise at the edit point, and
that the sudden changes are presented as a part of the musical materials
(have signifi cance and are handled artistically).
Edits are most easily made at points where loud attacks are performed by
prominent instruments or the entire ensemble, or immediately before or
after (not during) areas of silence.
Sound sources that are sustaining over the edit point or that are present
in each time segment make the edits more diffi cult. Changes in the sound
source will make the edit point audible.
Among the most common of inconsistencies that are present between two
time segments are differences in loudness levels. Even subtle changes can
be quite audible. Calculating the loudness levels between various takes of
an entire ensemble can be quite diffi cult, but will be developed through
learning to focus on program dynamic contour. Beginning recordists will
often only notice problems in this area after the edit has been made.
Tape noise is part of analog recording. The amount of noise on the tape
may or may not be consistent throughout a recording. Changes in noise
oor at edit points are very noticeable.
Differences in sound quality of individual instruments and of the overall
ensemble are easily overlooked. The potential exists for sound sources
and an ensemble to undergo signifi cant changes in sound quality from
the beginning of a recording session to the end. Performer fatigue, perfor-
mance intensity, artistic expression, or a change in temperature or humid-
ity in the performance space may cause these changes in sound quality.
Even subtle changes of sound quality can have a marked impact on the
technical quality and musicality of the recording.
Changes in pitch between the two segments can be the most noticeable of
all changes. The recordist must be well aware of any inconsistencies in this
element. Inconsistencies may occur within a particular sound source, or it
may be a change of the reference pitch level (tuning) of the ensemble. Care
must be taken to monitor the tuning of the ensemble and the intonation of
the performers.
No changes in spatial properties should occur at the edit point, unless they
are planned. It is common for spatial properties to be considerably different
The Final Artistic Processes and an Overview of Music Production Sequences
347
between time segments, when they represent different mixes of a multi-
track master. Sudden shifts of distance locations are common, and have
the potential to create few technical problems. Although sudden shifts of
surround or stereo location are equally common between time segments,
musical and technical problems can be created. Among these are unstable
images and phase differences between similar sounds at the edit point.
Sound sources or environments that have a lengthy decay may need to be
carried over across the edit point. This may or may not be possible, depend-
ing on the musical context and the nature of the sounds themselves. Edits
at these points are sometimes possible when other sources mask portions
of the sound, but must be carefully handled to avoid audible changes of
sound quality. These edits may need to be planned before the recording
session, with suitable alterations made to the performances at the session,
for instance starting the performers a bar before a planned edit point to
have reverberation present across the edit.
The musical material must remain in rhythm. It is possible to add or sub-
tract time in making an edit. Rhythm changes are very noticeable in their
effect on the performance, and measures will appear to be extended or
shortened by fractions of a beat.
Tempo changes or inconsistencies can occur between takes. The tempo of
the performances will be carefully monitored during the recording process,
but like all of the above it must be reevaluated during editing. Any tempo
differences that are present between segments will make the edit point
very noticeable, and will also make signifi cant changes to the music. An
entire take may be unusable, solely because of tempo inconsistencies.
Editing Techniques and Technologies
Analog and digital recording systems have some different characteristics
specifi cally related to their technology. The inherent qualities of each for-
mat create advantages or disadvantages depending on the application of
the recording and the specifi c nature of the recording session. Either ana-
log or digital recording may be the most appropriate choice, depending on
the individual recording project. Sound is edited very differently in the two
technologies.
Analog Tape Editing
In an analog recording, a physical image of the sound is present as orient-
ed magnetic particles on tape, and the physical characteristics of the image
are directly proportional to the soundwave. In editing an analog tape, the
tape itself is physically cut with a razor blade. Two cut ends of magnetic
tape are joined (usually at 45°) with an adhesive tape.
Chapter 15
348
Splice locations are found by slowly moving the tape across the playback
head of the recorder. By rocking the tape across the head, the recordist is
able to identify the edit point. The edit point is physically located on the
tape at the playback head. The tape is marked, removed from the recorders
tape path, placed in an editing block, and is cut.
Once an analog tape has been spliced, it is diffi cult to redo an edit. Splic-
es are diffi cult to separate without causing damage to the magnetic tape
(which contains the sound—music). If the recordist is successful in undoing
the splice without damaging the tape, it is diffi cult to cut thin time seg-
ments (pieces of tape) off the end of a magnetic tape (should the original
splice be just a bit too far to the left of the desired edit point). It is almost
impossible to add a small piece of magnetic tape onto the beginning of a
tape segment (should the original splice be a bit too far to the right of the
desired edit point). Identifying analog edit points, and the actual cutting
and taping activities required of analog editing, all require signifi cant skill
gained through practice and experience.
Diffi cult edits are sometimes rehearsed. Copies of portions of the session
tapes are made, and the copies are edited. The recordist gains confi dence,
or fi nds the precise edit points that are usable, on the copies of the tape,
thereby allowing most errors to be made on tape that will not be used in the
nal version of the project. Obviously, this is a time-consuming process.
Digital Sound Editing
Digital recording formats are quite different from analog. Sound exists as
digital information, stored as data fi les. Specialized computers or special-
ized software for personal computers are used to edit the waveform. The
digitized waveform can be altered by modifying and/or rearranging its digi-
tal information, but this need not be so. In many systems, edits are simply
“play lists” of select portions of select fi les at precisely defi ned starting and
stopping points.
A disadvantage of digital editing is that the sound cannot be held in the
recordist’s hand. There is no physical location of the recording and its com-
ponent sounds. All editing is accomplished on a computer and must be con-
ceptualized more abstractly than analog editing practices. Decisions might
rely on the eyes, rather than on listening.
Conversely, the primary advantage of digital editing is that the sound is
not physically present in the recordist’s hands. The sound exists as com-
puter information and may be acted upon in ways that are not restricted
by physical limitations. The following items are the most commonly used
among the many functions of most digital editing systems:
Precise edit points may be identifi ed and saved for future use, with
great time resolution
The Final Artistic Processes and an Overview of Music Production Sequences
349
An edit may be heard, changed, reheard, and evaluated by the record-
ist; in many systems an edit might never need to be permanent
Edits can be undone, quickly and easily
The edit does not alter the original material; the original recording is
not edited; a copy of the original recording is edited, as a computer fi le
(with no generation loss)
Overall dynamic levels of the time segments on either side of the edit
may be controlled to match at the edit point
Edits may be made by cross fading from one segment to the other, or
by suddenly switching from one take to another (called a butt edit)
Some systems allow the signal to be heard as the recordist moves the
cursor point (simulating the rocking of an analog tape across the play-
back head)
Time, dynamics, and frequency processing are usually available to
address specifi c types of inconsistent sound quality and relationships
between the two time segments
Special effects, such as looping and reversing sounds, are common
It should be evident that digital formats allow more fl exibility in and control
over the editing process than was available in analog editing.
Mastering: The Final Artistic Decisions
The fi nal master recording of the piece of music (joined with all other pieces
in the project) is the result of a mastering process. The mastering process
is the last chance to shape the recording before it goes to the duplicating
plant. This is the last creative step in crafting the individual song, as well
as shaping the whole album into one experience. Mastering fi lls the void
between mixing and replication, where sound can be enhanced one last
time and any problems repaired before the recording is fi nalized. Master-
ing provides the fi nal touch to make the record album sound fi nished, and
also will seek to ensure the recording will retain its sound quality when
played on a variety of playback systems and formats.
Mastering can leave the individual piece of music sounding much as it did
in the studio, after mixdown and any fi nal processing. It can consist almost
solely of ordering the songs of the album and perhaps adjusting loudness
levels between tracks (songs). In earlier times when the primary fi nal for-
mat for recordings was LP records, these were the primary tasks of the
mastering engineer. Subtle equalization (to compensate for frequency dif-
ferences between the inner and outer grooves of the disc), gentle compres-
sion (to ensure the level remained above the noise fl oor) and limiting (to
protect the cutter head) may have been often used, but the sound quality
of the recording made in the studio was not intended to be altered by the
mastering process.
Chapter 15
350
This approach might still be found but is now quite rare. Currently the
mastering process is much more involved, and will typically transform
a recording signifi cantly from what was made in the studio. While these
transformations may often be subtle to untrained ears, they are signifi -
cant to the success of the recording. These transformations can be used to
address problems with the recording and to enhance its sound qualities. As
many recordings are now made in lower-end and home studios that often
have inaccurate monitoring, and many times are made by people with
inadequate experience, the importance of the mastering engineer and their
infl uence on the fi nal recording has increased substantially. Their monitor
systems must be carefully calibrated and in a quiet and acoustically neutral
room in order to be very accurate.
The mastering engineer will shape sound at the highest levels of perspec-
tive to enhance or correct dimensions of the overall characteristics of the
recording; they will also listen at all other levels of perspective to focus on
how their alterations impact all of the other aspects of sound. They engage
in both artistic and technical tasks, and use both critical and analytical
listening. Where mixing utilized methods for improving the recording by
manipulating the sound characteristics of the individual sound sources
within it, mastering requires techniques for enhancing and correcting com-
pleted mixes. Mastering can turn an ordinary mix into an extraordinary
recording, and it is also possible for a great mix to be ruined by poor mas-
tering decisions.
The sequence of the mastering process follows. This ordering of events will
be largely consistent between projects:
1. Assembling all fi nal versions (mixes) of all of the songs (pieces of music,
or movements of compositions) of the project (album, fi lm sound track,
etc.) into the song tracks of the fi nal release;
2. Establishing the time length of spaces between the songs;
3. Editing to remove noises in the recording and to minimize distortion or
unwanted sounds;
4. Establishing an appropriate timbral balance (also called spectral bal-
ance) for the album, sometimes for an individual song;
5. Adjusting the dynamic levels of individual tracks (songs) as needed;
6. Establishing an appropriate dynamic level for the album, and leveling
the individual dynamic levels of all tracks; and
7. Coding for replication.
Assembly
The process begins with assembling the project. The individual tracks are
sequenced, or placed in an order. This is not a simple matter. Projects some-
times have an overall concept (an idea that originated with
Sgt. Peppers
Lonely Hearts Club Band
) though this is not common. They will, however,
The Final Artistic Processes and an Overview of Music Production Sequences
351
very often have some type of a theme or direction throughout or a central
song to bind the album into a single experience. Rarely is an album simply
a random collection of songs. The songs will be carefully ordered to pro-
vide a diverse experience that brings changes of tempo, mood, subjects,
materials, intensity, and sound characteristics. The record album will be
compiled to create the most rewarding musical experience from begin-
ning to end. This is the idea that the album is a single experience, and that
individual songs are enhanced because of their relationships to the other
songs on the album.
Important to successfully creating this single experience is the length of the
silences between songs (tracks). The lengths of these silences are based on
the music, not on the clock. No formula or preset amount will provide the
right movement from one song to another. This is, in effect, the rhythm of
the record album. A pace is established between songs that is sometimes
lengthened (to slow the progression), sometimes shortened (to hasten it),
at times eliminated (to have one song lead directly to another); at times
the time between tracks might be quite long to heighten anticipation, and
songs might even overlap in a cross fade for a length of time to form a
segue. Some pieces of music end in silence. This time is used for listener
refl ection, for a sense of drama, or to allow the music to reach its own
sense of conclusion. This silence might even represent the song’s reference
dynamic level. The lengths of silences between pieces will be carefully cal-
culated to effectively serve the individual pieces of music and the overall
project. Identifying this time unit requires an understanding of how the
songs related to one another and to the overall idea of the album.
Editing
The album and all tracks receive fi nal editing. Noises might exist in a song
that must be eliminated; these may be performer noises, poor edits, tech-
nical issues, and more. Song heads and tails are sometimes altered at this
stage; they might be shortened by editing out some material or extended
by copying and pasting material; fadeouts might be created. In classical
music and concert recordings the sound of the hall or concert venue is
needed between tracks; room tone, applause, audience sounds and the like
might be added to fi ll this space. Clicks, thumps, pops and other noises will
be eliminated. Noise reduction techniques might be employed to reduce or
eliminate narrow or broadband hiss, or hum and buzz (harmonics of hum)
that can originate in the recording process or from storage media.
Adjusting Timbral Balance
Establishing an appropriate timbral balance (this is also called spectral bal-
ance and tonal balance) for the album follows. One of the most important
Chapter 15
352
roles of the mastering engineer is to achieve a desirable timbral balance for
the album (and sometimes for each song individually). This greatly shapes
the overall sound quality of the recording, and can provide the recording
with a unique “sound.The goal of adjusting timbral balance is to bring out
the most desirable qualities of the recording and to address any spectral
problems that might exist. Often subtle changes in spectrum will have a
great impact on the recording.
Timbral balance is adjusted for a variety of purposes. Individual tracks
(songs) might be treated separately, and their spectral issues handled sep-
arately. Some songs might deliberately have a unique sound quality that
is celebrated and treated differently from other songs on the album; one
song might have a very different spectral problem from another, causing
each song to receive a very different treatment. Often an album will have
an overall sound quality that helps bind it into a single artistic statement;
spectrum will play a decisive role in this, and the timbral balance will be
crafted carefully to shape an appropriate overall quality. We also fi nd that
different types of music (pop, blues, rock, classical, jazz, metal, rap, etc.)
have different timbral balances that contribute to the character of that type
of music; this is especially prominent in how different types of music use
the lower octaves of the spectrum.
This adjustment to the overall spectrum of the recording will also take into
account the need for the recording to playback most effectively on a wide
range of listening systems (home entertainment, automobile, MP3 play-
ers) and playback formats/media (from internet download codecs to high-
resolution release formats such as SACD). The mastering engineer must be
able to project how the project will sound after it has been transformed in
these ways.
Equalization is the primary way timbral balance is altered in mastering. It
should be applied before dynamics processing, because changing EQ will
alter the dynamic level of the signal. Previously set dynamic relationships
and dynamic levels will be altered, and previous dynamic processing such
as a limiters peak protection could be undone.
Equalization of the mixed program will be a balance of compromises,
and is very different from processing an individual instrument in track-
ing or mixing. Applying EQ to the overall program can cause a change in
the prominence of one sound source over another within the mix. Also,
changes designed to correct the timbral balance of one aspect of the sound
may well cause problems in another. Small adjustments in one frequency
band can result in signifi cant changes in others (for example, adjusting the
amount of 60 Hz can alter the perception of 10 kHz to a remarkable degree);
many times multiple solutions are tried before one is found that will not
create more problems than it fi xes.
Large sections of the album are routinely listened to carefully to try to deter-
mine the full impact of even the subtlest change in equalization. Focus will
The Final Artistic Processes and an Overview of Music Production Sequences
353
shift deliberately between the perspective of the spectra of all of the indi-
vidual sound sources and the overall perspective of timbral balance. Well-
developed listening skills are required to perceive and shape the nuances
of spectral information of a complex, overall program. A similar routine
and skill levels are required of the mastering engineer in making adjust-
ments to the overall dynamics of the album.
Adjusting Dynamic Levels and Relationships
The dynamics of the recording can be altered at two levels of perspec-
tive in the mastering process. These are (1) the dynamic levels within and
dynamic contours of the individual songs, and (2) the dynamic levels and
contours that comprise the program dynamic contour of the entire album.
Adjusting the dynamic levels of individual songs might occur between sec-
tions of a song. In this case the loudness levels of the sections might be
brought closer together to create less dynamic range (as might be neces-
sary to make a wide dynamic range more appropriate for a home environ-
ment), or dynamic range might also be increased by lowering or raising the
loudness of a section to create more contrast in the song. Dynamic levels
and contour changes can also be made within sections, such as a fadeout,
or perhaps to increase the loudness of the dynamic peak of the song.
Changes in the dynamic level of the program can change other aspects of
the recording, sometimes markedly. For example, even subtle compres-
sion on a mix can cause sound stage imaging to change, depending on
the program material. The location and width of images, the perceived dis-
tance of sources and the ambience of the sound stage can all be altered as
compression raises the loudness levels of less prominent sound sources in
the mix above a certain threshold.
The album as a whole will have a dynamic contour, relating the dynam-
ic contours and levels of the individual tracks to one another. This can be
accomplished by identifying an appropriate reference dynamic level (RDL)
for the album, and leveling the individual dynamic levels of all tracks
accordingly. A reference dynamic level provides a point of reference to
which all other loudness levels can be compared. This level might be any
level that is appropriate for the particular project, such as the highest point
of the loudest track, or perhaps the most poignant moment of the album’s
title track, as examples. With this reference, the other songs can be given
a related loudness level as appropriate to their own unique qualities and
how they fi t or contribute to the album. In this way, dynamic relationships
between tracks are established.
The songs may have slightly different loudness levels, or quite different
levels depending on the project. The entire project must, however, be rea-
sonably consistent in terms of loudness. This will allow for the dynamic
relationships of the songs and the album to be correctly transferred to any
Chapter 15
354
format (CD, LP, cassette, DVD-A, MP3, television, satellite radio, etc.). In
some applications, dynamic peaks are limited to allow the project to trans-
fer more easily into other formats (such as media broadcast) or for sound
quality purposes.
The album will also have an overall loudness level that is established in
the mastering process. In recent years this level has been pushed higher
by a good many engineers, and CDs may vary in average loudness level
by as much as 15 dB. Selecting this actual loudness level is important to
the technical quality of the recording as well as artistic product. Distortions
and alterations in sound quality can occur when pushing for more program
loudness (which provides a sound preferred by many consumers) or when
lower levels bring diminished resolution of the waveform.
Coding
Concluding the mastering process, the PQ codes necessary for replication
are encoded into the digital master (which will usually be referred to as a
premaster by the replication plant). This code will establish the locations
of track numbers, to start the CD at specifi c points to play individual songs
tracks. It will also be used to create an index for the disc and provide other
information needed for playback and replication.
The Listener’s Alterations to the Recording
The listener may shape the fi nal sound of the recording. This may be
through a conscious altering of the original characteristics of the recording
or by accident. The listener may alter the sound qualities of a recording to
align with their own personal preferences, changing the sound qualities
that were crafted by the recordist.
Further, the sound reproduction systems of the fi nal listener to the record-
ing will in all likelihood be signifi cantly different than the system that was
used as a reference during the production of the recording. The listening
environment and equipment used for home playback are almost never sim-
ilar to (let alone the same as) those used in determining the fi nal sound of
the recording. The differences between studio and home listening environ-
ments, and studio and consumer sound reproduction systems cause great
changes to be made in the sound qualities of the recording during playback
in home listening environments. This situation has worsened with home
surround systems that need to be set up carefully and more accurately
calibrated in order for the intended sound to be heard.
Further, the listener alters the original sound characteristics of the recording
through a number of activities: adjusting playback equalization and loud-
ness level, their selection of playback equipment, the location of the play-
back system (especially loudspeakers) in their homes (where often visual
The Final Artistic Processes and an Overview of Music Production Sequences
355
aesthetics and the logistics of everyday living win out over sound quality),
and through the playback of the recording in small listening rooms.
Listening in automobiles, through earbuds or headphones, or through
the speakers of a PC all have potential to transform the original recording
into something very different. Radio formats, internet delivery codecs, and
compressed fi les for iPod and MP3 players provide more degradations to
the sound qualities of the recording.
The music recording will not have the same characteristics when deliv-
ered to the consumer, as it had in the recording studio. The recordist will
hope the music recording will not be radically altered as it is delivered to
each individual listener. At the same time the recordist must acknowledge
the reality that such alterations will take place, to varying degrees, much
more often than not. A well-crafted and mastered recording might transfer
reasonably well to a wide variety of formats and systems, by anticipating
some of the most prominent changes in sound quality that will result from
delivery formats and playback systems. Still, the recording will change.
Ultimately the recordist is not in control of how their art is heard.
Concluding Remarks
This book should not end on a fatalistic and negative note of recordists ulti-
mately not controlling how their artwork is heard. Indeed, usually when the
consumers care enough about a recording to alter it to make it their own,
it means they care deeply about the recording, its music and its message.
In a not-so-odd way, it is a compliment that the listener wishes to make the
music even more to their personal taste, although most recordists would
wish this alteration of the sound qualities (which they poured their soul
into) would not happen.
Recording music is a wonderful endeavor. Being part of the making of a
piece of music is often a very immense and unique privilege. One can wit-
ness magic, and be part of something that surpasses the sum of all individ-
uals of the project. One can certainly feel as though they have contributed
to the making of great music, whatever the type of music or the level of
accomplishment of the musicians. One can also be blessed with opportu-
nities to use their skills to help others realize their dreams, or to use their
skills to create music recordings of their own.
Recordists do shape and create art. They are artists in the truest sense of
the term.
The Art of Recording can be realized with:
An understanding of what makes recording an art
The listening skills to recognize those things and to make a profes-
sional quality recording
The craft to use the recording process and its devices to shape sound
and music creatively and with artistic sensitivity
Chapter 15
356
Exercises
The reader will benefi t greatly from methodically exploring the activities be-
low. Each of them should become the focus of an exercise that would have
the reader explore:
How does the device or process transform the waveform?
What changes to the waveform can I recognize?
In what ways do these transformations impact the audio signal and the
recording?
The accompanying CD provides a variety of completed mixes that could be
used for these exercises. These tracks would be fed from a CD player into the
various devices in the signal chain. Other source material can certainly be
substituted, whether commercial recordings or your own mixes.
In all exercises, the reader should begin work with exaggerated settings. These
will in all likelihood be far from artistically pleasing, but will cause the changes
to be readily apparent and easier to perceive in fi rst encounters where subtle
alterations or changes in level might well go undetected. Upon repetitions,
make changes smaller and smaller to refi ne your ability to identify the changes
and observe the more subtle alterations created to the qualities of the overall
sound and the sound sources within the mixes. Repeating each exercise with
different program material will benefi t the reader.
When performing these exercises, do not think about the musical result of
your processing at this point. Allow yourself the opportunity to simply learn
the sound qualities and impacts of your actions. When you have control of
these concepts and processes, you can start bringing your attention to mak-
ing artistic decisions.
Exercises 15-1
Exercises Modifying Completed Mixes
a. Compression; listen to the changes in dynamics and dynamic range cre-
ated by applying compression to a completed mix, such as track 53; bring
your attention to changes in the overall character of the mix, as well as how
the compression alters individual sound sources and the musical balance.
b. Equalization; add one or two bands of EQ to track 52 or 38 (tracks with
all instruments in the same or a similar environment); bring your attention
to changes in the timbral balance of the mix; next carefully follow each
instrument in the mix to perceive how the EQ impacts their individual
sound qualities and how each instrument might have changed in musical
balance, stereo location, or distance location. Next, add the same one or
two bands of EQ to track 53 (where sources have different environments
and a wider and more active sound stage); bring your attention to the
same changes in the timbral balance of the mix; next carefully follow each
The Final Artistic Processes and an Overview of Music Production Sequences
357
instrument in the mix to perceive how the EQ impacts their individual
sound qualities and how each instrument might have changed in musical
balance, stereo location, or distance location. Try these same exercises
with different frequency bands and observe the results; then work toward
perceiving the results of performing these exercises using four or fi ve nar-
row and subtle EQ settings.
c. Reverberation; apply reverb to track 53, and bring your attention to how
a new overall, perceived performance environment was created and how
it impacts the mix; apply a distinctly different reverb next, and bring your
attention to the perceived performance environment and its impacts on
the mix.
d. Limiting; apply a limiter to track 53 (or some other mix), and prepare
to raise the level of the program markedly; listen for any changes in dy-
namics and dynamic range created by limiting; bring your attention to
changes in the overall character of the mix.
359
CD Contents and Track Descriptions
Track 1: Harmonic Series – 10 Harmonics and 17 Harmonics Sine Waves
Harmonic Series built in sine waves on C2. It is played twice, fi rst with 10
harmonics and then with 17 harmonics. Each series is fi rst arpeggiated,
followed by the harmonic series as a chord.
Track 2: Harmonic Series – 10 Harmonics Performed on Piano
Harmonic Series performed on an acoustic piano, built on C2. It is played
twice, each with 10 harmonics. The fi rst series is without sustain pedal
depressed, the second with sustain pedal. Each series is fi rst arpeggiated,
followed by the harmonic series as a chord.
Track 3: Sustained Piano Notes
Two sustained piano notes of D0. The fi rst without sustain pedal, the second
with sustain pedal depressed. Notice the appearances, entrances and exits
of the different harmonics and overtones present throughout the duration
of the two sounds.
Tracks 4–13: Reference Frequencies and Reference Pitches
These are a few potential reference pitches and frequencies to assist the
reader in learning pitch estimation and in establishing one’s own pitch ref-
erence. The frequencies are sawtooth waves. The pitches are each repeated
three times, and are piano sounds.
Track 4: 60 Hz
Track 5: 100 Hz
Track 6: 250 Hz
Track 7: 1 kHz
Track 8: 2.5 kHz
CD Contents and Track Descriptions
360
Track 9: 4 kHz
Track 10: A4 (440 Hz)
Track 11: C4 (261.6 Hz)
Track 12: B-fl at4 (466.2 Hz)
Track 13: E2 (82.4 Hz)
Tracks 14–18: Pitch Register Boundaries
A piano is performing the groups of pitches that form the boundaries or
thresholds between the pitch registers. Each is performed three times.
Track 14: LOW to LOW-MID Boundary
Track 15: LOW-MID to MID Boundary
Track 16: MID to MID-UPPER Boundary
Track 17: MID-UPPER to HIGH Boundary
Track 18: HIGH to VERY HIGH Boundary
Tracks 19–25: Pitch Area Evaluation Source Material
Drum and cymbal sounds to serve as source material for performing pitch
area evaluations and other exercises.
Track 19: Kick Drum
Track 20: Tom 1
Track 21: Tom 2
Track 22: Tom 3
Track 23: Snare Drum
Track 24: Crash Cymbal
Track 25: Ride Cymbal
Tracks 26–33: Time Judgment Exercise
On each track a snare drum sound is delayed, and the delayed sound is
combined with the direct sound. Each track fi rst has the direct sound in
the left speaker and a single delay in the right speaker; after one second of
silence the delay is repeated four times in both speakers. All sounds are of
equal amplitude.
Track 26: 50 ms delay
CD Contents and Track Descriptions
361
Figure CD-01
Record-
ing session for drum
tracks, showing some
of the microphone
placements.
Track 27: 40 ms delay
Track 28: 30 ms delay
Track 29: 20 ms delay
Track 30: 15 ms delay
Track 31: 10 ms delay
Track 32: 5 ms delay
Track 33: 2 ms delay
Tracks 34–36: Rhythms of Refl ections
This is a realization of Figure 1-10. A snare drum sound is delayed and
attenuated appropriately to demonstrate the patterns of refl ections.
Track 34: Pattern 1
Track 35: Pattern 2
Track 36: Both patterns simultaneously
Track 37: Musical Balance and Performance Intensity Closely Matched
The musical balance of the instruments in the mix closely resembles the
performance intensities of the original performance. This recording was
made using an ORTF stereo microphone technique with the addition of
CD Contents and Track Descriptions
362
an accent microphone in the bass drum. A hall reverb program was then
applied to the mix.
Track 38: Musical Balance Changed from Original Performance Intensity
The musical balance of the mix presents instruments at different loudness
levels than their original performances. This mix has many unnatural rela-
tionships of performance intensity versus musical balance; some are delib-
erately over-exaggerated to provide clarity to this concept.
Tracks 39–41: Distance Location
A single cello performance illustrating different distance cues. A single per-
formance was recorded with ten different microphones tracked separately.
Each microphone was at a different distance (varying from several inches
to over twenty feet), height, and angle to the cello. Different combinations
of the microphones provided “proximity,“near,” and “far” distance loca-
tion illusions.
Track 39: Proximity
Track 40: Near
Track 41: Far
Figure CD-02
Close
and near microphone
placements for cello
recording.
CD Contents and Track Descriptions
363
Tracks 42–44: Stereo Location
A single guitar performance was recorded using several microphones at
different distances, height, and angles to the instrument. These separate
microphone tracks were used to create the two spread-image sizes of tracks
42 and 43. Reverb was added to track 42 to create track 44; the reader might
wish to add a different reverb to track 42 and compare the results.
Track 42: Narrow spread image of guitar performance
Track 43: Wide spread image of guitar performance
Track 44: Narrow spread image of guitar performance source with a
reverb program added to widen the image.
Figure CD-03
Close
microphone place-
ments for guitar
recording.
CD Contents and Track Descriptions
364
Figure CD-04
Distant
microphone place-
ments for guitar
recording.
Tracks 45–47: Space within Space
Description: Various sound-source and environment groupings provide
examples of space-within-space considerations. Some material is over-
exaggerated to allow for easier recognition of the concepts.
Track 45: The snare drum and high hat appear in the same environment but
at different distances. The bass drum is in a separate environment.
Track 46: Additional different environments existing side by side are pre-
sented here. The bass drum and snare drum have separate environments,
and a third environment is created for the toms.
Track 47: A distinct overall environment is superimposed onto an overall
program to apply a perceived performance environment. The environment
has very pronounced qualities and was applied at a high level to empha-
size this concept. This environment was applied to the mix of Track 53. The
reader may wish to add a different hall or reverb program or other process-
ing to that track to observe other results.
CD Contents and Track Descriptions
365
Figure CD-05
Record-
ing session for drum
tracks, showing ad-
ditional microphone
placements.
Tracks 48–53: Sound Stage Dimensions and Production Aesthetics
Description: These tracks were made from a single drum-set performance,
recorded by fi fteen close microphones, four stereo microphone techniques
and six microphones capturing room sound. Various mixes and stereo
microphone-technique recordings were made from the same solo drum
set performance for the tracks in this section. They were also used to create
tracks 37, 38, 45, 46 and 47.
Track 48: Mix of close microphones resulting in a wide and deep sound
stage. Images have many different widths and distances and appear in
CD Contents and Track Descriptions
366
individualized locations. This is a highly controlled mix with crafted sound
qualities and unnatural relationships of sounds.
Track 49: This is the mix of track 48 with the addition of a pair of cardioid
microphones in an
X-Y coincident technique. The microphones were locat-
ed approximately 0.75 meters above and 0.25 meters behind the drum-
mers head. The stereo pair adds an overall environment to the drum set
for another space-within-space relationship and it changes the listener-to-
sound stage relationship.
Track 50: ORTF (near coincident) stereo microphone array using Neumann
TLM 103 microphones located approximately 0.5 meters above and 0.25
meters behind the drummers head.
Track 51: Spaced omnidirectional microphone technique with two DPA
4006-TL microphones located approximately 2 meters from each side of
the drum set.
Track 52: Tracks recorded with closely placed microphones are mixed to
approximate a live performance. The relationships of performance inten-
sity and musical balance are closely aligned, and sounds are focused in
the center area of the sound stage. A single overall environment has been
subtly added to give a common trait to all sounds.
Track 53: The musical balance (dynamic levels of the mix) of track 52 is used
here, unaltered. The sound stage is widened signifi cantly, and some distance
locations are altered. Instruments are placed in one of four distinct environ-
ments: bass drum and snare drum have separate environments (the snare
drum’s environment has a shorter decay than it had in track 46), all cymbals
are in a single environment, and all toms are in a single environment (the
tom environment has a signifi cantly longer decay than it had in track 46).
The reader might want to use this track (or any of the other drum mix-
es) to experiment with changes to the overall qualities of the recording,
especially perceived performance environment, timbral density, and pro-
gram dynamic contour. Subtle changes in these areas are often elements
of mastering a recording.
Tracks 54–56: Playback System Set-Up and Calibration
The following tracks are presented to assist the reader in evaluating the
quality of their playback system and to help them prepare for accurate lis-
tening to this CD and to the musical examples cited in this book.
The listeners playback system should not have a loudness button or tone
control engaged. Loudspeakers should be located away from walls and
refl ective surfaces, and the listener seated appropriately. Some segments
of this book discuss playback system and listening-location considerations.
Those sections and other more detailed writings should be examined to
CD Contents and Track Descriptions
367
establish as accurate a playback system and environment as is reasonable
for one’s situation and means.
Performing listening evaluations with an inaccurate playback system will
lead to inaccurate conclusions. The material will be misperceived and
learned incorrectly. The reader/listener must hear the CD and musical
examples cited in this book accurately in order to learn to recognize these
new materials.
Track 54: Setting an appropriate listening level
Pink noise is played in two 8-second segments. The fi rst segment should
be at a nominal listening level. The second segment is 5 dB higher, and
represents loudness levels that might typically be reached during a typical
music recording.
Set your loudness level so that the fi rst segment is at a comfortable, though
somewhat loud level. If a sound pressure level meter is available it should
read approximately 85 dB SPL. This will be your nominal listening level, and
should become the average level of the program material you will hear.
Do not adjust the level. The second segment will be noticeably louder, but
it should be tolerable and should not cause distortion to your playback
system. If it is not tolerable, lower the level and listen to the fi rst segment
again. If you are noticing a distortion in the sound, your playback system is
in need of attention. Please address this situation before using the playback
system for listening exercises.
Track 55: Evaluating the loudness balance between left and right
loudspeakers
Pink noise is now directed to individual channels. In 5-second segments,
pink noise will appear at the left speaker, the right, then the center (each
speaker equally). This sequence is repeated three times. All sounds are at
the same loudness level and should be played back at your nominal listen-
ing level that was set while listening to track 54.
Use these pink-noise segments to ensure that the loudness relationship
between the two speakers is correct, and that radical room interference or
alterations to sound are noticed. Any reversed loudspeaker polarity should
be apparent. Listening carefully, one should try to determine that there
are no differences in spectrum (sound quality) and amplitude between the
speakers and when they are combined. Any detectable differences between
the left and right channels are cause for concern and need to be traced. An
SPL meter will be very useful for balancing the loudness levels of the two
speakers.
CD Contents and Track Descriptions
368
Track 56: Evaluating frequency response
Six 5-second sine tones are presented in the following order:
1 kHz
100 Hz
5 kHz
500 Hz
15 kHz
40 Hz
All sounds are at the same sound pressure level, and you should be play-
ing them back at the same nominal listening level that was set in track 54.
The fi rst four tones should be readily apparent, and near the same loud-
ness level—depending upon your nominal listening level. If this is not the
case, a serious problem is present in the monitoring system that must be
corrected before using the system to evaluate sound.
The last two tones are at the extremes of the hearing range and are likely
near (or just beyond) the limits of your playback system. These two tones
will appear signifi cantly softer, though they are recorded at the same SPL
as the other four. While this should not cause alarm, readers must be aware
of the limits of their playback systems as well as their own hearing. These
tones provide a fi rst step that will lead to that knowledge.
CREDITS:
Cello: Eli Cohn
Drums: Thomas Yahoub
Guitar: David Janco
Piano: Daniel Bolton and Sage Atwood
Engineers: Phillip Reese and William Moylan
Assistant Engineer: Erh-Chaun Lai
Producer: William Moylan
Mastering Engineer: Adam Ayan
Recorded and mixed in the Sound Recording Technology Studios at the
University of Massachusetts Lowell.
Mastered at Gateway Mastering Studios, Portland, Maine.
Copyright © and Phonorecord
2006 by William Moylan.
All rights reserved.
369
Glossary
Absolute pitch (perfect pitch) is the ability to recognize specifi c pitch levels
(in relation to specifi c tuning systems).
Accent microphones are microphones that are dedicated to capturing a
single sound source, or a small group of sound sources, within the
total ensemble being recorded by a stereo microphone technique, and
are used to supplement the stereo array.
Active listening is the listening process with the listener focused and intent
on extracting certain specifi c types of information from music or other
sounds.
Amplitude is the amount of displacement of the medium at any moment,
within each cycle of the waveform (measured as the magnitude of dis-
placement in relation to a reference level, or decibels).
Analytical listening techniques are used to evaluate the artistic elements
of sound; sound is evaluated within musical contents and these tech-
niques seek to understand the function of the sound in relation to the
musical or communication context in which it exists; it evaluates sound
over time and uses the concept of
sound event.
Artistic elements are characteristics of sound used to communicate artistic
ideas and provide a resource for artistic expression.
Blend is the bringing together of all of the sonic components of an acoustic
sound source during recording.
Body of a sound is the primary portion of the sound that is markedly dif-
ferent in dynamic contour, spectrum and/or spectral envelope from the
initial portion of the sound (up to the fi rst 20–30 ms).
Chord is two or more simultaneously sounding pitches.
Complete evaluation is an examination of all of the artistic elements in a
particular recording/piece of music.
Glossary
370
Complete evaluation graph contains elements at the perspective of the
overall texture that appear in the top tier and the elements at the per-
spective of the individual sound source on the lower tier against the
time line of the work; notes about important characteristics and quali-
ties of elements appear on each tier to assist the complete evaluation,
following a stream of shifting importance of materials and elements,
and to improve skill in shifting focus and perspective.
Critical listening techniques are used to evaluate the perceived parameters
of sound; it is evaluating sound for its own content, out of the context
of a piece of music, and out of time; it makes use of the concept of
sound as an abstract idea, or a
sound object.
Depth of sound stage is the area created by the distance of all sound sourc-
es from the perceived location of the listener; its boundaries create an
area of distance that extends from the nearest source to the furthest
sound sources (and their environments).
Digital audio workstation (DAW) is a computer-based recording production
system comprised of an I/O interface, containing A/D and D/A conver-
sion, monitor levels and routing, and software for potentially all pro-
duction functions.
Direct sound is sound that travels on a direct path from a sound source to
the listener or microphone.
Directional sensitivity of a microphone is its sensitivity to sounds arriving
at various angles to the diaphragm.
Distance dimensions in recorded music are: (1) the distance of the listener
to the sound stage, and (2) the distance of each sound source from the
listener.
Distance location continuum is a scale for evaluating distance location
based on the listeners sense of proximity; extending from immediately
adjacent to the listener to infi nity, and comprised of the areas “Proxim-
ity,” “Near” and “Far.
Distance location graph is used for plotting the distance locations of all
sources in a mix; it utilizes the distance location continuum areas
against a time line.
Distance perception is the perception of the distance of a sound source from
the listener, and is defi ned by: (1) the ratio of the amount of direct sound
to reverberant sound, and (2) the primary determinant, the loss of low-
amplitude (usually high-frequency) partials from the sound’s spectrum
with increasing distance (see
defi nition of timbre or timbral detail).
Distance sensitivity (reach) is the ability of a microphone to accurately cap-
ture the detail of a source’s timbre in relation to its distance from the
sound source.
Duration is the perception of time.
Glossary
371
Dynamic contours are changes in dynamic levels over time.
Dynamic envelope of a sound is the contour of the changes in the overall
dynamic level of the sound throughout its existence.
Early refl ections are those refl ections that arrive at the ear or microphone
within around 50 ms of the direct sound.
Early sound fi eld is comprised of the refl ections that arrive at the listener or
microphone within the fi rst 50 ms after the arrival of the direct sound.
Effective listening zone is an area in the control room where the spatial
characteristics of reproduced sound can be accurately perceived.
Environmental characteristics are the sound characteristics of an environ-
ment; an overall sound quality that is comprised of a number of com-
ponent parts; individual sound sources and the complete sound stage
(perceived performance environment) have individual environmental
characteristics.
Environmental characteristics evaluation defi nes the characteristics of the
environment itself.
Environmental characteristics graph is composed of (1) the refl ection enve-
lope, (2) the spectrum, and (3) the spectral envelope plotted against a
time line; seeks to defi ne the characteristics of the environment itself.
Equivalence is the concept that all of the artistic elements of sound have an
equal potential to carry the most signifi cant musical information; any
artistic element has the potential to be the central carrier of the musical
idea at any moment in time; any element may also be a secondary ele-
ment at any moment in time or at any level of perspective, and that the
importance of any element might shift at any moment.
Focus is the act of bringing some aspect of sound to the center of ones
attention and at a specifi c level of detail (perspective).
Form is the piece of music as if perceived, in its entirety, in an instant; it
is the substance and shape of the piece of music, perceived from con-
ceptualizing the whole; it is a global quality, as an overall concept and
essence.
Formants (formant regions) are an individual frequency or certain ranges
of frequencies within the spectrum that are emphasized consistently,
no matter the fundamental frequency.
Frequency is the number of similar, cyclical displacements in the medium,
air, per time unit (measured in cycles of the waveform per second,
or Hz).
Frequency bands are areas of frequency activities defi ned as a bandwidth
between an upper and lower boundary.
Frequency response is a measure of how a device responds to the same
sound level at different frequencies.
Glossary
372
Fundamental frequency is the periodic vibration of a waveform, producing
the sensation of a dominant frequency; it is measured by the number
of periodic vibrations, or cycles of the waveform, that repeat its charac-
teristic shape during a second.
Harmonic progression is the movement from one chord to another, in a
stylized sequence.
Harmonics are frequencies in the spectrum that are whole-number mul-
tiples of the fundamental frequency.
Harmony is created by patterns of the harmonic progression.
Hierarchy is the organization of materials by levels of importance.
Host environment is the environment in which an individual or groups of
sound sources are sounding.
Imaging is the lateral location and distance placement of the individual
sound sources within the sound stage, and provides depth and width
to the sound stage.
Inherent sound quality is the unique sonic imprint of any device or process
in the signal chain that is the result of its normal performance charac-
teristics and sound qualities.
Interaural amplitude differences (IAD) are arrivals of the same sound at
each ear at a different sound pressure level (amplitude) and are critical
to determining localization for sounds at high frequencies.
Interaural spectral differences (ISD) result when the head of the listener
blocks certain frequencies from the furthest ear (when the sound is not
centered).
Interaural time differences (ITD) are arrivals of the same sound at each ear
at a different time; a sound that is not precisely in front or in back of
the listener will arrive at the ear closest to the source before it reaches
the furthest ear.
Interval is the distance between the perceived levels of the two (or more)
pitches, played either in succession (as melody) or simultaneously (as
harmony).
Key is a summary of all of the sound sources plotted on a graph, showing
their individuals colors, labeling or formats.
Loudness is the perception of the overall excursion (acoustic energy) of the
waveform (amplitude).
Masking is the covering of the qualities or activities of a sound source by
another sound source.
Mastering is the process that fi lls the void between mixing and replication,
where sound can be enhanced one last time and any problems repaired
before the recording is fi nalized; further functions of sequencing, lev-
els, spacing, compression, equalization, etc. might also be performed.
Glossary
373
Melodic contour graph places a melodic line on an XY graph, mapping its
activity in pitch/frequency register against a time line; this is an excel-
lent way to acquire skill in hearing pitch levels against pitch register
designations, and in placing sources again a time line; it is an impor-
tant bridge between traditional music dictation (and notation) and the
graphs and processes of this system.
Metric grid is the underlying pulse of a piece of music that is a reference
pulse against which all durations can be defi ned, thereby allowing
the listener to make rhythmic judgments in a precise and consistent
manner.
Mixing (mixdown, the mix) is where the individual sound sources that
are being recorded or were previously recorded onto a multitrack are
combined into a two-channel or a surround-sound recording that will
become the fi nal version of the piece after the mastering process.
Monitor level is the sound pressure level (SPL) at which the recordist lis-
tens to sound reproduced through the monitor system.
Monitor system is comprised of all factors that impact reproduced sound:
power amplifi er(s), loudspeaker drivers, crossover networks, loud-
speaker enclosures, connector cables, and the listening room itself.
Multitier graphs contain a Y axis divided to allow several components to be
represented against the same time line; allows different characteristics
to be graphed against the same time line.
Musical balance is the interrelationships of the dynamic levels of each
sound source, to one another and to the entire musical texture.
Musical balance graph plots the individual sound sources of a musical
texture by their dynamic contours, against a single time line with the
work’s RDL as a reference.
Nominal level is used to establish a reference for plotting the dynamic con-
tours of the spectral components of environmental characteristics; it
represents the changing dynamic envelope of the environment, where
the sound source’s frequency components are unaltered.
Objective descriptions describe the states and activities of the physical
characteristics of sound.
Off-axis is any deviation from a microphone’s 0º center point for sound
arrival.
Off-axis coloration is a change in the timbre of a sound source caused
by change in the frequency response of a microphone to sounds arriv-
ing off-axis.
Onset (prefi x) is the initial portion (up to 20–30 ms) of the sound that is
markedly different in dynamic contour, spectrum and/or spectral enve-
lope from the remainder of the sound (the body).
Glossary
374
Overall texture is the highest level of perspective, bringing the listener to
focus on the composite sound of a recording or piece of music; dimen-
sions are the piece of music/recording’s form, perceived performance
environment, sound stage, reference dynamic level, program dynamic
contour, and its timbral balance.
Overtones are those frequencies in the spectrum that are not proportion-
ally related to the fundamental frequency.
Partials are all of the frequencies of the spectrum: overtones and harmon-
ics, subharmonics and subtones, and formats and formant regions.
Passive listening is the process of listening while the listener is consciously
focused on some activity other than the music (perhaps eating, a con-
versation, a dentist’s drill, etc.); in fact, the listener might not actually
be conscious of the music or sound.
Perceived parameters (fi ve) of sound are pitch, loudness, duration, timbre
(perceived overall quality), and space (perceived characteristics).
Perceived performance environment is the environment of the sound
stage, and is the overall environment where the performance (record-
ing) is heard as taking place; its environmental characteristics shape
the entire recording and bind all the individual sound sources and their
spaces together into a single performance area; a characteristic of the
overall texture.
Perfect pitch (absolute pitch) is the ability to recognize specifi c pitch levels
(in relation to specifi c tuning systems).
Performance intensity (perceived) is the timbre/sound quality of the sound
source created during a performance; the dynamic level at which the
sound source was performing when it was recorded; it is comprised of
the loudness, energy exerted, performance technique, and the expres-
sive qualities of the performance.
Performance intensity versus musical balance graph contrasts perfor-
mance intensity plotted as the dynamic levels of the original perfor-
mance against the loudness levels of the sources as they exist in the
nal recording (musical balance).
Perspective is the level of detail at which the sound material is heard, and
is related to a specifi c level of the structural hierarchy; it brings the
listeners
focus to a specifi c level of detail.
Phantom images are perceived sound-source locations that are sounding
at locations where a physical sound source (loudspeaker) does not
exist.
Phon is the unit of measure for perceived loudness established at 1 kHz,
based on subjective listening tests.
Physical dimensions (fi ve) of sound are frequency, amplitude, time, timbre,
and space.
Glossary
375
Pitch is the perception of the frequency of the waveform, defi ned as the
perceived position of a sound on a scale from low to high, and as an
attribute of hearing sensation by which sounds may be ordered on a
musical scale.
Pitch area is a defi ned area between an upper and a lower pitch level, in
which a signifi cant and prominent portion of the sources spectrum
exists.
Pitch area analysis graph is used to plot the pitch areas of nonpitched
sounds, showing the location, widths and densities of the sound’s pitch
areas and the relative loudness levels of the pitch areas; this is a rudi-
mentary sound-quality evaluation graph.
Pitch defi nition (defi nition of fundamental frequency) is the amount of pres-
ence of a sense of pitch, or the dominance of the fundamental frequen-
cy of the source; pitch defi nition is placed on a continuum between
the two boundaries of well-defi ned in pitch, or precisely pitched (as a
sine wave), through completely void of pitch or nonpitched (as white
noise).
Pitch density is the range of pitches spanned by a musical idea plus the
spectrum of the sound source playing it; it is the amount and place-
ment of pitch-related information of a single sound source within the
overall pitch range of the musical texture; it is at the perspective of the
individual sound source.
Pitch reference (internal) is a sense of pitch level that is present (usually
unconsciously) within each individual.
Pitch-level estimation is the ability to consistently identify the value (pitch
level or name) and register placement level of a pitch or frequency.
Pitch/frequency registers are divisions of the hearing range that were estab-
lished to estimate the relative level of the pitch material.
Point source is a phantom image that occupies a focused, precise point in
the sound stage.
Polar pattern of a microphone illustrates its sensitivity to sounds at various
frequencies in front, in back, and to the sides, and the actual pattern is
spherical around the microphone.
Prefi x (onset) is the initial portion (up to 20–30 ms) of the sound that is
markedly different in dynamic contour, spectrum and/or spectral enve-
lope from the remainder of the sound (the body).
Present is our consciousness of “now,” where we are at once experiencing
the moment of our existence, evaluating the immediate past of what
has just happened and anticipating the future of our window of con-
sciousness; a window of time through which we perceive the world,
and listen to sound.
Glossary
376
Primary elements are the aesthetic and artistic elements of sound that
directly contribute to the basic shape or characteristics of a musical
idea.
Primary musical materials are musical materials that are perceived as being
more important than others; they will carry the weight of communicat-
ing the musical message and expression of the music.
Primary phantom images are any of the fi ve source locations in surround
sound that exist between adjacent pairs of speakers.
Program dynamic contour is the dynamic contour of the overall program,
a single dynamic level/contour of the composite sound, the result of
combining all sounds in the program; it is a dimension of the overall
musical texture.
Proximity is the space that immediately surrounds the listener, the listen-
ers own personal space; used as a reference for judging distance loca-
tion.
Range is the complete span of an artistic element or perceived parameter,
such as the range of hearing covering all audible frequencies from the
lowest to the highest frequencies humans can hear.
Recording aesthetic is the relationship of the qualities of a recording to the
qualities of the original live performance in a performance space.
Recording sessions is a term used here to describe recordings where all
the parts (sound sources and musical materials) are played at once; the
entire musical texture is recorded simultaneously as a single sound, to
stereo or surround.
Recordist is a person involved in the production of audio recordings:
recording engineer, record producer, sound designer, sound synthe-
sist, mastering engineer and others.
Reference dynamic level (RDL) is the overall or global intensity level of a
piece of music and is the reference level for evaluating dynamics for
musical balance and program dynamic contour; it is the “perceived
performance intensity”
of the work as a whole, conceptualized as a
single entity out of time; a dimension of the overall texture.
Refl ected sound is sound that bounces (refl ects) off surfaces or objects
before arriving at the listener.
Register is a specifi c portion of the range, usually with a unique character
(such as a unique timbre, or some other determining factor) that will
differentiate it from all other areas of the sources range.
Relative pitch is the ability to consistently and reliably judge pitch level
within about 10% of actual level.
Reverberant sound is a composite of many refl ections of the sound arriving
at the listener (or microphone) in close succession.
Glossary
377
Reverberation time (RT60) is the length of time required for the refl ections
to reach an amplitude level of 60 dB lower than that of the original
sound source.
Rhythmic pattern is a group of durations, and can be applied to any artistic
element of sound.
Secondary elements are those aspects of the sound that assist, enhance, or
support the primary elements.
Secondary musical materials are musical materials perceived as being sub-
ordinate to others; they will in some way enhance the presentation of
the primary materials by their presence and activity in the music, and
usually function to support the primary musical ideas.
Secondary phantom images are source locations in surround sound that
exist between nonadjacent pairs of speakers.
Signal chain is the fl ow of signal through a chain of recording/reproduction
stages or devices.
Sound event is the shape or design of the musical idea (or abstract sound)
as it is experienced over time; it is understood as activity unfolding and
evolving over time, and is used in
analytical listening observations.
Sound object is the perception of the whole musical idea (or abstract
sound) at an instant, out of time; it is understood as the qualities of a
sound itself in its many variables and as it exists as a global quality or
object”; it is used for
critical listening applications and is always con-
sidered without relationship to another sound.
Sound quality is the artistic element that uses perceived timbre qualities
for artistic expression.
Sound quality evaluation seeks to defi ne and describe the states and activi-
ties of the sound source’s (1) dynamic envelope, (2) spectral content,
and (3) spectral envelope, and will also make use of the listeners care-
fully evaluated perception of (4) pitch defi nition.
Sound stage is a single area within which all sound sources are perceived
as being located in providing the “performance” that is the recording;
it has an apparent physical size of width and depth.
Space is composed of the following dimensions in audio recording: distance
of the sound source to the listener, angle of the sound source to the lis-
tener, geometry of the environment in which the sound source is sound-
ing, and location of the sound source within the host environment.
Space within space is the concept where sound sources and their indi-
vidual
environments can exist within the overall environment of the
recording (perceived performance environment), or spaces can exist
within another space; a hierarchy of environments existing within other
environments; it also allows for different environments to be contained
within others and to also possibly coexist within the same recording.
Glossary
378
Spatial relationships perceived in current recording reproduction are the
location of the sound source being at an angle to the listener (in front,
behind, to the left, to the right, etc.), the location of the sound source
being at distance from the listener, and an impression of the type, size,
and acoustic properties of the host environment.
Spectral envelope is the composite of each individual dynamic level and
dynamic envelope of all of the individual partials of the spectrum.
Spectrum (spectral content) is the composite of all of the frequency com-
ponents of a sound, and is comprised of the fundamental frequency,
harmonics, overtones, subharmonics and subtones.
Spread image is a phantom image that appears to occupy an area; it has a
width or size that extends between two audible boundaries.
Stage width (sometimes called stereo spread) is the width of the entire
sound stage.
Stage-to-listener distance establishes the front edge of the sound stage
with respect to the listener and determines the level of intimacy of the
music/recording.
States of sound, three, are physical dimensions, human perception and
sound as idea.
Stereo location is the perceived placement/location of a sound source with-
in the stereo playback array.
Stereo microphone techniques (arrays) are composed of two or more micro-
phones (or diaphragm assemblies) in a systematic arrangement; they
are designed to record sound in such a way that upon playback (through
two channels) a certain sense of the spatial relationships of the sound
sources present during the recorded performance is reproduced.
Stereo sound is a two-channel playback format that attempts to reproduce
all spatial cues through two separate loudspeakers.
Stereo sound-location graph plots the locations of all sound sources against
the time line of the work.
Structure is the architecture of the musical materials and the interrelation-
ships of a composition.
Surround sound is a multichannel playback format that reproduces spatial
cues, etc., through fi ve separate loudspeakers and a subwoofer.
Surround sound location graph allows the reader to plot the locations of all
sound sources against the time line of the work, utilizing “left,“right,
“center,“left surround,“right surround,” and “rear center” locations
in various formats as the
Y axis.
Tempo is the rate or speed of the pulses of the metric grid, measured in
metronome markings (pulses per minute, abbreviated “M.M.”); in a
larger sense, it can be the rate of activity of any large or small aspect
Glossary
379
of the piece of music (or of some other aspect of audio—for example,
the tempo of a dialogue).
Temporal fusion is the perception of reverberant sound used with the direct
sound to create a single impression of the sound in its environment.
Timbral balance is the combination of all of the pitch densities of all of the
recording’s sounds; it is the distribution and density of pitch/frequency
information in the recording/music; it is a characteristic of the overall
texture that represents the recording’s “spectrum.
Timbral detail (defi nition of timbre) is the subtle components and/or chang-
es in the content of a sound’s timbre (dynamic envelope, spectral con-
tent and/or spectral envelope).
Timbre is the overall quality of a sound comprising a multitude of functions
of frequency and amplitude displacements; its primary component
parts are the dynamic envelope, spectrum, and spectral envelope.
Timbre perception is the perception of the mixture of all of the physical
aspects that comprise a sound, as a global form, or the overall charac-
ter of a sound, which we recognize as being unique.
Time line is the time axis (or X axis) of an XY graph, and represents the
length of an example, divided into some appropriate time unit.
Time perception is the estimation of elapsed clock time, signifi cant to the
perception of the global qualities of a piece of music and to the estima-
tion of durations when a metric grid is not present in the music.
Tracking is the recording of individual instruments or voices (sound sourc-
es) or small groups of instruments or voices, on separate tracks in a
multitrack recorder.
Transient response is the time required for a microphone or loudspeaker to
accurately track the waveform of a sound source.
Triad is a chord composed of three pitches, combining two intervals of a
third.
X–Y graph is a two-axis, two-dimensional graph used herein to notate or
represent the qualities and activities of various artistic elements.
380
Bibliography
Alten, Stanley R.,
Audio in Media, Seventh Edition, Belmont, CA: Wadsworth
Publishing Company, 2005.
Anderton, Craig,
Audio Mastering, Bremen, Germany: Wizoo, 2002.
Backus, John,
The Acoustical Foundations of Music, Second edition, New
York: W.W. Norton & Co., Inc, 1977.
Ballou, Glen,
Handbook for Sound Engineers, Third Edition, Oxford: Focal
Press, 2002.
Bartlett, Bruce, and Michael Billingsley, 1990, An Improved Stereo Micro-
phone Array Using Boundary Technology: Theoretical Aspects,
Journal
of the Audio Engineering Society 38 (7/8): 543–552.
Bartlett, Bruce and Jenny Bartlett,
Practical Recording Techniques, Fourth
Edition, Oxford: Focal Press, 2005.
Beatles, The,
The Beatles Anthology, San Francisco: Chronicle Books,
2000.
Beatles, The,
The Beatles Complete Scores, Milwaukee: Hal Leonard
Corporation, 1993.
Bech, Søren, and O. Juhl Pedersen, editors, Proceedings of a Symposium
on
Perception of Reproduced Sound; Gammel Avernæs, Denmark,
1987, Peterborough, NH: Old Colony Sound Lab Books, 1987.
Benson, K. Blair, editor,
Audio Engineering Handbook, New York: McGraw-
Hill, 1988.
Beranek, Leo L,
Acoustics, New York: American Institute of Physics, Inc,
1986.
Bergson, Henri,
Matter and Memory, New York: Humanities Press, 1962.
Berry, Wallace,
Form in Music, Englewood Cliffs, NJ: Prentice-Hall, 1966.
Blauert, Jens, “Sound Localization of the Median Plane,Acustica 22
(1969/70): pp. 205–213.
Blauert, Jens,
Spatial Hearing, Cambridge, MA: The MIT Press, 1997.
Bibliography
381
Blaukopf, Kurt, “Space in Electronic Music,” in Music and Technology, Stock-
holm Meeting June 8–12, 1970, pp. 157–172, New York: Unipub, 1971.
Borwick, John, editor,
Loudspeaker and Headphone Handbook, Third Edi-
tion, Oxford: Focal Press, 2001.
Borwick, John,
Sound Recording Practice, Fourth Edition, Oxford: Oxford
University Press, 1994.
Butler, David,
The Musician’s Guide to Perception and Cognition, New York:
Schirmer Books, 1992.
Camras, Marvin,
Magnetic Recording Handbook, New York: Van Nostrand
Reinhold Company, 1988.
Chowning, John, The Simulation of Moving Sound Sources,
Computer
Music Journal
1 (3), 1977: pp. 48–52.
Clifton, Thomas,
Music As Heard: A Study in Applied Phenomenology, New
Haven, CT: Yale University Press, 1983.
Cooper, Grosvenor W., and Leonard B. Meyer,
The Rhythmic Structure of
Music
, Chicago: The University of Chicago Press, 1960.
Cooper, Paul,
Perspectives in Music Theory, New York: Dodd, Mead &
Company, 1973.
Davis, Don, and Carolyn Davis,
Sound System Engineering, Second Edi-
tion, Oxford: Focal Press, 1997.
Davis, Don, and Chips Davis, “The LEDE™ Concept for the Control of
Acoustic and Psychoacoustic Parameters in Recording Control Rooms,
Journal of the Audio Engineering Society, 28 (9), 1980: pp. 585–595.
Davis, Gary and Ralph Jones,
Sound Reinforcement Handbook, Second
Edition, Milwaukee: Hal Leonard Publishing Corporation, 1989.
Dell, Edward T., Jr.,
Of Mockingbirds and Other Irrelevancies, Francestown,
NH: Marshall Jones Company, 1993.
Deutsch, Diana,
The Psychology of Music, Orlando, FL: Academic Press,
Inc, 1982.
Deutsch, Diana, and J. Anthony Deutsch,
Short-Term Memory, New York:
Academic Press, 1975.
Dodge, Charles and Thomas Jerse,
Computer Music: Synthesis, Composi-
tion and Performance, New York: Schirmer Books, 1985.
Dowling, William J.,
Beatlesongs, New York: Fireside, 1989.
Eargle, John,
Handbook of Recording Engineering, Third Edition, New York:
Chapman & Hall, 1996.
Eargle, John,
The Microphone Book, Boston: Focal Press, 2001.
Eargle, John, Music, Sound and Technology, New York: Van Nostrand Rein-
hold, 1995.
Bibliography
382
Eargle, John, editor, An Anthology of Reprinted Articles on Stereophonic
Techniques, New York: Audio Engineering Society, Inc., 1986.
Erickson, Robert,
Sound Structure in Music, Berkeley, CA: University of
California Press, 1975.
Emerick, Geoff and Howard Massey,
Here, There and Everywhere: My Life
Recording the Music of The Beatles, New York: Gotham Books, 2006.
Everett, Walter,
The Beatles as Musicians: The Quarry Men through Rubber
Soul
, Oxford: Oxford University Press, 2001.
Everett, Walter,
The Beatles as Musicians: Revolver through the Anthology,
Oxford: Oxford University Press, 1999.
Fay, Thomas, Perceived Hierarchic Structure in Language and Music,
Jour-
nal of Music Theory
15 (1–2), 1971: pp. 112–137.
Federkow, G., W. Buxton, and K. Smith, A Computer-Controlled Sound Dis-
tribution System for the Performance of Electronic Music,
Computer
Music Journal 2 (3), 1978: pp. 33–42.
Hall, Donald E.,
Musical Acoustics: An Introduction, Belmont, CA: Wadsworth
Publishing Company, 1980.
Handel, Stephen,
Listening: An Introduction to the Perception of Auditory
Events, Cambridge, MA: MIT Press, 1993.
Harley, Robert,
The Complete Guide to High-End Audio, Albuquerque, NM:
Acapella Publishing, 1994.
Harris, John,
Psychoacoustics, New York: The Bobbs-Merrill Company,
1974.
Hatschek, Keith,
The Golden Moment: Recording Secrets from the Pros,
San Francisco: Backbeat Books, 2005.
Hawking, Stephen W.,
A Brief History of Time: From the Big Bang to Black
Holes, New York: Bantam Books, 1988.
Helmholtz, Hermann,
On The Sensations of Tone, New York: Dover Publica-
tions, Inc, 1967.
Hertsgaard, Mark,
A Day in the Life: The Music and Artistry of the Beatles,
New York: Delacorte Press, 1995.
Holman, Tomlinson,
5.1 Surround Sound Up and Running, Boston: Focal
Press, 2000.
Holman, Tomlinson,
Sound for Film and Television, Boston: Focal Press,
1997.
Howard, David M. and James Angus,
Acoustics and Psychoacoustics, Sec-
ond Edition, Oxford: Focal Press, 2001.
Huber, David Miles and Robert E. Runstein,
Modern Recording Techniques,
Fifth Edition, Boston: Focal Press, 2001.
Bibliography
383
Huber, David Miles, Microphone Manual: Design and Applications, India-
napolis: Howard W. Sams & Company, 1988.
James, William,
Principles of Psychology, New York: Dover Publications,
Inc., 1950.
Karkoschka, Erhard, “Eine Hörpartitur elektronischer Musik,
Melos 38 (11),
1971: pp. 468–475.
Karkoschka, Erhard,
Neue Musik / Analyses, Herrenberg: Doring, 1976.
Katz, Bob,
Mastering Audio: The Art and the Science, Oxford: Focal Press,
2002.
Katz, Mark,
Capturing Sound: How Technology Has Changed Music, Berke-
ley, CA: University of California Press, 2004.
Kefauver, Alan P.,
The Audio Recording Handbook, Middleton, WI: A-R Edi-
tions, Inc., 2001.
Kefauver, Alan P.,
Fundamentals of Digital Audio, Middleton, WI: A-R Edi-
tions, Inc., 1999.
Koffka, Kurt,
Principles of Gestalt Psychology, New York: Harcourt, Brace,
and World, 1963.
Kuttruff, Heinrich,
Room Acoustics, Second Edition, London: Applied
Science Publishers Ltd., 1979.
LaRue, Jan,
Guidelines for Style Analysis, New York: W.W. Norton &
Company, Inc., 1970.
Leeper, Robert, “Cognitive Processes,” in
Handbook of Experimental
Psychology, ed. S. S. Stevens, pp. 730–757, New York: John Wiley &
Sons, Inc., 1951.
Letowski, Tomasz, “Development of Technical Listening Skills: Timbre
Solfeggio,
Journal of the Audio Engineering Society 33 (4), 1985: pp.
240–244.
Lewisohn, Mark,
The Beatles Recording Sessions, New York: Harmony
Books, 1988.
Lewisohn, Mark,
The Complete Beatles Chronicle, London: Hamlyn, 2003.
Martin, George,
All You Need Is Ears, New York: St. Martin’s Press, 1979.
Martin, George, with William Pearson
, With a Little Help From My Friends:
the Making of Sgt. Pepper, Boston: Little, Brown and Company, 1994.
Massey, Howard,
Behind the Glass: Top Record Producers Tell How They
Craft the Hits
, San Francisco: Miller Freeman Books, 2000.
McAdams, Stephen, and Albert Bregman, “Hearing Musical Streams,
Computer Music Journal
3 (4), 1979: pp. 26–43.
Meyer, Leonard B., Emotion and Meaning in Music, Chicago: The Univer-
sity of Chicago Press, 1956.
Bibliography
384
Meyer, Leonard B., Explaining Music: Essays and Explorations, Berkeley,
CA: University of California Press, 1973.
Meyer, Leonard B.,
Music, the Arts and Ideas, Chicago: The University of
Chicago Press, 1967.
Miller, George, “The Magical Number Seven, Plus or Minus Two,
Language
and Thought, ed. Donald C. Hildum, pp. 3–31, Princeton, NJ: Van Nos-
trand Company, Inc., 1967.
Mills, A. W., “On the Minimum Audible Angle,
Journal of the Acoustical
Society of America 30 (1958): pp. 237–246.
Moore, Brian C. J.,
An Introduction to the Psychology of Hearing, Fifth Edi-
tion, Oxford: Elsevier Academic Press, 2004.
Moulton, David,
Golden Ears: Know What You Hear, Sherman Oaks, CA:
KIQ Production, Inc., 1995.
Moulton, David,
Total Recording: The Complete Guide to Audio Production
and Engineering, Sherman Oaks, CA: KIQ Production, Inc., 2000.
Moylan, William,
An Analytical System for Electronic Music, Ann Arbor, MI:
University Microfi lms, 1983.
Moylan, William, Aural Analysis of the Characteristics of Timbre,” Paper
presented at 79th Convention of the Audio Engineering Society, New
York, NY, 1985.
Moylan, William, Aural Analysis of the Spatial Relationships of Sound
Sources as Found in Two-Channel Common Practice,” Paper presented
at 81st Convention of the Audio Engineering Society, Los Angeles, CA,
1986.
Moylan, William, A Systematic Method for the Aural Analysis of Sound
Sources in Audio Reproduction/Reinforcement, Communications, and
Musical Contexts,” Paper presented at 83rd Convention of the Audio
Engineering Society, New York, NY, 1987.
Moylan, William,
The Art of Recording: the Creative Resources of Music
Production and Audio
, New York: Van Nostrand Reinhold, 1992.
Neve, Rupert, “Design and the Designer: A Point of Reference,” Paper pre-
sented at the 99
th
Convention of the Audio Engineering Society, New
York, NY, 1995.
Newell, Philip,
Recording Spaces, Oxford: Focal Press, 2000.
Newell, Philip,
Recording Studio Design, Oxford: Focal Press, 2003.
Nisbett, Alec,
The Technique of the Sound Studio, Fourth Edition, Boston:
Focal Press, 1979.
Nisbett, Alec,
The Use of Microphones, Second Edition, Boston: Focal
Press, 1983.
Olson, Harry F.,
Music, Physics and Engineering, Second Edition, New York:
Dover Publications, Inc., 1967.
Bibliography
385
Pellegrino, Ronald, The Electronic Arts of Sound and Light, New York: Van
Nostrand Reinhold Company, 1983.
Plomp, Reinier,
Aspects of Tone Sensation: A Psychophysical Study, New
York: Academic Press Inc, 1976.
Pohlmann, Ken,
Principles of Digital Audio, Third Edition, New York:
McGraw-Hill, Inc., 1995.
Polanyi, Michael,
Personal Knowledge: Towards a Post-Critical Philosophy,
Chicago: University of Chicago Press, 1962.
Pousseur, Henri, Outline of a Method,” in
die Reihe, Nr. 3, ed. Herbert
Eimert and Karlheinz Stockhausen, pp. 44–88, Bryn Mawr, PA: Theodore
Presser, Co., 1959.
Randall, J. K., “Three Lectures to Scientists,
Perspectives of New Music 3
(2), 1967: pp. 124–140.
Reynolds, Roger, “It(‘)s Time,
Electronic Music Review 7 (1968): pp. 12–17.
Reynolds, Roger,
Mind Models: New Forms of Musical Experience, New
York: Praeger Publishers, 1975.
Reynolds, Roger, “Thoughts of Sound Movement and Meaning,
Perspec-
tives of New Music 16 (2), 1978: pp. 181–190.
Risset, Jean-Claude,
Musical Acoustics, Paris: Centre George Pompidou
Rapports IRCAM No. 8, 1978.
Roads, Curtis, editor,
The Music Machine, Cambridge, MA: The MIT Press,
1989.
Roederer, Juan G.,
Introduction to the Physics and Psychophysics of Music,
Second Edition, New York: Springer-Verlag, 1979.
Rossing, Thomas D.,
The Science of Sound, Second Edition, Reading, MA:
Addison Wesley Publishing Company, 1990.
Rumsey, Francis,
Spatial Audio, Oxford: Focal Press, 2001.
Rumsey, Francis and Tim McCormick,
Sound and Recording: An Introduc-
tion, Fifth Edition, Oxford: Focal Press, 2006.
Russ, Martin,
Sound Synthesis and Sampling, Oxford: Focal Press, 1996.
Savona, Anthony, Editor,
Console Confessions: The Great Music Producers
in their Own Words
, San Francisco: Backbeat Books, 2005.
Schaeffer, Pierre,
A la recherche d’une musique concrète, Paris: Editions du
Seuil, 1952.
Schaeffer, Pierre, and Guy Reibel,
Solfège de l’objet sonore, Paris: Editions
du Seuil, 1966.
Schaeffer, Pierre,
Traité des objets musicaux, Paris: Editions du Seuil,
1966.
Bibliography
386
Schouten, J.F., “The Perception of Timbre,Report of the 6
th
International
Congress on Acoustics, 90 (1968): pp. 35–44.
Smith, F. Joseph,
The Experiencing of Musical Sound: Prelude to a
Phenomenology of Music
, New York: Gordon and Breach Science
Publishers, Inc., 1979.
Stravinsky, Igor,
Poetics of Music: In the Form of Six Lessons, Cambridge,
MA: Harvard University Press, 1970.
Streicher, Ron and F. Alton Everest,
The New Stereo Soundbook, Pasadena,
CA: Audio Engineering Associates, 1998.
Stevens, Stanley Smith, and Hallowell David,
Hearing: Its Psychology and
Physiology
, New York: Acoustical Society of America, 1938, 1983.
Stevens, Stanley Smith, and E. B. Newman, “The Localization of Actual
Sources of Sound,
American Journal of Psychology 48 (1936): pp.
297–306.
Stockhausen, Karlheinz, “The Concept of Unity in Electronic Music,
Perspectives of New Music 1 (1), 1962: pp. 39–48.
Talbot-Smith, Michael, Editor,
Audio Engineers Reference Book, Second
Edition, Oxford: Focal Press, 1999.
Tenney, James,
METAHODOS and META MetaHODOS, Oakland, CA:
Frog Peak Music, 1986.
Varèse, Edgard, “The Liberation of Sound,
Perspectives of New Music 5
(1), 1966: pp. 11–19.
Warren, Richard M.,
Auditory Perception: A New Synthesis, New York:
Pergamon Press Inc., 1982.
Watkinson, John,
The Art of Digital Audio, Third Edition, Oxford: Focal
Press, 2001.
Watkinson, John,
The Art of Sound Reproduction, Oxford: Focal Press,
1998.
Wertheimer, Max, “Laws of Organization in Perceptual Forms,” in
A
Source Book of Gestalt Psychology, ed. Willis Ellis, pp. 71–88, London:
Routledge & Kegan Paul, 1938.
Wilson, David, “Do You Hear What I Hear?”
Mix Magazine 8 (6), 1984:
pp. 132–134.
Winckel, Fritz.
Music, Sound and Sensation: A Modern Exposition. New
York: Dover Publications, Inc., 1967.
Winckel, Fritz, “The Psycho-Acoustical Analysis of Structure as Applied to
Electronic Music,
Journal of Music Theory 7 (2), 1963: pp. 194–246.
Woram, John M., Sound Recording Handbook, Indianapolis: Howard W.
Sams & Company, 1989.
387
Discography
Beatles, The.
A Day in the Life,
Sgt. Peppers Lonely Hearts Club Band, EMI Records
Ltd., 1967, 1987. CDP 7 46442 2.
“Carry That Weight,
Abbey Road, EMI Records Ltd., 1969, 1987.
CDP 7 46446 2.
“Come Together,
1, EMI Records Ltd., 2000. CDP 7243 5 29325 2 8.
“The Continuing Story of Bungalow Bill,
The Beatles (White Album),
EMI Records Ltd., 1968. CDP 7 46443 2.
“The End,
Abbey Road, EMI Records Ltd., 1969, 1987. CDP 7 46446 2.
“Every Little Thing,
Beatles for Sale, EMI Records Ltd., 1964.
CDP 7 46438 2.
“Golden Slumbers,
Abbey Road, EMI Records Ltd., 1969, 1987.
CDP 7 46446 2.
“Here Comes the Sun,
Abbey Road, EMI Records Ltd., 1969, 1987.
CDP 7 46446 2.
“Hey Jude,
1, EMI Records Ltd., 2000. CDP 7243 5 29325 2 8.
“It’s All Too Much,
Yellow Submarine Songtrack, EMI Records Ltd.,
1999. CDP 7243 5 21481 2 7.
“Let It Be,
1, EMI Records Ltd., 2000. CDP 7243 5 29325 2 8.
“Let It Be,
Let It Be, EMI Records Ltd., 1970, 1987. CDP 7 46447 2.
“Let It Be,
Let It Be . . . Naked, Apple Corps Ltd./EMI Records Ltd., 2003.
CDP 7243 5 95713 2 4.
“Let It Be,
Past Masters, Volume Two, EMI Records Ltd., 1988.
CDP 7 90044 2.
“Lucy in the Sky With Diamonds,
Sgt. Peppers Lonely Hearts Club
Band
, EMI Records Ltd., 1967, 1987. CDP 7 46442 2.
Discography
388
“Lucy in the Sky With Diamonds, Yellow Submarine Songtrack, EMI
Records Ltd., 1999. CDP 7243 5 21481 2 7.
“Lucy in the Sky With Diamonds,
Yellow Submarine (Dolby Digital 5.1
Surround), Subafi lms Ltd., 1968, 1999.
“Maxwell’s Silver Hammer,
Abbey Road, EMI Records Ltd., 1969, 1987.
CDP 7 46446 2.
“Penny Lane,
Magical Mystery Tour, EMI Records Ltd., 1967, 1987.
CDP 7 48062 2.
“She Came in Through the Bathroom Window,
Abbey Road, EMI
Records Ltd., 1969, 1987. CDP 7 46446 2.
“She Said She Said,
Revolver, EMI Records Ltd., 1966, CDP 7 46441 2.
“Something,
1, EMI Records Ltd., 2000. CDP 7243 5 29325 2 8.
“Strawberry Fields Forever,
Magical Mystery Tour, EMI Records Ltd.,
1967, 1987. CDP 7 48062 2.
“Tomorrow Never Knows,
Revolver, EMI Records Ltd., 1966,
CDP 7 46441 2.
“While My Guitar Gently Weeps,
The Beatles (White Album), EMI
Records Ltd., 1968. CDP 7 46443 2.
“Wild Honey Pie,
The Beatles (White Album), EMI Records Ltd., 1968.
CDP 7 46443 2.
“You Never Give Me Your Money,
Abbey Road, EMI Records Ltd., 1969,
1987. CDP 7 46446 2.
Dire Straits.
“Brothers in Arms,
Brothers in Arms, Mercury Records Ltd., 2005.
LC01633 9871498.
“Money for Nothing,
Brothers in Arms, Mercury Records Ltd., 2005.
LC01633 9871498.
“Walk of Life,
Brothers in Arms, Mercury Records Ltd., 2005. LC01633
9871498.
Yes, “Every Little Thing,
Yes, Atlantic Recording Corporation, 1969. 8243-2.
389
Index
A
A Day in the Life,The Beatles, 160, 186–187, 194
Abbey Road, The Beatles, 51, 53, 54, 171, 172, 187
Absolute pitch, 16
defi nition, 369
Accents, 38, 42–43
Accent microphones, 264, 269, 293–295, 342
defi nition, 369
Acoustic (live, unamplifi ed) performance, 36, 40, 261, 263, 267–268,
270–273
documenting, 263
recording aesthetic, 266–274
sound relationships, 264
Acoustic space, 10
sound quality, 160–161
Active listening, 78–79
defi nition, 369
Adjacent (distance), 192
Aesthetic/artistic decisions, 261
capturing timbres, 280–295
devices, 257–259, 296–298
recording aesthetic, 263–274
technologies, 295–296
Album, 74
mastering, 349–354
Ambiance, 58, 207, 265
Amplitude, 5–13, 16, 18, 25–26, 29, 33–34, 69, 198, 295
defi nition, 369
Analog recording technology, 347
master recorder, 258
multitrack recorder, 257
sound quality, 295–296
tape editing, 347–348
versus digital, 295–296
Analytic reasoning, 74, 78
Analytical listening, 89, 90–91, 93, 94–98, 101, 107, 118–119, 121, 139,
157, 240, 249, 295, 337, 345, 350
defi nition, 369
production concerns, 312–313, 324
signal processing, 318–319
sound quality, 159–160, 161
Analytical systems (for pitch), 119
Angle of sound source, 10, 14, 22, 23, 26, 209
calculating, 14–15
perception, 24–28
Antheil, George, 160
Artistic elements of sound, 4–5, 34, 36, 37–38, 61, 63–64, 65–66, 66–68,
68–69, 87, 90, 91, 92–93, 98, 101, 177, 239, 242, 245, 248, 320
defi nition, 369
primary elements, 66–68
secondary elements, 66–68
Audio professional, 85, 86, 92, 94, 103, 108, 131, 157
describing sound quality, 157–159
Need to evaluate sound, 86–89, 100, 107–108, 122–124, 160–161,
169–173
Axis (microphone), 282–284, 287
off-axis, 283–284
off-axis coloration, 284–285
on-axis, 284
B
Bach, Johann Sebastian
Brandenburg Concertos, 63
Cello Suites, 63
Bandwidth, 42
Beatles, The, xviii, xxiii, 69
Beethoven, Ludwig van
Symphony No. 6 “Pastoral,” 62
Berio, Luciano, 160
Berlioz, Hector
Requiem, 48
Blend, 289, 291–292
defi nition, 369
Body (of sound), 9
defi nition, 369
Book organization, xxiii–xxvi
Brain, 4, 34
Broadcast media, 36, 49, 77
C
“Carry That Weight,The Beatles, 53
CD content, 359–368
Chords, 17, 38, 39, 45, 118–119, 122, 134
defi nition, 369
Clock time, 21, 108, 137, 162, 169
exercise, 115
Comb fi lter effect, 26, 288–289
“Come Together,” The Beatles, 125–126, 128
Communication, xxiii–xxv, 68–69, 74–76
about sound, 4–5, 85, 86, 87–89, 100, 102, 114–115, 157–158,
249
Complete evaluation, 224–251
defi nition, 369
Complete evaluation graph, 241, 243, 244–246, 251
defi nition, 370
Composite musical texture, 41, 46, 48, 144, 224
Composite sound, 10, 12, 13, 29, 48, 67, 93, 146, 196–197, 224, 239,
279
Crescendo, 43
Critical distance, 294
Critical listening, 89–91, 93, 94, 100, 107, 118–119, 121, 128, 131, 139,
146, 157, 158–162, 176, 267, 286, 297
Index
390
defi nition, 370
signal processing, 318–319
sound quality, 158–162
production concerns, 313–316
D
Depth of sound stage, defi nition, 270
Digital audio workstation, 256–259
defi nition, 370
Digital recording technology
master recorder, 258
multitrack recorder, 257
sound editing, 348–349
sound quality, 295–296
versus analog, 295–296
Diminuendo (decrescendo), 43
Direct-fi eld monitoring, 302–304
Direct-to-master recording, 334, 339
sequence, 341–344
Direct sound, 11–12, 27, 28–30, 58, 189, 198–199, 201–202, 288, 290,
294, 301–302, 330, 333
defi nition, 370
Direction, 14
localization, 25–28
Directional location, 183–184
Directional sensitivity, 281, 282–283
defi nition, 370
Dire Straits
“Brothers in Arms,” 58–59
“Money for Nothing,” 213–214
“Walk of Life,” 59
Distance, 10, 12, 17, 23, 24, 29–30, 44–45, 49, 50, 52–55, 57–58, 147,
175–176, 180, 182, 189–191, 192–195, 264–265, 267–268, 287,
288, 289–290, 292–293, 320, 322, 329, 330–331, 333–334
areas, 188–189
changes to sound with, 12, 189
critical distance, 294
depth of sound stage, 52–53, 57, 196, 235, 290, 320, 329
judgments, 27
microphone to sound source, 189–190,
microphone to stage, 264
perception, 28–29
sensitivity, 281, 286
stage-to-listener, 52–53, 265, 330
Distance dimensions, defi nition, 370
Distance location, 28–29, 38, 45, 50, 52–53, 56, 66, 101, 114, 176, 181–
182, 188–189, 233, 235, 239, 241, 245, 248, 276, 280, 362
confusion, 189–191
continuum, defi nition, 370
environment infl uence, 205, 280
evaluation, 192–195
exercise, 216–217
graph, 193–195, 238
graph, defi nition, 370
understanding, 189–191
Distance perception, defi nition, 370
Distance sensitivity, defi nition, 370
Duration, 9–10, 16–20, 30–31, 32, 37–38, 45, 93
defi nition, 371
perception, 19, 30–31, 32
patterns of, 37
Dynamic contour, 38, 42, 43–44, 48, 66, 93, 101, 138, 141, 143, 144,
146–148, 161, 162, 164–165, 167–169, 199–200, 230, 232, 276,
320–322, 325, 353
defi nition, 371
evaluation, 241, 244, 246, 312–314
exercise, 154–155
graph, 147–148, 232
Dynamic envelope, 6–7, 9–10, 21, 93, 144, 157, 159, 160, 162, 169, 200
defi ning, 163
defi nition, 371
environment, 200
Dynamics
describing, 42, 139
levels and relationships, 31, 37, 38, 42–45, 67, 93, 101, 124,
138–142, 146–147, 149, 151, 201, 266, 320, 322, 324–328,
342–343, 350, 352, 353–354
levels as ranges, 143–144
performance intensity, and, 141–142
pitch areas, 124–129
Dynamic speech infl ections, 42
E
Early refl ections, 11–12, 29–30, 198–199
defi nition, 371
Early sound fi eld, 11–12, 30, 330
defi nition, 371
Editing, 268–269, 339–349
analog tape, 347–348
computer-based, 257–258, 348–349
identifying edit points, 345–347
razor blade, 258
Effective listening zone, 298, 300–302, 304
defi nition, 371
Elevation, 14, 49
Emotions, 5
“End, The,” The Beatles, 126–127
“Every Little Thing,” The Beatles, 73
Environmental characteristics, 29–30, 34, 38, 48, 53–55, 58–59, 64, 66,
69, 101, 160, 176, 177–178, 182, 189, 191–200, 241, 263, 269, 270,
276, 279–280, 288, 289, 293
composite sound, 196–197, 235–237
defi nition, 371
evaluation, 196–197
evaluation defi nition, 371
exercise, 218–221
graph, 198, 201–203
graph defi nition, 371
perceived performance environment exercise, 220–221
perception of, 29
sound sources, 196–197, 279, 327, 328, 332–333
spectrum and spectral envelope exercise, 200, 218
surround, in, 58–59
Environment, 6, 10–14, 23–24, 29–30, 263–265
cues, 14, 24, 52, 57, 58, 139, 177, 184, 204, 213, 264, 290
distance, 190–191
geometry of, 10, 12
Equalization, 121, 301, 315, 349, 352
room, 301
Equivalence, 67, 68–69, 159–160
defi nition, 371
Estimation of pitch-level, 16–17
Evaluating sound, 80, 85–86, 88, 90, 92, 95–98
discovering characteristics, 97–98
objective, 3
system, 100–115
F
Far (distance), 188–191
Filter, 165
Focus, 92–94, 103, 104, 139, 243–245, 249, 311–312
defi nition, 371
exercise, 251
shifting focus, 93, 239, 241, 243–244, 251, 324
Form, 63–65, 72–73, 224, 230, 248, 275
defi nition, 371
musical form, 63–65
overall texture, 224, 230, 248
song lyrics, 71–72
Formant (formant regions), 8, 226, 282, 315
defi nition, 371
Frequency, 5–8, 15, 16, 26–27, 34, 38, 69, 120–122, 282, 295, 298, 305,
317, 327
bands, 19, 87, 118, 124–129, 165, 202, 224, 282, 299, 352
Index
391
bands, defi nition, 372
defi nition, 371
estimation, 121–122
processors, 317–318
range, 8, 15, 79, 122
registers, 121–122
response, 281, 282, 284, 285, 287, 289, 291, 296, 300, 304–305,
368
response, defi nition, 372
Front sound fi eld, 56, 57, 58, 265
Fundamental frequency, 7–9, 16, 22, 28, 32, 167, 226
defi nition of, 157, 163, 372
describing, 164
Fusion, 5, 22, 279
G
“Golden Slumbers,The Beatles, 53
Graphs, 102
against time line, 112–113
environmental characteristic graphs, 201–203
evaluation graphs, 101
evaluations, for making, 247
graphing states and activity of sound components, 106–112,
247
key, 112
multiple sources, 110–112
multitier, 109–110
musical balance graphs, 153, 233–235
performance intensity graphs, 153
recreating sounds, 102
stereo sound-location, 186
supplementing, 114
surround sound location graphs, 212–213
uses, 100, 247
vertical (Y) axis, 108–109
horizontal (X) axis, 107
X-Y graphs, 101, 105, 111
H
Haas (precedence) effect, 33
Harmonic motion, 39, 42, 65
Harmonic progression, 39–40, 64, 66, 119
defi nition, 372
Harmonic rhythm, 39, 45
Harmonics, 7–9, 16, 32, 93, 110, 163, 167–169, 170, 172, 226
defi nition, 372,
harmonic series exercise, 35
harmonic series, 8, 164–165, 168, 359
Harmony, 16, 39, 68, 118
defi nition, 372
Head movement, 28, 58
Headphones, 305–306, 355
“Here Comes the Sun,The Beatles, 54, 91, 129, 130, 147, 148, 172,
245, 246,
“Hey Jude,The Beatles, 202, 203, 204, 205, 206
Hierarchy, 40, 63–67, 92–93, 105
artistic elements, 68
defi nition, 372
dynamics, 139–146
environments, 182
musical materials, 63–64, 67, 321
musical productions, 68
musical structure, 64, 65, 71, 90, 92–93
Horizontal plane, 14–15, 23, 24, 49, 176, 178–179, 186, 210
Host environment, 10–14, 23–24, 26, 29–30, 53–54, 58, 68, 179, 180,
190, 192, 195, 196–197, 199, 202, 204–205, 241, 267, 327
artifi cially generated, 14
characteristics, 26, 29, 195–196
defi nition, 372
evaluation, 196–200, 241
perception of characteristics, 29
source distance within, 190–195
I
Ideal seat, 53, 267, 294
Imaging, 38, 50, 51–52, 58, 177, 178–180, 182, 188, 193, 195, 212–214,
238, 264, 271, 300–302, 305–306
defi nition, 372
empty stereo sound stage, 188
headphones, 305–306
in surround, 55–56
Inherent sound quality, 89, 259, 295–298
defi nition, 372
Integrity (
see technical quality)
International Telecommunications Union (ITU), 207–208
Infi nity (distance), 191–193
Instrumentation, 41, 64, 245, 310–311, 323
selection, 46–47
Interaural amplitude differences (IAD), 25–26
defi nition, 372
Interaural spectral differences (ISD), 25, 27
defi nition, 372
Interaural time differences (ITD), 25–26
defi nition, 372
Internal pitch references, 120–121
Intervals (pitch), 17, 19, 38–39, 119, 131, 133
defi nition, 372
harmonics, 17
melodic, 17, 19, 131, 133
“It’s All Too Much,” The Beatles, 169–170
K
Key, defi nition, 372
L
“Let It Be,The Beatles, 128, 326
Listener, xxiii, 4–5, 73–75
altering recordings, 354–355
attentiveness, 4, 21, 61, 74, 78
audience member, 61, 73–74, 80, 273
expectations, 21, 74, 77–78, 80, 86, 273–274
experience, 4, 21, 29, 62–63, 71, 75–78, 80, 85
hearing characteristics, 4, 15, 73–75, 79–80
intelligence, 4, 75
knowledge, 4, 21–22, 28–29, 73–80, 85, 151, 168, 193, 197, 202,
318, 325
lay-listener, 74
location, 10, 23–24, 28, 45, 52, 176–178, 179, 188, 192–193, 198,
209, 211, 264, 272, 302–304, 305–306, 331
musical preferences, 74, 79, 354
social-cultural conditioning, 71, 74, 76–77
Listening
distractions, 94–95
fatigue, 18, 19, 34, 304, 305
level of involvement, 73–74
multidimensional, 89, 94
process, 89, 92, 94, 118, 157, 159, 286, 297, 311, 345
purpose, 88–95
recordist, and the, 74, 239
recreational, 224, 233
skill level, 29, 30, 74, 86, 89, 94
writing, and, 81, 102, 112, 119, 132–133
Listening skill development, 89–98, 100–115, 118, 120, 131, 163, 239–
247
Literary meaning, 70–71
Live acoustic recordings, 268
Location, 10, 49
judgments, 27
listener, 10, 23–24, 28, 45, 52, 176–178, 179, 188, 192–193, 198,
209, 211, 264, 272, 302–304, 305–306, 331
source in environment, 10–13, 23–24
Loudness, 16, 17, 18–19, 28, 31, 32, 33–34, 37, 38, 43–45, 47, 93, 118,
158, 163, 164, 173, 189, 195, 200, 226–227, 305, 311, 319–320,
325–326, 328, 343, 346, 349, 353–354
Index
392
defi nition, 372
evaluating, 138–153, 367
mastering, 349–350
perception, 18–19
perception exercise, 307–308
prominence, and, 139, 149
relative levels, 18, 167, 169
Loudspeakers, 4, 23, 33, 49, 56, 176, 298–300, 302–304, 306, 354, 366
array, 24, 49, 51, 183
audiophile, 299
consumer (home) quality, 299, 304
placement, 23, 298–304, 366
pro audio, 299
“Lucy in the Sky with Diamonds,The Beatles, 145–146, 149–150, 195,
228–229, 231–232, 233–234, 235–237, 238, 239, 242
Lyrics, 66, 70–72, 92, 145
M
Masking, 19, 33–34, 286, 338
defi nition, 373
Master tape, 345
Mastering, 319, 333, 339, 344, 349–354
assembly, 350–351
coding, 354
defi nition, 373
dynamic levels, 353–354
editing, 351
process, 205, 349–351
recorder, 267–268
session, 344
timbral balance, 351–353
“Maxwell’s Silver Hammer,The Beatles, 171–172
Melodic contour, 101, 131-134, 241
analysis exercise, 137
graph, 133
graph, defi nition, 373
graphing against time line, 131-132
Melodic lines, 38, 39, 42, 66, 69, 79, 93, 122
Memory, 4, 22, 75, 120, 122, 134–135, 136, 154, 197, 225, 247
development, 95–96, 132, 307
development exercise, 99
sound as, 96
Metric grid, 19–20, 45, 107–108, 116, 133, 147, 149, 151, 162, 185, 193,
210
defi nition, 373
Metronome markings, 20, 45
Microphones, 257, 264, 269, 276, 280–295, 315, 341
accent microphones, 264, 269, 293–295, 342
axis, 283
bi-directional, 291
blend, 291–291
capturing timbres, 280–281
cardioid, 291
comparing placement exercise, 309
directional sensitivity, 281, 282–284
distance sensitivity, 189–190, 264, 286
frequency response, 281, 282, 284, 285–286, 287, 289, 291
identifying and comparing microphones exercise, 308
microphone technique exercises, additional 309
off-axis coloration, 284
performance characteristics, 281–286
placement, 85, 190, 268, 281, 287–292, 315, 341
positioning, 289–291, 292, 293
proximity effect, 291
transparent recordings, in, 267–268
transient response, 285–286
Mind, 4–5, 15, 20, 22, 30, 37, 45, 65, 96, 177
Mix, 149, 156, 187, 194, 212, 233, 245, 304–305, 319–335, 340–341
anticipating, 315–316
composing and performing, 319–335
exercise, 337–338
exercises modifying completed mixes, 356–357
rehearsing, 345
shaping sound sources, 233–239
submixes, 269, 316
Mixing console, 257–258, 295, 314, 325
combine signals, 258
pre-processing, 257
record levels, 257
routing, 257–258
Mixing process, 44, 151–153, 156, 190, 228, 233, 266, 269, 296, 314,
315–316, 320–321, 324, 326
defi nition, 373
mixdown session, 258, 316, 318, 320, 340
performing the recording, 319–335
“Money for Nothing,” Dire Straits, 213–214
Monitor level, defi nition, 373
Monitor system (
see also Playback), 176, 257, 276, 298–306, 350
defi nition, 373
hardware, 298
listening room, 298–300, 303
loudspeaker and room interaction, 300–302
levels, 298, 304–305
sound quality, 298
Monitoring, 298–306
direct-fi eld, 302–304
headphones, 305–306
near-fi eld, 302–304
room, 300–304
Moving sources, 38, 51
Multitier graph, defi nition, 373
Multitrack recording, 53, 156, 264, 266, 339, 340–341, 342, 345
Music, 61
aesthetic experience, 62–63
emotions, 62, 74
experience, 75–77
functions, 61–63, 77
Musical balance, 39, 42, 43, 44, 93, 95, 101, 110, 141, 144, 149–150, 161,
233, 241, 245, 248, 270, 276, 277, 293–294, 361, 362
crafting, 324–326
defi nition, 373
exercise, 60, 155–156
graph, 149–150, 152, 160, 228, 234
graph, defi nition, 373
mixdown, 320, 323–326, 338, 356–357
performance intensity, vs., 151–153
sound quality, 265–266
Musical ideas (
see Musical materials)
Musical materials, 20, 64, 65–69, 73, 76, 77, 78, 79, 89, 90, 91, 93, 95,
101, 118, 140, 145, 177–178, 225, 230–231, 235, 239–240, 241–
243, 244, 245, 248, 264–265, 269, 270, 272, 276, 317, 321–324,
329–330, 333
patterns, 64–66
pitch density, 225–227
primary musical materials, 66, 73, 140, 265
secondary musical materials, 66, 321
sound quality, linkage, 239
timbral balance, 227–229
Musical message, 5, 37, 39, 46, 48, 61–63, 64, 65, 67, 69, 74, 79, 80, 93,
118, 138, 195, 240, 243, 261, 270, 330
Musical memory development exercise, 99
Musical style, 74, 75, 77, 78, 269
Music performance, 40, 311
altered realities, 271–274
defi nitive performance, 271, 273
enhanced performances, 268–269
live performance relationships, 264
permanent performance, 271
perfect performances, 269, 272–273
recording aesthetic, 266–271
Music productions, 56, 69–70, 151, 177, 180
N
Near (distance), 191, 192
Near-fi eld monitoring, 302–304
Neural impulses (signals), 4, 15, 34
Index
393
Neve, Rupert, xv, xxi
Noise, 8, 34, 63, 126, 162, 174, 240, 257, 291, 296, 299, 300, 304, 313,
314, 315, 316, 336, 345, 346, 349, 350, 351,
noise gate, 256, 316, 317, 337
pink noise, 367
Nominal level, 200
defi nition, 373
Notating sounds (
see also Graphs), 106, 112, 247
snap shots of time, 113–114
O
Objective descriptions, defi nition, 373
Off-axis, defi nition, 374
Off-axis coloration, defi nition, 374
Onset, 9
defi nition, 374
Orchestration, 41, 225, 328
Overall program, 41, 48, 91, 141, 146–147, 160, 161, 177, 180, 181, 189,
204, 224, 227, 243, 248, 267, 269, 328, 333, 344, 352, 353
characteristics of, 230–233
environment of (
see perceived performance environment),
53–54, 230–233
Overall texture (
see Overall program)
defi nition, 374
Overtones, 7, 8, 9, 22, 32, 93, 110, 163, 164, 165, 168, 169, 226
defi nition, 374
P
Partials, 8–9, 22, 28, 32, 163, 165–166, 168, 174–175, 191, 226
defi nition, 374
Passive listening, 61, 78–79
defi nition, 374
Pattern perception, 39, 45, 64–65, 95, 198–199
Penderecki, Krzysztof, 160
“Penny Lane,The Beatles, 44, 60
Pentatonic system, 40
Perceived Parameters, 4, 5, 15–34, 36–38, 74, 90, 91, 92, 99, 100–101,
104, 106, 114, 164, 312, 313, 318, 345
defi nition, 374
duration, 19–21
interaction, 30–34
loudness, 18–19
pitch, 16–17
spatial characteristics, 23–30
timbre, 21–22
Perceived performance environment, 38, 49, 50, 52, 53–55, 58, 101,
114, 177–184, 188, 195–196, 201, 203–205, 224, 230, 248, 263,
265, 275–276, 321–322, 329, 333, 357
characteristics, 182, 195–196
defi nition, 374
evaluation of, 233, 241
exercise, 220–221
overall texture, 224, 230, 333
Perceived performance intensity (
see Performance intensity)
Perfect pitch, 16
defi nition, 374
Performance intensity, 38, 44, 47, 60, 101, 139, 140, 141–142, 144, 147,
149, 167, 226–227, 238, 250, 266, 269, 276, 313, 320, 325, 326,
336, 339, 346, 361, 362
defi nition, 374
dynamic markings, and, 141–152
musical balance exercise, and, 60
versus musical balance, 110, 151–153, 233
versus musical balance exercise, 156
versus musical balance graph, 152, 234, 241, 244, 246
Performance intensity versus musical balance graph, defi nition, 374
Performance space, creating, 328–334
Performance techniques, 38, 47
Perspective, 43, 89, 91, 92–94, 95, 103, 104, 124–125, 129, 132, 144,
146–147, 149, 157–159, 192, 224, 226, 239–242, 276, 320, 321,
341, 345
defi nition, 375
exercise, 251
focus, and, 92–94, 243–246
hierarchical levels of, 93
shifting of, 94, 243–246, 311–316, 317–319
sound quality, and, 160–173
Phantom images, 38, 51–52, 56–38, 180, 184, 194, 196, 209, 212–213,
223, 293, 329
defi nition, 375
point source, 51–52, 56, 186, 210–211, 215, 223
spread image, 51–52, 56, 186, 210–211, 215, 223
Phon, 18
defi nition, 375
Physical dimensions, defi nition, 375
Pinna, 27
Pitch, 8–9, 16–17, 34–35, 37, 38–42, 43, 45, 66–69, 76, 93, 101–102,
158–175, 176, 200, 201, 225–229, 230, 232, 235, 282, 312, 313,
320–321, 324, 346
concerns of audio production, 40–42, 326–328
defi nition, 162, 163, 166, 167, 169–170, 172, 174, 375
estimation, 359
evaluating, 118–137
examining pitch defi nition, 163
internal pitch reference, 120–121
intervals, 16–19
level estimation exercise, 136
levels and relationships, 38–42
perception, 16–17, 30–32, 121
pitch center, 17, 39
pitch reference exercise, 134–135
pitch register boundaries, 360
quality, 31, 41, 163, 167, 174
recognizing levels, 121–124
registers, 121, 129, 136, 139, 169, 231, 245, 322, 327
traditional uses of, 39–40
Pitch area, 38–42, 101, 118, 125–128, 129, 161, 199–200, 201, 219, 225–
227, 233, 235, 241–242, 245–246, 249–250, 326, 327, 328
analysis exercise, 126, 136–137
analysis graph, 125
analysis graph, defi nition, 375
defi nition, 375
density, 129
environments, 199
evaluation, 119, 129
frequency band recognition, 124–128
musical idea, 129, 226–227
secondary, 124, 226
Pitch defi nition, defi nition, 375
Pitch density, 39, 40–41, 66, 69, 101, 129, 159–160, 225–227, 241–242,
245–246, 276, 316, 319, 320, 323, 327, 328, 330
defi nition, 375
evaluating, 161, 225–227, 250
exercise, 249–250
graph, 153, 223–224
overall texture, 230–233
pitch area and, 129
timbral balance and, 225–229, 320
Pitch reference, defi nition, 375
Pitch-level estimation, defi nition, 375
Pitch/frequency registers, defi nition, 375
Playback (listening) environment, xxvii, 23–24, 56, 176–177, 298–306
interaction with reproduced sound, 23, 354–355
listener location, 23–24, 300–304, 355
qualities, 23, 298
system, 24, 176, 276, 298, 299–300, 349, 354–355, 366–367, 368
Poetry, 70–71
Point of reference, xiii–xv, 58, 165, 197, 308, 353
Point source, 51–52, 56, 186, 210–211, 215, 223
defi nition, 375
Polar pattern, 282–284
defi nition, 376
Prefi x, 9–10
defi nition, 376
Preprocessing, 257, 293, 316, 317
Index
394
Present, 20–21, 75
defi nition, 376
Primary elements, 66–68
defi nition, 376
Primary musical materials, 66, 73, 140
defi nition, 376
Primary phantom images, defi nition, 376
Production aesthetic, 261–274
Production transparent recordings, 267
spatially enhanced, 267–268
Program Dynamic Contour, 38, 93, 146–147, 164, 224, 231, 241, 244,
248, 276, 312, 321–322, 328, 346, 353, 366
defi nition, 376
exercise, 154–155
graph, 147–148, 231–232
overall texture, 224, 230
Proximity (distance), 190–195, 245, 290, 330
defi nition, 376
Proximity effect, 291
R
Range, 8, 15–16, 38–42
defi nition, 376
dynamic levels, 143–144
frequency, 8, 15, 26–27, 79, 122, 299, 327
pitch, 16, 41, 225, 228, 323
Rate of activity, 37, 45
Ratio of direct to refl ected sound, 12, 28–29, 189, 289
reach, 286
Rear sound fi eld, 14, 24, 27–28, 57–59, 179, 188, 194, 207, 209, 329,
333
Recording, xxii, xxv
aesthetic, 263–266
as creative process, xxv, 3, 36, 37, 80–81, 242–243
envisioning, 275–276
studio as musical instrument, 262
unique sound qualities of, xxii, xxv–xxvi, 37, 97, 138–139
Recording aesthetic, 376
Recording production process, 55, 255–260, 340
aesthetic, 261–274
direct-to-master recording sequence, 334, 339, 341–344, 345
monitoring, 298–306
multitrack recording sequence, 340–341
overview, 255–256
sessions, 269, 311–316, 341
tracking, 311–316, 341
Recording/reproduction (signal) chain, 3, 256–259, 295
system control methods, 258
Recording sessions, defi nition, 376
Recording (performance) space, 177, 205, 263, 264, 267, 268, 287, 293,
294, 314, 322, 328–334, 342, 346
controlling sound of, 287–291
creating a recording space, 328–334
refl ective surfaces, 288–289, 290
Recordist, xiii, xiv, xxii, xxiv, 4, 74–75, 85, 277
analytical listening concerns, 312–313
anticipating mixdown, 315–316
artistic roles, 262–263
as artist, xxii, 36, 69–70, 80–81, 255, 355
controlling craft, 100, 255, 295
critical listening, 313–315
defi nition, 4, 376
developing a production sound, xxvii–xxviii
distractions during listening, 94–95
evaluating sound, 79–80, 86–87, 89, 95–96, 97–98, 100–117, 173,
195–196, 233, 243, 298, 311–315, 318–319, 344–345
examining pitch defi nition, 163
guiding music making, 259–260
impaired hearing, 79, 305
listening, xxiv, 69, 73–75, 311–315, 318–319
loudness contour, 147
overall levels, 138, 147
program dynamic contour, 138, 147
selecting equipment, 296–298, 315
selecting technologies, 295–296
sound quality, 173, 298
understanding the production sounds of others, 239, 242
Reference dynamic level (RDL), 42, 138–139, 140–141, 144–147, 149,
154, 155, 164, 165–166, 167–169, 231, 238, 241, 248, 276, 321–
322, 351, 353
defi ning, 140, 144–146, 241
defi nition, 376
designating, 144, 147, 149
exercise, 153–154
overall texture, 230–233
Reference frequencies, 359
Reference pitches, 359
Refl ected sound, 11, 12, 13, 28–30, 58, 189, 198, 289–290, 300, 303
defi nition, 377
patterns of refl ections, 13, 199
rhythms of refl ections, 361
Refl ection envelope, 197, 198–199, 201
Refl ections and reverberation exercise, 199–200, 217
Refl ective surfaces, 11, 29–30, 288–290
Register, 16, 38–40, 121
defi nition, 377
Relative pitch, 16, 120
defi nition, 377
Resonance peaks, 8
Reverb (device), 202, 217, 218, 290, 317, 318, 330, 337, 338, 357
Reverberant sound, 11, 12, 13, 28, 29, 30, 190, 193, 194, 199, 216, 217,
220, 281, 284, 294, 309, 329, 330
defi nition, 377
Reverberation time, 11
defi nition, 377
Room monitoring, 302–304
Rhythm, 20, 45, 66–68, 312, 347
text and lyrics, 70, 71
Rhythmic patterns, 37, 38, 39, 45, 250
defi nition, 377
S
Secondary elements, 66–68, 177
defi nition, 377
Secondary musical materials, 66–68
defi nition, 377
Secondary phantom images, defi nition, 377
Sgt. Pepper’s Lonely Hearts Club Band, The Beatles, 145–146, 149,
160, 238, 350
Shadow effect, 25
“She Came in Through the Bathroom Window,The Beatles, 51
“She Said She Said,The Beatles, 113
Signal chain, 256–259, 276, 295, 299, 306, 314, 334–335, 336, 356
defi nition, 377
digital audio workstation, 256–259
mixdown, 257
tracking, 257
Signal processing, 23, 233, 258, 264–265, 267, 288, 310, 317–319, 341,
344
amplitude processors, 317
exercises, 337
frequency processors, 317
global signal processing, 341, 349
signal processors, 89, 257, 259, 317–318, 319
time processors, 267, 317
Sine wave, 7, 163, 359
Song, 70–73, 81, 231, 321, 322
graphs, 126, 127, 128, 130, 133, 146, 148, 150, 152, 170, 171, 172,
187, 194, 195, 203, 204, 205, 206, 213, 229, 232, 234, 236,
237, 238, 246
time line, 73, 81, 101, 113, 231
“Something,” The Beatles, 128
Sonic imprint, 259, 295
Sound, 3–35
Index
395
altered by listening process, 4
artistic elements, 4–5, 36–38
as idea or message, 4–5
as memory, 96
describing sound exercise, 172, 174
descriptions, 87–89
distortions of, 4
evaluating, 100–117
in human perception, 4
perceived parameters, 4–5, 15–34, 36, 37, 38, 68, 74, 90–92, 94,
100, 101, 104, 106, 312, 313
physical dimensions of, 5–16, 21, 34, 36, 37–38, 88, 90, 100–101,
157, 163, 317–319
states of, 3–5
Sound designer, 278
Sound evaluation sequence, 103–106
Sound event, 64, 75, 91, 95–96, 103–106, 108–110, 112, 113
defi nition, 377
Sound object, 48, 64, 91, 103–104, 106, 108–109, 114, 157, 161
defi nition, 377
Gibson J-200 comparison, 91
Sound mass composition, 160
Sound pressure level (SPL), 18, 19, 25, 32, 44, 304–305, 367–368
Sound quality, 35, 37, 38, 42, 43, 67–68, 87–88, 90, 91, 92, 101, 144,
317
characteristics graph, 167–173
concerns, 326–328
defi nition, 377
evaluation, 100–117, 157–175, 176
evaluation exercise, 173, 174–175
harmonic series exercise, of the 35
host environment and, 53
importance, 46–47
inherent, 89, 259, 295
perspective, and, 160–161
sample evaluations, 169–173
sound sources and, 46–48
spacial properties and, 49
Sound quality evaluation, defi nition, 377
Sound sources, 17, 23–24, 37, 38, 41, 43–44, 49–54, 58, 93, 101, 113
against time line, 101, 113
artistic resources, as, 277–280
creating, 278–279
environments, 180–182, 195–197, 320
exercise in plotting against time line, 116–117
listing, 112–113, 240
nonmusical sources, 279–280
overall texture, 233–239
performers as, 278
selecting, 277–280
sound quality and, 46–48
sound quality of, 233, 238–239, 289, 311–318, 330–331
sound stage, 180–182
surround location, 49
timbre and environmental characteristics, 196–197. 279, 328–
334
Sound stage, 38, 49, 50, 54–55, 59, 68, 93, 101, 114, 177–179, 230, 235,
248, 263–265, 269, 331
defi nition, 378
depth, 51, 52–53, 57, 188–189, 194, 196, 205, 235, 293, 329
dimensions, 38, 235, 245, 321, 365, 329, 365–366
distance, 52–53, 182–184
empty, stereo, 188, 265, 328–335
empty, surround, 265, 328–335
front edge, 52–53, 178–179, 188, 193, 331
imaging and, 50–52, 177–179, 180
surround, 183–184
width, 51, 264
Sound synthesis, 48, 278, 306, 320
Space, 6, 10–15, 16, 34, 37, 38, 53–54, 55, 93, 181, 182, 183, 190, 192,
193, 195, 196, 202, 225, 230, 235, 263
artistic element, as, 177–184
defi nition, 378
perception, 23, 38
Space within space, 38, 54–55, 58, 180, 181, 182, 195, 196, 203–206,
241, 364
defi nition, 378
evaluating, 203–206, 241
exercise, 221–222
Spatial characteristics, 23–34
evaluating, 176–223
Spatial properties, 14, 34, 37, 38, 48–49, 59, 68, 101, 230, 328, 334,
342, 346
Spatial relationships, defi nition, 378
Spectral content (
see Spectrum)
Spectral envelope, 6, 9–10, 21, 38, 43, 93, 144, 157–158, 160, 161, 165–
166, 277, 279, 290, 291, 296
defi ning, 162–166, 167, 168, 169, 171, 172, 228
defi nition, 378
environment, 196, 199–202, 218–219, 333
exercise, 218
Spectrum, 6–10, 12, 22, 26–27, 28–29, 32, 34, 38, 41, 124–125, 129, 157,
160–161, 172, 174, 226–228, 230, 249–250, 277, 279, 280–284,
289, 322, 327, 329, 332–333, 352, 367
defi ning, 163, 165–166, 168
defi nition, 378
environment, 197, 199–200, 201–203
exercise, 218–219
pitch perception and, 32
Spread image, 51–52, 56, 186, 210–211, 215, 223
defi nition, 378
Stage width (stereo spread), 51
defi nition, 378
Stage-to-listener distance, defi nition, 378
States of sound, defi nition, 378
Stereo location, 38, 48, 51–52, 68, 101, 183, 228, 235, 306, 323, 347,
363
defi nition, 378
evaluation, 184–187, 238
exercise, 215, 337–338
graph, 196, 184–188, 210, 236, 241, 245–246, 338
Stereo microphone techniques, 53, 264, 316, 365
defi nition, 378
Stereo (two-channel) sound, 15, 23–25, 49–50, 56, 58, 207–208
defi nition, 379
evaluation, 176
playback array, 23–24, 49–50, 179, 183, 207–208
sound localization in, 184–187, 188
stage, 188, 211
Stockhausen, Karlheinz, 160
Storage media, 3
“Strawberry Fields Forever,The Beatles, 152, 153
Structure, 63–65, 72–73, 101, 231, 235, 241–245, 321
defi nition, 379
exercise, 81
patterns, 64
text, 70
song lyrics, 71–72
Subharmonics, 7–9
Subjective impressions, xxv, 89, 103, 106, 114
Subtones, 7–9
Subwoofer, 207, 299
Surround sound, 15, 24, 28, 49, 207–214, 237, 319, 329, 332, 333
aesthetic considerations, 55–57
defi nition, 379
dimensions, 55–59
distance, 57–58
environmental characteristics, 58–59
exercise, 222–223
evaluating, 176
format considerations, 207–208
imaging, 55–57, 187, 188
mix, 210–211
phantom images, 56–58, 207–208, 212–213, 223
sound localization, 25, 183
sound from behind, 58
Index
396
sound stage, 183–184, 210–212, 213–214
stereo and, 48–59
Surround (sound) location, 37, 49, 55–56, 66, 101, 183, 207, 213, 223,
233, 235, 241, 265
defi nition, 379
evaluation, 209–214
exercise, 215, 222–223
graph with time line, 209, 210, 211, 213, 237
graph as sound stage, 212, 214, 237
System for sound evaluation, 100–117
complete evaluations, 239–246
T
Target audience, 80
Technical quality, 85, 90, 158, 297, 312–313, 344, 346, 354
Technology selection, 295–296
Tempo, 20, 38, 45, 108, 115, 133, 140–141, 145, 312, 336, 342, 347, 351
defi nition, 379
Temporal fusion, 29
defi nition, 379
Text, 62, 66, 70–73, 90, 93, 101, 224, 231, 240–241, 248, 275–276, 279,
321–323, 326
Texture, 48, 66, 68, 69, 93, 129, 138–139, 144, 149, 159–161, 177, 189,
201, 202, 203, 221, 224, 225, 227, 228, 230–239, 243, 247–248,
250–251, 264, 270, 275, 279, 311, 324–330
Timbral balance, 38, 41, 46, 48, 93, 101, 119, 159, 160, 224, 227–229,
230–231, 235, 238, 241, 242, 243, 245, 247, 248, 277, 320–322,
327–328, 333, 338, 350, 356
adjusting, 351–353
defi nition, 379
exercise, 250
graph, 228, 229, 233, 235, 241
pitch density and, 225–229
Timbral detail, 12, 28–29, 53, 57, 182, 189–190, 192, 194, 239, 280, 286,
289–290, 299, 309, 313–314, 316, 330, 336, 338
defi nition, 379
Timbre, 6–10, 16, 21–22, 23, 28–29, 31, 32, 34, 37, 38, 44, 45, 46–48, 53,
58, 66, 68, 69, 93, 102, 108, 120, 127, 129, 131, 138, 141, 144, 151,
189–190, 193, 196–197, 200, 225–227, 228, 266, 268–269, 276–277,
279, 286, 288, 289–291, 308, 309, 313, 315–316, 317, 318–319,
325, 326–328
analysis, 127, 157–175
capturing, 280–295
choosing, 277–280
defi nition of timbre, 6, 12, 28, 159, 192, 216, 294, 379
perception, 21–22
prior knowledge, 151
Timbre perception, defi nition, 379
Time, 5–7, 12, 16, 20–21, 34, 38
density, 198
judgment, 108, 312
perception, 20, 32, 38
timbre of time units, 108, 116
Time judgment exercise, 115–116, 360
Time line, 73, 81, 101, 105, 107–108, 109, 112–113, 116–117, 131–134,
137, 149, 151, 154–156, 163, 165, 167–169, 186, 193, 201, 209,
210, 212, 215, 216, 219, 222, 227, 231, 240–241, 243, 249
creating, 81, 99, 107–108113, 162, 240
defi ning, 162
defi nition, 379
song, 101, 112–113, 231
Time perception, defi nition, 379
Tonal center, 40, 64, 66, 120
Tonal organization, 38, 39–40
Tonal speech infl ection, 39
Tonal system, 17, 68
Tracking, defi nition, 379
Tracking exercises, 336
Transient response, 87, 159, 281, 285–286, 290, 306
defi nition, 380
Translation of sound, 4, 15, 37, 230
Tremolo, 38, 43
Triads, 39
defi nition, 380
V
Varèse, Edgar, 160
Vertical (Y) axis, 108–109
Vertical plane, 14–15, 23–24
Vocabulary, 76–77, 87–89, 98, 100, 139, 157
objective, xxv, 114
Vibrato, 38, 43
W
Waveform, 5–7, 16, 18, 25–26, 28, 34, 48, 69, 124, 171, 174, 285–286,
296, 317, 319, 335, 348, 354, 356
“While My Guitar Gently Weeps,The Beatles, 91
White noise, 163
“Wild Honey Pie,The Beatles, 133
Whole-tone system, 40
Writing and listening, 81, 102, 112, 119, 132–133
X
X-Y graph, defi nition, 380
Y
Yellow Submarine, The Beatles, 145–146, 169–170, 195, 228–229, 232,
234, 235–237, 238
Yes, “Every Little Thing,73
“You Never Give Me Your Money,The Beatles, 51–52