“Musicality vs accuracy” is one of those discussion topics that never really gets anywhere. Partly because – as usual in a lot of forum discussions – the terms are never really defined, so people use the words to mean whatever they like and then argue for or against them. But also because there really is no “versus” in it. It’s like arguing about gin vs tonic. You need both.
Let’s do a little thought experiment. Suppose you go to a concert (an acoustic one, unamplified instruments and/or voices). You love it, the performance is astonishing, inspired, and moving. By any assessment, you would consider that a “musical” performance. Then you go home, armed with a CD containing a recording of that same performance, that someone slipped to you on your way out the door. And suppose that – somehow – the exact same sound was recreated in your home, including the full reverberation and ambience of the concert hall.
Now, that would have to be “accurate,” right? Since it’s exactly the same as the performance you heard earlier. And therefore, it has to be “musical” too… since it’s exactly the same as the “musical” performance you heard earlier…
OK, but here’s the rub… reproducing the exact same thing that you heard at the concert is impossible. Because now you are listening to it in your room, not the original venue. In fact, the recording doesn’t even have on it the same thing that you heard in the concert hall! Had you been looking, you would have noticed that the microphones were not in the audience, where you were; instead, they were above and behind the conductor. Not just the best seat in the house, but not even a seat. The position of the microphones is chosen to optimize what the recording engineer hears on the monitor headphones. If they are close to the floor, they pick up too much bass; if they are too close the performers, they can create an imbalance in levels for different instruments; if they are too far away from the performer, there is too much “hall sound” on the recording. The engineer also choose the microphone type and the reording technique, based on a combination of knowledge, experience, and trial-and-error. Have a look at this article to read more about this: Stereo Recording – comparing stereo mic techniques.
So, even for a minimalist acoustic recording, what is on the recording itself is a concoction; it is not the real thing, it is a representation of the real thing. Depending on the ensemble, the venue, and what the engineer wanted to achieve, additional microphones may be used to capture hall ambience, or to “isolate” specific instruments or soloists – these signals are then mixed in with the signal from the main stereo mics to create a “realistic” recording. If half a dozen engineers were able to record the same performance, you would get half a dozen different-sounding recordings.
I’ve mentioned hall ambience a couple of times – why not record what the audience hears and listen to it in an anechoic chamber, so as to remove the effects of the listening room? Because the brain doesn’t work like that – it needs to hear sounds around it for things to sound real. You can’t record sounds coming from above and behind the listener in the concert hall, and then play them through two point sources in front of the listener in an anechoic chamber, and expect the brain to be fooled into thinking that they are the same thing! You have to allow that listening is done in a listening room with its own acoustics. The best acoustic environment is not anechoic, nor too dead or too live, but one that is close to what studies have shown to be subjectively pleasing. That is, an environment that creates the “right” amount of ambience when energised with a recording that includes some of the ambience present at the original event.
So far, I’ve talked about the concept of “accuracy” as being a relationship between one acoustic event to another. Sometimes, when the nature of recording is pointed out, someone will respond that accuracy means reproducing exactly what is on the recording. But that doesn’t even make sense… the signal on the recording is one-dimensional – a value varying with time. (OK, two values, but you get my drift.) The acoustic signal in your room is three-dimensional – a loudspeaker radiates acoustic energy into a three-dimensional space. And this – the manner in which the loudspeaker radiates sound into that 3-dimensional space – is actually where a lot of things get stuffed up, but that’s a topic for another day. My point is that the statement “reproduce exactly what is on the recording” isn’t even a valid concept. Perhaps – maybe – for the electronics, up to about the speaker voice coil, but not when the speaker and room are included.
I find this fascination with “accuracy” in hifi to be curious, actually. Take the phenomenon of imaging. Of course, in real life, sounds come from a direction and a distance. Less so in a concert hall – much less so, in my experience. Perhaps it’s just because I can’t afford to sit in the front row, but I find the image created in many classical recordings – especially more modern ones and especially with solo instruments or smaller ensembles – to be quite unnatural. It’s not that I don’t like it, but it isn’t what it’s like when I go to an actual performance; it’s “larger than life.” With some (electronic) recordings, it’s almost like special effects.
Speaking of special effects, a few years ago I went on an informal tour of Pixar. They had just finished making A Bug’s Life, and at one point I had a sudden realization:
“Hey… why do the ants have only four legs?”
“Director’s decision.”
“But… ants have six legs.”
“Yes, but ants that talk and act like people can’t have six legs, because it doesn’t look right.”
“Oh.”
(Or something to that effect, I don’t remember the specific words.) Since then, I’ve become more attuned to this type of artistic license. In Avatar, for instance – the animals all have four eyes and six legs, but the Na’vi have… two eyes and four limbs. Animal-like creatures with four eyes and six legs – cool. Human-like creatures with four eyes and six limbs – just plain weird. I imagine that a biologist would argue that it would be more “accurate” for the Na’vi to have four eyes and six limbs… but the director says no.
I didn’t enjoy Avatar that much the first time I saw it – perhaps it was the whole 3D thing – but the second and third times, it was very easy to become absorbed in the story. You kinda forget that they’re blue and 12 feet tall, actually, until you see them next to a human again. You forget, in fact, that the whole thing is a complete fabrication, that there’s absolutely nothing real about it at all. But somehow, it can seem realistic when you’re “there.”
OK, but that’s science fiction, and entirely computer-generated at that. What about a film of something real, like, say, a nature documentary? Ah. Well, those aren’t real either. Yes, the footage is real, if you look at a little piece of it at a time. But the filmmaker plays a lot of tricks. A lot of it is staged, with situations being set up, or tame animals being used. Stories that follow “Sid the Sloth” on a migratory journey actually feature several different animals as “Sid.” It might seem real to you – if you allow it – but it is in actuality a concoction designed to present the illusion of reality.
What about photography? What you see in the picture is what was there, right? Nope. Photographers have always chosen and manipulated the image to present what they want to be presented. It’s not the “Photoshop disease,” although Photoshop has certainly made it easier for people to understand some elements of this process. Let’s take a very simple example. I take a RAW file from my camera and generate a JPEG file straight from it. If you look at it at a pixel level, it’s a bit blurry, because the sensor has an anti-aliasing filter over it that does that. You have to sharpen it to make it look “accurate.” How much sharpening do you apply, and where? Well, that’s up to the photographer, depending on the intended usage of the image and what they want to achieve. You, as the end viewer of a photograph (and if you’re not a photographer) are probably completely unaware of this and a hundred other manipulations the photographer may have done to the image, but you look at it and think “Wow, that’s a great photo.”
So it is when you listen to a recording. You are unaware of the manipulations that the recording engineer and mastering engineer have made to what actually gets put onto the recording. I expect that, unless you yourself are a recording or mastering engineer, the only time you would become aware of them is if they do a bad job of something – and the illusion is broken. Most of the time, you are willing to allow yourself to believe the sound from your stereo system is a reasonable facsimile of the real thing, more or less. Like the viewer of the nature documentary, or of the photograph.
Now, of course we want accuracy, where it makes sense. For example, I don’t want an amplifier that introduces distortion, do I? I want a “straight wire with gain.” This (in my opinion) is about where people get unstuck with simplistic assertions about what it means to be something as apparently simple as an “accurate” amplifier. Flat frequency response – OK. No distortion – errr…
The thing is, even something as conceptually simple as an amplifier is actually a very complex device. It contains dozens of individual components, connected up into a circuit that has mind-bogglingly subtle interactions between the various parts. The result is something that approximates the straight-wire-with-gain with varying degrees of success. Total harmonic distortion may be low, but look at the relative values of the 2nd, 3rd, and so on harmonics, and a different picture may emerge. Look at the intermodulation distortion, and things may start looking different again. How does the distortion spectrum vary with power and frequency. How does the amplifier behave when it clips (as I think most amplifiers do very often – a topic for another day)? What happens when the load is inductive or capacitive?
So now we start to get into trade-offs. Or rather, the amplifer designer does, and we have to try and pick an amplifier designed by someone who makes the same trade-offs that we would if we knew how. But since we don’t (know how), we have to rely on our ears to choose which combination of trade-offs represents – to us – the most accurate amplifier. And there we have it, perhaps the core paradox in most discussions about “accuracy”: the only way that most of us can tell if something is “accurate” is with our ears. That’s right – we decide that something is accurate because it “sounds” accurate! This is clearly (I hope) completely and utterly paradoxical.
As I said, that doesn’t mean that we don’t care about accuracy. Not at all. Just that the more you dig into it, the more complex it gets. Nor does it mean that we should throw up our hands at how difficult and complex it all is, and take up archery instead. (Now there’s a nice simple hobby… I think?) Just that it can’t be distilled down into simple dichotomies like “musicality vs accuracy.” In the end (and with some determination to provide a neat ending to this seemingly-endless blog entry!), perhaps what we really should be searching for is accuracy in the service of musicality. I’ll be exploring this more in future blog entries.