Blind testing in practice

Cristiano Sadun
Mar 18, 2021
8 min read

Updated: Mar 18, 2021

(note: if you're just interested in downloading the scramble utility, go here)

ree — (from Wikipedia Commons, photo by Dale Schoonover, Kim Schoonover)

Is a microphone better than another?

Is a interface better than another?

Is a preamp/console/compressor/equalizer/whatever better than another?

These are very common questions (or statements!) in both the physical and the internet world.

While obviously better depends on what aspect of a device one's looking at (say a SM57 can be placed under a snare, while a bulky AEA RCA 44 can't.. so from that specific point of view the 57 is better), most often when posing these questions, we are concerned with the sonic aspects.

Now, in the first post of this blog, and many times since, I've stated that (nowadays, and with 24 bits recording) gear is far from the most important factor for the quality of a recording.

How do I know?

The obvious answer is by using my ears, but there's a difficulty with that answer: our ears are connected to our brains, and our brains are easily fooled. Mine included!

Biases

It's just extremely easy to be biased. If we own a cheap device, we may think it's bad because it's cheap. If we own an expensive one, and we have invested a lot of money (or emotion) in it, we may tend to believe it's very good (because otherwise we would think we're idiots!).

Biases have many sources: for example we may know the origin, cost, or history of a device and thus develop a pre-judgement before actually trying it, but we are also social animals, so if loads of other people have an opinion, and we are aware of that opinion, it's harder to develop a different one.

There's an an enormous amount of research on biases (and yet it doesn't help much, since we're often biased against looking for biases!), and you could do worse than starting with reading the related Wikipedia page. This just a very short post about how it works when comparing all things audio, so I won't really enter into many details nor be incredibly precise. The idea is to give you some starting points and a tool that can help you create sound blind tests on your own.

The fact is: biases exist. And even knowing that they exist does not protect us much from them.

How, then, can we test things?

In particular, how can we test and compare the sonic properties of similar audio devices?

Enter blind tests.

Blind tests

A blind test is exactly what the name says: it means to test something ensuring that the tester is absolutely blind to any other property of the something, other than the one he wants to check.

In other words, we remove on purpose information that could affect and distort judgement. In rough terms, the less information the tester has (other than the content itself), the better the test and the more valid the results.

For example, you can blind test the softness of a textile by closing your eyes, having someone pass the textiles to you and using just your fingers (no peeking!); or blind test the skills of different producers of vanilla ice cream simply by eating them without knowing who's made what.

A blind test focuses exclusively on the aspect you want to test (for example, how much you like a recording), freeing you of any bias due to other aspects that are irrelevant - whether you are aware of that bias or not.

A philosophical moment

I will be a bit bold here, and tell you that in my $.10, blind tests are the only type of experience that allows you to have a meaningful opinion or judgement on something - as opposite to a prejudice or an arbitrary preference.

The reason is very simple: bias is there whether you are aware of it or not. If you know something, that something will bias your judgement, almost no matter what. Even reading opinions about something will cloud your judgement a little, also if you don't agree with it. Just by knowing that a certain opinion exists, you will be biased by your trust in the source of the opinion, the quantity of people agreeing, the reputation of these people and so on.

One consequence of this line of thinking is that it's awfully hard to have a meaningful opinion on anything, since we don't really performs blind tests on many things, if any. And indeed that's my (meaningful) opinion. But I will save it for a future The Philosopher's Blog.

Blind testing for "best sound"

In audio, we're so lucky that it's very often possible to "separate" the device producing the sound from the sound itself - simply by producing a recording. Modern digital recordings are so controllable that they really allow you to keep most of the recording chain identical across different recordings, and focus only on the one aspect you want to test.

For example. you can ask someone else to play some sound or music without seeing what's producing it (and with the other person not telling you).

Then you write down what you like best, or what you think is what.

You can listen as many times as you like, but you have to write it down at the end - no cheating!

Then you ask the someone else to tell you what was what, and you compare with your own, unbiased, conclusion. Very often you'll be in for suprises!

Keep the same conditions

It is of course critical that the test hide as much information as possible while keeping the test conditions absolutely identical. So there's a little bit of work to do in order to set up conditions which are as repeatable as possible.

For example, for microphones, we possibly want the sound or music to be the same (or very close) and the position of the microphone with respect to the source to be the same, in the same room etc.

So say you're miking a guitar amp and want to compare several mics: you want to keep everything identical as much as possible, including the placement of the mic with respect to the grille, the same room, the same player and the same, well-rehearsed chord progression or lead line.

If blind-testing physical amplifiers vs digital emulations, you'll want to keep the "model" of the amplifier the same and use the same microphone both physical and virtual. With vocals and microphones, you want the same vocalist, with the same music and song, and a vocalist which is so good at keeping a similar performance between few consecutive takes. With interfaces or preamps, you want the same recording setup as much as possible so that the only variable is, by and large, the make and model of the interface. And so on.

Basically, you (or your friend) should go to great lengths to ensure everything else than the device you want to test stays the same. The more you do so, the more valid will be the test and its results.

You still gotta know what you're doing..

Worth noticing that the person producing the recording must also be knowledgeable about how to record properly and what's actually going on, in order to produce data for a meaningful test.

For example, a naive recordist might use a bad gain structure when recording different microphones, staying near 0 dBFS all along and therefore producing horrible results for all the microphones. In that case you won't be testing the best microphone, you will be testing the (lack of) skills of the recording engineer.

For composite devices, made by a chain of different elements (such as audio interfaces) you must be aware that what you are evaluating is not going to be only one element of the chain, but the whole. So for example, two interfaces may have identically performing preamps but slightly different converter producing slightly different sound (even if it's unlikely).

So it's important to have a degree of competence in order to set up a proper audio test.

What a surprise!

Stay blind all the way

You also must stay blind all along - for example, you don't want to know the order with which the vocals are sung or the instrument played.

While of course as much "sameness" of the sound or music is desirable, in practice it's important that we are close.. so long we can repeat the experiment multiple times and average the results.

Avoid familiarity

Among the million aspects of bias that exist, one thing worth mentioning here is that familiarity may skew the results, because it basically makes it impossible for you to be truly blind. If you are familiar with something, chances are big that you will recognize it also when "blindfolded".

So you should avoid familiarity as much as possible.

In practice that means that for any test, you will want unfamiliar music, unfamiliar vocalists and instrumentalists etc. The details depend of course on what specifically you are testing.

Blind testing in solo

That's all good and well if you have a friend willing and able to support you, but what if you are on your own?

By definition, if you are recording something you know what you are recording. Take 1 will be the Neumann and Take 2 will be the Behringer and we all know which one's best, right?

There's a few things that come handy here:

the first is to let some time pass between the recording and the testing. Memory becomes fuzzy pretty quickly when confronted with similar things, and that helps a lot. So set up your conditions for repeatability (gaffer tape on the floor for microphone and singer position, same room, same quietness, same vocalist, same music and so on). Create the recording with the Neumanns and the Behringers. bounce the files someplace and let them rest for a day or two.
Also: make sure the recordings are all the same length. Chop them off in your DAW to the same length and delete the extra bits. The idea is to make them as visually unrecognizable as possible.
Then, the more stuff you test in one go, the less probable it will be that you remember the individual details. So do not test two microphones: test four or five minimum. Or record the same microphone (or whatever) multiple times. That way the chances that you remember exactly how the vocal sounded when the Neumann was recorded are far smaller..
Fourth, you want to scramble the names of the recording files so that, by looking at their names when you load them in the DAW you can't say what's what - but you still can check once you're done testing.

The scramble utility

This last point is easier said than done if you don't have a friend willing to do the file names "scrambling" for you. Therefore, some time ago I made a small command-line program called "scramble", which does just that. (Windows 64 bit only at the moment, if any one has a C++ compiler for Mac, let me know!) Using the utility is pretty simple:

Download and unzip the zip file at the bottom of the page, say in C:\Temp. It contains a single executable file. No viruses! But have it checked anyways.
Place your bounced WAV files in a sub folder (say C:\Temp\recordings)
Open a command window (Win+R, type cmd.exe and hit Enter)
go to the C:\Temp folder, where you have both the utility and the recordings folder (type cd c:\Temp in the command window)
type scramble recordings in the command window.

The result will be that a folder c:\Temp\blindtest files will be created, and inside you will find the same bounced files but with their names randomly scrambled.

Load them in your DAW, listen to your content and write down your conclusion (on what sounds best, on which file has been produced by which device, whatever).

Once you have done so, go back to the c:\Temp\blindtest files and open the key.txt text file and you will find out what you really think.

Expect some surprises!

The Audio Blog