Take any audio forum or Facebook group, or even a conversation at a pub among audio-interested people, and sooner or later someone will come up with the idea that they need better converters.
Or at least they wonder if if they should.
Are the ones in your computer or inexpensive interface good enough? Do they make a difference to the sound? Do you need better converters than you have to achieve that mythical "professional" quality that everybody seems so keen on? Is it ok to plug in your $2000 preamp into your interface or you need another $2000's worth of converter box to make it justice? Is it worth spending your hard-earned dosh on upgrading them or is a regular audio interface (or even your computer mic input) good enough?
Well... as usual, it turns out the answer is (annoyingly) “it depends”.
(this post is long: if you just want the conclusion, feel free to jump to the end!)
First of all, if you've read What Makes A Great Recording, you know that converters are way down the list of critical things to think about, when recording or mixing. At least if you re using an audio interface produced in the last decade. That's because, unless they are catastrophically bad, A/D converters contribute even less than preamps to the overall quality of your recording. This post aims to help you understand why.
The same bottom line as for preamps is valid here: when it comes to the quality and greatness of your recording, unless you're already covered with everything that comes before (performance, room, mics etc), it's not really worth thinking about the A/D conversion.
Just let it go, and buy some acoustic treatment instead, or a better mic. Or even better, some singing lessons if you are the vocalist.
It's gonna be far better value for money and make much more of an improvement.
What? No? Okay, let's assume that:
if you are the performer, you are as good as Elvis
your room is as good as Ocean Ways
you've chosen a vintage U47
you've spent a good half an hour carefully positioning it in the room
the mic is plugged into that spanking new Millennia STT-1
which you have set up with perfect gain staging
you're ready to record the vocal line of the century.
It's time to push "record".
You look down, follow the XLR cable out of the preamp to a the line-in of your audio interface.,. and you realize that said interface is not-particularly-high-end.
Suddenly you get cold feet.
Will it be good enough?
Do you need to go buy a better converter?
Let's find out.
What are A/D converters supposed to do?
Some background first.
A/D converters do a conceptually simple job: they take an analog signal (in most audio equipment, an electrical AC voltage), measure it over a short time (i.e. they "sample" it), and translate the measurement to a number that can be represented with 24 bits (or whatever is the word length that the converter produces). That sampled number is called (guess!) a "sample".
This sample is then made available at the converter outputs (in digital audio formats such as AES/EBU or ADAT or whatever). A computer audio interface, which is able to understand the digital format, can then pass on the samples to a computer application (a DAW, for example), for further processing. Sound!
D/A converters do the opposite job - taking a stream of samples and re-generating the analog signal so that it can drive some speakers.
In summary, A/D and D/A converters perform a sort of Star Trek transporter trick: they transform an analog, physical thing (voltage) into information, and then rematerialize it as a voltage someplace else. Neat!
So far, so easy.
The following figure (which, be warned, is quite misleading) should give an idea.
At times t1, t2, etc the A/D converter finds the value s1, s2, etc, converts it on base one a given scale, and there's your "sample": simply a number.
Do we lose information with digital audio?
No. I did say the figure above is misleading. And it is misleading because, looking at it, it may look that way: the blocky sequence doesn't seem remotely as smooth and precise as the continuous line of the waveform.
Which is right, because it isn't!
It's obvious that connecting the dots in the picture above does not produce a good approximation of the "real" analogue waveform.
But that's only because we are sampling at a pitifully slow sample rate, which indeed does lose information.
Between say t1 and t2, the waveform changes a lot, and since so much time goes between these two instants, we miss a lot of these changes. We lose information.
But what if we sampled much more often (i.e. more times per second, aka with a higher frequency, or "rate")? The samples would approximate the original signal much better:
Now, what if sampled so fast that the waveform changed very little between two consecutive sampling instants?
What if the signal change between two instants was so small, that it exceeded our ability to perceive changes? (the "ability to perceive change" is simply the max frequency we can hear. In people's case, about 20 KHz average).
So in other words, is there a sample rate fast enough to capture enough data, so that any information loss is outside our ability to perceive it - and therefore irrelevant?
Turns out there is.
Quite a few years ago, a really smart fellow named Claude Shannon (working on some idea by another smart fellow called Nyquist), proved that, in order not to lose any information about a signal limited to a certain frequency band, we've got to use a the sample rate which is double the max frequency we want to sample - aka the maximum ability of a device to detect change (for audio, that device is the human ear). This result is called the "sampling theorem".
In average, our ears perceive frequencies from about 20 Hz up to 20 KHz (and this top frequency decreases significantly with age).
That means that, to ensure we capture all there is is in that 20-20KHz interval ("frequency band"), we need to sample it at (at least) double 20 KHz, that is to say 40 KHz.
"CD quality" - that is 44.1KHz - is well above 40KHz, so we're jolly good: no information audible to a human ear is loss (but a bat, of course, will think the recording quality is horrible!).
Couple things work noticing: some few very young people can perceive frequencies well in excess of 20 KHz (up to 23 KHz) and for them 44.1 sample rate is not good enough (48KHz is better, for example). But due to the mechanical way ears work, that ability unfortunately doesn't last long; and at that age they're probably not that interested in audio quality.
Also, a 44.1 sampling rate is all good for humans, but sucks for dogs, which when young can hear frequencies up to 40KHz. Every time you put on a CD, think about the poor dog nearby!
(photo by Jacknunn - own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=91729752)
Real world filters and aliasing
One important detail on the result above is that, for sampling to work well, the signal must be band limited, that means that outside our 20 Hz - 20KHz band there must be nothing.
Otherwise, information outside the audio band will be captured and mistakenly used to rebuild the audio signal (i.e. end up inside the audio band) by the D/A stage, producing sound that wasn't there at all in the original signal... a sound which will have nothing to do with what we originally sampled, so most likely will be crap!
The production of these unwanted sounds is called aliasing and it's critically important to avoid it: we certainly don't want to our nice guitar solo to sound like we can't play!
So before sampling, we need to filter the signal, using a low-pass filter that removes everything exactly over 20KHz.
The difficulty is that, as usual, in the real world it is impossible to filter frequencies abruptly. In the universe it takes time to do anything!
Therefore, as you can see from the figure below, real-world filters can be very steep, but are never perfectly vertical.
Since we don't want to lose audible information, a real-world filter will need to start "cutting" from 20KHz, leaving a little space for some ultrasonic frequency to be present (see figure below).
The consequence is that we need to sample a little higher than 40KHz, (to ensure that no aliasing occurs in the 20Hz-20KHz range), and then discard any sample data for frequencies outside that range, so that we are left with audio-band samples only.
CD quality specifies 44.1 KHz, which is obviously much greater than 40 KHz and therefore works fine, and leaves a whopping 2.05 KHz "space" on each side.
A brief history of 44.1 KHz...
And now just for the fun... why 44,1KHz?
Well, it's a bit of a trivia, but when CDs were in the process of being invented, people had weird hairdos, pastel colors were all the rage, and hard drives were small.
When you have a stream of samples ("Pulse Code Modulation data stream", or "PCM stream" for friends) they are essentially numbers. Numbers can be stored as sequences of bits ("0"s and "1"s) but PCM streams are big.. their size is easy several megabytes.
Nowadays a phone has easily thousands of megabytes of storage.. but back them, a megabyte was a serious amount of data, very expensive to store and big to process.
Around the time the CD was invented there was one type og gear with enough capacity to record large amounts of digital data: video recording equipment.
Turns out that back in the times where moustaches where hot, digital video recording kit already used a sampling rate which was a bit higher than 40KHz-... specifically - guess what? - 44.1 KHz!
For video equipment, that specific number was chosen not because of the audio, but because it made easy to support both PAL and NTSC's "screen lines" vs. "line drawing" frequency, while needing only 3 samples rate for each line.
So that gear got used for audio - and it suited it well since it allowed the filter to have 2.05 KHz of "space" for the cutting curve (meaning that the filter could be relatively rough and thus not that expensive to produce).
And then inertia did the rest. So 44.1Khz is still with us.
As of 24 bits as word length, having 8 bits more than 16 allows better results should the PCM stream be processed by some calculations.. such as plugin effects and the mixing engine in DAWs - since computers aren't that good with numbers - they need all the help they can get.
Another dose of reality
So we're all good. All of the above tells us we can sample any audio signal and then reconstruct it identical to how it was. Just like Star Trek transporters do with people. Why then having many converters?
There is only a little problem: real converters aren't perfect.
Just like it was for preamps (and for band-limiting filters), most of what we've gone thru so far is the working of an ideal, perfect converter. If it were possible to build perfect real-world converters, it would be all that you'd ever need.
But it's not possible.
Converters are made with physical components that can only be manufactured with so much precision (and often, at a price point), and physics itself has the bad habit of completely ignoring the beauty and elegance of mathematical models.
In other words, when sampling, A/D converters make errors. And so D/As, of course.
And that's not all: different converters, based on different classes of technology (or different designs in the same class), will make slightly different errors.
That means that (given the same analogue signal) when using different hardware, the sample stream produced by an A/D converter and the output produced by a D/A converter will be a little different. As in different sound.
Conversion errors for dummies
How are these errors?
Well, there's lots of ways real-word converters can be imprecise, and it really depends on the technology in use.
Without going too much in detail, here's a rough, superficial list:
timing issues (the frequency of sampling must be governed by a very precise clock, and if the sampling duration - also determined by the clock - is not exactly identical every time, some sample may capture - and translate - slightly different amount of signal.. while, on reconstruction, the D/A converter will happily assume that all samples come from the same timing (or, of course, having its own, different, timing errors);
how the gain-to-digital level is interpreted (ideally should be a straight line, but physical components may not behave so, see picture below);
the fact that the analog circuitry which brings the signal to the A/D converter may present different electrical impedance to different frequencies,with certain frequencies being a little bit "reduced" (impedance is a kind of "resistance", but for alternate current, check out the excellent primer here by Hugh Robjohns);
Also "boundary conditions" like the circuit operating temperature (both room-dependent, but also voltage-dependent) can affect the signal detection. Different designs will attempt to compensate for these physical effects in different ways (usually the better they are, the more costly, with the inevitable law of diminishing returns creeping in).
There's many others - phase shifts, quality of the analogue filter etc, and more depending on the specific sampling technology. AND of course, any noise or distortion in the analog part (more on that later)
All of these may result in errors - meaning the A/D converter does not behave exactly as its theoretical, ideal counterpart.
It's important to keep in mind however that these effects are really minimal.
For example, the time scale with which music is understandable to us (say milliseconds) is far bigger than the time scale with which a converter operates (even a humble 44.1Hz is 44100 times per second, an order of magnitude greater).
Nevertheless: errors exist; and such errors result in slightly different (but musically equivalent) sample streams and reconstructed-analog-wave. Both "different from the ideal", and "different among different hardware"... which means that the various A/D (and D/A) converters may not indeed "sound" the same.
Back in the studio
Let's now go back to your Elvis-like performance: are you going to get a good result plugging your microphone in your inexpensive (but modern) interface instead of using an external preamp with an A/D box that costs ten times more?
Does it matter?
Sonically, the short answer shoud by now be clear: yes, but in general not so much.
With modern 24 bit units, conversion errors are generally minuscule with respect to all that comes before. Therefore the differences between different converters (making slightly different errors) will be extremely small, and in most cases irrelevant for real music... like "in not distinguishable in an A/B test". Errors are so small, that an audio signal must be subjected to many conversion cycles in order to hear any degradation (how many, it depends on the model... converters with lower noise, distortion and jitter will allow literally thousands of cycles without perceptible degradation).
However, it also depends a little bit on your ears, and the amount of high frequency detail provided by better converters (producing more "informative" bits in every 24 bits word) can be felt during mixing - there's just more information for that EQ to work on (and for your ears to detect).
By the way, can I use the converters built-in in my PC?
After reading so far, you may have realized that, since your computers is a capable of receiving audio (via a mic or line in input, normally) and produce sound, it has to have converters onboard. And so it is.
Can you use them?
Generally, it's not a good idea.
While there may be exceptions, on-board converters are usually not that great, especially on PCs. It's simply a matter of cost. Onboard converters will often have cheaper analogue front ends, making them noisier and more distorting, maybe produce only 16 bits, clocks will have more jitter etc. Simply put, audio is not a primary function for a PC motherboard, so what is given does the job, but just barely. You most likely need an external audio interface with onboard converters as opposite to plug your preamp directly into the computer line input mini-jack.