Debunking the Analog to Digital Process

Debunking the Analog to Digital Process

I’ve had several people ask me, “Exactly how does the ADC/DAC process work?” Sure. I’ll just vomit forth 45 years of integrated circuit and coding advancements. Brutal. Instead, let’s answer that good question with something more useful: what do I need to know about the Analog to Digital Process?

When I was taking my first electronic music course at Fullerton College, I got an “A” on my first test in the class. There were four in the class of 30 who did. As a “reward,” each of us was assigned to create an hour-long lecture on a particular hardware element of electronic music production, and in a week’s time, we would deliver it to class. They were really esoteric subjects.

Mine was the nature of error correction in CD laser reads. Oh man. But, being the busy little bee I was in my teens, I jumped right in, went to the library (remember those?) and – lo and behold – had my head blasted open with knowledge about error correction subroutines, dithering, laser etching technology, digital to analog conversion and more. I was so thrilled to have been given this assignment. Of course, at 18, I had no idea how to keep people’s attention, teach, and keep things interesting about a topic which was exciting to nerds, gear heads and silicon valley engineers. But when I got done (the teacher cut me off after about 30 minutes because even HE was bored), I felt like I had the knowledge of the gods. I had learned the mysterious knowledge of 16-bit 44.1 kHz sampling in the DAC/ADC process, and I looked at everything differently in digital audio post.

I’m not expecting you to have a Eureka moment from this article, but hopefully you’ll see how things work in a better way, and be able to get better results in your productions. By the way, 90% of what I’m going to talk about here is identical to how cameras store their data and regurgitate it.

In this video clip we see a quick representation of how analog signals turn into digital ones-and-zeros, are processed and then turned back into analog voltage, sent to an amplifier and ultimately to a speaker. Let’s break this process down.

Digitization and Bits

word length
Word Length refers to how many bits of storage occur in a samplen turned back into analog voltage, sent to an amplifier and ultimately to a speaker. Let’s break this process down.

Digitization is magic. Simply, it is a process whereby voltage is monitored over time (samples) and stored as a value (bits). Since audio which comes down a cable is simply voltage going up and down as the microphone diaphragm moved, this works well. The more values you have available to you during the digitization process to go from zero voltage to full voltage, the more resolution. We call sample points where voltage can be logged “bits,” because, just like in a computer, the digitization process is a binary one: everything is stored on a transistor/magnetic memory instrument of some kind (RAM, Hard Drive, etc). Naturally, voltage as an infinite amount of positions in a cable or bus, thus the more resolution we have to store infinity, the more we can fool the mind of the listener into thinking our digital representation of the infinite is…well…infinite.

With only two bits at your disposal, you have four positions at which the voltage level can be logged. Not too good for storing infinity. Three bits gives us 8, four gives us 16 and so on. When bits are applied to a sample recorded in time (like at 48 kiloHertz) we call this “word length.” How long is the “word” in a sample? It’s expressed as bit, yeah, but the official title is Word. It’s why I say never record in 16-bit audio, because the difference between 16 and 24 bit possible voltage logging options is astronomical. For the purposes of fooling the brain into thinking that our digital audio has infinite voltage positions, 24 bits is as high as we really need to go during the A-D process.

Samples

Then there’s samples. Again, analog audio has infinite, finite delineations of time on its side. We need to fool the brain into thinking

sample rates
A general understanding of sample rates

we’ve recorded infinite time resolution – which is what analog can do…more or less. We find that sending single bits of instructions to push a speaker somewhere fools the brain into thinking its listening to fluid analog audio 32,000 times a second or more. But look, 32K samples is terrible for reproducing complex sounds. CDs were standardized at 44.1kHz: more than double what humans can hear. Video is 48kHz. On the whole, 48k, does an okay job of having everyone believe flashes of data is fluid audio coming from computer chips. However, being able to record ultrasonic harmonics is critical to the human condition’s ability to “feel” or immerse in any audio, and many of those ultrasonic sounds are well above the 20,000 understood human ceiling for high frequency perception. Therefore, 96kHz and (my favorite) 192kHz were developed. Recording at 192 kHz/24 bits is the most amazing sounding thing.

This process of chopping up sound vertically into samples per second and “horizontally” into bits is called Pulse Code Modulation (PCM). Sure! You’ve seen that before on you Blu-Ray player preferences audio selections. Always choose it if it’s an option. The Blu-Ray codec allows for full resolution 96kHz/24bit audio in discreet 6 channel surround to pipe through your system. Dope. It’s noticeably better than the Dolby AC3 compression which runs at 400-odd kilobytes per second. Blu-Ray full res is 4,608 kilobytes per second. Yeah. WAY Better.

But there’s something else that’s even better: Sony’s DSD system which runs at a staggering 2.8 MEGA-Herz at 1 bit. Yeah. 1 bit. But

CD Vs. Blu-Ray
CD Vs. Blu-Ray. Yeah. Brutal.

at that speed, it’s actually “getting real” with infinity and audio for human hearing can’t react fast enough to need more than 1 bit. I promise you DSD is the smoothest, most realistic recording system on the planet. Trouble is, nothing will edit it. Until we see Adobe Audition editing DSD files, we’re stuck with PCM. To be honest, PCM isn’t that bad, and we’re so used to the distortions and sonic dithering and artifacts as a culture that we don’t mind.

This clip shows us why PCM resolution matters so much. In the beginning of the video  you see how horrible 4Hz/2 bits would be at reproducing a triangle wave. But as the resolution goes up, although it’s never perfect, it will do a very credible job of representing the infinite nature of analog audio.

Processing

Once the sound is brought into a binary PCM file, anything is possible. With modern Digital Audio Workstations, manipulating sound

channel strip
There’s a lot of math that goes on in a regular channel strip.

is what those of us mixing in the 80s would have considered god-like. But while being in the DAW is heaven on earth, there are some things to consider. It would suffice to say that what a plugin, channel fader or pan knob is actually doing is running a series of mathematical algorithms to manipulate the data of the audio stream. Whether it’s an EQ where highs need to be added or a compressor where values over time are being changed, it’s basically high-order math which sometimes, in big sessions, needs to be calculated in the terra-flops of bandwidth. When you think about it, there’s a lot of math going on in a channel. Just take a typical Adobe Audition channel strip as an example. From the top down:

  • phase flip (easy math)
  • bussing (just copying data with a fader drop; easy – unless you pan, then complex math)
  • EQ (complex math)
  • Pan (medium math)
  • Stereo Channel sum (medium math)
  • Fader (simple math)

We’re not talking about plugins at all here – which wildly complicate everything.

Imagine this: you’ve got a beautifully recorded 8 bit, 32 kHz recording (how it could be beautiful I don’t know, but go with me here). This means that you’ve got 256 possible choices for the volume of the sound to exist. You run it through Audition (and lets say it didn’t make a 24 bit version – which it would otherwise) and you change the fader on the channel to be 3 dB less. What you’ve likely done is forced the audio to no longer fit within the 256 choices you have. It’ll be somewhere between two  of the 256 bits. What do you do if you’re Audition? Which do you choose?

You randomly choose one.

Let’s say, “the upper choice.” And now the audio has been changed from its original into something which it was never supposed to be. Why? Because of a simple fader move. Worse, at every sample, Audition makes this random choice when the audio no longer falls into a possible bit position. To be clear, every DAW does this. You can see the problem, right? The audio is no longer the original. It’s got a randomization which was never there before. It ends up sounding like noise. We call it: Dither.

“But Mark! We only moved a fader!! How could something so simple do that?!?”

I wish it weren’t so, but the fact remains: if at any point the infinite nature of sound cannot find a sample or bit that coincides with where the math believes it should, dither/randomization occurs. This happens throughout the Analog to Digital process as well. Of course, at 24 bit, 96kHz, that dither is minute…but the downside to having a super fast sample rate is the dither-choice happens twice as many times as it does at 48K. Let me say, there are various dithering algorithms whic mask the noise very well.

One of the recent solutions to this is DAWs working 32-bit “float” calculations. This means any time math has to be applied to a sound, the DAW creates a “virtual” 32-bit architecture. This wildly reduces dither and a whole bunch of other issues, because instead of 16 million possibilities (24 bit) you now have 4,294,967,296. Yeah. Dope. In some DAWs you can actually export audio files in this format. With modern DAWs able to do this kind of really high level math, going over zero on a channel is no longer an issue. Although your master channel still has to stay under zero dB, you can drive it from any number of over-zero channels or busses and have nearly no effect. 32 bit architecture allows for a gigantic headroom increase inside the DAW – and I’m a BIG fan.

With these kinds of digital/mathematical tools at our disposal, processing audio is so much easier than it was even 5 years ago.

Back to Analog

Once the data stream has passed through the gauntlet of alterations in a DAW, it’s ready to be reassembled after the Master Output and turned back into voltage to be sent to speakers. This process is somewhat easier to do, because there’s no guesswork as to what might be “coming in” and, thus, few dither issues.  The Digital to Analog Converter (DAC) is able to “look ahead” and formulate the best voltage representation of the data.

Normally, any decision about dither happens at the DAW level. But if your audio interface also has a volume control…guess what? It’s doing another layer of dithering, and it will be doing it at 24 bit or less. Be careful when you buy an audio interface. DO NOT TRUST THE BUILT IN/GAMING AUDIO CARD ON YOUR COMPUTER TO DO ANYTHING BUT MAKE BAD DECISIONS ABOUT THIS PROCESS!!!

Once out of the interface, it’s voltage as usual, cables are your enemy and buy the best speakers your can afford.

Hopefully this little treatise on digital to analog processing has been helpful. Let us know your thoughts and experiences about DAC/ADC dealings, or Tweet about it!

 

Close Menu
Share This
×

Cart