Post Go back to editing

Understanding of gain stage, dsp headroom and serial port bit rates

Category: Software
Product Number: ADAU1466

Hello Ez, 

I am trying to properly understand the relation between gain stage, DSP headroom and serial port bit rates as the title says. The DSP is used as an active XO for stereo systems. It could be 2, 3 or 4-way systems as I have 4 stereo DAC's connected. main volume is controlled by a potentiometer.

My SS program does this: 

  • ASRC is used to bring SPDIF into the core. This is done with a -6dB gain before the ASRC using the F590 register as discussed here:  
    • Signal is now in the core at a -6dB level
    • If I have inter-sample overs it may reach 0dbFS right? If so this is only really happening at high frequencies I guess, not below 80Hz
    • My sources are TV toslink (fixed at 0dbFS), WIIM streamers toslink and PC toslink (both with adjustable volumes, but they will be left a 0dBFS). 
  • I then boost the signal with +23dB immediately
    • Signal is now at +17dB maybe increasing towards +23dB at high frequencies in case of inter-sample overs.
    • The boost is applied to be able to get low input signals to loud output levels as some of my source material is quite weak. 
    • One of the systems has very inefficient woofers and I have found this value to allow me to just barely clip the woofer amp at bass heavy tracks and volume at full.
  • I then enter my filters where I am left with 42dB - 17dB(23dB) = +25dB(19dB) of headroom.
    • Lets say I boost 20Hz by +20dB. I then still have 5dB of headroom at 20Hz.
    • My tweeter is reduced by -10dB so I should have some 29dB of headroom here in the case of inter-sample overs at the SPDIF input. 
  • Then I enter my volume control which is controlled by a potentiometer. 
    • The signal before this volume control could be up to +37dB at 20Hz. 
    • The volume potentiometer is adjusted according to the source material in the range from muted to -0dBFS. Most of the time at very low levels (when used with PC speakers). 
  • Then I truncate the output signal by:
    • Soft clip limiters set to -3dB on both tweeter and midrange outputs - Limit will most likely never be even close, but just for safety.
    • Multiband compressor on my woofer also set to -3dB. To avoid a 20HZ 0dBFS to kill 200Hz bas. 
  • Then I hand over the signal to the serial ports set to 24 bit.

  • My DAC's are set up so that a -3dBFS signal results in 4Vrms output (which is what I want when my volume potentiometer is at the max). 
    • Done to also avoid inter-sample saturation of the DAC IC itself. They are AK4493 DAC's. Not sure if this step is actually needed or if I could safely feed it 0dbFS or maybe -1dBFS. 
    • My amplifiers input buffers are adjusted so the 4Vrms input results in what ever power level I require for a given speaker driver. 

I then have a couple of questions:

  1.  First off, is this general gain stage a reasonable way of doing it?
    1. I had originally placed my volume control before the filters, but I thought it would make sense to give the math the biggest possible signal level to work with at all times as the device will often be used at very low volumes(PC speakers). 
  1. What exactly does the serial ports bit rate setting do? 
    1. One of Dave's videos on Youtube he mentions the cores takes a 24 bit signal in in the 1.23 format. Then internally treats it as a 32bit signal in the 8.24 format allowing to boost he signal by some 42dB. Then truncate it and outputs it as 24 bit in the 1.23 format again. Is this always the case or does the serial ports bit rate change this? 
    2. Lets say I were to set the serial outputs to 32 bits. Would the core still need to truncate the cores 32 bits down to 24 bits before the output or would I need the full +42dBFS headroom in order to get the full resolution from a 32bit serial port? 
    3. I have a feeling the above statement is completely wrong and the serial port resolution has nothing to do with the DSP cores bit rate. right?
  2. Does it make sense to limit my outputs to -3dBFS or is -1dB good enough?
  • Hello DannerD3H,

    Gee, some of these questions are a little tough, let me take them one at a time...

     First off, is this general gain stage a reasonable way of doing it?
    1. I had originally placed my volume control before the filters, but I thought it would make sense to give the math the biggest possible signal level to work with at all times as the device will often be used at very low volumes(PC speakers). 

    Yes, 100% correct to put the output volume control as close to the actual core output as possible. Not before a filter! I have built projects and seen projects where a hard limiter is set to limit to just before clipping so that should be the output and the volume control would be before this limiter. However, when it comes to EQ you want to have a high level signal going through it so you do not end up with a raised noise floor from the calculations.

    I do want to add some info here that somewhat applies to many of your questions. 

    There was a customer who builds really high quality speaker crossovers and they had great listening rooms and systems for testing. 

    Since there is 42dB of headroom, I asked them to perform an experiment. To raise the gain right after the signal enters the core by like 20db? Something like that. Then just before the signal leaves the core reduce the level by 20dB so all the audio processing is being done with "more bits". 

    Results?... They did a comparison and could not hear any difference at all!! This to me proves that the way we do the multiplies in the core using an 80-bit accumulator helps to prevent audible artifacts and just truncates the data after the filter has been calculated. So I would not bother to raise the level of signals unless they are too low as you noted. That use case it would be good to get the signal up close to a 0dBFS for a 1.23 signal. But, do not go too far out of your way to do it.  

    What exactly does the serial ports bit rate setting do? 
    1. One of Dave's videos on Youtube he mentions the cores takes a 24 bit signal in in the 1.23 format. Then internally treats it as a 32bit signal in the 8.24 format allowing to boost he signal by some 42dB. Then truncate it and outputs it as 24 bit in the 1.23 format again. Is this always the case or does the serial ports bit rate change this? 
    2. Lets say I were to set the serial outputs to 32 bits. Would the core still need to truncate the cores 32 bits down to 24 bits before the output or would I need the full +42dBFS headroom in order to get the full resolution from a 32bit serial port? 
    3. I have a feeling the above statement is completely wrong and the serial port resolution has nothing to do with the DSP cores bit rate. right?

    I will respond to all of these questions somewhat at the same time. 

    The serial port bit rate just sets the serial port bits per channel slot. It does not actually change the word size that is being sent to the seral port to be transmitted out. 

    The MSB is lined up with the MSB and remember that a 1.23 is actually a 1.24 in the core. One bit is added to the bottom in the core. So this bit is stripped as are all the bits above the "1." bit. So eight zeros are added to the 24 bits from the core. 

    Now, if you change the serial ports to be 32 bits. Then the 32 bits coming in goes directly to the core bypassing all truncators and saturators!! The same thing on the output of the core. If you set the serial output port to be 32 bits then the 32 bits from the core go directly to the serial ports bypassing all truncators and saturators. 

    If you set the serial ports to 32 bits then you get the full headroom. Whatever is in the core goes out of the serial port. 

    Now, you can mess this up! Set the serial port to be 32 bits but then select a bitclock frequency that is too low to output 32 bits per channel! You have to have the correct clock settings or send it the correct clocks from an external clock source. 

    Does it make sense to limit my outputs to -3dBFS or is -1dB good enough?

    This is somewhat your call... As I said earlier, if you setup the limiter to be really fast then -1dB may be good enough. If it is a little slow for an attack time then you may go above 0dBFS and clip the output to the serial port. A fast attack time can distort the audio a little and not sound all that great. This opens up a huge subject and for a full-spectrum signal I like to use a multi-band compressor limiter. But, your application is a crossover then "never mind!" LOL!  It also is good to have a more gentle compressor set to like -6dB or -4 and then having a final protection fast limiter set to -1dBFS. Hey, this is the fun of working with these DSPs, it is so easy to experiment!

    By the way. If you are planning on using a 32 bit DAC from some other company. Look at the specifications. Chances are the signal to noise spec is like 122dB? or perhaps it is a really good one at 132db SNR? 

    Well, with 6.02dB per bit this is 20 bits for a 122dBSNR and it is almost 22 bits for a 132bit DAC!! 

    So if you keep the level of the DSP core around the 1.24 signal level with 7 bits of headroom (42dB) then you will be losing a lot of bits that are above the noise floor of the audio in the core will end up being lost in the noise floor of the DAC!!

    This is a big argument for raising the level in the core before sending it out to the serial output port. 

    I hope this helps and does not confuse things more!!

    Dave T

  • Hello Dave, 

    Once again thanks for the elaborate answers and sorry for the very basic questions!

    1 - Gain stage

    Perfect, thanks. Good to get a sanity check on my thoughts :)

    And thanks for the bonus information regarding the in-audibility in practise! 

    2 - Bit rate

    Okay, so it actually does mean that the full internal 32 bit data is transmitted to the serial port directly? Kind of didn't expect that. Thanks for clarifying that! 

    So just to verify.. If I input 0dbFS through a 24 bit serial port I get a 1 in the core in the 7.24 format. I can then boost this signal by +42dB and output that directly through a 32 bit serial port without clipping the data transmitted to the DAC - correct? 

    And a clipper set to 0.9 on a 32 bit serial port as mentioned above will effectively reduce my output by -42dB(7 bits) by clipping it? Or am I missing something? 

    Regarding the clock, I am using 24.xx MHz clock for the core and DAC's. 

    3 - Compressors

    Oh this makes great sense to have a slow acting limiter at a lower level and then a hard limit up higher. Thanks for the tip!

    Regarding the DAC's

    My DAC's are based on the AK4493S chip. I know it's not yours and you cannot support this device. But just to clarify. It has a S/N of 123 dB.

    So this means, it in the best case can resolve around 20 bits of audio right? 
    So what you say is, if I run the serial ports at 24 bit, I will effectively have 4 bits of resolution in the DSP that the DAC cannot resolve anyway. So going to a 32 bit serial port will not give me any real benefit? 

    Case 1:
    My initial plan was to run serials at 24 bit. Then use the internal headroom to ensure that I can safely boost weak input signals without clipping hotter input signals internally in the DSP (given that the volume control is the last step). This headroom also allows me to do some dynamic bass boost and so on. 

    Case 2:
    Whereas if I used 32 bit serial outputs I should preferably stay away from the lower 12 bits as they would be lost in the DAC Noise floor. As a result I would have to use rather hot signal levels in the DSP, effectively loosing much of my headroom in the DSP?

    Is this the right understanding, or am I looking at it the wrong way? 


    As always, thanks for your detailed explanations. I hope I understood it correctly?

    Cheers, Daniel

  • Hello Daniel,

    You are kind of bouncing around a little between the bits in the core and the bits going out of the serial port. This post will just focus on what happens to the data in the core when it gets translated out to and out of the serial port. 

    I took a screenshot of the graphics detailing the serial port set to a stereo (I2S) format which is a format that has 32 bit-clock cycles per channel. So it is a format you would be using to communicate with a 32 bit converter. 

    So this details how the bits in the core are transferred to the serial port and then sent out via the serial stream of data. 

    You can set the serial port into three different modes. 32-bit, 24-bit and 16 bit. 

    The "Delay-By" bits really do not change anything I will be talking about. That detail just has to match up with the receiver settings. So I drew red arrows for the lines for Delay by 0. This just makes it a little clearer and gets that option out of the way of this discussion and comparison.

    This diagram is from page 66 of the ADAU1452 Rev D datasheet. 

    The numbers in the little boxes are the data bit positions in the core. You can see the bit clock cycles at the top of the screen counting the bits starting with "1" going to 64 for the two 32 bit channel slots. 

    The MSB is transmitted first. So the numbers in the little boxes are the bit positions so bit-0 is the LSB. Bit-23 is the LSB otherwise known as the sign bit. 

    So you see the 24-bit data being transmitted starting with bit 23 going down to bit 0. That is MSB to LSB. 

    This stuff is so automatic to me that I forget that others have not been doing this since the 1970's like I have!! Shoot, my first assembly language class was on a mainframe that was in octal!!! So I had to learn how to go between binary to octal to hex to decimal. I guess it has served me well through the years. 

    OK, Back to the info for this post!!

    The whitespace you see below bit-0 is the zeros that will be inserted after the data or before the data for the "delay-by" settings. 

    Now to translate this all from the number formats we define like a 1.24 format. or the 8.24 format. 

    So for the case of 24 bit setting and 1.24 data. 

    Bit-23 is the MSB and the sign bit. So that will be sent out in bit 31 position in the 32 bit serial data. The first bit transmitted. 

    Bit-0 will go out in bit-9 position. The LSB of the 1.24 bit signal will by truncated and the last eight bits will all be zeros. 

    For the case of 16 bit data, The top 16 bits of the 1.24 bit internal data gets transferred. the rest gets truncated. 

    Then, the MSB will go out first again in bit position 31 and 16 bits of zeros tacked onto the end. 

    For the case of the 32 bit settings. In this mode, the 32 bits in the core go right out of the serial port. MSB first of course. the MSB will of course be the sign bit and then the 31 data bits that are in the core. No truncations at all!!!

    Now here is the tricky part. 

    If you run internally in the core with the 7 bits of headroom, then just send that out to the serial port then you are sending the data with the top seven bits unused except for when the data is negative of course. Then if you send this to a 32 bit amp then you will be -42dB down from the top of its headroom!!

    Personally, I like the headroom in the core for the filters and any other math but then come back down to the nominal level for 1.24 signal. Then I would raise up the gain by maybe as much as 42dB!! I would be too chicken to go that much but 40dB or 39dB I could see. 

    Just do a shift left instead of a multiply. 

    Then the 24 bits of signal should correlate well with the 32 bit converters that have around 20 or 21 bits of resolution. 

    Do some experimentation. 

    Here is the screenshot I took. ( I almost forgot to insert it!!)

     

    I hope this helps. 

    Dave T

  • Daniel,

    The bit depth of a DAC is not literally the same thing as its S/N ratio. Quantization noise is distinct from other sources of noise, and even different types of quantization noise (correlated or uncorrelated) can be vastly different in audibility. The human hearing system is complex, with various masking effects in different critical bands, making it difficult to predict whether one type of noise will be audible or not in the presence of another type of noise. It's well-established that 24-bit audio can be preserved, in most audible frequency bands, even with a 16-bit format like CD. There are other examples of systems when humans can discern signals whose level is quite far below the level of the noise. Thus S/N is not a hard threshold - signals do not magically disappear when their level drops below the noise floor.

    With a 16-bit DAC, you definitely want to dither from 32-bit or 24-bit down to 16-bit to avoid the correlated quantization noise of truncation.

    With a 24-bit DAC, you may or may not want to dither, but it's probably better to dither unless you can thoroughly test the audibility of truncation.

    With a 32-bit DAC, I'm not even clear what the chip is doing with the extra 8 bits. I'm actually trying to find documentation from the vendors on what they do. Do they truncate to 24? Do they dither to 24? Or does their sigma-delta system actually have the oversampling bandwidth to realize those finer levels of quantization (even though they're far below the self-noise of the chip itself).

    Each possibility determines what we *should* do.

    My guess is that you can configure the interface between SigmaDSP and DAC as 24-bit, in which case I would highly recommend dithering; or configure it for 32-bit, and your guess is as good as mine as to whether dither would be meaningful.

    Brian

    p.s. It might be better to choose between 24-bit data paths and 32-bit data paths based on criteria other than S/N ... like, perhaps, radiated clock noise from the higher frequencies of the 32-bit configuration.

  • Hello Dave, Once again thanks. I think I am starting to understand it. But all this digital theory stuff is very new to me, so its a wee bit confusing to go from a decimal, to 1.24/8.24 and then toss some hex in there as well.. But I think I'm getting there slowly.. Can imagine it is just like speaking English for you Smiley

    Okay, another sanity check:

    1. SPDIF in and 24 bit serial out
      1. If I input 0dBFS via  SPDIF into the core through an ASRC. I get a 0dB signal in the core. 
      2. If I output that directly through the core to a 24bit serial port to the DAC I would get the maximum output from the DAC right?
      3. Or is the DAC always running in 32 bit mode? 
      4. If I boosted the signal by +42dB I would have to limit it before sending it to the serial port or I would clip the DAC right? 
    2. SPDIF in and 32 bit serial out
      1. If I take my 0dB SPDIF signal and instead output via a 32 bit serial port the full 32bit data of the DSP core is output meaning my bit 31(sign bit) is transmitted first. then I get bit 30-23 as empty bits (7 bits of headroom). And then from bit 22-0 I have my actual data.
      2. My 0dbFS input becoming a -42dB output from the DAC? 
      3. Meaning my DAC will actually mask the lower 7 bits with its noise floor giving me only 15bits of ENOB. 
      4. If I internally in the core boost the signal by +42dB before the serial port (with no limiters and so on) I would get the full 0dB output from my DAC?

    I will try to experiment a "bit" with this tonight!

    Thanks Dave!

  • Hello Brian  , 

    I actually do not see any benefit of the 32bit interface if my understanding of Dave's bit talk is correct. I find it more intuitive for some reason to stick with the 24 bit in/out and have the internal headroom.
    If there are even other downsides such as radiated noise and so on with hardly any if any at all audible gains, then I see no reason to not just use the 24 bit interface. 

    I was also looking into dither... Just didn't want to bring it into this thread as it already was confusing enough.. 

    But since you brought it up:

    1. If I want to add dither in this DSP, what would be the best approach? 
      1. Add two random noise generators with an output of -144/-138dB add them to the audio? Or is there a better approach?
      2. Where to add it to the audio? Right before the serial ports? Before the limiter? Before the volume control?

    Thanks!

  • As far as I can see Brian, the AK4493S runs a 24-bit core. This isn't explicitly stated in the datasheet, but the S/N implies it and what I have seen there is no measurement differences if its configured for 32 or 24 bit inputs. So my guess is that it just ignores the last 8 bits when fed a 32 bit signal.

    My DAC is currently configured for 32bit inputs hardware wise. So its my understanding that I can feed it both 32 and 24 bit data from the DSP and it will still work as long as the BCLK from the DSP stays at 64xFS. 

    Then the question of what the DSP puts in those 24 and 32 bits is another story.

    Do you agree, or am I way off?  

    PS:  its really inconvenient for a forum like EZ that discuss digital data and so on to not allow the use of the words "Maste*" and "Slav*"... xD

  • Hello Brian,

    I almost started to go into what you went into about being able to hear down into the noise floor and corelated verses uncorrelated noise etc. So I get it but I just focused on what happens to the bits when you do a transfer and not to truncate when not needed. 
    We do not do any dithering per-se, if you use an IIR filter it will somewhat do that but again it depends on the noise type. Yes, we can hear down into noise making analyzing performance tough! I remember a lot of discussion in the trade publications way back in the 1970's about trying to develop some tests that would produce a measure of sound quality like we can hear but it is yet to be developed!! Digital audio just ended up making it worse! I guess it keeps what we do exciting! 

    With the P.S. you added. This was why I wondered if someone took in the 24 bit ADC data and digital data and then multiplied it so there were a lot more bits of resolution and then ran it through all the IIR filters and FIR filters and all the processing then just brought it back down to 24 bits at the end that it would sound better due to all the dithering etc., but the folks who did the experiment reported absolutely no difference.  They have superb equipment and listening environments and could not hear a difference. Anyhow... I will try to get to the rest of your posts...
    Dave T

  • Daniel,

    The top of the AK4493 data sheet says "Quality Oriented 32-Bit 2ch DAC" and they have surely implemented a device that utilizes all 32 bits. I will reiterate that you cannot use the analog signal-to-noise ratio specifications to determine the number of bits used in the digital-to-analog converter technology - they are unrelated, other than the fact that they end up being mixed together into the final output signal. Asahi Kasei mention six types of 32-bit digital filters, so it's surely the case that they're feeding 32-bit samples into a 32-bit filter, even if that gets fed to a 24-bit DAC down the line, internally. It's worth asking their support for more details about the final D/A stage, but the input to the signal path is 32 bits.

    To put S/N ratio into perspective, Linear Technology has a nice paper on the challenges of designing for low noise. Their LT1028, LT6018, and LT1115 are the lowest-noise op-amps available. If you read their white papers, you will see that every component in the signal path adds noise, even the resistors that set gain. The more components in the signal path, the more noise. At some point, we reach the limits of S/N for any modern circuits, and 125 dB might be the best anyone can do. While it's worth arguing whether there is any point in designing a 32-bit DAC, the S/N maximums determined by physics do not change a 32-bit DAC into a 20-bit DAC. It's apples to oranges.

    If you use SigmaDSP to feed 16-bit or 24-bit samples to the AK4493, then there will be more noise on the input to its 32-bit digital filters. That noise will either be objectionable truncation noise that is correlated (and thus audible under certain conditions), or if you apply dither properly then it will be uncorrelated noise that may or may not compete with the S/N level of the analog stages in the output section of the DAC.

    Personally, I would like to see more information from Asahi Kasei on the details of what they mean by "32-bit" but that's a topic for their support channels.

  • If I want to add dither in this DSP, what would be the best approach? 
    1. Add two random noise generators with an output of -144/-138dB add them to the audio? Or is there a better approach?
    2. Where to add it to the audio? Right before the serial ports? Before the limiter? Before the volume control?

    That's a great question, Daniel. I'm new to SigmaDSP, and I was assuming there would be a 'dither' module that could be dropped in - either one provided by Analog Devices, or from a third party. If not, I'll have to learn how to do it as a bespoke solution.

    Gaussian noise is best, but two random noise generators per channel could be added to achieve a TPDF (triangular probability distribution function) noise source. The scaling should be 2 LSB, according to the AES papers I've read. Some dither implementations further improve this by performing a high-pass filtering of the noise source, so that what noise is there will likely be masked by the content (at least for human listeners). See Vanderkooy & Lipshitz, "Resolution Below the Least Significant Bit in Digital Systems with Dither."

    Dither should always be performed as the last stage. In the case of the SigmaDSP, this would be right before the serial ports. You do not want it before the limiter, because then the amplitude of the dither would change with the signal level - not good. Certainly not before the volume control. For these questions, I would refer to Bob Katz' "Mastering Audio: the art and the science"

    Personally, I would like to know whether Asahi Kasei are performing dither as part of their Velvet Sound technology, because they might be handling everything for us at the last stage.