Post Go back to editing

AD9173 - issues with STPL test and not outputting data on DAC1

I've used the AD9173 Eval FMC with both the USB interface in "standalone" mode, and on an KC705 HPC FMC.  The "serial port configuration over SPI" works reliably, we have validated this. I am experiencing two curious behaviors which suggest maybe my eval board has bad silicon.

1- When I put the chip in DC test mode with the main NCO programmed at 1GHz, and the two amplitude registers set to 0x50ff, DAC0 will output the tone as expected, DAC 1 will not.  However, if I setup the Channel NCO to output a tone, both DACs behave correctly.  Is this normal behavior?  This is behavior is consistent whether I use our own software or the FMC plugged into USB with ADI ACE.

2- We are using the DAC in dual JESD link mode.  The PRBS test succeed, the link setup and stabilize.  I get CGS, FS, CHECKSUM, and ILA bits set for the links and lanes in use.  I have the proper JESD mode configured and when I poll registers 0x450 and on, it reports the proper settings from the JESD transmitter.  When I run short_tpl test, it always passes... even when I program the check registers with a bad value!  However, if I don't disable the test mode as described in step 9 of the data sheet on page 44 of the STPL test, it always fails, even with the right test pattern programmed.  Furthermore, the chip will randomly not output data on DAC1 (or DAC0).  All of the status regs (470-473) show a good link, and registers 4b0 - 4b7 show good links as well, and the sync out is solid high (view it on chips cope).  We've run out of options.  We are suspecting the silicon on the sample board may be bad. We would like to know if there is any status which we can read that ensures the DAC is streaming data on both ports?  To remedy the problem, without reprogramming the device, we enter the PRBS test mode on the JESD links.  Incidentally, both links show pass and 0 error count on all lanes, and when we return from the test, the DACs successfully stream data.  We thought it might have to do with an overflow/underflow condition, so we even check the FIFO status bits, and they both show that we are neither empty nor full, so the DAC is definitely able to keep up with the data flow. 

3- Some other observations:  we are using physical lanes 4-7.  4 and 5 are link 0, with the crossbar sending them to logical lane 0 and 1 for link 0.  Lanes 6 and 7 are mapped to logical lane 4 and 5 supplying lane 0 and lane 1 of link 1.  Again, every register that we can think of that shows status suggest we have a working link.  We just can't verify proper streaming.  For our test, the main NCO for each DAC is set to 1GHz, and the Channel NCO FTW is set to 0.  However, we have both main and channel NCOs configured to use the NCO because we have complex interpolation (24x) enabled (JESD MODE 3 with 8x and 3x interp for main and channel), so we are consistent with the data sheet (reg 112 bit 3 and reg 130 bit 6 are set) and we can verify proper behavior (when the chip is working).  From the init, the boot loader successful flag is set (reg 705), the DAC PLL is locked (register 7b5), the DAC DLL is locked (register c3), DACs all complete calibration (register 52), the JESD mode is valid (register 110), the FTWs load (register 113), and SERDES PLL lock (register 281).

Incidentally, all register addresses are in hex.  Just to clarify.

Please advise on how to determine the silicon is behaving as it should.  Thank you!

Parents
  • for #1, note that you still need to setup a "JESD mode" in dual link, since this sets up clocking on the intended datapaths. So if you had setup for mode0 as a single-link, the DAC1 datapath will not be active. I setup dual link mode 0 for both links (I run the entirety of table 53).  Also, I have the physical lanes mapped to logical lanes 0 and 1 for Link 0 and lanes 4 and 5 for link 1.  If what I said has nothing to do with what you’re suggesting then please expand on “setup JESD mode” :)

     

    Consider paging both datapaths and DACs together, so you could write SPI at the same time and be sure that both are setup identically.  I do (where possible). I also tried individually.  Same results.

     

    It would be great to have a 2nd board on hand to compare results. You can also use any the AD917x boards to compare mode0, in case you have one already.  We don’t.

     

    Have you tried using ACE to program the EVB without it being connected to the KC-705? This would verify whether the board functions - we know programming via ACE yields the expected results. Yes.  That was my first test.  For the things that I can test in standalone mode this fails in a similar fashion.  (See numbered item 1 in my initial email below).  I can get DC tone on DAC0 most of the time when configuring DC test tone on Main NCO at 1GHz.  Never get a test tone from DAC 1 using main NCO test tone.  However, always (in the 5 tests I did) get a test tone from both DACs using channel NCO.  

     

    your comment on "DAC1 randomly will not output a signal" is a curious one. Is this an intermittent issue you are seeing with DAC1? Or is the lack of output on DAC1 consistent? DAC 1 is “reliably intermittent”. DAC 0 is "unreliably intermittent."  I know that sounds odd, but I can count on DAC1 not producing a tone more times than not.  DAC 0 seems more resilient.  In some cases, DAC0 is outputting a waveform and DAC1 is not.  The JESD cores on the FPGA are being fed the same data stream (literally, the input data stream is duplicated in HW from a single source in RAM).  Every possible register that I can read which indicates status shows that both links are up and solid.  (See #3 below for which registers I’m reading).  And the fifo bits are clear (so no overrun, no underrun). The sync is a solid high, and the FPGA is streaming data (verified with chibscope on the MGT lines).  Is there a bit on the DAC that says: Yes the QBD is de-framing and the data is going to the DAC?  I can easily check it.    Also, I checked the board for physical damage or loose components.  I found none. 

    Generally speaking (see comments about test modes below):  The chip behavior and the manual (and ACE) are inconsistent.  I get it, it’s a complex chip and this is rev 0 of the data sheet, so it is bleeding edge.  Which is why I’m asking for your help. 

    Example 1:

    what’s described in the manual always passes for the STPL test procedure.  Even if the wrong word is programmed into the check sequence.  Step 9 of the STPL test mode on page 44 of the data sheet (right hand column of the page).  It reads: "Wait for the desired time. The desired time is calculated as 1/(sample rate × BER). For example, given a bit error rate of BER = 1 × 10−10 and a sample rate = 1 GSPS, the desired time = 10 sec. Then, set SHORT_TPL_TEST_EN to 0.” If I set the SHORT_TPL_TEST_EN to 0 like they suggest, this test always passes even when I program the check values to be wrong.  (Incidentally, that led to no less than two wasted days while I tried to figure out how STPL could pass but no data was streaming).  The fix was to not set SHORT_TPL_TEST_EN to 0.  Then everything works as expected.  That fix was simple.

    Example 2:

    Look at table 58 on page 72.  This is the end of the startup routine.  The write to register 0x085 says to set it to 0x13 and the comment reads: “Set to the default register value”.   That is correct, this is the default value.  BUT, on page 88 the recommendation is to set it to a different value at the “end of the startup routine”.  So which is it?  I took a look at ACE and dumped the configuration stream out of it and did a line for line compare with the manual… that showed me ACE was doing a lot of additional writes to the AD9173 that aren’t described and/or even mentioned in the data sheet.  And I don’t mean a few additional writes, I mean 5x to 10x more.  If you convert the startup sequence of the data sheet into instructions, it comes out to about 100-200 SPI write instructions (I don’t remember the exact number).  ACE is doing closer to 1000…. And if I try to decode ACE, a lot of the registers it is writing to are not documented in the data sheet.

Reply
  • for #1, note that you still need to setup a "JESD mode" in dual link, since this sets up clocking on the intended datapaths. So if you had setup for mode0 as a single-link, the DAC1 datapath will not be active. I setup dual link mode 0 for both links (I run the entirety of table 53).  Also, I have the physical lanes mapped to logical lanes 0 and 1 for Link 0 and lanes 4 and 5 for link 1.  If what I said has nothing to do with what you’re suggesting then please expand on “setup JESD mode” :)

     

    Consider paging both datapaths and DACs together, so you could write SPI at the same time and be sure that both are setup identically.  I do (where possible). I also tried individually.  Same results.

     

    It would be great to have a 2nd board on hand to compare results. You can also use any the AD917x boards to compare mode0, in case you have one already.  We don’t.

     

    Have you tried using ACE to program the EVB without it being connected to the KC-705? This would verify whether the board functions - we know programming via ACE yields the expected results. Yes.  That was my first test.  For the things that I can test in standalone mode this fails in a similar fashion.  (See numbered item 1 in my initial email below).  I can get DC tone on DAC0 most of the time when configuring DC test tone on Main NCO at 1GHz.  Never get a test tone from DAC 1 using main NCO test tone.  However, always (in the 5 tests I did) get a test tone from both DACs using channel NCO.  

     

    your comment on "DAC1 randomly will not output a signal" is a curious one. Is this an intermittent issue you are seeing with DAC1? Or is the lack of output on DAC1 consistent? DAC 1 is “reliably intermittent”. DAC 0 is "unreliably intermittent."  I know that sounds odd, but I can count on DAC1 not producing a tone more times than not.  DAC 0 seems more resilient.  In some cases, DAC0 is outputting a waveform and DAC1 is not.  The JESD cores on the FPGA are being fed the same data stream (literally, the input data stream is duplicated in HW from a single source in RAM).  Every possible register that I can read which indicates status shows that both links are up and solid.  (See #3 below for which registers I’m reading).  And the fifo bits are clear (so no overrun, no underrun). The sync is a solid high, and the FPGA is streaming data (verified with chibscope on the MGT lines).  Is there a bit on the DAC that says: Yes the QBD is de-framing and the data is going to the DAC?  I can easily check it.    Also, I checked the board for physical damage or loose components.  I found none. 

    Generally speaking (see comments about test modes below):  The chip behavior and the manual (and ACE) are inconsistent.  I get it, it’s a complex chip and this is rev 0 of the data sheet, so it is bleeding edge.  Which is why I’m asking for your help. 

    Example 1:

    what’s described in the manual always passes for the STPL test procedure.  Even if the wrong word is programmed into the check sequence.  Step 9 of the STPL test mode on page 44 of the data sheet (right hand column of the page).  It reads: "Wait for the desired time. The desired time is calculated as 1/(sample rate × BER). For example, given a bit error rate of BER = 1 × 10−10 and a sample rate = 1 GSPS, the desired time = 10 sec. Then, set SHORT_TPL_TEST_EN to 0.” If I set the SHORT_TPL_TEST_EN to 0 like they suggest, this test always passes even when I program the check values to be wrong.  (Incidentally, that led to no less than two wasted days while I tried to figure out how STPL could pass but no data was streaming).  The fix was to not set SHORT_TPL_TEST_EN to 0.  Then everything works as expected.  That fix was simple.

    Example 2:

    Look at table 58 on page 72.  This is the end of the startup routine.  The write to register 0x085 says to set it to 0x13 and the comment reads: “Set to the default register value”.   That is correct, this is the default value.  BUT, on page 88 the recommendation is to set it to a different value at the “end of the startup routine”.  So which is it?  I took a look at ACE and dumped the configuration stream out of it and did a line for line compare with the manual… that showed me ACE was doing a lot of additional writes to the AD9173 that aren’t described and/or even mentioned in the data sheet.  And I don’t mean a few additional writes, I mean 5x to 10x more.  If you convert the startup sequence of the data sheet into instructions, it comes out to about 100-200 SPI write instructions (I don’t remember the exact number).  ACE is doing closer to 1000…. And if I try to decode ACE, a lot of the registers it is writing to are not documented in the data sheet.

Children
No Data