Post Go back to editing

AD9173 - issues with STPL test and not outputting data on DAC1

I've used the AD9173 Eval FMC with both the USB interface in "standalone" mode, and on an KC705 HPC FMC.  The "serial port configuration over SPI" works reliably, we have validated this. I am experiencing two curious behaviors which suggest maybe my eval board has bad silicon.

1- When I put the chip in DC test mode with the main NCO programmed at 1GHz, and the two amplitude registers set to 0x50ff, DAC0 will output the tone as expected, DAC 1 will not.  However, if I setup the Channel NCO to output a tone, both DACs behave correctly.  Is this normal behavior?  This is behavior is consistent whether I use our own software or the FMC plugged into USB with ADI ACE.

2- We are using the DAC in dual JESD link mode.  The PRBS test succeed, the link setup and stabilize.  I get CGS, FS, CHECKSUM, and ILA bits set for the links and lanes in use.  I have the proper JESD mode configured and when I poll registers 0x450 and on, it reports the proper settings from the JESD transmitter.  When I run short_tpl test, it always passes... even when I program the check registers with a bad value!  However, if I don't disable the test mode as described in step 9 of the data sheet on page 44 of the STPL test, it always fails, even with the right test pattern programmed.  Furthermore, the chip will randomly not output data on DAC1 (or DAC0).  All of the status regs (470-473) show a good link, and registers 4b0 - 4b7 show good links as well, and the sync out is solid high (view it on chips cope).  We've run out of options.  We are suspecting the silicon on the sample board may be bad. We would like to know if there is any status which we can read that ensures the DAC is streaming data on both ports?  To remedy the problem, without reprogramming the device, we enter the PRBS test mode on the JESD links.  Incidentally, both links show pass and 0 error count on all lanes, and when we return from the test, the DACs successfully stream data.  We thought it might have to do with an overflow/underflow condition, so we even check the FIFO status bits, and they both show that we are neither empty nor full, so the DAC is definitely able to keep up with the data flow. 

3- Some other observations:  we are using physical lanes 4-7.  4 and 5 are link 0, with the crossbar sending them to logical lane 0 and 1 for link 0.  Lanes 6 and 7 are mapped to logical lane 4 and 5 supplying lane 0 and lane 1 of link 1.  Again, every register that we can think of that shows status suggest we have a working link.  We just can't verify proper streaming.  For our test, the main NCO for each DAC is set to 1GHz, and the Channel NCO FTW is set to 0.  However, we have both main and channel NCOs configured to use the NCO because we have complex interpolation (24x) enabled (JESD MODE 3 with 8x and 3x interp for main and channel), so we are consistent with the data sheet (reg 112 bit 3 and reg 130 bit 6 are set) and we can verify proper behavior (when the chip is working).  From the init, the boot loader successful flag is set (reg 705), the DAC PLL is locked (register 7b5), the DAC DLL is locked (register c3), DACs all complete calibration (register 52), the JESD mode is valid (register 110), the FTWs load (register 113), and SERDES PLL lock (register 281).

Incidentally, all register addresses are in hex.  Just to clarify.

Please advise on how to determine the silicon is behaving as it should.  Thank you!

  • Hi cdarrow,

    for #1, note that you still need to setup a "JESD mode" in dual link, since this sets up clocking on the intended datapaths. So if you had setup for mode0 as a single-link, the DAC1 datapath will not be active.

    Consider paging both datapaths and DACs together, so you could write SPI at the same time and be sure that both are setup identically.

    It would be great to have a 2nd board on hand to compare results. You can also use any the AD917x boards to compare mode0, in case you have one already.

    Have you tried using ACE to program the EVB without it being connected to the KC-705? This would verify whether the board functions - we know programming via ACE yields the expected results.

    your comment on "DAC1 randomly will not output a signal" is a curious one. Is this an intermittent issue you are seeing with DAC1? Or is the lack of output on DAC1 consistent?

    Best Regards,

    Arik

  • for #1, note that you still need to setup a "JESD mode" in dual link, since this sets up clocking on the intended datapaths. So if you had setup for mode0 as a single-link, the DAC1 datapath will not be active. I setup dual link mode 0 for both links (I run the entirety of table 53).  Also, I have the physical lanes mapped to logical lanes 0 and 1 for Link 0 and lanes 4 and 5 for link 1.  If what I said has nothing to do with what you’re suggesting then please expand on “setup JESD mode” :)

     

    Consider paging both datapaths and DACs together, so you could write SPI at the same time and be sure that both are setup identically.  I do (where possible). I also tried individually.  Same results.

     

    It would be great to have a 2nd board on hand to compare results. You can also use any the AD917x boards to compare mode0, in case you have one already.  We don’t.

     

    Have you tried using ACE to program the EVB without it being connected to the KC-705? This would verify whether the board functions - we know programming via ACE yields the expected results. Yes.  That was my first test.  For the things that I can test in standalone mode this fails in a similar fashion.  (See numbered item 1 in my initial email below).  I can get DC tone on DAC0 most of the time when configuring DC test tone on Main NCO at 1GHz.  Never get a test tone from DAC 1 using main NCO test tone.  However, always (in the 5 tests I did) get a test tone from both DACs using channel NCO.  

     

    your comment on "DAC1 randomly will not output a signal" is a curious one. Is this an intermittent issue you are seeing with DAC1? Or is the lack of output on DAC1 consistent? DAC 1 is “reliably intermittent”. DAC 0 is "unreliably intermittent."  I know that sounds odd, but I can count on DAC1 not producing a tone more times than not.  DAC 0 seems more resilient.  In some cases, DAC0 is outputting a waveform and DAC1 is not.  The JESD cores on the FPGA are being fed the same data stream (literally, the input data stream is duplicated in HW from a single source in RAM).  Every possible register that I can read which indicates status shows that both links are up and solid.  (See #3 below for which registers I’m reading).  And the fifo bits are clear (so no overrun, no underrun). The sync is a solid high, and the FPGA is streaming data (verified with chibscope on the MGT lines).  Is there a bit on the DAC that says: Yes the QBD is de-framing and the data is going to the DAC?  I can easily check it.    Also, I checked the board for physical damage or loose components.  I found none. 

    Generally speaking (see comments about test modes below):  The chip behavior and the manual (and ACE) are inconsistent.  I get it, it’s a complex chip and this is rev 0 of the data sheet, so it is bleeding edge.  Which is why I’m asking for your help. 

    Example 1:

    what’s described in the manual always passes for the STPL test procedure.  Even if the wrong word is programmed into the check sequence.  Step 9 of the STPL test mode on page 44 of the data sheet (right hand column of the page).  It reads: "Wait for the desired time. The desired time is calculated as 1/(sample rate × BER). For example, given a bit error rate of BER = 1 × 10−10 and a sample rate = 1 GSPS, the desired time = 10 sec. Then, set SHORT_TPL_TEST_EN to 0.” If I set the SHORT_TPL_TEST_EN to 0 like they suggest, this test always passes even when I program the check values to be wrong.  (Incidentally, that led to no less than two wasted days while I tried to figure out how STPL could pass but no data was streaming).  The fix was to not set SHORT_TPL_TEST_EN to 0.  Then everything works as expected.  That fix was simple.

    Example 2:

    Look at table 58 on page 72.  This is the end of the startup routine.  The write to register 0x085 says to set it to 0x13 and the comment reads: “Set to the default register value”.   That is correct, this is the default value.  BUT, on page 88 the recommendation is to set it to a different value at the “end of the startup routine”.  So which is it?  I took a look at ACE and dumped the configuration stream out of it and did a line for line compare with the manual… that showed me ACE was doing a lot of additional writes to the AD9173 that aren’t described and/or even mentioned in the data sheet.  And I don’t mean a few additional writes, I mean 5x to 10x more.  If you convert the startup sequence of the data sheet into instructions, it comes out to about 100-200 SPI write instructions (I don’t remember the exact number).  ACE is doing closer to 1000…. And if I try to decode ACE, a lot of the registers it is writing to are not documented in the data sheet.

  • To keep things simple, lets try getting the NCO-only mode working reliably. You should be able to get two NCO tones, one from DAC0 and one from DAC1. We had done this in the past many times... 

    In ACE, you can setup two main NCO only if a dual-link is selected. for the reason I mentioned above - if in single-link, the DAC1 datapath is left powered-down.

    The red flag here is the intermittent behavior. How are you clocking the DAC? Direct clock or using the DAC PLL? I would expect intermittent issues If there is excessive phase-noise (jitter), or if the clock wanders (e.g. over temp), or if the clock input power is low. Note that on the EVB we have a balun the doesn't quite reach >8-9GHz (attempting to cover some 12GHz of bandwidth on the EVB, at a reasonable cost), so if bypassing the DAC PLL you may need some 16dBm to overcome the balun roll-off at 12GHz.

    Would you have a spectrum analyzer accessible? Keysight's PXA or similar.

    Noted on Examples #1 and #2. For #2 (Regarding ACE and the extra write), ACE was coded to have modular subroutines, so assuming a current DAC state is impossible. Writes are redundant in many cases. Many of them to page and re-page the correct DAC - the cost of object-oriented programming I suppose.

    For #1 I'd need to check. BER readback registers for the PRBS tests, for example, are PASS by default, and need to be reset before the start of each test, to show 0xFF. So PASS until FAIL, and the bits are sticky.

  • I’ll see about sending screenshots when I get back to the lab.  What I remember, and reading from my design notes, when I run the PRBS test (the last item ADI mentions in their email), I run PRBS test for 1 minute.  At our data rate, that’s a lot of time, but just I want to make sure I wasn’t missing anything.  Always passes.  Error count 0 all lanes.  

     

    The clocking is: I’m using the onboard HMC7040.  FREF is 128Mhz. JESDMode 3.  Number of links 2.  DAC PLL Div/2 enabled.  DAC rate 6.144GSPS.  Data Rate 256MSPS.  Lane Rate 5.12 Gbps. Sync is 8MHz.* Using the DAC PLL.  DAC PLL always shows locked.  Serdes PLL always shows locked.  Various other lock bits always show locked (see item 3 in my original email).   The clock coming out of the ADC reference port is very clean, low phase noise and stable.  We have SA screen shots, I’ll have to get them when I’m back at the office.  

     

     *these parameters are a result of us working with limitations of the eval board clocking etc.  For evaluation purposes, this is fast enough to generate signals we can evaluate.  The final configuration will be 2x faster.  But long story short, our JESD and clocking configuration are consistent with ACE’s values.  We generate the 128 from the onboard 122.88.  HMC7040 reference divider is 4 (which gives us an LCM of about 30.72MHz) R1 is 2, N1 is 8, x2 is enabled, R2 is 2, N2 is 25, output divider of 24.   These settings are confirmed by ACE.  

  • The first image is the ADC Divided clock output from the DAC "close in."The entire span is 1MHz and the RBW is 500Hz.  The spectral performance is really good.  Tells me the references are good and the DAC PLL is locked and able to generate a locked clean tone.   The second set of images is the same clock output at 40 MHz span.  There are some spurs, but they are about 70 dB down and coincide with FPD etc.  The third file is a chipscope image of the FPGA.  Notice, on the FPGA pins both sync pins are de-asserted.  When I captured this image, DAC B had no output, even though both JESD links were convinced that the system is up and running (and streaming data).

     

    Hope this helps. 

     

    Our conclusion: Given that the clocks are clean,  data is present,  and all indicators show the chip should be working, the random nature of this bug (DAC B and/or A to not output any data) even though the chip claims it is working makes us believe we may simply have a silicon issue. 

  • I would rule out a silicon issue at the moment. We have had the AD9173 in the field for some time and it went through rather stringent Eval pre-release. So very unlikely.

    Have you tried running the AD9173 in NCO-only mode, using the channelizer NCO's, using the same JESD204B mode?

    JESD204B mode is needed to set up the datapath correctly. No need for data input or actually setting up the link, but it would be good to just setup the link anyways so you could switch it in and out. once the link is up, setup NCO-only mode, which just means that the NCO will be supplied DC samples from an internal generator instead of the JESD204B PHY.

    if you can have a stable NCO tone at both DAC0 and DAC1 in NCO mode, it rules out a silicon issue. I would focus on the JESD204B link as a potential culprit, mainly clocking for both DAC and FPGA.

    If the issue persists, the issue is more likely related to clocking and/or supplies to the EVB, or a faulty EVB to begin with. 

    Are you seeing this issue on this particular EVB, or is it with more than one?

  • I’ve tried the NCO only mode.  Both as an FMC and as standalone with ACE.  Same problem in both configurations. Randomly one of the two channels won’t generate a tone.  Both with Main NCO and channel NCO.  

     

    Also, I didn’t mean silicon issue.  My mistake,  I’m sorry.  I agree with you that we may have a faulty EVB.  That’s why we are confident to use this chip in our design and are trying to rule out a faulty EVB.  I think to rule out the faulty EVB, it would be easiest to simply swap out the evaluation board to see if the problems go away.  I suspect it will.  The randomness of the fault is what’s so surprising to us that we don’t think it’s a systemic issue.  If it doesn’t go away, then we (QED) have a lot more understanding to do, and possibly may need to have more detailed conversations with ADI about our exact setup etc.  But let’s cross that bridge if we get there.  I’m willing to bet it’s something simple!

  • Sounds good, in case this is NOT an EVB issue, if you have a local FAE / FSE you could work with it may make the response time a bit faster. Lets hope it is an EVB issue however!