Hello
We have custom hardware with three ADRV9009 devices. These devices are controlled through the ADI Linux IIO device driver. We are using version 2019_R2. Our design is stable and we have manufactured multiple units.
However, under certain circumstances, we see problems with the ADRV9009 ARM during the setting of the LO frequency. We are setting the LO frequency in a range between 100MHz to 400MHz. Our sample rate is 204.8MSPS (166MHz BW) using a profile generated by the ADR9009 TES and Filter Wizard.
To make the problem happen, we do successive large jumps in the LO frequency - 100MHz per jump. So for example, we would set the LO frequency to 100MHz, 200MHz, 300MHz, 400MHz and then back to 100MHz. If we jump around randomly between these frequencies, and do it often enough, we eventually trigger a fault in the ADRV9009 ARM and dmesg starts spitting out the following messages:
[ 288.518808] adrv9009 spi1.2: ERROR: 247: TALISE_waitArmCmdStatus() failed due to thrown ARM error. ARM time out
[ 288.528998] adrv9009 spi1.2: adrv9009_set_radio_state: failed
[ 288.534782] adrv9009 spi1.2: adrv9009_set_radio_state: failed
[ 290.604279] adrv9009 spi1.2: ERROR: 179: ARM Mailbox Busy. Command not executed in TALISE_sendArmCommand()
[ 292.677449] adrv9009 spi1.2: ERROR: 179: ARM Mailbox Busy. Command not executed in TALISE_sendArmCommand()
[ 294.750656] adrv9009 spi1.2: ERROR: 179: ARM Mailbox Busy. Command not executed in TALISE_sendArmCommand()
[ 294.760313] adrv9009 spi1.2: adrv9009_set_radio_state: failed
[ 294.766096] adrv9009 spi1.2: adrv9009_set_radio_state: failed
[ 294.771873] adrv9009 spi1.2: ERROR: 439: TALISE_setRfPllFrequency() : Invalid rfpllLoFreq, rfPllLoFreq - TxProfileRFBW/2 must be > 0 (DC)
[ 296.877856] adrv9009 spi1.2: ERROR: 179: ARM Mailbox Busy. Command not executed in TALISE_sendArmCommand()
[ 297.888570] adrv9009 spi1.1: ERROR: 247: TALISE_waitArmCmdStatus() failed due to thrown ARM error. ARM time out
[ 297.898754] adrv9009 spi1.1: adrv9009_set_radio_state: failed
[ 297.904530] adrv9009 spi1.1: adrv9009_set_radio_state: failed
Thereafter, the ADRV9009 responds 'slowly' to requests. It seems like other commands thereafter also fail (like the adrv9009_set_radio_state command as you can see in the above). Another side effect of the ARM being in this broken state is that the temperature reading is completely erroneous. The only way to recover the ADRV9009 is to power cycle the complete unit. Rebooting the Linux does not fix the problem.
If we change the LO frequency in smaller steps (say 10MHz jumps), we cannot reproduce the problem. Also, if we don't have any signal connected to the RX, we cannot reproduce the problem. We are only able to reproduce the problem by having the RX connected to an antenna and by doing large 100MHz frequency jumps.
We are only using the RX channels. We have tried disabling the RX tracking corrections but the fault still happens. We only do initial calibrations at the start and those perform fine so this is not a result of a failed initial calibration (as I have seen elsewhere on the forums).
The ADRV9009 ARM code is a black box to us. Please can you provide some guidance or insight as to why the ADRV9009 ARM would be failing in this way. I have posted another question on the Linux forum about retrieving any logging that is available.
Thank you in advance.
Gavin