We are using BF548 rev. 0.2 with VDSP 5.0 with Update 6.
We use fat on sdcard extensively; we're banging our heads on a pesky problem from a few feeks.
Sometimes reading or failing from sdcard fails; the system freezes inside a pending semaphore (sem_pend()) function.
Tracing hundreds of runs we found that everytime the sem_pend() function hangs, it's shortly after an sdcard error in the adi_sdh_InterruptHandler.
The SDH STATUS registers shows 0x0200 (START_BIT_ERROR); sem_pend() hangs as a consequence.
- it may happen in a few seconds, or after more than 10 minutes of continuous use (and accesses to SDCARD, in reading), randomly.
- we tested 4 different SD brands, in two or more samples each, on more than ten different boards.
- we think it's not an hardware issue, because we can switch to USB mass storage at run-time; we read more than 600 Mbytes from the same card on the same
board in the same run, without any such error. And USB seems to use the same driver for low level sdcard access.
- not easy to reproduce on ezkit, because our app is tailored for a custom board; we suppose that reproducing the same access patterns in a toy example would be very difficult.
- it happens in reading or writing
So we added some tracing statements (to a ram buffer for later inspection) in the adi_sdh_InterruptHandler() routine in file adi_sdh.c
We don't understand some things:
- the ADI_SDH_STATUS_IS_DATA_TX_ERROR macro tests for 5 different "static" error flags of STATUS register, but the
following ADI_SDH_STATUS_CLEAR_DATA_TX_ERROR clears just 3 of them. Why ?
- the second if (ADI_SDH_STATUS_IS_DATA_TX_ERROR) follows a call to adi_sdh_SendCommand() that always *clears* the status register before
exiting with a ADI_SDH_STATUS_CLEAR_ALL macro. We do not understand why the same STATUS register is tested shortly after.
If the *first* if(ADI_SDH_STATUS_IS_DATA_TX_ERROR | ADI_SDH_STATUS_DATA_END) was true due to an error, such in our case, the second if() would
be false, because the status register has been cleared; so it won't execute the pDevice->BusState = ADI_SDH_BUS_DEV_INT_XFER_FAILED; line.
- maybe can be dangerous not having a SSYNC() between clearing status in the adi_sdh_SendCommand(), and reading the status immediately after ?
We found ourselves stopped on a breakpoint in the "ISR ERR" trace line, so the second if() was true, with the debugger SDIO registers showing status was
zero; seems like it did read 0x200 because STATUS didn't have time to reset before reading.
More important, we do not understand why this error happens only during use with filesystem service, and not when used,
in the exactly same "run", driven by USB mass storage.
We already tried:
- formatting/swapping/changing brands of sdcards
- reducing SDCARD clock by half
- reducing bus width to 1 instead of 4
- adding delays between accesses
- #pragma's in L1 or L2 of code / data
Any ideas ? Anyone using heavily sdcard ?
Here follows the code fragment from adi_sdh.c:
static ADI_INT_HANDLER(adi_sdh_InterruptHandler) /* SDH Interrupt handler */
ADI_SDH_DEF *pDevice = (ADI_SDH_DEF*)ClientArg;
TRACE_MSG_PRINTF(TRACE_MSGS_ISR | TRACE_MSGS_FSS_LOWLEVEL, "ISR sdh STATUS=%08X\n", (u32) *pADI_SDH_STATUS);
/* assume the interrupt is not for us */
Result = ADI_INT_RESULT_NOT_PROCESSED;
/* SDH data transfer error or SDH data transfer complete */
if (ADI_SDH_STATUS_IS_DATA_TX_ERROR | ADI_SDH_STATUS_DATA_END)
/* Issue a Stop command to terminate data transfer */
/* CMD12, short response, stuff bits as argument, Timeout for ID mode */
adi_sdh_SendCommand ( pDevice,
ADI_SD_MMC_CMD_STOP_TRANSMISSION | ADI_SDH_SD_MMC_CMD_SHORT_RESPONSE,
ADI_SDH_STATUS_CMD_RESPONSE_TIMEOUT_FIELD | ADI_SDH_STATUS_CMD_CRC_FAIL_FIELD,
/* If SDH data transfer results in error */
// WHEN WE ARRIVE HERE, THE FILESYSTEM WILL (LATER) HANG IN A SEM_PEND() FUNCTION, BECAUSE A SEM_POST() IS MISSED.
TRACE_MSG(TRACE_MSGS_ISR | TRACE_MSGS_FSS_LOWLEVEL, "ISR ERR sdh\n");
/* clear data Tx error status */
/* Set data transfer as failed */
pDevice->BusState = ADI_SDH_BUS_DEV_INT_XFER_FAILED;
/* ELSE, SDH Data transfer complete */
TRACE_MSG(TRACE_MSGS_ISR | TRACE_MSGS_FSS_LOWLEVEL, "ISR DATA END OK\n");
/* clear data complete status flag */
/* update SDH bus state as SDH device interrupt data transfer complete */
pDevice->BusState |= ADI_SDH_BUS_DEV_INT_XFER_DONE;
Result = ADI_INT_RESULT_PROCESSED;
/* process this interrupt */
/* Card detect interrupt occured? */
if (ADI_SDH_E_STATUS_IS_SD_CARD_DETECT_ENABLED && ADI_SDH_E_STATUS_IS_SD_CARD_DETECTED)
TRACE_MSG(TRACE_MSGS_ISR | TRACE_MSGS_FSS_LOWLEVEL, "ISR CARD DETECT\n");
/* Reset SDH registers */
/* Enable SDH bus */
/* Reset SD/MMC/SDIO card information table */
/* update SD slot status */
if (pDevice->SlotStatus == ADI_SDH_SLOT_EMPTY)
/* Mark SD slot status as new media inserted */
pDevice->SlotStatus = ADI_SDH_SLOT_MEDIA_INSERTED;
The Slot previously had a valid media.
Mark SD slot status media removed so that we could unmount any volumes previously mounted SDH
pDevice->SlotStatus = ADI_SDH_SLOT_MEDIA_REMOVED;
Result = ADI_INT_RESULT_PROCESSED;
TRACE_MSG_PRINTF(TRACE_MSGS_ISR | TRACE_MSGS_FSS_LOWLEVEL, "ISR SDH %08X\n", (u32) Result);
/* return */