AnsweredAssumed Answered

Possible Hang of SDH driver under VDK

Question asked by demonb on Nov 15, 2013
Latest reply on Dec 4, 2013 by demonb

Hi!

 

In our projects we are using ADI SDH driver from ADI SSL and Drivers.

Usually, it's a VDK project with multiply threads. And we always have one thread that writes something to SD card.

 

During testing I saw very rear situation when "SD writer" thread stops working and hangs during writing to SD.

One of the problems was due to chip error. It's written here http://ez.analog.com/thread/16542

Avoiding this anomaly helped me a lot.

 

But sometimes I still see sdh driver hanging, this time in another place.

 

My current application has many other threads."SD writer" thread is not most prioritized. Other threads could interrupt "SD writer" thread for a long time.

And during record we getting following problem: SDH driver loops in adi_sdh_SendCommand function(adi_sdh.c) :

 

if (ErrorStatus|SuccessStatus) {         while (!(*pADI_SDH_STATUS & (ErrorStatus|SuccessStatus)));//<-- hang here forever }

I am looking to SDH register and I see that command is already completed, but it's not a command that was issued in this function.

ADI_SD_MMC_CMD_READ_MULTIPLE_BLOCKS was issued but in register I see ADI_SD_MMC_CMD_STOP_TRANSMISSION command.

 

After some thinking and debugging I supposed that happens something like this:

 

"SD writer" calls fseek ... //long call stack of adi ssl functions ... adi_sdh_InitiateMemAccess() adi_sdh_SendCommand(ADI_SD_MMC_CMD_READ_MULTIPLE_BLOCKS | ADI_SDH_SD_MMC_CMD_SHORT_RESPONSE) {      ADI_SDH_ARGUMENT_SET_VALUE(CmdArgument);       ADI_SDH_COMMAND_SET_VALUE(CmdRegVal | ADI_SDH_SD_MMC_CMD_ENABLE_COMMANDS);       //somewhere here other thread interrupt "SD writer" thread for a long time (10-30ms)      //command to sd already sent, and data transmission is started      //Interrupt happens, that means data transmission is over and in this interrupt       //handler adi_sdh_SendCommand(ADI_SD_MMC_CMD_STOP_TRANSMISSION ) command is called      //After that we returns here.      //Now what we have - command is already done. Status bits is cleared by another command and we hangs here.       if (ErrorStatus|SuccessStatus)      {          while (!(*pADI_SDH_STATUS & (ErrorStatus|SuccessStatus)))// Hangs here      }       if (*pADI_SDH_STATUS & ErrorStatus)     {         Result = *pADI_SDH_STATUS;     }       *pADI_SDH_STATUS_CLEAR = (ErrorStatus|SuccessStatus);       return (Result); }

 

I don't actually know what better to implement in this situation.

I see several ways:

1) Add timeout to a waiting cycle, but it's not a problem decision, it's just avoid thread from hanging.

 

_GET_CYCLE_COUNT(start_count);           final_count = start_count;     /* cross-check Error/Success status value with present Status register value */     if (ErrorStatus|SuccessStatus)     {         while (!(*pADI_SDH_STATUS & (ErrorStatus|SuccessStatus)))         {                   _GET_CYCLE_COUNT(final_count);                   if((final_count - start_count) > CMD_TO)                             break;         }     } 

 

2) Disable interrupts before command execution and during waiting cycle. It's not very good but reliable.

 

     /* set command argument */     ADI_SDH_ARGUMENT_SET_VALUE(CmdArgument);     u16 m = cli();     /* configure command register */     ADI_SDH_COMMAND_SET_VALUE(CmdRegVal | ADI_SDH_SD_MMC_CMD_ENABLE_COMMANDS);     /* cross-check Error/Success status value with present Status register value */     if (ErrorStatus|SuccessStatus)     {         while (!(*pADI_SDH_STATUS & (ErrorStatus|SuccessStatus)));     }      /* On status error, pass the status register value as result */     if (*pADI_SDH_STATUS & ErrorStatus)     {         Result = *pADI_SDH_STATUS;     }      /* clear all SDH status bits */     *pADI_SDH_STATUS_CLEAR = (ErrorStatus|SuccessStatus);     sti(m);

 

or better both variants.

 

I've measured maximum time spent in this function with disabled interrupts. With Transcend SDHC card(10 class) it was about 0,4 ms. I believe that most commands take lower time.

 

I've attached a picture with such hang state of SDH register and call stack.

 

Hope that helps, and will be glad to here any verdict.

 

And sorry for my English if something wrong.

Dmitry.

Attachments

Outcomes