ADSP-21262, div by zero corrupts ISR?

(Not sure, maybe this should be on VisualDSP++ side)

Have my custom board with 21262 that I'm debugging via HPUSB-ICE.

For unknown reason code goes crazy and when I HALT (Shift-F5) disassembly -windows shows the ISR -area (80000-800FF) being filled with garbage.

I assume this initiates due to div by zero as I fixed one function that I knew could have such behaviour.

Fixing this made the crashing significantly less typical.

Being fairly newbie with enviroment, not sure how to dive in and where/how to look for bug.

I don't want to if(arg==0) for all of the unknown divisions as it will take many weeks, there's quite a lot codebase on this project and most of it fairly time critical.

How to detect if it was zero div or some other?

Typically right after Shift-F5 the program seems to be inside an interrupt that is not used (RTI statement)

I've added breakpoints to IICDI, SOVF, BKP and also all of the interrupts that I'm not using.

Reason being, trying to see when program goes into such area (also via hardware breakpoints).

My experience on embedded/dsp comes from "few moons ago" and admit I'm not expert on VDSP++, but kinda tried to look for some trace -feature to see whereabout the program actually was before it fails. No luck.

Also, shouldn't emulator jump to IICDI or SOVF if div by zero occurs?

Was using original version (5.0.0) of VDSP++, so I created a clone and upgraded that with 10.1 package, but no change.

Reason for upgrade was these anomalies I though might have something to do with my issues:

06000028 Incorrect Popping of stacks possible when exiting IRQx/Timer Interrupts with (DB) modifiers.
06000020 Indirect jumps or calls followed by Long Word accesses using PM bus ...

Anyway, all hints are appreciated

  • 0
    •  Analog Employees 
    on Feb 15, 2013 2:23 PM

    Hi,

    As far as my understanding is there , the divide by zero operation should set the Invalid exception.

    So when you halt the code, could you look in to the STKYx/y register to find out the status of the PEx/y operations and program sequencer stacks. This should give more idea, what is the source of these errors.

    1. How are you sure that the problem which you are facing is because of the Division by zero operation only?
    2. What was your observation that made you assume that this problem is initiated  due to div by zero?
    3. Kindly provide us more information regarding your application, which will enable our understanding for the issue.
    4. Also explain the code flow in your application, so that we can make sure that you are doing things correctly or not.
    5. You can also try to remove some portion of the code, by commenting them out, which you feel should not cause this error (for debugging purpose only), and try making the code as simple as possible. This will help us to narrow down the issue.

    For handling the division by zero condition, I would suggest you to have some wrapper function around the division library function in your code, which always checks for the operand and based on that generates some interrupt for the case corresponding to division by zero. This will ensure that whenever , wherever this condition is met it will be taken care by the code itself.

    Kindly provide us above information so that we can assist you better with the issue.

    Thanks,

    Harshit

  • 1. I'm not sure. When I added wrappers to some of the divs, the crashing occurred less. However it does still occur.

    2. I noticed that one particular function returned invalid values, due to div by zero, due to loose RF-cable... and when I wrapped it so that no div by zero couldn't happen, the crashing occurred less.

    3. DSP talking to and back FPGA. Complicated, but I've now narrowed it down to external interrupt (handling) using DAIHI. Handler itself is fairly simple (clear interrupt and write 32 bits to FPGA-address using DMA (8 bits at time) and then return. Whenever I increase the amount of interrupts, system stops (there's watchdog output updating a blinking led) and if then hit HaltF5 can see PC is in ISR-area and assembly code does not compare to original code aka is corrupted.

    4. Code is fairly mature, several years. However, we haven't used the external IRQ this much before. It's "typical" while (1) -main loop with interrupt handlers taking care of serial transfers. No operating system.

    5. I've ripped most of the code out and just concentrating on DAIHI itself.

    I tried to generate the div by zero and yes it does set the flto and flti, so that is taken care.

    I guess my challenge is that I cannot (or dont know when to) halt the program when it corrupts the ISR and therefore cannot pinpoint the actual source of corrupt introducing line of code.

    I've tried hardware breakpoints (if something is actually written to ISR address space) but no luck.

  • Ok, propably should use ISR=>IVT and the title of this thread should be something else too, but don't know how to edit that afterwards,

    Right now the problem is around the interrupt handler(s).

    What could be the official way of creating one?

    My template to start with, (all asm):

    // Init DSP
    ...
    bit set imask DAIHI; // Enable DAIHI interrupt
    bit set mode1 IRPTEN; // Enable interrupts in general
    ...
    // Somewhere in Runtime Header Segment, defining IVT
    ...
    ___lib_DAIHI: // INT(DAIH)
         JUMP IRQ_DAI_EXT;
         RTI; RTI; RTI;
    ...
    // and then the actual service
    ...
    IRQ_DAI_EXT:
         BIT CLR MODE1 IRPTEN; //Disable interrupts
         NOP; NOP;
         r4=MY_IO_ADDR;
         r8=0;
         call Write_io;  // Writes zero to FPGA
         NOP; NOP;
         BIT SET IMASK DAIHI;     // Enable DAI interrupt, do we need this really?
         BIT SET MODE1 IRPTEN; // Enable interrupts
         RTI;
    IRQ_DAI_EXT.end:
    ...
    
    

    Do you see anything wrong with that?

    I thought the stack(s) could overflow due to bad stack handling (for example inside subroutine Write_io but there's not much to look at and after some manual reading it looks like RTS/RTI should take care of most anyway.

    and the "original" problem is: when I give a lot external interrupts, DSP ends up having corrupted IVT.

    Appreciate your comments

  • 0
    •  Analog Employees 
    on Feb 27, 2013 12:56 PM

    Hi,

    The template for handling the external interrupt source looks correct to me.

    So I have few queries, which will help me in understanding the issue better as shown below:

    1. When you say that the problem happens when a lot of external interrupts takes place, so are you talking about the frequency of these interrupts or the number of interrupt sources getting increased in your system?
    2. What is the purpose of other interrupts in your code?
    3. Are you writing the code in C or assembly? If in C, then the compiler will take care of handling the stacks, but if you are writing an assembly routines then the operations to create a stack frame should be the responsibility of the called function. This has to be taken care of manually by the user in their code. I request you to kindly refer to Visual DSP++ 5.0 Compiler manual for more details regarding this. The link for it is given below:

    http://www.analog.com/static/imported-files/software_manuals/50_21k_cc_man.rev1.2.pdf

    I request you to kindly go through the above link and provide more details as asked above.

    Please let me know in case you have any further queries/doubts.

    Thanks,

    Harshit

  • 1. Right now, code is reduced to handle only DAI (external interrupt). It used to seem that only high frequency  ~100Hz->1kHZ of interrupts gave the problem but now I can re-create this just switching the output by hand (freq less than 1Hz) and the crash appears within 1->15 first interrupts.

    2. Other interrupts are mostly serial communication (data ready?), timer and ethernet interface (sport also) but I've disabled those now while debugging. It appears also they don't make the problem less or more propable.

    3. Most of the code is C++. This particular part is 100% asm. => I will study the manual more carefully.

    Original challenge still exists: How to debug something (using target hw) that you don't know exact point of crash? The PC stack does not give clue after the crash because it just points to middle of other interrupts within IVT.

    Appreciate your guidance to a rookie like me