2011-08-02 10:04:00     random CPLB misses - kernel OOPS

Document created by Aaronwu Employee on Aug 27, 2013
Version 1Show Document
  • View in full screen mode

2011-08-02 10:04:00     random CPLB misses - kernel OOPS

Timothy Stotts (UNITED STATES)

Message: 102787   

 

I am getting random CPLB misses when running three processes that use System V IPC for communication. One of the three processes spawns a network worker thread via pthread.h. What does this mean, and how can I fix it? The binaries are compiled as ELF with: -O0 -g -lpthread. All critical regions of System V IPC are guarded with semaphores, and all critical regions of pthread interaction are guarded with mutexes; and the two sections of code do not overlap.

 

 

 

Data access CPLB miss

<5> - Used by the MMU to signal a CPLB miss on a data access.

Kernel OOPS in progress

Deferred Exception context

CURRENT PROCESS:

COMM=ps PID=592  CPU=0

TEXT = 0x01e684b0-0x01eb74c4        DATA = 0x01484014-0x01487620

BSS = 0x01487620-0x01780000  USER-STACK = 0x0179fec0

 

return address: [0x0001a328]; contents of:

0x0001a300:  5408  0c00  1c07  e14a  001a  e10a  45f4  9110

0x0001a310:  0040  3041  0010  0000  3210  3001  0c42  1c20

0x0001a320:  6410  3200  67f0  5e82 [9152] 0c42  1819  6061

0x0001a330:  40c1  e120  fed4  5208  3200  5a42  0c41  1c11

 

ADSP-BF537-0.3 525(MHz CCLK) 131(MHz SCLK) (mpu off)

Linux version 2.6.34.7-ADI-2010R1 (adi@colinux) (gcc version 4.3.5 (ADI-2010R1-RC4) ) #8 Mon Aug 1 16:21:28 EDT 2011

 

SEQUENCER STATUS:               Not tainted

SEQSTAT: 00000026  IPEND: 8008  IMASK: ffff  SYSCFG: 0006

  EXCAUSE   : 0x26

  physical IVG3 asserted : <0xffa00798> { _trap + 0x0 }

  physical IVG15 asserted : <0xffa00e1c> { _evt_system_call + 0x0 }

  logical irq   6 mapped  : <0xffa00420> { _bfin_coretmr_interrupt + 0x0 }

  logical irq  10 mapped  : <0x000da4c8> { _bfin_rtc_interrupt + 0x0 }

  logical irq  18 mapped  : <0x000a9330> { _bfin_serial_dma_rx_int + 0x0 }

  logical irq  19 mapped  : <0x000a90a8> { _bfin_serial_dma_tx_int + 0x0 }

  logical irq  24 mapped  : <0x000d4014> { _bfin_mac_interrupt + 0x0 }

RETE: <0x00000000> /* Maybe null pointer? */

RETN: <0x0128bd7c> /* kernel dynamic memory (maybe user-space) */

RETX: <0x00000480> /* Maybe fixed code section */

RETS: <0x00061064> { _pid_revalidate + 0x14 }

PC  : <0x0001a328> { _get_pid_task + 0x10 }

DCPLB_FAULT_ADDR: <0x52202028> /* reserved memory */

ICPLB_FAULT_ADDR: <0x0001a328> { _get_pid_task + 0x10 }

PROCESSOR STATE:

R0 : 00000000    R1 : 00000000    R2 : 00000007    R3 : 016946e0

R4 : 0128be38    R5 : 00000000    R6 : 00000004    R7 : 014dc494

P0 : 00000002    P1 : 014dc494    P2 : 52202028    P3 : 014dc494

P4 : 0159ba34    P5 : 014df6d4    FP : 0128be90    SP : 0128bca0

LB0: 0008aa7e    LT0: 0008aa76    LC0: 00000000

LB1: 00040eec    LT1: 00040ee4    LC1: 00000ff0

B0 : 00000137    L0 : 00000000    M0 : fffffffc    I0 : 001a45f4

B1 : 000000c0    L1 : 00000000    M1 : 00000001    I1 : 0128be38

B2 : 7ffff000    L2 : 00000000    M2 : 00001802    I2 : 00000003

B3 : 00000000    L3 : 00000000    M3 : 0000005b    I3 : 00000006

A0.w: 00000038   A0.x: 00000000   A1.w: 00000038   A1.x: 00000000

USP : 0179fca4  ASTAT: 02001005

 

Hardware Trace:

   0 Target : <0x00003b8c> { _trap_c + 0x0 }

     Source : <0xffa0072a> { _exception_to_level5 + 0x96 } JUMP.L

   1 Target : <0xffa00694> { _exception_to_level5 + 0x0 }

     Source : <0xffa00552> { _bfin_return_from_exception + 0x6 } RTX

   2 Target : <0xffa0054c> { _bfin_return_from_exception + 0x0 }

     Source : <0xffa005f4> { _ex_trap_c + 0x74 } JUMP.S

   3 Target : <0xffa00498> { _ex_dcplb_miss + 0x0 }

     Source : <0xffa007c2> { _trap + 0x2a } JUMP (P4)

   4 Target : <0xffa00798> { _trap + 0x0 }

      FAULT : <0x0001a328> { _get_pid_task + 0x10 } P2 = [P2]

     Source : <0x0001a326> { _get_pid_task + 0xe } 0x5e82

   5 Target : <0x0001a318> { _get_pid_task + 0x0 }

     Source : <0x00061060> { _pid_revalidate + 0x10 } CALL pcrel

   6 Target : <0x00061050> { _pid_revalidate + 0x0 }

     Source : <0x0003fc84> { _do_lookup + 0x10c } CALL (P2)

   7 Target : <0x0003fc82> { _do_lookup + 0x10a }

     Source : <0x0003fbbc> { _do_lookup + 0x44 } IF !CC JUMP pcrel (BP)

   8 Target : <0x0003fba8> { _do_lookup + 0x30 }

     Source : <0x000461fc> { ___d_lookup + 0xc4 } RTS

   9 Target : <0x000461f4> { ___d_lookup + 0xbc }

     Source : <0x000461da> { ___d_lookup + 0xa2 } JUMP.S

  10 Target : <0x000461ba> { ___d_lookup + 0x82 }

     Source : <0x0008aa84> { _memcmp + 0x48 } RTS

  11 Target : <0x0008aa6e> { _memcmp + 0x32 }

     Source : <0x0008aa46> { _memcmp + 0xa } IF CC JUMP pcrel

  12 Target : <0x0008aa3c> { _memcmp + 0x0 }

     Source : <0x000461b6> { ___d_lookup + 0x7e } JUMP.L

  13 Target : <0x000461aa> { ___d_lookup + 0x72 }

     Source : <0x0004619c> { ___d_lookup + 0x64 } IF CC JUMP pcrel

  14 Target : <0x00046182> { ___d_lookup + 0x4a }

     Source : <0x000461f2> { ___d_lookup + 0xba } JUMP.S

  15 Target : <0x000461e8> { ___d_lookup + 0xb0 }

     Source : <0x000461e2> { ___d_lookup + 0xaa } IF !CC JUMP pcrel (BP)

Kernel Stack

Stack info:

SP: [0x0128bf24] <0x0128bf24> /* kernel dynamic memory (maybe user-space) */

Memory from 0x0128bf20 to 0128c000

0128bf20: 00000034 [01bfaf58] 00008000  00000000  00000000  0128c000  01bfaf58  01bfaf58

0128bf40: 01e6f1d4  ffa00e80  02000021  01c095c7  01c0b381  01c095c6  01c0b37e  00000000

0128bf60: 00000000  00000038  00000000  00000038  00000000  00000000  7ffff000  000000c0

0128bf80: 00000137  00000000  00000000  00000000  00000000  0000005b  00001802  00000001

0128bfa0: fffffffc  00000006  00000003  01444800  01eb4a80  0179fca4  0179fcb0  01485520

0128bfc0: 01444a9c  01445250  0178001c  01bfaf3c  00000005  01485520  0179fd2c  00000034

0128bfe0: 0179fd68  0179fce8  00000000  00000000  0179fce8  0179fce8  00000005  00000006

Return addresses in stack:

Modules linked in: bfin_wdt

Kernel panic - not syncing: Kernel exception

Hardware Trace:

Stack info:

SP: [0x0128bbc4] <0x0128bbc4> /* kernel dynamic memory (maybe user-space) */

FP: (0x0128bef8)

Memory from 0x0128bbc0 to 0128c000

0128bbc0: 00000003 [0015b1e8] 0012b56e  0128bca0  0015b1e8  0019621e  0019621e  0019621e

0128bbe0: 0128bbf4  00003ee6  0128bca0  00008008  00000001  0000001f  ffffffff  00000004

0128bc00:<00012282> 00030001  0002905a  001a6ea4  00166964  0008a76c  0128bd50  0160509e

0128bc20: 00000003  0128bca4  0128bd50  0160509f  0128bca4  0000ffff  01606000  00000000

0128bc40: 00000000  6c0a020a  ffffffff  00166968  6c0a0200  ffffffff  013cf800  0004b1d2

0128bc60: 011141a0  00000000  00000000  03816a27  00001000  00000000  00000000  ffa0072e

0128bc80: 00194000  00008008  00000026  00000000  0128be38  001a6750  0019b63c  00000155

0128bca0: 00000480  00008008  00000026  00000000  0128bd7c  00000480  0001a328  00061064

0128bcc0: 00000000  02001005  00040eec  0008aa7e  00040ee4  0008aa76  00000ff0  00000000

0128bce0: 00000038  00000000  00000038  00000000  00000000  7ffff000  000000c0  00000137

0128bd00: 00000000  00000000  00000000  00000000  0000005b  00001802  00000001  fffffffc

0128bd20: 00000006  00000003  0128be38  001a45f4  0179fca4  0128be90  014df6d4  0159ba34

0128bd40: 014dc494  52202028  014dc494  00000002  014dc494  00000004  00000000  0128be38

0128bd60: 016946e0  00000007  00000000  00000000  00000000  00000002  00000006  0128be30

0128bd80: 0159ba34  0128be30 <0003fc86> 0128be30  0159ba34  0128be30  00000000  000412cc

0128bda0: 0152c90c  0100e2a0  00000024  00040c98  0128be30  0159ba34  0128be90  0128be90

0128bdc0: 00000004  00000000  00000000  0003b0ca  0019c4fc  00000000  00000004  0128be98

0128bde0: 000422bc  ffffff9c  0128be30  01445250  0128be90  00000004  00000000  00000024

0128be00: 00000000  00000000  0128be84  00000024  00000000  01504000  0128a000  0128be44

0128be20: 0128a000  00000000  00000001  00000000  0100e2a0  0159ba34  00b118fa  00000007

0128be40: 0150400a  0100e320  010ed214  00000101  00000000  00000000  011141a0  0179f8b0

0128be60: 0004ad18  011141a0  0152ccb4  00000010  01113a40  0004ad66  011059a0  00000001

0128be80: 00000000  01694660  01485520  00040e8a  0101d440  00040ec0  0179fcb0  00038dd4

0128bea0: ffffff9c  00000005  01445250  00000000  00000004  00000000  01504000  ffffff9c

0128bec0: 01504000  011059a0  00000000  00000000  00000000  00038e98  00038e80  00000005

0128bee0: 01445250  00000000  ffffe000  00000034  0179fd68  00000000 (00000000)<ffa008fa>

0128bf00: 00000000 <ffa008fa> 00000000  ffffe000  00000004  0000fffe  0179fce8  0179fd68

0128bf20: 00000034  01bfaf58  00008000  00000000  00000000  0128c000  01bfaf58  01bfaf58

0128bf40: 01e6f1d4  ffa00e80  02000021  01c095c7  01c0b381  01c095c6  01c0b37e  00000000

0128bf60: 00000000  00000038  00000000  00000038  00000000  00000000  7ffff000  000000c0

0128bf80: 00000137  00000000  00000000  00000000  00000000  0000005b  00001802  00000001

0128bfa0: fffffffc  00000006  00000003  01444800  01eb4a80  0179fca4  0179fcb0  01485520

0128bfc0: 01444a9c  01445250  0178001c  01bfaf3c  00000005  01485520  0179fd2c  00000034

0128bfe0: 0179fd68  0179fce8  00000000  00000000  0179fce8  0179fce8  00000005  00000006

Return addresses in stack:

    address : <0x00012282> { ___do_softirq + 0x6a }

    address : <0x0003fc86> { _do_lookup + 0x10e }

   frame  1 : <0xffa008fa> { _system_call + 0x6a }

    address : <0xffa008fa> { _system_call + 0x6a }

 

linux_config

QuoteReplyEditDelete

 

 

2011-08-02 16:53:28     Re: random CPLB misses - kernel OOPS

Timothy Stotts (UNITED STATES)

Message: 102788   

 

It appears that continuing to use memory that is free'd can cause one or more of the following error types with the kernel:

 

SEGV

 

BUS

 

CPLB

 

Illegal instruction

 

Memory alignment

QuoteReplyEditDelete

 

 

2011-08-02 23:06:10     Re: random CPLB misses - kernel OOPS

Sonic Zhang (CHINA)

Message: 102789   

 

Could you attach a simple test code for us to replicate?

QuoteReplyEditDelete

 

 

2011-08-03 14:53:48     Re: random CPLB misses - kernel OOPS

Timothy Stotts (UNITED STATES)

Message: 102799   

 

// Nothing special; do something like the following.

 

// The bug is that the memory space is free'd before it is finished being used.

 

#include "stdint.h"

 

#include "stdlib.h"

 

void main(int argc, char* argv[])

 

{

 

   uint32_t* ptr = (uint32_t*)malloc(sizeof(uint32_t) * 256);

 

   int j, i;

 

   for(;;)

 

      {

 

      for(i = 0; i < 256; i++)

 

         {

 

            ptr[i] = rand();

 

         }

 

         free(ptr);

 

        ptr = NULL;

 

      }

 

}

QuoteReplyEditDelete

 

 

2011-08-03 23:51:12     Re: random CPLB misses - kernel OOPS

Sonic Zhang (CHINA)

Message: 102803   

 

This is the expected behavior on NOMMU archtecture. There is no memory access protection in NOMMU kernel. Writing freed memory space may corrupt kernel code or other application and cause strange kernel crash.

 

So, don't use any memory that is freed.

QuoteReplyEditDelete

 

 

2011-08-04 13:53:27     Re: random CPLB misses - kernel OOPS

Timothy Stotts (UNITED STATES)

Message: 102816   

 

/* Actually, my first code example is incorrect. The bug I was experiencing was not on a null pointer access. The following is a better example. */

 

#include "stdint.h"

 

#include "stdlib.h"

 

void main(int argc, char* argv[])

 

{

 

   uint32_t* ptr = (uint32_t*)malloc(sizeof(uint32_t) * 256);

 

   int j = 0, i;

 

   for(;;)

 

      {

 

      for(i = 0; i < 256; i++)

 

         {

 

            ptr[i] = rand();

 

         }

 

      if (0 == j)

 

         {

 

            j++;

 

            free(ptr);

 

          }

 

    }

 

}

QuoteReplyEditDelete

 

 

2011-08-04 21:40:23     Re: random CPLB misses - kernel OOPS

Simon Brewer (AUSTRALIA)

Message: 102820   

 

Hi Timothy,

 

you are writing to free'd memory which is a bug in the code.  Please see Sonic's earlier email where he explains on a NOMMU system you could corrupt another process, or the Kernel.

 

Simon

QuoteReplyEditDelete

 

 

2011-08-05 13:18:56     Re: random CPLB misses - kernel OOPS

Timothy Stotts (UNITED STATES)

Message: 102840   

 

Thank you. I am aware of this. It was request that I provide example code of what I was doing to exhibit the bad behavior. I fixed the problem a long time ago.

Attachments

Outcomes