2009-09-30 17:25:06     CPLB Data miss

Document created by Aaronwu Employee on Aug 19, 2013
Version 1Show Document
  • View in full screen mode

2009-09-30 17:25:06     CPLB Data miss

Dusko Cencan (GERMANY)

Message: 80729   

 

Hi,

 

 

 

I ran into a very bad problem. I am trying to fix it for a long time, but no success. I use a BF CM-527 Board with latest uClinux dist and an OV9655 Camera Sensor. I have a quite large application which does some heavy image processing with 3-4  large static buffers (char * buf = 1280*1024*3) and a lot of loops where my algorithms are processed. I ran frequently into a "Data access CPLB miss"

 

The more heavy my calculations are, the faster it breaks down.

 

Here is the Kernel output for the data miss:

 

Data access CPLB miss

- Used by the MMU to signal a CPLB miss on a data access.

Deferred Exception context

CURRENT PROCESS:

COMM=frametest2 PID=232

CPU = 0

TEXT = 0x01000040-0x01009f80 DATA = 0x01009fa0-0x0100dc50

BSS = 0x0100dc50-0x015d8f90 USER-STACK = 0x015e9efc

 

return address: [0x01001108]; contents of:

0x010010e0: 3001 4f10 5008 4f18 5010 3210 6c82 9110

0x010010f0: e638 ffbb 20a4 e438 ffbb e141 0136 e101

0x01001100: 5f7c 4f10 5008 3210 [9110] 4d80 e638 ffc0

0x01001110: e438 ffbb e141 0136 e101 5f7c 4f10 5008

 

ADSP-BF527-0.2 525(MHz CCLK) 131(MHz SCLK) (mpu off)

Linux version 2.6.28.10-ADI-2009R1

Built with gcc version 4.1.2 (ADI svn)

 

SEQUENCER STATUS: Not tainted

SEQSTAT: 00062026 IPEND: 0030 SYSCFG: 0006

EXCAUSE : 0x26

interrupts disabled

physical IVG5 asserted : <0xffa00a50> { _evt_ivhw + 0x0 }

RETE: <0x00000000> /* Maybe null pointer? */

RETN: <0x00630000> /* kernel dynamic memory */

RETX: <0x00000480> /* Maybe fixed code section */

RETS: <0x01001036> [ frametest2 + 0xff6 ]

PC : <0x01001108> [ frametest2 + 0x10c8 ]

DCPLB_FAULT_ADDR: <0x02000000> /* kernel dynamic memory */

ICPLB_FAULT_ADDR: <0x01001108> [ frametest2 + 0x10c8 ]

 

PROCESSOR STATE:

R0 : 02000000 R1 : 01365f7c R2 : 01356d5c R3 : 00000000

R4 : 00006c42 R5 : 0100db90 R6 : 0100a118 R7 : 00000009

P0 : 00000004 P1 : 012668f0 P2 : 02000000 P3 : 015e9f00

P4 : 0100dc4c P5 : 01009fa0 FP : 015e9d2c SP : 0062ff24

LB0: 01006755 LT0: 01006748 LC0: 00000000

LB1: 01009a51 LT1: 01009a0a LC1: 00000000

B0 : 00000000 L0 : 00000000 M0 : 00000004 I0 : 0100a51c

B1 : 00000000 L1 : 00000000 M1 : 00000000 I1 : 00000001

B2 : 00000000 L2 : 00000000 M2 : 00000000 I2 : 00000000

B3 : 00000000 L3 : 00000000 M3 : 00000000 I3 : 0000000c

A0.w: 00000000 A0.x: 00000000 A1.w: 00000000 A1.x: 00000000

USP : 015e8bbc ASTAT: 02002000

 

No trace since you do not have CONFIG_DEBUG_BFIN_NO_KERN_HWTRACE enabled

 

Userspace Stack

Stack info:

SP: [0x015e8bbc] <0x015e8bbc> [ frametest2 + 0x5e8bbc ]

Memory from 0x015e8bb0 to 015e9000

015e8bb0: 00000009 0100a118 0100db90 [01496978] 00000000 00000000 40cfd080 00000000

015e8bd0: 00000000 00000000 00000000 00000001 00000001 00000000 6c676846 3eef211d

015e8bf0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8c10: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8c30: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8c50: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8c70: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8c90: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8cb0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8cd0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8cf0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8d10: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8d30: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8d50: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8d70: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8d90: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8db0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8dd0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8df0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8e10: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8e30: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8e50: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8e70: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8e90: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8eb0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8ed0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8ef0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8f10: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8f30: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8f50: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8f70: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8f90: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8fb0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8fd0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000

015e8ff0: 00000000 00000000 00000000 00000000

Return addresses in stack:

bcap_close called

bcap_reg_reset:

...specified video device closed sucessfullly

QuoteReplyEditDelete

 

 

2009-09-30 18:39:32     Re: CPLB Data miss

Dusko Cencan (GERMANY)

Message: 80730   

 

A short addition: Whats normally causing such an error? Compiling on my windows machine with VS works fine, but porting to uclinux gives me these CPLB errors and if not a CPLB error my programm just freezes without any kernel errors. If anyone had such errors in the past and managed to fix it ( I assume in the application code) I'll be happy to know what problem caused this errors on your applications.

 

 

 

Regards,

 

Dusko Cencan

QuoteReplyEditDelete

 

 

2009-09-30 23:01:47     Re: CPLB Data miss

Frank Van Hooft (CANADA)

Message: 80733   

 

Have a read of this:

 

  docs.blackfin.uclinux.org/doku.php?id=uclinux-dist:analyzing_traces

 

Chances are, there's something of interest happening in your code at about this point:

 

PC : <0x01001108> [ frametest2 + 0x10c8 ]

QuoteReplyEditDelete

 

 

2009-10-01 00:49:42     Re: CPLB Data miss

Mike Frysinger (UNITED STATES)

Message: 80734   

 

are you using the same toolchain version ?

 

if you want to see a proper trace, you really need to enable the config option like the crash told you to:

CONFIG_DEBUG_BFIN_NO_KERN_HWTRACE

QuoteReplyEditDelete

 

 

2009-10-01 00:54:08     Re: CPLB Data miss

Dusko Cencan (GERMANY)

Message: 80735   

 

Hi,

 

 

 

Thanks for the answers. I read the link and will try to trace down the error with the debugger. I had CONFIG_DEBUG_BFIN_NO_KERN_HWTRACE enabled, but disabled it today, I dont't know why. The thing was, that I head no idea how to trace the error, but now I was pointed to the right direction. I downloaded the latest toolchain and build the Kernel with it, so my Kernel and toolchain are up-to-date.

QuoteReplyEditDelete

 

 

2009-10-01 01:02:30     Re: CPLB Data miss

Dusko Cencan (GERMANY)

Message: 80737   

 

EDIT: My toolchain tar is "blackfin-toolchain-09r1-10" and uclinux-dist is  "uClinux-dist-2009R1-RC6 "

QuoteReplyEditDelete

 

 

2009-10-01 09:47:26     Re: CPLB Data miss

Robin Getz (UNITED STATES)

Message: 80766   

 

Dusko:

 

There are a couple of hot fixes which are in the svn branch (of the kernel) which were applied after the tar balls were made. I don't think they are the cause of your problem (they manifest themselves as hangs), but just so you know.

 

-Robin

QuoteReplyEditDelete

 

 

2009-10-01 10:10:46     Re: CPLB Data miss

Lars Weber Rasmussen (DENMARK)

Message: 80770   

 

Hi all!

 

I had the same problem which disapeared when I compiled to rev 0.1 silicon. If you check the anomaly.h list all anomalies are implemented <2 silicon revisions. It seems that at least the raise1 instruction still is a problem if it is linked into external memory on rev 0.2 silicon.

QuoteReplyEditDelete

 

 

2009-10-01 11:07:08     Re: CPLB Data miss

Robin Getz (UNITED STATES)

Message: 80773   

 

Lars:

 

I'm not sure I understand what you mean. You were running into a problem with with 0.0 running on 0.1? and when you moved to compiled on 0.1 running on 0.1, problems went away?

 

Or something else?

 

Dusko has compiled, and is running on 0.2 - the dump message now shows that.

 

 

 

-Robin

QuoteReplyEditDelete

 

 

2009-10-01 11:21:36     Re: CPLB Data miss

Lars Weber Rasmussen (DENMARK)

Message: 80774   

 

Hi Robin!

 

Chip on board was BF527 rev 0.2 silicon. When compiling u-boot to rev. 0.2 anomaly list I saw the same problem - but when compiling to rev 0.1 silicon everything worked.

 

CCLK 300 Mhz , SCLK 60 Mhz.

 

/lars

QuoteReplyEditDelete

 

 

2009-10-01 17:26:01     Re: CPLB Data miss

Dusko Cencan (GERMANY)

Message: 80781   

 

Hey,

 

I dont understand why different silicon versions can cause such a problem. So, just uboot is compiled with v0.1 and Kernel with 0.2? I use a precompiled uboot image, only the kernel is compiled by my self. My application usually just hangs in a while() loop. It's basically a while(x!y) {<code to set x and y to be equal after some calculations>} On my Desktop systems this works fine, but on uClinux the only cause for the hang could be memory corruption. I usually got hangs in the while() loop, but every ~10th time the CPLB miss occurs, so I think that the CPLB handler just not trigger the exception every time and only then when something very bad is going on in the memory. The reason for the freeze and for CPLB could be the same. I will have a look at the latest fixes. I didn't make it to have a look at the traces and disassembly the debug objekt file, yet.

 

Best regards,

 

 

 

Dusko Cencan

QuoteReplyEditDelete

 

 

2009-10-01 19:32:20     Re: CPLB Data miss

Mike Frysinger (UNITED STATES)

Message: 80785   

 

once the kernel is up and running, u-boot shouldnt matter in any way.  so if you're seeing crashes under Linux, i wouldnt look into u-boot at all.

 

QuoteReplyEditDelete

 

 

2009-10-01 23:02:21     Re: CPLB Data miss

Dusko Cencan (GERMANY)

Message: 80786   

 

If I could increase the available memory for my application, I could get my program more stable. Right now I use something below 8mb of memory when the program starts. I don't use malloc() cause it ends in more catastrophic issues. If I increase some buffers by only 1mb the program wouldn't start and the kernel prints an "page allocation failure - unable to allocate memory" (or something like that). The strange thing is that I got nearly 11mb left on the device (total 32mb), while my application is running. I already modified the OV9655 Cam driver just to use the bottom buffer at address 0x1000, to save the memory that was used by the top buffer. I do not understand why the driver needs two buffers. The top buffer was designed to use DMA memory access, but the bottom is only a static buffer defined at compile time. I assume that there is a limit for each application controlled by the kernel. From what I was testing, I would say something about 8mb max. for a program. Is there anyway to increase this limit?

 

Best Regards,

 

Dusko Cencan

QuoteReplyEditDelete

 

 

2009-10-06 04:20:48     Re: CPLB Data miss

Mike Frysinger (UNITED STATES)

Message: 80887   

 

having free mem is not the same thing as having contiguous free mem.  nommu means fragmentation cannot be recovered from.  allocate large buffers at boot time and never let them go.

 

the double buffer in the blackfin_cam driver is a hack to try and get a bit more performance out of the system.  it'll be thrown out at some point.

 

there is no limit anywhere in the kernel that says "you can only run 8mb max".  the only limits are the physical ones.  after that, if memory gets fragmented even slightly, you're screwed.

Attachments

    Outcomes