[#5735] A SMP sync problem may lead to timers on core B not working

Document created by Aaronwu Employee on Sep 5, 2013
Version 1Show Document
  • View in full screen mode

[#5735] A SMP sync problem may lead to timers on core B not working

Submitted By: Yi Li

Open Date

2009-12-02 02:45:33     Close Date

2009-12-13 22:34:23

Priority:

Medium     Assignee:

Yi Li

Status:

Closed     Fixed In Release:

N/A

Found In Release:

2010R1     Release:

Category:

N/A     Board:

N/A

Processor:

ALL     Silicon Revision:

Is this bug repeatable?:

Yes     Resolution:

Fixed

Uboot version or rev.:

    Toolchain version or rev.:

2009R1 gcc 4.1

App binary format:

N/A     

Summary: A SMP sync problem may lead to timers on core B not working

Details:

 

When I am testing PREEMPT_RT kernel on BF561-ezkit, the kernel blocks while booting. When kernel initialize ethernet driver, it sleeps and never wakes up.

 

The issue happens when kernel waits for a timer for Core B (the timers are per-cpu based: static DEFINE_PER_CPU(struct tvec_base *, tvec_bases) = &boot_tvec_bases) to timeout.

 

However, the ksoftirqd thread for core B (note, the ksoftirqd thread is also per-cpu based) cannot work properly, and the timers for Core B never times out.

 

In kernel/softirq.c:

 

ksoftirqd()

{

...

                while (local_softirq_pending()) {

                        /* Preempt disable stops cpu going offline.

                           If already offline, we'll be on wrong CPU:

                           don't process */

                        if (cpu_is_offline((long)__bind_cpu))

                                goto wait_to_die;

...

 

wait_to_die:

        preempt_enable();

        /* Wait for kthread_stop */

        set_current_state(TASK_INTERRUPTIBLE);

        while (!kthread_should_stop()) {

                schedule();

                set_current_state(TASK_INTERRUPTIBLE);

        }

        __set_current_state(TASK_RUNNING);

        return 0;

}

 

When ksoftirqd() for the first time run on core B, it is possile core A is still initializing core B (see smp_init() -> cpu_up() -> __cpu_up() ). So the "cpu_is_offline()" check may returns true and ksoftirqd moves to "wait_to_die".

 

We need to fix bf561 smp code to make sure core B stays idle until core B is considered online.

 

 

 

 

 

 

 

 

 

Follow-ups

 

--- Mike Frysinger                                           2009-12-12 23:39:30

so is this issue fixed now ?  can we close the bug ?

 

--- Yi Li                                                    2009-12-13 22:34:23

Yes. We can close this bug. (I was testing this fix with the another SMP patch

to use static array for inter-processor message queue, so delayed for a while).

 

 

 

    Files

    Changes

    Commits

    Dependencies

    Duplicates

    Associations

    Tags

 

File Name     File Type     File Size     Posted By

No Files Were Found

Attachments

    Outcomes