[#5814] PREEMPT_RT: cyclic test shows no significant improvement for system latency
Submitted By: Yi Li
Open Date
2010-01-05 05:30:09 Close Date
2010-04-16 06:06:05
Priority:
Low Assignee:
Cliff Cai
Status:
Closed Fixed In Release:
N/A
Found In Release:
snaps Release:
Category:
N/A Board:
N/A
Processor:
BF537 Silicon Revision:
Is this bug repeatable?:
Yes Resolution:
Assigned (Not Start)
Uboot version or rev.:
Toolchain version or rev.:
2009R1-rc10
App binary format:
N/A
Summary: PREEMPT_RT: cyclic test shows no significant improvement for system latency
Details:
Using the cyclic test (http://rt.wiki.kernel.org/index.php/Cyclictest) to test system latency (irq + schedule), enabling PREEMPT_RT shows no significant improvement over PREEPT_VOLUNTORY.
Here is the result (testing clock_nanosleep() latency) (all result is in microsecond (us)).
PREEMT_VOLUNTORY
-----------------
1. 100% CPU workload:
root:/> ./cyclictest -p 80 -t1 -n -i 10000 -l 10000
timer mode: 1
policy: fifo: loadavg: 5.13 4.19 2.14 6/41 453
T: 0 ( 405) P:80 I:10000 C: 10000 Min: 32 Act: 57 Avg: 104 Max: 6712
2. no workload:
root:/> ./cyclictest -p 80 -t1 -n -i 10000 -l 10000
timer mode: 1
policy: fifo: loadavg: 0.00 0.00 0.00 1/31 169
T: 0 ( 169) P:80 I:10000 C: 10000 Min: 24 Act: 35 Avg: 40 Max: 120
PREEMPT_RT
-----------
1. 100% CPU workload:
root:/> ./cyclictest -p 80 -t1 -n -i 10000 -l 10000
timer mode: 1
policy: fifo: loadavg: 4.55 1.68 0.61 6/48 237
T: 0 ( 205) P:80 I:10000 C: 10000 Min: 29 Act: 65 Avg: 87 Max: 6747
2. no workload:
root:/> ./cyclictest -p 80 -t1 -n -i 10000 -l 10000
timer mode: 1
policy: fifo: loadavg: 0.00 0.00 0.00 1/36 174
T: 0 ( 174) P:80 I:10000 C: 10000 Min: 24 Act: 31 Avg: 32 Max: 73
We need to find out why maxinum latency is as high as 6747 us when there is 100% workload.
kernel version:
blackfin_preempt_rt patch + patch-2.6.31.6-rt19 + svn trunk linux kernel r7915.
Follow-ups
--- Yi Li 2010-01-05 05:32:42
Attached kernel configuration and workload script. Cyclic test is using
rttest-0.6.
--- Yi Li 2010-01-05 23:45:49
Compared similar test with xenomai (on 2009R1.1). The max latency in xenomai
case is much better (48 us VS. 6747 us).
100% workload:
root:/usr/bin> ./cyclictest -p 80 -n -t1 -i 10000 -l 10000
4.86 2.46 0.97 6/33 419
T: 0 ( 335) P:80 I: 10000 C: 10000 Min: 22 Act: 34 Avg: 34
Max: 48
--- Yi Li 2010-01-26 04:56:50
I use wakeup-rt tracer on kernel with PREEMPT_VOLUNTORY, result shows even for
tasks with real-time priority, the wake up latency is more than 7000 us.
root:/> cat /sys/kernel/debug/tracing/trace
# tracer: wakeup_rt
#
# wakeup_rt latency trace v1.1.5 on 2.6.32.5-ADI-2010R1-pre-svn8224
# --------------------------------------------------------------------
# latency: 7651 us, #4/4, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0)
# -----------------
# | task: -478 (uid:0 nice:0 policy:1 rt_prio:80)
# -----------------
#
# _------=> CPU#
# / _-----=> irqs-off
# | / _----=> need-resched
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| /_--=> lock-depth
# |||||/ delay
# cmd pid |||||| time | caller
# \ / |||||| \ | /
ls-493 0d.h.. 6us+: 493:120:R + [000] 478: 19:S cyclictest
ls-493 0d.h.. 21us!: _wake_up_process <-_hrtimer_wakeup
ls-493 0d.... 7636us+: _schedule <-_system_call
ls-493 0d.... 7643us : 493:120:R ==> [000] 478: 19:R
cyclictest
After detailed analysis, I think the main cause is still that, kernel cannot be
preempted when return from interrupt to kernel space. See attached trace log.
--- Yi Li 2010-01-27 03:33:15
Some system call (e.g do_exec()) will task a very long time (e.g,
load_flat_binary() may take 50ms, when executing "ls" with FLAT
format).
It is interesting that using FDPIC format, cyclic test gives much lower maximum
latency.
PREEMT_VOLUNTORY
-----------------
100% CPU workload:
root:/> ./cyclictest -p 80 -t 1 -n -l 10000 -i 10000
timer mode: 1
policy: fifo: loadavg: 5.03 4.09 2.08 6/30 425
T: 0 ( 378) P:80 I:10000 C: 10000 Min: 37 Act: 106 Avg: 96 Max:
323
--- Yi Li 2010-04-08 01:03:28
The cause is: kernel spends a lot of time in do_execve()→load_flat_binary(),
without any chance to call “schedule()”, until do_execve() system call
finally finished (i.e, return to user space). So the wake-up latency happens
because kernel cannot be preempted.
For details, please refer to:
http://docs.blackfin.uclinux.org/doku.php?id=linux-kernel:ftrace
This bug is kind of depends on task 5718
if kernel can be preempted when returning from interrupt.
--- Sonic Zhang 2010-04-16 06:04:25
Not a real bug. It depends on task 5718. Low priority.
Files
Changes
Commits
Dependencies
Duplicates
Associations
Tags
File Name File Type File Size Posted By
trace.log text/x-log 18113 Yi Li
fork_workload.sh application/x-sh 194 Yi Li
bugreport.tar.gz application/x-gzip 19216 Yi Li