I'm trying to research a problem and need some answers to some questions I'm hopping someone here will now.
(1) I see the delays are calculated in linux-2.6.x/arch/blackfin/include/asm/delay.h ... but, the code there references the m68k processor as a source for pulling from the calculations; however, the m68k processor code for the same doesn't support what we have for the blackfin. Is this old information?
(2) I can also see where the loops_per_jiffy is calculated. However, most documented souces I see say that any delay for the calculation should be as follows:
delay = (usecs * HZ * loops_per_jiffy) / 1000000;
Most seem to be following this with some variation of:
delay = (usecs * (2**32 / 1000000)[rounded up] * HZ * loops_per_jiffy);
So, I have to ask; why does the blackfin have a special calculation for the delay or loops?
(3) What does the blackfin actually use to increment jiffies? And where? ie: what module and what peripheral is providing the timing for this?
I've been digging some more and found this from one site:
* Ideally we use a 32*32->64 multiply to calculate the number of
* loop iterations, but the older standard 68k and ColdFire do not
* have this instruction. So for them we have a clsoe approximation
* loop using 32*32->32 multiplies only. This calculation based on
* the ARM version of delay.
* We want to implement:
* loops = (usecs * 0x10c6 * HZ * loops_per_jiffy) / 2^32
#define HZSCALE (268435456 / (1000000/HZ))
extern unsigned long loops_per_jiffy;
extern __inline__ void _udelay(unsigned long usecs)
#if defined(CONFIG_M68328) || defined(CONFIG_M68EZ328) || \
defined(CONFIG_M68VZ328) || defined(CONFIG_M68360) || \
__delay((((usecs * HZSCALE) >> 11) * (loops_per_jiffy >> 11)) >> 6);
unsigned long tmp;
usecs *= 4295; /* 2**32 / 1000000 */
__asm__ ("mulul %2,%0:%1"
: "=d" (usecs), "=d" (tmp)
: "d" (usecs), "1" (loops_per_jiffy*HZ));
Here is a comment on the linux development list on one ARM.
Here are some of the comments in an ARM version of delay.h
* Copyright (C) 1995-2003 Russell King
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
* Delay routines, using a pre-computed "loops_per_second" value.
* Division by multiplication and shifts.
* We want the number of loops to complete, where loops is:
* (us * (HZ * loops_per_jiffy)) / 10^6
* (ns * (HZ * loops_per_jiffy)) / 10^9
* Since we don't want to do long division, we multiply both numerator
* and denominator by (2^28 / 10^6):
* (us * (2^28 / 10^6) * HZ * loops_per_jiffy) / 2^28
* => (us * (2^28 * HZ / 10^6) * loops_per_jiffy) / 2^28
* ~> (((us * (2^28 * HZ / 10^6)) / 2^11) * (loops_per_jiffy / 2^12)) / 2^5
* (for large loops_per_jiffy >> 2^12)
* Note: maximum loops_per_jiffy = 67108863 (bogomips = 1342.18)
* minimum loops_per_jiffy = 20000 (bogomips = 0.4)
* Note: we rely on HZ = 100.
#define UDELAY_FACTOR 26843
#define NDELAY_FACTOR 27
extern void __bad_udelay(void); /* intentional errors */
extern void __bad_ndelay(void); /* intentional errors */
extern void __delay(unsigned long loops);
extern void __udelay(unsigned long usecs);
extern void __ndelay(unsigned long nsecs);
extern void __const_delay(unsigned long units);
#define udelay(n) \
(__builtin_constant_p(n) ? \
((n) > 20000 ? __bad_udelay() \
: __const_delay((n) * UDELAY_FACTOR)) \
#define ndelay(n) \
(__builtin_constant_p(n) ? \
((n) > 20000 ? __bad_ndelay() \
: __const_delay((n) * NDELAY_FACTOR)) \
#endif /* defined(_ARM_DELAY_H) */
I see that the kernel sources in kernel/time contain the jiffies.c file and that the time_sched.c schedules the timer tick to be incremented at a scheduled rate. So, I'm getting closer to understanding.
This post seems to suggest that the delay calculation is not exact and can / does have an error in the +/- range in some circumstances.
I'm going to experiment with this and let everyone know what I find out. Since we are using the same algorithm as the ARM to calculate this delay.