A Blackfin overview document says that a floating point multiply is preferable (faster) to an integer divide.

float recip_NUM_SAMPS = 1.0 / NUM_SAMPS;

int x = sum * recip_NUM_SAMPS;

as opposed to

int x = sum / recip_NUM_SAMPS;

Is this correct?

A Blackfin overview document says that a floating point multiply is preferable (faster) to an integer divide.

float recip_NUM_SAMPS = 1.0 / NUM_SAMPS;

int x = sum * recip_NUM_SAMPS;

as opposed to

int x = sum / recip_NUM_SAMPS;

Is this correct?

http://analog.tenet.res.in/overview.pdf

Page 86.

I was assuming that

*NUM_SAMPS*was an integer constant, being as it didn't make sense to have a fraction of a sample.http://analog.tenet.res.in/overview.pdf

Page 86.

Thanks. There the

*sum*is a float rather than int though, in which case the multiply-by-pre-computed reciprocal trick does make sense, because floating-point multiply is significantly faster than floating-point division.I was assuming that

*NUM_SAMPS*was an integer constant, being as it didn't make sense to have a fraction of a sample.The important word there is "constant", meaning that the compiler knows what value NUM_SAMPS has. If so, it can apply its integer division optimizations, otherwise it has to plant a call to a helper routine for performing the division.

It's incorrect. Where did you find that advice?

Blackfin does not have hardware support for floating-point operations, so they have to be emulated in software, which is quite slow. The code requires

sumto be converted from an int to a float, then multiplied byrecip_NUM_SAMPS, and finally converted back to int, which according to a quick experiment in the cycle-accurate simulator takes 132 cycles altogether.Presumably

NUM_SAMPSis a constant, otherwise the reciprocal would incur an even more expensive floating-point division at runtime (although that could just be done once if the same sample size is used a lot of times).Anyway, integer division by a constant is actually optimized by the compiler. If the divisor is a power of 2, it turns it into a right shift plus a correction to ensure correct behaviour for negative numbers. This takes only four cycles. If the divisor is not a power of two, it turns it into a series of integer multiplies, shifts and adds (using an algorithm by Granlund and Montgomery) that takes 11 cycles.

Without that optimization, signed integer division takes up to 91 cycles, but it can be less than half that, depending on the sizes of the operands.