# Problem with overflow

Question asked by BenChr on Jul 8, 2010
Latest reply on Jul 16, 2010 by AndreasL

I have encountered a problem with overflow. It can be seen on the picture with the flat tops.

It should have looked like the picture with the sharp tops.

The contexts is this: I am doing speechrecognition and need to calculate cepstrum coefficients of input speech:

Matlab code: Cepstrum = real(ifft(log(abs(fft(x)))));

So far I have made...

1) fft with the rfft_fr16 function,

2) abs with the cabs_fr16 function

3) Then i needed to do the log. BUT i could only find a floating point version of this one? Does a fract16 version exist?

i could´nt find one - therefore i used the fr16_to_float function and used the logf function.

4) Everything untill this point has given quite accurate results, but then I had to convert from float_to_fr16 to be able to use the

ifft_fr16 function. Here is my problem.

5) Because the floating point values are higher (on the tops and lower in the valleys) than +1 and -1 it overflows! I know that I probably need to divide the log(fft) with some factor and then convert, but I also want to have a nice precision when doing the ifft afterwards.

Should I loop through the log(fft) output and find the highest value and divide everything with this value, so that it is normalized to [-1 1]?

Then I could use static scaling on the ifft afterwards! Is there a more efficient way to do the normalization?

By the way: This discussion is only posted here.

Thank you

Benjamin

Here is a snippet of my code:

#define NUMPOINTS 256

complex_fract16 w[NUMPOINTS];    //twiddle sequence
complex_fract16 fftout[NUMPOINTS];
int block_exponent;
int n = NUMPOINTS;
float fft_scaled[NUMPOINTS];
float log_of_fft[NUMPOINTS];

//Make w twiddle table for FFT once.

void cepstrum_test(const fract16 samples256[]) {

//cepstrum = real(ifft(log(abs(fft(x)))));
int i = 0;

//FFT
rfft_fr16(samples256, fftout, w, 1, NUMPOINTS, &block_exponent, 2);

float scaling_fft = ldexpf(1.0F,block_exponent);

for(i=0;i<NUMPOINTS;i++) //Tak abs(fft) and scale fft and take logarithm and convert back to fr16:
{
fft_scaled[i] = fr16_to_float(cabs_fr16(fftout[i]))*scaling_fft; // WOULD IT BE MORE EFFICIENT TO SHIFT THE FR16 INSTEAD OF MULTIPLYING                                                                                             // WITH THE SCALING FACTOR IN THE FLOATING POINT DOMAIN?
log_of_fft[i] = logf(fft_scaled[i]);
fftout[i].re = float_to_fr16(log_of_fft[i]); //HERE IS THE PROBLEM.
fftout[i].im = 0;
}

}