AnsweredAssumed Answered

Builtin shr_fx1x16() versus fract >>

Question asked by LloydE on Sep 28, 2011
Latest reply on Sep 30, 2011 by LloydE

I am changing my BF536 code to use native fract types/operators instead of fract16/32 builtin functions in VisualDSP++ Release 10. Most arithmetic operations, including the left shift << operator, were pretty innocuous with respect to cycle counts except for the right shift >> operator. Using the >> operator instead of the builtin shr_fx1x16() function resulted in significantly more code (as well as a call to ldiv).

 

    fract xr = *fract_ptr++ >> 2;
[183FBF6] R0 = W [ P1 ] ( X ) ;
[183FBF8] R6 = 4 ;
[183FBFA] R1 = 4 ;
[183FBFC] P1.L = 0x8060 ;
[183FC00] P1.H = 0xffa0 ;
[183FC04] CALL ( P1 ) ;
[183FC06] R2 = ROT R1 BY 0 ;
[183FC0A] BITTGL ( R1 , 0x2 ) ;
[183FC0C] R3 = R1 >>> 31 ;
[183FC10] R2 = ABS R2 ;
[183FC14] R1 = ~ R3 ;
[183FC16] R1 = R1 + R6 ( NS ) ;
[183FC1A] R1 = R1 >> 1 ;
[183FC1E] R3 = R3 << 0x1 ;
[183FC22] R3 += 1 ;
[183FC24] R6 = 0 ;
[183FC26] CC = R1 < R2 ( IU ) ;
[183FC28] IF CC R6 = R3 ;
[183FC2A] R0 = R6 + R0 ( NS ) ;
[183FC2E] R1 = ROT R7 BY 0 ;
[183FC32] R7 += 2 ;
[183FC34] P1 = R7 ;
[183FC36] [ FP + 0x8 ] = R7 ;
[183FC38] W [ FP - 20 ] = R0 ;
   fract yr = shr_fx1x16(*fract_ptr++, 2);
[
183FC3C] R7.L = W [ P1 ] ;
[183FC3E] R2 = ROT R1 BY 0 ;
[183FC42] R1 += 4 ;
[183FC44] P0 = R1 ;
[183FC46] [ FP + 0x8 ] = R1 ;
[183FC48] W [ FP - 12 ] = R7 ;
[183FC4C] R6 = 2 ;
[183FC4E] W [ FP - 8 ] = R6 ;
[183FC52] R7.L = R7.L >>> 2 ;
[183FC56] W [ FP + 0x12 ] = R7 ;
[183FC58] W [ FP - 18 ] = R7 ;

According to the documentation, doing a right shift of a fract using the >> operator is the same as dividing by 2.0. However, I did not think it would be implemented with an actual division when it could be implemented with the more efficient arithmetic shift >>> assembly instruction as the builtin function does. Why is the code generated so inefficient?

 

I have included the function I used to generate this assembly code as well as the command line in the attached files, if you are interested. Note that because I am using the VisualDSP++ debugger, I have optimization turned off.

Attachments

Outcomes