I don't understand that why pm instruction costs 2 cycles while dm costs 1,
and some parallel instruction cost "strange" cycles, for example:
lcntr=r1, do filtering until lce;
f9 =f2*f4, f13=f8+f14, f8=f2; //cost 1 cycle (The results of ICE simulation)
f14=f2*f5, f3=f8+f12; //cost 2 cycle
f12=f3*f6, f8=f9+f13, dm(i4,m4)=f3; //cost 2 cycle
filtering: f8 =f3*f7, f12=f8+f12, f2=dm(i3,m4); //cost 1 cycle
According to the normal, all of the above 4 instructions should cost 1 cycle.
So could you tell me that how much cycles are taken by each instruction?
I didn't find the answer in the instruction set manuals.
Hope your reply.