Post Go back to editing

Help w/ ADSP-21489... DUAL SIMD MAC 40bit Floating Point

Talking (emailing) with an ADI applications engineer... he's VERY good... but, I'm getting frustrated not being able to find ANY good descriptions about how the DAGs are set-up to do DUAL SIMD MAC operations on 40-bit Floating Point variables generated by the PEx/PEy registers on the ADSP-21489.

It seems that no matter what search criteria I use on the ADI Site, or via Google, I get nowhere with my DAG(s), DUAL SIMD MAC operations searches using 40-bit Floating Point variables.

BTW, I think 40-Bit is a GREAT intermediary between SINGLE vs DUAL precision FP and that DUAL SIMD MAC operations are probably the ADI crown jewel... but, there seems to be a dismal-level-of-info on the Data Sheets and other support literature concerning this truly unique, valuable FP hardware. I'm focused on the ADSP-21489 for many reasons, but clearly this question should have general interest, or, am I missing the obvious?

Please tell me where/what I've missed on the ADI site to answer my query.

  • SHARCRegistered Processor Programming Reference Revision 2.2, March 2011 p.7-45 has the next text:

    Extended precision can’t be supported in SIMD mode since the both PM and DM data busses are limited to 64-bits but would require 80-bits.

    So you can't use 40 bit float and SIMD.

  • Hello,

    Please look into the attached  “Normal-word space-Dual data-SIMD” block diagram which would help you to understand more about Dual Data SIMD operation. If you want to use SIMD with extended-precision 40-bit data, the PM and DM data bus width would require 80 bits. But as mentioned in the block diagram the PM and DM data bus width is limited to 64 bits. Hence it is not possible to perform a Dual Floating Point SIMD MAC operation using 40-bit Floating Point variables.

    Regards,

    Jithul

  • I use 40 bit float in VDSP++ for C++. But CCES is better because CCES compiler use Fx = PASS Fy instead Rx (40 bit) = PASS Ry (only 32 bit) in VDSP++. The memory which hold 40-bit variables must have 48-bit width. Also if you use C/C++ you must ban to use SIMD declaring no_vectorization. And you don't  forget to enable 40- bit float operation.

    Please give me some idea about the Assembler Code needed to LOAD then STORE registers like F0 and F4.

    You must located the variables in 48-bit width memory

    For example

    seg_msp_int_pmda_48          { TYPE(PM RAM) START(0x000C4000) END(0x000C47FF) WIDTH(48) }     // 8k*48

    seg_msp_int_dmda_48          { TYPE(DM RAM) START(0x000C4800) END(0x000C5554) WIDTH(48) }     // 8k*48

    and enable using 48-bit width memory and 40-bit float operation

    bit clr mode1 RND32;    ustat1=SYSCTL     ustat1=IMDW2;         DM(SYSCTL)=ustat1;

    And now you can use DAG registers which point variables which hold in seg_msp_int_pmda_48 or seg_msp_int_dmda_48 for LOAD and STORE

  • Hi,

    Thank you! There is also another one in the Revision 2.4, April 2013 reference that plainly states: SIMD mode operation is only supported in NW and SW space, i.e., SIMD operations cannot exceed the 64-bit width of the PM or DM buses.

  • Hi Jithul,

    In keeping with the 64-bit width limit (SIMD or not), please help me use 40-bit FP variables (since they encode 9+ decimal digits) in an efficient way that does not attempt to use SIMD and the PEx/PEy register sets. I've attached a draft edit of your mapping diagram that attempts to use the PX register and deal with 40-bit FP variables. Please give me some idea about the Assembler Code needed to LOAD then STORE registers like F0 and F4.

    Best Regards,

    Rich

    40-BitFloatingPoint.pptx
  • Hi Rich,

    In addition to Bookevg's answer, if you want to perform an immediate 40-bit data register load, you can refer to the code snippet given below.

    Extended precision data requires a combined PX1/PX2 register alignment for immediate load in SISD mode:


    Bit CLR MODE1 PEYEN;

    NOP;

    PX2 = 0x12345678;      /* load data 39-8*/

    PX1 = 0x9A000000;      /* load data 7-0*/

    F0 = PX;                     /* F0 load with 40-bit*/;

    F4 = PX;                     /* F4 load with 40-bit*/

    Regards,

    Jithul

  • Hi bookevg, So, if I set up the PM and DM Blocks with 48-bit widths, I can use the DAGs (pointing into these 48-bit regions) as LOAD and STORE areas for 40-bit FP Registers (PEx and PEy?). Rich

  • Hi Jithul, So, if I use the "combined" PX Universal Register, I can transfer (LOAD and/or STORE) a 40-bit FP Register with respect to the PX Universal Register which is (I guess) limited to immediate data transfers. Rich

  • SHARCguy wrote:

    Hi bookevg, So, if I set up the PM and DM Blocks with 48-bit widths, I can use the DAGs (pointing into these 48-bit regions) as LOAD and STORE areas for 40-bit FP Registers (PEx and PEy?). Rich

    Yes, but you can't use SIMD

  • Hi bookevg,

    I truly thank you for your time and your responses. But, it would be helpful if you would read into my questions and try to answer my PEx vs PEy questions... for instance, in the 21489 data sheet, there is a Core Block Diagram that shows a DATA SWAP between the PEx and PEy register sets.  And, in the many pieces of SHARC literature I've read, there is no place I can find that states PEy is ALWAYS implicitly (only) handled in the SIMD Mode. That is, what NON-SIMD operations can be done on the contents of the PEy Registers?