could you please specify what performance degradation I can expect executing my matrix-vector operations where my numbers are float i.e. 32-bit versus 64-bit doubles. I know that SHARC family support 64-bit floats only via floating point emulation library hence execution will be much slower, the question I have how much slower? What sustained performance I can expect i.e. how far from 3.2GFlops pick for 32-bit floats?
Can anyone comment on performance using 40-bit extended precision? Are there any libraries that will support arthementic for such extended precision or the hand crafted assembler is only the option?
Are you aware of BLAS library for Tiger SHARC?