Question:
How do I instruct the compiler to generate SIMD code? It is not generated by default.
---------------------------------------------
Answer:
SIMD code is not generated by default for most SHARC processors for reasons that are explained below. The compiler supports a number of switches that allow you to control when SIMD code is generated. These switches are:
On some SHARC processors, SIMD memory accesses to external memory are not possible, or may only be possible for certain memory types. Attempting to perform SIMD accesses to external memory on these processors can result in issues at run-time as data accesses using the 2nd processing element (“PEy”) will have no effect. In CrossCore Embedded Studio, the compiler’s behavior is conservative and it will not generate SIMD code if there is the possibility that SIMD accesses may fail. This restriction affects the following processors, for which SIMD code will not be generated by default:
For the following processors, SIMD code is generated by default. These processors do not support direct access of external memory, so the limitations described above do not apply.
Documentation
For more information on SIMD code generation, see the section “SIMD Support” in the C/C++ Compiler Manual.
Differences from VisualDSP++
In VisualDSP++, the compiler’s default behavior was to generate SIMD code, as long as certain criteria concerning alignment, aliasing and performance gains were met. If an application performed SIMD accesses, and the application used external memory, there was the potential for run-time errors that were difficult to diagnose. As described above, the compiler’s behavior in CrossCore Embedded Studio is now more conservative, which may result in out-of-the-box compiler performance appearing to be poorer. The switch “-loop-simd” can be used to reinstate the behavior of the VisualDSP++ compiler.