Why input and output arrays has to be pragma aligned in large accel_xxxx examples?
For the best performance to be achieved, it is needed for all arrays must be aligned to at least a 32 byte boundary.
For all large FFTs (i.e. where the number of points exceeds 2048) require any input or output buffer of complex data to be aligned to at least an 8 byte boundary. If a buffer which does not meet this requirement is passed to the function, an error will occur and the function will return NULL.