I am developing an image processing application with ethernet. I developed and optimized the algorithms in a stand-alone application. Now i merged all algorithms together in a LwIP application. The result is that all of these algorithms need much more cycles to execute. As an example one part of the image processing needs nearly 9 million cycles to execute in the stand-alone application and 75 million cycles in the LwIP application. The projects use exactly the same source code and the same project options. All code and all buffers that are used are in internal memory. Also I tried it with different boards, the BF537 EZ-Kit and my BF537 custom board.
What could be the reason for this behavior and how can I solve this problem?