Is there a way to optimize the GCC compiled code in term of cpu and memory using option flags? Using O3 rather than 01 does increase or decrease the amount of memory or cpu usage?
GCC optimization for CPU and MEMORY usage
4k views Asked by staticx AtThere are 3 answers
About memory usage:
-Os
reduces the binary size of a program. It has limited effect on runtime memory usage (C/C++ memory allocation and deallocation is "manual").I say limited since tail recursion optimization can lower stack usage (this optimization will also be performed with
-O2
/-O3
).The
-flto
(link time optimization) option can also lower binary size.
CPU usage:
Highly optimized code (e.g.
-O3
) will stress the CPU but that doesn't automatically mean a higher total CPU power consumption (it may lead to minimum execution times).E.g. in Compiler-Based Optimizations Impact on Embedded Software Power Consumption (not strictly GCC related but interesting), they find that enabling various global speed compiler optimizations lead to considerable increase in the power consumption of the DSP (on average, by 25%). Although these optimizations increase the consumed power by the DSP, the energy usage while running an algorithm decreased, on average, by 95%
Profile guided optimization could lower CPU consumption (The risks of using PGO (profile-guided optimization) with production environment).
Take a look at Can we optimize code to reduce power consumption?
Probably you should use -O2
and do not worry about it: if you're looking to save power / memory, the overall design of your application will have more effect than a compiler switch.
Code size optimizations are addressed above.
I'm only looking at CPU optimization. You can write really good/optimized code that has low processor utilization, and really bad/unoptimized code that maximizes CPU utilization.
So how do you most effectively use your processor?
- First, use a good optimizing compiler. I won't speak to GCC, but Intel and some other purchased compilers (e.g. PGI) are very good at optimization.
- Exploit the underlying hardware, such as vector instructions, FMA, registers, etc.
- Follow best practices for use of peripherals, such as cellular, wifi, gps, etc.
- Following best practices for SW design, such as latency hiding, avoid polling by using interrupts, use a thread pool if appropriate, etc
Good luck.
You might try
-Os
which is like-O2
(good CPU speed) while simultaneously trying to reduce the binary size.Check out the various optimizations here.