why less than expression converts into less than or equal to expression in gcc

284 views Asked by At

I am working on code optimization and going through gcc internals. I wrote a simple expression in my program and I checked the gimple representation of that expression and I got stuck why gcc had done this. Say I have an expression :

if(i < 9)

then in the gimple representation it will be converted to

if(i <= 8)

I dont know why gcc do this. Is it some kind of optimization, if yes then can anyone tell me how it can optimize our program?

2

There are 2 answers

0
wildplasser On BEST ANSWER

The canonalisation helps to detect CommonSubExpressions, such as:

#include <stdio.h>

int main(void)
{
unsigned u, pos;
char buff[40];

for (u=pos=0; u < 10; u++) {
        buff[pos++] = (u <5) ? 'A' + u : 'a' + u;
        buff[pos++] = (u <=4) ? '0' + u : 'A' + u;
        }
buff[pos++] = 0;
printf("=%s=\n", buff);
return 0;
}

GCC -O1 will compile this into:

         ...
        movl    $1, %edx
        movl    $65, %ecx
.L4:
        cmpl    $4, %eax
        ja      .L2
        movb    %cl, (%rsi)
        leal    48(%rax), %r8d
        jmp     .L3
.L2:
        leal    97(%rax), %edi
        movb    %dil, (%rsi)
        movl    %ecx, %r8d
.L3:
        mov     %edx, %edi
        movb    %r8b, (%rsp,%rdi)
        addl    $1, %eax
        addl    $1, %ecx
        addl    $2, %edx
        addq    $2, %rsi
        cmpl    $10, %eax
        jne     .L4
        movb    $0, 20(%rsp)
        movq    %rsp, %rdx
        movl    $.LC0, %esi
        movl    $1, %edi
        movl    $0, %eax
        call    __printf_chk
         ...

GCC -O2 will actually remove the entire loop and replace it by a stream of assignments.

2
Mike Kwan On

Consider the following C code:

int i = 10;

if(i < 9) {
  puts("1234");
}

And also the equivalent C code:

int i = 10;

if(i <= 8) {
  puts("asdf");
}

Under no optimisation, both generate the exact same assembly sequence:

40052c:       c7 45 fc 0a 00 00 00    movl   $0xa,-0x4(%rbp)
400533:       83 7d fc 08             cmpl   $0x8,-0x4(%rbp)
400537:       7f 0a                   jg     400543 <main+0x1f>
400539:       bf 3c 06 40 00          mov    $0x40063c,%edi
40053e:       e8 d5 fe ff ff          callq  400418 <puts@plt>
400543:       .. .. .. ..             .. ..  ..

Since I am not familiar with the GCC implementation, I can only speculate as to why the conversion is done at all. Perhaps it makes the job of the code generator easier because it only has to handle a single case. I expect someone can come up with a more definitive answer.