Can _start be the thumb function?

2.2k views Asked by At

Help me please with gnu assembler for arm926ejs cpu.

I try to build a simple program(test.S):

.global _start 
_start:
    mov r0, #2
    bx lr

and success build it:

arm-none-linux-gnueabi-as -mthumb -o test.o test.S
arm-none-linux-gnueabi-ld -o test test.o

but when I run the program in the arm target linux environment, I get an error:

./test 
Segmentation fault

What am I doing wrong? Can _start function be the thumb func? or It is always arm func?

3

There are 3 answers

4
auselen On

Your problem is you end with

bx lr

and you expect Linux to take over after that. That exact line must be the cause of Segmentation fault.

You can try to create a minimal executable then try to bisect it to see the guts and understand how an executable is expected to behave.

See below for a working example:

.global _start
.thumb_func
_start:
    mov r0, #42
    mov r7, #1
    svc #0

compile with

arm-linux-gnueabihf-as start.s -o start.o && arm-linux-gnueabihf-ld start.o -o start_test

and dump to see the guts

$ arm-linux-gnueabihf-readelf -a -W start_test

Now you should notice the odd address of _start

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           ARM
  Version:                           0x1
  Entry point address:               0x8055
  Start of program headers:          52 (bytes into file)
  Start of section headers:          160 (bytes into file)
  Flags:                             0x5000000, Version5 EABI
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         1
  Size of section headers:           40 (bytes)
  Number of section headers:         6
  Section header string table index: 3

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00008054 000054 000006 00  AX  0   0  4
  [ 2] .ARM.attributes   ARM_ATTRIBUTES  00000000 00005a 000014 00      0   0  1
  [ 3] .shstrtab         STRTAB          00000000 00006e 000031 00      0   0  1
  [ 4] .symtab           SYMTAB          00000000 000190 0000e0 10      5   6  4
  [ 5] .strtab           STRTAB          00000000 000270 000058 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

There are no section groups in this file.

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000000 0x00008000 0x00008000 0x0005a 0x0005a R E 0x8000

 Section to Segment mapping:
  Segment Sections...
   00     .text 

There is no dynamic section in this file.

There are no relocations in this file.

There are no unwind sections in this file.

Symbol table '.symtab' contains 14 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 00008054     0 SECTION LOCAL  DEFAULT    1 
     2: 00000000     0 SECTION LOCAL  DEFAULT    2 
     3: 00000000     0 FILE    LOCAL  DEFAULT  ABS start.o
     4: 00008054     0 NOTYPE  LOCAL  DEFAULT    1 $t
     5: 00000000     0 FILE    LOCAL  DEFAULT  ABS 
     6: 0001005a     0 NOTYPE  GLOBAL DEFAULT    1 _bss_end__
     7: 0001005a     0 NOTYPE  GLOBAL DEFAULT    1 __bss_start__
     8: 0001005a     0 NOTYPE  GLOBAL DEFAULT    1 __bss_end__
     9: 00008055     0 FUNC    GLOBAL DEFAULT    1 _start
    10: 0001005a     0 NOTYPE  GLOBAL DEFAULT    1 __bss_start
    11: 0001005c     0 NOTYPE  GLOBAL DEFAULT    1 __end__
    12: 0001005a     0 NOTYPE  GLOBAL DEFAULT    1 _edata
    13: 0001005c     0 NOTYPE  GLOBAL DEFAULT    1 _end

No version information found in this file.
Attribute Section: aeabi
File Attributes
  Tag_CPU_arch: v4T
  Tag_THUMB_ISA_use: Thumb-1
7
artless noise On

Can _start be a thumb function (in a Linux user program)?

Yes it can. The steps are not as simple as you may believe.

Please use the .code 16 as described by others. Also look at ARM Script predicate; my answer shows how to detect a thumb binary. The entry symbol must have the traditional _start+1 value or Linux will decide to call your _start in ARM mode.

Also your code is trying to emulate,

 int main(void) { return 2; }

The _start symbol must not do this (as per auselen). To do _start to main() in ARM mode you need,

 #include <linux/unistd.h>
 static inline void exit(int status)
 {
         asm volatile ("mov      r0, %0\n\t"
                 "mov    r7, %1\n\t"
                 "swi    #7\n\t"
                 : : "r" (status),
                   "Ir" (__NR_exit)
                 : "r0", "r7");
 }
 /* Wrapper for main return code. */
 void __attribute__ ((unused)) estart (int argc, char*argv[])
 {
     int rval = main(argc,argv);
     exit(rval);
 }

 /* Setup arguments for estart [like main()]. */
 void __attribute__ ((naked)) _start (void)
 {
     asm(" sub     lr, lr, lr\n"   /* Clear the link register. */
         " ldr     r0, [sp]\n"     /* Get argc... */
         " add     r1, sp, #4\n"   /* ... and argv ... */
         " b       estart\n"       /* Let's go! */
         );
 }

It is good to clear the lr so that stack traces will terminate. You can avoid the argc and argv processing if you want. The start shows how to work with this. The estart is just a wrapper to convert the main() return code to an exit() call.

You need to convert the above assembler to Thumb equivalents. I would suggest using gcc inline assembler. You can convert to pure assembler source if you get inlines to work. However, doing this in 'C' source is probably more practical, unless you are trying to make a very minimal executable.

Helpful gcc arguements are,

 -nostartfiles -static -nostdlib -isystem <path to linux user headers>

Add -mthumb and you should have a harness for either mode.

1
g3plc On

here answer.

Thanks for all.

http://stuff.mit.edu/afs/sipb/project/egcs/src/egcs/gcc/config/arm/README-interworking

  • Calls via function pointers should use the BX instruction if the call is made in ARM mode:

        .code 32
        mov lr, pc
        bx  rX
    

    This code sequence will not work in Thumb mode however, since the mov instruction will not set the bottom bit of the lr register. Instead a branch-and-link to the _call_via_rX functions should be used instead:

        .code 16
        bl  _call_via_rX
    

    where rX is replaced by the name of the register containing the function address.