How do I convert a binary firmware dump to an .elf for assembly language debugging?

1.9k views Asked by At

I have a binary firmware image for ARM Cortex M that I know should be loaded at 0x20000000. I would like to convert it to a format that I can use for assembly level debugging with gdb, which I assume means converting to an .elf. But I have not been able to figure out how to add enough metadata to the .elf for this to happen. Here is what I've tried so far.

arm-none-eabi-objcopy -I binary -O elf32-littlearm --set-section-flags \
    .data=alloc,contents,load,readonly \
    --change-section-address .data=0x20000000 efr32.bin efr32.elf

efr32.elf:     file format elf32-little
efr32.elf
architecture: UNKNOWN!, flags 0x00000010:
HAS_SYMS
start address 0x00000000

Sections:
Idx Name          Size      VMA       LMA       File off  Algn
  0 .data         00000168  20000000  20000000  00000034  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
20000000 l    d  .data  00000000 .data
20000000 g       .data  00000000 _binary_efr32_bin_start
20000168 g       .data  00000000 _binary_efr32_bin_end
00000168 g       *ABS*  00000000 _binary_efr32_bin_size

Do I need to start by converting the binary to .o and write a simple linker script? Should I add an architecture option to the objcopy command?

1

There are 1 answers

11
old_timer On

A little experiment...

  58:   480a        ldr r0, [pc, #40]   ; (84 <spi_write_byte+0x38>)
  5a:   bf08        it  eq
  5c:   4809        ldreq   r0, [pc, #36]   ; (84 <spi_write_byte+0x38>)
  5e:   f04f 01ff   mov.w   r1, #255    ; 0xff

you dont have that of course, but you can read the binary and do this with it:

.thumb
.globl _start
_start:
.inst.n 0x480a
.inst.n 0xbf08
.inst.n 0x4809
.inst.n 0xf04f
.inst.n 0x01ff

then see what happens.

arm-none-eabi-as test.s -o test.o
arm-none-eabi-ld -Ttext=0x58 test.o -o test.elf
arm-none-eabi-objdump -D test.elf

test.elf:     file format elf32-littlearm


Disassembly of section .text:

00000058 <_start>:
  58:   480a        ldr r0, [pc, #40]   ; (84 <_start+0x2c>)
  5a:   bf08        it  eq
  5c:   4809        ldreq   r0, [pc, #36]   ; (84 <_start+0x2c>)
  5e:   f04f 01ff   mov.w   r1, #255    ; 0xff

but the reality is it wont work...if this binary has any thumb2 extensions it isnt going to work, you cant disassemble variable length instructions linearly. You have to deal with them in execution order. So to do this correctly you have to write a dissassembler that walks through the code in execution order, determining the instructions you can figure out, mark them as instructions...

  80:   d1e8        bne.n   54 <spi_write_byte+0x8>
  82:   bd70        pop {r4, r5, r6, pc}
  84:   40005200
  88:   F7FF4000
  8c:   e92d 41f0   stmdb   sp!, {r4, r5, r6, r7, r8, lr}
  90:   4887        ldr r0, [pc, #540]  ; (2b0 <notmain+0x224>)
.thumb
.globl _start
_start:
.inst.n 0xd1e8
.inst.n 0xbd70
.inst.n 0x5200
.inst.n 0x4000
.inst.n 0x4000
.inst.n 0xF7FF
.inst.n 0xe92d
.inst.n 0x41f0
.inst.n 0x4887

  80:   d1e8        bne.n   54 <_start-0x2c>
  82:   bd70        pop {r4, r5, r6, pc}
  84:   5200        strh    r0, [r0, r0]
  86:   4000        ands    r0, r0
  88:   4000        ands    r0, r0
  8a:   f7ff e92d           ; <UNDEFINED> instruction: 0xf7ffe92d
  8e:   41f0        rors    r0, r6
  90:   4887        ldr r0, [pc, #540]  ; (2b0 <_start+0x230>)

it will recover, and break and recover, etc...

instead you have to write a disassembler that walks through the code (doesnt necessarily have to disassemble to assembly language but enough to walk the code and recurse down all possible branch paths). all data not determined to be instructions mark as instructions

.thumb
.globl _start
_start:
.inst.n 0xd1e8
.inst.n 0xbd70
.word 0x40005200
.word 0xF7FF4000
.inst.n 0xe92d
.inst.n 0x41f0
.inst.n 0x4887

00000080 <_start>:
  80:   d1e8        bne.n   54 <_start-0x2c>
  82:   bd70        pop {r4, r5, r6, pc}
  84:   40005200    andmi   r5, r0, r0, lsl #4
  88:   f7ff4000            ; <UNDEFINED> instruction: 0xf7ff4000
  8c:   e92d 41f0   stmdb   sp!, {r4, r5, r6, r7, r8, lr}
  90:   4887        ldr r0, [pc, #540]  ; (2b0 <_start+0x230>)

and our stmdb instruction is now correct.

good luck.