How to call a C language function from x86 assembly code?

132 views Asked by At

I am recently trying to self-study OS and playing with the xv6 OS for teaching. The version I am using is the x86 one from GitHub. What I've been doing is try to use 2-level paging when initiating the system. For that purpose, I created a page table and a page directory in main.c as follows:

__attribute__((__aligned__(PGSIZE)))
pde_t entrypgdir[NPDENTRIES];

__attribute__((__aligned__(PGSIZE)))
pte_t entrypgtable[NPTENTRIES];

Then in entry.S, I have got some assembly codes to initialize these two arrays as follows:

# The xv6 kernel starts executing in this file. This file is linked with
# the kernel C code, so it can refer to kernel symbols such as main().
# The boot block (bootasm.S and bootmain.c) jumps to entry below.
        
# Multiboot header, for multiboot boot loaders like GNU Grub.
# http://www.gnu.org/software/grub/manual/multiboot/multiboot.html
#
# Using GRUB 2, you can boot xv6 from a file stored in a
# Linux file system by copying kernel or kernelmemfs to /boot
# and then adding this menu entry:
#
# menuentry "xv6" {
#   insmod ext2
#   set root='(hd0,msdos1)'
#   set kernel='/boot/kernel'
#   echo "Loading ${kernel}..."
#   multiboot ${kernel} ${kernel}
#   boot
# }

#include "asm.h"
#include "memlayout.h"
#include "mmu.h"
#include "param.h"

# Multiboot header.  Data to direct multiboot loader.
.p2align 2
.text
.globl multiboot_header
multiboot_header:
  #define magic 0x1badb002
  #define flags 0
  .long magic
  .long flags
  .long (-magic-flags)

# By convention, the _start symbol specifies the ELF entry point.
# Since we haven't set up virtual memory yet, our entry point is
# the physical address of 'entry'.
.globl _start
_start = V2P_WO(entry)

# Entering xv6 on boot processor, with paging off.
.globl entry
entry:
  # Turn on page size extension for 4Mbyte pages
  // I comment the following to not use bigger pages
  #movl    %cr4, %eax
  #orl     $(CR4_PSE), %eax
  #movl    %eax, %cr4
  # Set page directory

  //My assembly code to initialize a page table starts
  #set up the first page table page

  xor %esi, %esi
1:
  movl %esi, %eax
  shll $12, %eax
  orl $(PTE_P|PTE_W), %eax
  movl $(V2P_WO(entrypgtable)), %edi
  movl %esi, %ebx
  shll $2, %ebx
  addl %ebx, %edi
  movl %eax, (%edi)
  incl %esi
  cmpl $1024, %esi
  jb 1b

  # Set page directory

  movl $0, %esi
  movl %esi, %ebx
  shll $2, %ebx
  movl $(V2P_WO(entrypgdir)), %edi
  addl %ebx, %edi
  movl $(V2P_WO(entrypgtable)), (%edi)
  orl $(PTE_P | PTE_W), (%edi)

  movl $512, %esi
  movl %esi, %ebx
  shll $2, %ebx
  movl $(V2P_WO(entrypgdir)), %edi
  addl %ebx, %edi
  movl $(V2P_WO(entrypgtable)), (%edi)
  orl $(PTE_P | PTE_W), (%edi)
  //My assembly code to initialize a page table ends

  movl    $(V2P_WO(entrypgdir)), %eax
  movl    %eax, %cr3
  # Turn on paging.
  movl    %cr0, %eax
  orl     $(CR0_PG|CR0_WP), %eax
  movl    %eax, %cr0

  # Set up the stack pointer.
  movl $(stack + KSTACKSIZE), %esp

  # Jump to main(), and switch to executing at
  # high addresses. The indirect call is needed because
  # the assembler produces a PC-relative instruction
  # for a direct jump.
  mov $main, %eax
  jmp *%eax

.comm stack, KSTACKSIZE

As you can see, the basic idea is simple: the original xv6 system uses 4mb big pages when initializing the system, so that they only need the page directory to map the kernel code to the first physical page. I disable that option and use 4kb pages, then I create a new page table accordingly, then put the new page table into the page directory array. If my understanding is correct, the whole process should be equivalent to the following C code:

__attribute__((__aligned__(PGSIZE)))
pde_t entrypgdir[NPDENTRIES];

__attribute__((__aligned__(PGSIZE)))
pte_t entrypgtable[NPTENTRIES];

void init_entrypgdir(void) {
    for (int i = 0; i < NPTENTRIES; i++) {
        entrypgtable[i] = (i << 12) | PTE_P | PTE_W;
    }
    entrypgdir[0] = (uint)(V2P_WO(entrypgtable)) | PTE_P | PTE_W;
    entrypgdir[KERNBASE>>PDXSHIFT] = (uint)(V2P_WO(entrypgtable)) | PTE_P | PTE_W;
}

The above code resides in main.c, which is linked together with entry.S. I have confirmed that entry.S has access to my newly created init_entrypgdir() function.

Now comes the problem: what if I want to get rid of self-written assembly code and use the above C code to initialize xv6? Is there a way to replace my assembly code with my C code and still successfully initialize the OS?

I have tried to rewrite entry.S as:

#include "asm.h"
#include "memlayout.h"
#include "mmu.h"
#include "param.h"

# Multiboot header.  Data to direct multiboot loader.
.p2align 2
.text
.globl multiboot_header
multiboot_header:
  #define magic 0x1badb002
  #define flags 0
  .long magic
  .long flags
  .long (-magic-flags)

# By convention, the _start symbol specifies the ELF entry point.
# Since we haven't set up virtual memory yet, our entry point is
# the physical address of 'entry'.
.globl _start
_start = V2P_WO(entry)

# Entering xv6 on boot processor, with paging off.
.globl entry
entry:
  # Turn on page size extension for 4Mbyte pages
  // I comment the following to not use bigger pages
  #movl    %cr4, %eax
  #orl     $(CR4_PSE), %eax
  #movl    %eax, %cr4

  // see if this works
  call V2P_WO(init_entrypgdir)

  # Set page directory
  movl    $(V2P_WO(entrypgdir)), %eax
  movl    %eax, %cr3
  # Turn on paging.
  movl    %cr0, %eax
  orl     $(CR0_PG|CR0_WP), %eax
  movl    %eax, %cr0

  # Set up the stack pointer.
  movl $(stack + KSTACKSIZE), %esp

  # Jump to main(), and switch to executing at
  # high addresses. The indirect call is needed because
  # the assembler produces a PC-relative instruction
  # for a direct jump.
  mov $main, %eax
  jmp *%eax

.comm stack, KSTACKSIZE

Then, in GDB, I created breakpoints at lines 108, 110, and 112. enter image description here

enter image description here

As you can see on the screenshots, breakpoint 112 never gets hit, and qemu monitor freezes like this enter image description here

I wonder why line 112 is never run? Is there anything wrong with my for loop?

0

There are 0 answers