Incorrect output when computing square root with Nasm x86 Assembly

627 views Asked by At

First of all, I want to thank anyone who reads this and tries to help me figure this out, it is much appreciated.

I am writing an assembly program to compute the square root of a number in IEEE-754 format using the bisection method. I believe my implementation of the bisection method is correct (though I probably could have done it a lot more efficiently), because when I print out the square root of a number using my implementation, and print out the square root the FPU gives me from fsqrt I get the same result, in both cases, for every input number I have tried. So this leads me to believe that I am printing the number incorrectly. What am I doing wrong?

Here is the code: 

EXTERN printf 
EXTERN sscanf 
GLOBAL main

SEGMENT .data
n:      DD 0                                    ; Storage for float 
n_sqrt: DD 0 
l_bound: DD 0 
u_bound: DD 0 
epsilon: DD 0x3727C5AC                          ; Error bound. IEEE 754 representation of 0.00001
midpoint: DD 0 
format: DB "%f", 0 
form:   DB "%s", 10, 0
formh:  DB "%f", 10, 0 
outFormat: DB "The square root of %lf is: ", 0 
fsqrtForm: DB "fsqrt(n) = %f", 10, 0  

SEGMENT .text 
main: 
        push    ebp                             ; compose stack frame 
        mov     ebp, esp                        
        mov     eax, [ebp + 12]                 ; eax = address of param table


    finit                                   ; initialize FPU stack

    pushad                                  ; preserve all registers before making a system call
    push    n                               ; store f.p. number in n 
    push    format                          ; format at f.p. 
    push    dword [eax+4]                   ; push the first command line parameter 
    call    sscanf                          ; convert it to f.p., store it in n 
    add     esp, 4*3
    popad 

    fld     dword [n]                       ; st0 = n 
    fld1                                    ; st0 = 1; st1 = n 
    fadd    st1                             ; st0 = n+1; st1 = n
    fst     dword [u_bound]                 ; u_bound = n+1; st0 = n+1 ; st1 = n
    fld1                                    ; st0 = 1; st1 = n+1; st2 = n 
    fadd    st0                             ; st0 = 2; st1 = n+1; st2 = n 
    fdivr   st1                             ; st0 = (n+1)/2; st1 = n+1; st2 = n
    fstp    dword [midpoint]                ; midpoint = (n+1)/2; st0 = n+1; st1 = n 
    fcompp                                  ; clear st0 and st1 
.L1:    
        fld     dword [n]                       ; st0 = n
        fld     dword [midpoint]                ; st0 = midpoint; st1 = n
        fmul    st0                             ; st0 = midpoint*midpoint; st1 = n  
        fcomip  st1                             ; midpoint*midpoint < n ? ; st0 = n
        jae     .L2                             ; NO   
        fstp    st0                             ; clear st0
        fld     dword [midpoint]                ; st0 = midpoint
        fstp    dword [l_bound]                 ; l_bound = midpoint; clear st0 
        jmp     .L3                             ; continue 
.L2:                                            ; Else 
        fstp    st0                             ; clear st0 
        fld     dword [midpoint]                ; st0 = midpoint
        fstp    dword [u_bound]                 ; u_bound = midpoint 
.L3:                                            ; midpoint = (l_bound + u_bound)/2.0  
        fld     dword [u_bound]                 ; st0 = u_bound
        fld     dword [l_bound]                 ; st0 = l_bound; st1 = u_bound
        faddp   st1, st0                        ; st0 = l_bound + u_bound
        fld1                                    ; st0 = 1; st1 = l_bound + u_bound  
        fadd    st0                             ; st0 = 2; st1 = l_bound + u_bound
        fdivrp  st1, st0                        ; st0 = (l_bound + u_bound)/2.0
        fstp    dword [midpoint]                ; midpoint = (l_bound + u_bound)/2.0 ; clear st0 


    fld     dword [epsilon]                 ; st0 = epsilon
    fld     dword [u_bound]                 ; st0 = u_bound; st1 = epsilon
    fld     dword [l_bound]                 ; st0 = l_bound; st1 = u_bound; st2 = epsilon
    fsubp   st1, st0                        ; st0 = u_bound - l_bound; st1 = epsilon 
    fcomip  st1                             ; check: is u_bound - l_bound > epsilon? st0 = epsilon
    ja      .L5                             ; YES break while loop
    fstp    st0                             ; NO - clear st0 and continue
    jmp     .L1                             
.L5:

    jmp     .printSqrt 
.end:


    pop     ebp 
    ret 

;---------------------------------------------------------------------------------------------------        
.printSqrt:

    fld     dword [n]
    sub     esp, 8 
    fstp    qword [esp]
    push    outFormat
    call    printf 
    add     esp, 12 

    fld     dword [midpoint]
    sub     esp, 8 
    fstp    qword [esp]
    push    formh
    call    printf 
    add     esp, 12 

    fld     dword [n]
    fsqrt 
    sub     esp, 8
    fstp    dword [esp]
    push    fsqrtForm
    call    printf 
    add     esp, 12


    jmp     .end 
    ret 

[Also: the input number is passed as a command line parameter.]

Thanks again!

2

There are 2 answers

0
Tom On BEST ANSWER

I discovered that I had three minor problems:

1) As Frank mentioned, I was only storing a dword in esp instead of a qword when I reserved 8 bytes.

2) On the line: fdivrp st1, st0 ; st0 = (l_bound + u_bound)/2.0 I changed the divrp to divp. Having divrp actually computes 2.0/(l_bound + u_bound), the reciprocal of what I wanted.

3) I needed to change the line ja .L5 to jb .L5. Having ja breaks the loop when the difference l_bound - u_bound is greater than the error bound epsilon, instead I want the loop to terminate when the difference is less than the error.

1
Frank Kotler On

You subtract 8 from esp, but only store a dword there. Try fstp qword [esp].