Incorrect output when computing square root with Nasm x86 Assembly

Question

Incorrect output when computing square root with Nasm x86 Assembly

644 views Asked by Tom At 27 December 2024 at 19:31

First of all, I want to thank anyone who reads this and tries to help me figure this out, it is much appreciated.

I am writing an assembly program to compute the square root of a number in IEEE-754 format using the bisection method. I believe my implementation of the bisection method is correct (though I probably could have done it a lot more efficiently), because when I print out the square root of a number using my implementation, and print out the square root the FPU gives me from fsqrt I get the same result, in both cases, for every input number I have tried. So this leads me to believe that I am printing the number incorrectly. What am I doing wrong?

Here is the code: 

EXTERN printf 
EXTERN sscanf 
GLOBAL main

SEGMENT .data
n:      DD 0                                    ; Storage for float 
n_sqrt: DD 0 
l_bound: DD 0 
u_bound: DD 0 
epsilon: DD 0x3727C5AC                          ; Error bound. IEEE 754 representation of 0.00001
midpoint: DD 0 
format: DB "%f", 0 
form:   DB "%s", 10, 0
formh:  DB "%f", 10, 0 
outFormat: DB "The square root of %lf is: ", 0 
fsqrtForm: DB "fsqrt(n) = %f", 10, 0  

SEGMENT .text 
main: 
        push    ebp                             ; compose stack frame 
        mov     ebp, esp                        
        mov     eax, [ebp + 12]                 ; eax = address of param table


    finit                                   ; initialize FPU stack

    pushad                                  ; preserve all registers before making a system call
    push    n                               ; store f.p. number in n 
    push    format                          ; format at f.p. 
    push    dword [eax+4]                   ; push the first command line parameter 
    call    sscanf                          ; convert it to f.p., store it in n 
    add     esp, 4*3
    popad 

    fld     dword [n]                       ; st0 = n 
    fld1                                    ; st0 = 1; st1 = n 
    fadd    st1                             ; st0 = n+1; st1 = n
    fst     dword [u_bound]                 ; u_bound = n+1; st0 = n+1 ; st1 = n
    fld1                                    ; st0 = 1; st1 = n+1; st2 = n 
    fadd    st0                             ; st0 = 2; st1 = n+1; st2 = n 
    fdivr   st1                             ; st0 = (n+1)/2; st1 = n+1; st2 = n
    fstp    dword [midpoint]                ; midpoint = (n+1)/2; st0 = n+1; st1 = n 
    fcompp                                  ; clear st0 and st1 
.L1:    
        fld     dword [n]                       ; st0 = n
        fld     dword [midpoint]                ; st0 = midpoint; st1 = n
        fmul    st0                             ; st0 = midpoint*midpoint; st1 = n  
        fcomip  st1                             ; midpoint*midpoint < n ? ; st0 = n
        jae     .L2                             ; NO   
        fstp    st0                             ; clear st0
        fld     dword [midpoint]                ; st0 = midpoint
        fstp    dword [l_bound]                 ; l_bound = midpoint; clear st0 
        jmp     .L3                             ; continue 
.L2:                                            ; Else 
        fstp    st0                             ; clear st0 
        fld     dword [midpoint]                ; st0 = midpoint
        fstp    dword [u_bound]                 ; u_bound = midpoint 
.L3:                                            ; midpoint = (l_bound + u_bound)/2.0  
        fld     dword [u_bound]                 ; st0 = u_bound
        fld     dword [l_bound]                 ; st0 = l_bound; st1 = u_bound
        faddp   st1, st0                        ; st0 = l_bound + u_bound
        fld1                                    ; st0 = 1; st1 = l_bound + u_bound  
        fadd    st0                             ; st0 = 2; st1 = l_bound + u_bound
        fdivrp  st1, st0                        ; st0 = (l_bound + u_bound)/2.0
        fstp    dword [midpoint]                ; midpoint = (l_bound + u_bound)/2.0 ; clear st0 


    fld     dword [epsilon]                 ; st0 = epsilon
    fld     dword [u_bound]                 ; st0 = u_bound; st1 = epsilon
    fld     dword [l_bound]                 ; st0 = l_bound; st1 = u_bound; st2 = epsilon
    fsubp   st1, st0                        ; st0 = u_bound - l_bound; st1 = epsilon 
    fcomip  st1                             ; check: is u_bound - l_bound > epsilon? st0 = epsilon
    ja      .L5                             ; YES break while loop
    fstp    st0                             ; NO - clear st0 and continue
    jmp     .L1                             
.L5:

    jmp     .printSqrt 
.end:


    pop     ebp 
    ret 

;---------------------------------------------------------------------------------------------------        
.printSqrt:

    fld     dword [n]
    sub     esp, 8 
    fstp    qword [esp]
    push    outFormat
    call    printf 
    add     esp, 12 

    fld     dword [midpoint]
    sub     esp, 8 
    fstp    qword [esp]
    push    formh
    call    printf 
    add     esp, 12 

    fld     dword [n]
    fsqrt 
    sub     esp, 8
    fstp    dword [esp]
    push    fsqrtForm
    call    printf 
    add     esp, 12


    jmp     .end 
    ret

[Also: the input number is passed as a command line parameter.]

Thanks again!

Original Q&A

There are 2 answers

Frank Kotler On 05 December 2013 at 13:02

You subtract 8 from esp, but only store a dword there. Try fstp qword [esp].

**Tom** · Accepted Answer · 2013-12-05T18:47:15+00:00

I discovered that I had three minor problems:

1) As Frank mentioned, I was only storing a dword in esp instead of a qword when I reserved 8 bytes.

2) On the line: fdivrp st1, st0 ; st0 = (l_bound + u_bound)/2.0 I changed the divrp to divp. Having divrp actually computes 2.0/(l_bound + u_bound), the reciprocal of what I wanted.

3) I needed to change the line ja .L5 to jb .L5. Having ja breaks the loop when the difference l_bound - u_bound is greater than the error bound epsilon, instead I want the loop to terminate when the difference is less than the error.

TechQA.

Incorrect output when computing square root with Nasm x86 Assembly

There are 2 answers

Related Questions in ASSEMBLY

Related Questions in X86

Related Questions in NASM

Related Questions in SQUARE-ROOT

Related Questions in BISECTION

Popular Questions

Popular Tags

Trending Questions