Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
292 views
in Technique[技术] by (71.8m points)

C Calling Conventions 32bit to NASM with float (movups/movupd difference)

I have this func in C. When I use istructions like: movss, movaps, movups all work propely, instead when I use istructions like: movupd, movapd, ecc.. it not work.. and return strange values

CODE THAT WORK PROPELY WITH movaps, movups,ecc..

C:

extern float test(float* a,float* b, int num, int spuri, float* res);


int main(int argc, char** argv) {
    float a[] = { 1.0, 2.0, 3.0, 4.0, 6.0, 9.0 };
    float b[] = { 3.0, 4.0, 4.0, 5.0, 5.0, 8.0 };
    int d=6;
    int num=d/4;
    int spuri=d-(num*4);
    float res=-1.0;
    test(a,b,num,spuri,&res);

    printf("res: %f
",res);

    return 1;
}

NASM:

%include "sseutils.nasm"

section .data           


section .bss            

alignb 16
A:  resd    1
T:  resd    4


section .text           

global test

a           equ     8   
b           equ     12  
num         equ         16      
spuri       equ         20
result      equ     24

test:
        push    ebp             
        mov     ebp, esp        
        push    ebx             
        push    esi
        push    edi

        mov         esi, [ebp+a]                
        mov         edi, [ebp+b]                
        mov         ebx, 0              
        mov         ecx, [ebp+num]              
        mov         edx, [ebp+spuri]
        mov         eax,[ebp+result]                
        xorps       xmm1,xmm1           
        xorps       xmm3,xmm3           

loop1:
        cmp ecx,0
            je loop2
        movups      xmm0, [esi+ebx]     
        movups      xmm6, [edi+ebx]
        subps       xmm0, xmm6          
        mulps       xmm0, xmm0          
        sqrtps      xmm0, xmm0
        addps       xmm1, xmm0          
        add         ebx, 16             
        dec         ecx                 
        jnz         loop1



loop2:

        cmp edx,0
                je end
        movss   xmm2,[esi+ebx]
        movss   xmm7,[edi+ebx]
        subps   xmm2, xmm7
        mulps   xmm2, xmm2
        sqrtps  xmm2, xmm2
        addps   xmm3, xmm2
        add     ebx,4
        dec     edx
        jnz     loop2


end:
        haddps      xmm1,xmm1
        haddps      xmm1,xmm1
        addps       xmm1,xmm3
        movups      [eax],xmm1








        pop edi                     
        pop     esi
        pop     ebx
        mov esp, ebp                
        pop ebp                     
        ret                         

This return correct value, but I need more precision because the number is in floating point and is needed to use movupd, or similar istructions..

How it must modified the previous code to use instructions as MOVUPD, MOVAPD or similar?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
Waitting for answers

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...