You can't just push/pop safely from inline asm, if it's going to be portable to systems with a red-zone. That includes every non-Windows x86-64 platform. (There's no way to tell gcc you want to clobber it). Well, you could add rsp, -128
first to skip past the red-zone before pushing/popping anything, then restore it later. But then you can't use an "m"
constraints, because the compiler might use RSP-relative addressing with offsets that assume RSP hasn't been modified.
But really this is a ridiculous thing to be doing in inline asm.
Here's how you use inline-asm to swap two C variables:
#include <stdio.h>
int main()
{
int x = 1;
int y = 2;
asm("" // no actual instructions.
: "=r"(y), "=r"(x) // request both outputs in the compiler's choice of register
: "0"(x), "1"(y) // matching constraints: request each input in the same register as the other output
);
// apparently "=m" doesn't compile: you can't use a matching constraint on a memory operand
printf("x=%d,y=%d
",x,y);
// getchar(); // Set up your terminal not to close after the program exits if you want similar behaviour: don't embed it into your programs
return 0;
}
gcc -O3 output (targeting the x86-64 System V ABI, not Windows) from the Godbolt compiler explorer:
.section .rodata
.LC0:
.string "x=%d,y=%d"
.section .text
main:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0
xor eax, eax
mov edx, 1
mov esi, 2
#APP
# 8 "/tmp/gcc-explorer-compiler116814-16347-5i3lz1/example.cpp" 1
# I used "
" instead of just "" so we could see exactly where our inline-asm code ended up.
# 0 "" 2
#NO_APP
call printf
xor eax, eax
add rsp, 8
ret
C variables are a high level concept; it doesn't cost anything to decide that the same registers now logically hold different named variables, instead of swapping the register contents without changing the varname->register mapping.
When hand-writing asm, use comments to keep track of the current logical meaning of different registers, or parts of a vector register.
The inline-asm didn't lead to any extra instructions outside the inline-asm block either, so it's perfectly efficient in this case. Still, the compiler can't see through it, and doesn't know that the values are still 1 and 2, so further constant-propagation would be defeated. https://gcc.gnu.org/wiki/DontUseInlineAsm