Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
946 views
in Technique[技术] by (71.8m points)

assembly - What comes after QWORD?

If

  • 8 bits is a byte

  • two bytes is a word

  • four bytes is a dword

  • 8 bytes is a qword

What is a good name for 16 bytes?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

TL:DR: In NASM, after RESB/RESW/RESD/RESQ there's RESO, RESY, and RESZ. In instruction mnemonics and Intel terminology (used in manuals), O (oct) and DQ (double-quad) are both used. But DQWORD isn't used, only OWORD.

Disassemblers will use xmmword ptr [rsi] for memory operand explicit sizes in MASM or .intel_syntax GNU syntax. IIRC, there are no instructions where that size isn't already implied by the mnemonic and/or register.


Note that this question is x86-specific, and is about Intel's terminology. In most other ISAs (like ARM or MIPS), a "word" is 32 bits, but x86 terminology originated with 8086.

Terminology in instruction mnemonics

Octword is used in the mnemonics for some x86-64 instructions. e.g. CQO sign-extends rax into rdx:rax.

CMPXCHG16B is another non-vector instruction that operates on 16 bytes, but Intel doesn't use "oct" anywhere in the description. Instead, they describe the memory location as a m128. That manual entry doesn't use any "word"-based sizes.

SSE/AVX Integer instructions often have an element-size as part of the mnemonic. In that context, DQ (double-quad) is used, never O (oct). For example, the PUNPCKL* instructions that interleave elements from half of two source vectors into a full destination vector:

  • PUNPCKLWD: word->dword (16->32)
  • PUNPCKLDQ: dword->qword (32->64)
  • PUNPCKLQDQ: two qwords->full 128bit register (64->128).

However, it's only ever DQ, not DQWord. Double-Quadword sounds somewhat unnatural, but I think it might be used in Intel manuals occasionally. It sounds better if you leave out the "Word", and just say "Store a Double-Quad at this location". If you want to attach "word" to it, I think only OWord sounds natural.

There's also MOVDQA for load/store/reg-reg moves. Mercifully, when AVX extended the vector width to 256b, they kept the same mnemonics and didn't call the 256b version VMOVQQA.

Some instructions for manipulating the 128-bit lanes of 256-bit registers have a 128 in the name, like VEXTRACTF128, which is new for Intel (other than CMPXCHG8B).


Assembler directives:

From the NASM manual:

3.2.1 DB and Friends: Declaring Initialized Data

DB, DW, DD, DQ, DT, DO, DY and DZ are used ... (table of examples)

DO, DY and DZ do not accept numeric constants as operands.

DT is a ten-byte x87 float. DO is 16 bytes, DY is a YMMWORD (32 bytes), and DZ is 64 bytes (AVX512 ZMM). Since they don't support numeric constants as initializers, I guess you could only use them with string literal initalizers? It would be more normal anyway to DB/DW/DD/DQ with a comma-separated list of per-element initializers.

Similarly, you can reserve uninitialized space.

realarray       resq    10              ; array of ten reals 
ymmval:         resy    1               ; one YMM register 
zmmvals:        resz    32              ; 32 ZMM registers

Terminology in intrinsics, and AVX512

As I mentioned in my answer on How can Microsoft say the size of a word in WinAPI is 16 bits?, AVX512's per-element masking during other operations makes naming tricky. VSHUFF32x4 shuffles 128b elements, with masking at 32bit element granularity.

However, Intel is not backing away from word=16 bits. e.g. AVX512BW and AVX512DQ put that terminology right in the name. Some intrinsics even use them, where previous it was always epi32, not d. (i.e. _mm256_broadcastd_epi32(__m128i), _mm256_broadcastw_epi16(__m128i). The b/w/d/q is totally redundant. Maybe that was a mistake?)

(Does anyone else find the asm mnemonics easier to remember and type than the annoyingly-long intrinsics? You have to know the asm mnemonics to read compiler output, so it would be nice if the intrinsics just used the mnemonics instead of a second naming scheme.)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...