The code is testing whether the addresses are aligned suitably for a UINT
. If so, the code copies using UINT
objects. If not, the code copies using BYTE
objects.
The test works by first performing a bitwise OR of the two addresses. Any bit that is on in either address will be on in the result. Then the test performs a bitwise AND with sizeof(UINT) - 1
. It is expected the the size of a UINT
is some power of two. Then the size minus one has all lower bits on. E.g., if the size is 4 or 8, then one less than that is, in binary 112 or 1112. If either address is not a multiple of the size of a UINT
, then it will have one of these bits on, and the test will indicate it. (Usually, the best alignment for an integer object is the same as its size. This is not necessarily true. A modern implementation of this code should use _Alignof(UINT) - 1
instead of the size.)
Copying with UINT
objects is faster, because, at the hardware level, one load or store instruction loads or stores all the bytes of a UINT
(likely four bytes). Processors will typically copy faster when using these instructions than when using four times as many single-byte load or store instructions.
This code is of course implementation dependent; it requires support from the C implementation that is not part of the base C standard, and it depends on specific features of the processor it executes on.
A more advanced memcpy
implementation could contain additional features, such as:
- If one of the addresses is aligned but the other is not, use special load-unaligned instructions to load multiple bytes from the one address, with regular store instructions to the other address.
- If the processor has Single Instruction Multiple Data instructions, use those instructions to load or store many bytes (often 16, possibly more) in a single instruction.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…