Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
775 views
in Technique[技术] by (71.8m points)

c - Understanding the implementation of memcpy()

I was looking the implementation of memcpy.c, I found a different memcpy code. I couldnt understand why do they do (((ADDRESS) s) | ((ADDRESS) d) | c) & (sizeof(UINT) - 1)

#if !defined(__MACHDEP_MEMFUNC)

#ifdef _MSC_VER
#pragma function(memcpy)
#undef __MEMFUNC_ARE_INLINED
#endif

#if !defined(__MEMFUNC_ARE_INLINED)
/* Copy C bytes from S to D.
 * Only works if non-overlapping, or if D < S.
 */
EXTERN_C void * __cdecl memcpy(void *d, const void *s, size_t c)
{
    if ((((ADDRESS) s) | ((ADDRESS) d) | c) & (sizeof(UINT) - 1)) {

        BYTE *pS = (BYTE *) s;
        BYTE *pD = (BYTE *) d;
        BYTE *pE = (BYTE *) (((ADDRESS) s) + c);

        while (pS != pE)
            *(pD++) = *(pS++);
    }
    else {
        UINT *pS = (UINT *) s;
        UINT *pD = (UINT *) d;
        UINT *pE = (UINT *) (BYTE *) (((ADDRESS) s) + c);

        while (pS != pE)
            *(pD++) = *(pS++);
    }
    return d;
}

#endif /* ! __MEMFUNC_ARE_INLINED */
#endif /* ! __MACHDEP_MEMFUNC */
See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The code is testing whether the addresses are aligned suitably for a UINT. If so, the code copies using UINT objects. If not, the code copies using BYTE objects.

The test works by first performing a bitwise OR of the two addresses. Any bit that is on in either address will be on in the result. Then the test performs a bitwise AND with sizeof(UINT) - 1. It is expected the the size of a UINT is some power of two. Then the size minus one has all lower bits on. E.g., if the size is 4 or 8, then one less than that is, in binary 112 or 1112. If either address is not a multiple of the size of a UINT, then it will have one of these bits on, and the test will indicate it. (Usually, the best alignment for an integer object is the same as its size. This is not necessarily true. A modern implementation of this code should use _Alignof(UINT) - 1 instead of the size.)

Copying with UINT objects is faster, because, at the hardware level, one load or store instruction loads or stores all the bytes of a UINT (likely four bytes). Processors will typically copy faster when using these instructions than when using four times as many single-byte load or store instructions.

This code is of course implementation dependent; it requires support from the C implementation that is not part of the base C standard, and it depends on specific features of the processor it executes on.

A more advanced memcpy implementation could contain additional features, such as:

  • If one of the addresses is aligned but the other is not, use special load-unaligned instructions to load multiple bytes from the one address, with regular store instructions to the other address.
  • If the processor has Single Instruction Multiple Data instructions, use those instructions to load or store many bytes (often 16, possibly more) in a single instruction.

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...