c++ - memcpy/memmove to a union member, does this set the 'active' member?

Question

Welcome To Ask or Share your Answers For Others

c++ - memcpy/memmove to a union member, does this set the 'active' member?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

c++ - memcpy/memmove to a union member, does this set the 'active' member?

Important clarification: some commenters seem to think that I am copying from a union. Look carefully at the memcpy, it copies from the address of a plain old uint32_t, which is not contained within a union. Also, I am copying (via memcpy) to a specific member of a union (u.a16 or &u.x_in_a_union, not directly to the entire union itself (&u)

C++ is quite strict about unions - you should read from a member only if that was the last member that was written to:

9.5 Unions [class.union] [[c++11]] In a union, at most one of the non-static data members can be active at any time, that is, the value of at most one of the non-static data members can be stored in a union at any time.

(Of course, the compiler doesn't track which member is active. It's up to the developer to ensure they track this themselves)

Update: This following block of code is the main question, directly reflecting the text in the question title. If this code is OK, I have a follow up regarding other types, but I now realize that this first block of code is interesting itself.

#include <cstdint>
uint32_t x = 0x12345678;
union {
    double whatever;
    uint32_t x_in_a_union; // same type as x
} u;
u.whatever = 3.14;
u.x_in_a_union = x; // surely this is OK, despite involving the inactive member?
std::cout << u.x_in_a_union;
u.whatever = 3.14; // make the double 'active' again
memcpy(&u.x_in_a_union, &x); // same types, so should be OK?
std::cout << u.x_in_a_union; // OK here? What's the active member?

The block of code immediately above this is probably the main issue in the comments and answers. In hindsight, I didn't need to mix types in this question! Basically, is u.a = b the same as memcpy(&u.a,&b, sizeof(b)), assuming the types are identical?

First, a relatively simple memcpy allowing us to read a uint32_t as an array of uint16_t:

#include <cstdint> # to ensure we have standard versions of these two types
uint32_t x = 0x12345678;
uint16_t a16[2];
static_assert(sizeof(x) == sizeof(a16), "");
std:: memcpy(a16, &x, sizeof(x));

The precise behaviour depends on the endianness of your platform, and you must beware of trap representations and so on. But it is generally agreed here (I think? Feedback appreciated!) that, with care to avoid problematic values, the above code can be perfectly standards-complaint in the right context on the right platform.

(If you have a problem with the above code, please comment or edit the question accordingly. I want to be sure we have a non-controversial version of the above before proceeding to the "interesting" code below.)

If, and only if, both blocks of code above are not-UB, then I would like to combine them as follows:

uint32_t x = 0x12345678;
union {
    double whatever;
    uint16_t a16[2];
} u;
u.whatever = 3.14; // sets the 'active' member
static_assert(sizeof(u.a16) == sizeof(x)); //any other checks I should do?
std:: memcpy(u.a16, &x, sizeof(x));

// what is the 'active member' of u now, after the memcpy?
cout << u.a16[0] << ' ' << u.a16[1] << endl; // i.e. is this OK?

Which member of the union, u.whatever or u.a16 , is the 'active member'?

Finally, my own guess is that the reason why we care about this, in practice, is that an optimizing compiler might fail to notice that the memcpy happened and therefore make false assumptions (but allowable assumptions, by the standard) about which member is active and which data types are 'active', therefore leading to mistakes around aliasing. The compiler might reorder the memcpy in strange ways. Is this an appropriate summary of why we care about this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:57:52+0000

My reading of the standard is that std::memcpy is safe whenever the type is trivially copyable.

From 9 Classes, we can see that unions are class types and so trivially copyable applies to them.

A union is a class defined with the class-key union; it holds only one data member at a time (9.5).

A trivially copyable class is a class that:

has no non-trivial copy constructors (12.8),

has no non-trivial move constructors (12.8),

has no non-trivial copy assignment operators (13.5.3, 12.8),

has no non-trivial move assignment operators (13.5.3, 12.8), and

has a trivial destructor (12.4).

The exact meaning of trivially copyable is given in 3.9 Types:

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes (1.7) making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.

For any trivially copyable type T, if two pointers to T point to distinct T objects obj1 and obj2, where neither obj1 nor obj2 is a base-class subobject, if the underlying bytes (1.7) making up obj1 are copied into obj2, obj2 shall subsequently hold the same value as obj1.

The standard also gives an explicit example of both.

So, if you were copying the entire union, the answer would be unequivocally yes, the active member will be "copied" along with the data. (This is relevant because it indicates that std::memcpy must be regarded as a valid means of changing the active element of a union, since using it is explicitly allowed for whole union copying.)

Now, you are instead copying into a member of the union. The standard doesn't appear to require any particular method of assigning to a union member (and hence making it active). All it does is specify (9.5) that

[ Note: In general, one must use explicit destructor class and placement new operators to change the active member of a union. — end note]

which it says, of course, because C++11 allows objects of non-trivial type in unions. Note the "in general" on the front, which quite clearly indicates that other methods of changing the active member are permissible in specific cases; we already know this to be the case because assignment is clearly permitted. Certainly there is no prohibition on using std::memcpy, where its use would otherwise be valid.

So my answer is yes, this is safe, and yes, it changes the active member.

Categories

c++ - memcpy/memmove to a union member, does this set the 'active' member?

c++ - memcpy/memmove to a union member, does this set the 'active' member?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags