c++ - Copying big endian float data directly into a vector<float> and byte swapping in place. Is it safe?

Question

Welcome To Ask or Share your Answers For Others

c++ - Copying big endian float data directly into a vector<float> and byte swapping in place. Is it safe?

posted Oct 7, 2021 in Technique[技术] by 深蓝 (71.8m points)

c++ - Copying big endian float data directly into a vector<float> and byte swapping in place. Is it safe?

I'd like to be able to copy big endian float arrays directly from an unaligned network buffer into a std::vector<float> and perform the byte swapping back to host order "in place", without involving an intermediate std::vector<uint32_t>. Is this even safe? I'm worried that the big endian float data may accidentally be interpreted as NaNs and trigger unexpected behavior. Is this a valid concern?

For the purposes of this question, assume that the host machine receiving the data is little endian.

Here's some code that demonstrates what I'm trying to do:

std::vector<float> source{1.0f, 2.0f, 3.0f, 4.0f};
std::size_t number_count = source.size();

// Simulate big-endian float values being received from network and stored
// in byte buffer. A temporary uint32_t vector is used to transform the
// source data to network byte order (big endian) before being copied
// to a byte buffer.
std::vector<uint32_t> temp(number_count, 0);
std::size_t byte_length = number_count * sizeof(float);
std::memcpy(temp.data(), source.data(), byte_length);
for (uint32_t& datum: temp)
    datum = ::htonl(datum);
std::vector<uint8_t> buffer(byte_length, 0);
std::memcpy(buffer.data(), temp.data(), byte_length);
// buffer now contains the big endian float data, and is not aligned at word boundaries

// Copy the received network buffer data directly into the destination float vector
std::vector<float> numbers(number_count, 0.0f);
std::memcpy(numbers.data(), buffer.data(), byte_length); // IS THIS SAFE??

// Perform the byte swap back to host order (little endian) in place,
// to avoid needing to allocate an intermediate uint32_t vector.
auto ptr = reinterpret_cast<uint8_t*>(numbers.data());
for (size_t i=0; i<number_count; ++i)
{
    // IS THIS SAFE??
    uint32_t datum;
    std::memcpy(&datum, ptr, sizeof(datum));
    *datum = ::ntohl(*datum);
    std::memcpy(ptr, &datum, sizeof(datum));
    ptr += sizeof(datum);
}

assert(numbers == source);

Note the two "IS THIS SAFE??" comments above.

Motivation: I'm writing a CBOR serialization library with support for typed arrays. CBOR allows typed arrays to be transmitted as either big endian or little endian.

EDIT: Replaced illegal reinterpret_cast<uint32_t*> type punning in endian swap loop with memcpy.

question from:https://stackoverflow.com/questions/65910802/copying-big-endian-float-data-directly-into-a-vectorfloat-and-byte-swapping-in

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T19:12:28+0000

After your edit:

Regarding the auto datum = reinterpret_cast<uint32_t*>(numbers.data());: This is not allowed in C++, one can only safely type-pun to uint8_t (only if CHAR_BIT == 8, more precisely this type-punning exception only holds for the char types)

Old answer: Below is for the question before the edit (the one with bit_cast).

This is safe, provided sizeof(float) == sizeof(uint32_t)

Dont worry about signaling NaNs. The exceptions are usually disabled, and even if they are enabled, they are only happening when a signaling NaN is generated. The move instructions do not generate exceptions.

Accessing the vector elements via data() pointer is supported (for both reading and writing). vector is guarantueed to have a contiguous storage.

But why aren't you doing all in only a single loop without the temp buffers?

Just have the float vector (input or output) and the data buffer (uint8_t vector). For sending just iterate over the float input vector, for each element perform the byte swapping and write the 4 bytes to the data buffer. One at a time. Then you do not need any intermediate buffers. It will probably not be slower. For receiving do the reverse.

Use std::bit_cast for conversion of float from/to std::array<uint8_t,4>. This would be the "correct" way in C++20 (you cant use C arrays directly with bit_cast). With this approach you do not need to invoke ntohl, just copy the bytes in correct order from/to buffer.

Categories

c++ - Copying big endian float data directly into a vector<float> and byte swapping in place. Is it safe?

c++ - Copying big endian float data directly into a vector<float> and byte swapping in place. Is it safe?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags