opengl - Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

Question

Welcome To Ask or Share your Answers For Others

opengl - Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

posted Oct 17, 2021 in Technique[技术] by 深蓝 (71.8m points)

opengl - Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

The vec3 type is a very nice type. It only takes up 3 floats, and I have data that only needs 3 floats. And I want to use one in a structure in a UBO and/or SSBO:

layout(std140) uniform UBO
{
  vec4 data1;
  vec3 data2;
  float data3;
};

layout(std430) buffer SSBO
{
  vec4 data1;
  vec3 data2;
  float data3;
};

Then, in my C or C++ code, I can do this to create matching data structures:

struct UBO
{
  vector4 data1;
  vector3 data2;
  float data3;
};

struct SSBO
{
  vector4 data1;
  vector3 data2;
  float data3;
};

Is this a good idea?

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-16T21:20:09+0000

NO! Never do this!

When declaring UBOs/SSBOs, pretend that all 3-element vector types don't exist. This includes column-major matrices with 3 rows or row-major matrices with 3 columns. Pretend that the only types are scalars, 2, and 4 element vectors (and matrices). You will save yourself a very great deal of grief if you do so.

If you want the effect of a vec3 + a float, then you should pack it manually:

layout(std140) uniform UBO
{
  vec4 data1;
  vec4 data2and3;
};

Yes, you'll have to use data2and3.w to get the other value. Deal with it.

If you want arrays of vec3s, then make them arrays of vec4s. Same goes for matrices that use 3-element vectors. Just banish the entire concept of 3-element vectors from your SSBOs/UBOs; you'll be much better off in the long run.

There are two reasons why you should avoid vec3:

It won't do what C/C++ does

If you use std140 layout, then you will probably want to define data structures in C or C++ that match the definition in GLSL. That makes it easy to mix&match between the two. And std140 layout makes it at least possible to do this in most cases. But its layout rules don't match the usual layout rules for C and C++ compilers when it comes to vec3s.

Consider the following C++ definitions for a vec3 type:

struct vec3a { float a[3]; };
struct vec3f { float x, y, z; };

Both of these are perfectly legitimate types. The sizeof and layout of these types will match the size&layout that std140 requires. But it does not match the alignment behavior that std140 imposes.

Consider this:

//GLSL
layout(std140) uniform Block
{
    vec3 a;
    vec3 b;
} block;

//C++
struct Block_a
{
    vec3a a;
    vec3a b;
};

struct Block_f
{
    vec3f a;
    vec3f b;
};

On most C++ compilers, sizeof for both Block_a and Block_f will be 24. Which means that the offsetof b will be 12.

In std140 layout however, vec3 is always aligned to 4 words. And therefore, Block.b will have an offset of 16.

Now, you could try to fix that by using C++11's alignas functionality (or C11's similar _Alignas feature):

struct alignas(16) vec3a_16 { float a[3]; };
struct alignas(16) vec3f_16 { float x, y, z; };

struct Block_a
{
    vec3a_16 a;
    vec3a_16 b;
};

struct Block_f
{
    vec3f_16 a;
    vec3f_16 b;
};

If the compiler supports 16-byte alignment, this will work. Or at least, it will work in the case of Block_a and Block_f.

But it won't work in this case:

//GLSL
layout(std140) Block2
{
    vec3 a;
    float b;
} block2;

//C++
struct Block2_a
{
    vec3a_16 a;
    float b;
};

struct Block2_f
{
    vec3f_16 a;
    float b;
};

By the rules of std140, each vec3 must start on a 16-byte boundary. But vec3 does not consume 16 bytes of storage; it only consumes 12. And since float can start on a 4-byte boundary, a vec3 followed by a float will take up 16 bytes.

But the rules of C++ alignment don't allow such a thing. If a type is aligned to an X byte boundary, then using that type will consume a multiple of X bytes.

So matching std140's layout requires that you pick a type based on exactly where it is used. If it's followed by a float, you have to use vec3a; if it's followed by some type that is more than 4 byte aligned, you have to use vec3a_16.

Or you can just not use vec3s in your shaders and avoid all this added complexity.

Note that an alignas(8)-based vec2 will not have this problem. Nor will C/C++ structs&arrays using the proper alignment specifier (though arrays of smaller types have their own issues). This problem only occurs when using a naked vec3.

Implementation support is fuzzy

Even if you do everything right, implementations have been known to incorrectly implement vec3's oddball layout rules. Some implementations effectively impose C++ alignment rules to GLSL. So if you use a vec3, it treats it like C++ would treat a 16-byte aligned type. On these implementations, a vec3 followed by a float will work like a vec4 followed by a float.

Yes, it's the implementers' fault. But since you can't fix the implementation, you have to work around it. And the most reasonable way to do that is to just avoid vec3 altogether.

Note that, for Vulkan (and OpenGL using SPIR-V), the SDK's GLSL compiler gets this right, so you don't need to be worried about it for that.

Categories

opengl - Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

opengl - Should I ever use a `vec3` inside of a uniform buffer or shader storage buffer object?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

It won't do what C/C++ does

Implementation support is fuzzy

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags