Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
270 views
in Technique[技术] by (71.8m points)

c - Aliasing Arrays through structs

I'm reading paragraph 7 of 6.5 in ISO/IEC 9899:TC2.

It condones lvalue access to an object through:

an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),

Please refer to the document for what 'aforementioned' types are but they certainly include the effective type of the object.

It is in a section noted as:

The intent of this list is to specify those circumstances in which an object may or may not be aliased.

I read this as saying (for example) that the following is well defined:

#include <stdlib.h>
#include <stdio.h>

typedef struct {
    unsigned int x;
} s;

int main(void){
    unsigned int array[3] = {73,74,75};

   s* sp=(s*)&array; 

   sp->x=80;

   printf("%d
",array[0]);

   return EXIT_SUCCESS;
}

This program should output 80.

I'm not advocating this as a good (or very useful) idea and concede I'm in part interpreting it that way because I can't think what else that means and can't believe it's a meaningless sentence!

That said, I can't see a very good reason to forbid it. What we do know is that the alignment and memory contents at that location are compatible with sp->x so why not?

It seems to go so far as to say if I add (say) a double y; to the end of the struct I can still access array[0] through sp->x in this way.

However even if the array is larger than sizeof(s) any attempt to access sp->y is 'all bets off' undefined behaviour.

Might I politely ask for people to say what that sentence condones rather than go into a flat spin shouting 'strict aliasing UB strict aliasing UB' as seems to be all too often the way of these things.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The answer to this question is covered in proposal: Fixing the rules for type-based aliasing which we will see, unfortunately was not resolved in 2010 when the proposal was made which is covered in Hedquist, Bativa, November 2010 minutes . Therefore C11 does not contain a resolution to N1520, so this is an open issue:

There does not seem to be any way that this matter will be resolved at this meeting. Each thread of proposed approaches leads to more questions. 1 1:48 am, Thurs, Nov 4, 2010.

ACTION – Clark do more work in

N1520 opens saying (emphasis mine going forward):

Richard Hansen pointed out a problem in the type-based aliasing rules, as follows:

My question concerns the phrasing of bullet 5 of 6.5p7 (aliasing as it applies to unions/aggregates). Unless my understanding of effective type is incorrect, it seems like the union/aggregate condition should apply to the effective type, not the lvalue type.

Here are some more details:

Take the following code snippet as an example:

union {int a; double b;} u;
u.a = 5;

From my understanding of the definition of effective type (6.5p6), the effective type of the object at location &u is union {int a; double b;}. The type of the lvalue expression that is accessing the object at &u (in the second line) is int.

From my understanding of the definition of compatible type (6.2.7), int is not compatible with union {int a; double b;}, so bullets 1 and 2 of 6.5p7 do not apply. int is not the signed or unsigned type of the union type, so bullets 3 and 4 do not apply. int is not a character type, so bullet 6 does not apply.

That leaves bullet 5. However, int is not an aggregate or union type, so that bullet also does not apply. That means that the above code violates the aliasing rule, which it obviously should not.

I believe that bullet 5 should be rephrased to indicate that if the effective type (not the lvalue type) is an aggregate or union type that contains a member with type compatible with the lvalue type, then the object may be accessed.

Effectively, what he points out is that the rules are asymmetrical with respect to struct/union membership. I have been aware of this situation, and considered it a (non-urgent) problem, for quite some time. A series of examples will better illustrate the problem. (These examples were originally presented at the Santa Cruz meeting.)

In my experience with questions about whether aliasing is valid based on type constraints, the question is invariably phrased in terms of loop invariance. Such examples bring the problem into extremely sharp focus.

And the relevant example that applies to this situation would be 3 which is as follows:

struct S { int a, b; };
void f3(int *pi, struct S *ps1, struct S const *ps2)
{
  for (*pi = 0; *pi < 10; ++*pi) {
      *ps1++ = *ps2;
  }
}

The question here is whether the object *ps2 may be accessed (and especially modified) by assigning to the lvalue *pi — and if so, whether the standard actually says so. It could be argued that this is not covered by the fifth bullet of 6.5p7, since *pi does not have aggregate type at all.

**Perhaps the intention is that the question should be turned around: is it allowed to access the value of the object pi by the lvalue ps2. Obviously, this case would be covered by the fifth bullet.

All I can say about this interpretation is that it never occurred to me as a possibility until the Santa Cruz meeting, even though I've thought about these rules in considerable depth over the course of many years. Even if this case might be considered to be covered by the existing wording, I'd suggest that it might be worth looking for a less opaque formulation.

The following discussion and proposed solutions are very long and hard to summarize but seems to end with a removal of the aforementioned bullet five and resolve the issue with adjustments to other parts of 6.5. But as noted above this issues involved were not resolvable and I don't see a follow-up proposal.

So it would seem the standard wording permits the scenario the OP demonstrates although my understanding is that this was unintentional and therefore I would avoid it and it could potentially change in later standards to be non-conforming.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...