Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
162 views
in Technique[技术] by (71.8m points)

consequences of multiple weak symbols(C Linker)

I just have a question about potential problem of multiple weak symbols, this question is from my textbook:

One module A:

int x;
int y;
p1() {...}

the other module B:

double x;
p2() {...}

and my textbook says that 'write to x in p2 might overwrite y' I can kind of get the idea of the textbook( double x is twice the size of int x, and int y is placed right after int x, here comes the problem), but still lost in details, I know when there are multiple weak symbols, the linker will just randomly pick one, so my question is, which x of module that the linker choose will result in writing to x in p2 will overwrite y.

This is my understanding :if the linker choose the int x of module A will result in the consequence, because in that way x,y are both 4 bytes and the p2(image after compilation there is one assembly code movq compared by movl in p1 )will change 8 bytes therefore overwrite y.

But my instructor said if only the linker choose double x of module B, that will result in overwriting y, how come, am I correct or my instructor is correct?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

According to ISO C, the program invokes undefined behavior. An external name which is used must have exactly one definition somewhere in the program.*

"Weak symbols" are a concept in some dynamic library systems like ELF on GNU/Linux. That terminology does not apply here. A linker which allows multiple definitions of an external symbol is said to be implementing the "relaxed ref/def" model. This term comes from section 6.1.2.2 of the ANSI C rationale.

If we regard the relaxed ref/def model as a documented language extension, then the multiple definitions of a name become locally defined behavior. However, what if they are inconsistently typed? That is almost certainly undefined by the reasoning that the situation resembles bad type aliasing. It is possible that if one module has int x; int y; and the other has double x, that a write through the double x alias will clobber y. This isn't something you can portably rely on. It's a very poor way to obtain an aliasing effect on purpose; you want to use a union between two structures or some such.

Now about "weak symbols": those are external names in shared libraries that can be overridden by alternative definitions. For instance, most of the functions in the GNU C library on a GNU/Linux system are weak symbols. A program can define its own read function to replace the POSIX one, for instance. The library itself will not break not matter how read is redefined; when it needs to call read, it doesn't use the weak symbol read but some internal alias like __libc_read. This mechanism is important; it allows the library to conform to ISO C. A strictly conforming ISO C program is allowed to use read as an external name.


* In the ISO C99 standard, this was given in 6.9 External Definitions: "If an identifier declared with external linkage is used in an expression (other than as part of the operand of a sizeof operator whose result is an integer constant), somewhere in the entire program there shall be exactly one external definition for the identifier; otherwise, there shall be no more than one."


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...