Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
212 views
in Technique[技术] by (71.8m points)

c++ - Using clang++, -fvisibility=hidden, and typeinfo, and type-erasure

This is a scaled down version of a problem I am facing with clang++ on Mac OS X. This was seriously edited to better reflect the genuine problem (the first attempt to describe the issue was not exhibiting the problem).

The failure

I have this big piece of software in C++ with a large set of symbols in the object files, so I'm using -fvisibility=hidden to keep my symbol tables small. It is well known that in such a case one must pay extra attention to the vtables, and I suppose I face this problem. I don't know however, how to address it elegantly in a way that pleases both gcc and clang.

Consider a base class which features a down-casting operator, as, and a derived class template, that contains some payload. The pair base/derived<T> is used to implement type-erasure:

// foo.hh

#define API __attribute__((visibility("default")))

struct API base
{
  virtual ~base() {}

  template <typename T>
  const T& as() const
  {
    return dynamic_cast<const T&>(*this);
  }
};

template <typename T>
struct API derived: base
{};

struct payload {}; // *not* flagged as "default visibility".

API void bar(const base& b);
API void baz(const base& b);

Then I have two different compilation units that provide a similar service, which I can approximate as twice the same feature: down-casting from base to derive<payload>:

// bar.cc
#include "foo.hh"
void bar(const base& b)
{
  b.as<derived<payload>>();
}

and

// baz.cc
#include "foo.hh"
void baz(const base& b)
{
  b.as<derived<payload>>();
}

From these two files, I build a dylib. Here is the main function, calling these functions from the dylib:

// main.cc
#include <stdexcept>
#include <iostream>
#include "foo.hh"

int main()
try
  {
    derived<payload> d;
    bar(d);
    baz(d);
  }
catch (std::exception& e)
  {
    std::cerr << e.what() << std::endl;
  }

Finally, a Makefile to compile and link everybody. Nothing special here, except, of course, -fvisibility=hidden.

CXX = clang++
CXXFLAGS = -std=c++11 -fvisibility=hidden

all: main

main: main.o bar.dylib baz.dylib
    $(CXX) -o $@ $^

%.dylib: %.cc foo.hh
    $(CXX) $(CXXFLAGS) -shared -o $@ $<

%.o: %.cc foo.hh
    $(CXX) $(CXXFLAGS) -c -o $@ $<

clean:
    rm -f main main.o bar.o baz.o bar.dylib baz.dylib libba.dylib

The run succeeds with gcc (4.8) on OS X:

$ make clean && make CXX=g++-mp-4.8 && ./main 
rm -f main main.o bar.o baz.o bar.dylib baz.dylib libba.dylib
g++-mp-4.8 -std=c++11 -fvisibility=hidden -c main.cc -o main.o
g++-mp-4.8 -std=c++11 -fvisibility=hidden -shared -o bar.dylib bar.cc
g++-mp-4.8 -std=c++11 -fvisibility=hidden -shared -o baz.dylib baz.cc
g++-mp-4.8 -o main main.o bar.dylib baz.dylib

However with clang (3.4), this fails:

$ make clean && make CXX=clang++-mp-3.4 && ./main
rm -f main main.o bar.o baz.o bar.dylib baz.dylib libba.dylib
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -c main.cc -o main.o
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -shared -o bar.dylib bar.cc
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -shared -o baz.dylib baz.cc
clang++-mp-3.4 -o main main.o bar.dylib baz.dylib
std::bad_cast

However it works if I use

struct API payload {};

but I do not want to expose the payload type. So my questions are:

  1. why are GCC and Clang different here?
  2. is it really working with GCC, or I was just "lucky" in my use of undefined behavior?
  3. do I have a means to avoid making payload go public with Clang++?

Thanks in advance.

Type equality of visible class templates with invisible type parameters (EDIT)

I have now a better understanding of what is happening. It is appears that both GCC and clang require both the class template and its parameter to be visible (in the ELF sense) to build a unique type. If you change the bar.cc and baz.cc functions as follows:

// bar.cc
#include "foo.hh"
void bar(const base& b)
{
  std::cerr
    << "bar value: " << &typeid(b) << std::endl
    << "bar type:  " << &typeid(derived<payload>) << std::endl
    << "bar equal: " << (typeid(b) == typeid(derived<payload>)) << std::endl;
  b.as<derived<payload>>();
}

and if you make payload visible too:

struct API payload {};

then you will see that both GCC and Clang will succeed:

$ make clean && make CXX=g++-mp-4.8
rm -f main main.o bar.o baz.o bar.dylib baz.dylib libba.dylib
g++-mp-4.8 -std=c++11 -fvisibility=hidden -c -o main.o main.cc
g++-mp-4.8 -std=c++11 -fvisibility=hidden -shared -o bar.dylib bar.cc
g++-mp-4.8 -std=c++11 -fvisibility=hidden -shared -o baz.dylib baz.cc
./g++-mp-4.8 -o main main.o bar.dylib baz.dylib
$ ./main
bar value: 0x106785140
bar type:  0x106785140
bar equal: 1
baz value: 0x106785140
baz type:  0x106785140
baz equal: 1

$ make clean && make CXX=clang++-mp-3.4
rm -f main main.o bar.o baz.o bar.dylib baz.dylib libba.dylib
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -c -o main.o main.cc
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -shared -o bar.dylib bar.cc
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -shared -o baz.dylib baz.cc
clang++-mp-3.4 -o main main.o bar.dylib baz.dylib
$ ./main
bar value: 0x10a6d5110
bar type:  0x10a6d5110
bar equal: 1
baz value: 0x10a6d5110
baz type:  0x10a6d5110
baz equal: 1

Type equality is easy to check, there is actually a single instantiation of the type, as witnessed by its unique address.

However, if you remove the visible attribute from payload:

struct payload {};

then you get with GCC:

$ make clean && make CXX=g++-mp-4.8
rm -f main main.o bar.o baz.o bar.dylib baz.dylib libba.dylib
g++-mp-4.8 -std=c++11 -fvisibility=hidden -c -o main.o main.cc
g++-mp-4.8 -std=c++11 -fvisibility=hidden -shared -o bar.dylib bar.cc
g++-mp-4.8 -std=c++11 -fvisibility=hidden -shared -o baz.dylib baz.cc
g++-mp-4.8 -o main main.o bar.dylib baz.dylib
$ ./main
bar value: 0x10faea120
bar type:  0x10faf1090
bar equal: 1
baz value: 0x10faea120
baz type:  0x10fafb090
baz equal: 1

Now there are several instantiation of the type derived<payload> (as witnessed by the three different addresses), but GCC sees these types are equal, and (of course) the two dynamic_cast pass.

In the case of clang, it's different:

$ make clean && make CXX=clang++-mp-3.4
rm -f main main.o bar.o baz.o bar.dylib baz.dylib libba.dylib
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -c -o main.o main.cc
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -shared -o bar.dylib bar.cc
clang++-mp-3.4 -std=c++11 -fvisibility=hidden -shared -o baz.dylib baz.cc
.clang++-mp-3.4 -o main main.o bar.dylib baz.dylib
$ ./main
bar value: 0x1012ae0f0
bar type:  0x1012b3090
bar equal: 0
std::bad_cast

There are also three instantiations of the type (removing the failing dynamic_cast does show that there are three), but this time, they are not equal, and the dynamic_cast (of course) fails.

Now the question turns into: 1. is this difference between both compilers wanted by their authors 2. if not, what is "expected" behavior between both

I prefer GCC's semantics, as it allows to really implement type-erasure without any need to expose publicly the wrapped types.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I had reported this to the people from LLVM, and it was first noted that if it works in the case of GCC, it's because:

I think the difference is actually in the c++ library. It looks like libstdc++ changed to always use strcmp of the typeinfo names:

https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=149964

Should we do the same with libc++?

To this, it was clearly answered that:

No. It pessimizes correctly behaving code to work around code that violates the ELF ABI. Consider an application that loads plugins with RTLD_LOCAL. Two plugins implement a (hidden) type called "Plugin". The GCC change now makes this completely separate types identical for all RTTI purposes. That makes no sense at all.

So I can't do what I want with Clang: reduce the number of published symbols. But it appears to be saner than the current behavior of GCC. Too bad.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...