Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.1k views
in Technique[技术] by (71.8m points)

c# - Is Contains thread safe in HashSet<T>

Looking at the code for Contains in the HashSet<T> class in the .NET source code, I cannot find any reason why Contains is not thread safe?

I am loading a HashSet<T> with values ahead of time, and then checking Contains in a multi threaded .AsParallel() loop.

Is there any reason why this would not be safe. I am loath to use ConcurrentDictionary when I don't actually require storing values.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Normally (normally) collections that are used only for reading are "unofficially" thread safe (there is no collection in .NET that I know that modifies itself during reading). There are some caveats:

  • The items themselves could not be thread safe (but with an HashSet<T> this problem should be minimized, because you can't extract items from it. Still the GetHashCode() and the Equals() must be thread-safe. If, for example, they access lazy objects that are loaded on-demand, they could be not-thread safe, or perhaps they cache/memoize some data to speed-up subsequent operations)
  • You must be sure that after the last write there is a Thread.MemoryBarrier() (done in the same thread as the write) or equivalent, otherwise a read on another thread could read incomplete data
  • You must be sure that in each thread (different from the one where you did a write), before doing the first read there is a Thread.MemoryBarrier(). Note that if the HashSet<T> was "prepared" (with the Thread.MemoryBarrier() at the end) before creating/starting the other threads, then the Thread.MemoryBarrier() isn't necessary, because the threads can't have a stale read of the memory (because they didn't exist). Various operations cause an implicit Thread.MemoryBarrier(). For example if the threads where created before the HashSet<T> was filled, entered a Wait() and were un-Waited after the HashSet<T> was filled (plus its Thread.MemoryBarrier()), exiting a Wait() causes an implicit Thread.MemoryBarrier()

A simple example of a class that uses memoization/lazy loading/whatever you want to call it and in that way can break the thread safety.

public class MyClass
{
    private long value2;

    public int Value1 { get; set; }

    // Value2 is lazily loaded in a very primitive
    // way (note that Lazy<T> *can* be used thread-safely!)
    public long Value2
    {
        get
        {
            if (value2 == 0)
            {
                // value2 is a long. If the .NET is running at 32 bits,
                // the assignment of a long (64 bits) isn't atomic :)
                value2 = LoadFromServer();

                // If thread1 checks and see value2 == 0 and loads it,
                // and then begin writing value2 = (value), but after
                // writing the first 32 bits of value2 we have that
                // thread2 reads value2, then thread2 will read an
                // "incomplete" data. If this "incomplete" data is == 0
                // then a second LoadFromServer() will be done. If the
                // operation was repeatable then there won't be any 
                // problem (other than time wasted). But if the 
                // operation isn't repeatable, or if the incomplete 
                // data that is read is != 0, then there will be a
                // problem (for example an exception if the operation 
                // wasn't repeatable, or different data if the operation
                // wasn't deterministic, or incomplete data if the read
                // was != 0)
            }

            return value2;
        }
    }

    private long LoadFromServer()
    {
        // This is a slow operation that justifies a lazy property
        return 1; 
    }

    public override int GetHashCode()
    {
        // The GetHashCode doesn't use Value2, because it
        // wants to be fast
        return Value1;
    }

    public override bool Equals(object obj)
    {
        MyClass obj2 = obj as MyClass;

        if (obj2 == null)
        {
            return false;
        }

        // The equality operator uses Value2, because it
        // wants to be correct.
        // Note that probably the HashSet<T> doesn't need to
        // use the Equals method on Add, if there are no
        // other objects with the same GetHashCode
        // (and surely, if the HashSet is empty and you Add a
        // single object, that object won't be compared with
        // anything, because there isn't anything to compare
        // it with! :-) )

        // Clearly the Equals is used by the Contains method
        // of the HashSet
        return Value1 == obj2.Value1 && Value2 == obj2.Value2;
    }
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...