Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
403 views
in Technique[技术] by (71.8m points)

ruby - Deleting a modified object from a set in a no-op?

See the example below

require "set"
s = [[1, 2], [3, 4]].to_set # s = {[1, 2], [3, 4]}
m = s.max_by {|a| a[0]} # m = [3, 4]
m[0] = 9 # m = [9, 4], s = {[1, 2], [9, 4]}
s.delete(m) # s = {[1, 2], [9, 4]} ?????

This behaves differently from an array. (If we remove .to_set, we will get s = [[1, 2]] which is expected.) Is this a bug?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Yes, this is a bug or at least I'd call it a bug. Some would call this "an implementation detail accidentally leaking to the outside world" but that's just fancy pants city-boy talk for bug.

The problem has two main causes:

  1. You're modifying elements of the Set without Set knowing about it.
  2. The standard Ruby Set is implemented as a Hash.

The result is that you're modifying the internal Hash's keys without the Hash knowing about it and that confuses the poor Hash into not really knowing what keys it has anymore. The Hash class has a rehash method:

rehash → hsh

Rebuilds the hash based on the current hash values for each key. If values of key objects have changed since they were inserted, this method will reindex hsh.

a = [ "a", "b" ]
c = [ "c", "d" ]
h = { a => 100, c => 300 }
h[a]       #=> 100
a[0] = "z"
h[a]       #=> nil
h.rehash   #=> {["z", "b"]=>100, ["c", "d"]=>300}
h[a]       #=> 100

Notice the interesting behavior in the example included with the rehash documentation. Hashes keep track of things using the k.hash values for the key k. If you have an array as a key and you change the array, you can change the array's hash value as well; the result is that the Hash still has that array as a key but the Hash won't be able to find that array as a key because it will be looking in the bucket for the new hash value but the array will be in the bucket for the old hash value. But, if you rehash the Hash, it will all of a sudden be able to find all of its keys again and the senility goes away. Similar problems will occur with non-array keys: you just have to change the key in such a way that its hash value changes and the Hash containing that key will get confused and wander around lost until you rehash it.

The Set class uses a Hash internally to store its members and the members are used as the hash's keys. So, if you change a member, the Set will get confused. If Set had a rehash method then you could kludge around the problem by slapping the Set upside the head with rehash to knock some sense into it; alas, there is no such method in Set. However, you can monkey patch your own in:

class Set
  def rehash
    @hash.rehash
  end
end

Then you can change the keys, call rehash on the Set, and your delete (and various other methods such as member?) will work properly.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...