python - How do I make my implementation of greedy set cover faster?

Question

Welcome To Ask or Share your Answers For Others

python - How do I make my implementation of greedy set cover faster?

posted Oct 24, 2021 in Technique[技术] by 深蓝 (71.8m points)

python - How do I make my implementation of greedy set cover faster?

I came up with the following implementation for the Greedy Set Cover after much discussion regarding my original question here. From the help I received, I encoded the problem into a "Greedy Set Cover" and after receiving some more help here, I came up with the following implementation. I am thankful to everyone for helping me out with this. The following implementation works fine but I want to make it scalable/faster.

By scalable/faster, I mean to say that:

My dataset contains about 50K-100K sets in S
The number of elements in U itself is very small in the order of 100-500
The size of each set in S could be anywhere from 0 to 40

And here goes my attempt:

U = set([1,2,3,4])
R = U
S = [set([1,2]), 
     set([1]), 
     set([1,2,3]), 
     set([1]), 
     set([3,4]), 
     set([4]), 
     set([1,2]), 
     set([3,4]), 
     set([1,2,3,4])]
w = [1, 1, 2, 2, 2, 3, 3, 4, 4]

C = []
costs = []

def findMin(S, R):
    minCost = 99999.0
    minElement = -1
    for i, s in enumerate(S):
        try:
            cost = w[i]/(len(s.intersection(R)))
            if cost < minCost:
                minCost = cost
                minElement = i
        except:
            # Division by zero, ignore
            pass
    return S[minElement], w[minElement]

while len(R) != 0:
    S_i, cost = findMin(S, R)
    C.append(S_i)
    R = R.difference(S_i)
    costs.append(cost)

print "Cover: ", C
print "Total Cost: ", sum(costs), costs

I am not an expert in Python but any Python-specific optimizations to this code would be really nice.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-23T17:55:43+0000

I use a trick when I implemented the famous greedy algorithm for set cover (no weights) in Matlab. It is possible that you could extend this trick to the weighted case somehow, using set cardinality / set weight instead of set cardinality. Moreover, if you use NumPy library, exporting Matlab code to Python should be very easy.

Here is the trick:

(optional) I sorted the sets in descending order with respect to the cardinality (i.e. number of elements they contain). I also stored their cardinalities.
I select a set S, in my implementation it is the largest (i.e. first set of the list), and I count how many uncovered elements it contains. Let's say that it contains n uncovered elements.
Since now I know there is a set S with n uncovered elements, I don't need to process all the sets with cardinality lower than n elements, because they cannot be better than S. So I just need to search for the optimal set among the sets with cardinality at least n; with my sorting, we can focus on them easily.

Categories

python - How do I make my implementation of greedy set cover faster?

python - How do I make my implementation of greedy set cover faster?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags