As we fill in the components, at each stage there are three cases to consider (as you will have to match up overlapping groups):
- Neither x or y are in any component already found.
- Both are already in different sets, x in set_i and y in set_j.
- Either one or both are in one component, x in set_i or y in a set_i.
We can use the built-in set
to help. (see @jwpat's and @DSM's trickier examples):
def connected_components(lst):
components = [] # list of sets
for (x,y) in lst:
i = j = set_i = set_j = None
for k, c in enumerate(components):
if x in c:
i, set_i = k, c
if y in c:
j, set_j = k, c
#case1 (or already in same set)
if i == j:
if i == None:
components.append(set([x,y]))
continue
#case2
if i != None and j != None:
components = [components[k] for k in range(len(components)) if k!=i and k!=j]
components.append(set_i | set_j)
continue
#case3
if j != None:
components[j].add(x)
if i != None:
components[i].add(y)
return components
lst = [(1, 2), (2, 3), (4, 3), (5, 6), (6, 7), (8, 2)]
connected_components(lst)
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])]
map(list, connected_components(lst))
# [[8, 1, 2, 3, 4], [5, 6, 7]]
connected_components([(1, 2), (4, 3), (2, 3), (5, 6), (6, 7), (8, 2)])
# [set([8, 1, 2, 3, 4]), set([5, 6, 7])] # @jwpat's example
connected_components([[1, 3], [2, 4], [3, 4]]
# [set([1, 2, 3, 4])] # @DSM's example
This certainly won't be the most efficient method, but is perhaps one similar to what they would expect. As Jon Clements points out there is a library for these type of calculations: networkx, where they will be much more efficent.