I have a list of list as in the code I attached. I want to link each sub list if there are any common values. I then want to replace the list of list with a condensed list of list. Examples: if I have a list [[1,2,3],[3,4]]
I want [1,2,3,4]
. If I have [[4,3],[1,2,3]]
I want [4,3,1,2]
. If I have [[1,2,3],[a,b],[3,4],[b,c]]
I want [[1,2,3,4],[a,b,c]]
or [[a,b,c],[1,2,3,4]]
I don't care which one.
I am almost there...
My problem is when I have a case like [[1,2,3],[10,5],[3,8,5]]
I want [1,2,3,10,5,8]
but with my current code I get [1,2,3,8,10,5]
Here is my code:
import itertools
a = [1,2,3]
b = [3,4]
i = [21,22]
c = [88,7,8]
e = [5,4]
d = [3, 50]
f = [8,9]
g= [9,10]
h = [20,21]
lst = [a,b,c,i,e,d,f,g,h,a,c,i]*1000
#I have a lot of list but not very many different lists
def any_overlap(a, b):
sb = set(b)
return any(itertools.imap(sb.__contains__, a))
def find_uniq(lst):
''' return the uniq parts of lst'''
seen = set()
seen_add = seen.add
return [ x for x in lst if x not in seen and not seen_add(x)]
def overlap_inlist(o_lst, lstoflst):
'''
Search for overlap, using "any_overlap", of a list( o_lst) in a list of lists (lstoflst).
If there is overlap add the uniq part of the found list to the search list, and keep
track of where that list was found
'''
used_lst =[ ]
n_lst =[ ]
for lst_num, each_lst in enumerate(lstoflst):
if any_overlap(o_lst,each_lst):
n_lst.extend(each_lst)
used_lst.append(lst_num)
n_lst= find_uniq(n_lst)
return n_lst, used_lst
def comb_list(lst):
'''
For each list in a list of list find all the overlaps using 'ovelap_inlist'.
Update the list each time to delete the found lists. Return the final combined list.
'''
for updated_lst in lst:
n_lst, used_lst = overlap_inlist(updated_lst,lst)
lst[:] = [x for i,x in enumerate(lst) if i not in used_lst]
lst.insert(0,n_lst)
return lst
comb_lst = comb_list(lst)
print comb_lst
The out put from this script is:
[[88, 7, 8, 9, 10], [1, 2, 3, 4, 50, 5], [21, 22, 20]]
I want it so the key are in the original order like:
[[88, 7, 8, 9, 10], [1, 2, 3, 4, 5, 50,], [21, 22, 20]]
The 5 and 50 are switched in new lst[2]
I am somewhat new to python. I would appreciate any solutions to the problem or comments on my current code. I am not a computer scientists, I imagine there may be some kind of algorithm that does this quickly( maybe from set theory? ). If there is such an algorithm please point me to the right direction.
I may be making this way more complicated then it is...
Thank you!!
See Question&Answers more detail:
os