Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
204 views
in Technique[技术] by (71.8m points)

python - Delete substrings from a list of strings

I have a list

l = ['abc', 'abcdef', 'def', 'defdef', 'polopolo']

I'm trying to delete strings whose superstring is already in the list. In this case, the result should be:

['abcdef', 'defdef', 'polopolo']

I have written the code:

l=['abc','abcdef','def','defdef','polopolo']
res=['abc','abcdef','def','defdef','polopolo']
for each in l:
    l1=[x for x in l if x!=each]
    for other in l1:
        if each in other:
            res.remove(each)

but it doesnt seem to work. I have read that we cannot remove from the list while iterating over it. Hence the copy res, while l is my original list.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
l=['abc','abcdef','def','defdef','polopolo']
print [j for i, j in enumerate(l) if all(j not in k for k in l[i + 1:])]
# ['abcdef', 'defdef', 'polopolo']

We can speed it up a very little, by sorting the list before

l = sorted(l, key = len)
print [j for i, j in enumerate(l) if all(j not in k for k in l[i + 1:])]

As @Ashwini Chaudhary mentions in the comments, if you want to retain the duplicate strings, then you can do this

l = ['abc','defghi' 'abcdef','def','defdef','defdef', 'polopolo']
l = sorted(l, key = len)
print [j for i,j in enumerate(l) if all(j == k or (j not in k) for k in l[i+1:])]
# ['defdef', 'defdef', 'polopolo', 'defghiabcdef']

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...