I have a very long string. I want to find all the unique substrings of this string. I tried to write the code where I used a set(python) to store all the substrings to ensure uniqueness. I am getting correct result for many medium and large strings however in case of very large strings, I am getting a MemoryError. I googled a bit and found out that the set data structure in python has a large RAM footprint and maybe thats why I am getting a MemoryError.
Here is my code :
a = set()
for i in range(n):
string = raw_input()
j = 1
while True:
for i in xrange(len(string)-j+1):
a.add(string[i:i+j])
if j==len(string): break
j+=1
print sorted(list(a))
Is there a way to avoid this error for large strings? Or can anybody suggest a better modification in my code to handle this issue?
P.S: I donot have an option of shifting between 32 bit and 64 bit versions.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…