Consider the problem of extracting alphabets from a huge string.
One way to do is
''.join([c for c in hugestring if c.isalpha()])
The mechanism is clear: The list comprehension generates a list of characters. The join method knows how many characters it needs to join by accessing the length of the list.
Other way to do is
''.join(c for c in hugestring if c.isalpha())
Here the generator comprehension results in a generator. The join method does not know how many characters it is going to join because the generator does not possess len attribute. So this way of joining should be slower than the list comprehension method.
But testing in python shows that it is not slower. Why is this so?
Can anyone explain how join works on a generator.
To be clear:
sum(j for j in range(100))
doesn't need to have any knowledge of 100 because it can keep track of the cumulative sum. It can access the next element using the next method on the generator and then add to the cumulative sum.
However, since strings are immutable, joining strings cumulatively would create a new string in each iteration. So this would take lot of time.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…