When you read a file with readlines()
, the resulting list elements do have a trailing newline characters. Likely, these are the reason why you have less matches than you expected.
Instead of writing
for x in list:
write
for x in (s.strip() for s in list):
This removes leading and trailing whitespace from the strings in list
. Hence, it removes trailing newline characters from the strings.
In order to consolidate your program, you could do something like this:
with open('c:/tmp/textfile.TXT') as f:
haystack = f.read()
if not haystack:
sys.exit("Could not read haystack data :-(")
with open('c:/tmp/list.txt') as f:
for needle in (line.strip() for line in f):
if needle in haystack:
print(needle, ',one_sentence')
else:
print(needle, ',another_sentence')
I did not want to make too drastic changes. The most important difference is that I am using the context manager here via the with
statement. It ensures proper file handling (mainly closing) for you. Also, the 'needle' lines are stripped on the fly using a generator expression. The above approach reads and processes the needle file line by line instead of loading the whole file into memory at once. Of course, this only makes a difference for large files.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…