The Goal
I have a directory with 65 .txt files, which I am parsing, one by one, and saving the outputs into 65 corresponding .txt files. I then plan to concatenate them, but I'm not sure if jumping straight to that might help find a solution here.
The Problem
I am receiving:
TypeError: 'NoneType' object has no attribute 'getitem'
and have seen two similar threads:
TypeError: 'NoneType' object has no attribute '__getitem__'
Python: TypeError: 'NoneType' object has no attribute '__getitem__'
My problem seems somewhat strange, however, as it does manage to go through the input files, parsing them and writing the output file about ten times, at which point I get the error. The files are all similar, just HTML source code from website (i.e. the same website, just different pages of it, and so the same basic HTML structure).
Here is the function where the error occurs; in the last line of this snippet:
def parse(elTree):
desired_value = elTree.xpath('my_very_long_xpath')
desired_value = [x.get('title')[8:] for x in desired_value]
I do have a few more variants of these - I am actually parsing for about 5 to 6 different desired_value
s. And all of this is simply running inside of a larger loop where the files are read in to the parse
function and then the output is written to a new file.
What I have tried
I have removed the file where I initially got the error, but the same error occurred at the next file. I did the same again, removing two files, but still getting that error.
I introduced a time.sleep(3)
between each file, just to allow things to maybe run more smoothly. I realized there may be a buffer for the whole process, which is maybe being read and it is just being wiped, and so there is no file there... Here is a similar occurrence within a loop in C
. Unfortunately the sleep for 3 seconds (plus then scattered around at various other points) didn't help me. the code fails at exactly the same point.
According to the documentation, a TypeError
arises when a function is applied to an object of inappropriate type, so how can it be that it is occurring after functioning correctly 10 or 11 times?
Here is more official information regarding the __getitem__
method
As the code does work well otherwise, I haven't included the rest, but if someone suspects it may originate from somewhere else, with good reason, then I will add more of the code.
I have inspected the contents of the .txt files for those that worked and those where it failed and the xpaths work in both, the contents are there to be found and parsed.
I used the code on 30 copies of the same file, which did execute successfully, so there must be subtle differences in the HTML code, which my parser is not recognizing.
See Question&Answers more detail:
os