For the last few months I have been wrestling with the SAXParser to figure out where a strange bug is originating from.
Where I work, I receive a lot of fairly large XML files to load into our database. We have some code which has been doing this using the SAXParser and our own DefaultHandler implementation and this works fine.
However, I now have a BlockingQueue which runs multiple processes which includes the SAXParser I've wrote about above. When I do this I start to see the wrong XML popping up from one file in another(!?) which causes these loads to fail.
All parsers, their readers and writers are on different threads and as far as I'm aware they don't share any variables, which has left me scratching my head for a few weeks now!
Is there maybe something I'm missing, has anyone experienced this before and if so how did you resolve it?
Edit:
Okay. So after looking into every single detail of my implementation and a lot of research later, I found out what was causing my threading errors. It was not SAXParser but my DefaultHandler code!
In the DefaultHandler I have an Enum to let me know which element I am inside. The Enum has Boolean variables to help with program flow. My problem was that Javas Enum is a singleton object, which is usually thread-safe, but can produce strange behavior when used in multiple threads because the Booleans are mutable.
To fix my code I just took the booleans out of the Enum since they don't even need to be there and placed them directly inside my DefaultHandler where they should be thread-safe now. Phew!
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…