Edit : This post is now addressed in a new, as the problem as to be presented slightly differently. It's here : How can I efficiently run XSLT transformations for a large number of files in parallel?
I'm stuck in my attempts of parallelizing a process, and after some decent time spent on it I'd like to request some help ...
Basically, I have a lots of XML files to transform with a specific XSLT sheet. But the sheet uses a call to an (very slow) API to fetch additional data, and taking the whole batch of XMLs in 1 go will take (very) long.
Therefore I splitted all the files from the original "input" folder into subfolder containing each around 5000 XML files, and I copied the following Bash script inside each subfolder too:
for f in *.xml
do
java -jar ../../saxon9he.jar -xsl:../../some-xslt-sheet.xsl -s:$f
done
And I call each process, for each folder, from the "root" folder containing altogether the "input" folder, the Saxon library and the XSLT sheet :
find input -type d -exec sh {}/script.sh ;
But I get this error:
Unable to access jarfile ../../saxon9he.jar
I suppose it comes form the fact that I'm operating from the "root" folder, when the scripts being called are lower in the directories. I could solver the problem (if I'm correct) by copying all the assets in each subfolder, but I found the solution making my current approach even clumsier.
Thanks to anyone who might have an idea and make me understand this !
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…