I'm currently writing a little tool (Python + pyPdf) to test PDFs for printer conformity.
Alas I already get confused at the first task: Detecting if the PDF has at least 3mm 'bleed' (border around the pages where nothing is printed). I already got that I can't detect the bleed for the complete document, since there doesn't seem to be a global one. On the pages however I can detect a total of five different boxes:
mediaBox
bleedBox
trimBox
cropBox
artBox
I read the pyPdf documentation concerning those boxes, but the only one I understood is the mediaBox
which seems to represent the overall page size (i.e. the paper).
The bleedBox
pretty obviously ought to define the bleed, but that doesn't always seem to be the case.
Another thing I noted was that for instance with the PDF, all those boxes have the exact same size (implying no bleed at all) on each page, but when I open it there's a huge amount of bleed; This leads me to think that the individual text elements have their own offset.
So, obviously, just calculating the bleed from mediaBox
and bleedBox
is not a viable option.
I would be more than delighted if anyone could shed some light on what those boxes actually are and what I can conclude from that (e.g. is one box always smaller than another one).
Bonus question: Can someone tell me what exactly the "default user space unit" mentioned in the documentation? I'm pretty sure this refers to mm
on my machine, but I'd like to enforce mm
everywhere.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…