Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
705 views
in Technique[技术] by (71.8m points)

unit conversion - PDFBox converting inches or centimeters into the coordinate system

I am new to PDFBox (and PDF generation) and I am having difficulty to generate my own PDF.

I do have text with certain coordinates in inches/centimeters and I need to convert them to the units PDFBox uses. Any suggestions/utilities than can do this automatically?

PDPageContentStream.moveTextPositionByAmount(x,y) is making no sense to me.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

In general PDFBox uses the PDF user space coordinates when creating a PDF. This means:

  1. The coordinates of a page are delimited by its CropBox defaulting to its MediaBox, the values increasing left to right and bottom to top. Thus, if you create a page using new PDPage() or new PDPage(PDPage.PAGE_SIZE_*) the origin of the coordinate system starts in the lower left corner of the page.

  2. The unit in user space starts as the default user space unit which is defined by the UserUnit of the page. Most often (e.g. if you create a page using any of the PDPage constructors and don't explicitly change that value) it is not explicitly set and, therefore, its default kicks in which is 1?72 inch.

  3. The user space coordinate system can be changed pretty arbitrarily by concatenating some matrix to the current transformation matrix. The current transformation matrix starts as the identity matrix.

    In PDFBox you do this using one of the PDPageContentStream.concatenate2CTM() overloads.

  4. As soon as you switch to text mode using PDPageContentStream.beginText(), the coordinate system used is furthermore influenced by the transformation introduced by the text matrix.

    In PDFBox you set the text matrix using one of the PDPageContentStream.setTextMatrix() overloads.

As you are new to PDFBox (as you say) and new to PDF in general (as I presume because otherwise you would likely have recognized the coordinates), I would advise you to initially refrain from using transformations wherever possible and, therefore, remain in state where the coordinate system starts in the lower left, is neither rotated nor skewed, and has a unit length of 1/72 inch.

For this context you actually can use constants provided by PDFBox for conversion:

  • Multiply coordinates in inch by PDPage.DEFAULT_USER_SPACE_UNIT_DPI to get default user space coordinates.
  • Multiply coordinates in mm by PDPage.MM_TO_UNITS to get default user space coordinates.

If you want to have fun with coordinates, though, look at the PDF specification ISO-32000-1 and study the sections 8.3 Coordinate Systems and 9.4.4 Text Space Details.


The PDPage constants pointed to above used to be accessible in early PDFBox 1.8.x versions but then got hidden (private), and eventually were removed in the transition to PDFBox 2.x.

For reference, the constants were defined as

private static final int DEFAULT_USER_SPACE_UNIT_DPI = 72;

private static final float MM_TO_UNITS = 1/(10*2.54f)*DEFAULT_USER_SPACE_UNIT_DPI;

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...