Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
416 views
in Technique[技术] by (71.8m points)

pdf - HTML2PDF in PHP - convert utilities & scripts - examples & demos

I have a quite complicated HTML/CSS layout which I would like to convert to PDF on my server. I already have tryed DOMPDF, unfortunately it did not convert the HTML with correct layout. I have considered HTMLDOC but I have heard that it ignores CSS to a large extent, so I suppose the layout would break apart with that tool too.

My question therefor is - are there any online demos for other tools (like wkhtmltopdf i.e.) that I could use to verify how my HTML is converted? Before spending the rest of my life installing & testing one by one?

Unfortunately, I can't change the HTML layout to fit those tools. Or better said - I could, if any of them would get close to an acceptable result...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Not really an answer but for the question above, but I'll try to provide some of my experience, maybe it will help someone somwhere in the future.

  1. wkthmltopdf is really THE ONLY solution that worked for me that could produce what I call acceptable results. Still, some minor modifications to the CSS had to be made, however, it worked really well when it comes to rendering the content. All the other packages are really only suitable if you have a rather simply document with one basic table etc. No chance to get them to produce fair results on complex docs with design elements, css, multiple overlapping images etc. If complex documents are in game - do not spend the time (like I did) - go straight to wkhtmltopdf.

  2. Beware - the wkhtmltopdf installation is tricky. It was not so easy for me as the guys said in their comments (one of the reasons might be that I am not too familiar with Linux). The static binary did not work for me for some reason I can't explain. I suspect that there were problems with the version - apparently there is a difference between versions for different OS and processors, maybe I have the vrong version. For installing the non-static version first of all you have to have root access to the server, that's obvious. I installed it with apt-get using PuTTy, went quite well. I was lucky that my server already had all the predispositions to install wkhtmltopdf. So this was the easy part for me :) (btw, you don't have to care for symbolic links or wrappers as many tutorials tell you - I spent hours trying to figure out how to do that part, in the end I gave it up and everything works well though)

  3. After the install I got the quite famous Cannot connect to X server error. This is due to the fact that we need to run wkhtmltopdf headless on a 'virtual' x server. Getting around this was also quite simple (if one does not care for the symbolic links). I installed it with apt-get install xvfb. This also went quite well for me, no problems.

  4. After completing this I was able to run wkhtmltopdf. Beware - it took me some time to figure out that trying to run xvfb was the wrong way - instead you have to run xvfb-run. My PHP code now looks like this exec("xvfb-run wkhtmltopdf --margin-left 16 /data/web/domain.com/source.html /data/web/domain.com/target.pdf"); (notice the --margin-left 16 command line option for wkhtmltopdf - it makes my content more centered; I left it in place to demonstrate how you can use command line options).

  5. I also wanted to protect the generated PDF files from editing (in my case, print protect is also possible). After doing some research I found this class from ID Security Suite. First of all I have to say - IT'S OLD (I am running PHP 5+). However, I made some improvements to it. First of all - it's a wrapper around the FPDF library, so there is a file called fpdf.php in the package. I replaced this file from the latest FPDF version I got from here. It made my PHP warnings look more sustainable. I also changed the $pdf =& new FPDI_Protection(); and removed the & sign as I was getting an deprecated warning for it. However, there are more of those to come. Instead of searching and modifying the code I just turned the error reporting lvl to 0 with error_reporting(0); (although turning off the warnings only should be sufficient). Now someone will say that this is not "good practice". I am using this whole stuff on an internal system, so I do not really have to care. For sure the scripts could be modifiyed to match latest requirements. For me I didn't want to spend another hours working on it. Be careful where the script says $pdf->SetProtection(array('print'), '', $password); (I allowed printing my documents as you can see). It took me a while to figure out that the first argument is the permissions. The second is the USER PASSWORD - if you provide this then the docs will require a password to open (I left this blank). The third is the OWNER PASSWORD - this is what you need to make the docs "secured" against editing, copying etc.

My whole code now looks like:

// get the HTML content of the file we want to convert
$invoice = file_get_contents("http://www.domain.com/index.php?s=invoices-print&invoice_no=".$_GET['invoice_no'];
// replace the CSS style from a print version to a specially modified PDF version
$invoice = str_replace('href="design/css/base.print.css"','href="design/css/base.pdf.css"',$invoice);

// write the modified file to disk
file_put_contents("docs/invoices/tmp/".$_GET['invoice_no'].".html", $invoice);

// do the PDF magic
exec("xvfb-run wkhtmltopdf --margin-left 16 /data/web/domain.com/web/docs/invoices/tmp/".$_GET['invoice_no'].".html /data/web/domain.com/web/docs/invoices/".$_GET['invoice_no'].".pdf");

// delete the temporary HTML data - we don't need that anymore since our PDF is created
unlink("docs/invoices/tmp/".$_GET['invoice_no'].".html");

// workaround the warnings
error_reporting(0); 

// script from ID Security Suite
function pdfEncrypt ($origFile, $password, $destFile){
    require_once('libraries/fpdf/FPDI_Protection.php');
    $pdf = new FPDI_Protection();
    $pdf->FPDF('P', 'in');
    //Calculate the number of pages from the original document.
    $pagecount = $pdf->setSourceFile($origFile);
    //Copy all pages from the old unprotected pdf in the new one.
    for ($loop = 1; $loop <= $pagecount; $loop++) {
        $tplidx = $pdf->importPage($loop);
        $pdf->addPage();
        $pdf->useTemplate($tplidx);
    }

    //Protect the new pdf file, and allow no printing, copy, etc. and
    //leave only reading allowed.
    $pdf->SetProtection(array('print'), '', $password);
    $pdf->Output($destFile, 'F');
    return $destFile;
}

//Password for the PDF file (I suggest using the email adress of the purchaser).
$password = md5(date("Ymd")).md5(date("Ymd"));
//Name of the original file (unprotected).
$origFile = "docs/invoices/".$_GET['invoice_no'].".pdf";
//Name of the destination file (password protected and printing rights removed).
$destFile = "docs/invoices/".$_GET['invoice_no'].".pdf";
//Encrypt the book and create the protected file.
pdfEncrypt($origFile, $password, $destFile );

Hope this helps someone to save some time in the future. This whole solution took me like 12 hours to implement into our invoicing system. If there was better info on wkhtmltopdf for users like me, who are not that familiar with Linux/UNIX, I could have saved some of the hours spent on this.

However - what doesn't kill you makes you stronger :) So I am a bit more perfect now that I made this run :)


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...