For later versions of iTextSharp:
Using iTextSharp you can use the iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList()
method to create a PDF from HTML.
ParseToList()
takes a TextReader
(an abstract class) for its HTML source, which means you can use a StringReader
or StreamReader
(both of which use TextReader as a base type). I used a StringReader
and was able to generate PDFs from simple mark up. I tried to use the HTML returned from a webpage and got errors on all but the simplist pages. Even the simplist webpage I retrieved (http://black.ea.com/) was rendering the content of the page's 'head' tag onto the PDF, so I think the HTMLWorker.ParseToList()
method is picky about the formatting of the HTML it parses.
Anyway, if you want to try here's the test code I used:
// Download content from a very, very simple "Hello World" web page.
string download = new WebClient().DownloadString("http://black.ea.com/");
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
try {
using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
PdfWriter.GetInstance(document, fs);
using (StringReader stringReader = new StringReader(download)) {
ArrayList parsedList = HTMLWorker.ParseToList(stringReader, null);
document.Open();
foreach (object item in parsedList) {
document.Add((IElement)item);
}
document.Close();
}
}
} catch (Exception exc) {
Console.Error.WriteLine(exc.Message);
}
I couldn't find any documentation on which HTML constructs HTMLWorker.ParseToList()
supports; if you do please post it here. I'm sure a lot of people would be interested.
For older versions of iTextSharp:
You can use the iTextSharp.text.html.HtmlParser.Parse
method to create a PDF based on html.
Here's a snippet demonstrating this:
Document document = new Document(PageSize.A4, 80, 50, 30, 65);
try {
using (FileStream fs = new FileStream("TestOutput.pdf", FileMode.Create)) {
PdfWriter.GetInstance(document, fs);
HtmlParser.Parse(document, "YourHtmlDocument.html");
}
} catch(Exception exc) {
Console.Error.WriteLine(exc.Message);
}
The one (major for me) problem is the HTML must be strictly XHTML compliant.
Good luck!
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…