Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
219 views
in Technique[技术] by (71.8m points)

c# - Extract Data from .PDF files

I need to extract data from .PDF files and load it in to SQL 2008. Can any one tell me how to proceed??

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Here is an example of how to use iTextSharp to extract text data from a PDF. You'll have to fiddle with it some to make it do exactly what you want, I think it's a good outline. You can see how the StringBuilder is being used to store the text, but you could easily change that to use SQL.

    static void Main(string[] args)
    {
        PdfReader reader = new PdfReader(@"c:est.pdf");

        StringBuilder builder = new StringBuilder();

        for (int x = 1; x <= reader.NumberOfPages; x++)
        {
            PdfDictionary page = reader.GetPageN(x);
            IRenderListener listener = new SBTextRenderer(builder);
            PdfContentStreamProcessor processor = new PdfContentStreamProcessor(listener);
            PdfDictionary pageDic = reader.GetPageN(x);
            PdfDictionary resourcesDic = pageDic.GetAsDict(PdfName.RESOURCES);
            processor.ProcessContent(ContentByteUtils.GetContentBytesForPage(reader, x), resourcesDic);
        }
    }

public class SBTextRenderer : IRenderListener
{

    private StringBuilder _builder;
    public SBTextRenderer(StringBuilder builder)
    {
        _builder = builder;
    }
    #region IRenderListener Members

    public void BeginTextBlock()
    {
    }

    public void EndTextBlock()
    {
    }

    public void RenderImage(ImageRenderInfo renderInfo)
    {
    }

    public void RenderText(TextRenderInfo renderInfo)
    {
        _builder.Append(renderInfo.GetText());
    }

    #endregion
}

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...