Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
438 views
in Technique[技术] by (71.8m points)

firefox - Can Selenium verify text inside a PDF loaded by the browser?

My web application loads a pdf in the browser. I have figured out how to check that the pdf has loaded correctly using:

verifyAttribute xpath=//embed/@src {URL of PDF goes here}

It would be really nice to be able to check the contents of the pdf with Selenium - for example verify that some text is present. Is there any way to do this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

While not natively supported, I have found a couple ways using the java driver. One way is to have the pdf open in your browser (having adobe acrobat installed) and then use keyboard shortcut keys to select all text (CTRL+A), then copy it to the clipboard (CTRL+C) and then you can verify the text in the clipboard. eg:

protected String getLastWindow() {
    return session().getEval("var windowId; for(var x in selenium.browserbot.openedWindows ){windowId=x;} ");
}

@Test
public void testTextInPDF() {
    session().click("link=View PDF");
    String popupName = getLastWindow();
    session().waitForPopUp(popupName, PAGE_LOAD_TIMEOUT);
    session().selectWindow(popupName);

    session().windowMaximize();
    session().windowFocus();
    Thread.sleep(3000);

    session().keyDownNative("17"); // Stands for CTRL key
    session().keyPressNative("65"); // Stands for A "ascii code for A"
    session().keyUpNative("17"); //Releases CTRL key
    Thread.sleep(1000);

    session().keyDownNative("17"); // Stands for CTRL key
    session().keyPressNative("67"); // Stands for C "ascii code for C"
    session().keyUpNative("17"); //Releases CTRL key

    TextTransfer textTransfer = new TextTransfer();
    assertTrue(textTransfer.getClipboardContents().contains("Some text in my pdf"));
}

Another way, still in java, is to download the pdf and then convert the pdf to text with PDFBox, see http://www.prasannatech.net/2009/01/convert-pdf-text-parser-java-api-pdfbox.html for an example on how to do this.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...