Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
507 views
in Technique[技术] by (71.8m points)

javascript - How to run Puppeteer code in any web browser?

I'm trying to do some web scraping with Puppeteer and I need to retrieve the value into a Website I'm building.

I have tried to load the Puppeteer file in the html file as if it was a JavaScript file but I keep getting an error. However, if I run it in a cmd window it works well.

Scraper.js:
getPrice();
function getPrice() {
    const puppeteer = require('puppeteer');
    void (async () => {
        try {
            const browser = await puppeteer.launch()
            const page = await browser.newPage()              
            await page.goto('http://example.com') 
            await page.setViewport({ width: 1920, height: 938 })        
            await page.waitForSelector('.m-hotel-info > .l-container > .l-header-section > .l-m-col-2 > .m-button')
            await page.click('.m-hotel-info > .l-container > .l-header-section > .l-m-col-2 > .m-button')
            await page.waitForSelector('.modal-content')
            await page.click('.tile-hsearch-hws > .m-search-tabs > #edit-search-panel > .l-em-reset > .m-field-wrap > .l-xs-col-4 > .analytics-click')
            await page.waitForNavigation();
            await page.waitForSelector('.tile-search-filter > .l-display-none')
            const innerText = await page.evaluate(() => document.querySelector('.tile-search-filter > .l-display-none').innerText);
            console.log(innerText)
        } catch (error) {
            console.log(error)
        }

    })()
}
index.html:
<html>
  <head></head>
  <body>
    <script src="../js/scraper.js" type="text/javascript"></script>
  </body>
</html>

The expected result should be this one in the console of Chrome:

But I'm getting this error instead:

What am I doing wrong?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

EDIT: Since puppeteer removed support for puppeteer-web, I moved it out of the repo and tried to patch it a bit.

It does work with browser. The package is called puppeteer-web, specifically made for such cases.

But the main point is, there must be some instance of chrome running on some server. Only then you can connect to it.

You can use it later on in your web page to drive another browser instance through its WS Endpoint:

<script src="https://unpkg.com/puppeteer-web">
</script>

<script>
  const browser = await puppeteer.connect({
    browserWSEndpoint: `ws://0.0.0.0:8080`, // <-- connect to a server running somewhere
    ignoreHTTPSErrors: true
  });

  const pagesCount = (await browser.pages()).length;
  const browserWSEndpoint = await browser.wsEndpoint();
  console.log({ browserWSEndpoint, pagesCount });
</script>

I had some fun with puppeteer and webpack,

See these answers for full understanding of creating the server and more,


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...