Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
301 views
in Technique[技术] by (71.8m points)

javascript - Use array of keywords and loop through script in Playwright

So, I am trying to scrape a couple of searchengines with a couple of search phrases with Playwright. Running the script with one query is working.

Working:

  const { chromium } = require('playwright');

  (async () => {
  const browser = await chromium.launch({ headless: false, slowMo: 250 });
  const context = await browser.newContext()
  const page = await context.newPage();

  const keyWord = ('Arsenal');

  await page.goto('https://duckduckgo.com/');
  await page.fill('//input[@name="q"]',keyWord);
  await page.keyboard.press('Enter');

  const getOne =  ('  (//h2[@class="result__title"])[9]    ');
  await page.waitForSelector(getOne)
  const pushOne = await page.$(getOne);
  const One = await pushOne.evaluate(element => element.innerText);
  console.log(One);

  await page.goto('https://yandex.com/');
  await page.fill('//input[@aria-label="Request"]', keyWord);
  await page.keyboard.press('Enter');

  const getTwo =  ('  //li[@data-first-snippet] //div[@class="organic__url-text"]    ');
  await page.waitForSelector(getTwo)
  const pushTwo = await page.$(getTwo);
  const Two = await pushTwo.evaluate(element => element.innerText);
  console.log(Two);

  await browser.close()
  })()

But when I use an array with phrases (keyWordlist) I fail to get the script running. Have searched around for using Array with 'For' and 'Foreach' loops, but haven't been able to fix it. I want to run the different keywords through the different searchengines and list the results. For 3 keywords in two searchengines that would get 6 results.

  const { chromium } = require('playwright');

  (async () => {
  const browser = await chromium.launch({ headless: false, slowMo: 250 });
  const context = await browser.newContext()
  const page = await context.newPage();


  let kewWordlist = ['Arsenal', 'Liverpool', 'Ajax']
  
  for (var i=0; i<=kewWordlist.length; i++) {
        // for (const i in kewWordlist){
        async () => {
              
              const keyWord = kewWordlist[i];

              await page.goto('https://duckduckgo.com/');
              await page.fill('//input[@name="q"]',keyWord);
              // await page.fill('//input[@name="q"]',[i]);
              // await page.fill('//input[@name="q"]',`${keyWord}`);
              await page.keyboard.press('Enter');


              const getOne =  ('  (//h2[@class="result__title"])[9]    ');
              await page.waitForSelector(getOne)
              const pushOne = await page.$(getOne);
              const One = await pushOne.evaluate(element => element.innerText);
              console.log(One);


              // await page.goto('https://yandex.com/');
              // await page.fill('//input[@aria-label="Request"]', keyWord);
              // await page.keyboard.press('Enter');

              // const getTwo =  ('  //li[@data-first-snippet] //div[@class="organic__url-text"]    ');
              // await page.waitForSelector(getTwo)
              // const pushTwo = await page.$(getTwo);
              // const Two = await pushTwo.evaluate(element => element.innerText);
              // console.log(Two);

        }}
        await browser.close()
  })()

If anyone has some pointers on how to solve this, much obliged.

question from:https://stackoverflow.com/questions/65876180/use-array-of-keywords-and-loop-through-script-in-playwright

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The reason your loop isn't working is that you have an async function inside of it that you never call. There are a few ways you could go about this:

You could take your first version, have it accept a word to search, and run that over each element of the array:

const searchOneKeyword = async (keyWord) => {
  const browser = await chromium.launch({ headless: false, slowMo: 250 });
  const context = await browser.newContext()
  const page = await context.newPage();

  // rest of code
}

const kewWordList = ['Arsenal', 'Liverpool', 'Ajax']

keyWordList.forEach((k) => {
  searchOneKeyword(k)
})

Or if you'd like to keep the same browser instance, you can do it in a loop in the function:

const search = async (words) => {
  const browser = await chromium.launch({ headless: false, slowMo: 250 });
  const context = await browser.newContext()
  const page = await context.newPage();

  for (const keyWord of words) {
    await page.goto('https://duckduckgo.com/');
    await page.fill('//input[@name="q"]',keyWord);
    await page.keyboard.press('Enter');

    const getOne =  ('  (//h2[@class="result__title"])[9]    ');
    await page.waitForSelector(getOne)
    const pushOne = await page.$(getOne);
    const One = await pushOne.evaluate(element => element.innerText);
    console.log(One);

    // etc.
  }

  await browser.close()
}

search(keyWordList)

In both of those cases, you're logging, but never returning anything, so if you need that data in another function afterwards, you'd have to change that. Example:

const search = async (words) => {
  const browser = await chromium.launch({ headless: false, slowMo: 250 });
  const context = await browser.newContext()
  const page = await context.newPage();

  const results = await Promise.all(words.map((keyWord) => {
    await page.goto('https://duckduckgo.com/');
    await page.fill('//input[@name="q"]',keyWord);
    await page.keyboard.press('Enter');

    // etc.
    
    return [ One, Two ]
  }))

  await browser.close()
  return results
}

search(keyWordList).then((results) => { console.log(results.flat()) })

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...