Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
313 views
in Technique[技术] by (71.8m points)

javascript - Puppeteer script on DigitalOcean droplet running 24/7 maxes out memory and crashes

I have a simple puppeteer script that I leave running on a $5/month digitalocean droplet. The script is below. The script logins into an ecommerce website, and constantly refreshes the page look for a specific deal (if found adds to cart and checkouts). As you can see from the monitoring graphs, the memory increases with every loop until it reaches max, and then crashes the script. I think memory here is RAM, not storage.

Can anyone please suggest a better way to configure the puppeteer script to fix this problem? The loop takes about 5 seconds to run, and runs for about 12 hours until the memory usage increases to the point where the loop runtime creeps up from 5 seconds per run to 30 seconds per run. Then eventually the memory reaches max and crashes the script.

Would closing and reopening the browser within each run fix the memory problem? Is it because I have the script running as a never ending .js file?

Screenshot of Droplet memory

enter image description here

const puppeteer = require('puppeteer-extra')
puppeteer.use(require('puppeteer-extra-plugin-stealth')())
const fs = require('fs');
const performance = require('perf_hooks').performance;


(async () => {
  const browser = await puppeteer.launch({
    args: ['--no-sandbox'],
    headless: true
  })
  const page = await browser.newPage()
 
  //Login to page
  await page.goto('https://example.com/login', {waitUntil: 'load', timeout: 0});
  await page.type('[name=email]','[email protected]');
  await page.click('button[type="submit"]');
  await page.waitForTimeout('input[name=password]');
  await page.type('[name=password]','xxx');
  await page.click('button[type="submit"]');

  var runcount = 0
  var foundcount = 0

  //loops always running after logging in
  while(true){
    var t0 = performance.now();  
    
    await page.goto('https://www.example.com/xxx', {waitUntil: 'load', timeout: 0});
    
    //do some await page.$$eval here
    
  
    runcount++;
    var t1 = performance.now();
    console.log("Run: " + runcount + " | Runtime: " + Math.floor((t1 - t0)/1000);
  
  }
 
  await browser.close()
})()
question from:https://stackoverflow.com/questions/65644411/puppeteer-script-on-digitalocean-droplet-running-24-7-maxes-out-memory-and-crash

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Puppeteer works encapsulated with PM2 - is a javascript library that helps on watching node.js scripts - restarting the script when throws errors, etc. The encapsulation could be helpful for your case since there's a configuration for the memory threshold that restarts the entire script when reaches the limit specified, having a minimum downtime. It's super easy to use pm2, you only need to install the library, globally or locally, and then indicates the js file to load.

The library has good documentation for your personal use.

After installed, you need to create an ecosystemfile.js in your project, like these, I have 2 different scripts, but you can put only one

module.exports = {
  apps : [{
    name: "server-centrodonto",
    script: "./dist/server.js",
    exp_backoff_restart_delay: 100,
    max_memory_restart: '200M',
    watch: false, 
    exec_mode  : "cluster",
    instances: 1,
    env: {
      "NODE_ENV": "development",
      "APP_NAME": "server-centrodonto",
      "PORT": 3030,
      "HOST": "localhost",
      "DATABASE": "centrodonto-database-development"
    },
    env_production : {
      "NODE_ENV": "production",
      "APP_NAME": "server-centrodonto",
      "PORT": 3030,
      "HOST": "localhost",
      "DATABASE": "centrodonto-database"
   }
  },
  {
    name: "wpp-bot-atendente_virtual-centrodonto", 
    // here you put your js file location
    script: "./dist/whatsappBot/AtendenteVirtual_bot.js",
    exp_backoff_restart_delay: 100,
    max_memory_restart: '200M',
    watch: false, 
    exec_mode  : "cluster",
    instances: 1,
    env: {
      "NODE_ENV": "development",
      "APP_NAME": "WPPbot_atendente virtual",
    },
    env_production : {
       "NODE_ENV": "production",
       "APP_NAME": "WPPbot_atendente virtual",
    }
  }
],

};

After you need to configure your package scripts like these:

 "scripts": {
    "start": "yarn pm2 start ecosystem.config.js --env production",
    "start-dev": "yarn pm2 start ecosystem.config.js"
  },

In my case, I have installed it locally.

After that just run yarn start, and the library will load your script. Then you can see the logs, and memory usage using commands: pm2 logs or pm2 monit


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...