Summary:
The Puppeteer function page.exposeFunction()
essentially allows you to access Node.js functionality within the Page DOM Environment.
On the other hand, page.evaluateOnNewDocument()
evaluates a predefined function when a new document is created and before any of its scripts are executed.
The Puppeteer Documentation for page.exposeFunction()
states:
page.exposeFunction(name, puppeteerFunction)
name
<string> Name of the function on the window object
puppeteerFunction
<function> Callback function which will be called in Puppeteer's context.
- returns: <Promise>
The method adds a function called name
on the page's window
object. When called, the function executes puppeteerFunction
in node.js and returns a Promise which resolves to the return value of puppeteerFunction
.
If the puppeteerFunction
returns a Promise, it will be awaited.
NOTE Functions installed via page.exposeFunction
survive navigations.
An example of adding an md5
function into the page:
const puppeteer = require('puppeteer');
const crypto = require('crypto');
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
page.on('console', msg => console.log(msg.text()));
await page.exposeFunction('md5', text =>
crypto.createHash('md5').update(text).digest('hex')
);
await page.evaluate(async () => {
// use window.md5 to compute hashes
const myString = 'PUPPETEER';
const myHash = await window.md5(myString);
console.log(`md5 of ${myString} is ${myHash}`);
});
await browser.close();
});
An example of adding a window.readfile
function into the page:
const puppeteer = require('puppeteer');
const fs = require('fs');
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
page.on('console', msg => console.log(msg.text()));
await page.exposeFunction('readfile', async filePath => {
return new Promise((resolve, reject) => {
fs.readFile(filePath, 'utf8', (err, text) => {
if (err)
reject(err);
else
resolve(text);
});
});
});
await page.evaluate(async () => {
// use window.readfile to read contents of a file
const content = await window.readfile('/etc/hosts');
console.log(content);
});
await browser.close();
});
Furthermore, the Puppeteer Documentation for page.evaluateOnNewDocument
explains:
page.evaluateOnNewDocument(pageFunction, ...args)
Adds a function which would be invoked in one of the following scenarios:
- whenever the page is navigated
- whenever the child frame is attached or navigated. In this case, the function is invoked in the context of the newly attached frame
The function is invoked after the document was created but before any of its scripts were run. This is useful to amend the JavaScript environment, e.g. to seed Math.random
.
An example of overriding the navigator.languages property before the page loads:
// preload.js
// overwrite the `languages` property to use a custom getter
Object.defineProperty(navigator, "languages", {
get: function() {
return ["en-US", "en", "bn"];
}
});
// In your puppeteer script, assuming the preload.js file is in same folder of our script
const preloadFile = fs.readFileSync('./preload.js', 'utf8');
await page.evaluateOnNewDocument(preloadFile);