PRICING
PRODUCT
SOLUTIONS
by use cases
AI Lead ManagementInvoicingSocial MediaProject ManagementData Managementby Industry
learn more
BlogTemplatesVideosYoutubeRESOURCES
COMMUNITIES AND SOCIAL MEDIA
PARTNERS
page.evaluate()
is a key Puppeteer method that lets you run JavaScript directly in the browser context. It bridges Node.js and the browser, enabling tasks like DOM manipulation, data extraction, and automation of dynamic web pages. Here's what you need to know:
const title = await page.evaluate(() => document.title);
This retrieves the page title directly from the browser.
Feature | Node.js Context | Browser Context |
---|---|---|
Global Objects | process , require |
window , document |
Script Location | Local machine | Target webpage |
API Access | Node.js APIs | Browser Web APIs |
Use page.evaluate()
for precise, efficient automation tasks, especially when working with JavaScript-heavy websites.
When working with Puppeteer for web automation, it's crucial to grasp the distinction between the Node.js context and the browser context. These two environments are isolated, each with its own rules for running code and exchanging data.
Puppeteer operates across two environments: the Node.js context, where your main script runs, and the browser context, where interactions with the webpage occur. These are separate processes, each with its own virtual machine.
Here's a quick comparison of their key characteristics:
Feature | Node.js Context | Browser Context |
---|---|---|
Global Objects | process , require , __dirname |
window , document , localStorage |
Script Location | Local machine | Target webpage |
Variable Scope | Puppeteer script scope | Page context scope |
API Access | Node.js APIs | Browser Web APIs |
Memory Space | Separate process | Browser process |
Data exchange between these contexts involves a series of steps, relying heavily on serialization:
Function.prototype.toString()
.Key limitations: Functions in the browser context cannot directly access variables from the Node.js scope. Puppeteer offers specific tools to address these challenges:
page.evaluateHandle()
: Returns references to objects in the browser context.page.exposeFunction()
: Allows the browser to call Node.js functions.evaluateOnNewDocument()
: Executes code before any page scripts load.However, JSON serialization may strip certain properties, especially with complex objects like DOM nodes. To avoid issues, pass data as function arguments instead of relying on Node.js variables.
Mastering these communication techniques ensures you can use page.evaluate
effectively for automation tasks. Next, we'll dive into practical examples to see these concepts in action.
Syntax:
await page.evaluate(pageFunction, ...args)
Parameter | Type | Description |
---|---|---|
pageFunction | Function or string | JavaScript code to execute in the browser context |
args | Optional parameters | Values passed from Node.js to the browser context |
Return value | Promise | Resolves with the function's return value |
The pageFunction can be a function or a string containing JavaScript code. Using a function is generally better for debugging and TypeScript compatibility. Below are some examples to demonstrate how it works.
Examples:
<h1>
directly from the DOM:const headingText = await page.evaluate(() => {
return document.querySelector('h1').textContent;
});
await page.evaluate((username, password) => {
document.getElementById('username').value = username;
document.getElementById('password').value = password;
document.querySelector('#login-form').submit();
}, 'myUsername', 'myPassword');
await page.evaluate(() => {
const div = document.createElement('div');
div.textContent = 'Added by Puppeteer';
document.body.appendChild(div);
return div.textContent;
});
Debugging Tip: Use the following configuration to enable debugging during development:
const browser = await puppeteer.launch({
headless: false,
slowMo: 100 // Adds a 100ms delay to each operation
});
Next, we'll dive into techniques for exchanging data between Node.js and browser contexts.
When transferring data with page.evaluate
, stick to JSON-serializable values for input arguments.
Here's a quick breakdown of supported parameter types:
Parameter Type | Supported? | Example |
---|---|---|
Primitives | ✓ Fully | 'text' , 42 , true |
Arrays/Objects | ✓ JSON-compatible | { key: 'value' } , [1, 2, 3] |
Functions | ✗ Not directly | Use page.exposeFunction |
DOM Elements | ✓ Through JSHandle | Use page.evaluateHandle |
Now, let's see how these values are returned from the browser context.
When using page.evaluate
, the returned values are automatically serialized to JSON. Here's how it works:
// Returning a simple value
const pageTitle = await page.evaluate(() => document.title);
// Returning a complex object
const metrics = await page.evaluate(() => ({
viewport: window.innerWidth,
scrollHeight: document.body.scrollHeight,
timestamp: Date.now()
}));
"As a rule of thumb, if the return value of the given function is more complicated than a JSON object (e.g., most classes), then
evaluate
will likely return some truncated value (or{}
). This is because we are not returning the actual return value, but a deserialized version as a result of transferring the return value through a protocol to Puppeteer."
Once you've retrieved the output, you may encounter serialization-related challenges. Here's how to tackle them.
Some common scenarios require specific workarounds:
const bodyHandle = await page.$('body');
const html = await page.evaluate(body => body.innerHTML, bodyHandle);
await bodyHandle.dispose(); // Always clean up to avoid memory leaks
await page.exposeFunction('md5', text =>
crypto.createHash('md5').update(text).digest('hex')
);
const hash = await page.evaluate(async () => {
return await window.md5('test-string');
});
If you're working with TypeScript, ensure your transpiler is set up correctly:
// tsconfig.json
{
"compilerOptions": {
"target": "es2018"
}
}
These strategies will help you handle data exchange effectively in various contexts.
Here’s how you can use page.evaluate
in real-world scenarios, complete with practical code snippets.
Example: Scraping product details
This script collects details like title, price, rating, and stock status from product cards on a webpage:
const productData = await page.evaluate(() => {
const products = Array.from(document.querySelectorAll('.product-card'));
return products.map(product => ({
title: product.querySelector('.title').textContent.trim(),
price: product.querySelector('.price').textContent.trim(),
rating: parseFloat(product.querySelector('.rating').dataset.value),
inStock: product.querySelector('.stock').textContent.includes('Available')
}));
});
Example: Extracting table data
This approach retrieves data from a table by iterating through its rows and columns:
const tableData = await page.evaluate(() => {
const rows = Array.from(document.querySelectorAll('table tr'));
return rows.map(row => {
const columns = row.querySelectorAll('td');
return Array.from(columns, column => column.innerText);
});
});
Basic form automation
Here’s how to fill out form fields, trigger events, and submit the form:
await page.evaluate(() => {
// Fill form fields
document.querySelector('#username').value = 'testuser';
document.querySelector('#password').value = 'secretpass';
// Trigger input events for dynamic forms
const event = new Event('input', { bubbles: true });
document.querySelector('#username').dispatchEvent(event);
// Submit form
document.querySelector('form').submit();
});
Handling complex forms
For tasks like selecting dropdown options or checking radio buttons:
await page.evaluate(() => {
// Select dropdown option
const select = document.querySelector('#country');
select.value = 'US';
select.dispatchEvent(new Event('change', { bubbles: true }));
// Check radio button
const radio = document.querySelector('input[value="express"]');
radio.checked = true;
radio.dispatchEvent(new Event('change', { bubbles: true }));
});
Example: Infinite scrolling
This script scrolls through a page until it collects at least 100 items:
const items = await page.evaluate(async () => {
const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
const items = new Set();
while (items.size < 100) {
// Scroll to bottom
window.scrollTo(0, document.body.scrollHeight);
// Wait for new content
await delay(1000);
// Collect items
document.querySelectorAll('.item').forEach(item =>
items.add(item.textContent.trim())
);
}
return Array.from(items);
});
Example: Handling AJAX content
To load more content dynamically, this script clicks a "Load More" button and waits for new elements to appear:
await page.evaluate(async () => {
// Click load more button
document.querySelector('#loadMore').click();
// Wait for content update
await new Promise(resolve => {
const observer = new MutationObserver((mutations, obs) => {
if (document.querySelectorAll('.item').length > 10) {
obs.disconnect();
resolve();
}
});
observer.observe(document.body, {
childList: true,
subtree: true
});
});
});
These examples showcase how to handle diverse scenarios like scraping, form automation, and dynamic content. Adjustments can be made based on the specific structure and behavior of the webpage you're working with.
Latenode incorporates Puppeteer's core features into its automation workflows, making it easier to execute JavaScript directly in the browser. With page.evaluate
, users can manipulate the DOM and extract data efficiently. This approach allows for seamless integration of advanced data handling and DOM operations within Latenode's automation environment.
Latenode's browser automation module uses page.evaluate
to handle everything from simple DOM tasks to more complex JavaScript execution. Here's how it works in different scenarios:
// Basic DOM interaction
await page.evaluate(() => {
const loginButton = document.querySelector('#login');
loginButton.click();
// Trigger a custom event
loginButton.dispatchEvent(new Event('customClick'));
});
// Processing data with exposed functions
await page.exposeFunction('processData', async (data) => {
// Process data in Node.js context
return transformedData;
});
await page.evaluate(async () => {
const rawData = document.querySelector('#data').textContent;
const processed = await window.processData(rawData);
return processed;
});
Latenode also keeps a log of execution history, making it easier to debug scripts.
Latenode is well-equipped to handle dynamic content and complex automation tasks. Here's an example of processing dynamic content on a page:
const extractProductData = await page.evaluate(async () => {
const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
// Wait for dynamic content to load
while (!document.querySelector('.product-grid')) {
await delay(100);
}
return Array.from(document.querySelectorAll('.product'))
.map(product => ({
name: product.querySelector('.name').textContent,
price: product.querySelector('.price').textContent,
availability: product.querySelector('.stock').dataset.status
}));
});
For more advanced operations, page.exposeFunction
allows seamless interaction between Node.js and the browser:
await page.exposeFunction('md5', text =>
crypto.createHash('md5').update(text).digest('hex')
);
const processedData = await page.evaluate(async () => {
const sensitiveData = document.querySelector('#secure-data').value;
return await window.md5(sensitiveData);
});
To maintain references to DOM elements across steps, Latenode uses page.evaluateHandle
:
const elementHandle = await page.evaluateHandle(() => {
return document.querySelector('.dynamic-content');
});
await page.evaluate(element => {
element.scrollIntoView();
}, elementHandle);
These techniques ensure Latenode can handle dynamic content effectively while maintaining reliable performance. For users on the Prime plan, the platform supports up to 1.5 million scenario runs each month, providing extensive automation capabilities.
When working with page.evaluate
in browser automation, you might encounter various issues. Here are practical solutions to address them and ensure smoother execution.
Properly configure your TypeScript settings to avoid issues caused by transpilation. For example:
// Use direct, non-transpiled functions
await page.evaluate(() => {
document.querySelector('#button').click();
});
await page.evaluate(`(async () => {
document.querySelector('#button').click();
})()`);
Avoid returning DOM elements directly from page.evaluate
. Instead, use ElementHandle
for better handling:
// Incorrect: Returning a DOM element
const element = await page.evaluate(() => {
return document.querySelector('.dynamic-element');
});
// Correct: Using ElementHandle
const element = await page.evaluateHandle(() => {
return document.querySelector('.dynamic-element');
});
Scripts may run before the page is fully loaded, leading to timing errors. Use these strategies to handle such cases:
// Wait for navigation after an action
await Promise.all([
page.waitForNavigation(),
page.click('#submit-button')
]);
// Wait for a specific condition
await page.waitForFunction(() => {
const element = document.querySelector('.lazy-loaded');
return element && element.dataset.loaded === 'true';
}, { timeout: 5000 });
For dynamic websites, adopt more targeted waiting mechanisms:
// Wait for specific network requests
await page.waitForResponse(
response => response.url().includes('/api/data')
);
// Ensure elements are both present and visible
await page.waitForSelector('.dynamic-content', {
visible: true,
timeout: 3000
});
To prevent memory leaks, carefully manage DOM references. Here’s how:
// Use and dispose ElementHandles
const handle = await page.evaluateHandle(() => {
return document.querySelector('.temporary-element');
});
await handle.evaluate(element => {
// Perform operations
});
await handle.dispose(); // Dispose of handle after use
When working with multiple elements, pass data safely between contexts:
// Extract data from the DOM
const selector = '.product-price';
const price = await page.evaluate((sel) => {
const element = document.querySelector(sel);
return element ? element.textContent.trim() : null;
}, selector);
For event listeners, ensure proper cleanup to avoid lingering handlers:
await page.evaluate(() => {
const handler = () => console.log('clicked');
const button = document.querySelector('#button');
button.addEventListener('click', handler);
// Store cleanup references
window._cleanupHandlers = window._cleanupHandlers || [];
window._cleanupHandlers.push(() => {
button.removeEventListener('click', handler);
});
});
To get the best results with page.evaluate
, you need to focus on improving performance, reducing unnecessary context switching, and ensuring security. Here’s how you can fine-tune your browser automation workflows.
Running code efficiently within the page context saves time and system resources. Below are some techniques to make your scripts faster:
// Block unnecessary resources like images and stylesheets
await page.setRequestInterception(true);
page.on('request', request => {
if (['image', 'stylesheet'].includes(request.resourceType())) {
request.abort();
} else {
request.continue();
}
});
// Batch operations to reduce overhead
await page.evaluate(() => {
const results = [];
document.querySelectorAll('.product-item').forEach(item => {
results.push({
title: item.querySelector('.title').textContent,
price: item.querySelector('.price').textContent,
stock: item.querySelector('.stock').dataset.value
});
});
return results;
});
Choosing the right selectors also plays a big role in performance:
Selector Type | Speed | Example |
---|---|---|
ID | Fastest | #main-content |
Class | Fast | .product-item |
Tag | Moderate | div > span |
Complex XPath | Slowest | //div[@class='wrapper']//span |
Context switching between Node.js and the browser environment can slow things down. Here's how to minimize it:
// Example of inefficient context switching
for (const item of items) {
await page.evaluate((i) => {
document.querySelector(`#item-${i}`).click();
}, item);
}
// Better: Batch operations in a single context switch
await page.evaluate((itemsList) => {
itemsList.forEach(i => {
document.querySelector(`#item-${i}`).click();
});
}, items);
If you need to process data in Node.js and pass it back to the browser, expose functions instead of repeatedly switching contexts:
await page.exposeFunction('processData', async (data) => {
// Process data in Node.js
return transformedData;
});
await page.evaluate(async () => {
const result = await window.processData(documentData);
// Use the processed data in the browser
});
Once performance and context switching are optimized, focus on keeping your scripts secure. Here are some best practices:
// Always sanitize inputs before using them
const sanitizedInput = sanitizeHtml(userInput);
await page.evaluate((input) => {
document.querySelector('#search').value = input;
}, sanitizedInput);
// Use error handling for critical operations
try {
await page.evaluate(() => {
if (!window.__securityCheck) {
throw new Error('Security check failed');
}
// Continue with the operation
});
} catch (error) {
console.error('Security violation:', error);
}
For Latenode workflows, consider these additional tips:
userDataDir
to cache resources and improve performance across sessions.The page.evaluate
method connects Node.js and browser contexts by sending a stringified JavaScript function to execute in the browser. This function operates independently of the Node.js environment, so you need to handle data transfer carefully.
Here's a common example for extracting data:
const data = await page.evaluate(async () => {
const results = document.querySelectorAll('.data-item');
return Array.from(results, item => ({
id: item.dataset.id,
value: item.textContent.trim()
}));
});
Things to keep in mind:
evaluate
context.These basics lay the groundwork for using Puppeteer effectively. Additional tools can further streamline your automation tasks.
Puppeteer offers several tools to expand the capabilities of page.evaluate
:
Tool | Purpose | Best Use Case |
---|---|---|
page.evaluateHandle |
Returns object references | Interacting with DOM elements directly |
page.exposeFunction |
Makes Node.js functions usable in the browser | Managing complex server-side logic |
page.evaluateOnNewDocument |
Runs scripts before a page loads | Preparing the browser environment in advance |
For example, exposing Node.js functions to the browser can simplify advanced data processing in workflows like those in Latenode. While page.evaluate
works well for handling primitive types and JSON-serializable objects, page.evaluateHandle
is essential for dealing with complex browser objects that can't be serialized.