What are some key benefits of using Browserless Chrome?

Key benefits include no local setup, fast performance, bot detection bypass, and resource efficiency, reducing proxy and virtual machine usage.

Which libraries does Browserless Chrome integrate with?

Browserless Chrome integrates with popular libraries like Puppeteer, Playwright, and Selenium, offering simple connection methods for each.

Browserless Chrome: A Powerful Tool for Browser Automation

Table of contents

Browserless Chrome: A Powerful Tool for Browser Automation

Browserless Chrome is a cloud-based service that simplifies browser automation by running headless Chrome for tasks like web scraping, PDF generation, and testing. It eliminates the need for local browser setups and manages browser crashes, session isolation, and resource optimization automatically. Key benefits include:

No Local Setup: Use Docker or cloud deployment to get started quickly.
Fast Performance: Screenshots in ~1 second, PDFs in ~2 seconds.
Bot Detection Bypass: Overcomes blockers like Cloudflare for reliable automation.
Resource Efficiency: Reduces proxy and virtual machine usage by up to 90%.

Quick Setup: Integrate with popular libraries like Puppeteer, Playwright, or Selenium using simple connection methods. Browserless also offers APIs for web scraping, JavaScript rendering, and custom automation workflows.

Quick Comparison of Features

Feature	Benefit	Example
Session Cleanup	Removes inactive sessions automatically	Keeps resources optimized
Bot Detection Prevention	Bypasses blockers like Cloudflare	Reliable for scraping tasks
Multi-task Processing	Handles concurrent requests effectively	2M+ sessions processed weekly
Ready-to-use APIs	Simplifies automation tasks	JSON data extraction, PDFs

Browserless Chrome is ideal for developers and businesses looking to streamline automation without managing complex infrastructure.

Setup Guide

How to Install

To get started with Browserless Chrome, you can choose between two installation options: a local setup using Docker or a cloud deployment. For a local Docker setup, use the following command:

docker run -p 3000:3000 ghcr.io/browserless/chromium

This command pulls the latest image and makes it accessible on port 3000 ^[3]. It works seamlessly across Windows, macOS, and Linux.

Initial Configuration

Browserless Chrome includes several built-in features to simplify the setup process:

Feature	Default Setting	Purpose
Session Cleanup	30 seconds	Removes inactive sessions automatically
Health Checks	Enabled	Ensures system stability
Request Queue	Configurable	Manages multiple concurrent connections
Resource Limits	Adjustable	Controls memory and CPU usage

You can customize the environment by setting these variables:

MAX_CONCURRENT_SESSIONS=10
CONNECTION_TIMEOUT=30000
MAX_QUEUE_LENGTH=100

Once configured, you can connect to Browserless using your preferred integration method.

Making Your First Connection

Depending on the library you use, here’s how you can establish your first connection:

Puppeteer Connection

const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://chrome.browserless.io?token=YOUR-API-TOKEN'
});

Playwright Integration

const browser = await playwright.firefox.connect(
  `wss://production-sfo.browserless.io/firefox/playwright?token=${TOKEN}&proxy=residential`
);

REST API Access

curl -X POST https://chrome.browserless.io/content
  -H 'Content-Type: application/json'
  -H 'Authorization: Basic YOUR-BASE64-TOKEN'
  -d '{ "url": "https://example.com/"}'

Browserless V2 improves reliability with two regional endpoints: the US West Coast (production-sfo.browserless.io) and Europe (production-lon.browserless.io). These endpoints handle session isolation, manage concurrent requests, and recover from crashes automatically. They also clean up inactive sessions after 30 seconds by launching a fresh browser instance for every new session ^[4].

Main Features

Headless Browser Basics

Browserless Chrome operates without a graphical interface, running in a headless mode. It automatically starts new browser instances for incoming requests, ensuring efficient resource use.

Here’s a quick overview of its key features:

Feature	Description	Benefit
Session Isolation	Independent browser sessions	Lowers infrastructure costs
Automatic Recovery	Restarts after crashes	Keeps operations running
Resource Optimization	Efficient use of memory and CPU	Boosts overall performance

Beyond these essentials, Browserless is designed to handle multiple tasks at the same time with ease.

Multi-task Processing

With over 2 million sessions handled, Browserless Chrome has generated millions of screenshots, PDFs, and test results ^[5]. Its smart queue management system ensures requests are processed without overloading resources, maintaining consistent performance. This has proven especially useful for companies like Samsara, which switched from an in-house testing service to Browserless for better scalability.

"Browserless boasts a range of features designed to simplify and accelerate web browser automation tasks. With its robust API and ability to handle parallel operations, Browserless stands out as a leader in the automation space." – Elest.io

Browserless doesn’t just excel at multitasking - it also simplifies automation workflows with ready-to-use APIs.

Ready-to-use API Functions

Browserless offers APIs tailored for common automation needs, enhancing its core functionality:

Web Scraping API: Extracts structured JSON data from webpage elements.
Unblock API: Fetches HTML content after running JavaScript.
Function API: Executes custom Puppeteer code with ESM module imports.

These APIs have delivered real-world results:

"We started using another scraping company's headless browsers to run Puppeteer scripts. But, it required a Vercel upgrade due to slow fetch times, and the proxies weren't running correctly. I found Browserless and had our Puppeteer code running within an hour. The scrapes are now 5x faster and 1/3rd of the price, plus the support has been excellent." – Nicklas Smit, Full-Stack Developer, Takeoff Copenhagen ^[2]

"We built a scraping tool to train our chatbots on public website data, but it quickly got complicated due to edge cases and bot detection. I found Browserless and set aside a day for the integration, but it only took a couple of hours. I didn't need to become an expert in managing proxy servers or virtual computers, so now I can stay focused on core parts of the business." – Mike Heap, Founder, My AskAI ^[2]

What is browserless?

sbb-itb-23997f1

Library Integration Guide

Browserless Chrome works seamlessly with major automation libraries, offering performance and reliability. Here's how you can integrate it with some of the most popular tools.

Puppeteer Integration

Puppeteer

Switching to Browserless in Puppeteer is simple - just replace puppeteer.launch() with puppeteer.connect() ^[6].

Setup Type	Code Structure	Advantages
Traditional Puppeteer	Uses `puppeteer.launch()`	Consumes local resources
Browserless Puppeteer	Uses `puppeteer.connect()`	Optimized for the cloud
Enhanced Browserless	Custom launch arguments	Advanced configurations

You can also pass custom launch arguments via the WebSocket endpoint:

const launchArgs = JSON.stringify({
  args: ['--window-size=1920,1080'],
  stealth: true,
  timeout: 5000
});
const browser = await puppeteer.connect({
  browserWSEndpoint: `wss://production-sfo.browserless.io/?token=YOUR_API_TOKEN_HERE&launch=${launchArgs}`
});

This setup supports advanced configurations while maintaining simplicity.

Playwright Integration

Playwright

Browserless works equally well with Playwright. Here's an example of how to connect using Firefox:

// Firefox implementation with Playwright Protocol
const browser = await playwright.firefox.connect(
  'wss://production-sfo.browserless.io/firefox/playwright?token=YOUR_API_TOKEN_HERE'
);

For developers using Python, Browserless ensures a consistent experience:

with sync_playwright() as p:
  browser = p.firefox.connect('wss://production-sfo.browserless.io/firefox/playwright?token=YOUR_API_TOKEN_HERE')
  context = browser.new_context()

This cross-language compatibility makes it easy to integrate Browserless into various workflows.

Selenium Integration

Selenium

For Selenium, use the following Ruby configuration to connect to Browserless:

caps = Selenium::WebDriver::Remote::Capabilities.chrome("goog:chromeOptions" => {
  "args" => [
    "--disable-dev-shm-usage",
    "--disable-extensions",
    "--headless",
    "--no-sandbox"
  ]
})

You can establish the WebDriver connection using a simple URL format:

driver = Selenium::WebDriver.for :remote, 
  url: "https://[email protected]/webdriver",
  desired_capabilities: caps

This setup ensures secure and efficient operation, leveraging sandboxing and other resource-saving features. Always close browser instances after use to avoid memory leaks and optimize resource usage.

Performance Tips

When working with Browserless Chrome, managing performance is key to maintaining efficiency. With the platform handling nearly 5 million headless sessions weekly ^[8], careful resource and security management is essential for smooth operations at this scale.

Resource Management

Efficiently managing resources starts with how browser instances are handled. Instead of creating a new instance for every task, reuse existing instances to cut down on the overhead of starting new sessions:

const browser = await puppeteer.connect({ 
  browserWSEndpoint: 'wss://chrome.browserless.io?token=YOUR-TOKEN' 
});
// Reuse the instance by disconnecting instead of closing
await browser.disconnect();

Another effective tactic is blocking unnecessary assets to reduce resource use. Here's a breakdown:

Resource Type	Impact on Performance	Recommended Action
Images	Consumes high bandwidth	Block using `page.setRequestInterception()`
CSS Files	Uses extra memory	Disable unless critical for layout
Fonts	Slows loading	Block external font requests

For instance, Browserless.io reported a performance improvement in September 2024, where blocking these resources reduced execution time from 2,114 ms to 1,676 ms ^[10].

Handling High Traffic

Once resources are optimized, the next step is managing high traffic effectively. Horizontal scaling is more reliable than depending on a few large instances.

"Chrome is really really good at using full system resources, and loves to use equal parts CPU and memory for most things" ^[8]

To handle high-volume demands, consider these strategies:

Use Nginx for load balancing across multiple smaller Browserless instances.
Enable pre-request health checks with PRE_REQUEST_HEALTH_CHECK=true and limit concurrent sessions using MAX_CONCURRENT_SESSIONS=10.
Ensure proper process termination to avoid lingering "zombie" processes.

"Regardless of where or how you're running your headless sessions, it's important to kill Chrome with the fire of thousand suns" ^[8]

Security Setup

A secure setup not only protects your data but also ensures consistent performance under heavy loads. Here's how to secure your deployment:

Store API keys as hashed values in secure environments.
Use IP restrictions to control access.
Enable role-based access management.
Apply rate limiting to API endpoints.

For Docker deployments, set resource limits to avoid overloading:

docker run -e MAX_CONCURRENT_SESSIONS=10 \
    -e CONNECTION_TIMEOUT=30000 \
    --memory=2g \
    --cpu-shares=1024 \
    browserless/chrome

For handling untrusted code, use the vm2 module to create isolated environments. This approach prevents CPU-intensive attacks. Since March 5, 2018, Browserless.io has been using dumb-init within Docker containers to manage process termination effectively ^[9].

Summary

Browserless Chrome simplifies automation by taking over the heavy lifting of infrastructure tasks, which used to take up a significant chunk of developers' time - around 60%. By isolating Chrome from core services, it ensures better load balancing, scalability, and error management. One notable example is Samsara, which revamped its Puppeteer-based testing by removing the hassle of maintaining specialized infrastructure. This allowed their engineers to focus more on building their core product instead of worrying about backend operations ^[1].

Here’s a snapshot of what makes Browserless Chrome a game-changer:

Feature	Business Impact
Infrastructure Separation	Prevents Chrome-related issues from disrupting the entire service ^[11]
Built-in Load Balancing	Allows for effortless scaling without extra setup
Bot Detection Avoidance	Boosts success rates for web automation tasks ^[1]
REST API Integration	Makes tasks like PDF creation and screenshot generation much easier ^[1]

These features make switching to Browserless Chrome a practical and efficient choice for automation needs.

Getting Started Steps

Want to integrate Browserless Chrome into your workflow? Here’s how you can get started:

Choose an Integration Method: Before diving in, test the functionality with the online debugger. Then, decide between Puppeteer, Playwright, or Selenium based on your current tools ^[7].
Update Your Setup: Replace your local Puppeteer launch by connecting to Browserless. Simply update your code to use puppeteer.connect() with your Browserless endpoint.
Track Performance: Use Browserless's built-in tools like health checks and queue metrics to keep an eye on performance ^[1].