PRICING
PRODUCT
SOLUTIONS
by use cases
AI Lead ManagementInvoicingSocial MediaProject ManagementData Managementby Industry
learn more
BlogTemplatesVideosYoutubeRESOURCES
COMMUNITIES AND SOCIAL MEDIA
PARTNERS
Turning HTML into PDFs is crucial for creating standardized documents like reports, invoices, and client materials. Puppeteer, a browser automation tool, helps you manage styles, layouts, and page breaks for professional PDF output. Here's a quick overview of what you can do with Puppeteer:
Quick Start: Install Puppeteer with npm install puppeteer
, load your HTML (as a string, local file, or URL), and configure PDF settings like dimensions, margins, and background rendering. Use @media print
CSS rules for better control over print styles.
Key Features:
@page
rules.With Puppeteer, you can automate and customize PDF generation for consistent, high-quality results.
Learn how to set up and use Puppeteer to generate PDFs. Follow these steps to get started.
Before you begin, make sure you have Node.js version 14.0.0 or higher installed on your system. Here's how to set everything up:
npm init -y
.npm install puppeteer
to add Puppeteer to your project.Here’s a basic script to convert HTML into a PDF using Puppeteer:
const puppeteer = require('puppeteer');
async function generatePDF() {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Set page content
await page.setContent(`
<html>
<body>
<h1>Sample PDF Document</h1>
<p>Generated with Puppeteer</p>
</body>
</html>
`);
// Generate PDF
await page.pdf({
path: 'output.pdf',
format: 'Letter',
margin: {
top: '1in',
right: '1in',
bottom: '1in',
left: '1in'
}
});
await browser.close();
}
generatePDF();
Save this script as generate-pdf.js
. Run it by typing node generate-pdf.js
in your terminal. The script will create a PDF with US Letter dimensions (8.5×11 inches) and 1-inch margins.
Puppeteer provides multiple ways to load HTML content for PDF generation:
await page.setContent(htmlString);
await page.goto(`file:${path.join(__dirname, 'template.html')}`);
await page.goto('https://yourwebsite.com/page-to-convert');
When working with external resources like images or styles, make sure they are embedded, use absolute URLs, or are stored locally.
To ensure smooth PDF generation, keep these pointers in mind:
page.waitForNetworkIdle()
to wait for all network requests to finish.Once your HTML is ready, you can move on to customizing the PDF’s styles and settings.
To tailor your content for PDF output, use @media print
rules. Here's an example:
@media print {
/* Hide navigation menus and non-essential elements */
nav, button, .no-print {
display: none;
}
/* Adjust text for better readability in PDFs */
body {
font-size: 12pt;
line-height: 1.5;
}
/* Ensure accurate background rendering */
* {
-webkit-print-color-adjust: exact;
}
}
If you want to keep your screen-based styles instead of applying print-specific styles, include this line before generating the PDF:
await page.emulateMediaType('screen');
Once print styles are applied, you can move on to layout adjustments.
Define PDF dimensions using Puppeteer options or CSS @page
rules. For Puppeteer, you can use the following configuration:
await page.pdf({
format: 'Letter',
margin: {
top: '0.75in',
right: '0.5in',
bottom: '0.75in',
left: '0.5in'
},
landscape: false,
preferCSSPageSize: true
});
For more customized page sizes, rely on CSS @page
rules:
@page {
size: 8.5in 11in;
margin: 0.75in 0.5in;
}
After setting up the layout, you can fine-tune the design elements for a polished look.
To make the content visually clear and professional, use these CSS rules:
body {
font-family: 'Arial', sans-serif;
color: #333333;
}
h1, h2, h3 {
page-break-after: avoid;
color: #000000;
}
table {
width: 100%;
border-collapse: collapse;
page-break-inside: avoid;
}
img {
max-width: 100%;
height: auto;
page-break-inside: avoid;
}
For consistent background colors, especially in critical sections, add this rule:
.color-critical {
-webkit-print-color-adjust: exact;
}
These adjustments ensure your PDF is easy to read and visually appealing.
Managing page breaks effectively ensures your content flows smoothly across pages. Use these CSS properties to control where content divides:
/* Start new page before chapters */
.chapter {
page-break-before: always;
}
/* Keep headings together with their content */
h2, h3 {
page-break-after: avoid;
}
/* Avoid splitting tables or figures */
table, figure {
page-break-inside: avoid;
}
These rules help keep your document organized and easy to read. Once you’ve set up page breaks, focus on configuring headers and footers to align with these settings.
Set up headers and footers in Puppeteer to give your PDF a professional look:
await page.pdf({
displayHeaderFooter: true,
headerTemplate: `
<div style="font-size: 10px; padding: 0 0.5in; width: 100%;">
<span class="title"></span>
<span class="date" style="float: right;"></span>
</div>
`,
footerTemplate: `
<div style="font-size: 10px; text-align: center; width: 100%;">
Page <span class="pageNumber"></span> of <span class="totalPages"></span>
</div>
`,
margin: {
top: '1in',
bottom: '1in'
}
});
Make sure to adjust the margins so the header and footer fit properly without overlapping your content.
With page breaks and headers/footers in place, focus on managing content across multiple pages. Proper layout control ensures your document remains clear and professional:
/* Keep captions with their images */
figure {
display: table;
page-break-inside: avoid;
}
figcaption {
display: table-caption;
caption-side: bottom;
}
/* Avoid splitting list items or table rows */
li, .table-row {
page-break-inside: avoid;
}
/* Allow large tables to break across pages */
.table-wrapper {
page-break-inside: auto;
}
For large tables that span multiple pages, wrap them in a container allowing breaks while keeping rows intact. This ensures data remains easy to follow, even in lengthy datasets.
Tip: Enable the
printBackground
option in Puppeteer to render all visual elements, including background colors and images:
await page.pdf({
printBackground: true,
preferCSSPageSize: true
});
Improving PDF quality and performance requires attention to scaling, image handling, and resource management. These steps ensure the final document looks polished and functions efficiently.
Scaling content correctly ensures it remains readable and consistent in design. Puppeteer offers detailed scaling controls for rendering PDFs:
await page.pdf({
scale: 0.8,
preferCSSPageSize: true,
format: 'Letter'
});
Here, values below 1 shrink content, while values above 1 enlarge it. Pairing scaling with preferCSSPageSize
ensures the PDF adheres to CSS-defined dimensions:
@page {
size: 8.5in 11in;
margin: 0.5in;
}
Choosing the right image format is crucial. PNG works well for detailed visuals like charts and logos but can increase file size. JPEG is a better option for photos, while WebP often gets converted, potentially inflating the file size further.
To improve image clarity, increase the device scale factor:
await page.setViewport({
width: 1200,
height: 800,
deviceScaleFactor: 2
});
Addressing common challenges like resource management, file size, and errors can significantly boost performance.
const browser = await puppeteer.launch({
args: ['--no-sandbox', '--disable-setuid-sandbox']
});
const page = await browser.newPage();
for (const request of requests) {
await generatePDF(page, request);
}
await page.evaluate(() => {
document.querySelectorAll('.no-print').forEach(el => el.remove());
document.querySelectorAll('img').forEach(img => {
img.loading = 'lazy';
img.decoding = 'async';
});
});
const generatePDF = async (page, options) => {
try {
await page.goto(options.url, {
waitUntil: 'networkidle0',
timeout: 30000
});
return await page.pdf(options);
} catch (error) {
console.error('PDF generation failed:', error);
throw error;
}
};
Using Puppeteer to convert HTML to PDF provides effective tools for creating professional-grade documents.
page.emulateMediaType('print')
.page-break-inside: avoid
to ensure elements such as table rows stay intact.These techniques build on earlier styling and layout methods, serving as a solid base for more advanced automation.
You can take PDF generation further with these additional automation features:
When deploying these methods in production, include error handling and logging to maintain consistent and reliable PDF outputs.