A low-code platform blending no-code simplicity with full-code power 🚀
Get started free
Converting HTML to PDF with Puppeteer: Style Configuration and Pagination
March 25, 2025
7
min read

Converting HTML to PDF with Puppeteer: Style Configuration and Pagination

George Miloradovich
Researcher, Copywriter & Usecase Interviewer
Table of contents

Turning HTML into PDFs is crucial for creating standardized documents like reports, invoices, and client materials. Puppeteer, a browser automation tool, helps you manage styles, layouts, and page breaks for professional PDF output. Here's a quick overview of what you can do with Puppeteer:

  • Generate PDFs: Use Puppeteer to convert HTML into polished PDFs while running JavaScript and applying custom CSS.
  • Control Styles: Define page sizes, margins, fonts, headers, footers, and more using print-specific CSS.
  • Manage Page Breaks: Use CSS rules to avoid splitting tables, headings, or images across pages.
  • Optimize Performance: Improve quality and reduce file size with scaling, image optimization, and efficient resource handling.

Quick Start: Install Puppeteer with npm install puppeteer, load your HTML (as a string, local file, or URL), and configure PDF settings like dimensions, margins, and background rendering. Use @media print CSS rules for better control over print styles.

Key Features:

  • Page customization with @page rules.
  • Header/footer templates for professional layouts.
  • Multi-page content management to avoid awkward splits in tables or text.

With Puppeteer, you can automate and customize PDF generation for consistent, high-quality results.

🌐 Convert HTML to PDF with Puppeteer in Node.js 🚀 Full Step ...

Puppeteer

Getting Started with Puppeteer

Learn how to set up and use Puppeteer to generate PDFs. Follow these steps to get started.

Setup

Before you begin, make sure you have Node.js version 14.0.0 or higher installed on your system. Here's how to set everything up:

  • Install Node.js: Download it from nodejs.org and complete the installation.
  • Create a project folder: Make a new folder for your project.
  • Initialize the project: Open a terminal in your project folder and run npm init -y.
  • Install Puppeteer: Use the command npm install puppeteer to add Puppeteer to your project.

First PDF Generation Script

Here’s a basic script to convert HTML into a PDF using Puppeteer:

const puppeteer = require('puppeteer');

async function generatePDF() {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  // Set page content
  await page.setContent(`
    <html>
      <body>
        <h1>Sample PDF Document</h1>
        <p>Generated with Puppeteer</p>
      </body>
    </html>
  `);

  // Generate PDF
  await page.pdf({
    path: 'output.pdf',
    format: 'Letter',
    margin: {
      top: '1in',
      right: '1in',
      bottom: '1in',
      left: '1in'
    }
  });

  await browser.close();
}

generatePDF();

Save this script as generate-pdf.js. Run it by typing node generate-pdf.js in your terminal. The script will create a PDF with US Letter dimensions (8.5×11 inches) and 1-inch margins.

HTML Source Options

Puppeteer provides multiple ways to load HTML content for PDF generation:

  • Direct Content Loading: Use a string containing the HTML.
    await page.setContent(htmlString);
    
  • Local File Access: Load an HTML file from your local system.
    await page.goto(`file:${path.join(__dirname, 'template.html')}`);
    
  • Remote URL Loading: Fetch HTML from a live website.
    await page.goto('https://yourwebsite.com/page-to-convert');
    

When working with external resources like images or styles, make sure they are embedded, use absolute URLs, or are stored locally.

Tips for Better Performance

To ensure smooth PDF generation, keep these pointers in mind:

  • Use page.waitForNetworkIdle() to wait for all network requests to finish.
  • Set appropriate timeouts for loading resources.
  • Handle font loading explicitly to avoid rendering problems.

Once your HTML is ready, you can move on to customizing the PDF’s styles and settings.

PDF Style Settings

To tailor your content for PDF output, use @media print rules. Here's an example:

@media print {
  /* Hide navigation menus and non-essential elements */
  nav, button, .no-print {
    display: none;
  }

  /* Adjust text for better readability in PDFs */
  body {
    font-size: 12pt;
    line-height: 1.5;
  }

  /* Ensure accurate background rendering */
  * {
    -webkit-print-color-adjust: exact;
  }
}

If you want to keep your screen-based styles instead of applying print-specific styles, include this line before generating the PDF:

await page.emulateMediaType('screen');

Once print styles are applied, you can move on to layout adjustments.

Page Layout Settings

Define PDF dimensions using Puppeteer options or CSS @page rules. For Puppeteer, you can use the following configuration:

await page.pdf({
  format: 'Letter',
  margin: {
    top: '0.75in',
    right: '0.5in',
    bottom: '0.75in',
    left: '0.5in'
  },
  landscape: false,
  preferCSSPageSize: true
});

For more customized page sizes, rely on CSS @page rules:

@page {
  size: 8.5in 11in;
  margin: 0.75in 0.5in;
}

After setting up the layout, you can fine-tune the design elements for a polished look.

Text and Design Elements

To make the content visually clear and professional, use these CSS rules:

body {
  font-family: 'Arial', sans-serif;
  color: #333333;
}

h1, h2, h3 {
  page-break-after: avoid;
  color: #000000;
}

table {
  width: 100%;
  border-collapse: collapse;
  page-break-inside: avoid;
}

img {
  max-width: 100%;
  height: auto;
  page-break-inside: avoid;
}

For consistent background colors, especially in critical sections, add this rule:

.color-critical {
  -webkit-print-color-adjust: exact;
}

These adjustments ensure your PDF is easy to read and visually appealing.

sbb-itb-23997f1

Page Break Control

Page Break CSS Properties

Managing page breaks effectively ensures your content flows smoothly across pages. Use these CSS properties to control where content divides:

/* Start new page before chapters */
.chapter {
  page-break-before: always;
}

/* Keep headings together with their content */
h2, h3 {
  page-break-after: avoid;
}

/* Avoid splitting tables or figures */
table, figure {
  page-break-inside: avoid;
}

These rules help keep your document organized and easy to read. Once you’ve set up page breaks, focus on configuring headers and footers to align with these settings.

Set up headers and footers in Puppeteer to give your PDF a professional look:

await page.pdf({
  displayHeaderFooter: true,
  headerTemplate: `
    <div style="font-size: 10px; padding: 0 0.5in; width: 100%;">
      <span class="title"></span>
      <span class="date" style="float: right;"></span>
    </div>
  `,
  footerTemplate: `
    <div style="font-size: 10px; text-align: center; width: 100%;">
      Page <span class="pageNumber"></span> of <span class="totalPages"></span>
    </div>
  `,
  margin: {
    top: '1in',
    bottom: '1in'
  }
});

Make sure to adjust the margins so the header and footer fit properly without overlapping your content.

Multi-Page Content Management

With page breaks and headers/footers in place, focus on managing content across multiple pages. Proper layout control ensures your document remains clear and professional:

/* Keep captions with their images */
figure {
  display: table;
  page-break-inside: avoid;
}

figcaption {
  display: table-caption;
  caption-side: bottom;
}

/* Avoid splitting list items or table rows */
li, .table-row {
  page-break-inside: avoid;
}

/* Allow large tables to break across pages */
.table-wrapper {
  page-break-inside: auto;
}

For large tables that span multiple pages, wrap them in a container allowing breaks while keeping rows intact. This ensures data remains easy to follow, even in lengthy datasets.

Tip: Enable the printBackground option in Puppeteer to render all visual elements, including background colors and images:

await page.pdf({
  printBackground: true,
  preferCSSPageSize: true
});

PDF Quality and Performance

Improving PDF quality and performance requires attention to scaling, image handling, and resource management. These steps ensure the final document looks polished and functions efficiently.

Content Scaling Methods

Scaling content correctly ensures it remains readable and consistent in design. Puppeteer offers detailed scaling controls for rendering PDFs:

await page.pdf({
  scale: 0.8,
  preferCSSPageSize: true,
  format: 'Letter'
});

Here, values below 1 shrink content, while values above 1 enlarge it. Pairing scaling with preferCSSPageSize ensures the PDF adheres to CSS-defined dimensions:

@page {
  size: 8.5in 11in;
  margin: 0.5in;
}

Image Quality Management

Choosing the right image format is crucial. PNG works well for detailed visuals like charts and logos but can increase file size. JPEG is a better option for photos, while WebP often gets converted, potentially inflating the file size further.

To improve image clarity, increase the device scale factor:

await page.setViewport({
  width: 1200,
  height: 800,
  deviceScaleFactor: 2
});

Common Issues and Solutions

Addressing common challenges like resource management, file size, and errors can significantly boost performance.

  • Resource Management
    Use a single browser instance and page to handle multiple PDF requests, reducing overhead:
    const browser = await puppeteer.launch({
      args: ['--no-sandbox', '--disable-setuid-sandbox']
    });
    
    const page = await browser.newPage();
    for (const request of requests) {
      await generatePDF(page, request);
    }
    
  • File Size Optimization
    Minimize file size by removing unnecessary elements and optimizing images:
    await page.evaluate(() => {
      document.querySelectorAll('.no-print').forEach(el => el.remove());
    
      document.querySelectorAll('img').forEach(img => {
        img.loading = 'lazy';
        img.decoding = 'async';
      });
    });
    
  • Error Handling
    Implement strategies to handle errors like timeouts and retries:
    const generatePDF = async (page, options) => {
      try {
        await page.goto(options.url, {
          waitUntil: 'networkidle0',
          timeout: 30000
        });
        return await page.pdf(options);
      } catch (error) {
        console.error('PDF generation failed:', error);
        throw error;
      }
    };
    

Conclusion

Using Puppeteer to convert HTML to PDF provides effective tools for creating professional-grade documents.

Key Steps to Follow

  • Apply print media settings with page.emulateMediaType('print').
  • Use CSS rules like page-break-inside: avoid to ensure elements such as table rows stay intact.

These techniques build on earlier styling and layout methods, serving as a solid base for more advanced automation.

Advanced Automation Options

You can take PDF generation further with these additional automation features:

  • Environment Configuration
    Set up cache directories and browser settings to ensure consistent results across different platforms.
  • Performance Tweaks
    Adjust timeout settings and add retry mechanisms to improve reliability during the generation process.

When deploying these methods in production, include error handling and logging to maintain consistent and reliable PDF outputs.

Related posts

Related Blogs

Use case

Backed by