PDF Generation with Puppeteer in NestJS: Building a CV Export Feature

One of the features I wanted for my portfolio was a downloadable CV – not just a static PDF I update manually, but one generated dynamically from the same data that powers my website. Update the database once, and both the web view and the PDF stay in sync.
Why Puppeteer?
There are several approaches to generating PDFs in Node.js, each with trade-offs.
Libraries like PDFKit or jsPDF let you construct PDFs programmatically by placing text, lines, and shapes at specific coordinates. This gives you precise control but becomes tedious quickly – you're essentially writing a page layout engine from scratch. Want to add a paragraph? You need to calculate line breaks, handle overflow, manage fonts manually.
HTML-to-PDF converters like html-pdf or pdf-make are easier to work with but often produce inconsistent results. They use simplified rendering engines that don't support modern CSS, so what looks good in your browser might look broken in the PDF.
Puppeteer takes a different approach: it runs an actual Chrome browser (headless, without a visible window) and uses Chrome's built-in "Print to PDF" functionality. The advantage is obvious – if it renders correctly in Chrome, it will look identical in the PDF. You get full CSS support, web fonts, flexbox, grid, everything. The trade-off is resource usage: you're running a full browser, which consumes more memory than a pure JavaScript solution.
For a CV where visual design matters and I already know HTML/CSS, Puppeteer was the clear choice.
The Generation Pipeline
The overall flow is straightforward:
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ CV Data │────▶│ Handlebars │────▶│ Puppeteer │────▶ PDF
│ (DB) │ │ Template │ │ (Chrome) │
└──────────────┘ └──────────────┘ └──────────────┘First, we fetch the CV data from the database – work experience, education, skills, languages. Then we pass this data through a Handlebars template to generate an HTML document. Finally, Puppeteer opens this HTML in headless Chrome and exports it as a PDF.
Setup
You only need two packages:
bash
npm install puppeteer handlebarsPuppeteer is Google's official library for controlling Chrome programmatically. When you install it, it automatically downloads a compatible version of Chromium (~170MB). This ensures your code works regardless of what browser the user has installed.
Handlebars is a templating engine that lets you write HTML with placeholders like {{firstName}} that get replaced with actual data. It's intentionally "logic-less" – you can't write arbitrary JavaScript in templates, which keeps them clean and focused on presentation.
The PDF Generator Service
Let's build the service step by step. In NestJS, a service is a class decorated with @Injectable() that contains business logic. We'll create one specifically for PDF generation:
typescript
// cv/services/pdf-generator.service.ts
import { Injectable, Logger, OnModuleDestroy } from '@nestjs/common';
import * as puppeteer from 'puppeteer';
import type { Browser } from 'puppeteer';
import * as fs from 'fs/promises';
import * as path from 'path';
import * as Handlebars from 'handlebars';
@Injectable()
export class PdfGeneratorService implements OnModuleDestroy {
private readonly logger = new Logger(PdfGeneratorService.name);
private templateCache: HandlebarsTemplateDelegate | null = null;
private browserInstance: Browser | null = null;
private browserLaunchPromise: Promise<Browser> | null = null;
constructor() {
this.registerHandlebarsHelpers();
}
async onModuleDestroy() {
if (this.browserInstance) {
await this.browserInstance.close();
this.browserInstance = null;
}
}
}A few things to note here:
The OnModuleDestroy interface is a NestJS lifecycle hook. When your application shuts down (server restart, deployment, etc.), NestJS calls onModuleDestroy() on all services that implement it. We use this to properly close the Chrome browser. Without it, you'd leave zombie Chrome processes consuming memory.
We declare three private properties: templateCache stores the compiled Handlebars template so we don't re-read it from disk on every request. browserInstance holds our reusable Chrome browser. browserLaunchPromise prevents race conditions when multiple requests try to start the browser simultaneously.
Browser Instance Management: Why Reuse Matters
Starting a Chrome browser is slow – typically 2-3 seconds. If every PDF request launched a new browser, your users would wait 3+ seconds even for a simple one-page document. That's unacceptable.
The solution is to start the browser once and keep it running. Each PDF request opens a new page (tab) within that browser, which is nearly instantaneous. When the PDF is done, we close the page but keep the browser alive for the next request.
Here's the implementation:
typescript
private async getBrowser(): Promise<Browser> {
// If we already have a connected browser, use it
if (this.browserInstance?.connected) {
return this.browserInstance;
}
// If a launch is already in progress, wait for it
// This prevents multiple simultaneous requests from each starting their own browser
if (this.browserLaunchPromise) {
return this.browserLaunchPromise;
}
// No browser exists and none is launching – start one
this.browserLaunchPromise = this.launchBrowser();
try {
this.browserInstance = await this.browserLaunchPromise;
return this.browserInstance;
} finally {
// Clear the promise so future calls know no launch is in progress
this.browserLaunchPromise = null;
}
}
private async launchBrowser(): Promise<Browser> {
return puppeteer.launch({
headless: true,
args: [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-dev-shm-usage',
'--disable-gpu',
'--font-render-hinting=none',
],
});
}The args array contains Chrome flags that optimize for server environments:
--no-sandboxand--disable-setuid-sandbox: Disable Chrome's sandbox security feature. This is necessary in Docker containers where the sandbox doesn't work. In production, your container isolation provides security instead.--disable-dev-shm-usage: By default, Chrome uses/dev/shmfor shared memory. In Docker, this partition is often too small (64MB), causing crashes. This flag tells Chrome to use/tmpinstead.--disable-gpu: Disables GPU acceleration, which isn't available in headless server environments anyway.--font-render-hinting=none: Disables font hinting for more consistent rendering across platforms.
Cross-Platform Chrome Detection
By default, Puppeteer downloads and uses its own Chromium. But in some scenarios, you might want to use the system-installed Chrome instead – perhaps to reduce your Docker image size or to use a specific Chrome version.
This helper function checks common installation paths across operating systems:
typescript
private getChromePath(): string | undefined {
// Docker/CI: Use environment variable if set
if (process.env.PUPPETEER_EXECUTABLE_PATH) {
return process.env.PUPPETEER_EXECUTABLE_PATH;
}
// macOS: Chrome is typically in the Applications folder
if (process.platform === 'darwin') {
const systemChrome = '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome';
if (fsSync.existsSync(systemChrome)) {
return systemChrome;
}
}
// Linux: Check common Chromium installation paths
if (process.platform === 'linux') {
const paths = ['/usr/bin/chromium-browser', '/usr/bin/chromium'];
for (const chromePath of paths) {
if (fsSync.existsSync(chromePath)) {
return chromePath;
}
}
}
// Windows or not found: Let Puppeteer use its bundled Chromium
return undefined;
}The process.platform property tells you what operating system Node.js is running on: 'darwin' for macOS, 'linux' for Linux, 'win32' for Windows.
Handlebars Helpers: Extending the Template Language
Handlebars templates are intentionally simple – they can only output variables and loop over arrays. For anything more complex, you need "helpers" – custom functions that transform data within the template.
Here are the helpers I registered for the CV template:
typescript
private registerHandlebarsHelpers(): void {
// Format a single date as "MM/YYYY" or "Present" if current
Handlebars.registerHelper('formatDate', (date: Date | null, isCurrent: boolean) => {
if (isCurrent || !date) return 'Present';
const d = new Date(date);
// padStart ensures single-digit months get a leading zero: "1" becomes "01"
return `${String(d.getMonth() + 1).padStart(2, '0')}/${d.getFullYear()}`;
});
// Format a date range like "01/2020 - 06/2023" or "01/2020 - Present"
Handlebars.registerHelper('dateRange', (startDate: Date, endDate: Date | null, isCurrent: boolean) => {
const start = new Date(startDate);
const startStr = `${String(start.getMonth() + 1).padStart(2, '0')}/${start.getFullYear()}`;
if (isCurrent || !endDate) {
return `${startStr} - Present`;
}
const end = new Date(endDate);
const endStr = `${String(end.getMonth() + 1).padStart(2, '0')}/${end.getFullYear()}`;
return `${startStr} - ${endStr}`;
});
// Join array elements with a separator (default: comma + space)
// Usage in template: {{join skills ", "}}
Handlebars.registerHelper('join', (arr: string[], separator = ', ') => {
if (!arr || !Array.isArray(arr)) return '';
return arr.join(separator);
});
// Convert language proficiency levels to percentages for visual progress bars
// "C1 - Professional" becomes 85, "B2 - Upper Intermediate" becomes 80, etc.
Handlebars.registerHelper('langWidth', (level: string) => {
const levels: Record<string, number> = {
Native: 100, C2: 95, C1: 85, B2: 80, B1: 65, A2: 40, A1: 25
};
// Check if the level string contains any of our known levels
for (const [key, value] of Object.entries(levels)) {
if (level?.includes(key)) return value;
}
return 50; // Default for unknown levels
});
}These helpers are called from the Handlebars template. For example:
handlebars
<div class="experience">
<span class="dates">{{dateRange startDate endDate isCurrent}}</span>
<h3>{{title}}</h3>
</div>
<div class="language">
<span>{{name}}</span>
<div class="progress-bar" style="width: {{langWidth level}}%"></div>
</div>The Main PDF Generation Method
Now let's look at the core method that ties everything together:
typescript
async generatePdf(cvData: CvProfile): Promise<Buffer> {
const startTime = Date.now();
this.logger.log('Starting PDF generation...');
// Step 1: Get the compiled template and fill it with our CV data
// This produces a complete HTML string with all placeholders replaced
const compiledTemplate = await this.getCompiledTemplate();
const html = compiledTemplate(await this.prepareTemplateData(cvData));
// Step 2: Get a browser instance (reused across requests)
const browser = await this.getBrowser();
// Step 3: Open a new page (tab) in the browser
const page = await browser.newPage();
try {
// Step 4: Set the viewport dimensions
// 794x1123 pixels is A4 paper size at 96 DPI (dots per inch)
// deviceScaleFactor: 2 renders at double resolution for crisp text (like Retina displays)
await page.setViewport({
width: 794,
height: 1123,
deviceScaleFactor: 2,
});
// Step 5: Load our HTML into the page
// 'domcontentloaded' means we proceed as soon as the HTML is parsed
// (we don't wait for images/fonts since we're not loading external resources)
await page.setContent(html, {
waitUntil: 'domcontentloaded',
timeout: 10000,
});
// Step 6: Generate the PDF
const pdf = await page.pdf({
format: 'A4',
printBackground: true, // Include CSS background colors/images
preferCSSPageSize: true, // Respect @page CSS rules if present
margin: { top: '0', right: '0', bottom: '0', left: '0' },
});
const duration = Date.now() - startTime;
this.logger.log(`PDF generated in ${duration}ms`);
// page.pdf() returns a Uint8Array, we convert to Buffer for easier handling
return Buffer.from(pdf);
} finally {
// Always close the page, even if an error occurred
// This prevents memory leaks from accumulated open tabs
await page.close();
}
}The try/finally pattern is important here. Even if PDF generation fails (timeout, invalid HTML, etc.), we still close the page. Without this, failed requests would leave orphaned pages consuming memory until the browser eventually crashes.
Template Caching for Performance
Reading files from disk is relatively slow – a few milliseconds per read. While that sounds trivial, it adds up when you're serving multiple requests. More importantly, in a serverless environment, cold starts are already slow; we don't want to add unnecessary file I/O.
The solution is to read and compile the template once, then cache it in memory:
typescript
private async getCompiledTemplate(): Promise<HandlebarsTemplateDelegate> {
// In development, always reload the template so changes are picked up immediately
const isDev = process.env.NODE_ENV !== 'production';
// Return cached template if available (and we're in production)
if (this.templateCache && !isDev) {
return this.templateCache;
}
// Read the template file from disk
// process.cwd() returns the directory where the Node process was started
const templatePath = path.join(process.cwd(), 'src', 'cv', 'templates', 'cv-template.hbs');
const templateSource = await fs.readFile(templatePath, 'utf-8');
// Compile the template string into a reusable function
// This parses the Handlebars syntax once; calling the function is fast
this.templateCache = Handlebars.compile(templateSource);
return this.templateCache;
}In development mode (NODE_ENV !== 'production'), we skip the cache. This lets you edit the template and see changes immediately without restarting the server. In production, the template is compiled once and reused for the lifetime of the process.
Controller Endpoints
The controller exposes our PDF functionality as HTTP endpoints. We need three routes: one for downloading the PDF, one for viewing it in the browser, and one for previewing the raw HTML (useful during development).
typescript
// cv/cv.controller.ts
@Controller('cv')
export class CvController {
constructor(
private readonly cvService: CvService,
private readonly pdfGenerator: PdfGeneratorService,
) {}
// Download endpoint: triggers a file download in the browser
@Get('export/pdf')
async exportCvAsPdf(@Res() res: Response): Promise<void> {
// Fetch CV data from the database
const cv = await this.cvService.getMyCV();
if (!cv) throw new NotFoundException('CV not found');
// Generate the PDF
const pdfBuffer = await this.pdfGenerator.generatePdf(cv);
// Set response headers for file download
res.set({
'Content-Type': 'application/pdf',
// 'attachment' tells the browser to download rather than display
// filename suggests a name for the downloaded file
'Content-Disposition': `attachment; filename="${cv.firstName}_${cv.lastName}_CV.pdf"`,
'Content-Length': pdfBuffer.length.toString(),
});
// Send the PDF bytes
res.send(pdfBuffer);
}
// Preview endpoint: displays PDF in browser's built-in viewer
@Get('preview/pdf')
async previewCvAsPdf(@Res() res: Response): Promise<void> {
const cv = await this.cvService.getMyCV();
if (!cv) throw new NotFoundException('CV not found');
const pdfBuffer = await this.pdfGenerator.generatePdf(cv);
res.set({
'Content-Type': 'application/pdf',
// 'inline' tells the browser to display the PDF rather than download
'Content-Disposition': 'inline',
});
res.send(pdfBuffer);
}
// HTML preview: returns the raw HTML before PDF conversion
// Invaluable for debugging layout issues
@Get('preview/html')
async previewCvAsHtml(@Res() res: Response): Promise<void> {
const cv = await this.cvService.getMyCV();
const html = await this.pdfGenerator.getTemplateHtml(cv || undefined);
res.set({ 'Content-Type': 'text/html' });
res.send(html);
}
}The @Res() decorator gives us direct access to the Express response object. Normally, NestJS handles serialization automatically (returning an object sends JSON), but for binary data like PDFs, we need manual control over the response.
The difference between attachment and inline in Content-Disposition is subtle but important: attachment triggers a download dialog, while inline displays the content directly in the browser (most browsers have built-in PDF viewers).
Dynamic Page Splitting
A one-page CV is simple: just fill the template and render. But what if you have 10 years of experience that doesn't fit on one page? You need to intelligently split content across pages without cutting entries in half.
The challenge is that we're working with HTML, which flows naturally, but PDFs have fixed page boundaries. We need to predict how much space each entry will take and group them into pages:
typescript
private splitExperiencesIntoPages(experiences: Experience[]): Experience[][] {
// These values represent approximate "height units"
// Calibrated through trial and error to match actual rendered heights
const PAGE_1_CAPACITY = 9.5; // First page has sidebar, less space for content
const PAGE_N_CAPACITY = 12.0; // Subsequent pages have full width
const pages: Experience[][] = [];
let currentPage: Experience[] = [];
let currentHeight = 0;
let pageIndex = 0;
for (const exp of experiences) {
// Calculate how much vertical space this experience will need
const expHeight = this.calculateExperienceHeight(exp);
// Determine capacity based on which page we're on
const capacity = pageIndex === 0 ? PAGE_1_CAPACITY : PAGE_N_CAPACITY;
// Does it fit on the current page?
if (currentHeight + expHeight <= capacity) {
currentPage.push(exp);
currentHeight += expHeight;
} else {
// Start a new page
pages.push(currentPage);
currentPage = [exp];
currentHeight = expHeight;
pageIndex++;
}
}
// Don't forget the last page
if (currentPage.length > 0) pages.push(currentPage);
return pages;
}
private calculateExperienceHeight(experience: Experience): number {
// Base height: job title, company name, date range
let height = 0.8;
// Each bullet point adds height based on text length
for (const highlight of experience.highlights || []) {
if (highlight.length < 100) height += 0.25; // Short bullet
else if (highlight.length < 200) height += 0.4; // Medium bullet
else height += 0.55; // Long bullet (might wrap to 3+ lines)
}
// Technology tags take up a small amount of space
if ((experience.technologies || []).length > 0) height += 0.2;
return height;
}This is admittedly a heuristic approach – we're estimating heights rather than measuring actual rendered pixels. The values (0.8, 0.25, etc.) came from trial and error: render a PDF, see where content overflows, adjust the numbers, repeat.
A more robust solution would render the HTML, measure actual element heights with JavaScript, then re-render with proper page breaks. But for a CV that changes infrequently, the heuristic approach is simpler and works well enough.
Performance Comparison
Here's the impact of each optimization I implemented:
OptimizationBeforeAfterWhy It MattersBrowser reuse~3s per PDF~200ms per PDFStarting Chrome is slow; reusing it is fastTemplate cacheDisk read every timeCompile onceFile I/O adds latency, especially on slow disksdomcontentloaded vs networkidle0~500ms waitImmediateWe have no external resources to wait fordeviceScaleFactor: 2Pixelated textRetina qualityHigher resolution without larger file size
Browser reuse is the biggest win by far. Going from 3 seconds to 200ms makes the feature actually usable – users click "Download CV" and get a response almost instantly.
Summary
Here's the complete pipeline visualized:
┌─────────────────────────────────────────────────────────────┐
│ PDF Generation Pipeline │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Load CV data from database │
│ ▼ │
│ 2. Prepare data (sorting, pagination) │
│ ▼ │
│ 3. Fill Handlebars template → HTML │
│ ▼ │
│ 4. Puppeteer: HTML → PDF (A4, Retina) │
│ ▼ │
│ 5. Return buffer or save to disk │
│ │
├─────────────────────────────────────────────────────────────┤
│ Key Takeaways: │
│ • Reuse browser instance across requests │
│ • Cache compiled templates │
│ • Use OnModuleDestroy for cleanup │
│ • Handle cross-platform Chrome paths │
│ • Estimate content height for multi-page documents │
└─────────────────────────────────────────────────────────────┘The result is a CV that's always up-to-date, looks professionally typeset, and generates in under a second. It took some effort to set up – handling browser lifecycle, optimizing performance, dealing with page breaks – but now I update my experience in one place and both my website and downloadable CV stay in sync automatically.
If you're building any kind of document export feature, Puppeteer is worth considering. The "render HTML with a real browser" approach sidesteps all the complexity of programmatic PDF layout, and with proper optimization, the performance is more than acceptable for on-demand generation.