Tired of Puppeteer Screenshot Bugs? Here's What We Switched To

We replaced 200+ lines of Puppeteer screenshot code with a single API call to SnapRender after spending two weeks debugging full-page captures on a client project. Memory leaks crashed production twice, Chrome updates broke rendering, and our Docker image hit 2GB. If your puppeteer screenshot full page not working searches are becoming a weekly routine, this is the path we took and why we'd do it again.

The Breaking Points

Our app generates PDF reports with website screenshots. Customers paste a URL, we capture it, embed it in a report. Sounds simple. For the first few months, Puppeteer handled it fine. Then usage grew, edge cases multiplied, and we started spending more time on screenshot infrastructure than on the report features people were actually paying for.

Full-page screenshots that weren't full-page

A client reported that screenshots of their marketing site were cutting off halfway. Their page used lazy-loaded images and a sticky header. The sticky header stamped itself across the entire capture, and everything below the fold was placeholder images.

The fix required scrolling the page to trigger lazy loading, converting fixed elements to absolute positioning, waiting for all images to load, then capturing. Here's what our "simple screenshot" function looked like after we patched it:

async function takeScreenshot(url) {
  const browser = await puppeteer.launch({
    headless: 'new',
    args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage'],
  });

  const page = await browser.newPage();
  await page.setViewport({ width: 1440, height: 900 });

  await page.goto(url, { waitUntil: 'networkidle0', timeout: 30000 });

  // Handle cookie banners
  await page.evaluate(() => {
    const selectors = [
      '#cookie-banner', '.cookie-consent', '[class*="cookie"]',
      '#onetrust-banner-sdk', '.cc-banner',
    ];
    selectors.forEach(sel => {
      document.querySelectorAll(sel).forEach(el => el.remove());
    });
  });

  // Remove overflow restrictions
  await page.evaluate(() => {
    document.documentElement.style.overflow = 'visible';
    document.body.style.overflow = 'visible';
  });

  // Convert fixed position elements
  await page.evaluate(() => {
    document.querySelectorAll('*').forEach(el => {
      if (window.getComputedStyle(el).position === 'fixed') {
        el.style.position = 'absolute';
      }
    });
  });

  // Scroll to trigger lazy loading
  await page.evaluate(async () => {
    await new Promise((resolve) => {
      let totalHeight = 0;
      const timer = setInterval(() => {
        window.scrollBy(0, 400);
        totalHeight += 400;
        if (totalHeight >= document.body.scrollHeight) {
          clearInterval(timer);
          window.scrollTo(0, 0);
          resolve();
        }
      }, 100);
    });
  });

  // Wait for images
  await page.waitForFunction(() => {
    return Array.from(document.images).every(img => img.complete);
  }, { timeout: 10000 }).catch(() => {});

  // Wait for fonts
  await page.evaluate(() => document.fonts.ready);

  // Extra buffer for CSS transitions
  await new Promise(r => setTimeout(r, 1500));

  const screenshot = await page.screenshot({
    fullPage: true,
    type: 'png',
  });

  await browser.close();
  return screenshot;
}

That's 60 lines for a single screenshot. And it still didn't handle every site. Pages with iframes, WebGL content, or aggressive bot detection would fail silently, returning a screenshot that looked nothing like the actual page.

Memory leaks in production

Each Puppeteer screenshot launches a Chromium instance. We pooled browsers to avoid the startup cost, but Chrome's memory usage is unpredictable. A page with heavy JavaScript could push a single tab to 500MB+. Our monitoring showed memory climbing steadily over hours until the process hit the container limit and got killed.

We added a browser recycling system that closed and relaunched Chrome every 50 screenshots. That helped, but zombie processes still leaked through. Chrome tabs that timed out during navigation would sometimes leave orphaned processes:

// We had to add this to our cleanup routine
const { execSync } = require('child_process');
try {
  execSync('pkill -f "chrome.*--headless"', { timeout: 5000 });
} catch (e) {
  // No processes to kill, that's fine
}

Running pkill in production code. That's when you know your screenshot infrastructure has gone off the rails.

Chrome version roulette

Puppeteer pins to a specific Chrome version, but our automated dependency updates bumped it quarterly. Each Chrome update had a non-zero chance of changing rendering behavior. Fonts would shift by a pixel. SVG rendering would break. One update changed how overflow: clip was handled, which broke full-page screenshots on every site using a modern CSS reset.

We started pinning Chrome versions, which meant missing security patches. Then we started manually testing each Chrome update against our screenshot test suite, which added a half-day of work per quarter.

The Docker image problem

Our Dockerfile for the screenshot service looked like this:

FROM node:20-slim

RUN apt-get update && apt-get install -y \
    chromium \
    fonts-liberation \
    fonts-dejavu-core \
    fonts-noto-cjk \
    fonts-noto-color-emoji \
    libnss3 \
    libatk-bridge2.0-0 \
    libdrm2 \
    libxkbcommon0 \
    libgbm1 \
    libgtk-3-0 \
    libasound2 \
    libxshmfence1 \
    && rm -rf /var/lib/apt/lists/*

The resulting image: 1.8GB. With Chromium binaries, font packages, and shared libraries, there's no way around it. Every deploy pushed almost 2GB. Our CI pipeline spent more time building and pushing this image than running tests.

The Switch

After the second production crash from a memory leak, we did the math. We were spending roughly 15-20 engineering hours per month maintaining Puppeteer infrastructure. At our billing rate, that's more than any screenshot API would cost. If you're curious about the full cost breakdown, see The Real Cost of Self-Hosting Screenshots.

We evaluated three options: building a microservice with Playwright (same problems, different library), using a managed browser service (cheaper than our time but still required maintaining screenshot logic), or switching to a screenshot API (zero infrastructure).

Why we picked SnapRender

We tested four screenshot APIs. SnapRender won for three reasons:

No feature gating. Every plan includes every feature. Full-page screenshots, device emulation, dark mode, ad blocking, cookie banner removal. Other APIs locked features behind higher tiers. One competitor charged extra for full-page screenshots, which was the entire reason we were looking for a solution.

The pricing made sense. SnapRender's free tier gives you 200 screenshots/month to test with real URLs, not sandboxed demos. Their paid plans start at $9/month for 2,000 screenshots and scale to $199/month for 200,000. Our usage at the time was about 5,000 screenshots/month, so the $29/month Growth plan covered us. Compare that to our infrastructure cost: a dedicated container running 24/7 on AWS, Chrome eating 2GB+ RAM, plus engineering time.

Cached responses. SnapRender caches screenshots with configurable TTL. A fresh capture takes 2-5 seconds. A cached response comes back in under 200ms. For our report generation, where multiple users might screenshot the same URL within an hour, this cut our p50 response time dramatically. (We wrote more about this in From Timeout Hell to 200ms.)

The before and after

Our 60-line Puppeteer function became this:

const fetch = require('node-fetch');

async function takeScreenshot(url) {
  const response = await fetch(
    `https://app.snap-render.com/v1/screenshot?` +
    `url=${encodeURIComponent(url)}` +
    `&full_page=true` +
    `&block_cookie_banners=true` +
    `&block_ads=true` +
    `&format=png` +
    `&viewport_width=1440` +
    `&viewport_height=900`,
    {
      headers: { 'X-API-Key': process.env.SNAPRENDER_API_KEY },
    }
  );

  if (!response.ok) {
    throw new Error(`Screenshot failed: ${response.status}`);
  }

  return response.buffer();
}

Nine lines of actual logic. No browser to manage. No fonts to install. No memory to monitor. No zombie processes to kill.

Or using their Node SDK:

const { SnapRender } = require('snaprender');

const client = new SnapRender({ apiKey: process.env.SNAPRENDER_API_KEY });

async function takeScreenshot(url) {
  return client.capture({
    url,
    fullPage: true,
    blockCookieBanners: true,
    blockAds: true,
    format: 'png',
    viewportWidth: 1440,
    viewportHeight: 900,
  });
}

What we deleted

After the switch, we removed:

The entire Puppeteer screenshot service (1,200 lines)
The Dockerfile with Chrome and font dependencies
Browser pool management code
Zombie process cleanup scripts
Chrome version pinning configuration
The screenshot-specific monitoring and alerting
2GB Docker image from our registry

Our main application's Docker image dropped from 2.1GB to 340MB. Deploy times went from 8 minutes to under 2.

Six Months Later

What improved

Reliability went from ~94% to 99.8%. Our Puppeteer setup had a roughly 6% failure rate across all URLs. Some sites would timeout, some would crash Chrome, some would trigger bot detection. With SnapRender, failures dropped to occasional timeouts on extremely slow sites.

Response times got predictable. With Puppeteer, screenshot times ranged from 3 seconds to 45 seconds depending on the page. With SnapRender, it's 2-5 seconds fresh, under 200ms cached. That predictability let us set reasonable timeout expectations in our UI instead of showing a spinner for an unknown duration.

No more 3am alerts. We haven't been paged for a screenshot-related issue since the switch. The memory leak alerts, the process crash alerts, the "disk full because Chrome dumped core files" alerts. All gone.

Engineering time freed up. The 15-20 hours/month we spent on Puppeteer maintenance went to zero. That time went back into building report features our customers actually asked for.

What we gave up

Fine-grained browser control. With Puppeteer, we could inject arbitrary JavaScript, intercept network requests, manipulate the DOM in any way we wanted before capturing. SnapRender handles common cases (cookie banners, ads, dark mode, hiding specific selectors), but if you need to, say, click through a multi-step form before screenshotting the result, you'd need to combine the API with your own browser automation for that specific case.

On-premises data. Our Puppeteer screenshots never left our infrastructure. With SnapRender, the URL is sent to their servers for rendering. For our use case (capturing public marketing sites), this wasn't a concern. If you're screenshotting internal dashboards or sites behind a VPN, an API won't work without exposing those URLs.

Pixel-level consistency across updates. When we controlled Chrome's version, we could guarantee pixel-identical output between runs. With an API, the rendering engine updates on their end. In practice, we haven't noticed any visual regressions, but the theoretical risk exists.

The honest assessment

For about 90% of screenshot use cases, an API is the right call. If you're capturing public URLs for thumbnails, previews, reports, or social media images, the engineering cost of maintaining Puppeteer infrastructure doesn't justify the control it gives you. The puppeteer screenshot full page not working problem, along with memory management, Docker configuration, font installation, and Chrome updates, all become someone else's responsibility.

The 10% where you still need Puppeteer: browser automation workflows where the screenshot is one step in a larger interaction, capturing content behind authentication that you can't pass via URL parameters, or situations where you need to run JavaScript that modifies the page in ways an API's selector-based options can't express.

Making the Switch

If your puppeteer screenshot full page not working searches have become a recurring theme and you're considering moving off Puppeteer, here's the practical path:

Start with the free tier. SnapRender gives you 200 screenshots/month at no cost. Test with the actual URLs your app captures, not a hello-world page. The edge cases that break Puppeteer are the ones you want to verify work through the API. For a walkthrough of the complete API, see The Complete Screenshot API Guide.
Run both in parallel. We kept our Puppeteer service running for two weeks while routing traffic to SnapRender. We compared outputs side by side. The API matched or beat our Puppeteer output on 97% of URLs.
Handle errors gracefully. The API can fail (network issues, target site down, timeout). Your error handling should be simpler than Puppeteer's, but it still needs to exist:

async function takeScreenshot(url, retries = 2) {
  for (let attempt = 0; attempt <= retries; attempt++) {
    try {
      const response = await fetch(
        `https://app.snap-render.com/v1/screenshot?url=${encodeURIComponent(url)}&full_page=true`,
        {
          headers: { 'X-API-Key': process.env.SNAPRENDER_API_KEY },
          signal: AbortSignal.timeout(15000),
        }
      );

      if (response.ok) return response.buffer();

      if (response.status === 429) {
        // Rate limited, wait and retry
        await new Promise(r => setTimeout(r, 2000 * (attempt + 1)));
        continue;
      }

      throw new Error(`API returned ${response.status}`);
    } catch (err) {
      if (attempt === retries) throw err;
    }
  }
}

Tear down the old infrastructure. Once you're confident in the API, delete the Puppeteer code, the Docker configuration, the monitoring. Don't leave it "just in case." Dead infrastructure that nobody maintains is worse than no infrastructure.

We made the switch eight months ago. Total cost: $29/month on SnapRender's Growth plan. Total time spent on screenshot infrastructure since: zero. That's the math that matters.