Performance budgets are hard to maintain. Learn how to automate bundle size analysis and Core Web Vitals thresholds in your CI/CD pipeline to stop regressions.
We’ve all been there: a new feature lands, the PR looks clean, but after the merge, the LCP jumps by 400ms. It’s a death by a thousand cuts where no single commit is the culprit, but the cumulative effect ruins the user experience.
I spent a week last year chasing a performance regression that turned out to be three different libraries added by three different teams. That’s when I realized that hoping for developer discipline isn’t a strategy. We needed performance budgets enforced by the machine, not by manual PR reviews that everyone glosses over.
Before we automated our gate, we relied on "performance culture." We’d ask developers to check the network tab or run a quick Lighthouse score locally. But local environments are noisy, and "good enough" is subjective.
When you’re cutting JavaScript bundle size: A practical guide for developers, you need data that compares the current PR against the production baseline. Relying on gut feeling is how you end up with 500KB of vendor code you don't even use.
To stop the bleeding, we implemented a two-tier gate in our GitHub Actions pipeline. The first tier is static: a hard cap on the main bundle size. The second tier is dynamic: a Lighthouse threshold gate for Core Web Vitals.
We started by using bundlesize to set a hard limit on our entry point. It’s simple, effective, and runs fast. If a PR adds more than 20KB of gzipped code, the build fails.
YAML- name: Check bundle size uses: siddharthkp/bundlesize-action@v4 with: github_token: ${{ secrets.GITHUB_TOKEN }} package_json: './package.json'
This forced us to have the "is this library worth 20KB?" conversation before the code hit main. We had one instance where a developer tried to add a heavy charting library; the CI failure triggered a discussion, and we found a lighter, tree-shakeable alternative instead.
Bundle size is a proxy for performance, but it isn't the whole story. We needed to measure the user experience directly. We configured a step in our CI using Lighthouse CI to spin up a temporary environment and run a performance audit.
We set our thresholds conservatively:
If the PR exceeds these, the build fails. It’s not about being perfect; it’s about preventing the "slow creep." If you're struggling with responsiveness, you might also want to look into improving INP via selective hydration and React Suspense to keep your main thread clear.
Initially, we tried to run these audits against our production environment. That was a mistake. The variance in production data—due to CDN caching, user device types, and network conditions—made our CI results flaky. We’d get "build failed" notices for performance dips that were actually just server-side fluctuations.
We switched to a dedicated ephemeral staging environment that mimics production settings but with a controlled, throttled network profile. By decoupling our performance testing from real-world RUM data, we gained reliable, repeatable results.
Performance budgets work best when they don't feel like a punishment. If your budget is too tight, developers will just disable the checks or bypass them.
Here is how we keep the process healthy:
Even with these gates, we still face issues with font loading strategy: eliminate FOIT and layout shifts that don't always show up in bundle size metrics.
Automating your performance budgets is a massive step forward, but it’s not a "set it and forget it" solution. You have to treat your performance thresholds like code—they need maintenance, refinement, and a team that understands why they exist. Start small, set realistic limits, and prioritize the metrics that actually impact your users.
Master requestIdleCallback to keep your main thread responsive. Learn how to defer non-essential work and prevent frame drops in complex frontend applications.
Read morescheduler.yield helps you eliminate long tasks and improve main thread optimization. Learn how to fix interaction to next paint (INP) issues in production.