Real User Monitoring is the only way to catch performance regressions that synthetic tests miss. Learn how to use the Web Vitals API to track Core Web Vitals.
Last month, our dashboard showed a perfect green score on Lighthouse. Yet, our support tickets were flooding with reports of "sluggish" page interactions on mobile devices. Synthetic tests in our CI/CD pipeline were lying to us because they didn't account for the chaotic reality of mid-range Android devices, spotty 4G connections, and real user behavior.
That’s when I realized that if you aren't using Real User Monitoring, you aren't actually measuring performance. You’re just measuring your dev environment’s best-case scenario.
Lighthouse is a fantastic debugging tool, but it's a snapshot. It runs in a controlled environment with cached assets and a high-speed connection. It doesn’t know about the user in a subway tunnel trying to load a massive bundle while their browser handles background processes.
When we talk about Performance Regression Testing, we have to move away from "lab data" and toward "field data." Lab data tells you what could happen; field data tells you what is happening.
The browser’s Web Vitals API is the secret weapon for capturing what your users see. It’s lightweight, built into modern browsers, and gives you the exact same metrics that Google uses to rank your site.
Here is a simple way to start capturing these metrics in production:
JAVASCRIPTimport { onCLS, onLCP, onINP } from CE9178">'web-vitals'; function sendToAnalytics(metric) { const body = JSON.stringify(metric); // Use navigator.sendBeacon for reliable delivery navigator.sendBeacon(CE9178">'/analytics', body); } onCLS(sendToAnalytics); onLCP(sendToAnalytics); onINP(sendToAnalytics);
By sending these events to your own backend or an observability platform, you build a historical baseline. If you deploy a new version and your P75 LCP jumps from 2.2s to 3.1s, you’ll know instantly.
We once tried to automate performance budgets using only Lighthouse CI. It was a disaster. We spent more time fighting "flaky" tests—where a network hiccup caused a build failure—than actually optimizing code.
When we switched to monitoring Core Web Vitals from real users, the noise vanished. We started seeing patterns. For example, we noticed a massive spike in Interaction to Next Paint (INP) after we added a third-party analytics script. INP explained and how to actually improve it in production helped us understand that these scripts were blocking the main thread, but RUM was the only thing that confirmed it was affecting 15% of our traffic.
You’ll be tempted to track everything. Don't. Every byte you send to your analytics endpoint consumes bandwidth and CPU cycles, potentially hurting the very performance you're trying to measure.
Start small:
I’m still not entirely happy with how we handle "noise" in our RUM data. Sometimes a user’s device is just so old that no amount of code optimization will make the site fly. We’re currently experimenting with how to exclude these outliers so our alerts stay actionable rather than just reporting "your users have old phones."
We’ve also started using the Speculation Rules API to mask some of the latency we see in the field. It’s been a game-changer for perceived speed, but it adds complexity to how we interpret our metrics.
Are you currently tracking your metrics in the field, or are you still relying on Lighthouse reports? If you’re just starting, don't over-engineer it. Just drop the web-vitals script in, send the data to a database, and watch the trends for a week. You’ll be surprised by what you find.
Does RUM impact my page load time?
If implemented correctly using navigator.sendBeacon and deferred script loading, the impact is negligible. The browser handles the beacon request outside of the main page lifecycle.
Should I stop using Lighthouse? Absolutely not. Use Lighthouse for local development and CI/CD "gating," but use RUM for production monitoring. They serve different purposes.
What is the minimum I should track? Start with LCP (Largest Contentful Paint), CLS (Cumulative Layout Shift), and INP (Interaction to Next Paint). These three cover the vast majority of user experience issues.
Service Workers and stale-while-revalidate strategies help you achieve offline-first performance. Learn how to master the Cache API for instant loading.