Master Kubernetes resource right-sizing with VPA and Goldilocks. Stop over-provisioning and start optimizing your cluster efficiency with this practical guide.
Most Kubernetes clusters I’ve audited are hemorrhaging money. It’s almost always the same story: developers set CPU and memory requests to “what feels safe,” which usually means 2x or 3x what the application actually needs. You end up with a cluster that looks full but is mostly running idle cycles.
If you’re tired of manually tuning resource manifests, it’s time to automate the process. We’ll combine the Vertical Pod Autoscaler (VPA) with Goldilocks to turn resource right-sizing from a guessing game into a data-driven workflow.
When you define a deployment, you set requests and limits. Kubernetes uses requests for scheduling. If you set these too high, your nodes report they’re full, forcing you to add more nodes—and pay more cloud provider bills. If you set them too low, you hit OOMKills and CPU throttling.
Finding the "Goldilocks" zone—where resources are neither too hot nor too cold—is impossible to do manually across hundreds of microservices. That’s where automation comes in.
The VPA monitors your actual pod usage over time and updates the spec.containers.resources for you. I prefer running it in RecommendationMode rather than letting it restart pods automatically, as it gives the team a chance to review the changes.
You can install VPA via Helm:
Bashhelm repo add fairwinds-stable https://charts.fairwinds.com/stable helm install vpa fairwinds-stable/vpa --namespace vpa --create-namespace
Once installed, the VPA controller creates a VerticalPodAutoscaler custom resource. It starts observing usage immediately. But staring at raw JSON output from kubectl get vpa -o yaml is a nightmare. This is where Goldilocks enters the picture.
Goldilocks is an open-source tool from Fairwinds that aggregates VPA recommendations into a clean, readable dashboard. It tells you exactly what your requests should be based on real-time metrics.
To set it up:
Bashhelm repo add fairwinds-stable https://charts.fairwinds.com/stable helm install goldilocks fairwinds-stable/goldilocks --namespace goldilocks --create-namespace
Once installed, label the namespaces you want to monitor:
Bashkubectl label namespace default goldilocks.fairwinds.com/vpa=enabled
Now, navigate to the Goldilocks dashboard (usually via port-forwarding the service). You’ll see a breakdown for every deployment. It categorizes recommendations into:
Don't just blindly apply these numbers. Kubernetes resource optimization is a process, not a one-time fix. Here’s the workflow I recommend for your team:
By implementing Kubernetes VPA and Goldilocks, you’re not just saving money; you’re improving cluster efficiency. When you right-size your pods, the Kubernetes scheduler can pack more pods onto fewer nodes. This reduces your cloud footprint and reduces the "noise" in your cluster.
I’ve seen teams reduce their monthly AWS bill by 30-40% just by tightening these requests. It’s the highest ROI task you can perform as an SRE or DevOps engineer.
Stop guessing. Start measuring. Your cluster—and your CFO—will thank you.
Master Kubernetes resource management with VPA recommendation mode. Learn how to optimize container resource utilization and improve your capacity planning workflow.
Read moreMaster Kubernetes Canary Deployments using Flagger and Istio. Learn how to automate traffic shifting, run health checks, and achieve safer progressive delivery.