Master Kubernetes resource management with VPA recommendation mode. Learn how to optimize container resource utilization and improve your capacity planning workflow.
We’ve all been there. You deploy a new service, set memory limits to 512Mi, and hope for the best. Two days later, you’re either looking at OOMKilled events or paying for massive amounts of wasted, idle RAM. Static resource allocation is the enemy of efficient Kubernetes resource management.
If you're tired of manually tuning requests and limits, it’s time to look at the Vertical Pod Autoscaler (VPA). Specifically, using the VPA recommendation mode is the safest way to start optimizing your cluster without letting an automated controller restart your pods during peak traffic.
Before we get to the data, we need the controller. I typically install VPA via the official Helm chart or the manifests provided in the kubernetes/autoscaler repository.
Assuming you’re using Helm:
Bashhelm repo add autoscaler https://kubernetes.github.io/autoscaler helm install vpa autoscaler/vpa --namespace kube-system
Once installed, you’ll see three components running in your kube-system namespace: vpa-recommender, vpa-updater, and vpa-admission-controller. For our purposes, the vpa-recommender is the star of the show. It analyzes your pod metrics and calculates the "ideal" resource footprint.
The beauty of VPA recommendation mode is that it doesn’t actually change your pod’s configuration. It just observes and suggests. This is perfect for Kubernetes capacity planning because it gives you a baseline based on real-world traffic patterns rather than developer intuition.
To enable it, create a VerticalPodAutoscaler object targeting your deployment:
YAMLapiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: my-app-vpa spec: targetRef: apiVersion: apps/v1 kind: Deployment name: my-app updatePolicy: updateMode: "Off" # This is the key for Recommendation Mode
Setting updateMode: "Off" tells Kubernetes: "Tell me what I should be using, but don't touch my pods."
After letting the VPA run for at least 24 hours—to capture your daily traffic cycles—you can inspect the recommendations. Run this command to see what the VPA thinks your resources should be:
Bashkubectl get vpa my-app-vpa -o yaml
Look for the recommendation block in the output:
YAMLstatus: recommendation: containerRecommendations: - containerName: my-app lowerBound: cpu: 100m memory: 256Mi target: cpu: 250m memory: 512Mi uncappedTarget: cpu: 250m memory: 512Mi upperBound: cpu: 500m memory: 1Gi
Here’s how to read this:
Now that you have the numbers, you can stop guessing. Use these recommendations to update your Deployment manifests. I recommend a gradual rollout. Don't just blindly apply the target if your current usage is drastically different; look at the lowerBound and upperBound to understand your risk profile.
Container resource optimization isn't a one-time task. It’s a loop. By keeping VPA in "Off" mode, you maintain a continuous feedback loop. Every week, I check the VPA status for our core services to see if traffic shifts have necessitated a change in our resource requests.
upperBound might suggest resources you can't afford. Use the target for baseline efficiency.VPA recommendation mode is the lowest-risk, highest-reward tool in your DevOps toolkit. It turns the "black box" of resource management into a clear, actionable data stream. Stop over-provisioning your clusters and start letting the data drive your configuration.
Master Kubernetes Canary Deployments using Flagger and Istio. Learn how to automate traffic shifting, run health checks, and achieve safer progressive delivery.
Read moreMaster Kubernetes autoscaling using Karpenter and AWS spot instances. Learn how to optimize cloud costs and automate node provisioning for your cluster.