TechnologySoftware EngineeringJune 19, 20263 min read

Kubernetes Resource Management: Using VPA Recommendation Mode

Master Kubernetes resource management with VPA recommendation mode. Learn how to optimize container resource utilization and improve your capacity planning workflow.

KubernetesDevOpsVPACloud-NativeSRECapacity PlanningOptimizationLinuxServer

Why Guessing Resource Limits is Killing Your Cluster

We’ve all been there. You deploy a new service, set memory limits to 512Mi, and hope for the best. Two days later, you’re either looking at OOMKilled events or paying for massive amounts of wasted, idle RAM. Static resource allocation is the enemy of efficient Kubernetes resource management.

If you're tired of manually tuning requests and limits, it’s time to look at the Vertical Pod Autoscaler (VPA). Specifically, using the VPA recommendation mode is the safest way to start optimizing your cluster without letting an automated controller restart your pods during peak traffic.

Installing the VPA Controller

Before we get to the data, we need the controller. I typically install VPA via the official Helm chart or the manifests provided in the kubernetes/autoscaler repository.

Assuming you’re using Helm:


Bash
helm repo add autoscaler https://kubernetes.github.io/autoscaler
helm install vpa autoscaler/vpa --namespace kube-system

Once installed, you’ll see three components running in your kube-system namespace: vpa-recommender, vpa-updater, and vpa-admission-controller. For our purposes, the vpa-recommender is the star of the show. It analyzes your pod metrics and calculates the "ideal" resource footprint.

Implementing VPA in Recommendation Mode

The beauty of VPA recommendation mode is that it doesn’t actually change your pod’s configuration. It just observes and suggests. This is perfect for Kubernetes capacity planning because it gives you a baseline based on real-world traffic patterns rather than developer intuition.

To enable it, create a VerticalPodAutoscaler object targeting your deployment:


YAML
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Off" # This is the key for Recommendation Mode

Setting updateMode: "Off" tells Kubernetes: "Tell me what I should be using, but don't touch my pods."

Analyzing the Recommendations

After letting the VPA run for at least 24 hours—to capture your daily traffic cycles—you can inspect the recommendations. Run this command to see what the VPA thinks your resources should be:


Bash
kubectl get vpa my-app-vpa -o yaml

Look for the recommendation block in the output:


YAML
status:
  recommendation:
    containerRecommendations:
      - containerName: my-app
        lowerBound:
          cpu: 100m
          memory: 256Mi
        target:
          cpu: 250m
          memory: 512Mi
        uncappedTarget:
          cpu: 250m
          memory: 512Mi
        upperBound:
          cpu: 500m
          memory: 1Gi

Here’s how to read this:

Target: This is the sweet spot. Aim to set your requests to these values.
Lower/Upper Bound: These represent the 5th and 95th percentiles of usage. If your application is highly volatile, your target might sit between these two.

Transforming Data into Optimization

Now that you have the numbers, you can stop guessing. Use these recommendations to update your Deployment manifests. I recommend a gradual rollout. Don't just blindly apply the target if your current usage is drastically different; look at the lowerBound and upperBound to understand your risk profile.

Container resource optimization isn't a one-time task. It’s a loop. By keeping VPA in "Off" mode, you maintain a continuous feedback loop. Every week, I check the VPA status for our core services to see if traffic shifts have necessitated a change in our resource requests.

Lessons Learned in Production

Don't ignore the history: VPA needs time. If you look at the recommendations after one hour, you’ll get garbage data. Wait for a full business cycle.
Watch out for spikes: If your app has massive, rare spikes, the upperBound might suggest resources you can't afford. Use the target for baseline efficiency.
Use it for Node Sizing: The data you get from VPA is invaluable for Kubernetes capacity planning. If you aggregate these recommendations across all namespaces, you can finally calculate how much CPU/RAM your cluster actually needs, allowing you to downsize your node groups and save on cloud costs.

VPA recommendation mode is the lowest-risk, highest-reward tool in your DevOps toolkit. It turns the "black box" of resource management into a clear, actionable data stream. Stop over-provisioning your clusters and start letting the data drive your configuration.

Back to Blog

Kubernetes Resource Management: Using VPA Recommendation Mode

Why Guessing Resource Limits is Killing Your Cluster

Installing the VPA Controller

Implementing VPA in Recommendation Mode

Analyzing the Recommendations

Transforming Data into Optimization

Lessons Learned in Production

Similar Posts

Kubernetes Canary Deployments: A Guide to Flagger and Istio

Kubernetes Autoscaling with Karpenter and AWS Spot Instances

Kubernetes Secret Management with HashiCorp Vault and ESO Guide