Software EngineeringTechnologyJune 19, 20263 min read

Kubernetes Autoscaling with Karpenter and AWS Spot Instances

Master Kubernetes autoscaling using Karpenter and AWS spot instances. Learn how to optimize cloud costs and automate node provisioning for your cluster.

KubernetesKarpenterAWSCloud ComputingDevOpsSRECost OptimizationLinuxServer

Why Static Scaling is Dead

For years, we relied on Kubernetes Cluster Autoscaler. It was the standard, but it felt clunky. You’d define fixed Auto Scaling Groups (ASGs), guess your instance types, and wait minutes for nodes to join the cluster. If your pods didn't fit the instance type you pre-provisioned, you were stuck with wasted capacity or pending pods.

Then came Karpenter.

Karpenter isn't just another autoscaler; it’s an intelligent provisioner. It looks at the pending pods in your cluster, calculates the exact resource requirements—CPU, memory, and topology—and talks directly to the AWS EC2 Fleet API to launch the most efficient instances. No more managing complex ASGs.

Setting Up Karpenter

I’ve been using Karpenter v0.32+ in production, and the performance leap over the old methods is massive. To get started, you’ll need an EKS cluster and the Karpenter controller installed.

First, ensure your IAM roles are configured correctly. Karpenter needs permissions to ec2:RunInstances and ec2:TerminateInstances. If you’re using IRSA (IAM Roles for Service Accounts), map the Karpenter controller role to the karpenter service account in the karpenter namespace.

Defining Your Provisioner

In Karpenter, we define "NodePools" (the replacement for the old Provisioner CRD). This is where you tell Karpenter what kind of infrastructure you’re comfortable with.


YAML
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: "karpenter.sh/capacity-type"
          operator: In
          values: ["spot", "on-demand"]
        - key: "kubernetes.io/arch"
          operator: In
          values: ["amd64"]
      nodeClassRef:
        name: default
  limits:
    cpu: "100"
  disruption:
    consolidationPolicy: WhenUnderutilized
    consolidateAfter: 30s

Leveraging AWS Spot Instances for Cost Optimization

The real magic happens when you prioritize AWS spot instances. By setting karpenter.sh/capacity-type to spot, you instruct Karpenter to hunt for the cheapest available capacity.

In a recent migration, we shifted 80% of our stateless microservices to spot instances. Because Karpenter is "instance-type aware," it doesn't just pick one type; it evaluates dozens of instance families simultaneously. If m5.large is too expensive or unavailable, Karpenter instantly pivots to m6a.large or c6g.large.

This is true cloud cost optimization. You aren't just scaling; you're dynamically buying the cheapest compute that satisfies your workload's constraints.

Handling Spot Interruptions

A common concern is: "What happens when AWS takes my spot instance back?"

It’s a valid fear. AWS provides a two-minute warning via the EC2 Instance Metadata Service. Karpenter handles this natively. When the interruption notice hits, Karpenter drains the node gracefully, moves the pods to a new node, and terminates the old one.

To make this bulletproof, ensure your pods have:

Pod Disruption Budgets (PDBs): Prevent your critical services from going down during a mass drain.
Graceful Shutdown: Handle SIGTERM in your application code to finish active requests.

The Power of Consolidation

One of my favorite features is consolidationPolicy: WhenUnderutilized.

Karpenter constantly monitors your cluster's utilization. If you have three nodes running at 20% capacity, Karpenter will calculate if it can pack those pods onto two nodes (or even one) and initiate a graceful termination of the emptier nodes. This keeps your cluster tight and prevents the "fragmentation" that usually plagues static ASGs.

Lessons from the Field

If you're implementing this today, keep these three things in mind:

Don't hardcode instance types. Let Karpenter decide. By removing specific instance requirements, you give Karpenter a wider pool to choose from, which significantly lowers the risk of InsufficientInstanceCapacity errors.
Use mixed-instance policies. Even if you love spot instances, keep a small on-demand pool for critical system components like CoreDNS or your CNI plugin.
Monitor your savings. Use the AWS Cost Explorer to track the delta between your previous ASG-based bill and your new Karpenter-managed costs. We typically see 40-60% savings.

Kubernetes autoscaling isn't just about adding nodes; it’s about managing the lifecycle of your compute. Karpenter makes this transition from "managing infrastructure" to "defining requirements" possible. It’s a tool that pays for itself in the first month.

Back to Blog

Kubernetes Autoscaling with Karpenter and AWS Spot Instances

Why Static Scaling is Dead

Setting Up Karpenter

Defining Your Provisioner

Leveraging AWS Spot Instances for Cost Optimization

Handling Spot Interruptions

The Power of Consolidation

Lessons from the Field

Similar Posts

Kubernetes Cost Monitoring: A Guide to Kubecost and FinOps

Kubernetes Canary Deployments: A Guide to Flagger and Istio

Kubernetes Resource Management: Using VPA Recommendation Mode