Master Kubernetes PriorityClass to manage critical workloads. Learn how pod preemption works to ensure high-priority services survive node resource contention.

During a routine deployment of our core API last Thursday, we ran into a classic "noisy neighbor" problem that brought down our production logging stack. Even though we had Kubernetes Resource Management: Using VPA Recommendation Mode in place to right-size our pods, a sudden spike in batch processing jobs consumed all available CPU cycles on our worker nodes, causing the API’s liveness probes to fail. We needed a way to guarantee that our user-facing traffic took precedence over background tasks, which is where Kubernetes PriorityClass and pod preemption saved the day.
At its core, a Kubernetes PriorityClass is a non-namespaced object that defines the relative importance of a pod. When the scheduler encounters a pod that won't fit on any node, it checks if that pod has a higher priority than existing pods. If it does, the scheduler kills (preempts) the lower-priority pods to make room.
Before we implemented this, we were manually scaling our node groups using Implementing Kubernetes Node Auto-Provisioning: Karpenter and Bottlerocket, but that wasn't fast enough for instantaneous traffic spikes. Relying on auto-scaling is great for capacity, but pod preemption is your last line of defense when the cluster is physically out of room.
To get started, you define a PriorityClass manifest. The value field is an integer; the higher the number, the higher the priority.
YAMLapiVersion: scheduling.k8s.io/v1 kind: PriorityClass metadata: name: critical-service value: 1000000 globalDefault: false description: "Used for high-priority user-facing APIs."
Once applied, you assign this to your deployment spec:
YAMLspec: priorityClassName: critical-service containers: - name: api-pod image: my-app:v1.2.3

We first tried setting a global default priority for all pods, thinking it would make our Kubernetes resource management predictable. That was a mistake. We ended up with a cluster where every pod thought it was important, causing the scheduler to churn constantly as pods evicted each other in a death spiral. We learned that globalDefault should almost always be false.
We also initially underestimated the impact of preemptionPolicy. By default, it's set to PreemptLowerPriority. We once set it to Never on a test workload to see if we could avoid restarts, but that just meant our critical pods stayed in a Pending state for roughly 45 minutes until a node became free. That’s an eternity in production.
Managing high-priority workloads requires a tiered approach. We now use three tiers:

The biggest challenge isn't the configuration—it's the ripple effect. When a high-priority pod preempts a lower one, the lower-priority pod is evicted and must be rescheduled elsewhere. If your cluster is already saturated, those evicted pods might end up in a pending state, creating a backlog.
We’ve found that combining this with Kubernetes VPA and Goldilocks: Master Resource Right-Sizing is the only way to keep the cluster healthy. If you don't know your actual resource usage, you're just guessing where the pressure points are.
Q: Will my pods be killed immediately if a higher-priority pod arrives?
A: Yes, if the scheduler determines the only way to satisfy the higher-priority pod's requirements is to evict yours. The pod will receive a SIGTERM and have a grace period to shut down.
Q: Can I prevent a specific pod from being preempted?
A: You can set preemptionPolicy: Never on the pod spec, but be warned: if the cluster is full, that pod will simply wait indefinitely for nodes to open up.
Q: Is there a limit to how many PriorityClasses I should have? A: Keep it simple. We use three tiers, as mentioned above. Adding more granularity usually leads to "priority creep," where every developer argues their service deserves a +100 increase over the next.
I’m still not entirely convinced our current tiering strategy is optimal. We’re currently investigating if we should move our database workloads, which we run using CloudNativePG for Reliable Kubernetes Database Management, into their own separate node pools to avoid preemption entirely. For now, it works, but I suspect we'll need to revisit our eviction budgets as we continue to scale.
Kubernetes ResourceQuotas and Kyverno are the keys to cluster stability. Learn to automate resource limits and prevent noisy neighbor issues in production.