MHRubel
HomeAboutProjectsSkillsExperienceBlogPhotosContact
MHRubel

Senior Software Engineer crafting high-performance web applications and SaaS platforms.

Navigation

  • Home
  • About
  • Projects
  • Skills
  • Experience
  • Blog
  • Photos
  • Contact

Get in Touch

Available for senior/lead roles and consulting.

bd.mhrubel@gmail.comHire Me

© 2026 Mahamudul Hasan Rubel. All rights reserved.

Built with using Next.js 16 & Tailwind v4

Back to Blog
KubernetesInfrastructureNetworkingJune 19, 20263 min read

Implementing Kubernetes NodeLocal DNSCache for Lower DNS Latency

Learn how to implement Kubernetes NodeLocal DNSCache to slash DNS latency, reduce CoreDNS load, and improve overall cluster performance in production.

KubernetesDevOpsDNSNetworkingPerformanceCoreDNS
Three metallic wrenches arranged on a rustic wooden table, top view.

During a routine performance review of our high-traffic microservices cluster, we noticed an alarming trend: 14% of our external API requests were failing with 504 Gateway Timeout errors. After digging through the traces, we realized the bottleneck wasn't the application code or the database; it was the DNS resolution time. Requests were waiting for up to 800ms just to resolve internal service names, a direct side effect of the default CoreDNS architecture where every pod hits a centralized service IP.

Why Kubernetes NodeLocal DNSCache matters

In a default cluster, every DNS query from a pod has to travel through the network stack to reach the CoreDNS service. This introduces significant network overhead and contention, especially as your pod density grows. By implementing Kubernetes NodeLocal DNSCache, you move the DNS cache directly onto the worker node. This changes the lookup path from a network-traversed service call to a local loopback request, effectively cutting down DNS latency from double-digit milliseconds to under 2ms in most of our p99 measurements.

If you’re managing your nodes with tools like Karpenter, you’ll find that adding a DaemonSet for local caching is a trivial but high-impact configuration. It offloads the central CoreDNS pods, which often become the bottleneck during traffic spikes.

Our failed attempt at optimization

Creative display of the word 'OPTIMIZE' on a pink textured surface.

Before settling on NodeLocal DNSCache, we tried scaling CoreDNS horizontally by increasing the replica count from 2 to 10. That was a mistake. We saw a temporary improvement, but the increased pod-to-pod communication overhead just shifted the congestion from the DNS pods to the kube-proxy iptables rules. Our latency jitter actually worsened, jumping from a stable 15ms to an unpredictable range of 5ms to 120ms.

We then pivoted to NodeLocal DNSCache. The implementation involves running a cache agent as a DaemonSet on each node. It listens on a specific local IP (usually 169.254.20.10), and we configure our pods to use this IP as their nameserver.

Here is the basic configuration we applied to the NodeLocalDNS manifest:

YAML
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: node-local-dns
  namespace: kube-system
spec:
  template:
    spec:
      containers:
      - name: node-cache
        image: registry.k8s.io/dns/k8s-dns-node-cache:1.22.20
        args:
          - -localip
          - 169.254.20.10
          - -conf
          - /etc/coredns/Corefile

Once we applied this, we had to update our kubelet configuration to point to this local IP. If you don't update the clusterDNS flag in your kubelet settings, your pods will keep querying the old, congested service IP.

Measuring cluster performance gains

The results were immediate. We saw a 38% reduction in total DNS-related CPU usage across the cluster. More importantly, the intermittent 504 errors vanished. While we still have to manage node lifecycles with Kubernetes Cluster API, the local cache survives node reboots and keeps the DNS resolution layer stable.

One thing to watch out for: if you have complex rewrite rules in your primary CoreDNS config, ensure they are mirrored in the local cache config. We spent half a day debugging why some internal service lookups were failing post-migration because we forgot to propagate a custom search domain setting to the local cache.

FAQ

Close-up of a magnifying glass focusing on the phrase 'Frequently Asked Questions'.

Q: Does NodeLocal DNSCache replace CoreDNS? A: No, it acts as a local cache. It still forwards non-cached requests to the upstream CoreDNS pods.

Q: Will this increase memory usage on my nodes? A: Yes, each node will now run an extra pod. In our experience, it consumes about 50MB of RAM per node, which is a negligible trade-off for the latency improvements.

Q: Is it difficult to roll back if things break? A: Not really. You can simply revert the clusterDNS configuration in your Kubelet and delete the DaemonSet.

Looking back, I wish we had implemented this sooner instead of chasing replica counts on CoreDNS. It's a classic case of infrastructure architecture outperforming brute-force scaling. I'm still curious if we could squeeze more performance by switching to a different backend for the cache, but for now, this setup is solid.

Back to Blog

Similar Posts

Overhead drone shot of busy city highway with cars and bus, surrounded by trees.
KubernetesJune 19, 20263 min read

Kubernetes Ingress: NGINX vs Gateway API for Traffic Routing

Master Kubernetes Ingress with our deep dive into the NGINX Ingress Controller and the modern Kubernetes Gateway API for scalable traffic routing and load balancing.

Read more
Focused view of a computer screen displaying code and debug information.
Kubernetes
June 19, 2026
4 min read

Kubernetes Network Policies Debugging with Cilium Hubble

Master Kubernetes network policies using Cilium Hubble. Learn to use eBPF for deep network observability and fix silent traffic drops in your clusters.

Read more
Yellow tape measure extended across a dark wooden floor, highlighting measurement details.
KubernetesJune 19, 20263 min read

Scaling Laravel Queues on Kubernetes: A KEDA Implementation Guide

Scaling Laravel queues on Kubernetes is hard. Learn to use KEDA for event-driven autoscaling of your Queue Workers, moving beyond basic CPU metrics.

Read more