Kubernetes Taints and Tolerations

30 Apr 2025 13:02 kubernetes

Today I learned, in a way that’s going to stick, the difference between Kubernetes Taints and Tolerations and Node Affinity.

Well, I say “today”, but it was actually two almost-identical occurrences over the last few days.

Recapping nodeSelector and nodeAffinity

In Kubernetes, you can force particular pods to run on particular nodes by using nodeSelector. Here’s how I make sure that Longhorn (which is quite resource-intensive) only runs on some of my nodes (those labelled as such):

  nodeSelector:
    differentpla.net/longhorn-storage-node: "true"

Some more examples:

You want particular workloads to run on nodes that have GPU acceleration.
You want a particular pod to run on a node that’s got a software-defined radio (SDR) dongle attached.
- You can use Node Feature Discovery for this.
  - The RTL-SDR Blog V3 dongle can be detected with feature.node.kubernetes.io/usb-ff_0bda_2838.present: "true"

Or you might want your pod to run only on x86_64 (amd64) nodes:

  nodeSelector:
    kubernetes.io/arch: "amd64"

You can combine them (Longhorn storage and amd64):

  nodeSelector:
    kubernetes.io/arch: "amd64"
    differentpla.net/longhorn-storage-node: "true"

But you can’t repeat a key, so there’s no way to express “amd64 OR arm64”. For that, you need nodeAffinity.

nodeAffinity is a more flexible version of nodeSelector. Here’s how to specify which architectures a pod can run on:

  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/arch
              operator: In
              values:
                - amd64
                - arm64

We’re saying that the architecture must be in the ['amd64', 'arm64'] set.

Note that nodeSelectorTerms are OR-ed together; matchExpressions are AND-ed together.

Aside: make sure you don’t have both nodeSelector and nodeAffinity. I spent a couple of hours wondering why my pods weren’t being scheduled on ARM64 nodes – I’d added the nodeAffinity as above, but not removed the nodeSelector

Taints

This is great, but what if you add some nodes to your cluster, and you don’t want existing workloads scheduled on them?

For example: let’s say you’re adding arm64 nodes (at work, Graviton; at home, RPi 4) and you don’t want to update all of the existing deployments (which were only built for amd64) to add the missing nodeSelector or nodeAffinity.

You can add a “taint” to those nodes; Kubernetes won’t schedule any workloads on the tainted nodes. Here, I’m tainting my 3 new RPi 4 nodes with differentpla.net/arch=arm64:NoSchedule.

kubectl taint nodes rpi401 rpi402 rpi403 differentpla.net/arch=arm64:NoSchedule

(I used differentpla.net/arch rather than kubernetes.io/arch because I didn’t want the official label confused with my custom taint).

This means that no pods will be scheduled on the tainted nodes.

Tolerations

What if I want a pod scheduled on one of those RPi 4 nodes? I’ve been messing around with multi-arch builds; I want my builds to run on the new nodes.

That’s when you add a “toleration” to the pod template. You’re saying “this pod tolerates that taint; it’s OK for it to run there”.

That looks like this:

  tolerations:
    - key: differentpla.net/arch
      operator: Equal
      value: arm64
      effect: NoSchedule

I’m saying that this pod is allowed to run on the tainted node; it tolerates it.

As another example, you might have nodes that are flaky or experimental in some way; you don’t want normal workloads to run there, so you taint them. But you want to make use of those nodes, so you mark some low-value workloads as tolerating that taint.