Why K8s deployments need `matchLabels` keyword
To create a Kubernetes deployment, we must specify the matchLabels
field, even though its value must match the one we specify in the template. But why? Cannot Kubernetes be smart enough to figure it out without us being explicit?

Illustration by unDraw+.
Did you know? K8s is short for Kubernetes ‘cause there are 8 letters between K and S.
A Kubernetes (K8s) Deployment provides a way to define how many replicas of a Pod K8s should aim to keep alive. I’m especially bothered by the Deployment spec’s requirement that we must specify a label selector for pods, and that label selector must match the same labels we have defined in the template. Why can’t we just define them once? Why can’t K8s infer them on its own? As I will explain, there is actually a good reason. However, to understand it, you would have to go down a rabbit hole to figure it out.
A deployment specification
Firstly, let’s take a look at a simple deployment specification for K8s:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 3
selector:
matchLabels:
app: nginx # Why can't K8s figure it out on its own?
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:latest
This is a basic deployment, taken from the official documentation, and here we can already see that we need to fill the matchLabels
field.
What happens if we drop the “selector” field completely?
➜ kubectl apply -f nginx-deployment.yaml
The Deployment "nginx-deployment" is invalid:
* spec.selector: Required value
Okay, so we need to specify a selector. Can it be different from the “label” field in the “template”? I will try with:
matchLabels:
app: nginx-different
➜ kubectl apply -f nginx-deployment.yaml
The Deployment "nginx-deployment" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"app":"nginx"}: `selector` does not match template `labels`
There are usually good reasons behind what it seems a not well-thought implementation, and this is true here as well, as we’ll see.
As expected, K8s doesn’t like it, it must match the template. So, we must fill a field with a well-defined value. It really seems that a computer could do that for us, why do we have to specify it manually? It drives me crazy having to do something a computer could do without any problem. Or could it?
Behind a deployment
Probably, you may never need to manipulate ReplicaSet objects: use a Deployment instead, and define your application in the spec section.
How does a deployment work? Behind the curtains, when you create a new deployment, K8s creates two different objects: a Pod definition, using as its specification what is available in the “template” field of the Deployment, and a ReplicaSet. You can easily verify this using kubectl
to retrieve pods and replica sets after you have created a deployment.
A ReplicaSet’s purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
A ReplicaSet needs a selector that specifies how to identify Pods it can acquire and manage: however, this doesn’t explain why we must specify it, and why K8s cannot do it on its own. In the end, a Deployment is a high-level construct that should hide ReplicaSet quirks: such details shouldn’t concern us, and the Deployment should take care of them on its own.
Digging deeper
Understanding how a Deployment works doesn’t help us find the reason of this particular behavior. Given that Googling doesn’t seem to bring any interesting result on this particular topic, it’s time to go to the source (literally): luckily, K8s is open source, so we can check its history on GitHub.
Going back in time, we find out that, actually, K8s used to infer it the matchLabels
field! The behavior was removed with apps/v1beta2
(released with Kubernetes 1.8), through Pull Request #50164. Such pull request links to issue #50339, that, however, has a very brief description, and lacks the reasoning behind such a choice.
The linked issues are rich of technical details, and they have many comments. If you want to understand exactly how kubectl apply works, take a look!
Luckily, other issues provide way more context, as #26202: it turns out, the main problem with defaulting is when in subsequent updates to the resource, labels are mutated: the patch operation is somehow fickle, and apply
breaks when you update a label that was used as default.
Many other concerns have been described by Brian Grant in deep in the issue #15894.
Basically, assuming a default value, creates many questions and concerns: what’s the difference between explicitly setting a label as null, and leaving it empty? How to manage all the cases where users left the default, and now they want to update the resource to manage themselves the label (or the other way around)?
Conclusion
Given that in K8s everything intend to be declarative, developers have chosen that explicit is better than implicit, especially for corner cases: specifying things explicitly allows a more robust validation on creation and update time, and remove some possible bugs that existed due to uncertainties caused by lack of clarity.
Shortly after dropping the defaulting behavior, developers also made the labels immutable, to make sure behaviors were well-defined. Maybe in the future labels will be mutable again, but to have that working, somebody needs to design a well-though document explaining how to manage all the possible edge cases that could happen when a controller is updated.
I hope you found this deep dive into the question interesting. I spent some time on it, since I was very curious, and I hope that the next person with the same question can find this article and get an answer quicker than what it took me.
If you have any question, or feedback, please leave a comment below, or write me an email at [email protected].
Ciao,
R.