This page provides good practices and considerations when designing admission webhooks in Kubernetes. This information is intended for cluster operators who run admission webhook servers or third-party applications that modify or validate your API requests.
Before reading this page, ensure that you're familiar with the following concepts:
Admission control occurs when any create, update, or delete request is sent to the Kubernetes API. Admission controllers intercept requests that match specific criteria that you define. These requests are then sent to mutating admission webhooks or validating admission webhooks. These webhooks are often written to ensure that specific fields in object specifications exist or have specific allowed values.
Webhooks are a powerful mechanism to extend the Kubernetes API. Badly-designed webhooks often result in workload disruptions because of how much control the webhooks have over objects in the cluster. Like other API extension mechanisms, webhooks are challenging to test at scale for compatibility with all of your workloads, other webhooks, add-ons, and plugins.
Additionally, with every release, Kubernetes adds or modifies the API with new
features, feature promotions to beta or stable status, and deprecations. Even
stable Kubernetes APIs are likely to change. For example, the Pod API changed
in v1.29 to add the
Sidecar containers feature.
While it's rare for a Kubernetes object to enter a broken state because of a new
Kubernetes API, webhooks that worked as expected with earlier versions of an API
might not be able to reconcile more recent changes to that API. This can result
in unexpected behavior after you upgrade your clusters to newer versions.
This page describes common webhook failure scenarios and how to avoid them by cautiously and thoughtfully designing and implementing your webhooks.
Even if you don't run your own admission webhooks, some third-party applications that you run in your clusters might use mutating or validating admission webhooks.
To check whether your cluster has any mutating admission webhooks, run the following command:
kubectl get mutatingwebhookconfigurations
The output lists any mutating admission controllers in the cluster.
To check whether your cluster has any validating admission webhooks, run the following command:
kubectl get validatingwebhookconfigurations
The output lists any validating admission controllers in the cluster.
Kubernetes includes multiple admission control and policy enforcement options. Knowing when to use a specific option can help you to improve latency and performance, reduce management overhead, and avoid issues during version upgrades. The following table describes the mechanisms that let you mutate or validate resources during admission:
| Mechanism | Description | Use cases |
|---|---|---|
| Mutating admission webhook | Intercept API requests before admission and modify as needed using custom logic. |
|
| Mutating admission policy | Intercept API requests before admission and modify as needed using Common Expression Language (CEL) expressions. |
|
| Validating admission webhook | Intercept API requests before admission and validate against complex policy declarations. |
|
| Validating admission policy | Intercept API requests before admission and validate against CEL expressions. |
|
In general, use webhook admission control when you want an extensible way to declare or configure the logic. Use built-in CEL-based admission control when you want to declare simpler logic without the overhead of running a webhook server. The Kubernetes project recommends that you use CEL-based admission control when possible.
If you use CustomResourceDefinitions, don't use admission webhooks to validate values in CustomResource specifications or to set default values for fields. Kubernetes lets you define validation rules and default field values when you create CustomResourceDefinitions.
To learn more, see the following resources:
This section describes recommendations for improving performance and reducing latency. In summary, these are as follows:
Mutating admission webhooks are called in sequence. Depending on the mutating webhook setup, some webhooks might be called multiple times. Every mutating webhook call adds latency to the admission process. This is unlike validating webhooks, which get called in parallel.
When designing your mutating webhooks, consider your latency requirements and tolerance. The more mutating webhooks there are in your cluster, the greater the chance of latency increases.
Consider the following to reduce latency:
Consider any other components that run in your cluster that might conflict with the mutations that your webhook makes. For example, if your webhook adds a label that a different controller removes, your webhook gets called again. This leads to a loop.
To detect these loops, try the following:
Update your cluster audit policy to log audit events. Use the following parameters:
level: RequestResponseverbs: ["patch"]omitStages: RequestReceivedSet the audit rule to create events for the specific resources that your webhook mutates.
Check your audit events for webhooks being reinvoked multiple times with the same patch being applied to the same object, or for an object having a field updated and reverted multiple times.
Admission webhooks should evaluate as quickly as possible (typically in milliseconds), since they add to API request latency. Use a small timeout for webhooks.
For details, see Timeouts.
Admission webhooks should leverage some form of load-balancing to provide high
availability and performance benefits. If a webhook is running within the
cluster, you can run multiple webhook backends behind a Service of type
ClusterIP.
Consider your cluster's availability requirements when designing your webhook.
For example, during node downtime or zonal outages, Kubernetes marks Pods as
NotReady to allow load balancers to reroute traffic to available zones and
nodes. These updates to Pods might trigger your mutating webhooks. Depending on
the number of affected Pods, the mutating webhook server has a risk of timing
out or causing delays in Pod processing. As a result, traffic won't get
rerouted as quickly as you need.
Consider situations like the preceding example when writing your webhooks. Exclude operations that are a result of Kubernetes responding to unavoidable incidents.
This section provides recommendations for filtering which requests trigger specific webhooks. In summary, these are as follows:
Admission webhooks are only called when an API request matches the corresponding webhook configuration. Limit the scope of each webhook to reduce unnecessary calls to the webhook server. Consider the following scope limitations:
kube-system namespace. If you run your own
Pods in the kube-system namespace, use an
objectSelector
to avoid mutating a critical workload.kube-node-lease system namespace. Mutating node leases might result in
failed node upgrades. Only apply validation controls to Lease objects in this
namespace if you're confident that the controls won't put your cluster at
risk.namespaceSelector.Admission controllers support multiple fields that you can use to match requests
that meet specific criteria. For example, you can use a namespaceSelector to
filter for requests that target a specific namespace.
For more fine-grained request filtering, use the matchConditions field in your
webhook configuration. This field lets you write multiple CEL expressions that
must evaluate to true for a request to trigger your admission webhook. Using
matchConditions might significantly reduce the number of calls to your webhook
server.
For details, see
Matching requests: matchConditions.
By default, admission webhooks run on any API versions that affect a specified
resource. The matchPolicy field in the webhook configuration controls this
behavior. Specify a value of Equivalent in the matchPolicy field or omit
the field to allow the webhook to run on any API version.
For details, see
Matching requests: matchPolicy.
This section provides recommendations for the scope of mutations and any special considerations for object fields. In summary, these are as follows:
Admission webhook servers send HTTP responses to indicate what to do with a
specific Kubernetes API request. This response is an AdmissionReview object.
A mutating webhook can add specific fields to mutate before allowing admission
by using the patchType field and the patch field in the response. Ensure
that you only modify the fields that require a change.
For example, consider a mutating webhook that's configured to ensure that
web-server Deployments have at least three replicas. When a request to
create a Deployment object matches your webhook configuration, the webhook
should only update the value in the spec.replicas field.
Fields in Kubernetes object specifications might include arrays. Some arrays
contain key:value pairs (like the envVar field in a container specification),
while other arrays are unkeyed (like the readinessGates field in a Pod
specification). The order of values in an array field might matter in some
situations. For example, the order of arguments in the args field of a
container specification might affect the container.
Consider the following when modifying arrays:
add JSONPatch operation instead of replace to
avoid accidentally replacing a required value.Ensure that your webhooks operate only on the content of the AdmissionReview
that's sent to them, and do not make out-of-band changes. These additional
changes, called side effects, might cause conflicts during admission if they
aren't reconciled properly. The .webhooks[].sideEffects field should
be set to None if a webhook doesn't have any side effect.
If side effects are required during the admission evaluation, they must be
suppressed when processing an AdmissionReview object with dryRun set to
true, and the .webhooks[].sideEffects field should be set to NoneOnDryRun.
For details, see Side effects.
A webhook running inside the cluster might cause deadlocks for its own deployment if it is configured to intercept resources required to start its own Pods.
For example, a mutating admission webhook is configured to admit create Pod
requests only if a certain label is set in the Pod (such as env: prod).
The webhook server runs in a Deployment that doesn't set the env label.
When a node that runs the webhook server Pods becomes unhealthy, the webhook
Deployment tries to reschedule the Pods to another node. However, the existing
webhook server rejects the requests since the env label is unset. As a
result, the migration cannot happen.
Exclude the namespace where your webhook is running with a
namespaceSelector.
Dependency loops can occur in scenarios like the following:
To avoid these dependency loops, try the following:
objectSelector.Mutating admission webhooks support the failurePolicy configuration field.
This field indicates whether the API server should admit or reject the request
if the webhook fails. Webhook failures might occur because of timeouts or errors
in the server logic.
By default, admission webhooks set the failurePolicy field to Fail. The API
server rejects a request if the webhook fails. However, rejecting requests by
default might result in compliant requests being rejected during webhook
downtime.
Let your mutating webhooks "fail open" by setting the failurePolicy field to
Ignore. Use a validating controller to check the state of requests to ensure
that they comply with your policies.
This approach has the following benefits:
In general, design your webhooks under the assumption that Kubernetes APIs might
change in a later version. Don't write a server that takes the stability of an
API for granted. For example, the release of sidecar containers in Kubernetes
added a restartPolicy field to the Pod API.
Mutating webhooks that respond to a broad range of API requests might unintentionally trigger themselves. For example, consider a webhook that responds to all requests in the cluster. If you configure the webhook to create Event objects for every mutation, it'll respond to its own Event object creation requests.
To avoid this, consider setting a unique label in any resources that your webhook creates. Exclude this label from your webhook match conditions.
Some Kubernetes objects in the API server can't change. For example, when you deploy a static Pod, the kubelet on the node creates a mirror Pod in the API server to track the static Pod. However, changes to the mirror Pod don't propagate to the static Pod.
Don't attempt to mutate these objects during admission. All mirror Pods have the
kubernetes.io/config.mirror annotation. To exclude mirror Pods while reducing
the security risk of ignoring an annotation, allow static Pods to only run in
specific namespaces.
This section provides recommendations for webhook order and designing idempotent webhooks. In summary, these are as follows:
Mutating admission webhooks don't run in a consistent order. Various factors might change when a specific webhook is called. Don't rely on your webhook running at a specific point in the admission process. Other webhooks could still mutate your modified object.
The following recommendations might help to minimize the risk of unintended changes:
Every mutating admission webhook should be idempotent. The webhook should be able to run on an object that it already modified without making additional changes beyond the original change.
Additionally, all of the mutating webhooks in your cluster should, as a collection, be idempotent. After the mutation phase of admission control ends, every individual mutating webhook should be able to run on an object without making additional changes to the object.
Depending on your environment, ensuring idempotence at scale might be challenging. The following recommendations might help:
The following examples show idempotent mutation logic:
For a create Pod request, set the field
.spec.securityContext.runAsNonRoot of the Pod to true.
For a create Pod request, if the field
.spec.containers[].resources.limits of a container is not set, set default
resource limits.
For a create Pod request, inject a sidecar container with name
foo-sidecar if no container with the name foo-sidecar already exists.
In these cases, the webhook can be safely reinvoked, or admit an object that already has the fields set.
The following examples show non-idempotent mutation logic:
For a create Pod request, inject a sidecar container with name
foo-sidecar suffixed with the current timestamp (such as
foo-sidecar-19700101-000000).
Reinvoking the webhook can result in the same sidecar being injected multiple times to a Pod, each time with a different container name. Similarly, the webhook can inject duplicated containers if the sidecar already exists in a user-provided pod.
For a create/update Pod request, reject if the Pod has label env
set, otherwise add an env: prod label to the Pod.
Reinvoking the webhook will result in the webhook failing on its own output.
For a create Pod request, append a sidecar container named foo-sidecar
without checking whether a foo-sidecar container exists.
Reinvoking the webhook will result in duplicated containers in the Pod, which makes the request invalid and rejected by the API server.
This section provides recommendations for testing your mutating webhooks and validating mutated objects. In summary, these are as follows:
Robust testing should be a core part of your release cycle for new or updated webhooks. If possible, test any changes to your cluster webhooks in a staging environment that closely resembles your production clusters. At the very least, consider using a tool like minikube or kind to create a small test cluster for webhook changes.
Your mutating webhooks shouldn't break any of the validations that apply to an object before admission. For example, consider a mutating webhook that sets the default CPU request of a Pod to a specific value. If the CPU limit of that Pod is set to a lower value than the mutated request, the Pod fails admission.
Test every mutating webhook against the validations that run in your cluster.
Before upgrading your production clusters to a new minor version, test your webhooks and workloads in a staging environment. Compare the results to ensure that your webhooks continue to function as expected after the upgrade.
Additionally, use the following resources to stay informed about API changes:
Mutating webhooks run to completion before any validating webhooks run. There is no stable order in which mutations are applied to objects. As a result, your mutations could get overwritten by a mutating webhook that runs at a later time.
Add a validating admission controller like a ValidatingAdmissionWebhook or a
ValidatingAdmissionPolicy to your cluster to ensure that your mutations
are still present. For example, consider a mutating webhook that inserts the
restartPolicy: Always field to specific init containers to make them run as
sidecar containers. You could run a validating webhook to ensure that those
init containers retained the restartPolicy: Always configuration after all
mutations were completed.
For details, see the following resources:
This section provides recommendations for deploying your mutating admission webhooks. In summary, these are as follows:
When you're ready to deploy your mutating webhook to a cluster, use the following order of operations:
failurePolicy field in the MutatingWebhookConfiguration manifest
to Ignore. This lets you avoid disruptions caused by misconfigured webhooks.namespaceSelector field in the MutatingWebhookConfiguration
manifest to a test namespace.Monitor the webhook in the test namespace to check for any issues, then roll the webhook out to other namespaces. If the webhook intercepts an API request that it wasn't meant to intercept, pause the rollout and adjust the scope of the webhook configuration.
Mutating webhooks are powerful Kubernetes controllers. Use RBAC or another authorization mechanism to limit access to your webhook configurations and servers. For RBAC, ensure that the following access is only available to trusted entities:
admissionregistration.k8s.io/v1If your mutating webhook server runs in the cluster, limit access to create or modify any resources in that namespace.
The following projects are examples of "good" custom webhook server implementations. You can use them as a starting point when designing your own webhooks. Don't use these examples as-is; use them as a starting point and design your webhooks to run well in your specific environment.
Items on this page refer to third party products or projects that provide functionality required by Kubernetes. The Kubernetes project authors aren't responsible for those third-party products or projects. See the CNCF website guidelines for more details.
You should read the content guide before proposing a change that adds an extra third-party link.