nginx-ingress-controller pods failing to load configuration with "client intended to send too large body" error in nginx-ingress-controller < nginx-0.32.0-rancher1

Follow
Table of Contents

Issue

nginx-ingress-controller pods fail to load configuration successfully, resulting in failure for some Ingress resources. Logs of the nginx-ingress-controller pods reveal error messages of the following format:

2020-09-22T19:06:19.696272452Z 2020/09/22 19:06:19 [error] 5832#5832: *28190476 client intended to send too large body: 10696855 bytes, client: unix:, server: , request: "POST /configuration/servers HTTP/1.1", host: "nginx-status"
2020-09-22T19:06:19.718950185Z W0922 19:06:19.718851       8 controller.go:176] Dynamic reconfiguration failed: unexpected error code: 413

The nginx-ingress-controller dynamically updates its configuration by POST'ing the data to the /configuration endpoint. In nginx-ingress-controller versions lower than nginx-0.26.0 the client_max_body_size for this endpoint is hardcoded to 10m. As a result, if the configuration data is greater than 10m the request will fail (in the log entry above the configuration body is 10696855 bytes, which is equal to ~10.2m). The configuration request will be repeatedly retried, resulting in increased CPU usage by the nginx-ingress-controller pods.

Pre-requisites

  • A Rancher Kubernetes Engine (RKE) CLI or Rancher v2.x provisioned Kubernetes cluster
  • nginx-ingress-controller version lower than nginx-0.32.0-rancher1

Workaround

Remove some Ingress resources in the cluster to reduce the configuration size below the 10m limit.

N.B. To remove recently added Ingress resources that pushed the configuration size over the limit of 10m, check the age of the Ingresses.

Resolution

In ingress-nginx versions 0.26.0 and above, the client_max_body_size for the /configuration endpoint is dynamic.

To take advantage of this fix, upgrade the Kubernetes version of the cluster to one of the below or later, which use nginx-ingress-controller:nginx-0.32.0-rancher1 or above:

  • v1.15.12-rancher1-1
  • v1.16.10-rancher2-1
  • v1.17.11-rancher1-1
  • v1.18.3-rancher2-1

RKE CLI and Rancher v2.x provisioned Kubernetes clusters, with Kubernetes v1.19+ run a higher version of the ingress-nginx, which also includes the fix.

Was this article helpful?
0 out of 0 found this helpful

Comments

0 comments

Please sign in to leave a comment.