On a Rancher v2.x provisioned cluster, a host shows a large number of containers running the
rancher-agent image, per the following output of
docker ps | grep rancher-agent:
$ docker ps | grep rancher-agent ... aeffe9725521 rancher/rancher-agent:v2.3.3 "run.sh --server htt…" About a minute ago Up About a minute sleepy_hopper 130120f49b71 rancher/rancher-agent:v2.3.3 "run.sh --server htt…" 6 minutes ago Up 6 minutes stoic_hypatia 498b923d9b6e rancher/rancher-agent:v2.3.3 "run.sh --server htt…" 11 minutes ago Up 11 minutes laughing_elbakyan 3453865e5f70 rancher/rancher-agent:v2.3.3 "run.sh --server htt…" 16 minutes ago Up 16 minutes wonderful_gagarin f925209cd16a rancher/rancher-agent:v2.3.3 "run.sh --server htt…" 21 minutes ago Up 21 minutes silly_shannon 7d7fb5d4bf04 rancher/rancher-agent:v2.3.3 "run.sh --server htt…" 26 minutes ago Up 26 minutes gifted_elgamal ...
docker inspect <container_id> for these containers, shows the Path and Args are of the following format:
"Path": "run.sh", "Args": [ "--server", "https://22.214.171.124", "--token", "gwrp7zlnwvsnzh2nhbvwcgdw45ccv6cq9pztzdd92j6xlv69xxhvnp", "--ca-checksum", "bbc8c7ca05c87a7140154554fa1a516178852f2710538c57718f4c874c29533c", "--no-register", "--only-write-certs" ],
- A Rancher v2.x provisioned Kubernetes cluster, using either custom nodes or nodes hosted in an infrastructure provider.
- Repeated deletion of stopped containers on hosts in the cluster, e.g. use of
docker system prune, either manually or as part of an automated process such as a cronjob.
This behaviour is a result of the issue tracked in Rancher GitHub issue #15364.
share-mnt container is created on a Rancher provisioned Kubernetes cluster, and exits upon completion, but is not removed such that it can be invoked again.
Meanwhile, the Rancher
node-agent Pod on a host will spawn a new
share-mnt container, if the
share-mnt is removed. Upon starting, the
share-mnt process spawns a
rancher-agent container to write certificates. This agent container will run indefinitely until the
node-agent is triggered to reconnect to the Rancher server or the
node-agent process is restarted.
As a result, where the
share-mnt container on a host is removed repeatedly, either manually or by an automated process, this will result in multiple running
To trigger automatic removal of the
rancher-agent containers, the
node-agent container on the host can be restarted. Identifying the running agent container with
docker ps | grep k8s_agent_cattle-node restart the container with
docker restart <container_id>.
In addition, you can prevent further creation of multiple
rancher-agent container instances by removing whichever process is triggering the deletion of stopped containers.
An enhancement request, to prevent the creation of multiple long-running
rancher-agent containers, in the event of repeated deletion of the
share-mnt container, is tracked in Rancher GitHub issue #15364.