Supportability Review - Rancher 2.x

Follow
Table of Contents

This article aims to provide a number of checks that can be evaluated to ensure best practices are in place when planning, building or preparing a Rancher 2.x and Kubernetes environment.

Architecture

1.1 Nodes

Understanding workload resource needs in downstream clusters upfront can help choose an appropriate node configuration; some nodes may need different configurations; however, all nodes of the same role are generally configured the same.

   Checks

Standardize on supported versions and ensure minimum requirements are met:

  • Confirm the OS is covered in the supported versions
  • Resource needs can vary based on cluster size and workload, however, in general, no less than 8GB of memory and 2 vCPUs is recommended
  • SSD storage is recommended, especially for nodes with the etcd role
  • Firewall rules allow connectivity for nodes (k3sRKE)
  • A static IP for all nodes is required, if using DHCP, all nodes should have a reserved address
  • Swap is disabled on the nodes
  • NTP is enabled on the nodes

1.2 Separation of concerns

The Rancher management cluster should be dedicated to running the Rancher deployment, additional workloads added to the cluster can contend for resources and impact the performance and predictability of Rancher.

This is also important to consider in downstream clusters, the etcd and control plane nodes (RKE), and server nodes (k3s) should be dedicated to the purpose. When possible, it is recommended that each node have a single role, for example, separate nodes for the etcd and control plane roles.

   Checks
Using the following commands on each cluster, check and confirm for any unexpected workloads running on the Rancher management cluster, or running on the server or etcd/control plane nodes of a downstream cluster.

Rancher management cluster

  • Check for any unexpected pods running in the cluster: kubectl get pods --all-namespaces
  • Check for any single points of failure or discrepancies in OS, kernel and CRI version: kubectl get nodes -o wide

Downstream cluster

k3s RKE
  • Check for any unexpected pods running on server nodes:
for n in $(kubectl get nodes -l node-role.kubernetes.io/master=true --no-headers | cut -d " " -f1)
  do
    kubectl get nodes --field-selector metadata.name=${n} --no-headers
    kubectl get pods --all-namespaces -o wide --field-selector spec.nodeName=${n}; echo
done

1.3 High Availability

Ensure nodes within a cluster are spread across separate failure boundaries as much as possible. This could mean VMs running on separate physical hosts, data centres, switches, storage pools, etc. If running in a cloud environment, instances in separate availability zones.

For High Availability in Rancher, a Kubernetes install is required.
   Checks
    • When deploying the Rancher management cluster it is recommended to use the following configuration:
Distribution Recommendation
k3s 2 server nodes
RKE 3 nodes with all roles
    • Confirm the components of all clusters and external datastores (k3s) are satisfying minimum HA requirements:
k3s RKE
Component Minimum Recommended Notes
external datastore 2 2 or greater The external datastore should provide failover to a standby using the datastore-endpoint
server nodes 2 2 or greater Allow tolerance for at least 1 server node failure
agent nodes 2 N/A Allow tolerance for at least 1 agent node failure, scale up to meet the workload needs

Cloud provider

The following commands can also be used with clusters configured with a cloud provider to review the instance type and availability zones of each node.

  • Kubernetes v1.17 or earlier: kubectl get nodes -L beta.kubernetes.io/instance-type -L failure-domain.beta.kubernetes.io/zone
  • Kubernetes v1.17 or greater: kubectl get nodes -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone
These labels may not be available on all cloud providers.

1.4 Load balancer

To provide a consistent endpoint for the Rancher management cluster, a load balancer is highly recommended to ensure the Rancher agents, UI, and API connectivity can effectively reach the Rancher deployment.

   Checks

The load balancer is configured:

  • Within close proximity of the Rancher management cluster to reduce latency
  • For high availability, with all Rancher management nodes configured as upstream targets
  • With a health check to the following path:
Distribution Health check path
k3s /ping
RKE /healthz
A health check interval is generally recommended at 30 seconds or less

1.5 Proximity and latency

For performance reasons, it is recommended to avoid spreading cluster nodes over long distances and unreliable networks. For example, nodes could be in separate AZs in the same region, the same datacenter, or separate nearby data centres.

This is particularly important for etcd nodes which are sensitive to network latency, the RTT between etcd nodes in the cluster will determine the minimum time to complete a commit.

   Checks
  • Network latency and bandwidth is adequate between locations that the cluster nodes will be provisioned
A tool like mtr to gather connectivity statistics between locations over a long sample period can be useful to report on the packet loss and latency.
Generally latency between etcd nodes is recommended at 5s or less

1.6 Datastore

It is important to ensure that the chosen datastore is capable of handling requests inline with the workload of the cluster.

Allocation of resources, storage performance, and tuning of the datastore may be needed over time, this could be due to an increase in churn in a cluster, downstream clusters growing in size, or the number of downstream clusters Rancher is managing increases.

   Checks

Confirm the recommended options are met for the distribution in use:

k3s RKE

With an external datastore the general performance requirements include:

  • SSD or similar storage providing 1,000 IOPs or greater performance
  • Datastore servers are assigned 2 vCPUs and 4GB memory or greater
  • A low latency connection to the datastore endpoint from all k3s server nodes
MySQL 5.7 is recommended. If running in a cloud provider, you may wish to utilise a managed database service.

1.7 CIDR selection

The cluster and service CIDRs cannot be changed once a cluster is provisioned.

For this reason, it is important to future proof by changing the ranges to avoid routing overlaps with other areas of the network and potential cluster IP exhaustion if the defaults are not suitable.

   Checks
  • The default CIDR ranges do not overlap with any area of the network
The default CIDRs are below which often don't need to be changed, to ensure the are no issues with routing from or two pods you may wish to adjust these when creating clusters (RKE, k3s).
Network Default CIDR
Cluster 10.42.0.0/16
Service 10.43.0.0/16
Reducing the CIDR sizes can lower the number of IPs available and therefore total number of pods and services in the cluster. In a large cluster, the CIDR ranges may need to be increased.

1.8 Authorized cluster endpoint

At times connecting directly to a downstream cluster may be desired, this could be to reduce latency, avoid interruption if Rancher is unavailable, or that a high frequency of external API calls occur, for example, external monitoring, or a CI/CD pipeline.

   Checks
Access directly to the downstream cluster kube-apiserver can be configured using the secondary context in the kubeconfig file.

Best Practices

2.1 Installing Rancher

It is highly encouraged to install Rancher on a Kubernetes cluster in an HA configuration.

If starting with small resource requirements, at the very minimum always install on a Kubernetes cluster with a single node, this provides a future path to adding nodes at a later date.

The design of the single node Docker install is for short-lived testing environments, migration from a Docker to a Kubernetes install is not possible.
   Checks
  • Rancher is installed on a Kubernetes cluster, even if that is a single node cluster

2.2 Rancher Resources

The minimum resource requirements for nodes in the Rancher management cluster need to scale to match the number of downstream clusters and nodes; this may change over time and need reviewing as changes occur in the environment.

   Checks

2.3 Chart options

When installing the Rancher helm chart, the default options may not always be the best fit for specific environments.

   Checks
  • The Rancher helm chart is installed with the desired options
  •  replicas - the default number of Rancher replicas (3) may not suit your cluster, for example, a k3s cluster with 2 x server nodes using a replicas value of 2 will ensure only one Rancher pod is running per node.
  •  antiAffinity - the default preferred scheduling can mean Rancher pods become imbalanced during the lifetime of a cluster, using required can ensure Rancher is always scheduled on unique nodes
To confirm the options provided on an existing Rancher install with helm v3, the following command can be used helm get values rancher -n cattle-system

2.4 Supported versions

When choosing or maintaining the components for Rancher and Kubernetes clusters the product lifecycle and support matrix can be used to ensure the versions and OS configurations are certified and maintained.

   Checks
    • All Rancher and Kubernetes cluster versions are under maintenance and certified
As versions are a moving target, checking the current stable releases and planning for future upgrades on a schedule is recommended.

2.5 Recurring snapshots and backups

It is important to configure snapshots on a recurring schedule and store these externally to the cluster for disaster recovery.

   Checks
  • Recurring snapshots are configured for the distribution in use
Distribution Configuration
k3s Configure snapshots and backups on the external datastore, this can differ depending on the chosen database
RKE Configure recurring snapshots of etcd, with an S3 compatible endpoint for off-node copies
In addition to a recurring schedule, it's important to take one-time snapshots of etcd (RKE), or datastore (k3s) before and after significant changes.

2.6 Provisioning

Provisioning nodes and resources for Rancher and downstream clusters in a repeatable and automated way will greatly improve the supportability of Rancher and Kubernetes. This allows nodes to be replaced in a cluster easily, and new clusters created in a consistent way.

   Checks

The below points can help prepare the Rancher and Kubernetes environment with integrations and modern approaches to managing resources, such as infrastructure as code, CI/CD, immutable infrastructure, and configuration management:

  • Manifests and configuration data are stored in source control, treated as the source of truth for containerized applications
  • Automated build, deployment and/or configuration management
The rancher2 terraform provider and pulumi package can be used to manage clusters and resources as code.

2.7 Managing node lifecycle

When making significant planned changes it is important to drain nodes that are being affected to avoid disrupting in-flight connections, such as restarting Docker, patching, shutting down or removing nodes.

For example, the kube-proxy component manages iptables rules on nodes to manage service endpoints, if a node is suddenly shutdown, stale endpoints and orphaned pods can be left in place for a period of time causing connectivity issues.

In some cases during an unplanned issue, draining can be automated, such as when a node may be terminated, restarted, or shutdown.

   Checks
  • A process is in place to drain before planned disruptive changes are performed on a node
  • Where possible, node draining during the shutdown sequence is automated, for example, with a systemd or similar service

Operating Kubernetes

3.1 Capacity planning and Monitoring

It is recommended to measure resource usage of all clusters by enabling monitoring in Rancher, or your chosen solution. It is recommended to alert on resource thresholds and events in the cluster.

On supported platforms, using Cluster Autoscaler can be used to ensure the number of nodes is right-sized for the pod workload. Combining this with Horizontal Pod Autoscaler provides both application and infrastructure scaling capabilities.

   Checks

3.2 Probes

In the defence against service and pod related failures, liveness and readiness probes are very useful; these can be in the form of HTTP requests, commands, or TCP connections.

   Checks
  • Liveness and Readiness probes are configured where necessary
  • Probes do not rely on the success of upstream dependencies, only the running application in the pod

3.3 Resources

Assigning resource requests to pods allows the kube-scheduler to make more informed placement decisions, avoiding the "bin packing" of pods onto nodes and resource contention.

Limits also offer value in the form of a safety net against pods consuming an undesired amount of resources.

In addition to defining requests and limits for pods, it can also be useful to reserve capacity on nodes to prevent allocating resources that may be consumed by the kubelet and other system daemons, like Docker.

   Checks
  • All pods define resource requests and have limits configured where necessary
  • Nodes have system and daemon reservations where necessary
When Rancher Monitoring is enabled, the workload and pod metrics can be used to find a baseline of CPU and Memory for resource requests

3.4 OS Limits

Containerized applications can consume high amounts of OS resources, such as open files, connections, processes, filesystem space and inodes.

Often the defaults are adequate; however, establishing a standardized image for all nodes can help establish a baseline for all configuration and tuning.

   Checks

In general, the below can be used to confirm the OS limits allow for adequate headroom for the workloads

  • File descriptor usage: cat /proc/sys/fs/file-nr
  • User ulimits: ulimit -a

    Or, a particular process can be checked: cat /proc/PID/limits

  • Conntrack limits:

    cat /proc/sys/net/netfilter/nf_conntrack_max
    cat /proc/sys/net/netfilter/nf_conntrack_count

  • Filesystem space and inode usage: df -h and df -ih
Requirements for Linux can differ slightly depending on the distribution, refer to the Linux Requirements for more information.

3.5 Log rotation

To prevent large log files from accumulating, and apply a desired retention period it is recommended to rotate OS, pod log files, and configure an external log service to stream logs off the nodes for a longer-term lifecycle and easier searching.

   Checks

Containers

k3s RKE
  • Log rotation is configured for the container logs
  • An external logging service is configured as needed

The below arguments for the INSTALL_K3S_EXEC environment variable can be used as an example to rotate container logs:

INSTALL_K3S_EXEC="--kubelet-arg container-log-max-files=5 --kubelet-arg container-log-max-size=100Mi"

OS

Rotation of log files on nodes is also important, especially if a long node lifecycle is expected.

3.6 DNS scalability

DNS is a critical service running within the cluster. DNS queries are distributed throughout the cluster, where the availability depends on the accessibility of the CoreDNS pods in the service.

The Nodelocal DNS cache is a redesign on the architecture and is recommended for clusters that may experience high DNS workload or issues.

   Checks

If a cluster has experienced a DNS issue, or high DNS workload is expected:

  • Check the output of conntrack -S on related nodes.

High amounts of the insert_failed counter can be indicative of a conntrack race condition, Nodelocal DNS cache is recommended to mitigate this.

Was this article helpful?
7 out of 7 found this helpful

Comments

0 comments

Article is closed for comments.