How to monitor NTP on Linux nodes with Cluster Monitoring in Rancher v2.2.x+

Follow
Table of Contents

Task

Time drift between nodes in a Kubernetes cluster can create a range of issues, from a difficulty to correlate application log message timestamps across nodes, to a loss of etcd quorum (given the time sensitive nature of the consensus algorithm used in etcd).

Using Rancher, you can monitor the state and processes of your cluster nodes, Kubernetes components, and software deployments through integration with Prometheus, a leading open-source monitoring solution.

This article details how to monitor time drift, via the Network Time Protocol (NTP), on Linux nodes within Rancher Kubernetes Engine (RKE) or Rancher v2.x provisioned clusters.

Pre-requisites

  • A Rancher v2.x instance, starting at v2.2.0 and above
  • A Rancher Kubernetes Engine (RKE) CLI or Rancher v2.x provisioned Kubernetes cluster with Cluster Monitoring enabled, with Monitoring Version 0.2.0+
  • ntp configured on Linux nodes in the cluster (refer to the documentation for your Linux distribution on enabling and configuring ntp)

Steps

Enable the NTP collector on the Node Exporter DaemonSet

  1. Within the Rancher UI cluster view for the relevant cluster, navigate to Tools -> Monitoring
  2. In the bottom-right corner of the form, click Show advanced options
  3. Click Add Answer
  4. Configure the variable exporter-node.collectors.ntp.enabled with value true
  5. Click Save

Configure an alert for NTP time drift

  1. Within the Rancher UI cluster view for the relevant cluster, navigate to Tools -> Alerts
  2. On the A set of alerts for node Alert Group click Add Alert Rule
  3. Set Name to Node NTP time drift equal to or greater than 1 second
  4. Select Expression and enter node_ntp_offset_seconds
  5. Click Create
  6. Configure a Notifier for the A set of alerts for node Alert Group, by clicking the elipses for this Alert Group, and configuring the desired notifier in the Alert section at the bottom of the form.

Further Reading

Was this article helpful?
0 out of 0 found this helpful

Comments

1 comment
  • If you are using Chrony you may need to add 'allow 127/8' to the chrony.conf file for the NTP counters to appear in Prometheus.

    0
    Comment actions Permalink

Please sign in to leave a comment.