After rebooting a Kubernetes node, you may notice that pod to pod network connectivity(via the overlay network) does not function correctly until you restart the canal workload on that node in Kubernetes.
- Kubernetes cluster running canal or flannel as the CNI
- Linux nodes running Systemd v242 or higher
This is caused by a race condition between flannel and systemd-networkd that is being tracked in this upstream issue.
This doesn't appear to affect Ubuntu 20.04, due to it's use of netplan to manage networking configuration.
Either restart canal on the node (
kubectl delete pod -n kube-system canal-XXXX) as needed or change the MACAddressPolicy for the flannel interfaces on your nodes to none:
cat<<'EOF'>/etc/systemd/network/10-flannel.link [Match] OriginalName=flannel* [Link] MACAddressPolicy=none EOF
At present there is no resolution and this bug is still open upstream.