HA Kubernetes RKE2 with Kube-VIP and Rancher

First, I want to try to explain some mind-blowing things. If you are new to Rancher, it can be difficult to understand the difference between and purpose of each of these concepts. If you know the differences between K3S, RKE, RKE2, and Rancher, you can skip the next paragraph without reading.

Rancher is a piece of software that can be created and managed by your Kubernetes clusters. In my opinion, that is where the confusion starts. If you have not good at K8S concepts, you fall into the chicken and egg situation. Managing Kubernetes is the main duty for Rancher. You can manage multiple Kubernetes clusters in one Rancher. But if you don’t have any Kubernetes, you can create from Rancher UI easily.
I think If you need a single node K8S, you can use this method. At first, you should run a Rancher docker then follow the installation wizards for K3S or RKE. RKE stands for Rancher Kubernetes Engine and is Rancher’s command-line utility for creating, managing, and upgrading Kubernetes clusters. That means RKE is the name of Kubernetes distribution like Openshift, Mikrok8s, Mirantis, Tanzu, and EKS (AWS). But you will usually hear “Rancher” because Rancher is the name of the frontend product and company name. Rancher Company launched K3S, RKE, and RKE2 Kubernetes with Rancher Product. If you need lightweight Kubernetes, especially on IoT devices, you can use K3S. If you need a standard Kubernetes that running on Docker, use RKE. If you need a more secure and powerful Kubernetes on ContainerD, use RKE2. It is my suggestion that If you need a High Available (HA) Rancher on Production, use RKE2.

Now, I will try to describe the installation of 3 Node HA Rancher Kubernetes with RKE2 basically. I said there are available many installation methods at first for that, but I tried nearly all of them and decided that.

I used Ubuntu 20.04 for OS. Your devices can be bare-metal or VMs. I installed Ubuntu 20.04 on Proxmox Virtualisation.

Do not forget to set static IPs to your nodes. The range “192.168.10.71” to “192.168.10.73” is my node’s IP and needs an extra floating IP for Kube-VIP. I allocated “192.168.10.74”.

Now set domain names on your DNS. I set 3 domains for every node ( rke-node-X.rke.domain.com) and an extra domain for Rancher UI (rancher.rke.domain.com). The “rke.domain.com” domain is the base domain for my Rancher cluster, so I need a wildcard SSL certificate for this domain (*.rke.domain.com). If you don’t have an SSL certificate, you can get it from ZeroSSL or Letsencrpt.

Suggestion:
I like “acme.sh” project to get certificates from ZeroSSL easily.
https://github.com/acmesh-official/acme.sh

Well! Your domains IPs and certificates are ready, in that case, connect Node1 through SSH and run these commands below.

This is the config file for Node1. Then add these lines below to the config file.

Now, set some environment variables for Kube-VIP.

We are ready for installation now.

Follow the logs, if you like, run:

Wait for finishing the installation patiently, because that may take some time, then your first node will be ready. You can check it like that:

“Kubeconfig” file named rke2.yaml can be found in “/etc/rancher/rke2” directory.


Output:
NAME STATUS ROLES AGE VERSION
rke2 Ready control-plane,etcd,master 12d v1.22.7+rke2r2

If your output resembles these lines above, everything is going well. You have a single node K8S. So you can install Kube-VIP now.

crictl pull docker.io/plndr/kube-vip:$TAGalias kube-vip="ctr --namespace k8s.io run --rm --net-host docker.io/plndr/kube-vip:$TAG vip /kube-vip"

kube-vip manifest daemonset \
--arp \
--interface $INTERFACE \
--address $VIP \
--controlplane \
--leaderElection \
--taint \
--services \
--inCluster | tee /var/lib/rancher/rke2/server/manifests/kube-vip.yaml

Wait 15 seconds approximately, then you can check the status.

Output:
time="2022-04-11T20:33:48Z" level=info msg="Broadcasting ARP update for 192.168.10.74 (00:50:56:9b:3a:cb) via ens160"

If your output resembles these lines above, everything is going well probably. You installed Kube-VIP on your K8S. So you can see your floating IP on your node.

Output:
2: ens160: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:50:56:9b:3a:cb brd ff:ff:ff:ff:ff:ff
inet 192.168.10.71/24 brd 192.168.10.255 scope global ens160
valid_lft forever preferred_lft forever
inet 192.168.10.74/32 scope global ens160
valid_lft forever preferred_lft forever
inet6 fe80::250:56ff:fe9b:3acb/64 scope link
valid_lft forever preferred_lft forever

Our floating IP “192.168.10.74” is there. Our node has 2 IPs.

I can continue with the installation of the second node. But I choose to install Rancher at first.

helm repo update

kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.5.4/cert-manager.yaml

helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--version 2.6.3 \
--set hostname=rancher.rke.domain.com \
--set replicas=1

If you want to install HTTPS-based Rancher, you must create a TLS secret object and must give parameters to the HELM command. Forget the commands above and implement the codes below.

helm repo add rancher-stable https://releases.rancher.com/server-charts/stablehelm repo updateexport CERTDIR=/root/rke/certificateskubectl -n cattle-system create secret tls tls-rancher-ingress --cert=${CERTDIR}/fullchain.pem --key=${CERTDIR}/key.pem

helm install rancher rancher-stable/rancher \
--namespace cattle-system \
--set hostname=rancher.rke.domain.com \
--set replicas=1 \
--set ingress.tls.source=secret

To check rancher installation status. Run the command below.

Output:
Waiting for deployment "rancher" rollout to finish: 0 of 1 updated replicas are available...

You can see the output like above at first. In that case, please wait patiently. Maybe after some extra outputs, you should see the sentence below at end of the outputs.

deployment “rancher” successfully rolled out

To be sure everything is well, check the rancher pods

Output:
NAME READY STATUS RESTARTS AGE
rancher-78f6794ccb-wh54w 1/1 Running 0 7m42s

If your output resembles these lines above, you can smile. :) then check the URL from your browser. If you see a Welcome Page, you should see some commands to get “token” too. For a Helm installation, run:

Use the token to set the admin password then you are in the Rancher UI. At the last, You have a single node RKE2 and Rancher. So you can continue the installation of Node-2.

You need the token to install other nodes. For getting the token, run:


Output:
K10e53bdb27060ebc74cd2c25184fd8b14a94934898a7a91a48a613fb33ec310032::server:1c18fe2068bc4561260b2764858d7402

Now you must log in SSH to the second node and run:

This is the config file for Node2. Then add these lines below to the config file.

We are ready for install now.

Wait for finishing the installation patiently, because that may take some time, then your second node will be ready. You can check it like that:

Also, you can see the second node in Rancher UI. Anymore, You must do the same things for the third node as the second.

Now, I hope you have 3 nodes, but the rancher is running on just the first node, for this reason, you should increase replicas of the rancher. For doing that run:

After 3 minutes later, you can see your new pods with the command below.

Output:
rancher-78f6794ccb-jm5tm 1/1 Running 0 4m10s
rancher-78f6794ccb-pqzbd 1/1 Running 0 4m47s
rancher-78f6794ccb-wh54w 1/1 Running 0 64m

RKE will dispatch every pod to a different node. So you have a 3 nodes Rancher cluster and you can try the high availability of your cluster. node. Now shut down the first node! In seconds, your VIP IP will move to one of the other nodes and you can still use Rancher UI and your cluster. But you can not stop another node. Your cluster will collapse because the cluster needs minimum of 2 Etcd servers.

Adding Agent ( Worker) Node

If you just want to add Worker nodes to your cluster, you should run “rke2-agent” instead of the “rke2-server” service.

You must add the config file again.

This time, your config file can be like that below.

“token” and “server” items are enough for worker config file. Run “rke2-agent”:

Check your nodes again. Run:

Follow the logs, if you like, run:

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store