Adding vSphere/vSAN Storage Class to Kubernetes

Batur Orkun
5 min readApr 22, 2023

I am a HELM lover. At first, I always searched for a helm solution. If I do not find any, I will write them myself. But this time, I found it. Thanks to “Stefan van Gastel”

https://github.com/stefanvangastel/vsphere-cpi-csi-helm.git

You must download the code to your Bastion host or local PC.

git clone https://github.com/stefanvangastel/vsphere-cpi-csi-helm.git

I want to mention my system in order to have my aim understood. I have a 5-node vSphere cluster (VMWare ESXi, 7.0.3) and a vSAN cluster. I have 5 VMs for the RKE2/Rancher Kubernetes cluster on the vSAN cluster. Before, I installed Longhorn on RKE2 nodes and used it as persistent storage. But I had some problems and thought multiple replica storage was the cause.

Before installation, you should check a setting for your vSphere. Your vSphere version must be 7.0 or higher, and your Kubernetes cluster version must be 1.16 or higher for CSI 2.0.0. Moreover, you need extra controls for your VMs. Your Kubernetes node VM version must be 15 or higher, and UUID configuration options must be enabled.

How do you check it or do it?

  1. Log in to the ESXi or vCenter UI.
  2. After power-off, right-click the virtual machine, and choose Edit Settings.
  3. Click the VM Options tab, and select Advanced.
  4. Click Edit Configuration in Configuration Parameters.
  5. Click Add parameter.
  6. In the Key column, type “disk.EnableUUID”
  7. In the Value column, type “TRUE”
  8. Click OK and click Save.
  9. Power on the virtual machine

Check if it is working.

# kubectl get nodes -o json | grep providerID

Output must be like that below: ( 5 Nodes )

“providerID”: “vsphere://421bf484–6a67-d982-e12d-0b6a7a4d0fd8”

“providerID”: “vsphere://421b8453–1a6c-be78–49e9-c392a0d621db”

“providerID”: “vsphere://421b94e4–1d52–3706-ae38–17697276fe03”

“providerID”: “vsphere://421b9151–828a-f013–9e8f-336734c2c6b8”

“providerID”: “vsphere://421be6ee-c163–13c1-eb72-fa75aff2deaf”

This environment plays a crucial role. If you do not forget to do it, your PVC objects will be in “Pending Status.”

You should find your “datastoreurl” in vSAN from vCenter or the ESXi UI. You can click the storage icon from the left-side icon menu.

I needed a private values file because I use RKE2 and Rancher uses “control-plane” label instead of “controlplane”. You can see that from https://github.com/stefanvangastel/vsphere-cpi-csi-helm/blob/master/charts/vsphere-cpi-csi/v2.3.0/values.yaml

So I need to change the nodeSelector label.

  nodeSelector:
node-role.kubernetes.io/control-plane: "true"

Meanwhile, I used v2.3.0, which is the last one now.

That is my custom “values.yaml”

cloudProvider:
nodeSelector:
node-role.kubernetes.io/master: "true"

csiController:
nodeSelector:
node-role.kubernetes.io/master: "true"

I decided to use the “master” value, because my nodes are masters.

We prepared our helm, but we need a CSIDriver object ( CRD ). if you have not installed it yet. After this operation, we will have a CSI storage class. So if you have an error about CSIDriver, check that https://github.com/kubernetes-csi/csi-driver-nfs/blob/master/docs/install-csi-driver-master.md

You can check your “CSIDrivers” with the command below.

kubectl get csidrivers
Output:
error: the server doesn't have a resource type "csidrivers"

So you need to install it like that.

curl -skSL https://raw.githubusercontent.com/kubernetes-csi/csi-driver-nfs/master/deploy/install-driver.sh | bash -s master --

After all of that, you can run your helm command. That is my example command:

cd vsphere-cpi-csi \

helm upgrade - install vsphere-cpi-csi \
- namespace kube-system \
./charts/vsphere-cpi-csi/v2.3.0 \
- set vcenter.insecurehost=true \
- set vcenter.port=443 \
- set vcenter.host="vcenter.mylocal.net" \
- set vcenter.username="administrator@vsphere.local" \
- set vcenter.password="XXXXXXXX" \
- set storageclass.datastoreurl="ds:///vmfs/volumes/vsan:52a88bd9638b1b83-b0bce0d4cf62b81c/" \
- set vcenter.datacenter="vSAN Datacenter" \
-f values.yaml

Check the installation status now.

# kubectl get pods -n kube-system | grep vsphere

vsphere-cloud-controller-manager-6g6mk 1/1 Running 0 2d4h
vsphere-cloud-controller-manager-824zd 1/1 Running 0 2d4h
vsphere-cloud-controller-manager-gjtm9 1/1 Running 0 2d4h
vsphere-cloud-controller-manager-hk9tt 1/1 Running 0 2d4h
vsphere-cloud-controller-manager-tdl97 1/1 Running 0 2d4h
vsphere-csi-controller-5fbf876877-6cq8c 6/6 Running 0 2d4h
vsphere-csi-node-8hbfr 3/3 Running 0 2d4h
vsphere-csi-node-dfk96 3/3 Running 0 2d4h
vsphere-csi-node-n7gd6 3/3 Running 0 2d4h
vsphere-csi-node-sfn44 3/3 Running 0 2d4h
vsphere-csi-node-zx8hl 3/3 Running 0 2d4h

Everything is perfect. If things don’t go well, you can inspect them from the “vsphere-csi-controller-*****-*****” pod.

# kubectl logs -f vsphere-csi-controller-5fbf876877-6cq8c -n kube-system

For example, you can see that “vsphere-csi”: rpc error: code = Internal desc = failed to get shared datastores in kubernetes cluster. Error: Empty List of Node VMs returned from nodeManager”. It means that “spec.providerID” is null. That is a “disk.enableUUID” problem.

Now you can see your CSIDriver and Storage Class.

root@bastion:~# kubectl get csidrivers
NAME ATTACHREQUIRED PODINFOONMOUNT STORAGECAPACITY TOKENREQUESTS REQUIRESREPUBLISH MODES AGE
csi.vsphere.vmware.com true false false <unset> false Persistent 2d9h
nfs.csi.k8s.io false false false <unset> false Persistent 3d10h
root@bastion:~# kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
nfs-csi nfs.csi.k8s.io Delete Immediate true 3d10h
vsphere-csi (default) csi.vsphere.vmware.com Delete Immediate false 2d9h

I am putting in a simple “statefulset” Nginx example yaml to check your success.

kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx
replicas: 1 #
minReadySeconds: 10
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.8
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www-data
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "vsphere-csi"
resources:
requests:
storage: 1Gi

I wish the data sources that there is no data loss. :)

--

--