Kubernetes local volumes go beta. However, what is it, a Kubernetes local volume? Last time, we have discovered, how to use Kubernetes hostPath volumes. However, we also have seen, that hostPath volumes work well only on single node clusters. Here, Kubernetes local volumes help us to overcome the restriction and we can work in a multi-node environment with no problems.
„Local volumes“ are similar to hostPath volumes, but they allow to pin-point PODs to a specific node, and thus making sure that a restarting POD always will find the data storage in the state it had left it before the reboot. They also make sure that other restrictions are met before the used persistent volume claim is bound to a volume.
Note, the disclaimer on the announcement that local volumes are not suitable for most applications. They are much easier to handle than clustered file systems like glusterfs, though. Still, local volumes are perfect for clustered applications like Cassandra.
Let us start:
References
- Kubernetes Documentation on persistent Volumes
- Katacoda persistent Volumes Hello World with an NFS Docker container
- Other Kubernetes Series posts in this blog:
Prerequisites
- We need a multi-node Kubernetes Cluster to test all of the features of „local volumes“. A two-node cluster with 2 GB or better 4 GB RAM each will do. You can follow the instructions found on (3) Kubernetes Cluster with Kubeadm in order to install such a cluster on CentOS.
Step 1: Create StorageClass with WaitForFirstConsumer Binding Mode
According to the docs, persistent local volumes require to have a binding mode of WaitForFirstConsumer. the only way to assign the volumeBindingMode to a persistent volume seems to be to create a storageClass with the respective volumeBindingMode and to assign the storageClass to the persistent volume. Let us start with
cat > storageClass.yaml << EOF kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: my-local-storage provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer EOF kubectl create -f storageClass.yaml
The output should be:
storageclass.storage.k8s.io/my-local-storage created
Step 2: Create Local Persistent Volume
Since the storage class is available now, we can create local persistent volume with a reference to the storage class we have just created:
cat > persistentVolume.yaml << EOF apiVersion: v1 kind: PersistentVolume metadata: name: my-local-pv spec: capacity: storage: 500Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: my-local-storage local: path: /mnt/disk/vol1 nodeAffinity: required: nodeSelectorTerms: - matchExpressions: - key: kubernetes.io/hostname operator: In values: - node1 EOF
Note: You might need to exchange the hostname value „node1“ in the nodeAffinity section by the name of the node that matches your environment.
The „hostPath“ we had defined in our last blog post is replaced by the so-called „local path„.
Similar to what we have done in case of a hostPath volume in our last blog post, we need to prepare the volume on node1, before we create the persistent local volume on the master:
# on the node, where the POD will be located (node1 in our case): DIRNAME="vol1" mkdir -p /mnt/disk/$DIRNAME chcon -Rt svirt_sandbox_file_t /mnt/disk/$DIRNAME chmod 777 /mnt/disk/$DIRNAME # on master: kubectl create -f persistentVolume.yaml
The output should look like follows:
persistentvolume/my-local-pv created
Step 3: Create a Persistent Volume Claim
Similar to hostPath volumes, we now create a persistent volume claim that describes the volume requirements. One of the requirement is that the persistent volume has the volumeBindingMode: WaitForFirstConsumer. We can assure this by referencing the previously created a storageClass:
cat > persistentVolumeClaim.yaml << EOF kind: PersistentVolumeClaim apiVersion: v1 metadata: name: my-claim spec: accessModes: - ReadWriteOnce storageClassName: my-local-storage resources: requests: storage: 500Gi EOF kubectl create -f persistentVolumeClaim.yaml
With the answer:
persistentvolumeclaim/my-claim created
From point of view of the persistent volume claim, this is the only difference between a local volume and a host volume.
However, different to our observations about host volumes in the last blog post, the persistent volume claim is not bound to the persistent volume automatically. Instead, it will remain „Available“ until the first consumer shows up:
# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE my-local-pv 500Gi RWO Retain Available my-local-storage 3m59s
This should change in the next step.
Step 4: Create a POD with local persistent Volume
The Kubernetes Architects have done a good job in abstracting away the volume technology from the POD. As with other volume technologies, the POD just needs to reference the volume claim. The volume claim, in turn, specifies its resource requirements. One of those is the volumeBindingMode to be WairForFirstCustomer. This is achieved by referencing a storageClass with this property:
Once a POD is created that references the volume claim by name, a „best match“ choice is performed under the restriction that the storage class name matches as well.
Okay, let us perform the last required step to complete the described picture. The only missing piece is the POD, which we will create now:
cat > http-pod.yaml << EOF apiVersion: v1 kind: Pod metadata: name: www labels: name: www spec: containers: - name: www image: nginx:alpine ports: - containerPort: 80 name: www volumeMounts: - name: www-persistent-storage mountPath: /usr/share/nginx/html volumes: - name: www-persistent-storage persistentVolumeClaim: claimName: my-claim EOF kubectl create -f http-pod.yaml
This should yield:
pod/www created
Before, we have seen that the persistent volume claim was not bound to a persistent volume yet. Now, we expect the binding to happen, since the last missing piece of the puzzle has fallen in place already:
]# kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE my-local-pv 500Gi RWO Retain Bound default/my-claim my-local-storage 10m
Yes, we can see that the status is bound to claim named „default/my-claim“. Since we have not chosen any namespace, the claim is located in the „default“ namespace.
The POD is up and running:
# kubectl get pods NAME READY STATUS RESTARTS AGE www 1/1 Running 0 3m29s
Step 5: Add Index File to local Volume
We now can create an index file in the local persistent volume:
node1# echo "Hello local persistent volume" > /mnt/disk/vol1/index.html
Step 6: Access Application Data
Now, since the index file is available, we can access the index file of the POD. For that, we need to retrieve the POD IP address:
# POD_IP=$(kubectl get pod www -o yaml | grep podIP | awk '{print $2}'); echo $POD_IP 10.44.0.2
Now we can access the web server’s index file with a cURL command:
# curl $POD_IP Hello local persistent volume
Perfect.
Note: as long as the index file is not present, you will receive a 403 Forbidden message here. In that case, please check that you have created the index file in the correct host and path.
Step 7 (optional): LifeCycle of a Local Volume
Step 7.1: Exploring Local Volume Binding after POD Death
Here, we want to explore what happens to an orphaned Kubernetes local volume. For that, we delete a POD with a local volume and observe, whether or not the binding state changes. My guess is, that once a local volume is bound to a persistent volume claim, the binding will persist, even if the corresponding POD has died.
Enough guessing, let us check it! See below the binding state if a POD is up and running and it is using the local volume:
# kubectl get pod NAME READY STATUS RESTARTS AGE www 1/1 Running 0 42h # kubectl get pv NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE my-local-pv 500Gi RWO Retain Bound default/my-claim my-local-storage 43h
Now let us delete the POD and check again:
# kubectl delete pod www
pod "www" deleted
# kubectl get pod
NAME READY STATUS RESTARTS AGE
# kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
my-local-pv 500Gi RWO Retain Bound default/my-claim my-local-storage 43h
Yes, I was right: the status is still „Bound“, even though the POD is gone.
Step 7.2: Attach a new POD to the existing local volume
Let us try to attach a new POD to the existing local volume. For that, we create a new POD with reference to the same persistent volume claim (named „my-claim“ in our case).
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
name: centos-local-volume
labels:
name: centos-local-volume
spec:
containers:
- name: centos
image: centos
command: ["/bin/sh"]
args: ["-c", "while true; do cat /data/index.html; sleep 10; done"]
volumeMounts:
- name: my-reference-to-the-volume
mountPath: /data
volumes:
- name: my-reference-to-the-volume
persistentVolumeClaim:
claimName: my-claim
EOF
This will an output
pod/centos-local-volume created
It is just a simple CentOS container that is sending the content of the data to the log every 10 sec. Let us retrieve the log now:
# kubectl logs centos-local-volume Hello local persistent volume Hello local persistent volume Hello local persistent volume
Cool, that works fine.
Step 7.3: Verifying Multiple Read Access
How about attaching more than one container to the same local volume? Let us create a second centos container named „centos-local-volume2“:
cat <<EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
name: centos-local-volume2
labels:
name: centos-local-volume2
spec:
containers:
- name: centos
image: centos
command: ["/bin/sh"]
args: ["-c", "while true; do cat /data/index.html; sleep 10; done"]
volumeMounts:
- name: my-reference-to-the-volume
mountPath: /data
volumes:
- name: my-reference-to-the-volume
persistentVolumeClaim:
claimName: my-claim
EOF
This will an output
pod/centos-local-volume2 created
Let us retrieve the log once again:
# kubectl logs centos-local-volume | tail -n 3 Hello local persistent volume Hello local persistent volume Hello local persistent volume # kubectl logs centos-local-volume2 | tail -n 3 Hello local persistent volume Hello local persistent volume Hello local persistent volume
Here, we can see that both containers have read access to the volume.
Step 7.4: Verifying Multiple Write Access
Now let us check the write access by entering the first centos container and changing the index file:
# kubectl exec -it centos-local-volume sh sh-4.2# echo "centos-local-volume has changed the content" > /data/index.html sh-4.2# exit exit
Now let us check the log of the second container:#
# kubectl logs centos-local-volume2 | tail -n 3 centos-local-volume has changed the content centos-local-volume has changed the content centos-local-volume has changed the content
That works! And it works also the other way round:
# kubectl exec -it centos-local-volume2 sh
sh-4.2# echo "centos-local-volume2 has changed the content" > /data/index.html
sh-4.2# exit
exit
[root@centos-2gb-nbg1-1 ~]# kubectl logs centos-local-volume | tail -n 3
centos-local-volume has changed the content
centos-local-volume has changed the content
centos-local-volume2 has changed the content
As you can see, I had been quick enough to still see two of the old lines, but the last line is showing that the content has been changed by centos-local-volume2.
Summary
In this blog post, we have shown that Kubernetes local volumes can be run on multi-node clusters without the need to pin PODs to certain nodes explicitly. Local volumes with their node affinity rules make sure that a POD is bound to a certain node implicitly, though. Kubernetes local volumes have following features:
- Persistent volume claims will wait for a POD to show up before a local persistent volume is bound
- Once a persistent local volume is bound to a claim, it remains bound, even if the requesting POD has died or has been deleted
- A new POD can attach to the existing data in a local volume by referencing the same persistent volume claim
- Similar to NFS shares, Kubernetes persistent local volumes allow multiple PODs to have read/write access
Kubernetes local persistent volume they work well in clustered Kubernetes environments without the need to explicitly bind a POD to a certain node. However, the POD is bound to the node implicitly by referencing a persistent volume claim that is pointing to the local persistent volume. Once a node has died, the data of all local volumes of that node are lost. In that sense, Kubernetes local persistent volume cannot compete with distributed solutions like Glusterfs and Portworx volumes.
What are best practices for creating the directories on the node to which the PV is bound? I am at a loss for how to get into the node to do this. I am running kind (kubernetes in docker) and using this as a way to create PVCs in a local cluster on which my application relies. I cannot figure out how to create those directories on the node before deploying my pods.
Hi Michael, may be you want to give https://github.com/oveits/installcentos/blob/master/install-openshift.sh a try? This is an open source repo forked from gshipley with the target to install OpenShift on CentOS. OpenShift builds on top of Kubernetes. The relevant part starts at line 168.
Why we are creating a persistensvolume if we are creating a persistenvolume claim Is step 2 really needed?
Yes, step two is needed. You cannot fulfill a claim for a persistent volume, if there is no persistent volume available.
what happens if the node1 goes down..? how the volume move to other node?
If node1 goes down, the corresponding POD will remain in pending mode and it will wait for the volume to become available again. This is the drawback of the local volume solution.
Thanks for sharing. I read many of your blog posts, cool, your blog is very good.
fun88 ???????????????????????????????????????????????????????????????????????????? fun88 ?????????????????????????????????????????????? ??????????????????????????????????????????????? ????????? ????????????????????????? ?????????????????????? ??????????????????????????????????? ??????????????????????????????????????????
I read your article and it helped me solve a problem configuring a Portainer Pod on my bare metal Kubernetes Homelab.
In the documentation, Portainer only mentions the StorageClass, but this alone does not make Portainer, which needs a persistent volume, work.
With your article, I assembled the 3 resources necessary for the Pod configuration to work.
Thank you very much.