How to Use Persistent Volumes and Claims in Kubernetes for Data Persistence
Introduction
In Kubernetes, Pods are ephemeral, meaning any data stored inside a Pod is lost if the Pod is deleted or restarted. This can create challenges for applications like databases, content management systems, or file storage that require persistent data.
Kubernetes addresses this with Persistent Volumes (PV) and Persistent Volume Claims (PVC):
- Persistent Volume (PV): A piece of storage in the cluster, provisioned by an administrator or dynamically via a StorageClass.
- Persistent Volume Claim (PVC): A request for storage by a Pod, specifying size, access mode, and optionally a label selector.
By using PVs and PVCs, you can ensure data persists beyond the lifecycle of a Pod, decouple storage from compute, and manage storage efficiently.
Prerequisites
Before you start, ensure the following:
- A running Kubernetes cluster (Minikube, kind, or cloud provider).
- kubectl is installed and configured to communicate with your cluster.
- Basic knowledge of Pods and YAML manifests.
Step 1: Create a Persistent Volume (PV)
A PV is the actual storage resource in your cluster. Here’s an example with a label selector:
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-demo
labels:
type: fast-storage
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
Apply the PV:
kubectl apply -f pv.yaml
kubectl get pv
Step 2: Create a Persistent Volume Claim (PVC)
A PVC is a request for storage that binds to a matching PV:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: pvc-demo
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Mi
selector:
matchLabels:
type: fast-storage
Apply the PVC:
kubectl apply -f pvc.yaml
kubectl get pvc
Step 3: Use the PVC in a Pod
Mount the PVC inside a Pod to give your container access to persistent storage:
apiVersion: v1
kind: Pod
metadata:
name: pod-with-pvc
spec:
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- mountPath: "/usr/share/nginx/html" # Path inside the Pod
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: pvc-demo
Apply the Pod:
kubectl apply -f pod-with-pvc.yaml
kubectl get pods
Any data written to /usr/share/nginx/html
In the Pod is actually stored in the PV (/mnt/data
on the worker node).
Step 4: Verify Data Persistence
1. Exec into the pod:
kubectl exec -it pod-with-pvc -- /bin/bash
2. Create a file in the mounted volume:
echo "Hello Kubernetes" > /usr/share/nginx/html/index.html
3. Exit the Pod and delete it:
kubectl delete pod pod-with-pvc
4. Create a new Pod using the same PVC:
apiVersion: v1
kind: Pod
metadata:
name: new-pod-with-pvc
spec:
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- mountPath: "/usr/share/nginx/html"
name: storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: pvc-demo
Apply the new Pod and check the file:
kubectl apply -f new-pod-with-pvc.yaml
kubectl exec -it new-pod-with-pvc -- cat /usr/share/nginx/html/index.html
- You should see “Hello Kubernetes”, confirming that data persisted across pods.
Advantages of PV and PVC
- Data Persistence: Data survives Pod deletions and restarts.
- Decoupling Storage from Pods: PVs exist independently of Pods.
- Dynamic Provisioning: PVCs can automatically request storage from cloud providers.
- Controlled Binding: Using labels and selectors ensures PVC binds to the correct PV.
Conclusion
Persistent Volumes and Persistent Volume Claims are essential for stateful applications in Kubernetes. They provide reliable, reusable, and decoupled storage, ensuring that your application data remains safe even if pods are deleted or recreated.