2.7 Persistent Storage
Persistent Storage in Kubernetes
Persistent storage is a fundamental aspect of containerized applications running on Kubernetes, as containers are typically stateless and ephemeral by nature. When a container is terminated or crashes, all data within the container's filesystem is lost. To ensure that data persists across restarts and container lifecycles, Kubernetes provides mechanisms for persistent storage.
Key Concepts in Kubernetes Persistent Storage
1. Volumes
In Kubernetes, volumes provide a way to store data that can be shared across containers in a pod. Unlike Docker volumes, Kubernetes volumes persist data across the lifecycle of a pod. When a pod is terminated, the data in the volume can still be retained depending on how the volume is defined.
- Ephemeral Volumes: Volumes that exist only as long as the pod is running (e.g.,
emptyDir). - Persistent Volumes (PVs): Volumes that are independent of the lifecycle of pods and persist data even after a pod is deleted.
Volume Types
Kubernetes supports various types of volumes, including:
- emptyDir: Temporary storage that is erased when a pod is deleted or restarted.
- hostPath: A volume that mounts a directory from the node's filesystem into the pod.
- configMap and secret: Store configuration data and sensitive information respectively.
- nfs: Mount a Network File System (NFS) as a volume.
- Persistent Volume (PV): A cluster-wide storage resource with its lifecycle independent of the pod.
2. Persistent Volumes (PVs)
A Persistent Volume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically through StorageClasses. PVs are storage resources in the cluster that are independent of the pods that use them.
- PVs are defined in the cluster and can be used by multiple pods across namespaces.
- They support various storage backends, such as AWS EBS, Google Persistent Disk, NFS, and more.
- PVs have their own lifecycle, distinct from pods.
Example PV YAML:
apiVersion: v1
kind: PersistentVolume
metadata:
name: my-pv
spec:
capacity:
storage: 10Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: my-storage-class
nfs:
path: /mnt/data
server: nfs-server.example.com
3. Persistent Volume Claims (PVCs)
A Persistent Volume Claim (PVC) is a request for storage by a pod. PVCs allow users to claim storage resources in the cluster, which are backed by Persistent Volumes (PVs).
- A PVC specifies how much storage a pod needs and its access mode (e.g., ReadWriteOnce, ReadOnlyMany).
- PVCs are automatically bound to available PVs based on their storage class and capacity requirements.
- If no matching PV exists, Kubernetes can dynamically provision a PV if a StorageClass is specified.
Example PVC YAML:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: my-storage-class
4. Storage Classes
A StorageClass defines the types of storage and the dynamic provisioning of Persistent Volumes. Each StorageClass provides a different quality of service for storage (e.g., SSD vs HDD, gold vs silver tier). Administrators can create multiple storage classes to provide various types of storage within the cluster.
- StorageClasses enable dynamic provisioning of PVs.
- Each StorageClass points to a provisioner that is responsible for creating volumes (e.g., AWS EBS, GCP PD, or custom provisioners like Ceph, GlusterFS).
Example StorageClass YAML:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: my-storage-class
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp2
zones: us-west-1a
reclaimPolicy: Retain
5. Dynamic Volume Provisioning
With dynamic provisioning, Kubernetes automatically provisions storage for a pod when a PVC is created. This eliminates the need for administrators to manually provision PVs for each pod.
- Dynamic provisioning uses StorageClasses to provision PVs automatically based on the PVC request.
- If a PVC specifies a StorageClass, the dynamic provisioner will create a corresponding PV.
6. Access Modes
Access modes define how a volume can be accessed by a pod. Kubernetes supports the following access modes:
- ReadWriteOnce (RWO): The volume can be mounted as read-write by a single node.
- ReadOnlyMany (ROX): The volume can be mounted as read-only by many nodes.
- ReadWriteMany (RWX): The volume can be mounted as read-write by many nodes.
7. Volume Reclaim Policy
The reclaim policy of a Persistent Volume determines what happens to the data in the PV after it is released by the PVC. The available reclaim policies are:
- Retain: Manual intervention is required to reclaim the resource.
- Recycle: The data in the PV is deleted, and the volume is made available for reuse.
- Delete: The PV and its data are deleted.
8. StatefulSets and Persistent Storage
StatefulSets are used to manage stateful applications, such as databases, where data persistence is crucial. StatefulSets guarantee that pods are created and destroyed in a specific order, and they retain their identity across restarts. Each pod in a StatefulSet can have its own persistent volume, ensuring data persistence.
StatefulSets work with PVs and PVCs to manage persistent storage for each pod in the set.
Best Practices for Kubernetes Persistent Storage
- Use Dynamic Provisioning: Implement dynamic provisioning with StorageClasses to automate the creation and management of Persistent Volumes.
- Use StatefulSets for Stateful Applications: Use StatefulSets for managing stateful applications that require persistent storage.
- Back Up Persistent Volumes: Regularly back up persistent volumes to protect against data loss.
- Use Appropriate Access Modes: Choose the right access mode for your use case (e.g., RWO, ROX, RWX) to ensure the correct level of accessibility.
- Monitor Storage Usage: Monitor storage consumption and ensure that Persistent Volumes are sized appropriately to avoid storage exhaustion.
Conclusion
Persistent storage in Kubernetes is crucial for managing stateful applications. By using Persistent Volumes (PVs), Persistent Volume Claims (PVCs), and StorageClasses, Kubernetes allows users to dynamically provision and manage storage. Understanding how to manage persistent storage effectively is key to deploying reliable and resilient containerized applications in Kubernetes environments.