Store the Redpanda Data Directory in hostPath Volumes
You can configure Redpanda to use Kubernetes hostPath
volumes to store the Redpanda data directory. A hostPath
volume mounts a file or directory from the host node’s file system into your Pod.
Use hostPath volumes only for development environments. If the Pod is deleted and recreated, it might be scheduled on another worker node and lose access to the data.
|
Prerequisites
You must have the following:
-
Kubernetes cluster: Ensure you have a running Kubernetes cluster, either locally, such as with minikube or kind, or remotely.
-
Kubectl: Ensure you have the
kubectl
command-line tool installed and configured to communicate with your cluster. -
Dedicated directory: Ensure you have a dedicated directory on the host worker node to prevent potential conflicts with other applications or system processes.
-
File system: Ensure that the chosen directory is on an ext4 or XFS file system.
Configure Redpanda to use hostPath volumes
Both the Redpanda Helm chart and the Redpanda custom resource provide an interface for configuring hostPath
volumes.
To store Redpanda data in hostPath
volumes:
-
Helm + Operator
-
Helm
redpanda-cluster.yaml
apiVersion: cluster.redpanda.com/v1alpha2
kind: Redpanda
metadata:
name: redpanda
spec:
chartRef: {}
clusterSpec:
storage:
hostPath: "<absolute-path>"
persistentVolume:
enabled: false
initContainers:
setDataDirOwnership:
enabled: true
kubectl apply -f redpanda-cluster.yaml --namespace <namespace>
-
--values
-
--set
hostpath.yaml
storage:
hostPath: "<absolute-path>"
persistentVolume:
enabled: false
initContainers:
setDataDirOwnership:
enabled: true
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--values hostpath.yaml --reuse-values
helm upgrade --install redpanda redpanda/redpanda --namespace <namespace> --create-namespace \
--set storage.hostPath=<absolute-path> \
--set storage.persistentVolume.enabled=false \
--set statefulset.initContainers.setDataDirOwnership.enabled=true
-
storage.hostPath
: Absolute path on the host to store the Redpanda data directory. -
storage.persistentVolume.enabled
: Determine if a PersistentVolumeClaim (PVC) should be created for the Redpanda data directory. When set tofalse
, a PVC is not created. -
statefulset.initContainers.setDataDirOwnership.enabled
: Enable the init container to set write permissions on the data directories.Pods that run Redpanda brokers must have read/write access to their data directories. The initContainer is responsible for setting write permissions on the data directories. By default,
statefulset.initContainers.setDataDirOwnership
is disabled because most storage drivers callSetVolumeOwnership
to give Redpanda permissions to the root of the storage mount. However, some storage drivers, such ashostPath
, do not callSetVolumeOwnership
. In this case, you must enable the initContainer to set the permissions.To set permissions on the data directories, the initContainer must run as root. However, be aware that an initContainer running as root can introduce the following security risks:
-
Privilege escalation: If attackers gains access to the initContainer, they can escalate privileges to gain full control over the system. For example, attackers could use the initContainer to gain unauthorized access to sensitive data, tamper with the system, or start denial-of-service attacks.
-
Container breakouts: If the container is misconfigured or the container runtime has a vulnerability, attackers could escape from the initContainer and access the host operating system.
-
Image tampering: If attackers gain access to the container image of the initContainer, they could add malicious code or backdoors to it. Image tampering could compromise the security of the entire cluster.
-
Suggested reading
-
For details about
hostPath
volumes, see the Kubernetes documentation.
Next steps
Monitor disk usage to detect issues early, optimize performance, and plan capacity.