OpenEBS ZFS LocalPV delivers powerful and notably fast storage for Kubernetes environments. However, a common hurdle appears when attempting to use these volumes with containers operating under non-root privileges. Let’s dissect why this occurs specifically with native ZFS volumes and outline the effective solution.

How OpenEBS ZFS provisions volumes

OpenEBS ZFS LocalPV offers several methods for volume creation:

  1. Native ZFS Volume (using fstype: "zfs"): This method carves out a ZFS filesystem directly within your ZFS pool. It stands out as the fastest and most straightforward approach.
  2. Volume with other filesystems (e.g., fstype: "ext4"): This involves creating a ZFS subvolume (or zvol) which is then formatted using a different filesystem, such as ext4 or XFS.

This guide concentrates on the first scenario: native ZFS volumes.

The problem

Native ZFS volumes (fstype: "zfs"), despite their efficiency, exhibit a particular characteristic: the OpenEBS ZFS CSI driver, by default, does not manage file ownership or permissions within the mounted volume.

The practical consequence is:

  • Setting securityContext.runAsUser and securityContext.fsGroup in a Pod definition is standard practice, expecting the volume to become writable by the specified user and group.
  • Yet, upon mounting a native ZFS volume, its entire content initially belongs to root:root (UID 0, GID 0).
  • The fsGroup setting, in isolation, fails to alter the ownership of the volume’s contents in this native mode.
  • This directly blocks containers running as non-root users from writing data. While initContainers executing chown/chmod can serve as a workaround, it introduces complexity and potential delays.

Consider this example setup using Kustomize and Helm (certain boilerplate like ns.yaml is omitted).

Kustomization (kustomization.yaml)

namespace: openebs
resources:
- ns.yaml
- storageClass.yaml
helmCharts:
- name: openebs
  includeCRDs: true
  valuesInline:
    engines:
      local:
        lvm:
          enabled: false
      replicated:
        mayastor:
          enabled: false
    zfs-localpv:
      enabled: true
      analytics:
        enabled: false
      backupGC:
        enabled: true
      zfsNode:
        kubeletDir: "/var/lib/kubelet"
      crds:
        zfsLocalPv:
          enabled: true
        csi:
          volumeSnapshots:
            enabled: false
    lvm-localpv:
      crds:
        lvmLocalPv:
          enabled: false
  releaseName: openebs
  version: 4.2.0
  repo: https://openebs.github.io/openebs

StorageClass (storageClass.yaml)

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: openebs-zfspv
  annotations:
    storageclass.kubernetes.io/is-default-class: "true"
parameters:
  recordsize: "128k"
  compression: "on"
  dedup: "on"
  thinProvision: "yes"
  fstype: "zfs"
  poolname: "zfspv-pool"
allowVolumeExpansion: true
provisioner: zfs.csi.openebs.io

Sample Deployment (deployment.yaml)

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: demo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: demo
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: demo
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        fsGroup: 1000
      restartPolicy: Always
      enableServiceLinks: false
      containers:
        - name: demo
          image: busybox:latest
          command: ["/bin/sh", "-c", "sleep 3600"]
          securityContext:
            runAsUser: 1000
            runAsGroup: 1000
            runAsNonRoot: true
            allowPrivilegeEscalation: false
          volumeMounts:
            - name: data
              mountPath: /srv
      volumes:
        - name: data
          persistentVolumeClaim:
            claimName: demo-pvc
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: demo-pvc
spec:
  accessModes:
    - ReadWriteOnce
  storageClassName: openebs-zfspv
  resources:
    requests:
      storage: 10Gi

Deploying this configuration and executing a shell within the demo container (kubectl exec -it <pod-name> -- sh) reveals that /srv is owned by root:root. Any attempt to write (e.g., touch /srv/test.txt) fails with a “Permission denied” error, as the container operates under user ID 1000.

The common (ineffective here) suggestion

Searching for solutions often points towards using fsGroupChangePolicy: OnRootMismatch within the Pod’s securityContext:

# deployment.yaml (snippet)
# ...
    spec:
      securityContext:
        runAsUser: 1000
        runAsGroup: 1000
        fsGroup: 1000
        fsGroupChangePolicy: OnRootMismatch
# ...

Unfortunately, for OpenEBS native ZFS volumes, this policy has no effect. The volume’s contents stubbornly remain owned by root:root.

The real solution

We just need to patch the CSIDriver created by OpenEBS and set spec.fsGroupPolicy field to File.

Using the phrase ‘patch driver’ for this fix definitely brings back my memories – it feels very reminiscent of troubleshooting in the late 90s and early 2000s.

The File policy directs the CSI driver to recursively apply ownership and permission changes across all files and directories within the volume, aligning them with the Pod’s specified fsGroup during the mount operation.

Kustomize provides a clean way to apply this patch:

Update kustomization.yaml:

# kustomization.yaml
namespace: openebs
resources:
- ns.yaml
- storageClass.yaml
patches:
- path: patch_csidriver.yaml
helmCharts:
# ... (rest of helm chart definition)

Patch file (patch_csidriver.yaml):

# patch_csidriver.yaml
apiVersion: storage.k8s.io/v1
kind: CSIDriver
metadata:
  name: zfs.csi.openebs.io
spec:
  fsGroupPolicy: File

Key considerations:

  • Existing pods (and potentially PVCs/PVs created before the patch) may need to be deleted and recreated for the policy change to take effect. New volumes will automatically adhere to the updated policy.
  • The CSIDriver object is cluster-scoped; this modification impacts all volumes provisioned by the zfs.csi.openebs.io driver on the cluster.

Verification

After applying the patch and ensuring the relevant Pods are restarted, re-enter the container shell. Check the ownership of the mount point:

# Inside the container
ls -narth /srv

The output should now indicate that /srv is owned by root:<fsGroup> (e.g., root:1000). Consequently, the non-root container (running as user 1000, group 1000) gains the necessary permissions to write within /srv.

Downside

  • Performance impact: The fsGroupPolicy: File setting compels the CSI driver to perform a recursive chown/chmod across the entire volume structure upon each mount. For volumes containing an exceptionally large number of files, this operation can introduce noticeable delays during Pod startup. While generally acceptable for typical workloads, it’s a factor to consider if rapid startup is critical. In such specific scenarios, if internal permission management is handled differently, running the primary container process as root might be a simpler, albeit less secure, alternative.

Conclusion

Native ZFS volumes via OpenEBS ZFS LocalPV offer compelling performance advantages. The default permission handling can hinder non-root container operation, but this is readily addressed by configuring fsGroupPolicy: File on the zfs.csi.openebs.io CSIDriver object. This adjustment ensures that appropriate permissions are automatically enforced, enabling non-root applications to utilize ZFS storage effectively.

Good luck!