k3s on Raspberry Pi: Static Persistent Volumes
In the previous post, we succeeded in giving our docker registry some persistent storage. However, it used (the default) dynamic provisioning, which means we don’t have as much control over where the storage is provisioned.
The local-path storage class
Previously, we created a PersistentVolumeClaim:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: docker-registry-pvc
namespace: docker-registry
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
And then we attached it to our container:
...
volumes:
- name: docker-registry-vol
persistentVolumeClaim:
claimName: docker-registry-pvc
Because we didn’t specify anything in the claim, the default storage class is used.
For k3s, this is local-path
:
$ kubectl get storageclass
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 161d
By default, this provisions storage in the /var/lib/rancher/k3s/storage
directory on the local filesystem of each
node:
$ kubectl --namespace kube-system get cm local-path-config -o yaml
apiVersion: v1
data:
config.json: |-
{
"nodePathMap":[
{
"node":"DEFAULT_PATH_FOR_NON_LISTED_NODES",
"paths":["/var/lib/rancher/k3s/storage"]
}
]
}
...
If we wanted to provision storage on a different filesystem, we could edit the ConfigMap appropriately.
For example, some of our nodes might have extra storage, or we might want to use an NFS server for shared storage.
Static Provisioning
Another way to do it would be to use static provisioning, where we explicitly create a PersistentVolume, and then specify it in our PersistentVolumeClaim.
One example I found – at Deploy Your Private Docker Registry as a Pod in Kubernetes, gives us the following:
apiVersion: v1
kind: PersistentVolume
metadata:
name: docker-repo-pv
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: /tmp/repository
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: docker-repo
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
If you apply the above, you’ll end up with a dynamic PVC (and PV) and an (unused) static PV:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
docker-repo-pvc Bound pvc-938faff3-c285-4105-83bc-221bb93a6603 1Gi RWO local-path 8m32s
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
docker-repo-pv 1Gi RWO Retain Available 18m
pvc-938faff3-c285-4105-83bc-221bb93a6603 1Gi RWO Delete Bound default/docker-repo-pvc local-path 3m42s
You can see that we have one PVC listed, but two PV’s. One of the PV’s is marked “Available”, showing that it’s unused.
For it to be used, there has to be some kind of association between the claim and the volume.
One way to do that is given in the Configure a Pod to Use a PersistentVolume for Storage
walkthrough, which uses an explicit storageClassName: manual
on the volume, which is used to bind the claim.
It looks like you can invent storage classes to describe whether you’re providing (for example) slow-but-reliable vs. fast-but-unreliable storage. Then you’d provision a number of PV instances (with whatever backing store is relevant). Then each claim can use the appropriate storage class name to request the type of storage (reliable or fast) it needs.
Presumably there’s a storage scheduler in there somewhere that matches these up in some “optimal” way.
Labels and selectors
Another way to do this is to use labels on the PV, and a matching selector on the PVC.
Let’s experiment with that by converting our iSCSI deployment to use a PV and PVC.
The simplest way to do this seems to be to use an app: testing
label in the volume and then use it as the selector in
the claim:
apiVersion: v1
kind: PersistentVolume
metadata:
name: testing-vol-pv
labels:
app: testing # <--
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
iscsi:
targetPortal: 192.168.28.124:3260
iqn: iqn.2000-01.com.synology:ds211.testing.25e6c0dc53
lun: 1
readOnly: false
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: testing-vol-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
selector:
matchLabels:
app: testing # <--
If we apply that:
$ kubectl apply -f testing-ubuntu-pv.yml
persistentvolume/testing-vol-pv configured
persistentvolumeclaim/testing-vol-pvc created
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
testing-vol-pv 1Gi RWO Retain Available 101s
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
testing-vol-pvc Pending local-path 21s
If we look into why it’s still Pending
:
$ kubectl describe pvc testing-vol-pvc
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal WaitForFirstConsumer 13s (x3 over 41s) persistentvolume-controller waiting for first consumer to be created before binding
OK then. Let’s create our first consumer. We’ll update our deployment:
...
volumes:
- name: testing-vol
persistentVolumeClaim:
claimName: testing-vol-pvc
…and fire it up:
$ kubectl apply -f testing-ubuntu.yml
deployment.apps/testing created
That doesn’t work. It remains Pending
. Let’s investigate:
$ kubectl get events
LAST SEEN TYPE REASON OBJECT MESSAGE
17s Normal ExternalProvisioning persistentvolumeclaim/testing-vol-pvc waiting for a volume to be created, either by external provisioner "rancher.io/local-path" or manually created by system administrator
14s Normal Provisioning persistentvolumeclaim/testing-vol-pvc External provisioner is provisioning volume for claim "default/testing-vol-pvc"
14s Warning ProvisioningFailed persistentvolumeclaim/testing-vol-pvc failed to provision volume with StorageClass "local-path": claim.Spec.Selector is not supported
Ah. Seems we need to use a different storage class. Let’s try that.
However, we first need to delete the PV and PVC, because we’re not allowed to change the spec
after they’ve been
provisioned.
Deleting a PersistentVolume and PersistentVolumeClaim
$ kubectl delete pv testing-vol-pv
persistentvolume "testing-vol-pv" deleted
$ kubectl delete pvc testing-vol-pvc --grace-period=0 --force
warning: Immediate deletion does not wait for confirmation that the running resource has been terminated. The resource may continue to run on the cluster indefinitely.
persistentvolumeclaim "testing-vol-pvc" force deleted
Adding a storageClassName
Update the testing-ubuntu-pv.yml
file:
apiVersion: v1
kind: PersistentVolume
metadata:
name: testing-vol-pv
labels:
app: testing
spec:
storageClassName: iscsi # <--
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
iscsi:
targetPortal: 192.168.28.124:3260
iqn: iqn.2000-01.com.synology:ds211.testing.25e6c0dc53
lun: 1
readOnly: false
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: testing-vol-pvc
spec:
storageClassName: iscsi # <--
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
selector:
matchLabels:
app: testing
Then reapply it (and the deployment)
$ kubectl apply -f testing-ubuntu-pv.yml
persistentvolume/testing-vol-pv configured
persistentvolumeclaim/testing-vol-pvc created
$ kubectl apply -f testing-ubuntu.yml
It seems to work:
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
testing-5cb897dd66-sk5zs 1/1 Running 0 83s 10.42.2.41 rpi405 <none> <none>
Is it persistent? Across nodes?
We’re on the home straight at this point.
$ kubectl exec --stdin --tty testing-5d4458cc68-jffcx -- /bin/bash
# ls /var/lib/testing
kilroy-was-here
Cool, so we didn’t lose anything while we were messing around.
# touch /var/lib/testing/kilroy-was-still-here
Let’s force the pod to move:
$ kubectl cordon rpi405
$ kubectl delete pod testing-5cb897dd66-sk5zs
$ kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
testing-5cb897dd66-xmjt5 1/1 Running 0 100s 10.42.3.42 rpi404 <none> <none>
New pod, different node.
$ kubectl exec --stdin --tty testing-5cb897dd66-xmjt5 -- /bin/bash
# ls /var/lib/testing/
kilroy-was-here kilroy-was-still-here
Same content. We’re done.
What was the point?
That seems like a lot of effort to get from where we were – a single pod with persistent storage on an iSCSI target – to where we are – a single pod with persistent storage on an iSCSI target, except specified with a volume claim.
Honestly, at this point, I don’t have a definitive reason for that.
I suspect that if I’d provisioned a number of iSCSI targets and created a volume for each, then Kubernetes (or k3s)
could have used them to fulfill a corresponding number of volume claims without needing to worry specifically about what
went where, as long as the storage class (iscsi
) matched up.
In a word: abstraction.
We’re moving the volume details one step back from the pod, allowing us more flexibility in how we provision it.
Maybe it would be clearer if the PV and PVC weren’t in the same YAML file.
The claim should go in the deployment YAML. It’s saying “this deployment needs a volume; it should have this much capacity, and it should be in this storage class”. It’s up to the scheduler to match that up with an available volume, or to provision one dynamically.
Ideally, we’d invent a storage provisioner that used the Synology NAS API (does it have one?) to carve out targets and LUNs on demand. That’d be cool, but it’s overkill for my home test lab.
We’ve also not really solved our “node is unavailable, we can’t start the pod” problem either. We’ve just moved that one step back as well. Now we have a “NAS is unavailable, we can’t start the pod” problem.
But, in theory, a reasonable NAS system (even a 10-year old one) is more reliable than a bunch of Raspberry Pi 4s precariously balanced on the edge of my desk.