Upgrading Longhorn: Incident Review
A new release of Longhorn came out a few days ago. I tried upgrading. It did not go well. This is the incident review.
A new release of Longhorn came out a few days ago. I tried upgrading. It did not go well. This is the incident review.
I’ve added a Raspberry Pi node to my K3s cluster, and I don’t want it to take part in Longhorn replication.
As part of my over-engineered doorbell project, I’ve added a new Raspberry Pi node to my K3s cluster, but none of the daemonsets are being scheduled onto it.
This afternoon, I couldn’t run kubectl get namespaces
against my K3s cluster. Instead, I got an Unauthorized
error.
In a previous post, I showed how to store sys.config
and
vm.args
in a ConfigMap. There are some problems with that approach, so I’m going to use Kustomize to fix that.
My K3s cluster uses MetalLB as a bare-metal “load-balancer”, and I wondered how it worked. Here’s what I found.
I installed MetalLB using helm. Upgrading is a one-liner.
Erlang configuration is typically stored in the sys.config
file. What options do we have if we want to have different
settings in this file when deploying using Kubernetes?
About 3 months ago, the Erlang nodes in my cluster stopped talking to each other. This was caused by the CA certificate expiring. In that post, I asked “How do we roll the CA certificate without downtime?”. This post explores some options.
I’m running Gitea and ArgoCD on my K3s cluster, for some GitOps goodness. I noticed that Gitea 1.19.0 recently came out, with some features I want to try, such as Gitea Actions. Since I’m running 1.17.4, it’s time to upgrade.
When I originally installed MetalLB, it used a ConfigMap
for
setting up address ranges. Since 0.13.2, it supports configuration using Custom Resource Definitions (CRDs). I forgot to
write a blog post about that when I upgraded.
In Expired Certificates: Incident Review, I listed a future action: “Audit the cluster to see if there are any other TLS secrets that aren’t using cert-manager.” Here’s how I did it using Elixir Livebook.
During the morning of March 18th, 2023, while investigating this incident, Firefox reported a “your connection is not secure” error when connecting to my gitea server. This is the incident review.
On the morning of March 18th, 2023, accessing any sites hosted by my Kubernetes cluster would fail with a “Connection Timed Out” error. This is the incident review.
In this post, I showed how to access the Erlang console via SSH
using kubectl port-forward
.
The nodes in the cluster stopped talking to each other at some point. I only noticed this afternoon after investigating some other problem.
I want visitors to http://home.k3s.differentpla.net
to be redirected to https://home.k3s.differentpla.net
. Here’s
how to set that up on K3s, using Traefik middlewares.
This started as a quick experiment to spin up an Erlang cluster on Kubernetes, using TLS distribution, and to validate using cert-manager to issue the pod certificates.
In this post, I showed how to use
an init container to create CertificateRequest
objects, which cert-manager signs, returning the certificates. A new
request is created every time a pod starts. This eventually leaves a lot of stale CertificateRequest
objects. We
should clean those up.
In a previous post, I used sleep
5s
to wait for cert-manager to complete the CertificateRequest
. Instead, we should poll the status
field.
In an earlier post, I used a
ClusterIssuer
that I originally created when first setting up cert-manager. That needs fixing.
While scaling up/down the deployment for my Erlang cluster, I regularly refresh the web page that displays cluster members. Occasionally, I get a “502 Bad Gateway” error from Traefik. What’s with that?
In the previous post, I showed how to enable SSH access to the Erlang remote console on a pod. When we left it, it had no authentication. Let’s fix that.
As mentioned earlier, using TLS for Erlang
distribution breaks erlclu remote_console
(because it breaks erl_call
). At the time, I worked around the problem by
using nodetool
. This post shows how to use Erlang’s SSH daemon instead.
In the previous two posts, we generated signing requests with OpenSSL and submitted them to cert-manager. In this post, we’ll actually use the generated certificates for mutual TLS.
In the previous post we used OpenSSL to create a certificate signing request. In this post, we’ll submit it to cert-manager and get the certificate back.
As explained here, I’m going to use an init container to issue the pod certificates.
In the previous post, we got clustering working without TLS. Lifting from the investigation that I wrote up here, I’ll add TLS distribution to my Erlang cluster, but only with server certificates and with no verification (for now).
Based on my investigation with libcluster in Elixir, I’ve decided to use DNS-based discovery for finding the other Erlang nodes in the cluster. To do this, we’ll need a headless service.
I’ve covered this previously; see “Erlang/Elixir Cookies and Kubernetes”. Here’s the quick version.
I noticed that whenever I made any change to the application, it caused the dockerpodman build to re-fetch and
recompile all of the dependencies. On the tiny laptop I was using at the time, this was taking several extra minutes for
every build.
Because this is going to be a cluster of Erlang nodes, there’s (obviously) going to be more than one instance. It makes sense to add some kind of way to have some kind of “whoami” page, so that we can clearly see which node we’re talking to.
For simplicity’s sake, I created a new application with rebar3 new app name=erlclu
. I very soon regretted this
decision, because I actually needed a release, so I ran rebar3 new release name=whoops
and manually merged the relevant
pieces together.
A few weeks ago, I decided to write a blog post about using mutual TLS to secure Erlang distribution (clustering), with auto-provisioning of certificates when running in Kubernetes. It took a little longer to write up than I expected, and turned into a series of blog posts.
You’re using kubectl
to do something; you want to do the same using the Kubernetes API (e.g. with curl
). How do you
figure out what kubectl
is doing?
I want to set up an Erlang cluster in Kubernetes, using TLS with mutual authentication. This post discusses some of the potential options for doing that. It’s also applicable to general mutual TLS between pods.
I’ve got an Electric Imp Environment Tail in my office. It monitors the temperature, humidity and pressure. Currently, to display a graph, it’s using flot.js and some shonky Javascript that I wrote. It remembers samples from the last 48 hours.
The documentation for VictoriaMetrics is a bit of a mess, so here’s what worked for me.
Using ArgoCD CLI:
ArgoCD provides a web interface and a command line interface. Let’s install the CLI.
As I add more things to my k3s cluster, I find myself wishing that I had a handy index of their home pages. For example, I’ve got ArgoCD and Gitea installed. I probably want to expose the Longhorn console, and the Kubernetes console. I think Traefik has a console, too. I’ll also be adding Grafana at some point soon.
Because I like experimenting with Kubernetes from Elixir Livebook, I made the service account into a cluster admin.
Up to this point, I’ve been creating and installing certificates manually. Let’s see if cert-manager will make that easier.
There’s a security fix that needs to be applied; there’s an arm64 release candidate. Time to upgrade ArgoCD.
I’ve got Gitea installed on my cluster, but it’s currently accessed via HTTP (i.e. no TLS; it’s not secure).
We’re using ArgoCD at work; time to play with it.
My Gitea instance isn’t using TLS, so I’m going to replace the LoadBalancer with an Ingress, which will allow TLS termination.
I want to play with GitOps on my k3s cluster (specifically ArgoCD). To do that, I’m going to need a local git server. I decided to use Gitea.
If you want to access the Kubernetes API from Elixir, you should probably just use the k8s package, but here’s how to do it without taking that dependency.
Distributed Erlang and Elixir applications use a shared secret called a “cookie”. It’s just a string of alphanumeric characters. All nodes in the cluster must use the same cookie. Let’s take a look at what that means in a Kubernetes context.
I’m still on the hunt for a way to connect Erlang nodes in a Kubernetes cluster by using pod names.
I’m looking at setting up an Erlang/Elixir cluster in my Kubernetes cluster, using libcluster, and I’m trying to get my head around some of the implied constraints.
Messing around with the kubernetes API from inside a container.
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
A while ago, I asked why pod names don’t resolve in
DNS, and never really
got a satisfactory answer. One way you can connect to a pod (rather than with a service), is to use the dashed-IP form
of the pod address, e.g. 10-42-2-46.default.pod.cluster.local
. Here’s how it works.
Installation with Helm, per https://metallb.universe.tf/installation/#installation-with-helm.