Erlang Clustering: a survey
A question on Mastodon asks “What are people using [for cluster management] in 2023?”. I thought I’d address a couple of hidden assumptions in the question and do a quick survey of what’s available.
A question on Mastodon asks “What are people using [for cluster management] in 2023?”. I thought I’d address a couple of hidden assumptions in the question and do a quick survey of what’s available.
By default, the improved logger originally introduced in Erlang/OTP-21 doesn’t support per-level colours. This is
something that I miss from lager
and from Elixir’s Logger
. Here’s a simple
way to implement something like lager
.
In the previous posts, we gathered metrics from Cowboy and Hackney. I’d like to publish the metrics to Prometheus.
In the previous post of the series, I added basic metrics reporting to Cowboy, and simply wrote them to the logger. In this post, I’m going to do the same for the HTTP client, Hackney,.
The ErlangLS extension for VS Code includes formatting, using rebar3_format
. I’d prefer to use erlfmt
, so here’s how it should be set up.
Cowboy is probably the most popular HTTP server for the Erlang and Elixir ecosystem. Here’s how to get metrics from it.
Building and installing Erlang is almost compatible with my use erlang
rule for direnv. Here’s how to bodge it.
There’s more comprehensive information in various files in the HOWTO
directory, but ain’t nobody got time for that. This works for me.
Some notes about hacking on and contributing to Erlang, because I don’t do it frequently enough to have all of this in muscle memory.
You’re experimenting with Erlang’s built-in SSH daemon, and it fails with “No host key available”. What’s up with that?
The nodes in the cluster stopped talking to each other at some point. I only noticed this afternoon after investigating some other problem.
For when you want to run both per-suite and per-test setup and cleanup in eunit tests.
This started as a quick experiment to spin up an Erlang cluster on Kubernetes, using TLS distribution, and to validate using cert-manager to issue the pod certificates.
While scaling up/down the deployment for my Erlang cluster, I regularly refresh the web page that displays cluster members. Occasionally, I get a “502 Bad Gateway” error from Traefik. What’s with that?
In theory, we’ve got TLS working for our Erlang cluster, with mutual authentication. How do we prove that?
As mentioned earlier, using TLS for Erlang
distribution breaks erlclu remote_console
(because it breaks erl_call
). At the time, I worked around the problem by
using nodetool
. This post shows how to use Erlang’s SSH daemon instead.
When you’re investigating a problem with a deployed application, it’s useful to know precisely which version you’re looking at. Here’s how to automatically set the version number in an Erlang project.
In the previous two posts, we generated signing requests with OpenSSL and submitted them to cert-manager. In this post, we’ll actually use the generated certificates for mutual TLS.
As explained here, I’m going to use an init container to issue the pod certificates.
In the previous post, we got clustering working without TLS. Lifting from the investigation that I wrote up here, I’ll add TLS distribution to my Erlang cluster, but only with server certificates and with no verification (for now).
Based on my investigation with libcluster in Elixir, I’ve decided to use DNS-based discovery for finding the other Erlang nodes in the cluster. To do this, we’ll need a headless service.
I’ve covered this previously; see “Erlang/Elixir Cookies and Kubernetes”. Here’s the quick version.
I noticed that whenever I made any change to the application, it caused the dockerpodman build to re-fetch and
recompile all of the dependencies. On the tiny laptop I was using at the time, this was taking several extra minutes for
every build.
Because this is going to be a cluster of Erlang nodes, there’s (obviously) going to be more than one instance. It makes sense to add some kind of way to have some kind of “whoami” page, so that we can clearly see which node we’re talking to.
For simplicity’s sake, I created a new application with rebar3 new app name=erlclu
. I very soon regretted this
decision, because I actually needed a release, so I ran rebar3 new release name=whoops
and manually merged the relevant
pieces together.
A few weeks ago, I decided to write a blog post about using mutual TLS to secure Erlang distribution (clustering), with auto-provisioning of certificates when running in Kubernetes. It took a little longer to write up than I expected, and turned into a series of blog posts.
Some notes about using rebar3 with an umbrella project.
In the previous post, I recapped Erlang distribution (clustering). In this post, we’ll secure it by using TLS.
I want to write a post about using mutual TLS to secure Erlang distribution (clustering), with auto-provisioning of certificates when running in Kubernetes. This is not that post. This is a recap of basic Erlang clustering, to refresh my memory and lay some groundwork.
Erlang/OTP provides a built-in SSH client and daemon. You can use this to expose the console directly over SSH.
You’ve got an Erlang module with a private (not-exported) function, and you want to add some unit tests for that function? How should you do that?
The ErlangLS extension for VS Code includes formatting, using rebar3_format
. I’d prefer to use erlfmt
, so here’s how I set it up.
Distributed Erlang and Elixir applications use a shared secret called a “cookie”. It’s just a string of alphanumeric characters. All nodes in the cluster must use the same cookie. Let’s take a look at what that means in a Kubernetes context.
To get the version of the compiler used to compile a particular .beam
file:
Erlang/OTP 24.0 added support for ed25519 curves. Here’s some example snippets:
I recently had to report a bug against ErlangLS (the Erlang Language Server). Here’s how I discovered the version number:
Per Wikipedia:
Ordinarily when writing an SSL/TLS server or client using Erlang/OTP, you’ll use the certfile
and keyfile
options, as follows:
I’ve just spent about a day poking around in the guts of Erlang/OTP and Ranch, and I thought I’d write some of it down.
At Electric Imp (now part of Twilio), my team uses Erlang’s Common Test for driving our system tests. These are (almost-)end-to-end tests that exercise (almost) the whole platform.
I’ve got a bunch of Erlang nodes running in Docker containers, and I’d like to connect a remote shell, running on the host, to one of them.
Quick reference for installing Erlang and Elixir on a Raspberry Pi, using the Erlang Solutions packages.
When an error occurs in the Electric Imp backend, it’s logged and collated, and we regularly review error reports to see if there’s anything that needs looking at.
Use this helper function:
At Electric Imp, on developer PCs, we manage our Erlang versions with kerl.
To integrate direnv
with kerl
, add the following to ~/.direnvrc
:
If you want to use kerl
to build your Erlang installation, you’re going to
need some packages installed first.
If kerl list installations
is displaying Erlang installations that you
deleted ages ago, and you’ve got all of your installations in
~/.kerl/erlangs/
, you can rebuild the list by running the following command:
When you use gen_server:start/3,4
or gen_server:start_link/3,4
, the call
blocks until the other process has finished running init/1
.
I wanted to get Erlang running on my Synology NAS, for various reasons, and was struggling with the cross compiler.
I’m currently playing with OpenID Connect (OAuth 2.0 for Login), to allow people to log into a web site using their Google account. The web site is built using Erlang.
We use lager for our logging at Electric Imp. This morning I had cause to tweak the configuration at runtime on one of our staging boxes, but first I needed to figure out which handlers were already installed.
In Erlang, in a gen_server
, when does terminate get called? Also, some
messing around with dbg
for tracing.
I find Erlang’s gen_event
behaviour to be fairly tricky to understand,
despite the copious documentation on the subject:
It turns out that you can’t load Erlang NIF libraries from the shell.
kerl
allows you to easily build and install multiple Erlang/OTP releases.
It’s kinda like nvm
or rvm
, but for Erlang. It doesn’t do everything, and
that’s what direnv
is for. direnv
allows you to run commands upon entering
or leaving a particular directory.
While using erlang.mk
and relx
to build a newly-created Erlang application, I got the following error:
Originally posted to the Electric Imp blog. Preserved here.
You’re using lager for logging in your Erlang program, and you discover that
your configuration isn’t logging anything more verbose than info
-level
messages. How do you bump that up to debug
?
If you’re connected to an Erlang node via a remote shell, and you don’t have access to stdout on the original node, you’ll need to redirect the trace output to your current shell.
It’s pretty simple to create a ZIP file on disk.
Some odds and ends.
handle_event
and handle_sync_event
Just some random thing I learnt today while playing with Erlang: