Elixir and Kubernetes: A love story — Part 1: Setting up an Elixir Cluster

In the last few years, Kubernetes has been heavily adopted in many companies as the deployment and application orchestration solution.

This adoption birthed many Cloud-Native solutions relying on Kubernetes to automate diverse workflows, to quote a few:

Unfortunately, most of those solutions, often implemented in Go, do not support more than one instance running (“replicas” in a Deployment, in Kubernetes terms), creating SPOFs (Single Point of Failure) in your infrastructure.

This is because “leader election” or “distributed consensus” are tricky to implement.

But in the Erlang/Elixir ecosystem, those problems look like they are already solved. Yet, most tutorials about Elixir only talk about building a webapp with the Phoenix framework.

This article is the first of a series of article that will cover:

  • how to build a Kubernetes Operator in Elixir

Hopefully, by the end of this series, you’ll have a better idea of what Elixir can do for you 🙂

📃 Table of Content:

  • Part 1: Setting up an Elixir Cluster

Code examples can be found on Github at: linkdd/elixir-k8s-love-story

In this first article, we’ll see how to setup an Erlang/Elixir cluster on Kubernetes.

But first, let’s recap things a bit…

What is Elixir? 🤔

Elixir 1.0 came out in 2015. At the time of writing, Elixir 1.12 is the latest version.

Elixir is a functional language that compiles to Erlang bytecode and is run on the BEAM (the Erlang virtual machine).

Erlang is also a functional language, that came out in the 1980s from Ericsson.

Their key features are:

  • fault tolerance: you get resilience at the application level

Elixir adds to those:

  • the pipeline operator: a simple and effective way to chain computations

The learning curve for Erlang is steeper than Elixir, BUT I’d still recommend starting with the amazing book “Learn you some Erlang for great good”. It will give you an amazing introduction to Erlang, OTP and Mnesia (a distributed database, native to Erlang, with ACID transactions).

NB: I like to use Mnesia without disk persistence to build a distributed cache for my distributed applications. If I need persistence, I would still rely on PostgreSQL or another DBMS (thus, keeping my application “stateless”)

What is Kubernetes? 🤔

Kubernetes first came out in 2014 from Google. It is a container orchestrator, meaning its main objective is to manage container-based workload.

This includes:

  • scheduling and running containers across the nodes of your cluster

It relies on the following standards:

  • OCI (Open Container Initiative): to package and distribute container images

With Kubernetes, you declare your desired state, and through the control loop, it will modify the observed state towards the desired state.

The desired state is defined with a bunch of resources (for example: a Pod, a ConfigMap, a Service, …), and operators will watch those resources to decide what to do next. Example:

  • the user defines a Deployment resource with 3 replicas

If a container crash, the pod operator will notice it (the observed state has diverged from the desired state), and restart it.

This is called the “control loop”, giving you resilience at the infrastructure level.

To summarize previous sections:

  • Elixir gives you resilience at the application level.

If you want to build reliable and fault tolerant softwares, you might need both.

To understand how both can work together, let’s explore how to build an Erlang/Elixir cluster.

Erlang Clustering 🌐

Figure 1: 3 nodes Erlang cluster

In Erlang/Elixir, a node is identified by a basename and a hostname (or an IP address, or a Fully Qualified Domain Name).

In the Figure 1 diagram, we have 3 nodes:

  • Node 1: basename = foo, hostname =

To connect the nodes together, they need to share the same Erlang Cookie, a secret value used to authenticate nodes upon join.

This cookie is set via the--cookie argument when starting your node.

Then, you would run from Node 1 the following code:


Let’s create a sample application with:

$ mix new my_app --sup
$ cd my_app

NB: The --sup option will create a supervision tree (an OTP application and a supervisor).

Running mix release will create a portable release of your application, containing your compiled code, the full runtime and a script to start the Erlang node with your application.

NB: If you configured multiple releases within your mix.exs file, be sure to pass the correct name to the command with: mix release ${RELEASE_NAME}.

This release will be located in _build/${MIX_ENV}/rel/my_app and can be copied as-is to the hosts you want to deploy to.

This startup script (found at bin/my_app) uses a few environment variables to configure the Erlang node:

  • RELEASE_DISTRIBUTION controls the kind of node name that is expected (short hostname, or FQDN)

Deploy your Erlang/Elixir release 🚀

Using the following Dockerfile, you’ll be able to create a container packaging your application:

This container can then be deployed to Kubernetes with the following Deployment resource:

Here, we set the current node’s IP to the address of the Pod it’s running on. For security reasons, we are also setting the Erlang cookie from a Kubernetes Secret.

Finally, we expose the EPMD (Erlang Port Mapper Daemon) port. This is how Erlang/Elixir communicate with other nodes. We’ll use this later.

There is a step missing though. We need to know the IP address of the other nodes before we can connect them together.

Then, what if a Pod crashes? What if there is a Deployment rollout starting new Pods and shutting down old ones?

Fortunately, if there is a problem, there is a solution!

Automatic Cluster Formation and Healing 🤖

libcluster is a library that provides a mechanism for automatically forming clusters of Erlang nodes, with either static or dynamic node membership.

It provides a pluggable “strategy” system, with a variety of strategies provided out of the box. The one we care about is the Kubernetes DNS strategy.

Using this strategy, libcluster will perform a DNS query against a headless Kubernetes Service, getting the IP address of all Pods running our Erlang cluster:

To start using libcluster, add {:libcluster, “~> 3.2”} to your dependencies, then in your application module (lib/my_app/application.ex), add the following:

That’s it!

Although, you might want to follow the 12 Factor App design principles, and make this configurable:

  • what if I want to run the application on my computer (single node)?

This is where Datapio comes into the picture…

Exploit Kubernetes with Datapio 🔨

Datapio aims to provide a complete platform to build Cloud-Native systems on Kubernetes. It comes with 3 packages:

  • Datapio OpenCore: an Open-Source CI/CD platform based on Tekton

The OpenCore package is distributed as an umbrella project containing the following sub-projects:

  • datapio_cluster: integration of libcluster as an OTP application

We will take a closer look to each of those sub-projects in this series. But today, let’s focus on datapio_cluster.

First, add the following to your dependencies (and remove libcluster):

github: "datapio/opencore",
ref: "main",
sparse: "apps/datapio_cluster"

NB: Make sure git is installed when running mix deps.get, it is required to clone a Git repository from Github (or another VCS provider).

Then, add the :datapio_cluster to the extra_applications field. Your mix.exs, should look like this:

This will ensure that the datapio_cluster OTP application is started before yours.

This application will read the following environment variables:

  • DATAPIO_SERVICE_NAME: the name of the headless Kubernetes Service (in the same namespace as the Deployment/Pod), if not set, no automatic clustering will be done

Those variables will be used to configure libcluster.

Therefore, the modifications done in the previous section to your application module (lib/my_app/application.ex) can be reverted.

In addition, this application will take care of configuring the Mnesia application to enable replication across nodes, and provide a mechanism to create your RAM-only sets at startup.

To customize this, you can add to your config/config.exs (compile time configuration) the following:

Nothing else is required!

Wrapping up 📦

In this article, we saw how easy it is to setup an Erlang cluster on Kubernetes, allowing us to get resilience at the application level AND the infrastructure level.

NB: To clarify one point, Erlang/Elixir gives you the tools to have resilience at the application level, but this is not automatic, you have to use them 😉

In the next parts of this series, we will continue this Kubernetes journey by writing a small Kubernetes Operator, discovering how Datapio and Kubirds can help us setup and monitor Elixir systems.

Stay tuned for Part 2!



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store