A while ago, I wrote about my self-built kubernetes cluster. One interesting detail of it, was the lack of a CNI plugin, having configured the routes for pod-to-pod networking statically, via ansible.
This worked great. I just had to configure the IP ranges (v4 and v6) in the docker daemon and configure routes
in interface-up
scripts, which I had anyway. In the beginning, I had a tinc
tunnel between the nodes, which
I migrated to wireguard
later and even later changed to vxlan
on wireguard
(to allow dynamic routing with
BGP, which I use for example for metallb
). But docker
support is deprecated in kubernetes, so I had to change
runtime. Both real alternatives, containerd
and cri-o
, depend on having a CNI network config, so changing to
CNI was required.
I started with modifying my ansible playbook to push the generated Pod network ranges for each node into the Nodes
definition (.spec.PodCIDRs
), retrieve the default CNI plugins and place them at /opt/cni/bin
. After that, I
applied the flannel manifest, which I modified to use host-gw
mode (since I already have a vxlan I use for other
purposes too (and which was initially created for k8s, so it’s not a misuse of the existing one)).
After I had flannel running on my cluster and rebooted all nodes gracefully, I migrated from docker
to cri-o
.
My cluster even felt more stable after that change, probably (in part) due to not having two cgroup
management
things running at the same time - docker and kubelet were configured for cgroupfs
, while all nodes run debian,
so systemd. With the change to cri-o
, I also configured kubelet
to use systemd
as cgroup
driver.
If the simple solutions were perfect, this post could stop now - but, oh well, flannel does not yet support
dual-stack network (it’s added by now, but not yet released), so I didn’t have IPv6 anymore in my containers,
resulting in a lot of Instance down!
from my monitoring, since I have some IPv6-only services to monitor -
Also, I have my own /48
IPv6 PI allocation (PI: provider independent, can route via any upstream provider I have
a connection to) allocation, from which I want to use a portion as LoadBalancer IPs and not having IPv6 just isn’t
an option at all anymore (did you see the prices for IPv4? I wanted to pay some money for some, but some money just
isn’t enough O_O).
Being the most calm and patient person in existence (*ahem*), I decided to build my own small program for that.
Grabbing each Node’s IPs and adding routes to the other Node’s pod networks seemed like a simple enough thing to
do. And I succeeded - I built this in (mostly) two days, which weren’t even used fully for that (after all, I’m
still in hospital and therapy right now) and deployed it - replacing flannel
in host-gw
node completely.
It’s important to note, my kube-hostgw
isn’t a CNI plugin in itself, it only generates a CNI network config
based on the default plugins, namely bridge
, portmap
and host-local
. The bridge
plugin combined with
host-local
for IP address allocation does what docker
does by default: create a Linux bridge interface, give
it the first IP of the given host-local
range, create a veth
pair for every container with the host-side being
added to the bridge and the container side being configured with another IP from the range given to host-local
.
portmap
is required for Kubernetes services of type NodePort
- I don’t (yet) know how that plugin works
exactly, where it gets its infos from - but it was listed as dependency of flannel (and the flannel CNI config
used it) and everything works.
With kube-hostgw
not being a CNI plugin itself, the stability of it is not very important - it only needs to be
running in a loop if Nodes get added, removed or new PodCIDRs allocated - besides that, it could be built as a
Job
instead of a DaemonSet
(but sadly there isn’t a JobSet
- one Job
per Node matching some rules).
Best thing, after I deployed kube-hostgw
to my cluster (I still only have this one, so my testing environment is
sadly the same as my prod environment ^^’), one of my girlfriends was also building a kubernetes cluster, to move
all their stuff into it. They, too, started with flannel (I think it was my recommendation) and migrated to
kube-hostgw
once it was usable (so.. even before it had a version number, not even a LICENSE or README). She used
kube-hostgw
in her prod cluster from the start - so it’s running in three clusters already, just doing its job :3
She also wrote about it (mostly her cluster, a bit about kube-hostgw
and quite a lot about how friendly the
kubernetes community is) - check out her post, too!
If you want to read more about kube-hostgw
, take a look at
the project in my Gitlab instance :)