What happened? (Status update, August 2023)

I didn’t really consciously decide to not write status updates in the last couple months, it kind of just happened. I am not really sure it’s the right format for me. I also just don’t enjoy writing that much in general, I suppose. I think I get some benefit out of it, but it’s not quite what I had hoped for so far. And once the first one slipped, the next one seems like an even bigger chore. So, to sort of get me out of this for now, I’ll do a quick write-up of the stuff I still remember from the last three months. Then, maybe, in the future I will try a different approach. Let’s see.

sr.ht infrastructure

I think I never mentioned that yet, but we are now running an authoritative name server (ns3.sr.ht) in the EU. This should make DNS lookups much faster for EU users (if your resolver is smart enough to pick up the fastest server, which most of them are). Unfortunately, because the process at our registrar is a bit “manual”, there are no IPv6 glue records at the moment. We hope to get them out soon (but you still couldn’t use sr.ht on IPv6-only yet, anyways).

I spent a lot of time dealing with Ceph and its various components, but we now have a pretty good setup with RBD (for transient Kubernetes storage), CephFS (for true shared, multi-writer volumes), and RADOS (S3-like HTTP blob storage). This is all manually set up using the Alpine packages, with only RADOS running in Kubernetes. I specifically avoided stuff like Rook, as I wanted to make sure I understand every detail of the setup. So far, I think this was the right decision. Our cluster is pretty small, so we don’t just exchange hardware like crazy. At the moment, it’s much more important that we are in full control over every detail, rather than having all kind of procedures automated that we will not perform anyways.

I will say: CephFS, especially in the context of Kubernetes, is one heck of a shiny hammer…

As a first target, instead of any of the regular services, we are preparing to bring up a build runner in Kubernetes. It will require a few changes upstream (and some more discussion thereof), but I have successfully run build jobs already, so we are getting close.

I was also frustrated with the complexity that is cert-manager, apparently the go-to solution for managing certificates in Kubernetes. You need >5K lines of YAML to install it, and are running at least four different containers. I admit, it has a ton of features, and probably makes sense at a certain scale. But I don’t think it’s the right tool for our handful of certificates.

So, in a typical NIH fit, I wrote notariat. It’s super barebones for now, and I wouldn’t really recommend using it yet. That is because after I wrote it, Simon highly recommended to use certmagic for the ACME handling. So I tried, but it didn’t work, because certmagic is pretty focused on a somewhat different use case. But we did engage with the maintainer, and it may still become a thing, which is why notariat in its current form is in sort of a limbo.

Hare

I did have a bunch of fun with Hare recently. My previous effort to write a dig-like tool culminated in the creation of the Paw DNS project. The goal is still to not only create usable software, but also improve the DNS support in Hare’s stdlib where needed. That effort, in turn, has apparently been annoying enough for the Hare maintainers that they made me the maintainer of the net::dns subsystem 😅

Because I learned a lot more about Hare and its idioms, I also gave hare-tftp a once-over. The basic client is working, server hopefully coming soon. Because it seemed like a fun thing to do I also brought hare-icmp back into shape and added a very basic - but working - ping tool for it.

Vomit

After despairing over the mess that is folder layout for maildirs for a while, I decided to just put that off for now and stick to Maildir++. I am still evaluating support for other layouts, but not sure it will ever happen.

Having decided to postpone that for now, I finally re-wrote the state cache handling in vsync, but haven’t released a new version yet. Caching is hard, and I prematurely merged what should have been two different caches, unfortunately opening the door to substantial data loss. I untangled the mess, but also left the safeguards, like the --force option, in place for now. I already have much more confidence in the cache behavior, even in the unhappy path (e.g. aborted syncs). I’ll do some more testing before preparing a new release, but I expect that to be relatively soon.

Quo vadis?

So there, finally a new status update. What comes next here, I have not decided. Maybe another status update, maybe something else. In the meantime, feel free to reach out, as always, either by sending an email to my public-inbox, or finding on IRC, I am bitfehler on Libera, e.g. in #sr.ht.watercooler.