Defending against DDoS attacks hammering my git server
It's no sekret that I have a git server. I host all sorts of stuff on there - from stuff I've talked about on this blog to many other things I have, and still others that are private repositories that I can't share / yet for one reason or another.
While I can't remember exactly when I first set it up, I do remember that gitea wasn't even a thing back then, and I originally setup go git service.
Nowadays, I run the fork of gitea called forgejo, which is a fork of go git service.
Either way, it's been around for a while!
Unfortunately, now that smaller git servers are becoming more common (we still need a social/federated git standard like e.g. ActivityPub), so are attacks against such servers², and one I dealt with yesterday was particularly nasty, so I decided to make a blog post about it.
I'll have CPU for breakfast, lunch, and tea thank you
Before I explain how I dealt with it (mitigated is the technical term I understand), it's important to know the anatomy of the attack. After all, security is important but we can only be secure if we know what we're defending against.
The threat model, if you will.
In this case, the attacker sent random requests to random files on random commits in a large git repository I have on my aforementioned git server.
Yesterday, I measured almost 1 million unique IP addresses making exactly 2 requests at a time each.
If I had the energy, I'd plot em all on a hilbert curve² with a colour gradient for age, maybe even with an animation.
The result of all of this is 100% CPU usage on my 3rd generation dedicated server I rent and a slow terminal experience, because to serve each request Forgejo has to call a git subprocess to inspect the repository and extract the version of the file requested.
That's a very expensive way to handle a HTTP/S request!
At first, I thought I was infected, but further inspection of the logs revealed it not to be so.
With all this in mind, the goal of my expedition was to avoid the spammy HTTP/S calls from hitting the application server (forgejo).
This is all interesting, because it means that a number of common steps to achieve this won't work:
- We can't just block the IP address, because there are too many and most of them will be compromised IoT (Internet of Terrible security) devices etc in peoples' homes that are roped into being a botnet.
- We can't keep the git server turned off, because I need to use it
- I can't block access to the problematic paths on the server, because then the attacker will switch to another set and access to the git server is still impaired
- I can't just allow specific IP addresses through, as I have blog post stuff hosted on there and you, one of my readers, would be cut off from accessing it (and I access from my phone sometimes which doesn't have a fixed IP)
...so that just leaves us stuck right?
Teh solutionses!
No so. There's still a strategy that we haven't tried: a Web Application Firewall. Traditionally, such tools are big and very very expensive, but I discovered the other week a tool that did the job and inside an envelope (a couple of megabytes) and price point (free!) I could afford.
That tool is Anubis, and despite the.... interesting name it acts like something of a firewall that sits in front of on the application server, but behind your reverse proxy:
Public Internet ║ Inside server
║
╟───────────────┐ ┌───────────────┐ ┌───────────────┐
║ Caddy │ │ Anubis │ │ Forgejo │
Inbound ────────▶ • ├─────────▶ • ├──▶ • │
requests 80/tcp║ Reverse proxy │ │ Firewall │ │ App server │
443/tcp╟───────────────┘ └───────────────┘ └───────────────┘
║ localhost localhost
║ 2999/tcp 3000/tcp
Essentially, when each request comes in it weighs the risk of a request. 'high-risk' requests, such as those coming from browsers which attackers love to impersonate, get served a small challenge that they must solve to gain access to the website. Low-risk clients, such as git or curl or elinks can go straight through.
This is in the form of a hashing problem: the browser must tell the server what nonce (number only used once) that, alongside a given unique challenge string, produces a hash with a certain number of zeroes (0) when hashed.
Correctly completing the challenge (which doesn't take very long), sets a cookie for that client to gain access to the website without completing another challenge for a certain period of time.
I could go on, but the official documentation explains it pretty well.
Essentially, by serving challenges to high-risk clients instead of allowing requests straight through attempts to access expensive HTTP/S calls (such as loading a random file from a random commit in a random git repo) a server's resources can be protected to give a better experience to the users who use it on a day-to-day basis.
This isn't without its flaws - namely inadvertently blocking good bots - but it does strike enough of a balance that I can keep my git server online without giving up the entirety of my server's resources in the process, which I need to use for other things.
But how?!
I'll assume you already have some sort of reverse proxy in front of some sort of application server. In my case, that's caddy and forgejo.
Anubis' latest release can be downloaded from here, but for Debian/Ubuntu users who want an apt repository I'm rehosting the .deb files from Anubis' releases page in my personal apt repository:
https://apt.starbeamrainbowlabs.com/
Assuming you have an e.g. Ubuntu server, you'll want to install anubis and then navigate to /etc/anubis, in which you should create a configuration file with the name of the user account you'll be starting anubis under.
Each instance of anubis can only handle 1 domain/app at a time, so you'll want 1 system user account per application you want to protect.
For example, I have a config file at /etc/anubis/anubis-git.env with the following content:
TARGET=http://[::1]:3000
BIND=:2999
METRICS_BIND=:2998
....my internal git server is listening on port 3000 on the IPv6 localhost address ::1 for HTTP requests, so that's the target that anubis should forward requests to, as in the ASCII diagram above (made in monosketch).
Then, start the new anubis instance like so:
sudo systemctl enable --now [email protected]
....in my case, the username I created (sudo useradd --system anubis-git etc etc) was anubis-git, so that's what goes in the filename above and after the @ sign when we start the service.
If you haven't seen this syntax before in systemd service names, it allows you to set the username that a supporting service file will start a service with. syncthing does the same thing with the default systemd service definition it provides.
In other words, it lets you start multiple instances of the same service without them clashing with each other.
At any rate, the final piece of the puzzle is telling your reverse proxy to talk to anubis:
git.starbeamrainbowlabs.com {
log
reverse_proxy http://[::1]:2999 {
# ref anubis config setup both of these are required
header_up X-Http-Version {http.request.proto}
# ref anubis config, this is esp. required
header_up X-Real-Ip {remote_host}
}
}
Replace http://[::1]:2999 with the address of Anubis instead of your application server directly, then check the config and reload:
sudo caddy validate -c /etc/caddy/Caddyfile && sudo systemctl reload caddy
(replacing /etc/caddy/Caddyfile with the path to your Caddyfile of course)
Conclusion
....and you're done!
We've successfully put an application server behind anubis to protect it from malicious requests.
Over time, I assume I will need to tweak the anubis settings, which is possible through what seems to be a rather detailed policy file system (which allows RSS/Atom files through by default, if you're crazy enough to be subbed to any feeds from my git server).
If something seems broken to you now that I've set this up, please do get in touch and I'll try my best to help you out.
I'll be continuing to keep an eye on my web server traffic to see if anything gets through that shouldn't, and adjusting my response as necessary.
Thanks for sticking with me, and when I have the energy I have lots of other cool things to talk about here soon.
--Starbeamrainbowlabs
Aside: IP blocking with Caddy
While implementing the above approach, I found I did need to bring my git server up for my Continuous Integration system (I implemented it well before forgejo got workers and I haven't checked out the latter yet) to work.
To do this, I temporarily implemented an IP address-based allowlist.
If you're curious, here's the code for that:
# temp solution to block anyone who isn't in the allowlist outright
# note that given the sheer range of IPs from what's probably a compromised device-based botnet, we can't just IP block this long-term.
@denied not client_ip 1.2.3.4 5.6.7.8/24 127.0.0.1/8 ::1/128
abort @denied
....throw this in one of the server blocks in your Caddyfile before a reverse_proxy directive - changing the allowed IP addresses of course (leave the IPv4 & IPv6 ones!) - validate & reload, and you should have an instant IP address allowlist system in place!




















































Real-time social media sentiment analysis for rapid impact assessment of floods






