The scary state of IPv6 rate-limiting

IPv6 rate-limiting is scarily half-baked right now. If you run a server that does any kind of IP-based rate-limiting, consider not enabling IPv6 if possible. If you do use IPv6, check how your rate-limiter actually handles it.

## Four billion is a pretty small number

Most IPv4 rate-limiters will block individual addresses as they exceed the limit. That’s mostly okay, because there are only 4 billion IPv4 addresses. That means a) they are given out with some frugality, and b) it doesn’t take much memory to block a large proportion of them. If you and 1000 of your closest friends launch a brute-force or credential-stuffing login attack, any server will have no problem rate-limiting all of you.

But IPv6 is a very different matter.

## A gazillion IPs

When you ask your ISP for an IPv6 assignment, you get at least a /64 block – 2⁶⁴ assignable addresses. RIPE suggests giving a /56 prefix (2⁷² addresses == 256 /64 blocks) to home users and a /48 prefix (2⁸⁰ addresses == 65,536 /64 blocks) to businesses (or “If you want a simple addressing plan use a /48 for each end-user”). RFC 6177 agrees with this guidance, as does APNIC.

Searching for ISPs’ IPv6 prefix delegation policies shows that /64¹ and /56 are pretty common. Internode in Australia assigns /56 blocks to residential and business customers. In the US, Charter Spectrum also gives /56s. Cogent lets users request up to /48.

So, it’s safe to assume that an attacker can obtain at least a /56 and probably a /48. It’s also prudent to assume that a determined attacker can utilize all of the addresses at their disposal. And there is at least one tool that does exactly that – “freebind: IPv4 and IPv6 address rate limiting evasion tool”.

## What’s the right way to rate-limit a gazillion IPs?

This StackOverflow answer outlines the best approach I’ve found:

The best algorithm is to start blocking separate addresses. Then when multiple addresses are blocked in the same /64 you block the whole /64. Repeat that for bigger aggregates.
Prefixes are usually given out on nibble boundaries (multiples of 4, or one hexadecimal digit). So you might want to scale from /64 to /60, /56, /52, and /48. A /48 is usually the largest prefix given to a single site.
Depending how careful you want to be you can skip from /64 straight to /56 and /48.

A comment on that answer has a useful addition:

You can implement this gradual aggregation approach in a fairly simple way. Track separate rate limits at the /64, /56, and /48 level all the time. Use higher limits for higher levels. That way there is no aggregation logic at all. It’s just three separate limits based on different keys.

(Fun fact: If I google for “ipv6 rate limiting” (in a private browsing window), the “featured snippet” at the top is a link to the “rate limiting evasion tool” that I mentioned above. The first normal result is to that SO question. And note that it has only 6 votes and a single answer with only 10 votes. Are people just not thinking/talking about the problem? Or am I searching for the wrong thing?)

## How are real rate limiters actually doing it?

Let’s start with Cloudflare, since it’s nice and clear:

Once an individual IPv4 address or IPv6 /64 IP range exceeds a rule threshold, further requests to the origin web server are blocked

That’s pretty good, though it’s missing some of the nuance of the algorithm above. If there’s a large non-malicious site (apartment complex, school, business, etc.) behind the /64, the blocking might be over-aggressive. If an attacker has an assignment larger than /64, they might have between 256 and 65,536 /64s at their disposal. The large end of that range is getting big.

AWS WAF supports IPv6 for rules, inspection, and reporting, but doesn’t specify how it implements rate-limiting for IPv6. Concerningly, it has a really small limit on the number of IPs it can rate-limit at once: “AWS WAF can block up to 10,000 IP addresses. If more than 10,000 IP addresses send high rates of requests at the same time, AWS WAF will only block 10,000 of them.” Unless their IPv6-limiting algorithm is smart, it would be easy for an attacker to ensure they have more blockable units (IPs or /64s) than the limiter can hold. And that means that it would effectively be completely unlimited.

(This raises the question of what the limit on the number of blocked IPs is for other services. I found no such limit mentioned for anything else.)

I also couldn’t figure out what IPv6 strategy Google Cloud Armor uses, but it says this about its configurable rules: “Both IPv4 and IPv6 source addresses are supported, but IPv6 addresses must have subnet masks no larger than /64.” So maybe its rate-limiting is also /64-based, like Cloudflare? Or maybe that’s reading too much into a statement that’s only tangentially related.

Let’s Encrypt limits account creations by /48, because “it’s not uncommon for one person to have a /48 to themselves”. That seems very.. cautious. On the one hand, I like how aggressive it is, but on the other hand… there could be 65,536 home or business networks (/64s) in a single rate-limited /48. I feel like this is too coarse-grained for general use.

A year ago, after a vulnerability report, Nextcloud changed from limiting IPv6 by individual addresses (/128) to limiting by /64. (There also is/was no size-limiting of the IP cache, that I can see.)

I also looked at a couple of Go HTTP rate-limiting libraries – github.com/didip/tollbooth and github.com/go-chi/httprate. Neither distinguishes between IPv4 and IPv6 and simply does per-IP blocking. So that’s bad. And neither has a size limit on the IPs in its limiter cache (only a time limit), so an attacker can consume all available memory, I think.²

(Fun fact: Even a terabyte drive can only store 2³⁶ IPv6 addresses. So you’d need about 270 million such disks to store the IP addresses accessible to a single /64 home user. Or 18 trillion disks for a /48.)

## How many “blockable units” is too many for an attacker?

If a rate limiter is blocking by single IP addresses, then that’s the “blockable unit”³. If it’s blocking by /64, then that’s the “blockable unit”. And so on. The rate limiter effectively “allows” an attacker to have a certain number of blockable units at her disposal depending on the limiting strategy used.

The obvious extremes: An attacker having a single blockable unit is acceptable (and unavoidable). An attacker having 2⁶⁴ blockable units is way too many.

But what if the attacker has 256 blockable units (blocking on /64, attacker has /56)? Or 65,536 blockable units (blocking on /64, attacker has /48)?

Let’s (charitably) assume that AWS WAF’s limit of blocking “10,000 IP addresses” applies to /64s for IPv6. If that’s true, then allowing an attacker 65,636 is too many. (To state the obvious, an attacker could cycle through her /64s and never be limited at all.)

Do other WAFs have a size limit that they’re not publishing? It seems likely, but not certain. Cloudflare, for example, prides itself on withstanding the largest attacks and is surely concerned about state-level attackers with access to at least a /32 prefix – 4 billion /64s. It would take about 40 GB of storage to keep track of that many prefixes (2³² * (8 bytes per prefix + overhead)). That’s not impossible for a big box of RAM, and certainly not for disk, of course (but disk feels a bit slow for this use case). Perhaps Cloudflare is comfortable with blocking that many addresses.

A big box of RAM dedicated to this purpose might be expensive for a smaller operator, but maybe using disk is more acceptable. If we’re talking about Nextcloud running on someone’s NAS box, then /32 attacks are surely outside of the threat model.

What about 256 blockable units? That’s… probably okay?

So, I don’t have a great answer to the question of how many blockable units is too many. What’s your comfort level? What’s your threat model?

And what about an attack that is both distributed and can utilize the full IP space? What multiple of 65,536 (or 256) are you comfortable with?

## Conclusions

I really like the idea of IPv6. I work for a company that would (probably) benefit from widespread IPv6 adoption (so that we’re, uh, harder to block). But as I said in the title: If you need to rate-limit access to something, avoid enabling IPv6 for now. The state of IPv6 rate-limiting just seems too immature.

But what if you have no choice? If you’re using a web application firewall, try to talk to the vendor about what it actually does. (And then let me know what they say!) If you’re doing the rate-limiting yourself, look closely at what your code is doing, because there’s a very good chance that it’s doing it inadequately.

For a quick fix, block IPv6 /64s rather than individual IPs. It might not be perfect, but it’s 2⁶⁴ times better.

I remain hopeful that this situation can improve rapidly. Good algorithms tend to get adopted quickly once they become available in a consumable format, and this isn’t likely a very complex case. (Yes, I am tempted to implement something myself, but this isn’t a problem I personally have right now so I wouldn’t actually use my own code, which is never a good starting point.)

## Postscript

The state of this seems so obviously sketchy that I think I must be missing something important. I am still an IPv6 neophyte. Please correct me if I have gotten anything wrong.

Edit 2022-02-21: I posted this to /r/ipv6 and there are some good contrary comments there. I particularly like this one that talks about IPv6 being better than IPv4 for rate limiting, since providers will generally have a single IPv6 prefix themselves and give out prefixes in a consistent manner, rather than the scattered, different-IP-each-reboot world of IPv4. The comments also talk a lot more about “bycatch” (over-blocking), which I didn’t really. But I still don’t feel they’re worried enough about how providers and libraries have actually implemented rate limiting at this point in time.

Edit 2022-02-22:

A coworker pointed out that the way I did the prefixed-IP-canonicalization in my PRs was overly complicated and can be achieved with the stdlib like ipv6.Mask(net.CIDRMask(56, 128)).String(). I had tried various approaches with the stdlib and didn’t come up with one that worked, but I guess I missed that one. Embarrassing.

I did some searching for fail2ban+ipv6. Their IPv6 support master plan is interesting and relevant. For example: “I am not sure we will land/release 1 [per-IP blocking] alone since, as was stated, it could immediately be exploited by an attacker to cause resources exhaustion/DoS. May be only if treatment of IPv6 addresses would be made optional with a big fat warning on possible ramifications.” Even though it looks like per-IP IPv6 support was added in 0.10.

Reading through all of the comments on that issues suggests that fail2ban still only uses a per-IP strategy to block IPv6. And are aware it’s insufficient. And stopped discussing it a year and a half ago.

Some ISPs also give a small multiple of /64s. But I feel like that case isn’t significantly different from a single /64 for our purposes. ↩︎
After writing this I realized that I’d better be part of the change I want to see, so I submitted PRs to tollbooth and httprate. Both have been accepted. But it’s unlikely that the only two rate-limiting libraries I checked are the only two with this problem, so I don’t think this changes the overall point of this post. ↩︎
To be clear, I’m making this term up for convenience of discussion. ↩︎