<beneroth>
I find this is once again confirmation that FOSS projects should be hosted individually and not centralized on the whims of some bigcorp (remember bitbucket)
<abu[7]>
Right!
<aw->
beneroth: I don't see how those two things are linked. Rate limits are completely normal and expected of any service receiving millions if not billions of requests per day
<aw->
it's not like they're rate limiting people who are using the service legitimately with an account.. these things are almost always targetted at bots, spammers, abusers..
<beneroth>
aw-, these rate limits are a (understandable) reaction the blatant scrapping and IP theft for LLM development (which was greatly accelerated by Microsoft).
<beneroth>
aw-, well in the past, it was completely legitimate to use GitHub without an account. Now less so.
<beneroth>
Apparently they increased the allowed rate, in the second link you can see people complaining that earlier it kicked in after 2-3 page hits, no apparently only with much more (maybe also depending from with ASN someone is coming)
<beneroth>
aw-, I'm not criticizing GitHub, they offer a free service which was and is a huge benefit to humanity. I critize the people who make themselves dependent on that, when it's always quite certain that this will become a problem eventually.
<beneroth>
if GitHub wanted, they could solve that issue in much better ways. Naturally, the go with the cheapest (for them) option (externalizing costs)
aelius has joined #picolisp
<aw->
what better way to you suggest to solve this problem?
<soweli_iki>
these botnets are notoriously difficult to block via IP rules. from what the Emacs wiki (iirc) folks said, they tend to make just a few requests per client IP
<soweli_iki>
but they come in swarms and can easily take down smaller services, or skyrocket the resource costs of larger ones
<beneroth>
soweli_iki, yeah that's true. I experience it myself with my hosting offerings.
<beneroth>
aw-, There is not really a single one-stop solution. I would expect MS to have a wild range of measures against this already with all their online services and Azure. But granted, probably I just overestimate them...
<beneroth>
aw-, on my servers I often block whole ip-blocks which are clearly not enduser ISPs (e.g. AWS/OHV/hosters). That obviously wouldn't work really well for GitHub, giving their target audience and use, but could be used to segment rate limit categories
<beneroth>
to me the current rate limits are designed to make people create accounts. otherwise that steep difference between anon (60/h) and authenticated (5k+/h) doesn't make any sense (numbers from https://news.ycombinator.com/item?id=43938433, in turn from GitHub documentation)
<beneroth>
another check (to apply different rate limit categorizes) would be proof of work, e.g. captchas and the like - but they're annoying and stopped working in recent years (machines are better in solving them than humans), so that's mostly out.
<beneroth>
question is a bit, what is the issue/goal? 1) server load, or 2) hindering "AI"-competition from taking stuff and make users authenticate for better data and control 3) network load it cannot be, because authentication makes no difference there
<beneroth>
the best way is to optimize the server software and make use of HTTP caching mechanisms so the load is less of a problem. good botnets have enough IPs to keep going, so a simple rate limiting is only really restricting legitimate users (forcing the botnets to switch IPs only increases load with all the additional TCP connections).
<beneroth>
if its individual IPs which misbehave then they can be detected by behavior analysis (behavior a legitimate client wouldn't do in bulk, e.g. many 404 calls etc).
<beneroth>
the very low limits (they were increased recently) suggest it is primarily about getting clients to authenticate and not about stopping abusive scrappers - or if it is, then its made very lazily and without consideration of their (past) target audience, which will not be out of necessity for a IT company of such size, skill and resources.
<beneroth>
ofc its right to turn their service into a walled garden. as did bitbucked a while ago.