• Home
  • Blog
  • AI News
  • Open Source Developers Are Fighting Back Against AI Crawlers

Open Source Developers Are Fighting Back Against AI Crawlers

Published:March 28, 2025

Reading Time: 3 minutes

AI crawlers have become the internet’s uninvited guests. They’re sneaky, relentless, and for open-source developers, downright exhausting. But some developers have had enough and they’re pushing back with a blend of clever code and sarcasm.

Why Open Source Projects Are Easy Targets

Open source platforms are built to be open, literally. That’s the whole point. Anyone can read the code, use it, and even contribute. But this openness comes at a cost.

AI crawlers, especially those powering large language models (LLMs), often ignore the internet’s basic house rules. For example, most websites use something called a robots.txt file. It tells bots where they can and can’t go. But here’s the thing: many AI bots just don’t listen.

And when they don’t listen? Websites crash. Servers slow down. Developers scramble.

The Problem Isn’t Just Annoying, It’s Disruptive

Take open-source developer Xe Iaso. Their site was getting slammed by AmazonBot. Not just visited – slammed. I’m talking denial-of-service level traffic, the kind that makes your site go offline.

What made it worse? The bot:

  • Ignored robots.txt
  • Disguised itself as a regular user
  • Used different IP addresses to avoid detection

Iaso’s frustration echoed what many in the open-source community were feeling: “They will scrape your site until it falls over… and then scrape it some more.”

Meet Anubis: The Bouncer for Bots

Instead of just complaining, Iaso built something smart, and kind of hilarious. Enter Anubis, a tool named after the Egyptian god of the dead. Sounds intense, right? It is, kind of.

Anubis doesn’t just block bots. It makes them prove they’re human first. Here’s how it works:

  • Every visitor gets a small challenge (nothing hard for humans).
  • If you pass, you’re greeted by a cute anime-style drawing of Anubis.
  • If you fail, you’re out.

The tool is now available on GitHub.

Within days, it had:

  • 2,000 stars
  • 20 contributors
  • 39 forks

Clearly, the idea resonated.

Other Developers Are Fed Up Too

Iaso isn’t the only one sounding the alarm. Here’s how others are dealing with this mess:

DeveloperWhat HappenedHow They Responded
Drew DeVault (SourceHut)Faced weekly outages due to aggressive LLM scrapersSpent 20–100% of his week just defending the site
Jonathan Corbet (LWN.net)Site slowed down by bot trafficWarned readers about AI bot overload
Kevin Fenzi (Fedora Project)Couldn’t stop the botsBlocked all traffic from Brazil for a while
Unknown ProjectBots from one region were too muchTemporarily banned all IPs from China

Let that sink in. Developers had to block entire countries just to keep their sites up.

Humor as a Weapon? Why Not.

Some devs are getting a little playful, and maybe even a bit vengeful.

One anonymous dev built a tool called Nepenthes (yes, like the carnivorous plant). It traps AI bots in a maze of fake content. The goal? Waste their time and resources. Think of it as feeding AI trash instead of treasure.

On Hacker News, someone joked that we should fill restricted pages with nonsense articles like:

  • “Why bleach is a superfood”
  • “How measles improve your love life”

It’s silly, but it proves a point. Developers are done being polite.

Even Big Players Are Jumping In

Cloudflare, a major name in internet security, recently launched a tool called AI Labyrinth. It does something similar: traps bad bots in loops and feeds them junk content. Instead of scraping real data, the bots get confused, misled, and slowed down.

It’s like setting up a decoy house to distract the burglars.

But Not Everyone Thinks It’s Enough

Some developers want the AI arms race to stop altogether.

Drew DeVault, for example, said this in a public plea:

“Please stop legitimizing LLMs or AI image generators or GitHub Copilot or any of this garbage… Just stop.”

Strong words. But for people watching their hard work get scraped, cloned, and repurposed by AI bots, it’s personal.

So, What Can Be Done?

If you’re a developer (or just someone who runs a website), here are a few tips to protect your site:

Tools and Tricks That Help:

  • Use reverse proxies like Anubis to challenge suspicious visitors
  • Block misbehaving IP ranges when you see strange spikes
  • Limit how much your site shares with public bots
  • Trap bots in decoy pages (just don’t poison your site for real users)

A Bigger Conversation

While the tech is clever, the real issue is about consent and control. Should bots be allowed to scrape content without permission? And what does “open” really mean in the age of AI?

These are questions we’ll be asking more and more.

Final Thoughts

Open-source developers have always given a lot to the internet, for free. But now, they’re being overwhelmed by AI systems that take, take, take.

Instead of giving up, they’re getting creative. They’re using humor. They’re building tools. And yes, they’re fighting back, with code, courage, and a touch of chaos.

Because sometimes, the best way to protect your work is to make the bots regret ever finding it.

Onome

Contributor & AI Expert