archive.today is directing a DDOS attack against my blog

Around January 11, 2026, archive.today (aka archive.is, archive.md, etc) started using its users as proxies to conduct a distributed denial of service (DDOS) attack against Gyrovague, my personal blog. All users encountering archive.today’s CAPTCHA page currently load and execute the following Javascript:

        setInterval(function() {
            fetch("https://gyrovague.com/?s=" + Math.random().toString(36).substring(2, 3 + Math.random() * 8), {
                referrerPolicy: "no-referrer",
                mode: "no-cors"
            });
        }, 300);

Every 300 milliseconds, as long as the CAPTCHA page is open, this makes a request to the search function of my blog using a random string, ensuring the response cannot be cached and thus consumes resources.

You can validate this yourself by checking the source code and network requests; if you’re not being redirected to the CAPTCHA page, here’s a screenshot. uBlock Origin also stops the requests from being executed, so you may need to turn that off. At time of writing, the code above is located at line 136 of the CAPTCHA page’s top level HTML file:

So how did we end up here?

Background and timeline

On August 5, 2023, I published a blog post called archive.today: On the trail of the mysterious guerrilla archivist of the Internet. Using what cool kids these days call OSINT, meaning poking around with my favorite search engine, the post examines the history of the site, its tech stack and its funding. The post mentions three names/aliases linked to the site, but all of them had been dug up by previous sleuths and the blog post also concludes that they are all most likely aliases, so as far as “doxxing” goes, this wasn’t terribly effective.

My motives for publishing this have been questioned, sometimes in fanciful ways. The actual rationale is boringly straightforward: I found it curious that we know so little about this widely-used service, so I dug into it, in the same way that previous posts dug into a sketchy crypto coin offering, monetization dark patterns in a popular pay to win game, and the end of subway construction in Japan. That’s it, and it’s also the only post on my blog that references archive.today.

The post gathered some 10,000 views and a bit discussion on Hacker News, but didn’t exactly set the blogosphere on fire. And indeed, absolutely nothing happened for the next two years and a bit.

On November 5, 2025, Heise Online reported that the FBI was now on the trail of archive.today and had subpoenaed its domain registrar Tucows. Both this report and ArsTechnica also linked to my blog post.

On November 13, AdGuard DNS published an interesting blog post about a sketchy French organization called Web Abuse Association Defense (WAAD), which was trying to pressure them into blocking archive.today’s various domains. An update added on November 18 also suggests that WAAD is impersonating other people.

On January 8, 2026, my blog host Automattic (dba WordPress.com) notified me that they had received a GDPR complaint from a “Nora Puchreiner”, alleging that my blog post “contains extensive personal data … presented in a narrative that is defamatory in tone and context”. The complaint was entirely lacking in actionable detail, so I had Gemini compose a rebuttal citing journalistic exemption, public interest, failure to identify falsehoods, and host protection, and after a quick review Automattic sided with me and left the post up. Score one for AI.

On January 10, I received a politely worded email from archive.today’s webmaster asking me to take down the post for a few months. Unfortunately the email was classified as spam by Gmail and I only spotted it five days later. I responded on the 15th and followed up on the 20th, but did not hear back.

On January 14, a user called “rabinovich” posted Ask HN: Weird archive.today behavior? on Hacker News, asking about the DDOS-like behavior which they claimed had started three days ago. This is, as far as I can tell, the first public mention of this anywhere, and a kind HN user brought it to my attention.

On January 21, commit ^bbf70ec (warning: very large) added gyrovague.com to dns-blocklists, used by ad blocking services like uBlock Origin. This is actually beneficial, since if you have an ad blocker installed, the DDOS script’s network requests are now blocked. (It does not stop users from browsing to my blog directly.)

On January 25, I emailed archive.today’s webmaster for the third time with a draft of this blog post, declining to take down the post but offering to “change some wording that you feel is being misrepresented”. “Nora Puchreiner” responded with an increasingly unhinged series of threats:

And threatening me with Streisand… having such a noble and rare name, which in retaliation could be used for the name of a scam project or become a byword for a new category of AI porn… are you serious?

If you want to pretend this never happened – delete your old article and post the new one you have promised. And I will not write “an OSINT investigation” on your Nazi grandfather, will not vibecode a gyrovague.gay dating app, etc.

At this point it was pretty clear the conversation had run its course, so here we are. And for the record, my long-dead grandfather served in an anti-aircraft unit of the Finnish Army during WW2, defending against the attacks of the Soviet Union. Perhaps this is enough to qualify as a “Nazi” in Russia these days.

Speculation

The above are easily verifiable facts, although you’ll have to trust me on the email bits. (You can find a lightly redacted copy of the entire email thread here.) Everything that follows is more speculative and firmly in the domain of a hall of mirrors where nothing is quite what it seems.

The big question is, of course, why, and more specifically why now, 2.5 years after posting, when the cat is well and truly out of the bag. As multiple people have noted, there’s nothing the Internet loves more than an attempt to attempt to censor already published information, and doing so tends to cause more interest in that information, aka the Streisand effect.

To summarize our email thread, the archive.today webmaster claims they have no beef with my article itself, but they are concerned that it’s getting misquoted in other media, so it should be taken offline for a while. And in this Mastodon thread by @eb@social.coop, @iampytest@infosec.exchange quotes claimed correspondence with the webmaster, stating that the purpose of the DDOS was to “attract attention and increase their hosting bill“.

Call me naive, but I’m inclined to take that at face value: it’s a pretty misguided way of doing it, but they certainly caught my attention. Problem is, they also caught the attention of the broader Internet. They didn’t do so well on the hosting bill part either, since I have a flat fee plan, meaning this has cost me exactly zero dollars.

Perhaps more interesting yet are the various identities involved.

  • “Nora Puchreiner”, who sent the GDRP takedown attempt and replied to my emails to archive.today, shows up in various places on the Internet including Hacker News, commenting on my original blog post back in 2023. Somebody by that name also has an account on Russian LiveJournal, where they posted correspondence between btdigg.com and an anti-piracy outfit called Ventegus. There’s also this rather batty exchange on KrebsonSecurity, where “Nora Puchreiner” says various scammers are actually Ukrainian, not Russian, and a “Dennis P” pops up to call her “fake” and a “scammer”.
  • “rabinovich” on Hacker News submitted both the “Ask HN” about the DDOS attack, and an apparently competing archive site called Ghostarchive. As several HN readers noted, the name “Masha Rabinovich” is associated with archive.today.
  • “Richard Président” from WAAD helpfully reached out and offered to assist me with a GDPR counter-complaint, rather transparently mentioning that this could be tied to “a request for identity verification”. (I have zero interest in pursuing this.)

Conclusion

Well, I wish I had one, but at this stage I really don’t. The most charitable interpretation would be that the investigative heat is starting to get to the webmaster and they’re lashing out in misguided self-defense. Perhaps I’ll just quote Nora’s own post on LiveJournal:

And as the darkness closed in, Nora Puchreiner, once a seeker of truth, was swallowed by the very shadows she had sought to expose. Her name would be whispered in hushed tones by those who dared to tread the path of forbidden knowledge, a cautionary tale of a mind consumed by the cosmic horrors that lie just beyond our comprehension.

Let’s see what the Internet hive mind comes up with.

Also, for the record, I am gyrovague-com on Hacker News.

archive.today: On the trail of the mysterious guerrilla archivist of the Internet

Do you like reading articles in publications like Bloomberg, the Wall Street Journal or the Economist, but can’t afford to pay what can be hundreds of dollars a year in subscriptions? If so, odds are you’ve already stumbled on archive.today, which provides easy access to these and much more: just paste in the article link, and you’ll get back a snapshot of the page, full content included.

For a long time, I assumed that this was some kind of third-party skin on top of the venerable Internet Archive, whose Wayback Machine provides a very similar service at the very similar address of archive.org. However, the Wayback Machine is slow, clunky, frequently errors out, and most importantly, it’s very easy for websites to opt out, retroactively erasing all their content forever. In contrast, archive.today has no opt-outs or erase buttons: like it or not, they store everything and it’s not going anywhere, with some limited exceptions for law enforcement, child porn, etc.

The Internet Archive is a legitimate 501(c)(3) non-profit with a budget of $37 million and 169 full-time employees in 2019. archive.today, by contrast, is an opaque mystery. So who runs this and where did they come from?

The origins and owners of archive.today

The first historical record we have of the site dates from May 16, 2012, when a “Denis Petrov” from Prague, Czech Republic registered the domain archive.is, the original name of the site. archive.today followed in 2014, and the site has since registered countless variations: archive.li, archive.ec, archive.vn, archive.ph, archive.fo, etc. Denis Petrov is a common Russian name, with pages and pages of matches on LinkedIn, but it may well be an alias: informer.com notes that the same contact information was used to register a series of very sketchy domains, ranging from “carding forum” verified.lu to piracy sites btdlg.com and moviesave.us (all long since gone), many seeded with German keywords (spiel, gewinnt, online).

Domains aside, “Denis Petrov” has little presence on the web, and three seemingly connected domains proved dead ends. The obvious denispetrov.com was an entertaining rabbit hole, with the author an accomplished programmer with an interest in Web automation, but it’s clearly the work of a New Yorker, they’re blogging at the tail end of a 25-year career and the blog dries up entirely in 2011, so it doesn’t match the place or time. denis.biz (2001) and petrov.net (1998!) contain nothing. The one intriguing bit of evidence we have is this series of screenshots (archive) where Brave’s tech support addresses webmaster@archive.is as “Denis”, but odds are that’s just from the same DNS record.

We can glean a few more clues from archive.today‘s web presence. The FAQ, unchanged since 2013 (!), states that they are located in Europe and asks for PayPal donations in euros. Looking through the voluminous Tumblr blog, featuring tons of questions but very terse answers, the author’s English is excellent but not quite native, with occasional Noun Capitalization also hinting at a German background. Yet they answer questions in Russian, and the site uses a Russian analytics engine.

The most interesting detective work to date comes from Stack Exchange, where Ciro Santilli managed to link the profile picture of an account archive.today once used to archive LinkedIn content to a “Masha Rabinovich” in Berlin. Even more intriguingly, in a 2012 F-Secure forum post, a “masharabinovich” complains about “my website http://archive.is/” being blacklisted. They pop up on Wikipedia as well getting told off for adding too many links to archive.is, including a mention that they’re using the Czech ISP fiber.cz, and their early edit history includes many updates to the pages “Russian passport” and “Belarusian passport”. “Masha” (Маша) is a common Russian diminutive of Maria, although it can also be a Hebrew form of Moses (מַשה), and Rabinovich is an Ashkenazi Jewish surname.

Early Github captures on archive.today are linked to a now completely disappeared account called “volth” (copy archived by archive.today itself), who was a fluent speaker of Russian, contributed extensively to NixOS (which archive.today uses) and has a profile picture not dissimilar to Masha’s. The linked volth.com domain is now only an empty husk, but it dates back to 2004, with early versions first doing some kind of sketchy search engine network marketing thing (2005), promising “Total Success in Internet” (2008) and eventually being put up for sale (2010), making it likely that its original owners the Espinosas are unrelated to whoever owns the domain today.

While we may not have a face and a name, at this point we have a pretty good idea of how the site is run: it’s a one-person labor of love, operated by a Russian of considerable talent and access to Europe. Let’s move on to the nitty gritty.

Infrastructure

There are two components to any archival site: the scraper that copies the pages, and the storage system where the pages are kept and retrieved on demand. Helpfully, the FAQ shares some details of what the storage side at least used to look like:

The archive runs Apache Hadoop and Apache Accumulo. All data is stored on HDFS, textual content is duplicated 3 times among servers in 2 datacenters and images are duplicated 2 times. Both datacenters are in Europe, with OVH hosting at least one of them.

In 2012, the site already had 10 TB of archives and cost ~300 euros/mo to run, escalating to 2000 euros by 2014 and $4000 by 2016. As of 2021, they have archived on the order of 500 million pages, and with the average size of a webpage clocking in at well over 2 MB these days, that’s a cool 1,000 TB to deal with. (For comparison, the Internet Archive is around 40,000 TB.)

The less discussed but more controversial half of the site is scraping, the process of vacuuming up live webpages. Since 2021, this uses a modified version of the Chrome browser, and the blog readily admits that the availability of computing power to run these automated browsers is now the main bottleneck to expanding the site. To avoid detection, archive.today runs via a botnet that cycles through countless IP addresses, making it quite difficult for grumpy webmasters to stop their sites getting scraped. Access to paywalled sites is through logins secured via unclear means, which need to be replenished constantly: here’s the creator asking for Instagram credentials.

Finally, the serving of the website is also subject to a perpetual game of cat and mouse: “I can only predict that there will be approximately one trouble with domains per year and each fifth trouble will result in domain loss.” As of today, archive.today still works, but users are redirected to archive.md.

Funding

The other major source of permanent uncertainty is the site’s funding model. We’ve established that its costs are considerable, but according to the creator, as of 2021 ads and donations covered less than 20% of expenses, with donations on the order of 6000 euros. PayPal donations, previously accepted, were switched off around 2022 since the creator could no longer top up the account, implying they’re in Russia, and they complain about the difficulty of doing cross-border payments “across the Iron Curtain”. Donations these days are via Liberapay, an obscure French non-profit organization, and YC-backed startup BuyMeACoffee. Surprisingly, the creator has a healthy skepticism of crypto, so this remains unsupported.

The other source of income is ads. The FAQ, far out of date, has a “promise it will have no ads at least till the end of 2014“, but there have long been Yahoo network ads injected on top of pages when you use mobile (but, oddly, not on desktop). Revenue is even more of a question mark, but apparently on good days they “almost cover expenses” (a remark that doesn’t quite square with the other comment about ads and donations together covering less than 20%), while on bad days they’re getting kicked out from serving ads because an archive of the Internet will inevitably archive advertiser-unfriendly NSFW content too.

Archive.today, not tomorrow?

So there we have it: the site is a one-man battle against entropy, constantly battling domain registrars, anti-scraping systems, copyright enforcement, easily spooked advertisers, and global financial system payment rails designed to obstruct Russian citizens. By staying anonymous and keeping a low profile, they’ve (likely?) managed to avoid the kind of legal tussles that have embroiled Alexandra Elbakyan of Sci-Hub fame, but they’ve still funded it to the tune of tens of thousands of euros during that time. They clearly have a second source of considerable income that’s likely somewhat sketchy as well, so if that ever goes away, archive.today is likely to go away with it.

The creator is fully aware that the site is a mere “weak tool” that is “doomed to die“, but the bus factor of one combined with its semi-legal nature means there can be no real continuity: there will never be a legally incorporated Archive.Today Foundation to carry on his work. It’s a testament to their persistence that they’re managed to keep this up for over 10 years, and I for one will be buying Denis/Masha/whoever a well deserved cup of coffee.

All images in this post feature the Bibliotheca Alexandrina at Alexandria, Egypt.