The Internet Archive has accused the European Union's Internet Referral Unit (EU IRU) of sending hundreds of false takedown requests for 'terrorist content' on the site, warning that the law will require them to take completely innocuous and often core content down automatically until staff can manually review each request.

Part of a Europe-wide crackdown on the hosting of content deemed terroristic by the authorities, the European Union is currently in the process of introducing legislation which would require sites hosting user-generated content (UGC) to take down reported material within an hour of it being reported. While positioned as a means of reducing the harm said material can have, volunteer electronic library the Internet Archive is warning that it could have a serious chilling effect on how it operates - and has the false takedown requests to prove it.

'In the past week, the Internet Archive has received a series of email notices from Europol's European Union Internet Referral Unit (EU IRU) falsely identifying hundreds of URLs on archive.org as "terrorist propaganda",' explains the Internet Archive's Chris Butler in a blog post. 'At least one of these mistaken URLs was also identified as terrorist content in a separate take down notice from the French government's L'Office Central de Lutte contre la Criminalité liée aux Technologies de l'Information et de la Communication (OCLCTIC).

'The Internet Archive has a few staff members that process takedown notices from law enforcement who operate in the Pacific time zone. Most of the falsely identified URLs mentioned here (including the report from the French government) were sent to us in the middle of the night – between midnight and 3am Pacific – and all of the reports were sent outside of the business hours of the Internet Archive. The one-hour requirement essentially means that we would need to take reported URLs down automatically and do our best to review them after the fact.

'It would be bad enough if the mistaken URLs in these examples were for a set of relatively obscure items on our site,' Butler continues, 'but the EU IRU's lists include some of the most visited pages on archive.org and materials that obviously have high scholarly and research value.'

Of the material reported for takedown by the IRU, some of the most egregious include the central index pages for collections including all text files, content relating to American culture, out-of-copyright works of literature collected under the banner of Project Gutenberg, material released by the Smithsonian, every television programme hosted by the site, and scholarly articles archived from public-access sources including arXiv and PubMed. Even more surprising are requests for content produced and distributed by the US Government to be taken down, including CSPAN debate archives and a copy of a report from the US Army War College on the challenge of drug trafficking to democratic governance and human security in West Africa.

'These examples are only a few of the some 550 falsely identified URLs,' Butler explains. 'The erroneous reports continue to be sent to us by the EU IRU (the most recent example was sent a day prior to this post). Thus, we are left to ask – how can the proposed legislation realistically be said to honour freedom of speech if these are the types of reports that are currently coming from EU law enforcement and designated governmental reporting entities? It is not possible for us to process these reports using human review within a very limited time frame like one hour. Are we to simply take what's reported as "terrorism" at face value and risk the automatic removal of things like THE primary collection page for all books on archive.org?'

'I'm not inclined to comment on this other than to say that when I see people complain "I wrote to the archive, and it's two days later and they've not gotten back to me!!!!!!!" this is why it takes time,' adds archivist Jason Scott on Twitter. 'Realise, that in the actual real world, that there are firms and entities who simply blast thousands of requests at websites, filled with threatening language, and then figure it'll all work out, and take a nice fat paycheck from whoever they're doing it for as contractors.'

The issues encountered by the Internet Archive can equally apply to the recently-approved Article 13 of the revised EU copyright legislation: Under the terms of Article 13, any site which allows UGC must proactively scan said content for copyright violations. With thousands of hours of video, audio, text, and imagery being uploaded to sites across the world every minute, there's no way to do this without relying upon automated filters - the very same type of automation which doubtless triggered the false takedown notices sent by the EU IRU.

The EU IRU was contacted for comment on this article, but had not responded at the time of writing.

UPDATE 1640:

Europol, of which the EU IRU is a division, has denied sending takedown notices to the Internet Archive, placing the blame firmly on the shoulders of the French national IRU. 'All 25 URLs mentioned in the blog have been referred to the Internet Archive by the French IRU Unit, using the EU IRU's Internet Referral Management Application (IRMa) in April 2019,' explains Europol spokesperson Jan Op Gen Oorth. 'EU Member States are using dedicated email addresses via IRMa to channel their referrals. Hosting Service Providers have been informed that even though the emails are sent from Europol's domain, the content is assessed and referred by the relevant Member State.

'Member States are also requested to add their signature to the emails to make clear the referrals come from them and not from the EU IRU,' Op Gen Oorth adds, though it is not immediately clear whether these signatures were in place on the messages received by the Internet Archive and originally believed to be from the EU IRU.

The French Ministry of the Interior has been contacted for further comment.