Publishers are blocking the Internet Archive for fear AI scrapers can use it as a workaround

3 months ago 11

The Internet Archive has often been a valuable resource for journalists, from it's finding records of deleted tweets or providing academic texts for background research. However, the advent of AI has created a new tension between the parties. A few major publications have begun blocking the nonprofit digital library's access to their content based on concerns that AI companies' bots are using the Internet Archive's collections to indirectly scrape their articles.

"A lot of these AI businesses are looking for readily available, structured databases of content," Robert Hahn, head of business affairs and licensing for The Guardian, told Nieman Lab. "The Internet Archive’s API would have been an obvious place to plug their own machines into and suck out the IP."

The New York Times took a simil...

Source: https://www.engadget.com/ai/publishers-are-blocking-the-internet-archive-for-fear-ai-scrapers-can-use-it-as-a-workaround-204001754.html?src=rss

Read Entire Article

Disclaimer of liability !!!

NEWS.SP1.RO is an automatic news aggregator. In each article, taken over by NEWS.SP1.RO with maximum 500 characters from the original article, the source name and hyperlink to the source are specified.

The acquisition of information aims to promote and facilitate access to information, in compliance with intellectual property rights, in accordance with the terms and conditions of the source.

If you are the owner of the content and do not wish to publish your materials, please contact us by email at [email protected] and the content will be deleted as soon as possible.