Wikipedia offers AI developers a training dataset to maybe get scraper bots off its back

6 months ago 7

Wikipedia has been struggling with the impact that AI crawlers — bots that are scraping text and multimedia from the encyclopedia to train generative artificial intelligence models — have been having on its servers, leading to increased costs and slower load times for human users in some cases. Perhaps in an effort to stop the bots from pummeling the public Wikipedia website and soaking up too much bandwidth, the Wikimedia Foundation (which manages Wikipedia's data) is offering AI developers a dataset they can freely use.

The organization has teamed up with Kaggle, a data science platform, to offer up a beta release of a structured dataset in both English and French.

Source: https://www.engadget.com/ai/wikipedia-offers-ai-developers-a-training-dataset-to-maybe-get-scraper-bots-off-its-back-143255593.html?src=rss

Read Entire Article

Disclaimer of liability !!!

NEWS.SP1.RO is an automatic news aggregator. In each article, taken over by NEWS.SP1.RO with maximum 500 characters from the original article, the source name and hyperlink to the source are specified.

The acquisition of information aims to promote and facilitate access to information, in compliance with intellectual property rights, in accordance with the terms and conditions of the source.

If you are the owner of the content and do not wish to publish your materials, please contact us by email at [email protected] and the content will be deleted as soon as possible.