Quote from: Uncle Pete on May 01, 2026, 14:57
News of a settlement because the company used scrapped materials, from pirate sites.
"In June 2025, Judge William Alsup of the U.S. District Court for the Northern District of California ruled on summary judgment that using books without permission to train AI was fair use if they were acquired legally, but he denied Anthropic's request for summary judgment related to piracy—finding that the piracy was not fair use."
That's the summary of the important detail. If the AI training was from legitimate sites, then it's fair use. (in this opinion) But if it's from a pirate site, then it's piracy. In order for these authors and publishers to make a claim and receive payment, the work must have been registered.
What that's leading to, is if there's ever a settlement for photos, from stolen materials that was used for AI training, the images must have been registered, in advance.
https://authorsguild.org/advocacy/artificial-intelligence/what-authors-need-to-know-about-the-anthropic-settlement/
https://www.wral.com/business/technology/anthropic-settlement-copyright-questions-data-provenance-sept-2025/
From the second article: "Provenance you can audit. Track where all of your data came from—licensed archives, public-domain repositories, creator uploads under clear terms, or lawfully purchased collections. Avoid gray-market mirrors and "misc. web" buckets you can't defend. Courts are paying attention to the acquisition method, not just use."
They've been hammering us for decades about how bad the piracy of copyrighted material is.
Then a tech giant uses a pirate site to train its models. That alone says a lot about their (lack of) ethics.
