Comedian and writer Sarah Silverman, along with authors Christopher Golden and Richard Kadrey, have filed lawsuits against OpenAI and Meta in a US District Court. They claim that both companies have violated their copyright by training their AI models, ChatGPT and LLaMA, using illegally obtained datasets.
These datasets allegedly contain their works, which were obtained from websites like Bibliotik, Library Genesis, Z-Library, and others that are known for sharing books through torrent systems.
Golden and Kadrey have chosen not to comment on the lawsuit, and there has been no response from Silverman’s team at the time of the press deadline.
AI bots accused of stealing content and showing it for free to users
In their lawsuit against OpenAI, the trio provides evidence that when prompted, ChatGPT summarizes their books, which they argue infringes on their copyrights. The exhibits include examples of ChatGPT summarizing Silverman’s book Bedwetter, Golden’s book Ararat, and Kadrey’s book Sandman Slim. The lawsuit points out that the chatbot fails to include any of the copyright information that the plaintiffs included with their published works.
Another lawsuit has been filed against Meta, specifically targeting their LLaMA models, a set of open-source AI models introduced in February. The authors claim that their books were included in the datasets used to train LLaMA, and they believe these datasets were obtained illegally.
The complaint highlights that Meta’s own paper on LLaMA mentions the sources of their training datasets, one of which is called ThePile, compiled by a company called EleutherAI.
An intricate web of AI companies “borrowing” datasets
ThePile, as mentioned in an EleutherAI paper, was created using “a copy of the contents of the Bibliotik private tracker.” The lawsuit argues that Bibliotik and the other mentioned “shadow libraries” are blatantly illegal.
Both lawsuits emphasize that the authors never gave consent for their copyrighted books to be used as training material for these AI models. The claims include multiple counts of copyright infringement, negligence, unjust enrichment, and unfair competition. The authors seek statutory damages, restitution of profits, and other remedies.
LLMlitigation – Artists fight back
The authors’ legal representatives, Joseph Saveri and Matthew Butterick, state on their LLMlitigation website that they have been contacted by other concerned writers, authors, and publishers who are worried about ChatGPT’s ability to generate text similar to copyrighted materials, including thousands of books.
Saveri has also initiated legal action against AI companies on behalf of programmers and artists. Getty Images has filed a lawsuit alleging that Stability AI, the creator of the AI image generation tool Stable Diffusion, trained its model on millions of copyrighted images. Saveri and Butterick are also representing authors Mona Awad and Paul Tremblay in a similar case involving Meta’s chatbot.
from Firstpost Tech Latest News https://ift.tt/OF5jx4C
No comments:
Post a Comment
please do not enter any spam link in the comment box.