Recently, the U.S. District Court for the Southern District of New York dismissed the copyright lawsuit filed by Alternet and Raw Story against OpenAI, triggering widespread concern in the industry about the copyright issue of AI model training data. The editor of Downcodes will conduct an in-depth interpretation of this judgment, analyze the underlying reasons and future development trends, and explore the different attitudes towards AI copyright issues in different countries around the world.
According to media reports, the U.S. District Court for the Southern District of New York recently dismissed the copyright lawsuit filed by news media Alternet and Raw Story against OpenAI, but this victory may only be temporary. The court's ruling did not touch on the most controversial core issue in the field of artificial intelligence: whether using copyrighted content to train AI models requires authorization.
Two media outlets filed a lawsuit in February this year, accusing OpenAI of deleting copyright management information (CMI) during the training data process, including author names, terms of use, and work titles. The lawsuit seeks damages of at least $2,500 for each infringement and asks the court to prohibit OpenAI from continuing to use its copyrighted works.
The main reason for the court's dismissal of the lawsuit was that the plaintiff failed to prove specific damages suffered as a result of the removal of copyrighted information. OpenAI stated in its defense that the plaintiff could neither prove that ChatGPT received training from its work nor prove specific losses. The judge agreed with this view and pointed out that considering the size of the database, it is less likely that ChatGPT would output the content of the plaintiff’s article.
You Yunting, senior partner of Shanghai Dabang Law Firm, said that proof has always been a key problem in AI copyright disputes. Due to the black-box nature of large models, it is difficult to prove whether a specific work was used for training. Under the existing legal framework, there is a lack of relevant systems to help vulnerable parties obtain evidence.
Currently, OpenAI is also facing at least six related lawsuits, including lawsuits from the New York Times, the Daily News and other media, as well as class action lawsuits from writers. These cases all involve a core issue: whether AI companies need authorization to use copyrighted content to train models.
It is worth noting that countries have different attitudes towards this issue. Japan has classified the use of copyrighted works for AI training as fair use, but courts in China and the United States have yet to give a clear answer on this. Yao Zhiwei, a professor at the School of Law of Guangdong University of Finance and Economics, pointed out that the fair use theory lacks a legislative basis in China and there is great uncertainty in its judicial recognition.
Although this lawsuit was dismissed, the judge stated in the judgment that the plaintiff could re-sue the issue of OpenAI using the work to train AI but failing to pay the fee. Lawyers for Raw Story and AlterNet have stated that they will amend the indictment and continue to defend their rights.
This judgment did not end the AI copyright dispute, but highlighted the lag in the legal framework and the urgency of AI industry supervision. In the future, relevant legislation and judicial interpretations will have a profound impact on the development of AI and deserve continued attention. The editor of Downcodes will continue to follow up on relevant reports and bring the latest information to readers.