Connect with us

Business

Reddit Launches Legal Action Against Perplexity AI Over Data Theft

editorial

Published

on

Social media platform Reddit has initiated a lawsuit against Perplexity AI and three other companies, alleging that they unlawfully scraped user-generated content to train their artificial intelligence models. The complaint, filed in the US District Court for the Southern District of New York, accuses Perplexity AI, along with SerpApi, Oxylabs, and AWMProxy, of engaging in unfair competition and unjust enrichment. Additionally, Reddit claims that some of these companies violated US copyright laws.

In its legal filing, Reddit asserts that these companies circumvented its data protection measures to obtain data essential for Perplexity’s answer engine system. According to Ben Lee, Reddit’s chief legal officer, the demand for high-quality human content has intensified competition among AI companies, resulting in what he describes as an “industrial-scale ‘data laundering’ economy.” Lee stated, “Scrapers bypass technological protections to steal data and sell it to clients looking for training material.”

Reddit emphasized its significance in the AI landscape, noting that its platform hosts over 100,000 interest-based subreddit communities. The lawsuit claims that user posts from Reddit have become a primary source for AI-generated answers on Perplexity. Following a previous cease-and-desist letter, Reddit reported a forty-fold increase in citations to its content by Perplexity.

In response, Perplexity AI has defended its practices, asserting that it does not train AI models on Reddit content but instead summarizes and cites public discussions from the platform. The company contended that licensing agreements are not possible under these circumstances. Perplexity stated, “Our approach remains principled and responsible as we provide factual answers with accurate AI, and we will not tolerate threats against openness and the public interest.”

SerpApi’s customer success director, Ryan Schafer, expressed strong disagreement with Reddit’s claims and indicated intentions to defend against the accusations. Oxylabs shared its disappointment over the lawsuit and affirmed its commitment to a robust defense.

Reddit has been actively addressing issues related to data scraping. In 2023, the company began requesting payment from third parties for access to its data and has entered into licensing agreements with major firms like Google and OpenAI. AI researchers have noted that a wealth of moderated conversations on Reddit can enhance the naturalness of AI chatbot responses.

The lawsuit underscores the growing tensions surrounding data ownership, AI transparency, and the ethical implications of using publicly available content for machine learning purposes. In a previous statement, Reddit’s COO Jen Wong highlighted that AI licensing agreements with Google and OpenAI constituted nearly 10% of the company’s revenue as of February 2023.

As the legal proceedings unfold, the implications of this case could significantly influence the landscape of data usage and AI development, particularly regarding the rights of content creators and the companies that rely on their contributions.

Continue Reading

Trending

Copyright © All rights reserved. This website offers general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information provided. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult relevant experts when necessary. We are not responsible for any loss or inconvenience resulting from the use of the information on this site.