Reddit sues Anthropic over alleged scraping of user data for AI
Reddit alleges that Anthropic illegally scraped over 100,000 user pages to train its Claude chatbot.
Reddit has filed a lawsuit against artificial intelligence company Anthropic, alleging that the startup unlawfully scraped vast amounts of user-generated content to train its AI chatbot, Claude. The suit, lodged in the California Superior Court, accuses Anthropic of systematically harvesting data from Reddit without permission, violating the platform’s terms of service.
The complaint claims Anthropic made over 100,000 unauthorised accesses to Reddit’s platform since mid-2024. This, Reddit argues, represents a flagrant breach of its user agreement and an exploitation of its community-generated data for commercial AI development.
Monetising data versus open access
Reddit’s legal action underscores a growing conflict between tech platforms seeking to monetise their data and AI companies dependent on vast datasets to train models. Reddit has already established data licensing agreements with companies like Google and OpenAI, which allow regulated access to content in exchange for compensation.
By contrast, Anthropic’s alleged scraping bypassed these frameworks, depriving Reddit of potential revenue and compromising its users’ trust. The platform argues that unlike casual browsing, mass data harvesting constitutes a commercial activity that must be subject to clear contracts and privacy protections.
Anthropic's response and industry implications
Anthropic has denied any wrongdoing, stating that its data collection practices comply with fair use provisions and are consistent with widespread AI development norms. The company, backed by Amazon and led by former OpenAI researchers, said it will vigorously contest the lawsuit.
This case is part of a broader wave of litigation targeting AI firms. Earlier this year, The New York Times and several authors filed lawsuits alleging similar data misuse. The outcomes of these cases could reshape how training data is accessed, negotiated, and monetised across the tech industry.
Reddit's strategic positioning
Reddit’s legal action also reflects its evolving business model. As the company prepares for life as a publicly traded entity following its IPO, data licensing has emerged as a key revenue stream. The lawsuit signals Reddit's intent to protect its assets aggressively and assert control over how its content is used.
Industry analysts see Reddit’s move as a bellwether. As AI developers increasingly rely on public web data, platforms are pushing back, demanding clearer boundaries and compensation


