Legal disputes over training artificial intelligence models have grown more complex, incorporating strategic and procedural considerations. These issues touch on corporate applications of generative models as well as efforts to create a sustainable and consistent usage framework, ultimately promoting balance and sustainability across the digital ecosystem. Establishing a clear regulatory framework would not only reduce legal risks but also encourage a responsible approach to artificial intelligence, enabling companies to invest confidently while ensuring fair and sustainable recognition for content creators.
The Landmark Case of Perplexity
A notable example is the Generative AI model developed by Perplexity, a company widely recognized for its unique applications and its ability to provide directly verifiable sources through explicit linking. Recently, Perplexity, alongside OpenAI , found itself at the center of a copyright infringement lawsuit filed by the New York Post, as reported by The Wall Street Journal
Perplexity utilizes generative AI to produce responses to user queries, positioning itself as a competitor to Google in the search engine market. Founder Aravind Srinivas has stated the ambitious goal of reaching half a billion queries per day by 2026. To offer broader context, the company, backed by Jeff Bezos and NVIDIA , is negotiating financing that could potentially double its valuation to over $8 billion. Currently, the startup has an estimated annualized revenue of approximately $50 million.
Despite these impressive figures, Perplexity’s officials expressed surprise at the litigation, stating the company was open to an “appropriate business conversation.” Demonstrating goodwill, Perplexity proposed a revenue-sharing model, inviting plaintiffs to participate in sharing revenues generated from content usage. From a procedural standpoint, this approach could help avoid prolonged legal battles and introduce an additional compensation mechanism for the use of protected content. News Corp , the parent company of the Wall Street Journal and the Post, has already entered into a five-year, $250 million licensing agreement with OpenAI.
Legal Complexities and Evolving Regulations
Regulatory uncertainty over the legality of training generative AI models on protected content remains a highly contentious issue. AI companies often defend their practices by invoking “fair use” under U.S. law, while rights holders argue that such training infringes on intellectual property. The legal outcomes of these cases could set pivotal precedents, either broadening or restricting content access for AI model training.
Understanding these disputes from both legal and procedural perspectives is crucial. Some cases may resolve through pre-decisional agreements, avoiding the expense and uncertainty of a full trial. Perplexity’s revenue-sharing proposal exemplifies a practical compromise that could shape future disputes. However, a key question persists: are activities such as crawling and scraping content for AI training permissible for commercial purposes, and how can content creators be appropriately compensated?
A recent ruling from the Hamburg Regional Court (case 310 O 227/23) affirmed established European legal principles by ruling that scraping copyrighted images for dataset creation is permissible only for scientific research, consistent with German and European copyright exceptions. Contrary to some claims, this decision does not authorize indiscriminate use of protected content and leaves unresolved questions about using such datasets for commercial AI training.
Comparing Revenue-Sharing Models: Perplexity and Spotify
Perplexity’s proposed revenue-sharing model has similarities to platforms like Spotify, which distribute revenue from subscriptions and advertising to artists and labels based on listen counts. However, Spotify has faced criticism for providing minimal compensation to independent and lesser-known artists while favoring major labels with direct deals and a larger share of revenue. Similarly, Perplexity’s model seeks to resolve disputes with large publishing groups but offers no comprehensive compensation framework for small creators whose content is used. As with Spotify, smaller creators risk receiving negligible compensation unless they possess significant legal leverage.
Ensuring Fair Compensation for Content Creators
The absence of fair compensation mechanisms for small content creators remains a critical issue. While large entities like News Corp can negotiate lucrative deals, smaller creators often lack the resources to enforce their rights. This imbalance raises important ethical questions: should only the most powerful entities benefit from compensation systems, leaving smaller creators unprotected?
A “digital fair compensation” model could offer a more equitable solution by providing remuneration proportional to the value and use of content in AI training, irrespective of creators’ economic power. Effective regulation is needed to ensure transparent licensing and remuneration accessible to both large and small creators.
Conclusion: Opportunities and Challenges for Companies and Content Creators
The dispute over generative AI and revenue-sharing proposals extends beyond legal boundaries; it tests the business models of the future. Companies operating in content, media, and technology must dedicate resources to closely monitor regulatory changes, which could significantly impact their competitiveness. By proactively adopting inclusive and transparent compensation systems, businesses can anticipate compliance needs and maintain a competitive edge in a shifting regulatory environment.
As generative AI becomes integral to business operations, the challenge of balancing technological innovation with respecting content creators’ rights becomes increasingly significant. Companies that adopt ethical and inclusive compensation strategies can gain a sustainable competitive advantage. A fair compensation model can also build corporate trust and foster a balanced digital ecosystem, ensuring that the value generated by generative AI is distributed equitably among all stakeholders in the digital production chain.