Toward Computational Copyright in AI Image Generation: Legal Limits, Technical Challenges, and a Path Forward

Term Paper by Renwen Xia

The rapid rise of generative AI, especially in image creation through diffusion models, has exposed the deep inadequacies of current copyright law, which remains trapped in a binary framework of infringement or fair dealing. Here I will explore the legal and technical challenges of this landscape and propose a computational copyright regime, grounded in emerging attribution methods as a path toward fairer and commercially sustainable solutions.

Legal and commercial moves in the context of generative AI

In the Canadian legal context, similar to that in the U.S, the current copyright law offers a binary outcome: either a finding of infringement with significant penalties or a fair dealing.¹ Since the proliferation of generative AI tools, especially AI image generators with architectures of diffusion models, this legal binary outcome discourages AI firms from licensing specific artistic styles and leaves rights holders with limited avenues for compensation beyond outright opposition.

The recent controversy surrounding OpenAI’s GPT-4o and its generation of Studio Ghibli-style images underscores the inadequacy of current legal frameworks in addressing AI-driven artistic replication. While no direct copying is evident, the substantial similarity to Ghibli’s distinctive aesthetic has ignited debates over copyright infringement.²

So far Studio Ghibli has not filed a lawsuit against OpenAI but Getty Images did against Stability AI in 2024. In a lawsuit filed early in 2024 in a Delaware federal court, Getty Images alleged that London-based Stability AI had copied without permission more than 12 million photographs from its collection, along with captions and metadata as part of its efforts to build a competing business.³ Getty submitted damages of up to $150,000 for each infringed work, an amount that could theoretically add up to $1.8 trillion.⁴

While the Getty v Stability AI lawsuit has largely stalled in jurisdictional and procedural disputes and is unlikely to be settled soon, Getty Images took a commercial move by launching a diffusion based generative AI tool itself in Dec 2024.This image generator was developed in collaboration with NVIDIA and trained solely on Getty’s fully licensed image database. Getty also announced a strategic engagement with Clarifai, a global leader in AI orchestration, that integrates Getty’s generative AI capabilities into the Clarifai platform in the same year.⁵ This move marks a strategic assertion of control over its IP and a public declaration that its AI generator is “commercially safe.”⁶

Being the world giant image stock platform, besides commercial safety, Getty Images is also exploring an “ethical” approach to share the revenue brought by its generative AI tool with rights holders. Importantly, Getty is now exploring a revenue sharing model that compensates content creators whose works contribute to AI-generated outputs.⁷ According to the Getty Images API model, “Getty Images compensates contributors in an ongoing basis. This includes where contributors’ content is used as training data for AI. On an annual recurring basis, we will share in the revenues generated from the Generative AI by Getty Images with contributors whose content was used to train the AI Generator, allocating both a pro rata share in respect of every file and allocating a share based on traditional licensing revenue.”⁸

Pressing need for computational copyright and the technical challenges

I see a pressing need for an ongoing revenue sharing with the rights holder who contributes to both input and output of the image generative tools. Attempts by Startups like Bria and Ascendant Art, along with major players like Shutterstock and Adobe also verify the need: all of them are developing frameworks to compensate artists and content creators whose works are used to train AI models.⁹

Without access to the specific revenue-sharing agreements between Getty and its rights holders, a key challenge to me is: how can generative AI platforms accurately and proportionally distribute revenue to contributors based on their actual input to the training data or the generated images?

This challenge comes from the opaque nature of the image diffusion models with architecture of Convolution neural network “ CNN.”¹⁰ The opacity of diffusion models arises from their complex, multi-step generative processes. These models learn to reconstruct data by progressively denoising inputs, a procedure that is mathematically intricate and difficult to interpret. This complexity hampers efforts to trace how specific training data influence the generated outputs.¹¹ CNNs is commonly used in diffusion models for its capacity in generating high-quality images. Featured by their multi-layered structures and non-linear activations, CNNs further exacerbate this opacity by making it challenging to discern the internal decision-making processes. As a result, CNNs are often considered “black-box” models, with limited transparency in how they process and transform input data.¹²

At its core, this is a technical problem. Without clear methods to measure attribution, any revenue-sharing model risks becoming a form of compensation disconnected from actual contribution-more a matter of business negotiation than fairness, and one that could easily veer into exploitation.

I argue that a computational copyright regime could be a useful tool for the legal and commercial issues in the context of generative AI. Since none of the existing work has provided a clear definition, I define the “computational copyright” as “ the use of technical, algorithmic methods to measure, quantify, and attribute the influence of copyrighted works on AI-generated outputs. It aims to create an objective, systematic basis for determining ownership rights, compensation, and liability in the context of generative AI, bridging the gap between legal copyright principles and the opaque nature of AI models.”

Related work of Computational copyright framework and the limits

Fortunately, there are several related works proposing a computational copyright, which lays a technical foundation to a future actual commercially safe and fair revenue- sharing model, as well as computational copyright legal remedy in the context of generative AI.

A computational copyright framework for AI music algorithmic licensing¹³

The computational copyright framework for AI music generation designs a royalty distribution model using data attribution techniques. It measures how much each piece of copyrighted music contributes to AI-generated outputs, employing removal-based attribution methods like TRAK and TracIN alongside musical similarity metrics. This approach aims to solve the problem of opaque attribution in AI outputs and provides a path beyond the binary copyright outcomes of infringement versus fair use by enabling algorithmic licensing. It allows platforms to create revenue pools and distribute royalties proportionally to actual contributions, much like YouTube’s Content ID system.

A computational approach grounded in Levin-Kolmogorov complexity¹⁴

The computational approach grounded in Levin-Kolmogorov complexity introduces a method for measuring “derivation similarity” between AI outputs and original works by comparing the description lengths of programs that generate the output with and without access to the original material. This method addresses the unpredictability of traditional copyright tests for non-literal copying by offering an objective, measurable scale. It aligns with the abstraction-filtration-comparison test in software law, aiming to support proportional remedies in disputes over AI-generated works.

The Shapley Royalty Share Framework¹⁵

The Shapley Royalty Share (SRS) framework applies cooperative game theory, specifically the Shapley value, to fairly allocate revenue among copyright owners based on each owner’s marginal contribution to the AI model’s ability to generate a given output. By offering a quantitative and interpretable way to assign royalties, the SRS framework addresses the current legal system’s all-or-nothing outcomes and promotes proportional compensation for partial contributions to AI outputs.

PSPC: technical foundation towards actual computational copyright

While each method advances the goal of fairer attribution and compensation, all three face significant technical challenges. Data attribution methods remain resource-intensive and prone to noise, especially for minor contributors. Levin-Kolmogorov complexity is theoretically robust but practically limited by incomputability and dependency on shared background data. The SRS framework, although conceptually elegant, suffers from computational burdens in calculating Shapley values at scale and is vulnerable to strategic manipulation like data duplication. Together, these limitations highlight the need for further refinement to achieve scalable, legally reliable computational copyright solutions.

In comparison, Patch Set Posterior Composite (PSPC) offers a significant advantage in advancing the computational copyright regime. PSPC, by contrast, operates inside the generation process itself, offering granular, procedural attribution at each diffusion step.

Patch Set Posterior Composite (PSPC)¹⁶

The PSPC method proposed introduces a training-free, localized denoising mechanism to explain how diffusion models generalize. Instead of learning denoising globally across an entire image, PSPC approximates the denoising operation by combining local patch-based estimates of the posterior mean at each step of the diffusion process. For a noisy input image, PSPC breaks the image into overlapping patches, estimates how each patch should be denoised based on nearby training examples (through empirical averaging weighted by local similarity), and aggregates these local corrections to reconstruct a full denoised output. Two main variants are proposed: PSPC-Square, which uses fixed-size patches, and PSPC-Flex, which adapts patch shapes based on the neural network’s learned sensitivity. This method effectively mimics the behavior of real diffusion model denoisers during most of the generation process without needing any retraining or optimization.

PSPC could form a powerful technical foundation for addressing copyright issues in the context of image generative AI, because it provides a way to decompose and attribute influence at a local patch level during the image generation process. Unlike previous approaches that can only assess output similarity after generation, PSPC shows that each localized denoising step can be traced back probabilistically to specific parts of the training data. In a legal setting, this means it would become possible to produce evidence showing how much influence (at the patch level) a copyrighted image had on a new AI-generated image, not just globally but locally and procedurally throughout the sampling process. This opens the door to partial attribution, proportional liability, and proportional compensation frameworks, which are crucial for building fair revenue-sharing models and addressing infringement claims where AI outputs are “inspired by” but not direct copies of copyrighted works.

The PSPC framework offers a promising technical foundation to address the intertwined legal and business challenges posed by generative AI image platforms.

In law, PSPC could transform the rigid “all-or-nothing” structure of current copyright adjudication. Instead of evaluating AI outputs as either fully infringing or fully original, courts could recognize partial, localized copying or influence, grounded in empirical evidence generated by PSPC analysis. This would enable proportional remedies-such as allocating damages or licensing fees based on the extent of influence-rather than triggering extreme remedies like full injunctions or total damages awards. In cases like the Getty v. Stability AI dispute or potential future conflicts involving artistic styles like Studio Ghibli’s, PSPC could offer a middle path: rights holders would not have to prove that an entire output was a copy, but could demonstrate quantifiable, localized influence sufficient for partial compensation. By moving attribution from the level of subjective aesthetic impression to objective, patch-based probabilistic tracing, PSPC would provide courts and regulators with a structured tool to manage complex generative cases fairly.

On the business side, PSPC could underpin a new generation of commercially safe and fair generative AI platforms. It would enable platforms like Getty, Clarifai, or Adobe to design transparent revenue-sharing models, where compensation to rights holders is tied not simply to the inclusion of works in training datasets, but to measured influence on actual generated outputs. This would move beyond pro rata or blanket licensing approaches toward fine-grained, patch-level revenue allocation. Rights holders would be empowered to audit and verify their participation in AI outputs, strengthening their bargaining position in negotiations with major platforms. At the same time, AI developers could demonstrate the commercial and ethical “safety” of their tools by offering provable, computationally grounded attribution and compensation mechanisms.

In doing so, PSPC offers a path to avoid the exploitation risks: it aligns technological capability with legal fairness and commercial sustainability, making it a crucial building block for any future computational copyright regime.

Conclusion:

Computational copyright, supported by technical advances like PSPC, offers a promising bridge between opaque AI processes and the demands for legal accountability and fair compensation. By enabling proportional attribution and remedy, it paves the way for a more nuanced and equitable copyright framework in the era of generative AI.

Toward Computational Copyright in AI Image Generation: Legal Limits, Technical Challenges, and a Path Forward

Copyright & Social Media

Communications Law

Top Commented

Featured