Skip to content
All posts

Plagiarism and Copyright Battles in Generative AI

Plagiarism and Copyright Battles in Generative AI


Generative AI has ushered in a transformative era, revolutionizing content creation across various industries. The advent of AI-powered image generators like Stable DiffusionMidjourney, and DALLE2 has brought forth an intriguing blend of innovation and uncertainty. These AI models produce striking visuals, spanning from evocative watercolors to intricate pencil drawings, captivating audiences with their remarkable capabilities, as such, even esteemed art institutions, such as the Museum of Modern Art and the Mauritshuis, have embraced AI-generated creations, sparking a dialogue about the intersection of technology and artistry.

Yet, beneath the surface of these awe-inspiring creations lies a complex web; legal intricacies, with intellectual property infringement looming large. Generative AI platforms, like text generators, mine vast data lakes and question snippets to identify patterns and relationships. AI translates these patterns into rules and judgments, thus enabling it to craft essays, poems, and summaries that mimic style and form. However, this intricate process raises pressing legal concerns, particularly in the realm of copyright.

The marriage of AI and creativity prompts fundamental questions: Do traditional copyright, patent, and trademark infringement laws apply to AI-generated works? Who owns the content produced by generative AI tools — the developers, the users, or the AI itself? These queries beckon resolution, for they underpin the delicate balance between technological progress and the safeguarding of artistic expression. Generative AI, exemplified by ChatGPT, utilizes Natural Language Processing (NLP) to craft text, often drawing from various sources of data. However, the question of whether AI-generated content constitutes inspiration or infringement remains a challenge. With ChatGPT’s wide-ranging capabilities, concerns emerge regarding the line between novel creation and reproduction.

As stated in the recently published document, Generative Artificial Intelligence and Copyright Law, “AI companies may argue that their training processes constitute fair use and are therefore non-infringing. Whether or not copying constitutes fair use depends on four statutory factors under 17 U.S.C. § 107:

  1. The purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes
  2. The nature of the copyrighted work
  3. The amount and substantiality of the portion used as it relates to the copyrighted work as a whole
  4. The effect of the use upon the potential market for or value of the copyrighted work.”

Some stakeholders assert that employing copyrighted materials to train AI programs should be categorized as fair use. This they say should be guided by the factors specified. OpenAI, for instance, argues that its training process is transformative rather than expressive, yielding a practical generative AI system. OpenAI further maintains that the third factor supports fair use, given that the copied materials are utilized solely for training and aren’t publicly accessible.

Girl with the Pearl Earring painting
Credits: Girl with the Pearl Earring painting 


The generative AI realm has ignited a fervent discussion about the boundaries of creativity, intellectual property, and the implications of technology-driven content creation. Recently, OpenAI, a prominent player in the AI field, found itself in the crosshairs of a legal battle. This was a case concerning alleged copyright infringement. Authors Paul Tremblay and Mona Awad launched a class-action lawsuit, accusing OpenAI of unlawfully incorporating copyrighted material into its auto-generative AI system, ChatGPT. The suit claims that copyrighted novels were downloaded illegally by OpenAI for training purposes, and furthermore, ChatGPT’s generated responses were deemed to infringe upon copyrighted works. This incident underscores the complexities surrounding intellectual property rights in the age of AI.

This lawsuit is not an isolated incident. It’s part of a broader trend where artists, creators, and copyright holders express apprehension.  This apprehension is about AI models utilizing their work without permission. Comedian and author Sarah Silverman, along with fellow writers Christopher Golden and Richard Kadrey, filed a lawsuit against OpenAI and Meta for alleged copyright infringement, asserting that their books were used as training data without authorization.

GitHub Copilot, unveiled by GitHub in June 2021, is a remarkable AI innovation designed to assist coders by generating code snippets based on patterns extracted from publicly available code repositories. However, the controversy centers around Copilot’s handling of copyrighted code. The lawsuit claims that Copilot frequently reproduces substantial portions of licensed code without providing proper credit to the original creators — a violation of copyright law. Matthew Butterick, a programmer and lawyer, spearheaded the lawsuit. He did this with the support of the Joseph Saveri Law Firm based in San Francisco. Butterick asserts that this case is not merely an isolated incident; it’s a pivotal step in a larger movement. He emphasizes that AI systems are not exempt from legal accountability. As such, those who create and operate these systems must bear the consequences of their actions.

As the field of generative AI continues to expand, it brings to the forefront many important issues. Evidently, is the urgent need for clear guidelines and ethical considerations surrounding content creation. These legal battles highlight the intricate interplay between AI advancement, copyright law, and the evolving landscape of intellectual property. In this age of technological innovation, it becomes essential to strike a balance. One between fostering creativity and respecting the rights of content creators.

The outcome of these legal challenges could significantly shape the future of generative AI and its ethical boundaries. Consequently, this will leave us to ponder how technology intersects with the artistic realm and the legal domain.

How about AI outputs?

The recently published document titled “Generative Artificial Intelligence and Copyright Law,” issued by the Congressional Research Service, highlights a fundamental challenge stemming from existing U.S. copyright law. The Copyright Office recognizes copyright only in works “created by a human being.” Furthermore, courts have likewise declined to extend copyright protection to nonhuman authors. For example, appellate courts have held in various cases that a monkey who took a series of photos lacked standing to sue under the Copyright Act; that some human creativity was required to copyright a book purportedly inspired by celestial beings; and therefore, a living garden could not be copyrighted as it lacked a human author.”

Within its analysis, the document acknowledges the potential for copyright safeguards. Specifically in cases where human interaction drives the generation of content through generative AI. However, it directs attention to a pivotal factor—human involvement in the creative process. While collaborations between humans and AI to produce content may potentially merit copyright protection, the mere act of submitting a text-based prompt may not adequately meet the criterion of “human involvement,” thereby constraining the extent of copyright coverage.

The Intersection of Innovation and Responsibility

Generative AI has unleashed boundless creative potential, subsequently redefining content creation and challenging traditional paradigms. As AI-generated works permeate the cultural landscape, a delicate dance between innovation and responsibility unfolds. As stakeholders navigate this uncharted terrain, there is a need to reflect carefully. This is because their actions will shape the future of generative AI and the preservation of creative expression.

Long-term solutions demand innovative thinking; thus, Developers should therefore adopt practices that preserve the provenance of AI-generated content, ensuring traceability and verification. Businesses should additionally incorporate protective clauses into contracts and demand terms of service that validate proper licensure of training data.

Content creators and brands must proactively monitor their intellectual property and examine the evolving landscape of trademark infringement.

Legal counsel should stay informed as laws evolve, while AI developers responsibly source data and seek content creators’ consent.

#GenerativeAI #CopyrightInfringement #AIEthics #ContentCreation #LegalChallenges