Generative AI Turns Ordinary Computers
into Weapons of Mass Copyright Violation

Generative AI gives intellectual property right holders an excuse to grab control of your devices.

AI generated Photo of corporate lawyers watching gleefully as someone is chained to their devices while books burn in the background.

7 Feb 2024

London

by Matthew Eric Bassett

2023 was a bonsai year for tech journalists thanks largely to a technology that should make them all obsolete: large language models. But the irony here is that large language models might spark a legal shift in favour of – not the journalists, but the large corporations, the media, that employs them, and this shift would substantially limit our rights and abilities to access different information. This potential attack on our rights and abilities would come in the form of stronger intellectual property rights for those same corporations. We can actually see this play out in some of the tech & law headlines from 2023. In one case, we can watch how their ability to distract us from the primary sources allows them to distort and even create stories, and in the other case we can see how enforcement of their copyright over these stories might allow them to shut down primary sources.

Let’s begin with the first case. In 2023 much of tech & law journalism was following the legal trials of computer science Professor Stephen Thaler. He spent much of the year trying to get an AI model listed as the inventor or author of a patent or copyright, respectively. While the patent case got as far as a US federal appeals court, so must have had some merit, the copyright case was quite simple. Professor Thaler argued that the copyright office should have granted his application to list his AI model as the author of a picture he developed using it. But the copyright office, and the law, is quite clear. Copyright requires human authorship. It’s right there in black and white on page 7, chapter 300 of the Compendium of US Copyright Office Practices.1 Nevertheless, the media took advantage of the fact that you haven’t read that particular page of the primary source by writing headlines like “Thaler’s Quest to Get His ‘Autonomous’ AI Legally Recognized Could Upend Copyright Law Forever”2 and “Can AI Invent?”3.

Still, we need a judge to dismiss this case. Judge Baryl Howell was the one to do it. She promptly ruled that only humans can be authors of copyright, following existing legal precedent dating back to the 19th century. While Professor Thaler thought, and the media tried to lead us to believe, that the case would be provocative, Judge Howell’s ruling was as simple as the case was insignificant. The primary source – her ruling – is an easy and fun read, especially the part where she points out a past case where the courts didn’t recognize copyright in works that were supposedly authored by a divine entity rather than a computer.4 With the existing case law being this uncomplicated, the media could have just told you how copyright works instead of portraying it as a groundbreaking legal question, but then there would have been no story. And so without complication, Judge Howell dismissed the case quite forcefully. She writes that AI models “will prompt challenging questions regarding how human input is necessary to qualify the user of an AI system as an ‘author’ of generated work, the scope of protection obtained over the resultant image, how to assess the originality of AI-generated works where the system may have been trained on unknown pre-existing works, how copyright might best be used to incentivize creative works involving AI, and more… This case, however, is not nearly so complex… In the absence of any human involvement in the creation of the work, the clear and straightforward answer is the one given by the [copyright office]: No.” (emphasis mine)

Judge Howell did a better job of identifying the societal issues caused by AI than our computer science professor, but she also hints at a key point of our legal system. When she asks “how copyright might best be used to incentivize creative works involving AI?” she is calling attention to the fact that the point of copyright is to incentivize creative works. The US Constitution spells this out for us in the “Copyright Clause”, where it states that Congress can promote the progress of arts and sciences by securing to the authors certain exclusive rights. What sort of rights? Well, property rights. The US Constitution loves property rights and property owners, so much so that in its original form it even protected chattel slavery. It is no surprise that it also the source of intellectual property rights. After several decades of US Presidents appointing lawyers from Big Corporate Law to the federal bench, the federal courts are now tilted strongly in favour of substantial property owners, by which I mean corporations.5 So while “inventive creative works involving AI” might sound like a federal judge is thinking about you and me using AI in our creative endeavors, what she really means is “secure the rights of corporations over new parts of our culture”.6

Nota benne that there is no chance that a federal district court, much less an appeals circuit, would tackle society-shifting questions about AI because a “normal” individual, like our former plaintiff, had a legit claim. The courts really cannot be bothered about the harms AI might cause you or your loved ones. For example, take deep fake porn. If a deranged pervert wants to release your likeness doing something NSFW on the internet then you are unlikely to get a federal judge to order an injunction against the people running the servers that would distribute that “content”.7 After all, an image of your likeness is not even your property – the copyright belongs to that photo belongs to the person who took it. And when that deranged pervert uploaded it to MetaFace, the copyright was probably transferred to that substantial property owner, by which I mean corporation. No, the courts do not care, and even if they did, lawsuits are too expensive and time consuming for anyone but the landed gentry to pursue.

So if our first case illustrated how the media can create wild headlines and stories when they distract us from the primary source, our second set of headlines shows us just how much the media needs to fight to retain control over the primary sources. The New York Times vs Microsoft & OpenAI 8 was a blockbuster, and promises to continue to be so in 2024. For one, it’s one large property holder versus another. And for two, the media is the plaintiff, so they have already made sure that you know about the case. You may be forgiven for not reading the nearly 70 pages of primary source (and the media would love to forgive you of that, too, even though you can get the gist of it from paragraphs 1-9) of the New York Times’ complaint. The Times is trying to uphold its copyright over the results of the work of countless journalists through the decades, and it has observed that Microsoft & OpenAI are creating and distributing derivative works without their copyright-management information, which is a violation of US copyright law. The tech companies, on the other hand, will need to fight for their ability to exploit their copyright over the work of hundreds of machine learning researchers who have been developing AI models. The journalists and researchers themselves are not involved.

I am not a lawyer or a legal analyst, but that won’t stop me from opining on this case. Let’s take the idea of “the law protects substantial property owners, by which we mean corporations” to the maximum and see where it takes us. Since these models can produce works that are near-identical copies of the New York Times' articles, the courts will recognize that The Times is being irreparably harmed each time someone uses one of these models. So at some point the courts will order that these models are trained on data that includes the copyright management information and that the output must include the correct copyright management information for any copyrighted article that the output might be similar to. The tech companies must make these changes before they can resume public access to their models. Thus the “productionization” of future generative AI models will require significant help from intellectual property rights lawyers to ensure that they are complaint with this order. Such an order would prevent future harm to substantial property owners, by which I mean corporations, both The Times and tech companies alike. For The Times, it grants the relief they asked for and gives them potential avenues to bring in revenue when users use the tech companies' models, and for tech companies, it insulates them from competition by increasing the legal barriers to entry for generative AI. As an added bonus, lawyers will still have plenty work in the new generative AI economy as they’ll be needed to ensure that the models behave.

Generative AI models, even with our hypothetical court’s restriction, could still produce copyright-infringing material, as the user could just prompt the model to ignore copyright management information, or even distribute the results without it. In this way we see that AI models turn ordinary computers into weapons of mass copyright violation. They aren’t just distributing works, they enable anyone to easily creating derivative works. Naturally, the courts will want to protect property owners from such mass violations. The courts may grant injunctive relief by banning tech companies from distributing the models to, or allowing access to these models from, “irresponsible” computers that cannot also restrict what the user does with the output. As these models become more embedded into day-to-day work, most computers will need some sort of “digital AI rights management” software embedded into the operating system to prevent people like you or me from accidentally infringing on the copyright of a substantial property owner, by which I mean a corporation. It would then have to be a new crime to “jailbreak” these computers from their digital restrictions in most circumstances. Obviously there would be some legitimate use-cases for an unrestricted device, like operating systems research or AI research, or even programming in general. Such users would require a license, or else be locked out of generative AI, which would make them economically non-competitive.9

All of this would not only protect property owners like the New York Times from mass copyright infringement, but it would also protect property owners like Microsoft from competition. And, of course, it would protect consumers from rouge software developers, much like hair-cutting licenses in some US states protect people from bad hair cuts from rouge hairdressers. It would realize Judge Howell’s goal of incentivizing creative works involving AI by ensuring that substantial property owners have control of their works on whatever device they might be on.

I am trying to paint a dystopian picture, but even as I do I realize that the horse has already bolted and I’m trying to close the barn doors too late. Consumers are already accustomed to "buying" smart phones that they don’t own or control. And thanks to AI-detection software, students are already becoming accustomed to only using software that tracks their edits, so that their universities can verify they aren’t using generative AI. One of the benefits of this world for media companies like the Times, and one of the horrors for me, is that this licensing of computers does not have to stop with copyright and generative AI content. Substantial property owners can also use the same digital rights software to ensure that the Times is the only source available. After all, as we saw in the first case, they have much more control over the story when you aren’t aware of the primary source material. Indeed, they want to be your primary source; Jeff Bezos didn’t buy the Washington Post, and billionaire Carlos Slim didn’t invest in the New York Times, because they needed more money and thought journalism would bring in the profits. They made those investments because they want to control the medium between you and the primary source. The best thing for them is that you never reach for any other source.

And generative AI might be the best legal excuse to make sure that they are the only source you’ll ever see. So go on, read Judge Howell’s opinion, because it might be the last chance you get.

Notes

.