ChatGPT has become a very prominent topic of discussion, due to the rapidly growing capability of artificial intelligence (AI) to produce more sophisticated results for traditionally human tasks. With AI’s growing ability to produce commonly copyrightable content such as art, music, and various forms of text, there are new questions regarding the legality of AI-generated content, including whether AI content can be copyrighted, who would have ownership of the copyright, and if models are allowed to train on copyrighted sources.
In a recent episode of the Technology Tangents podcast “ChatGPT vs. the Law: The legal implications of AI,” Credera’s Chief Technology Officer Jason Goth and Chief Data Scientist Vincent Yates bring on guest Adam Floyd, an intellectual property attorney, to discuss this important and thorny topic.
While rules vary by country, the group discusses the key legal issues organizations should begin to understand when AI technology is being used. A summary of their podcast conversation is below.
When AI generates copy or images, who is the author? And is that copy copyrightable? If AI doesn’t own it, who does own the copyright?
Helpful context to answer this comes from the legal concepts of authorship and ownership, using the example of a book. Many times, the owner of a copyright may be a publisher and different than the author.
Legally, AI can be neither an author nor an owner.
The terms author and inventor definitionally require human creation and a computer program is unable to enter into a legal contract for ownership based on current statutes.
Though AI is not currently allowed to own a copyright, there are still questions about whether predominately AI-generated content can be owned by a user as tools such as ChatGPT and DALL-E become more prevalent. To answer this, it is useful to understand the original purpose of copyright law. Copyrights are explained in the U.S. Constitution and were designed to protect the arts (with patents acting as a counterpart to protect scientific creations).
It’s easy to imagine how allowing copyright on AI-generated content could harm artists. Based on the highly scalable nature of computer programs and AI, for many fields it may be possible to mass produce trillions of outcomes to try to shut out potential competition and create a blocking patent/copyright if it is allowed.
This was exactly the case in the human genome project where billions of genomes were generated by an individual company in order to potentially require licenses from pharmaceutical companies or other research organizations. Currently in North America, Europe, and Australia, AI is not allowed to be used in this manner to copyright mass generations of content.
Given the dangerous potential for mass generation enabled by AI to flood the market, thereby hurting artists’ and creators’ ability to compete, AI-generated content cannot be copyrighted under the intention behind copyright law.
Though purely AI-generated content may not be copyrightable, in many cases, people have started using AI as a tool to aid in their creativity, combining AI generation with human creation to produce unique synthesized results. The next question is whether these types of creations can be copyrightable. Connections can be drawn with other tools that aid in creating, such as Photoshop, that have faced no issues with being used for the development of copyrightable images. Photoshop has even started to include more AI-based features such as object selection, neural filters, and content aware fill.
If AI-assisted technology is acceptable for copyrighting images created with Photoshop, what marks the difference with prompt-based AI generators like ChatGPT and DALL-E? The analogy to this case would be asking someone to draw a picture for you. This would be a situation where the person drawing the picture would be considered the author and the idea is not copyrightable. The artistic or creative expression is the main part of the process that has the potential to be copyrighted, so in a circumstance where a user is submitting a prompt to an AI-generation software, the user wouldn’t be allowed to copyright. In a similar case, patents protect the utility of an idea and not the original idea. For example, if you are designing a flying car, you can only get a patent on a specific model of a flying car, not the whole concept of cars that fly.
From here, the next step is to try to find the line where enough effort is put in by the user for a creative expression. For example, some AI technology allows the user to draw areas of an image with base colors corresponding to different landscape objects such as water, trees, and mountains. With just a basic placement of filled in areas of color, the AI can generate detailed landscapes. Although there is a lot of gray area where it can be difficult to determine copyrightable content, vaguely outlining elements of an image likely isn’t enough to be protected.
While mass content and minimal effort generation has the potential to harm artists’ and creators’ ability to compete, the use of AI as a tool in creative outputs could prove to become a greatly beneficial way in helping artists and creators express themselves more efficiently and in unique ways.
Determining the amount of creativity needed for a creation to be copyrightable can be challenging. Photography can once more be a helpful analogy. At first it seems that by one simple click of a button, there isn’t much effort required from the individual taking the photograph.
To explain why photographs are protectable, it can be helpful to go back to the original history. In the late 1800s, the court originally thought that photographs were not protectable, but in following years the Supreme Court finally came to the decision that photographs are copyrightable due to the number of decisions a photographer has to make. Decisions such as the composition of different subjects, lighting, angle, and content to be photographed are all considered part of the creative expression.
Looking back at AI-generated content, it is possible to have a very specific and detailed prompt, which may get closer to the type of creative effort required for copyright. There might be more of a case in these scenarios for protecting these creations under copyright law.
If creative input is the deciding factor for copyright, another question is whether fabrications created by a highly designed model could be considered. A lot of creativity goes into creating the model such as deciding on hyperparameters, the loss function, and the structure of the training model. If these choices are creative, would the model designer hold the copyright?
Despite seeming like a possible argument, the design of a tool does not automatically result in being able to copyright everything produced by the tool. However, in terms of machine learning, it can be possible to generate a system to produce one specific outcome, such as tuning the parameters and model to produce one specific output. In this case the model itself could be part of the creative expression and could be used as a manner of producing copyrightable content assuming this is used in a bespoke manner. Whether or not this scenario is copyrightable is still a gray area and the argument becomes even more difficult if the output is unknown or if there are numerous outputs.
It can be surprisingly challenging to draw a line between what level of creative effort is required for copyright protection, but a line must be drawn to protect the effort involved in creative expression while still supporting the freedom of working with the new world of AI tools.
Another interesting facet of AI content generation is that it many of the large-scale AI generation technologies rely on large datasets, consisting of a variety of either text or images from the internet. With the millions of sources, it is nearly impossible for these technologies to not pull from other already copyrighted text or images.
Although it can be quite difficult to prove that a particular piece of content was used in the training of an AI model, this could still be a potential area of conflict. However, based on how current laws are structured, if a model is trained on raw numerical input data, it isn’t breaking any copyright rules. This is convenient because most models are only trained on numerical datasets, with methods of transferring images or text into sets of numbers, which makes it even more difficult to prosecute if it appears that a model is trained on copyrightable content.
As business leaders look to navigate the new waters of the AI-generated content landscape, here are a few key takeaways to keep in mind:
Ideas aren’t copyrightable, but their execution through human creativity can be.
AI can be a valuable tool in generating creative content, but may be challenging to protect through copyright. This is important to note in situations that require copyrighted marketing material.
AI-assisted content is more of a gray area where copyright laws could go either way on a case-by-case basis, depending on how much human creative input went into the project.
Machine learning models are allowed to train on any raw numerical input data without risk of legal backlash.
With the increased ability and accessibility to generate creative content with artificial intelligence, questions around the legality of AI-generated content are only beginning. Although it will be a challenge to draw a clear line between copyrightable and noncopyrightable content created with the help of AI, a better understanding of how copyright is determined is achievable and could prove useful in navigating the future of AI-based content. Through an understanding of the history, definition, and several relevant examples of copyright, it is possible to gain a fuller understanding of how current copyright laws may influence current and future AI technologies.
At Credera, we help organizations unlock the benefits of AI and empower them to achieve more. If you are interested in learning more about navigating the legal issues surrounding AI or discovering the right AI strategy for your company, reach out to us at [email protected].