Copyright: Let’s Chat(GPT)
Copyright: Let’s Chat(GPT)
Copyright of ChatGPT’s output is a complex legal issue, as it can depend on various factors such as the specific text generated, the jurisdiction, and the intended use of the text.
That is the answer I received when I asked ChatGPT “Who owns the copyright in ChatGPT’s output?”
In fairness, the answer given by the system is not bad in terms of accuracy: copyright is a complex legal issue. However, the answer ChatGPT served to me is not particularly helpful if I am thinking of reproducing it.
ChatGPT generates its answers using a complex language model that is based on deep learning techniques. Its language model has been trained on a large body of data, such as books, articles, and web pages. It is unclear quite where all these data have come from, and ChatGPT is rather cagey about it when asked. (“Great care is taken to select and pre-process the training data to ensure that it is representative of the type of text data that ChatGPT is likely to encounter in real-world applications”, I am told, but the application won’t divulge its sources.)
Absent any indication or attribution of the sources used to compose each of its responses, users of the artificial intelligence (‘AI’) service cannot be sure whether the content they receive includes any copyright materials. For that matter, users also cannot be certain of the source of any personal data that may be included within ChatGPT’s responses to know how those data have been collected and whether any lawful basis exists under the GDPR for the processing of them.
ChatGPT isn’t shy about presenting source materials in full and verbatim either, highlighting the risk that personal data and copyright materials may be present in its output.
Asking ChatGPT “How does A Tale of Two Cities begin?” results in all 120 words of the book’s famous opening sentence being returned in full and as penned by Dickens. (I do not reproduce the sentence here in the interests of brevity, rather than out of copyright concerns. As Charles Dickens died more than 70 years ago, his works are in the public domain and out of copyright.)
ChatGPT has opened a world of possibilities for businesses and many corporations have already bought-in. Shopify, Salesforce and Morgan Stanley are all reportedly exploring its use or the use of related systems from its creator (OpenAI) for all manner of applications including the automated handling of customer service enquiries, composing of correspondence and development of company and sector insights and reports. However, as exciting as the opportunities may seem, a lack of clarity regarding ChatGPT’s sources and any rights issues that may be engaged by the system’s responses should cause users to pause for thought before they jump in.
The risk of protected inputs resulting in rights-infringing outputs has been highlighted by the legal proceedings recently commenced in the UK and US by stock photography company Getty Images against AI-powered art generator Stability AI. Among Getty’s allegations is that Stability AI has copied millions of Getty’s copyrighted images to train Stability AI’s system so as to generate more accurate pictures and content based on the prompts of its users. The focus of Getty’s claims is therefore on the input stage of Stability AI’s system (the data it was trained on), rather than the use of any of the platform’s outputs by its end-users.
However, the input-stage focus of Getty’s claims should not be taken to suggest that one can avoid becoming a litigation target where, for example, the materials used to train an AI system were not provided by them but were instead selected by the system’s developer. You cannot avoid liability for copyright infringement or for unlawfully processing personal data merely because you obtain that data from another source. ChatGPT’s own response to the issue shows how, absent appropriate contractual legal protections, any contrary conclusion would be misplaced:
ChatGPT does not provide compensation for any interactions or responses it generates. ChatGPT is an artificial intelligence tool designed to generate responses based on its training data and algorithms, and it does not have the capability to provide financial compensation or any other form of compensation to its users.
While AI is an enormous business now, and ChatGPT and related technologies are truly revolutionising how companies provide and define their services, the legal consequences of a poorly thought-through deployment could ultimately prove very costly. Disputes such as those between Getty and Stability only underscore the importance of businesses taking legal advice on any AI systems they may be considering and of ensuring they have fully explored and understood the contractual position before they too join the rush.