Author: Emmanuel Salami (Doctoral Researcher)
Research group: Law, Technology and Design Thinking
Emmanuel Salami |
ChatGPT is a state-of-the-art natural language processing
model developed by OpenAI. It is a variant of the GPT-3 (Generative Pertained
Transformer 3) model, which has been trained on a massive amount of text data
to generate human-like responses to a given input.[2] ChatGPT
uses unsupervised machine-learning techniques to create responses. In other
words, it can generate responses without the machine learning algorithm being
trained to respond in any particular way.
This notwithstanding, human input is needed to curate the information
and thereby guide its output. Furthermore, ChatGPT makes AI readily accessible
to the public,
thereby creating a potential avenue for unravelling, on a
large scale, some critical legal concerns previously expressed about AI
systems. Therefore, this blog post focuses on some Intellectual Property Rights
(IP, IPR) and data protection law concerns that might arise in using ChatGPT.
ChatGPT might raise some exciting IP considerations concerning
its output. This is because the datasets used to train the AI system at the
machine learning phase must have been generated through the works of authors
who are most probably unaware of it. Though ChatGPT is proving to be very good
at mixing and matching, time will tell if we potentially have a copyright
action on our hands, should it replicate works attributable to other authors.
A relatable concern has been raised by Australian artists who accused an AI system that creates art of infringing on their artwork. [3] Their rationale is that their artwork had been used to train the AI system, and its elements are evident in the AI-generated art. It is arguable that, at some point, something like this might be possible, especially when it comes to AI-generated literary works such as (non) fiction books.
One of the schools of thought justifying IPR posits that its
purpose includes the incentivisation of authors and the encouragement of
innovation. [4] ChatGPT’s (potential) use of copyrighted works without adequate consideration
for the incentivisation of authors can potentially hinder the ‘author
incentivisation’ objective of IPR. Furthermore, IPR accords all authors moral
rights in their works, which is an inalienable right to be consistently
recognised as the author of the work. [5] ChatGPT,
and by extension, AI’s (potential) use of copyrighted works, threatens this IPR
principle. In addition, OpenAI has created an avenue in its terms and
conditions for infringed copyrighted works to be taken down from the platform. [6] However, this is neither a sufficient attempt to incentivise authors nor
resolve the IPR concerns identified above.
Despite the output of ChatGPT essentially being non-personal
and publicly available data, data protection law remains relevant in its use.
However, it would appear that processing (potentially sensitive) personal data
does not take the data retention principle into proper consideration. To avoid
doubt, the data retention principle (also known as the storage limitation
principle) simply requires that personal data should neither be retained nor
capable of identifying natural persons longer than necessary in relation to the
purpose(s) of processing. [7]
The retention of personal data in ChatGPT raises two
concerns for the data retention principle: firstly, when users delete their account
on the platform, all information about the account is deleted, and they will be
unable to reopen another account. Users are therefore encouraged to deactivate
their accounts, making their data available on the platform. [8] The
implication is that users might be forced to keep their data on the platform to
avoid being prevented from creating an account in future.
Of course, the platform may claim that the legal basis for
retaining the account details upon deactivation (instead of deletion) is the
user’s consent. However, such consent is invalid because it is non-voluntary.
After all, the user has no other option. [9] It is
frivolous to assert that such retention is justifiable by the performance of a
contract since it is not necessary for such performance. [10] Secondly,
some data categories entered into the ChatGPT system cannot be deleted. [11] Although users are advised not to enter sensitive data into the system, this
does not resolve the data retention concern, especially because such
erroneously entered data will likely be available for machine learning
purposes.
The scenarios highlighted above also raise some interesting
concerns from the perspective of the data minimisation and purpose limitation
principles which cannot be fully addressed within the scope of this blog post.
Flowing from the concerns identified above, one can say that the terms and
conditions and the privacy policy of ChatGPT are quite superficial and non-transparent
and do not sufficiently address these concerns. If ChatGPT is to become a
mainstream application, these concerns (and more) must be addressed,
particularly in the EU, due to its extensive (proposed) legislation regulating
personal data and AI. The lack of transparency becomes even more worrisome,
given how well ChatGPT has been received and the proposed intention of some
global tech players to incorporate it into their products. [12]
From an epistemic perspective, ChatGPT (like most other AI
systems today) only recreates information from other existing data and is
incapable of creating new knowledge. This is because it lacks the human
consciousness needed for knowledge creation. As Zittrain notes, AI systems
(including Chat GPT) “don’t uncover causal mechanisms, they are at best
statistical correlation engines”, unlike human intelligence, which is needed
for the investigation of problems and their causal effects. [13] ChatGPT’s
status as a “statistical correlation engine” is one reason behind some of the
superficial and wrong answers it has been known to provide. In addition, it cannot
discern and verify the validity/correctness of the information, although this
may also result from training the system with error-prone data. This highlights
the risks of importing real-world errors and biases into the realm of AI,
resulting in the propagation of misinformation. Therefore, it is necessary to
ensure that some form of human review is mandatory in using ChatGPT.
As identified above, ChatGPT has some hurdles to surpass if it is to be adopted without legal and regulatory challenges. It is necessary to carefully consider these issues so that AI adoption does not result in the erosion of user rights. While the frenzy surrounding AI is understandable, developers will do well to sustain this excitement by ensuring that their products comply with applicable laws.
[1] See for instance - Next
Rembrandt. <https://www.nextrembrandt.com/> accessed 06/01/2023.
[2] Shripad Kulkarni, Generative
Pre-trained Transformer 3 by OpenAI.
<https://link.medium.com/Rcb57QuWpwb> accessed 08/01/2023.
[3] Cait Kelly, Australian artists accuse popular AI imaging app of stealing content, call for stricter copyright laws, (The Guardian.com, 11/12/2022). <https://www.theguardian.com/australia-news/2022/dec/12/australian-artists-accuse-popular-ai-imaging-app-of-stealing-content-call-for-stricter-copyright-laws?CMP=share_btn_link> accessed 08/01/2023.
[4] Annette Kur and Thomas Dreier, European Intellectual Property Law:
Text, Cases and Materials (Edward Elgar Publishing 2013) 5–10.
[5] Art 6bis Berne convention.
[6] OpenAI, Terms of Use, paragraph 3(d). <https://openai.com/terms/>
accessed 9/01/2022.
[7] Art 5(1) (e) GDPR.
[8] Chat GPT FAQ, paragraph 7 <https://help.openai.com/en/articles/6783457-chatgpt-faq>
accessed 9/01/2022.
[9] Art 4(11) and Art 7 GDPR.
[10] Art 6 (1) (b) GDPR.
[11] Chat GPT FAQ, (n 8) paragraph 8.
[12] Ryan Browne, Microsoft reportedly plans to invest $10 billion in
creator of buzzy A.I. tool ChatGPT, (January 10, 2023, CNBC). <https://www.cnbc.com/2023/01/10/microsoft-to-invest-10-billion-in-chatgpt-creator-openai-report-says.html>
accessed 11 January 2023.
[13] Jonathan Zittrain, ‘The Hidden Costs of Automated Thinking’ (The
New Yorker, 23 July 2019) <https://www.newyorker.com/tech/annals-of-technology/the-hidden-costs-of-automated-thinking>
accessed 11 January 2023.