0% found this document useful (0 votes)
30 views8 pages

Jpae 025

This article analyzes whether the introduction of an opt-out possibility in Article 4(3) of the CDSMD adequately balances the interests of rightholders and AI developers. While aiming to encourage innovation, the opt-out mechanism could potentially hinder the advancement of European AI if rightholders' reservations are implemented in an overbroad manner. The upcoming AI Act may allow rightholders to make more effective use of the opt-out to influence licensing deals.

Uploaded by

Juan Pérez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views8 pages

Jpae 025

This article analyzes whether the introduction of an opt-out possibility in Article 4(3) of the CDSMD adequately balances the interests of rightholders and AI developers. While aiming to encourage innovation, the opt-out mechanism could potentially hinder the advancement of European AI if rightholders' reservations are implemented in an overbroad manner. The upcoming AI Act may allow rightholders to make more effective use of the opt-out to influence licensing deals.

Uploaded by

Juan Pérez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Journal of Intellectual Property Law & Practice, 2024, Vol. 00, No.

00 ARTICLE 1

The Text and Data Mining Opt-out in Article 4(3)


CDSMD: Adequate Veto Right for Rightholders or

Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024


a Suffocating Blanket for European Artificial
Intelligence Innovations?
Gina Maria Ziaja*

I. Introduction
Since Chatbots such as ChatGPT were launched in Abstract
November 2022, the number of applications of easily • By introducing Article 4 in Directive 2019/790
accessible generative Artificial Intelligence (AI) systems (CDSMD), the European Union legislator
has been increasing on a daily basis. These systems that intended to both encourage innovation and to
produce images, texts and music are not only changing provide more legal certainty for text and data
the way one looks at creativity and art but act as catalysts mining (TDM) activities. That said, it appears
regarding the discussion about the impact of AI on copy- that this provision does not strike a fair balance
right law.1 Hence, there is a global debate about the extent between the interests of rightholders and Artificial
to which AI systems should be legally regulated. Given Intelligence (AI) developers.
the fact that AI systems are trained with pre-existing data, • This article argues that Article 4(3) CDSMD does
which most often consist of copyright-protected works, not necessarily strengthen the rightholders’ posi-
unlicensed text and data mining (TDM) activities may tion whilst potentially hindering the advancement
result in copyright infringements.2 of AI developments in the European Union. It
Due to the large amount of data required to train is unclear how the reservation of rights shall be
AI systems effectively, concluding licences with individ- realized in practice.
ual rightholders may not be feasible. Apart from the
fact that the majority of rightholders may be difficult • By imposing transparency obligations on AI sys-
tem providers, the upcoming Artificial Intelli-
gence Act may however allow rightholders to
make more effective use of the opt-out mecha-
The author nism.
• Gina Maria Ziaja is an LL.M. Candidate in Euro-
pean Intellectual Property Law at Stockholm Uni-
versity, Sweden. She graduated from Ludwig- or even impossible to identify due to the lack of a copy-
Maximilians University in Munich, Germany, right register, the acquisition of the enormous number of
with a focus on IP law and has been working as a licences required could make the training of AI systems
legal assistant in the same field for over five years. de facto impossible. The European Union (EU) decided
to introduce two new exceptions that allow the use of
protected works for TDM activities under certain con-
*
Email: [email protected].
ditions in Directive 2019/7903 on copyright and related
1 N Maamar, ‘Urheberrechtliche Fragen beim Einsatz von generativen rights in the Digital Single Market (CDSMD). Whilst
KI-Systemen’ [Copyright issues in the use of generative AI systems], (2023)
Zeitschrift für Urheber- und Medienrecht [Journal for Copyright and
Media Law], 481. 3 Directive (EU) 2019/790 of the European Parliament and of the Council of
2 J Nordemann and J Pukas, ‘Copyright exceptions for AI training data - 17 April 2019 on copyright and related rights in the Digital Single Market
will there be an international level playing field?’ (2022) 17 Journal of and amending Directives 96/9/EC and 2001/29/EC (Text with EEA
Intellectual Property Law and Practice 973. relevance) [2019], (CDSM).

© The Author(s) 2024. Published by Oxford University Press.


This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://
creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any
medium, provided the original work is properly cited.
2 ARTICLE Journal of Intellectual Property Law & Practice, 2024, Vol. 00, No. 00

Article 3 CDSMD is limited to TDM activities for sci- uncertainties about the cases covered and their require-
entific research, Article 4 CDSMD covers reproductions ments.8
and extractions of lawfully accessed works for the pur-
pose of TDM if not expressly reserved by the respective III. Limiting the scope: opt-out

Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024


rightholder in an appropriate manner in terms of Article possibility
4(3) CDSM.
This article analyses whether the introduction of opt- Opt-out formalities can serve as an instrument to imple-
ing out of the use of protected works regarding TDM ment relatively broad copyright exceptions and limita-
activities in terms of Article 4(3) CDSMD serves as an tions.9 Nonetheless, it is submitted that Article 4 CDSMD
adequate ‘veto right’ for rightholders or a suffocating is rather restricted and may not unleash the full capa-
blanket for European AI innovations. The influence of bilities of AI in Europe. Given Article 4(3) CDSM’s opt-
the upcoming EU’s Artificial Intelligence Act is also taken out mechanism, the exception may be overridden by
into account in this context. contracts or technical methods. Allowing rightholders
to make reservations risks making legislative exceptions
subject to private will. Besides, there are still questions
surrounding the nature of reservations in a machine-
II. Aim of Article 4 CDSM readable format and their potential implementation.10
To enable machine learning, which is essential for AI
systems, robust exceptions for TDM are needed for 1. Strengthened position of rightholders?
machines to replicate, store and process training data to By including the possibility to opt-out, rightholders’ posi-
propose new solutions.4 Article 4 CDSMD aims to create tion was meant to be strengthened. On the one hand,
a standardized level playing field for system developers the reservation of rights might increase rightholders’
across the EU to lawfully conduct TDM activities. The bargaining power and lead to licensing deals follow-
Directive focuses on harmonizing Member States’ legis- ing remuneration from tech companies.11 Accordingly,
lation through a mandatory TDM exception that shall this approach could lead to a significant opportunity
accelerate innovation by encouraging EU-wide system for rightholders to establish new collective structures
developing processes.5 The provision is intended both to in order to exercise their rights against commercial AI
encourage innovation and to provide more legal certainty developers.12 On the other hand, it might lead to further
for TDM activities that fall outside of the scope of Article market concentration and exploitation of creators. Given
3 CDSMD.6 the already existing concentration of creative labour mar-
Recital 18 CDSMD further specifies that Article 4 kets and the considerable bargaining power wielded by
CDSMD should not unduly affect licensing opportuni- dominant companies, contractual terms requiring artists
ties for rightholders regarding uses of their work falling to waive their ‘training rights’ for reduced compensation
outside of the scope of Article 3 CDSMD and the exist- may be imposed. This could result in large companies
ing exceptions and limitations of the EU’s pre-existing
copyright framework.7 It aims to acknowledge that the 8 J Drexl et al., ‘Artificial Intelligence and Intellectual Property Law Position
Statement of the Max Planck Institute for Innovation and Competition of
use of data to train AI systems should not excessively 9 April 2021 on the Current Debate’ (2021) Max Planck Institute for
restrict the rightholder’s right to exploitation. Neverthe- Innovation and Competition Research Paper No. 21-10, 7.
less, it is questionable if the legal framework of the TDM 9 M Senftleben, ‘How to overcome the normal exploitation obstacle:
opt-out formalities, embargo periods, and the international three-step test’
exceptions provides usefulness in practice, given several (2014) 1 Berkely Technology Law Journal Commentaries 13.
10 R Ducato and A Strowel, ‘Limitations to text and data mining and
consumer empowerment: making the case for a right to “Machine
Legibility”’ (2019) 50 IIC-International Review of Intellectual Property and
4 C Geiger, ‘The Missing Goal-Scorers in the Artificial Intelligence Team: Competition Law 649, 666; A Strowel and R Ducato, ‘Artificial Intelligence
Of Big Data, the Fundamental Rright to Research and the Failed Text and and Text and Data Mining, A Copyright Carol’ in E Rosati (ed.) The
Data Mining Limitations in the CDSM Directive’ in M Senftleben, J Poort, Routledge Handbook of EU Copyright Law (Routledge 2021 London), 300.
M van Eechoud et al. (eds) Intellectual Property and Sports, PIJIP/TLS 11 J Quintais ‘Generative AI, Copyright and the AI Act’ (09 May 2023).
Research Paper Series, No. 66 (Kluwer International 2021 Alphen aan den Available at https://copyrightblog.kluweriplaw.com/2023/05/09/
Rijn) 383, 385f. generative-ai-copyright-and-the-ai-act/ (accessed 3 December 2023).
5 A Dermawan, ‘Text and data mining exceptions in the development of 12 P Keller ‘Protecting Creatives or Impeding Progress? Machine Learning
generative AI models: what the EU member states could learn from the and the EU Copyright Framework’ (20 February 2023). Available at
Japanese “nonenjoyment” purposes?’ (2023) The Journal of World https://copyrightblog.kluweriplaw.com/2023/02/20/protecting-creatives-
Intellectual Property 8. or-impeding-progress-machine-learning-and-the-eu-copyright-
6 E Rosati Copyright in the Digital Single Market, Article-by-Article framework/ (accessed 28 November 2023); A Strowel, ‘ChatGPT and
Commentary to the Provisions of Directive 2019/790 (OUP 2021Oxford) 74. generative AI tools: theft of intellectual labor?’ (2023) 54 IIC-International
7 Ibid; Recital 18 CDSM. Review of Intellectual Property and Competition Law 491, 493.
Gina Maria Ziaja ⋅ The text and data mining opt-out in Article 4(3) CDSMD ARTICLE 3

gaining more power and control over the market, leav- order to comply with the three-step test as also expressly
ing artists with reduced control and compensation in the referred to in Article 7(2) CDSMD, the entirety of a
medium to long term.13 Therefore, it is unclear whether rightholder’s work may not be used, even if they have
both the creators’ and the rightholders’ position will be not opted-out.18 However, in the case that rightholders

Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024


actually improved over time. do opt-out, AI system providers will be forced to obtain
licences if the relevant protected data are key to train-
2. Blocking the development of AI in the EU? ing the developed system. Requiring authorization for
Despite the rightholders’ legitimate interests, the opt-out training AI algorithms that use large amounts of data con-
mechanism of Article 4(3) CDSMD has been criticized stitutes an added cost factor for the system provider in the
for potentially hampering the development of AI in the form of licensing fees and transaction costs. It has been
EU.14 The freedom granted to rightholders to reserve, anticipated by a commentator that if these costs become
inter alia, their right of reproduction for TDM activities too high, it will have a detrimental effect on the EU’s AI
could potentially impede the application of the excep- sector to compete on the global market. In the long run,
tion codified in Article 4 CDSMD and ultimately under- this could lead to a reduction of the economic poten-
mine its intended purpose. This may adversely affect tial of licensing copyright protected content for training.
the development of AI-based creativity in the EU in Therefore, the chosen TDM exception may also lead to
practice.15 a lack of revenue for creative sectors and insufficient
The current EU TDM system appears to be more investment in emerging high-tech products and services
restrictive compared to other jurisdictions which, for contrary to the CDSMD’s intention.19 In addition, con-
now, rely on fair use clauses or opening models allow- trary to the EU’s innovation goals, the CDSM’s provision
ing an increased number of system developers to conduct might paradoxically support the development of biased
TDM activities. This may place European AI develop- AI systems. This might occur by offering the wrong
ers and other innovators at a competitive disadvantage in incentives due to accessibility and price conditions for
comparison to their counterparts.16 Nevertheless, there is training data. It could be economically attractive to train
still ambiguity surrounding the international legal frame- AI systems on older, biased data or import algorithms
work and the country-specific requirements regarding already trained on unverifiable data to avoid licence
the use of training data that have yet to be further clari- fees.20
fied. Currently, there a several pending lawsuits concern- If the overall goal were to encourage research and,
ing the extent to which TDM activities may be permitted more specifically, to establish a legal framework that fos-
without a licence, including in the USA and UK.17 In ters innovation, it may have been preferable to impose a
right to remuneration on TDM activities used for com-
mercial purposes rather than offering the codified opt-
13 Quintais, n 11.
14 Dermawan, n 5, 10; E Bonadio and L Mc Gonagh, ‘Artificial intelligence as out.21 Apart from potentially hampering the development
producer and consumer of copyright works: evaluating the consequences of AI systems in Europe, Article 4(3) CDSMD could dis-
of algorithmic creativity’ (2020) 2 Intellectual Property Quarterly 112.
advantage small and medium-sized enterprises. Given
15 E Rosati, ‘Copyright as an obstacle or an enabler? A European perspective
on text and data mining and its role in the development of AI creativity’ that big companies such as OpenAI, Google, Facebook or
(2020) 27 Asia Pacific Law Review 198, 215. Amazon are already able to access large amounts of image
16 J Griffiths et al., ‘Comment of the European Copyright Society addressing
selected aspects of the implementation of articles 3 to 7 of Directive (EU)
and language data to train their AI systems, they have a
2019/790 on Copyright in the Digital Single Market’ (2023) 72 GRUR competitive advantage in the field of AI not depending
International 22, 29; M Senftleben et al., ‘Ensuring the Visibility and on the authorization of individual rightholders. There-
Accessibility of European Creative Content on the World Market - The
Need for Copyright Data Improvement in the Light of New Technologies fore, bigger companies can utilize their already available
and the Opportunity Arising from Article 17 of the CDSM Directive’ datasets to train more advanced AI systems, which can,
(2022) 13 Journal of Intellectual Property, Information Technology and
Electronic Commerce Law 67, para. 11; T Margoni and M Kretschmer, ‘A in turn, bolster their goods and services. This might lead
Deeper Look into the EU Text and Data Mining Exceptions: to issues for new market entrants as licensing and owner-
Harmonisation, Data Ownership, and the Future of Technology’ (2022) 71
GRUR International 685, 700; Geiger, n 4, 9f; J Love ‘We Need Smart
ship of the relevant data might be complex and subject to
Intellectual Property Laws for Artificial intelligence’ (07 August 2023).
Available at https://www.scientificamerican.com/article/we-need-smart-
intellectual-property-laws-for-artificial-intelligence/ (accessed 3 18 E Rosati, ‘No step-free copyright exceptions: the role of the three-step in
December 2023); Keller, n12. defining permitted uses of protected content (including TDM for
17 P Keller ‘Protecting Creatives or Impeding Progress? Machine Learning AI-training purposes)’, (2023) Stockholm Faculty of Law Research Paper
and the EU Copyright Framework’ (20 February 2023). Available at Series no. 123, 20.
https://copyrightblog.kluweriplaw.com/2023/02/20/protecting-creatives- 19 Senftleben / Margoni et al., n 16, para. 12f.
or-impeding-progress-machine-learning-and-the-eu-copyright- 20 Magroni/ Kretschmer, n 16, 700.
framework/ (accessed 17 January 2024). 21 Geiger, n 4, 9f.
4 ARTICLE Journal of Intellectual Property Law & Practice, 2024, Vol. 00, No. 00

copyrights. The costs of developing the necessary datasets amounts of training materials. It is often the case that AI
from the ground up can be exorbitant, which can hin- systems are trained with pre-prepared datasets that are
der smaller companies from competing with established not accompanied by the rightholders’ reservation. How-
ones.22 ever, reservations may still exist regarding the lawfully

Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024


Therefore, the introduction of Article 4(3) CDSMD accessible materials obtained from the original sources
into the European copyright legal framework rather for the arrangement of the dataset. Article 4(3) CDSMD
appears to be an obstacle than an enabler of innovation.23 does not provide guidance on where a reservation is to
be made to be deemed effective. Even if the rightholder
3. How to opt-out opted-out in a machine-readable format, the prospect
Besides the potential imbalance of interests, the practi- of an effective reservation being made somewhere other
cal feasibility of opting out appears challenging. Article than where the material used was acquired remains.28
4(3) CDSMD only contains minor indications as to how At the moment, there are no generally accepted pro-
the reservation of rights should be exercised. However, tocols or standards for the machine-readable expres-
opting out only improves the position of rightholders if sion of opting out. Even if there are various emerging
it is sufficiently clear how and where the rights are to be approaches to this issue developed by different players
reserved. Recital 18 CDSMD provides further specifica- on the market, it is unclear which will be supported by
tion of the act of reserving the relevant rights. A distinc- the major AI model providers.29 This uncertainty com-
tion is made based on whether the relevant content has plicates the effective reservation of rights by the respective
been made publicly available online or not. If the content rightholders.
has been made publicly available online, the reservation
is considered appropriate if done by the use of machine- IV. Clarification through the upcoming
readable means, including terms and conditions of a AI Act?
website and metadata.24 In other cases, rightholders may
How do rightholders know that their work is used to
consider reserving their rights through contractual agree-
train generative AI systems? This seems to be one of the
ments or a unilateral declaration. Nonetheless, following
main issues regarding the opt-out-approach of Article
the CJEU judgement in VG Bild Kunst,25 a commenta-
4(3) CDSM. As long as rightholders have no knowledge
tor has suggested that the provision needs to be inter-
whether their content is used or is going to be used, the
preted in the sense that rightholders may only reserve
mere possibility of opting out does not serve as a useful
their rights by implementing effective technological mea-
instrument to secure copyrights.
sures within the meaning of Article 6(1), (3) InfoSoc
In April 2021, the European Commission proposed
Directive26 to guarantee legal certainty and the proper
a regulatory framework for AI. Following this proposal
functioning of the internet. Following the reasoning by
and the EU Council’s adopted proposals of December
analogy with the CJEU’s approach in VG Bild-Kunst, it
2022, the European Parliament published its adopted
can be inferred that for both publicly available online
negotiating position and amendments in June 2023.30
content and other cases, rightholders may only reserve
This upcoming Artificial Intelligence Act might lead
TDM activities by implementing effective technological
the way for rightholders to obtain knowledge that their
measures. Without effective technological measures, it
works are being used to train AI systems.
could be challenging for individual users to determine if
Even though the draft AI Act does not specifically
the respective rightholder meant to reserve the perfor-
cover TDM exceptions and is without prejudice to the
mance of TDM activities concerning their copyrighted
CDSMD,31 Article 28b(4)(c) of the Parliament’s proposal
works.27
That said, it appears difficult to ensure that no legally 28 J Vesala, ‘Developing Artificial Intelligence-Based Content Creation: Are
effective reservation has been made when utilizing large EU Copyright and Antitrust Law Fit for Purpose?’ (2023) 54
IIC-International Review of Intellectual Property and Competition Law 351,
357.
22 N Lucchi, ‘ChatGPT: a case study on copyright challenges for generative
29 P Keller ‘Generative AI and Copyright: Convergence of Opt-outs?’ (23
artificial intelligence systems’ (2023) European Journal of Risk Regulation
November 2023). Available at https://copyrightblog.kluweriplaw.com/
1, 12.
2023/11/23/generative-ai-and-copyright-convergence-of-opt-outs/
23 Rosati, n 15, 217.
(accessed 13 December 2023).
24 Recital 18 CDSM.
30 Parliament, ‘Amendments adopted by the European Parliament on 14
25 VG Bild Kunst, C-392/19, ECLI:EU:C:2021:181. June 2023 on the proposal for a regulation of the European Parliament and
26 Directive 2001/29/EC of the European Parliament and of the Council of 22 of the Council on laying down harmonised rules on artificial intelligence
May 2001 on the harmonization of certain aspects of copyright and related (Artificial Intelligence Act) and amending certain Union legislative acts’,
rights in the information society. (COM(2021)0206-C9-0146/2021-2021/0106(COD)).
27 Rosati, n 6, 89f. 31 Parliament, n 30, Recital 60h.
Gina Maria Ziaja ⋅ The text and data mining opt-out in Article 4(3) CDSMD ARTICLE 5

relates to training data ‘protected under copyright law’.32 made that these obligations have been introduced to sup-
As stated in Recital 28a of the draft, ‘the adverse impact port rightholders to effectively exercise their right to opt-
caused by the AI system on the fundamental rights pro- out from the TDM exception in terms of Article 4(3)
tected by the charter […] include intellectual property CDSM.40 The publication of a comprehensive list of works

Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024


rights’.33 used for TDM activities will enable rightholders to be
Among other things, the Parliament’s published aware that their works have been used to train specific AI
amendments included additional obligations for systems.
providers of foundation models and generative AI sys- Nonetheless, given the wording of Article 4(3)
tems34 According to the Parliament’s proposal, providers CDSMD ‘has not been expressly reserved’, it appears that
of generative AI systems shall be required to ‘make pub- the opt-out mechanism can, as of now, only be applied
licly available a summary disclosing the use of training data ex ante. Contrary to that, the transparency obligation in
protected under copyright law’.35 The underlying ratio- the AI Act refers to content that has already been used.
nale aims at ensuring the collaboration between copyright In order for this transparency obligation to have a notice-
holders and providers of generative AI systems regarding able impact on the realization of the opt-mechanism for
this new form of exploitation of works.36 If adopted in the rightholders, the opt-out would have to be exercised ex
final AI Act, this would give rightholders the opportunity post. Depending on the technologies used, it seems chal-
to obtain knowledge of the use of their works to exercise lenging to opt-out of the use of one’s work after an AI
their opt-out in terms of Article 4(3) CDSM. The obli- system has already been trained on it.
gation is specifically directed at providers of foundation That said, tech companies have started developing
models using generative AI. For this purpose, founda- technologies allowing to ‘unlearn’ specific information
tion models are defined as ‘an AI system model that is from large language models. System providers could use
trained on broad data at scale, is designed for generality such means to remove copyright-infringing training data
of output, and can be adapted to a wide range of distinctive ex post upon the request of the concerned rightholder.41
tasks’.37 Generative AI systems are ‘AI systems specifically It is therefore technically possible to exercise the opt-
intended to generate, with varying levels of autonomy, con- out mechanism after obtaining knowledge of the use of
tent such as complex text, images, audio, or video’.38 The one’s protected work. That said, it may be less time- and
transparency obligation expressly states that ‘providers of cost- consuming and more practical to adequately remu-
foundation models [using generative AI], shall in addition nerate the rightholder upon publication of the use of the
without prejudice to Union or national or Union legisla- work than untraining already developed AI systems.
tion on copyright, document and make publicly available Following Parliament’s proposal, the Spanish presi-
a sufficiently detailed summary of the use of training data dency of the EU Council further suggested that founda-
protected under copyright law’.39 tion model providers should demonstrate the implemen-
Therefore, the proposed obligation requires providers tation of adequate measures to ensure that the training
of the respective models to publish a comprehensive list of the model complied with EU copyright law. Follow-
of the content protected by copyright that has been used ing this, the respective developer should set up a sys-
to train their algorithm along with an identification of tem that respects the opt-out decisions of authors and
the relevant rightholders. Thus, the presumption has been rightholders. Additionally, a sufficiently detailed sum-
mary of the content used for training the system and
the way the provider handles copyright aspects shall be
published based on a template provided by the Euro-
32 M Hervey, ‘The EU AI Act and IP’ (26June 2023). Available at https:// pean Commission.42 However, it remained unclear what
www.lexology.com/library/detail.aspx?g=f08f23a7-306d-4afe-bc14-
0878aec74b2c (accessed 28 November 2023). a ‘sufficiently detailed’ documentation of copyright works
33 Parliament, n 30, Recital 28a.
34 Hervey, n 32.
35 Parliament, n 30, Art 28b(4)(c).
36 C Geiger and V Iaia ‘Generative AI, Digital Constitutionalism and 40 Geiger / Iaia, n 31.
Copyright: Towards a Statutory Remuneration Right grounded in 41 B Wodecki ‘AI Models Can Now Selectively “Forget” Data After Training’
Fundamental Rights—Part 1’ (17 October 2023). Available at https:// (06 October 2023). Available at https://aibusiness.com/ml/ai-models-can-
copyrightblog.kluweriplaw.com/2023/10/17/generative-ai-digital- now-selectively-forget-data-after-training#close-modal (accessed 22
constitutionalism-and-copyright-towards-a-statutory-remuneration- January 2024).
right-grounded-in-fundamental-rights-part-1/ (accessed 30 November 42 L Bertuzzi ‘Spanish Presidency Pitches Obligations for Foundation Models
2023). in EU’s AI law’ (07 November 2023). Available at https://www.euractiv.
37 Parliament, n 30, Art 3(1)(c). com/section/artificial-intelligence/news/spanish-presidency-pitches-
38 Parliament, n 30, Art 28b(4). obligations-for-foundation-models-in-eus-ai-law/ (accessed 28 November
39 Ibid. 2023).
6 ARTICLE Journal of Intellectual Property Law & Practice, 2024, Vol. 00, No. 00

would amount to. The obligation appeared to be impossi- Act, ‘general purpose AI model means an AI model, includ-
ble to comply with, if providers of generative AI systems ing when trained with a large amount of data using self-
have to list all copyright protected material used in their supervision at scale, that displays significant generality and
training data sets and identify their rightholders. Given is capable to competently perform a wide range of distinct

Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024


the rather low threshold of originality needed to achieve tasks’.47 According to Article 52c(1)(d) AI Act, providers
copyright protection, the lack of required registration of general-purpose AI models ‘shall draw up and make
and the absence of rights ownership data, fulfilling the publicly available a sufficiently detailed summary about the
transparency obligation did not appear to be feasible in content used for training of the general-purpose AI model,
practice.43 according to a template provided by the AI Office’.48
Big AI developing companies, such as OpenAI, have An added recital specifies that ‘this summary should
already stated that they might have to cease operating in be generally comprehensive in its scope instead of techni-
the EU if they cannot comply with future regulations as cally detailed to facilitate parties with legitimate interests,
set out in the proposal AI Act.44 Even if this threat seems including copyright holders, to exercise and enforce their
rather unrealistic, it demonstrates that even major players rights under Union law, for example by listing the main
are questioning the economic viability of developing their data collections or sets that went into training the model,
systems in the EU, whilst smaller AI developing compa- such as large private or public databases or data archives,
nies most likely face even greater difficulties. Whereas and by providing a narrative explanation about other data
OpenAI arguably possesses the financial resources and sources used’.49 The AI Office’s template should ‘be sim-
technical infrastructure to publish datasets as proposed ple, effective, and allow the provider to provide the required
and would be able to undertake efforts to identify the summary in narrative form’.50 Contrary to what was ini-
respective rightholders, such obligations could exclude tially discussed, system providers do not have to lay open
smaller companies from the market entirely. This might an exhaustive list of all relevant data that has been used
weaken the innovative landscape in regard to AI devel- for training.
opments immensely. The Union legislature appears to have taken the impact
Due to imposing requirements that likely cannot be of this transparency obligation into account and clarified
met in practice, it appears preferable to frame the trans- that the size of the system provider shall be considered to
parency obligation as one of best efforts or good faith to allow simpler ways of compliance for small and medium
document information about how the respective provider enterprises including start-ups. The aim is to not bur-
handles training data that is protected by copyright.45 den smaller players on the market with excessive costs,
This would give smaller players a chance to implement which could prevent them from developing AI systems.51
transparency obligations to a certain degree depending This shall not place a disproportionate burden on smaller
on the company’s size without being treated in the same companies, which will hopefully be enabled to continue
way as market leaders. developing their systems in the EU.
On 9 December 2023, the presidency of the Council Overall, these clarifications have the potential to elim-
and the European Parliament’s negotiators reached a pro- inate most ambiguities regarding the scope of the pro-
visional agreement.46 Following this, on 21 January 2024, posed transparency obligation.52
a seemingly final version of the AI Act was leaked. Besides the imposed transparency obligation, model
Contrary to Parliament’s proposal, the seemingly final providers shall ‘in place a policy to respect Union copy-
AI Act refers to general-purpose AI models rather than right law in particular to identify and respect, includ-
foundation models. According to Article 3(1) (44b) AI ing through state of the art technologies, the reservations
of rights expressed pursuant to Article 4(3) of Directive
43 Quintais, n11; Hervey, n32. (EU) 2019/790’.53 This provision explicitly establishes a
44 J Vincent ‘OpenAI Says it could “cease operating” in the EU if it can’t
Comply with Future Regulation’ (25 May 2023). Available at https://www.
link to copyright law, more specifically to the CDSMD’s
theverge.com/2023/5/25/23737116/openai-ai-regulation-eu-ai-act-cease-
operating (accessed 28 November 2023). 47 Art 3(1)(44b) AI Act.
45 Quintais, n 11. 48 Art 52c(1)(d) AI Act.
46 Council ‘Artificial Intelligence Act: Council and Parliament Strike a Deal 49 Recital 60k AI Act.
on the First Rules for AI in the World’ (09 December 2023). Available at
50 Ibid.
https://www.consilium.europa.eu/en/press/press-releases/2023/12/09/
artificial-intelligence-act-council-and-parliament-strike-a-deal-on-the- 51 Recital 60g AI Act.
first-worldwide-rules-for-ai/ (accessed 13 December 2023); Parliament 52 P Keller ‘A First Look at the Copyright Relevant Parts in the Final AI Act
‘Artificial Intelligence Act: Deal on Comprehensive Rules for Trustworthy Compromise’ (11 December 2023). Available at https://copyrightblog.
AI’ (09 December 2023). Available at https://www.europarl.europa.eu/ kluweriplaw.com/2023/12/11/a-first-look-at-the-copyright-relevant-parts-
news/en/press-room/20231206IPR15699/artificial-intelligence-act-deal- in-the-final-ai-act-compromise/#_ftn1 (accessed 22 January 2024).
on-comprehensive-rules-for-trustworthy-ai (accessed 13 December 2023). 53 Art 52c(1)(c) AI Act.
Gina Maria Ziaja ⋅ The text and data mining opt-out in Article 4(3) CDSMD ARTICLE 7

TDM exception and opt-out mechanism of Article 4(3) even the next decades. It remains questionable if the
CDSMD. introduction of the opt-out rule in Article 4(3) CDSMD
Overall, especially by implementing several new actually strengthens the rightsholders’ position given its
recitals, the seemingly final version of the Act appears to legal uncertainties. This will be shown following the prac-

Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024


establish a closer connection between that and its obli- tical application of the national transpositions of Arti-
gations and the pre-existing copyright framework. The cle 4 CDSM. The proposed transparency obligation of
extent to which the obligations of the AI Act will have the finalized AI Act has the potential to facilitate the
an impact on the balance of interests between AI model opt-out option for rightholders in terms of Article 4(3)
providers and rightholders remains to be seen though. CDSM. Depending on its practical application, this obli-
Nonetheless, the agreement reached appears to be more gation may significantly hinder or block further devel-
innovation-friendly than originally proposed. opments of AI innovations in the EU. Having the fast-
moving technology of AI systems in mind, only time
V. Conclusion will tell how effective and valuable the introduced TDM
exception and the transparency obligation of the AI Act
The overall aim of balancing the rightholders’ interests
will be.
whilst not negatively impacting AI innovation in Europe
remains hard to strike. The practical application of Arti-
cle 4(3) CDSMD on national levels as well as the out- Acknowledgements
come of the finalized AI Act will likely have a big impact The author thanks Professor Eleonora Rosati for her
on the European AI landscape in the next few years or inspiration and feedback in writing this article.
Downloaded from https://academic.oup.com/jiplp/advance-article/doi/10.1093/jiplp/jpae025/7614898 by Scuola Superiore Sant'Anna user on 23 March 2024

Journal of Intellectual Property Law & Practice, 2024, Vol. 00, No. 00
Article
© The Author(s) 2024. Published by Oxford University Press.
doi:https://doi.org/10.1093/jiplp/jpae025

You might also like