Choban v. Meta

You are on page 1of 18

Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 1 of 18

DANIEL J. MULLER, SBN 193396


1 [email protected]
2 VENTURA HERSEY & MULLER, LLP
1506 Hamilton Avenue
3 San Jose, California 95125
Telephone: (408) 512-3022
4 Facsimile: (408) 512-3023
5
Attorneys for Plaintiffs and the Class
6

8 UNITED STATES DISTRICT COURT

9 NORTHERN DISTRICT OF CALIFORNIA – SAN FRANCISCO DIVISION

10
MICHAEL CHABON, DAVID HENRY Case No.
11 HWANG, MATTHEW KLAM, RACHEL
LOUISE SNYDER, AND AYELET CLASS ACTION COMPLAINT
12 WALDMAN,
13 individually and on behalf of all others CLASS ACTION
similarly situated,
14
Plaintiffs,
15
v. JURY TRIAL DEMANDED
16
META PLATFORMS, INC., a Delaware
17 Corporation,
18 Defendant.
19

20

21

22

23

24

25

26

27

28

CLASS ACTION COMPLAINT


Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 2 of 18

1 Plaintiffs Michael Chabon, David Henry Hwang, Matthew Klam, Rachel Louise Snyder,

2 and Ayelet Waldman (“Plaintiffs”), individually and on behalf of all others similarly situated,

3 bring this class action against Defendant Meta Platforms, Inc. Plaintiffs’ allege as follows based

4 upon personal knowledge as to themselves and their own acts, and upon information and belief

5 as to all other matters:

6 NATURE OF ACTION

7 1. This is a class action lawsuit brought by Plaintiffs on behalf of themselves and a

8 Class of authors holding copyrights in their published works arising from Meta’s clear

9 infringement of their intellectual property.

10 2. Meta’s LLaMA (Large Language Model Meta AI) is a set of large language

11 models created and maintained by Meta Platforms, Inc. A large language model is an AI

12 software program designed to produce convincingly natural texts outputs in response user

13 prompts.

14 3. Rather than being programmed in the traditional manner, a large language model

15 is “trained” by copying massive amounts of text and extracting expressive information from it.

16 The body of text is referred to as the training dataset.

17 4. Accordingly, a large language model’s output is therefore entirely and uniquely

18 reliant on the material in its training dataset. Every time it assembles a text output, the model

19 relies on the information it extracted from its training dataset. Therefore, the decisions about the

20 textual information it includes in the training dataset are deliberate and important choices.

21 5. Plaintiffs and Class members are authors of books, screenplays, novels, and other

22 written works. Plaintiffs and Class members possess copyrights for the books and written works

23 they created and published. Plaintiffs and Class members did not consent to the use of their

24 copyrighted books as training materials for LLaMA.

25 6. Nevertheless, their copyrighted protected works were copied and ingested as part

26 of training LLaMA. Plaintiffs’ copyrighted books appear in the dataset that Meta has admitted

27 to using to train LLaMA.

28

CLASS ACTION COMPLAINT


Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 3 of 18

1 7. A large language model’s responses to user prompts or queries are entirely and

2 uniquely dependent on the text contained in its training dataset, necessarily processing and

3 analyzing the information contained in its training dataset to generate responses.

4 JURISDICTION AND VENUE

5 8. This Court has subject matter jurisdiction of this action pursuant to 28 U.S.C. §

6 1331 because this case arises under the Copyright Act (17 U.S.C. § 501) and the Digital

7 Millennium Copyright Act (17 U.S.C. § 1202).

8 9. This Court has personal jurisdiction over Defendants pursuant to 18 U.S.C.

9 §§ 1965(b) & (d), because they maintain their principal places of business in, and are thus

10 residents of, this judicial district, maintain minimum contacts with the United States, this judicial

11 district, and this State, and they intentionally avail themselves of the laws of the United States

12 and this state by conducting a substantial amount of business in California. For these same

13 reasons, venue properly lies in this District pursuant to 28 U.S.C. §§ 1391(a), (b) and (c).

14 PARTIES

15 A. Plaintiffs

16 10. Plaintiff Michael Chabon (“Plaintiff Chabon”) is a resident of California.

17 Plaintiff Chabon is an author who owns registered copyrights in several works, including but

18 not limited to, The Mysteries of Pittsburgh, Wonder Boys, The Amazing Adventures of Kavalier

19 & Clay, the Yiddish Policemen’s Union, Gentlemen of the Road, Telegraph Avenue, and

20 Moonglow. Plaintiff Chabon is the recipient of the Pulitzer Prize for Fiction, Hugo, Nebula, Los

21 Angeles Times Book Prize, and the National Jewish Book Award, among many other awards

22 received during the span of a writing career of more than 30 years. Plaintiff Chabon’s works

23 include copyright-management information that provides information about the copyrighted

24 work, including the title of the work, its ISBN or copyright registration number, the name of the

25 author, and the year of publication.

26 11. Plaintiff David Henry Hwang (“Plaintiff Hwang”) is a resident of New York.

27 Plaintiff Hwang is a playwright and screenwriter who owns registered copyrights in several

28 plays, including but not limited to, M. Butterfly, Chinglish, Yellow Face, Golden Child, the
1
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 4 of 18

1 Dance and the Railroad, and FOB, as well as the Broadway musicals Aida, Flower Drum Song

2 (2002 revival) and Disney’s Tarzan. Plaintiff Hwang is a Tony Award winner and three-time

3 nominee, a Grammy Award winner and two time nominee, a three-time OBIE Award winner,

4 and a three-time finalist for the Pulitzer Prize in Drama. Plaintiff Hwang’s works include

5 copyright-management information that provides information about the copyrighted work,

6 including the title of the work, its ISBN or copyright registration number, the name of the author,

7 and the year of publication.

8 12. Plaintiff Matthew Klam (“Plaintiff Klam”) is a resident of Washington D.C.

9 Plaintiff Klam is an author who owns registered copyrights in several works, including but not

10 limited to, Who is Rich?, and Sam the Cat and Other Stories. Plaintiff Klam is a recipient of a

11 Guggenheim Fellowship, a Robert Bingham/PEN Award, a Whiting Writer’s Award, and a

12 National Endowment of the Arts. Plaintiff Klam’s works have been selected as Notable Books

13 of the year by The New York Times, The Los Angeles Times, the Kansas City Star, and the

14 Washington Post. His work has appeared in The New York Times, The New Yorker, Harper’s

15 Magazine, and elsewhere. Plaintiff Klam’s works include copyright-management information

16 that provides information about the copyrighted work, including the title of the work, its ISBN

17 or copyright registration number, the name of the author, and the year of publication.

18 13. Plaintiff Rachel Louise Snyder (“Plaintiff Snyder”) is a resident of Washington,

19 D.C. Plaintiff Snyder is an author who owns registered copyrights in several works, including

20 but not limited to, Women We Buried, Women We Burned, No Visible Bruises – What We Don’t

21 Know About Domestic Violence Can Kill Us, What We’ve Lost is Nothing, and Fugitive Denim:

22 A Moving Story of People and Pants in the Borderless World of Global Trade. Plaintiff Snyder

23 is the recipient of the J. Anthony Lukas Work-in-Progress Award, the Hillman Prize, and the

24 Helen Bernstein Book Award, and finalist for the National Book Critics Circle Award, Los

25 Angeles Times Book Prize, and Kirkus Award. Her work has appeared in The New

26 Yorker, The New York Times, Slate, and elsewhere. Plaintiff Snyder’s works include copyright-

27 management information that provides information about the copyrighted work, including the

28
2
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 5 of 18

1 title of the work, its ISBN or copyright registration number, the name of the author, and the year

2 of publication.

3 14. Plaintiff Ayelet Waldman (“Plaintiff Waldman”) is a resident of California.

4 Plaintiff Waldman is an author and screen and television writer who owns registered copyrights

5 in several works, including but not limited to, Love and other Impossible Pursuits, Red Hook

6 Road, Love and Treasure, Bad Mother, Daughter’s Keeper, A Really Good Day, and Mommy

7 Track Mysteries. Plaintiff Waldman has been nominated for an Emmy and Golden Globe and is

8 the recipient of numerous awards including a Peabody, AFI award, and a Pen Award, among

9 others. Plaintiff Waldman’s works include copyright-management information that provides

10 information about the copyrighted work, including the title of the work, its ISBN or copyright

11 registration number, the name of the author, and the year of publication.

12 15. At all times relevant hereto, Plaintiffs have been and remain the holders of the

13 exclusive rights under the Copyright Act of 1976 (17 U.S.C. §§ 101, et seq. and all amendments

14 thereto) to reproduce, distribute, display, or license the reproduction, distribution, and/or display

15 the works identified in paragraphs 13-17, supra.

16 B. Defendant

17 16. Defendant Meta is a Delaware corporation with its principal place of business at

18 1601 Willow Road, Menlo Park, California 94025.

19 AGENTS AND CO-CONSPIRATORS


20
17. The unlawful acts alleged against Meta in this class action complaint were
21
authorized, ordered, or performed by the Defendant’s respective officers, agents, employees,
22
representatives, or shareholders while actively engaged in the management, direction, or control
23
of the Defendant’s businesses or affairs. The Defendant’s agents operated under the explicit and
24
apparent authority of their principals. Each Defendant, and its subsidiaries, affiliates, and agents
25
operated as a single unified entity.
26
18. Various persons and/or firms not named as Defendants may have participated as
27
co-conspirators in the violations alleged herein and may have performed acts and made
28
3
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 6 of 18

1 statements in furtherance thereof. Each acted as the principal, agent or joint venture of, or for

2 other Defendants with respect to the acts, violations, and common course of conduct alleged

3 herein.

4 FACTUAL ALLEGATIONS

5 A. Meta Platform’s Artificial Intelligence Products

6 19. Meta creates, markets, and sells software and hardware technology products,

7 including Facebook, Instagram, and Horizon Worlds. Meta also has a large artificial-intelligence

8 group called Meta AI that creates and distributes artificial-intelligence software products.

9 20. AI software is designed to algorithmically simulate human reasoning or

10 inference, often based upon statistical models or methods.

11 21. In February 2023, Meta released an AI product called LLaMA. LLaMA is a set

12 of large language models. A large language model (or “LLM” for short) is AI software designed

13 to parse and emit natural language. Though a large language model is a software program, it is

14 not created the way most software programs are—that is, by human software engineers writing

15 code. Rather, a large language model is “trained” by copying massive amounts of text from

16 various sources and feeding these copies into the model. This corpus of input material is called

17 the training dataset. During training, the large language model copies each piece of text in the

18 training dataset and extracts expressive information from it. The large language model

19 progressively adjusts its output to more closely resemble the sequences of words copied from

20 the training dataset. Once the large language model copies and ingests the all of this text, it is

21 able to generate and produce convincing simulations of natural written language as it appears in

22 the training dataset.

23 22. Much of the material in Meta’s training dataset, however, comes from

24 copyrighted works—including works written by Plaintiffs—that were copied by Meta without

25 consent, without credit, and without compensation.

26 23. Plaintiffs published written works contain certain copyright management

27 information. This information includes the written work’s title, the ISBN number or copyright

28 number, the author’s name the copyright holder’s name, and terms and conditions of use.
4
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 7 of 18

1 24. Meta introduced LLaMA in a paper called “LLaMA: Open and Efficient

2 Foundation Language Models”. In the paper, Meta describes the LLaMA training dataset as “a

3 large quantity of textual data” that was chosen because it was “publicly available, and

4 compatible with open sourcing.”

5 25. Open sourcing refers to putting data under a permissive style of copyright license

6 called an open-source license. Copyrighted materials, however, are not ordinarily “compatible

7 with open sourcing” unless and until the copyright owner first places the material under an open-

8 source license, thereby enabling others to do so later.

9 26. In a table describing the composition of the LLaMA training dataset, Meta notes

10 that 85 gigabytes of the training data comes from a category called “Books.” Meta further

11 elaborates that “Books” comprises the text of books from two internet sources: (1) Project

12 Gutenberg, an online archive of approximately 70,000 books that are out of copyright, and (2)

13 “the Books3 section of ThePile . . . a publicly available dataset for training large language

14 models.” Meta’s paper on LLaMA does not further describe the contents of Books3 or ThePile.

15 27. In a table describing the composition of the LLaMA training dataset, Meta notes

16 that 85 gigabytes of the training data comes from a category called “Books.” Meta further

17 elaborates that “Books” comprises the text of books from two internet sources: (1) Project

18 Gutenberg, an online archive of approximately 70,000 books that are out of copyright, and (2)

19 “the Books3 section of ThePile . . . a publicly available dataset for training large language

20 models.” Meta’s paper on LLaMA does not further describe the contents of Books3 or ThePile.

21 28. But that information is available elsewhere. ThePile is a dataset assembled by a

22 research organization called EleutherAI. In December 2020, EleutherAI introduced this dataset

23 in a paper called “The Pile: An 800GB Dataset of Diverse Text for Language Modeling”.

24 29. The EleutherAI paper reveals that the Books3 dataset comprises 108 gigabytes

25 of data, or approximately 12% of the dataset, making it the third largest component of The Pile

26 by size.

27 30. The EleutherAI paper describes the contents of Books3:

28 Books3 is a dataset of books derived from a copy of the contents of the


5
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 8 of 18

Bibliotik private tracker … Bibliotik consists of a mix of fiction and


1 nonfiction books and is almost an order of magnitude larger than our next
2 largest book dataset (BookCorpus2). We included Bibliotik because
books are invaluable for long-range context modeling research and
3 coherent storytelling.

4 31. Bibliotik is one of a number of notorious “shadow library” websites that also

5 includes Library Genesis (aka LibGen), Z-Library (aka B-ok), and Sci-Hub. The books and other

6 materials aggregated by these websites have also been available in bulk via torrent systems.

7 These shadow libraries have long been of interest to the AI-training community because of the

8 large quantity of copyrighted material they contain. For that reason, these shadow libraries are

9 also flagrantly illegal.

10 32. The person who assembled the Books3 dataset has confirmed in public

11 statements that it represents “all of Bibliotik” and contains 196,640 books. EleutherAI currently

12 distributes copies of Books3 through its website (https://pile.eleuther.ai/).

13 33. The Books3 dataset is also available from a popular AI project hosting service

14 called Hugging Face (https://huggingface.co/datasets/the_pile_books3).

15 34. Many of Plaintiffs’ written works appear in the Books3 dataset, these written

16 works are referred to as the Infringed Works.

17 35. For example, Books3 contains a significant amount of Plaintiff Chabon’s works,

18 including, but not limited to, The Final Solution, Bookends: Collected Intros and Outros,

19 Kingdom of Olives and Ash, Manhood for Amateurs: The Pleasures and Regrets of a Husband,

20 Father, and Son, Maps and Legends, McSweeney’s Mammoth Treasury of Thrilling Tales,

21 Werewolves in Their Youth, Michael Chabon’s America: Magical Words, Secret Worlds, and

22 Sacred Spaces, Moonglow, Pops Fatherhood in Pieces, The Amazing Adventures of Kavalier &

23 Clay, and the Yiddish Policemen’s Union.

24 36. Books3 similarly contains Plaintiff Hwang’s written works, including, but not

25 limited to, Golden Child, M. Butterfly, and Trying to Find Chinatown.

26 37. Plaintiff Klam’s works are similarly found in the Books3 dataset, including, but

27 not limited to, Who is Rich? and Sam the Cat.

28
6
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 9 of 18

1 38. Plaintiff Snyder’s works also are contained in the Books3 dataset, including, but

2 not limited to, No Visible Bruises: What We Don’y Know about Domestic Violence Can Kill Us.

3 39. In the same vein, Plaintiff Waldman’s works appear in the Books3 dataset,

4 including, but not limited to, A Really Good Day, Bad Mother, Love and Other Impossible

5 Pursuits, and Love and Treasure.

6 40. Since the launch of the LLaMA language models in February 2023, Meta has

7 made these models selectively available to organizations that request access, saying:

8 To maintain integrity and prevent misuse, we are releasing our model


9 under a noncommercial license focused on research use cases. Access to
the model will be granted on a case-by-case basis to academic
10 researchers; those affiliated with organizations in government, civil
society, and academia; and industry research laboratories around the
11 world. People interested in applying for access can find the link to the
application in our research paper.
12

13 41. Meta has not disclosed what criteria it uses to decide who is eligible to receive
14 the LLaMA language models, nor who has actually received them, or whether Meta has in fact
15 adhered to its stated criteria. On information and belief, Meta has in fact distributed the LLaMA
16 models to certain people and entities, continues to do so, and has benefited financially from
17 these acts.
18 42. In March 2023, the LLaMA language models were leaked to a public internet site
19 and have continued to circulate. Meta has not disclosed what role it had, if any, in the leak.
20 43. Later in March 2023, Meta issued a DMCA takedown notice to a programmer on
21 GitHub who had released a tool that helped users download the leaked LLaMA language models.
22 In the notice, Meta asserted copyright over the LLaMA language models.
23 44. According to reporting in June 2023, Meta plans to make the next version of
24 LLaMA commercially available.
25 CLASS ALLEGATIONS
26 45. Plaintiffs bring this action pursuant to the provisions of Rules 23(a), 23(b)(2),

27 and 23(b)(3) of the Federal Rules of Civil Procedure, on behalf of themselves and the following

28 proposed Class:
7
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 10 of 18

All persons or entities in the United States that own a United States copyright in
1 any work that was used as training data for the LLaMA language models during
2 the Class Period.
44. Excluded from the Class are Defendant, its employees, officers, directors, legal
3
representatives, heirs, successors, wholly- or partly-owned, and its subsidiaries and affiliates;
4
proposed Class counsel and their employees; the judicial officers and associated court staff
5
assigned to this case and their immediate family members; all persons who make a timely
6
election to be excluded from the Class; governmental entities; and the judge to whom this case
7
is assigned and his/her immediate family.
8
45. This action has been brought and may be properly maintained on behalf of the
9
Class proposed herein under Federal Rule of Civil Procedure 23.
10
46. Numerosity. Federal Rule of Civil Procedure 23(a)(1): The members of the Class
11
are so numerous and geographically dispersed that individual joinder of all Class members is
12
impracticable. On information and belief, there are at least tens of thousands of members in the
13
Class. The Class members may be easily derived from Defendants’ records.
14
47. Commonality and Predominance. Federal Rule of Civil Procedure 23(a)(2) and
15
23(b)(3): This action involves common questions of law and fact, which predominate over any
16
questions affecting individual Class members, including, without limitation:
17
a. Whether Defendant violated the copyrights of Plaintiffs and the Class when they
18
downloaded copies of Plaintiffs’ and the Class’s Infringed Works and used them
19
to train the LLaMA language models;
20
b. Whether the LLaMA language models are themselves infringing derivative
21
works based on Plaintiffs’ and the Class’s Infringed Works;
22
c. Whether the text outputs of the LLaMA language models are infringing
23
derivative works based on Plaintiffs’ Infringed Works;
24
d. Whether Defendant violate the DMCA by removing copyright-management
25
information from Plaintiffs’ and the Class’s Infringed Works;
26
e. Whether Defendant was unjustly enriched by the unlawful conduct alleged
27
herein;
28
8
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 11 of 18

1 f. Whether Defendant’s conduct allege herein constitutes Unfair Competition under

2 California Business and Professions Code Secion 17200 et seq.

3 g. Whether Defendant’s conduct alleged herein constitute common unfair

4 competition;

5 h. Whether any affirmative defense excuses Defendant’s conduct;

6 i. Whether any statutes of limitation limits Plaintiffs’ and the Class’s potential for

7 recovery;

8 j. Whether Plaintiffs and the other Class members are entitled to equitable relief,

9 including, but not limited to, restitution or injunctive relief; and

10 k. Whether Plaintiffs and the other Class members are entitled to damages and other

11 monetary relief and, if so, in what amount.

12 48. Typicality. Federal Rule of Civil Procedure 23(a)(3): Plaintiffs’ claims are

13 typical of the other Class members’ claims because, among other things, all Class members were

14 comparably injured through Defendant’s wrongful conduct as described above.

15 49. Adequacy. Federal Rule of Civil Procedure 23(a)(4): Plaintiffs are adequate

16 Class representative because their interests do not conflict with the interests of the other

17 members of the Class they seeks to represent; Plaintiffs have retained counsel competent and

18 experienced in complex class action litigation; and Plaintiffs intend to prosecute this action

19 vigorously. The interests of the Class will be fairly and adequately protected by Plaintiffs and

20 their counsel.

21 50. Declaratory and Injunctive Relief. Federal Rule of Civil Procedure 23(b)(2):

22 Defendants have acted or refused to act on grounds generally applicable to Plaintiffs and the

23 other members of the Class, thereby making appropriate final injunctive relief and declaratory

24 relief with respect to the Class as a whole.

25 51. Superiority. Federal Rule of Civil Procedure 23(b)(3): A class action is superior

26 to any other available means for the fair and efficient adjudication of this controversy, and no

27 unusual difficulties are likely to be encountered in the management of this class action. The

28 damages or other financial detriment suffered by Plaintiffs and the other Class members are
9
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 12 of 18

1 relatively small compared to the burden and expense that would be required to individually

2 litigate their claims against Defendants, so it would be impracticable for the members of the

3 Class to individually seek redress for Defendant’s wrongful conduct. Even if Class members

4 could afford individual litigation, the court system could not. Individualized litigation creates a

5 potential for inconsistent or contradictory judgments, and increases the delay and expense to all

6 parties and the court system. By contrast, the class action device presents far fewer management

7 difficulties, and provides the benefits of single adjudication, economy of scale, and

8 comprehensive supervision by a single court.

9 CAUSES OF ACTION

10 FIRST CAUSE OF ACTION

11 DIRECT COPYRIGHT INFRINGEMENT,


17 U.S.C. § 106, et seq.
12
52. Plaintiffs hereby incorporate by reference the allegations contained in the
13
preceding paragraphs of this Complaint.
14
53. Plaintiffs bring this claim on behalf of themselves and on behalf of the Class
15
against Defendants.
16
54. As the owners of the registered copyrights in the Infringed Works, Plaintiffs
17
hold the exclusive rights to those books under 17 U.S.C. § 106.
18
55. To train the LLaMA language models, Meta copied the Books3 dataset, which
19
includes the Infringed Works.
20
56. Plaintiffs never authorized Meta to make copies of their Infringed Works, make
21
derivative works, publicly display copies (or derivative works), or distribute copies (or
22
derivative works). All those rights belong exclusively to Plaintiffs under copyright law.
23
57. Meta made copies of the Infringed Works during the training process of the
24
LLaMA language models without Plaintiffs’ permission.
25
58. Because the LLaMA language models cannot function without the expressive
26
information extracted from Infringed Works and retained inside the LLaMA language models,
27

28
10
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 13 of 18

1 these LLaMA language models are themselves infringing derivative works, made without

2 Plaintiffs’ permission and in violation of their exclusive rights under the Copyright Act.

3 59. Plaintiffs and the Class have been injured by Meta’s acts of direct copyright

4 infringement. Plaintiffs and the Class are entitled to statutory damages, actual damages,

5 restitution of profits, and other remedies provided by law.

6 SECOND CAUSE OF ACTION

7 VICARIOUS COPYRIGHT INFRINGEMENT


8 17 U.S.C. § 106
60. Plaintiffs incorporate by reference all allegations of the preceding paragraphs as
9
though fully set forth herein.
10
61. Plaintiffs bring this claim on behalf of herself and on behalf of the Class against
11
Defendants.
12
62. Because the output of the LLaMA language models is based on expressive
13
information extracted from Plaintiffs’ Infringed Works, every output of the LLaMA language
14
models is an infringing derivative work, made without Plaintiffs’ permission and in violation of
15
their exclusive rights under the Copyright Act.
16
63. Meta has the right and ability to control the output of the LLaMA language
17
models. Meta has benefited financially from the infringing output of the LLaMA language
18
models. Therefore, every output from the LLaMA language models constitutes an act of
19
vicarious copyright infringement.
20
64. Plaintiffs and the Class have been injured by Meta’s acts of vicarious copyright
21
infringement. Plaintiffs and the Class are entitled to statutory damages, actual damages,
22
restitution of profits, and other remedies provided by law.
23

24
THIRD CAUSE OF ACTION
25

26 DIGITAL MILLENNIUM COPYRIGHT ACT – REMOVAL OF COPYRIGHT


MANAGEMENT INFORMATION
27 17 U.S.C. § 1202(B)
28 65. Plaintiffs incorporate by reference all allegations of the preceding paragraphs as
11
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 14 of 18

1 though fully set forth herein.

2 66. Plaintiffs bring this claim on behalf of herself and on behalf of the Class against

3 Defendants.

4 67. Plaintiffs included one or more forms of copyright-management information

5 (“CMI”) in each of the Infringed Works, including: copyright notice, title and other identifying

6 information, or the name or other identifying information about the owners of each book, terms

7 and conditions of use, and identifying numbers or symbols referring to CMI.

8 68. Without the authority of Plaintiffs and the Class, Meta copied the Infringed

9 Works and used them as training data for the LLaMA language models. By design, the training

10 process does not preserve any CMI. Therefore, Meta intentionally removed CMI from the

11 Infringed Works in violation of 17 U.S.C. § 1202(b)(1).

12 69. Without the authority of Plaintiffs and the Class, Defendant created derivative

13 works based on the Infringed Works. By distributing these works without their CMI, Meta

14 violated 17 U.S.C. § 1202(b)(3).

15 70. By falsely claiming that it has sole copyright in the LLaMA language models—

16 which it cannot, because the LLaMA language models are infringing derivative works—Meta

17 violated 17 U.S.C. § 1202(a)(1).

18 71. Meta knew or had reasonable grounds to know that this removal of CMI would

19 facilitate copyright infringement by concealing the fact that every output from the LLaMA

20 language models is an infringing derivative work, synthesized entirely from expressive

21 information found in the training data.

22 72. Plaintiffs and the Class have been injured by Meta’s removal of CMI. Plaintiffs

23 and the Class are entitled to statutory damages, actual damages, restitution of profits, and other

24 remedies provided by law.

25

26

27

28
12
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 15 of 18

1 FOURTH CAUSE OF ACTION

2 VIOLATIONS OF THE CALIFORNIA UNFAIR COMPETITION LAW


CAL. BUS. & PROF. CODE §§ 17200, ET SEQ.
3
73. Plaintiffs and the Class incorporate by reference each preceding and succeeding
4
paragraph as though fully set forth at length herein.
5
74. Plaintiffs bring this claim on behalf of herself and on behalf of the Class against
6
Defendants.
7
75. Defendant has engaged in unlawful business practices, including violating
8
Plaintiffs’ and the Class’s rights under the DMCA, and using the Infringed Works to train
9
LLaMA without Plaintiffs’ or the Class’s authorization.
10
76. The unlawful business practices described herein violate California Business and
11
Professions Code section 17200 et seq. because that conduct is otherwise unlawful by violating
12
the DMCA.
13
77. The unlawful business practices described herein violate California Business and
14
Professions Code section 17200 et seq. because they are unfair, immoral, unethical, oppressive,
15
unscrupulous or injurious to consumers, because, among other reasons, Defendant used
16
Plaintiffs’ protected works to train LLaMA for Defendant’s own gain without Plaintiffs’ and the
17
Class’s authorization.
18
78. The unlawful business practices described herein violate California Business and
19
Professions Code section 17200 et seq. as fraudulent because consumers are likely to be
20
deceived because, among other reasons, Meta caused LLaMA’s output to be emitted without
21
any credit to Plaintiffs’ or the Class whose Infringed Works comprise LLaMA’s training dataset.
22
79. Plaintiffs and the Class have been injured by Meta’s removal of CMI. Plaintiffs
23
and the Class are entitled to statutory damages, actual damages, restitution of profits, and other
24
remedies provided by law.
25
FIFTH CAUSE OF ACTION
26
NEGLIGENCE
27
80. Plaintiffs incorporate by reference the allegations of all foregoing paragraphs as
28
13
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 16 of 18

1 if they had been set forth in full herein.

2 81. Plaintiffs bring this claim on behalf of themselves and on behalf of the Class

3 against Defendants.

4 82. Defendant owed a duty of care toward Plaintiffs and the Class based upon

5 Defendant’s relationship to them. This duty is based upon Defendant’s obligations, custom and

6 practice, right to control information in its possession, exercise of control over the information

7 in its possession, authority to control the information in its possession, and the commission of

8 affirmative acts that result in said harms and losses. Additionally, this duty is based on the

9 requirements of California Civil Code section 1714, requiring all “persons,” including

10 Defendant, to act in a reasonable manner toward others.

11 83. Defendant breached its duties by negligently, carelessly, and recklessly

12 collecting, maintaining and controlling Plaintiffs’ and Class members’ Infringed Works and

13 engineering, designing, maintaining and controlling systems—including LLaMA—which are

14 trained on Plaintiffs’ and Class members’ Infringed Works without their authorization.

15 84. Defendant owed Plaintiffs and Class members a duty of care to maintain the

16 Infringed Works once collected and ingested for training LLaMA.

17 85. Defendant also owed Plaintiffs and Class members a duty of care to not use the

18 Infringed Works in a way that would foreseeably cause Plaintiffs and Class members injury, for

19 instance, by using the Infringed Works to train LLaMA.

20 86. Defendant breached their duties by, inter alia, the Infringed Works to train

21 LLaMA.

22 SIXTH CAUSE OF ACTION

23 UNJUST ENRICHMENT

24 87. Plaintiffs incorporate by reference all allegations of the preceding paragraphs as

25 though fully set forth herein.

26 88. Plaintiffs and the Class have invested substantial time and energy in creating the

27 Infringed Works.

28
14
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 17 of 18

1 89. Defendants have unjustly utilized access to the Infringed Materials to train

2 LLaMA.

3 90. Plaintiffs did not consent to the unauthorized use of the Infringed Materials to

4 train LLaMA.

5 91. By using Plaintiffs’ Infringed Works to train LLaMA, Plaintiffs and the Class

6 were deprived of the benefits of their work, including monetary damages.

7 92. Defendants derived or intend to derive profit and other benefits from the use of

8 the Infringed Materials to train LLaMA.

9 93. It would be unjust for Defendant to retain those benefits.

10 94. The conduct of Defendant is causing and, unless enjoined and restrained by this

11 Court, will continue to cause Plaintiffs and the Class great and irreparable injury that cannot

12 fully be compensated or measured in money.

13 REQUEST FOR RELIEF

14 WHEREFORE, Plaintiffs, individually and on behalf of members of the Class defined

15 above, respectfully request that the Court enter judgment against Defendants and award the

16 following relief:

17 A. Certification of this action as a class action pursuant to Rule 23 of the Federal

18 Rules of Civil Procedure, declaring Plaintiffs as the representative of the Class, and Plaintiffs’

19 counsel as counsel for the Class;

20 B. An order awarding declaratory relief and temporarily and permanently enjoining

21 Defendant from continuing the unlawful and unfair business practices alleged in this Complaint

22 and to ensure that all applicable information set forth in 17 U.S.C. § 1203(b)(1) is included when

23 appropriate;

24 C. An award of statutory and other damages under 17 U.S.C. § 504 for violations of

25 the copyrights of Plaintiff and the Class by Defendants.

26 D. An award of statutory damages under 17 U.S.C. § 1203(b)(3) and 17 U.S.C. §

27 1203(c)(3), or in the alternative, an award of actual damages and any additional profits under 17

28 U.S.C. § 1203(c)(2);
15
CLASS ACTION COMPLAINT
Case 3:23-cv-04663 Document 1 Filed 09/12/23 Page 18 of 18

1 E. A declaration that Defendant is financially responsible for all Class notice and

2 the administration of Class relief;

3 F. An order awarding any applicable statutory and civil penalties;

4 G. An order requiring Defendant to pay both pre- and post-judgment interest on any

5 amounts awarded;

6 H. An award of costs, expenses, and attorneys’ fees as permitted by law; and

7 I. Such other or further relief as the Court may deem appropriate, just, and

8 equitable.

9 DEMAND FOR JURY TRIAL

10 Plaintiffs hereby demand a jury trial for all claims so triable.

11 DATED: September 12, 2023 Respectfully submitted,

12
/s/ Daniel J. Muller
13
DANIEL J. MULLER, SBN 193396
14 [email protected]
VENTURA HERSEY & MULLER, LLP
15 1506 Hamilton Avenue
San Jose, California 95125
16 Telephone: (408) 512-3022
17 Facsimile: (408) 512-3023
[email protected]
18
/s/ Bryan L. Clobes
19 Bryan L. Clobes (pro hac vice anticipated)
CAFFERTY CLOBES MERIWETHER
20
& SPRENGEL LLP
21 205 N. Monroe Street
Media, PA 19063
22 Tel: 215-864-2800
[email protected]
23
Alexander J. Sweatman (pro hac vice anticipated)
24
CAFFERTY CLOBES MERIWETHER
25 & SPRENGEL LLP
135 South LaSalle Street, Suite 3210
26 Chicago, IL 60603
Tel: 312-782-4880
27 [email protected]
Attorneys for Plaintiffs
28
16
CLASS ACTION COMPLAINT

You might also like