From the course: The State of AI and Copyright

What can creators do to protect their work from being used for AI training?

From the course: The State of AI and Copyright

What can creators do to protect their work from being used for AI training?

- It seems like there, with so much unknown or unanswered in AI-related copyright law, what steps can a creator take like right now to protect themselves from AI deriving content from their own work? If anything? (interviewer chuckles) - [Jen] I have a couple thoughts, practical considerations to think about. And my personal view is I think a lot of the steps we can take are technological in nature. You know, what can we do with our work to protect it, from that technological standpoint? There are a lot of proposals out there right now. There's a lot of legislative and regulatory proposals out there, you know, in AI. I've seen a lot about affirmatively marking AI-generated content. But I would actually flip that and advise content creators to mark their own content, put some digital watermark, invisible signatures, something on their content so that they can track its authenticity. I've seen really unique solutions proposed leveraging blockchain technologies. - Okay. - Provenance of certain artwork. 'Cause the issue with digital copies is you can create exact replicas of them quite easily and cheaply. And, you know, the question is how do we control that? There's technology to mask digital artwork and digital content so that it becomes less effective for training purposes. (Garrick hums) - [Jen] Almost like a poison pill. So, there are some technological measures out there. If you have a website, some legal measures looking at your terms of service, you know, making sure that you're preventing, you're prohibiting scraping activities, use of your content for unauthorized means. The robots.txt file is a big one. There are some changes being proposed right now in terms of, how can we signal to bots out there that are scraping the web, you know, to not scrape content? Like you mentioned ChatGPT, I think there's a CBOT out there now that's crawling the internet. You can block that kind of bot if you want it scraping your content. And there are other known bots out there. So, a lot of psychological controls. The other thing I would say, the other plug I would make as an IP attorney, is to think about branding. I think we're going to see a lot more trademark issues coming here as we think about unique styles, watermarking, we've seen that in some of the litigation. If any of you have seen the Getty Images complaint, there are some distorted images coming out of the Stable Diffusion model that have, it's almost like a distorted Getty Images watermark on it. So thinking about how these branding and trademarks in conjunction with our digital content might be another layer of protection we can think about. - That's an incredibly detailed and complete answer. Just a couple of other thoughts. I mean a particular product is called Glaze. I mean I can't endorse it, I don't know if it works. But the idea is it tricks AI into not being able to see the image properly even though you or I would see it properly on screen. There's a site called Have I Been Trained?, which searches the LAION-5B Dataset, tells you if it's in that dataset, which is used by a lot of image generators to train their AI and you can then file an opt-out. Because under EU law, if the text and data mining is done for a commercial purpose, you have to check for opt-outs in some way. But we don't know how that's going to work in practice. You can put your stuff behind a pay wall but obviously that's very inconvenient because it conflicts with your ability to show your works to the general public. And then the other thought, and this won't work for everyone at all, but if you do license your work for text and data mining to certain operators where you do like their ethical approach or you do like the compensation, because I think one of the considerations in the States, at least, will be, in terms of fair use, whether what they're doing conflicts with what you would normally do. And if you do in fact license your works, then if they've taken it without paying license fees, then that's probably less likely to be fair use.

Contents