AI Copyright & Training Data w/ Chris Paniewski | Wilson Sonsini Startup Legal Basics
Today’s show: Jason sits down with Wilson Sonsini partner Chris Paniewski for a special *Startup Legal Basics* on one of the thorniest questions in tech right now: how copyright law applies to AI training data. In this episode, Jason and Chris cover: - Why AI copyright law is unsettled and will take years to shake out - The difference between *training data* and *output* in legal terms - How “fair use” really works (and why it’s a defense, not a permission slip) - The risks of scraping vs. licensing, and why open source ≠ free use - How investors are diligencing AI startups around training data - Why startups must think differently once they’re funded vs. hacking in a dorm room Whether you’re building an AI product, investing in one, or just trying to understand where the law is headed, this conversation breaks down the real legal risks every founder should know.
Key Points
- Understanding the legalities of training data for AI involves navigating complex issues around copyright, permissions, and fair use, differing by jurisdiction.
- Fair use is a nuanced legal defense that hinges on factors like the purpose of use and market impact, with transformative use being a key consideration.
- Startups must be cautious and proactive about obtaining proper licenses or permissions for training data to avoid legal pitfalls and potential deal-blocking issues during investor due diligence.
Chapters
0:00 | |
1:14 | |
4:06 | |
7:07 | |
12:07 | |
17:39 |
Transcript
Loading transcript...