The Bottomless Appetite of AI
Chat GPT and Google's Gemini have swallowed up the entire internet buffet without paying. Is that fair?
Image provided with complete irony by Dall-E.
When Biz Markie sampled Gilbert O’Sullivan’s 1972 symphonic sad sack ballad “Alone Again (Naturally),” it was pretty damn obvious. Not only did Biz lift the main piano line and beat from the beginning of the song for his track “Alone Again,” he lifted O’Sullivan’s chorus word for word. Sure, Biz rapped in his distinct style during the verses, but there was no hiding that his track wouldn’t exist without Sullivan’s.
Unsurprising, the owners of “Along Again (Naturally),” a publishing company called Grand Upright Music, Ltd., sued Markie and his record label, Warner Bros. Music, for copyright infringement. That lawsuit, 1991’s Grant Upright Music Ltd. V. Warner Bros. Records Inc. would change music forever.
Prior to that case, sampling ran rampant. It was core to hip-hop’s ethos. DJs like Kool Herc first looped breakbeats at Bronx block parties in the ‘70s, providing a bed for MCs to rap over. A particular favorite: James Brown’s track “Funky Drummer,” which eventually provided part of the beat to songs by artists like Erik B. & Rakim and Boogie Down Productions, among others. By the mid-‘80s, with the invention of the E-mu SP-1200, one of the first powerful sampling machines, sampling became an art, with producers and DJs stitching together whole songs out of multiple clips. Need proof? Check out what the Dust Brothers did on the Beastie Boys’ 1989 masterpiece Paul’s Boutique. It contains over 300 known samples, though people still believe there are more that have yet to be identified.
The Grand Upright Music case stopped all that. Judge Kevin Duffy opened his ruling with simply, “Thou shall not steal.” It got worse from there. In the courtroom, lawyers for Warner Bros. argued that, according to Duffy, “stealing is rampant in the music business and, for that reason, their conduct here should be excused.” Duffy didn’t believe that made it legal, though. “The conduct of the defendants … violates not only the Seventh Commandment but also the copyright laws of this country.” He ruled that Markie had committed copyright infringement. He also recommended that Markie be criminally prosecuted, though nothing came of it.
Regardless, from that moment on, sampling was no longer a free-for-all. Snippets had to be licensed. Contracts had to be signed. Money had to trade hands. And sampling moved forward, becoming the core of many of today’s hits. The originators of the music get pay and recognition out of the deal. New artists can build off their inspiration and create something new. It’s a win-win.
But that’s not how the internet works. And AI companies want to keep it that way even though they’re currently making the same argument Warner Bros. did in the Biz Markie case—essentially, everyone is doing it, so it has to be legal.
A recent New York Times article revealed that, by 2021, OpenAI’s ChatGPT had already ingested “every reservoir of reputable English-language text” online. That means they had scraped all the news and magazine stories posted online. All the research papers in academic journals. All the fan fiction on geek sites. Every Tumblr post. Every Wikipedia page. Every PDF book. Every blog. And every cooking recipe with a ridiculously long personal story before the instructions.
But ChatGPT still needed more data.
So, the company decided to create a program called Whisper, which transcribed over a million hours of YouTube videos and podcasts, all of which then got fed into GPT-4, the company’s latest version of AI—this despite YouTube’s terms of service saying videos could not be accessed by “any automated means.”
Of course, you’d think such a violation would get the lawyers YouTube and its parent company Google as excited as a horde of Urak-hai about to go off to battle. It didn’t. YouTube said nothing because, well, Google, did the same thing to train its AI platform Gemini, even though that also violated the copyrights of YouTube creators. Facebook also considered gathering mounds of copyrighted material to train its large language model (LLM) even if it meant lawsuits. Negotiating deals for those works would take too long.
Which might explain why AI companies are arguing this blatant grab of creator-owned material falls under fair use. The doctrine claims you can copy copyrighted material if what results from it is for a limited and transformative purpose. They argue that LLMs don’t reproduce copyrighted material like, say, a joke by Sarah Silverman, but instead use all of her work—her books, stand-up appearances, podcast interviews, etc.—to capture her comedic essence and then, as such, create new jokes in Silverman’s voice. It made a new thing so it can’t be copyright infringement! Silverman disagrees, of course, which is why she is suing OpenAI.
AI companies are also arguing that, because the amount of data that has already been collected is so large, it’s impractical to license it all. Call it the too-big-to-prosecute theory. But, of course, that’s like saying JMM shouldn’t have to pay for a box of tasty fruit snacks since he made off with the whole grocery store. The scale doesn’t make it legal, even if everyone is doing it. Like Judge Duffy said, “Thou shall not steal.”
But the best argument around why AI companies should be allowed to train their models on everything ever written without having to pay a dime for it might be one OpenAI CEO Sam Altman made on the New York Times podcast “Hard Fork” back in the fall. He said, “…in the same way humans can read the internet and learn, AI should be allowed to read the internet and learn.” The idea is that, since everything is posted online, making it publicly available, it’s ok to gobble up all of it. It’s just learning, after all.
But humans can’t read the whole web. And they certainly can’t get behind paywalls without first getting a subscription. Or read a book without first buying it. Or, as Biz Markie learned, borrow a bit of someone else’s work to spark your own without first getting permission. But AI companies are arguing that they should be able to snarf up the sum total of all human creativity for free and then turn around and get paid for the resulting burb that comes up. Because that’s transformative, after all.
Judge Duffy would likely disagree with them.
Other Options
Come April, the job hunt gets as hectic as Griff’s Relays schedule. Graduating seniors are applying, interviewing, and negotiating seemingly daily. But what if Handshake doesn’t have that dream gig you’ve always wanted? Well, maybe it’s time to check out this post from the site Freelance Opportunities on job boards you didn’t know about. Many of these are a bit obscure. And admittedly, some of them focus on jobs abroad, though maybe that’s not a bad thing if you’re thinking you want to avoid the circus that will be the election this fall. But that perfect gig might be waiting right there on Page 1. Happy hunting.
Meet the New Boss
It’s the great transfer of power. The Board of Student Communications has named next year’s student media leaders. Rising seniors Nicole Cox (DMP, MMJ) and Emma Stroner (DMP) have been named co-presidents of Drake Broadcasting System. Rising junior Bella Spah (DMP) will be the new editor-in-chief of Drake Mag, while soon-to-be seniors Parker Wright (MMJ, Writ) will take over the reins of Drake Political Review, and Erin Carlson (Psych) will continue as the EIC of DUiN. Avery Hjelm (Eng, Mus) will take over leadership of Periphery, and Mack Swenson (MMJ, Env Sci) will ascend to the top spot at The Times-Delphic. They have all promised to use their new-found positions for evil.
Same as the Old Boss
That doesn’t mean there isn’t old student media business to wrap up, including a bevy of BSC org activity this week. Here’s what’s on tap:
· DPR will hold its pin-up today, Monday, April 15, at 8 p.m., Mere 104. Come check out the spring issue, and don’t forget to bring your red pen and aggressive grammar skills.
· Periphery is hosting its spring launch party at 8 p.m. on Thurs., April 18, in the Medbury Lounge. Authors will read their work from the latest issue of the journal.
· DBS will record its first Relays live show on Fri., April 19, in Mere 2. First call is at noon.
· Finally, Drake Mag is now taking applications for positions including Associate Editor, Art Director, Photo Editor, and Digital Media Director. All materials are due on Friday, April 19 at 5:00 p.m. Send applications and questions to the upcoming EIC, Bella Spah (bella.spah@drake.edu). Click here for more information.
The Rehash: Summer Internships
· The St. Louis-based River City Journalism Fund is looking for a summer writing fellow. You’ll be embedded in a St. Louis newsroom, work 30 hours a week covering social justice and underrepresented communities, and be paid $700 a week. Get more info here.
· American Public Media Group, home of Minnesota Public Radio, has multiple internships for the summer including audio, video, production, and reporting. All of the gigs are based in St. Paul, Minnesota. You might even end up working with former Iowa Public Radio host and reporter Clay Masters and recent SJMC grad CJ Younger (MMJ), both of whom now call MPR home. Check out all the gigs here.
· SourceBooks books publisher is looking for nine summer interns for its Naperville, Illinois, office, including in marketing, editorial, content delivery, and sales. Pay is $15 an hour for 24 hours a week. You must be on site. The 10-week gig runs June 4 to August 9. Apply here.
· HerCampus has been one of the leading voices for college women for years. A group of websites including HerCampus, Spoon University, and College Fashionista, the company hires remote interns to help create content, design graphics, and run social media for its various websites. They have rolling internship sessions, with the summer one starting May 8. Get more info here.
· The Pulitzer Center for Crisis Reporting is looking for a full-time year-long intern. You’ll work with the center’s journalists to cover global stories, post stories to its website, and help create multimedia content. The position can be remote and it pays $37,440 with benefits and starts June 1. Apply here.
Want to Spread Word with JMM?
Do you have some essential info or were you just elected Drake SJMC student senator like sophomore Skylar Lathrop (PR, MMJ)? If you do, send them on over to jeff.inman@drake.edu. JMM will treat it like this great article by Erica Owen (Mags, ’12) on the surprising Icelandic food scene, which JMM can definitely confirm.
Finally, here’s actor Ryan Reynolds’ truly thoughtful birthday gift to his friend and Wrexham AFC co-owner Rob McElhenney. If all only had friends like Deadpool.