It didn’t take long for the very bad thing to happen. On June 6, Forbes posted a story by Sarah Emerson and Rich Nieva about former Google CEO Eric Schmidt. The two journalists discovered that Schmidt’s secretive company White Stork, reportedly now renamed Project Eagle, was testing military drones in Menlo Park, California, and Ukraine. It was a big scoop, the kind that takes journalists hundreds of hours and dozens of interviews to make happen. It was also behind a pay wall, supposedly out of reach to all non-subscribers.
AI plagiarized it within 24 hours.
In fact, “plagiarized” might be too soft of a word. What Perplexity did was straight-up theft. And not because the AI-powered search engine provides complete summaries of even current information (more on that in a minute) rather than a list of links, bypassing the need to click through to anyone’s story. This robbery happened on its even more blatant product Pages, which provide AI-aggregated and barely rewritten stories presented as news articles.
According to a blog post by Forbes Chief Content Officer Randall Lane, on June 7 Perplexity posted a ridiculously similar story to the Forbes piece on Pages. The Perplexity story had, according to Lane, “eerily similar wording” to the Forbes piece and even “some entirely lifted fragments.” That alone would have been enough to get any journalism student catapulted off campus.
But Perplexity kept going. It barely sourced its information, “other than a line at the bottom of every few paragraphs that mentioned ‘sources,’ and a very small icon that looked to be the ‘F’ from the Forbes logo – if you squinted.” It lifted one of Forbes’ illustrations. It also created an AI-voiced podcast version of the piece and a YouTube version of that podcast. Oh, and for a while the Perplexity pieces were, according to Lane, ranking higher in Google search results than the Forbes piece it scrapped. (As of this writing, Forbes is now at the top of the SERP with Perplexity’s story clocking in third.)
Perplexity’s version of the Eric Schmidt story.
Forbes Executive Editor John Paczkowski called out Perplexity’s plagiarism on X, eliciting a response from the company’s CEO, Aravind Srinivas. Srinivas tweeted that Pages “has rough edges” but he sure appreciated the feedback. No admission of wrongdoing. No promise to stop stealing content. And no update to Perplexity’s story. The sourcing hasn’t been made more prominent. The illustration hasn’t been removed. The podcast and YouTube versions are still posted.
What’s worse: Perplexity shouldn’t have ever been able to access that material in the first place. The company promises publishers can block its bots from crawling their sites. All it takes is inserting a bit of code into the web page’s programming. But a Wired investigation of Perplexity found that the search engine frequently accessed content that it shouldn’t have. It ignored paywalls. It ignored the code it provided publishers. It just scraped what it wanted. It even plagiarized that very story.
Sure, Perplexity can argue that it sources all of its query responses and stories, just like it promises when you sign up for the service. “Every answer uses cited sources to provide a more accurate and comprehensive answer,” the text on the walk-through says. “If you want to dig deeper, just click the link to the source.”
And at the top of the answer it does list some of the sources it used, like this response it provided to JMM’s prompt “How often has Perplexity plagiarized journalists’ work?” You have a link to the tech site NDTV/Profit, the Wired piece, and one from Platformer. You need to click on a separate page to see it also sourced from the New York Post and New York magazine. And, of course, don’t forget the ridiculously tiny and opaque numbers after each paragraph that you can hover over to see which sources were used for that part of the summary. Sadly, Perplexity did not provide a specific number in response to JMM’s question, if only because the source material it was cribbing from didn’t provide one.
Ultimately, Perplexity has quickly become a case study of what journalists, publishers, and other creatives have feared: that AI will flat-out steal our work. But even if AI companies pay for access to content like OpenAI, it creates an uncomfortable reality where journalism doesn’t exist to tell stories and break news. Instead, it exists to train large language models and provide raw content to aggregate. It makes publishers what The Rebooting’s Brian Morrissey calls content vendors, essentially paid wholesalers of information that provide stories to feed the machine. That’s not a sustainable business model for publishers, nor is it really an ethical one for AI companies.
But that doesn’t seem to be reason enough for Perplexity to even feel guilty about lifting content, let alone stopping the practice. It just means that its product “has rough edges”— thanks for the feedback.
Dig It
It’s like data journalism Christmas. ProPublica has released its annual update to the 527 Explorer, its database of donations individuals and corporations make to political action committees. The database allows researchers to dig into the vast troves of money donated to political groups large and small, mining the documents these 527s—so named after the section of the tax code that allows them to exist—file with the Internal Revenue Service. You can search by state to see what, say, Iowa State Auditor Rob Sand has given to the Democratic organization ActBlue. Or you can explore some of the largest organizations like the Republican Governors Association to see where the group’s money came from. It’s an endless buffet of rabbit holes to explore. All you need to do is dive in.
Internships, Fellowships, and More
• Long way off, but seriously cool and with a close deadline. Universal Orlando is looking for a spring 2025 Destinations and Experiences intern. The Orlando-based position runs Jan.-April 2025, but applications are due June 28, 2024. The internship is paid, though no specifics on how much. Check out the details here.
• KUT, Austin’s NPR station, is seeking fall interns. Students work 10-15 hours per week at $15 an hour for the semester. Applications for fall at due by June 28. Get more info here.
• Fox News is looking for a fall intern. The remote position pays $15-$20 an hour, requires 16-24 hours a week, and provides a ton of networking opportunities. Oh, and you also get to work with the Multimedia Reporting team producing content. The position runs from Sept. 9-Nov. 15—peak election time. Applications are due July 7.
• The neuroscience site Transmitter is looking for a fall intern. The post-grad position is NYC-based. You’ll work a minimum of 24 hours per week at $20-$25 an hour. A background or strong interest in neuroscience, genetics, cognition, and behavior is preferred. Get more info here.
Got Some Fresh Dirt?
Do you have some essential info or did you write a story about Big Brutus, the world’s largest electric shovel, as a side hustle like Sophia Lacy (MBM, DMP, 23)? If so, then let JMM know by sending that juicy news on over to jeff.inman@drake.edu. JMM will treat it like this chillingly good Propublica story about 3M and forever chemicals and tell everyone about it.
Finally, every dog is cute, right?