Hiya, folks, and welcome to TechCrunch’s regular AI newsletter.
This week in AI, music labels accused two startups developing AI-powered song generators, Udio and Suno, of copyright infringement.
The RIAA, the trade organization representing the music recording industry in the U.S., announced lawsuits against the companies on Monday, brought by Sony Music Entertainment, Universal Music Group, Warner Records and others. The suits claim that Udio and Suno trained the generative AI models underpinning their platforms on labels’ music without compensating those labels — and request $150,000 in compensation per allegedly infringed work.
“Synthetic musical outputs could saturate the market with machine-generated content that will directly compete with, cheapen and ultimately drown out the genuine sound recordings on which the service is built,” the labels say in their complaints.
The suits add to the growing body of litigation against generative AI vendors, including against big guns like OpenAI, arguing much the same thing: that companies training on copyrighted works must pay rightsholders or at least credit them — and allow them to opt out of training if they wish. Vendors have long claimed fair use protections, asserting that the copyrighted data they train on is public and that their models create transformative, not plagiaristic, works.
So how will the courts rule? That, dear reader, is the billion-dollar question — and one that’ll take ages to sort out.
You’d think it’d be a slam dunk for copyright holders, what with the mounting evidence that generative AI models can regurgitate nearly (emphasis on nearly) verbatim the copyrighted art, books, songs and so on they’re trained on. But there’s an outcome in which generative AI vendors get off scot-free — and owe Google their good fortune for setting the consequential precedent.
Over a decade ago, Google began scanning millions of books to build an archive for Google Books, a sort of search engine for literary content. Authors and publishers sued Google over the practice, claiming that reproducing their IP online amounted to infringement. But they lost. On appeal, a court held that Google Books’ copying had a “highly convincing transformative purpose.”
The courts might decide that generative AI has a “highly convincing transformative purpose,” too, if the plaintiffs fail to show that vendors’ models do indeed plagiarize at scale. Or, as The Atlantic’s Alex Reisner proposes, there may not be a single ruling on whether generative AI tech as a whole infringes. Judges could well determine winners model by model, case by case — taking each generated output into account.
My colleague Devin Coldewey put it succinctly in a piece this week: “Not every AI company leaves its fingerprints around the crime scene quite so liberally.” As the litigation plays out, we can be sure that AI vendors whose business models depend on the outcomes are taking detailed notes.
News
Advanced Voice Mode delayed: OpenAI has delayed advanced Voice Mode, the eerily realistic, nearly real-time conversational experience for its AI-powered chatbot platform ChatGPT. But there aren’t any idle hands at OpenAI, which also this week acqui-hired remote collaboration startup Multi and released a macOS client for all ChatGPT users.
Stability lands a lifeline: On the financial precipice, Stability AI, the maker of open image-generating model Stable Diffusion, was saved by a group of investors that included Napster founder Sean Parker and ex-Google CEO Eric Schmidt. Its debts forgiven, the company also appointed a new CEO, former Weta Digital head Prem Akkaraju, as part of a wide-ranging effort to regain its footing in the ultra-competitive AI landscape.
Gemini comes to Gmail: Google is rolling out a new Gemini-powered AI side panel in Gmail that can help you write emails and summarize threads. The same side panel is making its way to the rest of the search giant’s productivity apps suite: Docs, Sheets, Slides and Drive.
Smashing good curator: Goodreads’ co-founder Otis Chandler has launched Smashing, an AI- and community-powered content recommendation app with the goal of helping connect users to their interests by surfacing the internet’s hidden gems. Smashing offers summaries of news, key excerpts and interesting pull quotes, automatically identifying topics and threads of interest to individual users and encouraging users to like, save and comment on articles.
Apple says no to Meta’s AI: Days after The Wall Street Journal reported that Apple and Meta were in talks to integrate the latter’s AI models, Bloomberg’s Mark Gurman said that the iPhone maker wasn’t planning any such move. Apple shelved the idea of putting Meta’s AI on iPhones over privacy concerns, Bloomberg said — and the optics of partnering with a social network whose privacy policies it’s often criticized.
Research paper of the week
Beware the Russian-influenced chatbots. They could be right under your nose.
Earlier this month, Axios highlighted a study from NewsGuard, the misinformation-countering organization, that found that the leading AI chatbots are regurgitating snippets from Russian propaganda campaigns.
NewsGuard entered into 10 leading chatbots — including OpenAI’s ChatGPT, Anthropic’s Claude and Google’s Gemini — several dozen prompts asking about narratives known to have been created by Russian propagandists, specifically American fugitive John Mark Dougan. According to the company, the chatbots responded with disinformation 32% of the time, presenting as fact false Russian-written reports.
The study illustrates the increased scrutiny on AI vendors as election season in the U.S. nears. Microsoft, OpenAI, Google and a number of other leading AI companies agreed at the Munich Security Conference in February to take action to curb the spread of deepfakes and election-related misinformation. But platform abuse remains rampant.
“This report really demonstrates in specifics why the industry has to give special attention to news and information,” NewsGuard co-CEO Steven Brill told Axios. “For now, don’t trust answers provided by most of these chatbots to issues related to news, especially controversial issues.”
Model of the week
Researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) claim to have developed a model, DenseAV, that can learn language by predicting what it sees from what it hears — and vice versa.
The researchers, led by Mark Hamilton, an MIT PhD student in electrical engineering and computer science, were inspired to create DenseAV by the nonverbal ways animals communicate. “We thought, maybe we need to use audio and video to learn language,” he said told MIT CSAIL’s press office. “Is there a way we could let an algorithm watch TV all day and from this figure out what we’re talking about?”
DenseAV processes only two types types of data — audio and visual — and does so separately, “learning” by comparing pairs of audio and visual signals to find which signals match and which don’t. Trained on a dataset of 2 million YouTube videos, DenseAV can identify objects from their names and sounds by searching for, then aggregating, all the possible matches between an audio clip and an image’s pixels.
When DenseAV listens to a dog barking, for example, one part of the model hones in on language, like the word “dog,” while another part focuses on the barking sounds. The researchers say this shows DenseAV can not only learn the meaning of words and the locations of sounds but it can also learn to distinguish between these “cross-modal” connections.
Looking ahead, the team aims to create systems that can learn from massive amounts of video- or audio-only data — and scale up their work with larger models, possibly integrated with knowledge from language-understanding models to improve performance.
Grab bag
No one can accuse OpenAI CTO Mira Murati of not being consistently candid.
Speaking during a fireside at Dartmouth’s School of Engineering, Murati admitted that, yes, generative AI will eliminate some creative jobs — but suggested that those jobs “maybe shouldn’t have been there in the first place.”
“I certainly anticipate that a lot of jobs will change, some jobs will be lost, some jobs will be gained,” she continued. “The truth is that we don’t really understand the impact that AI is going to have on jobs yet.”
Creatives didn’t take kindly to Murati’s remarks — and no wonder. Setting aside the apathetic phrasing, OpenAI, like the aforementioned Udio and Suno, faces litigation, critics and regulators alleging that it’s profiting from the works of artists without compensating them.
OpenAI recently promised to release tools to allow creators greater control over how their works are used in its products, and it continues to ink licensing deals with copyright holders and publishers. But the company isn’t exactly lobbying for universal basic income — or spearheading any meaningful effort to reskill or upskill the workforces its tech is impacting.
A recent piece in The Wall Street Journal found that contract jobs requiring basic writing, coding and translation are disappearing. And a study published last November shows that, following the launch of OpenAI’s ChatGPT, freelancers got fewer jobs and earned much less.
OpenAI’s stated mission, at least until it becomes a for-profit company, is to “ensure that artificial general intelligence (AGI) — AI systems that are generally smarter than humans — benefits all of humanity.” It hasn’t achieved AGI. But wouldn’t it be laudable if OpenAI, true to the “benefiting all of humanity” part, set aside even a small fraction of its revenue ($3.4 billion+) for payments to creators so they aren’t dragged down in the generative AI flood?
I can dream, can’t I?
Read the full article here