‘Pop Music Is a Promise That You Aren’t Listening Alone’: What AI Music Can and Can’t Do

Takeaways from a conversation with musician-technologists Holly Herndon and Mat Dryhurst

Li Jin

August 7, 2023

Earlier this spring, a brand new single dropped from Drake and The Weeknd. It quickly proliferated across the internet—the tune was catchy and you could dance to it. As tech writer Casey Newton put it, “I’m enjoying this more than anything on [Drake’s] last album.”

If you’ve been paying attention to the quickly evolving AI landscape, you already know the twist: Neither Drake nor The Weeknd made the track; “Heart on My Sleeve” was the product of generative AI.

Except that doesn’t fully capture how the song was made or how AI music will be created or consumed, if you ask Holly Herndon and Mat Dryhurst, musician-technologists who are pushing the AI envelope for artists.

“If you read some accounts in the media, you would think, ‘Oh my God, it’s the end of music. We can automate these AI-generated songs,’” Mat told me and Jesse Walden on a recent episode of my Means of Creation podcast. “And it’s like, ‘No, it’s actually incredibly manual.’” A human being still had to sing the lyrics and transfer it using software trained on vocal models.

Holly and Mat know this perhaps better than anyone else. She’s a singer who completed her PhD in machine learning at Stanford. Mat is her long-time collaborator and husband. Together, they’ve been ahead of the curve in applying AI to music. In 2021, Holly released Holly+, a “deepfake twin” of her voice that’s managed by a DAO; the next year, Holly and Mat launched Spawning, which builds tools to help artists manage their consent for AI training data.

To them, AI’s big moment—and the Drake deepfake—provides a useful inflection point for thinking about all the non-technological things we expect from music.

“Rather than worrying so much about how these automation tools are going to take away or remove things that you hold dear,” says Mat, “likely you hold them dear for reasons that extend beyond the media that’s exchanged.” That is: some of the biggest reasons people listen to music are only tangentially related to sound.

Below the podcast episode are my highlights from our conversation that have embedded themselves like an earworm into my brain.

Web3 infrastructure can still lead to new economic models for artists.

We’ve all been talking about the economics of streaming for a while. Very little money ends up in the hands of most creators as intermediaries suck up most of the profits, and only the biggest performers possess the clout to negotiate more favorable terms.

Platforms such as Audius are using blockchain tech to nudge the scales ever so slightly back toward artists, while Sound has layered in NFTs to incorporate ownership. At the moment, such platforms cater primarily to indie artists and independent labels.

“These are very much communities that have kind of been let down, I would argue, by web2 infrastructure,” says Mat.

But there are others who also aren’t well-served by the streaming ecosystem, says Holly.

“A lot of music does not function on a per-play valuation. It’s the kind of music that you just need to get access to the idea a couple times and then it changes your life, but it’s not something you want to hear on repeat in the background. For that kind of work, different economic models need to evolve.”

To her, web3 tools serve as “building blocks that put agency back in artists’ hands to create their own economic models around their work.”

AI doesn’t replace community.

Not everything revolves around streaming. For plenty of artists, streaming is secondary to the live experience; they’ll regularly sell out shows but maintain a minimal presence online. By extension, there are loads of people who want to experience music live instead of streaming it.

“[AI automation] doesn’t fundamentally remove the very human desire for musical congregation in real time with other humans,” says Holly. “That’s not going anywhere. It’s just the way that it’s monetized might be changed and the way that it’s organized will be changed.”

That isn’t to say there’s not a whole bunch of folks who would “happily listen to infinite Drake” on an AI-generated playlist, says Mat.

Overall though, “when it comes to deep subcultures or people who actually participate and care about music, I’m not really bothered in the slightest,” says Mat. “I’m just interested and excited by how new subcultures will forge some very exciting new tools.”

AI tools raise the baseline capacity for everyone.

For decades, artists have incorporated new technologies into their music. Recording and editing software known as digital audio workstations are now integrated into almost every musical session. Holly thinks machine learning will soon become just as commonplace in the industry.

“For some people, that will mean generating the entire song and all of the timbres and all of the instruments,” she says. “For other people, it will be generating the perfect version of your own natural singing voice or generating the perfect reverb that has maybe really strange physical properties. So I think there’s a whole sliding scale of how people will be integrating it into their work.”

All of this means that everyone—musically trained or not—will soon have the ability to produce music that sounds good.

But just as the introduction of keyboards with pre-recorded beats didn’t make every Casio-owning teenager a composer, AI won’t suddenly transform the talentless into pop stars.

“Just increasing baseline capacity for everybody equally says nothing,” says Mat. “The people we’re going to end up paying the most attention to are the people who do something more interesting with that baseline.”

It may take some time for AI to disrupt power laws.

In streaming, as in the creator economy in general, artist success follows a power law, with streaming recommendations and network effects helping musicians who are already popular.

While AI tools may boost smaller players by lowering the barrier to entry, Jesse thinks AI will in the short term end up preserving—and potentially exacerbating—the power laws around which creators’ music gets consumed.

“When you give masses new tools, what will they make with it? I think initially maybe they will make derivatives of what they already know and love,” Jesse says. “In the mid- to long- term, I think maybe there’s more opportunity for change, and we will see new forms of music emerge because the tools are more accessible.”

It’s a point Holly agreed with, comparing it to the rise of DJs. “This is no shade to DJs, but you don’t necessarily need to be able to write your own music or even produce your own music to be a really highly paid touring musician as a DJ. There’s all kinds of extra-musical things that contributed to different DJs’ success. And I think you’ll probably also see that as music becomes much, much easier to generate.”

Ultimately, however, elevating the musical baseline will have a disruptive effect, though it’s hard to say what that effect will look like.

“I think that what we understand as music will fundamentally change and hopefully we’ll expect something different. And I think there will be new talents [that] emerge using new skills that we maybe can’t even imagine,” Holly says.

Hyperpersonalization shouldn’t be the goal.

The creative process in its most material form has always been about producing work that others want to consume. When you make a movie or a song, you put it out there hoping that an audience will like it.

Streaming algorithms have already made the discovery process easier for consumers, sometimes to the detriment of spontaneous exploration. As a result, “We can also see people getting somewhat stuck in their tastes,” Holly says.

Yet the “if you like that, try this” model still relies on a creator.

AI can potentially invert the creative process altogether by serving up highly personalized content directly for consumers. It’s what tech writer Eugene Wei refers to as the “creative singularity” where AI generation and individual tastes converge.

While Mat says that’s a “fascinating engineering challenge,” he believes hyper-personalization misses the point of most art. “Pop music is a promise that you aren’t listening alone. Sure, you could possibly, hypothetically pretty soon tailor a song to be the perfect four minutes depending on your listening tastes. But who cares? My argument is I don’t think people will care.”

It still can’t create the feeling of sharing something in a room with other people, all drawn together because the artist is from the same city as them or shares something in common.

“Reducing music down to media that can be consumed and personalized is cute for sure,” Mat says. “But it’s just missing such a huge part of the picture.”

+++

Variant is an investor in Sound. This post is for general information purposes only. It does not constitute investment advice or a recommendation or solicitation to buy or sell any investment and should not be used in the evaluation of the merits of making any investment decision. It should not be relied upon for accounting, legal or tax advice or investment recommendations. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by Variant. While taken from sources believed to be reliable, Variant has not independently verified such information. Variant makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. This post reflects the current opinions of the authors and is not made on behalf of Variant or its Clients and does not necessarily reflect the opinions of Variant, its General Partners, its affiliates, advisors or individuals associated with Variant. The opinions reflected herein are subject to change without being updated.