Just a few years ago the idea that the internet could be flooded with Kanye West songs created by artificial intelligence might have sounded more like a subplot for a Philip K. Dick novel than a concept rooted in any kind of reality. Yet the acceleration of AI music has been so rapid that this has now become the new norm whenever you log onto YouTube.
Over recent months we’ve seen a new AI-generated Drake and The Weeknd song that was so realistic it fooled social media users when going viral, forcing the Universal Music Group to use all its legal might to get it scrubbed off the internet. There’s also been a new report suggesting major labels including Universal and Warner are in talks with Google to secure licensing and expertise so they can create new deep fake songs from their deceased artists. That’s not even counting renowned producer Timbaland using AI Biggie vocals and pledging the technology will soon “be a new way of creating and generating money with less costs”; pop star Grimes revealing she wants AI to resurrect her voice, so music can be released in her name long after her death; or a deep fake 2Pac and DMX duet that was so life-like it’s had some rap fans labelling it “mind-blowing”.
“AI music is no longer this sci-fi concept, it’s our new reality,” explains Holly Herndon - the ambitious electronic composer and singer-songwriter who has been at the forefront of experimenting with artificial intelligence software through her own music – to Mixmag. “It is a Pandora's box situation; once it is out there, it is impossible to put the technology back into the box.”
According to Herndon there are many similarities between AI music and the advent of sampling, which itself initially inspired fearful reactions too. As an example, Miles Davis’ percussionist James Mtume famously branded the sample a form of “artistic necrophilia”. Herndon adds: “Sampling old records to create something new led to the formation of genres like hip hop and innovative new forms of artistic expression. AI music has the potential to do something very similar.”
However, for every artist like Herndon calling for AI music to be embraced, there’s several notable ones who are deeply worried about its implications. “It is a grotesque mockery of what it is to be human,” wrote Nick Cave in a blog post responding to an AI song that replicated his own singing. “Songs arise out of suffering, by which I mean they are predicated upon the complex, internal human struggle of creation and, well, as far as I know, algorithms don’t feel. Data doesn’t suffer.” Legendary rap producer Pete Rock, meanwhile, angrily tweeted: “AI is madly disrespectful and a cowardly act that bears no real soul or feeling. If you think this AI stuff is dope, then you are a part of the problem.”
So, what exactly is the reality? Can AI music really be a force for good, and mirror the art of sampling in pushing musicians into brand new areas of creativity? Or is it destined to be exploited by capitalist businessmen at major labels, forcing us to listen to “new” Aphex Twin and Beyoncé songs for hundreds of years after their deaths? After all, an AI voice model doesn’t answer back or need to be paid millions of dollars in royalties, making Timbaland’s reference to “less costs” start to feel more and more like a nod to doing away with human artists altogether.
The MIT Museum in the US is currently hosting the AI: Mind The Gap exhibition, which explores how artificial intelligence and human creativity can work side-by-side. It achieves this through activities including a collaborative poetry experience, where visitors can co-write a poem with OpenAI’s GPT-3 text generating software. Yet even though the exhibition ultimately puts a positive spin on AI, MIT content developer Lindsay Bartholomew recognises the red flags around it being integrated within the music industry, especially around things like the mass spread of misinformation. “The intent behind much of AI research and technology is in fact to augment, not take over, human ability – that’s the exciting part. But that doesn’t mean AI can’t be misused, either,” she warns.
“The big difference between AI music and sampling is that the voice is by definition one of the most personal aspects of a person. It can be dangerous to use and manipulate voices, as it can feel like invading or even changing a person to misrepresent them. Yes, there is an opportunity to use AI to combine elements to make new music and forms of expression. But it’s almost a fine line between honouring another person’s work and stealing that work, or even someone’s identity. Deep fake songs have the potential to create alternate realities and even destroy lives if misused.”
Read this next: 8 of the best AI tools for music production
Currently so many of the AI songs that inhabit the internet have been created by amateurs in their bedrooms. Using software such as Voicify and Tacotron (developed by Google), bedroom creators “train” these text-to-speech programs with hundreds of hours of acapella vocals of artists, resulting in a voice model that subsequently performs the songs you feed into it and can be used at will like the pop music equivalent of Frankenstein’s monster. Samuel Fisher is the brains behind the AI Hit Factory YouTube channel, where he’s used Voicify to create covers such as Frank Sinatra singing Lady Gaga’s ‘Bad Romance’ and even Elvis Presley covering Rihanna’s ‘Umbrella’. For bedroom producers without industry connections, he believes AI music can help to level out the playing field.
“Before amateur musicians would never have the opportunity to work with massive artists like this, but now we can,” he says. “That’s really powerful.” The AI Hit Factory’s output is relatively innocent stuff: more like fan fiction than reanimating a corpse to sing problematic, freshly imagined lyrics. Yet Fisher says he’s already had songs taken down because of copyright infringement and, despite making only $100 from his efforts to date, he lives in constant fear of legal action. “Right now there just isn’t a clear legal landscape for how AI music is done,” he adds, “and the fear is that the community, which is mostly just music fans experimenting and having fun, will be punished simply for embracing a new form of technology.”
He continues: “For my channel I am now looking more at voice models trained on data that is in the public domain, like political speeches. It’s much safer creating a deep fake rap out of Trump vocals, say, than it is trying to create a new Kendrick Lamar song.”
The reason the Drake and The Weeknd song was removed off YouTube and TikTok via the Digital Millennium Copyright Act was because it sampled the ad-lib of producer Metro Boomin without permission. Yet what’s legally acceptable when it comes to actually creating artificial intelligence-powered vocals remains relatively unclear. “It is a situation where the technology is evolving so much faster than the rate the law can possibly evolve at,” says Chris Mammen, a lawyer with experience of intellectual property law within the music industry. “When it comes to AI music, the law is continuously trying to play catch up, which means there’s lots of loopholes.”
The United States Copyright Office recently issued new guidelines regarding copyright applications for works created with artificial intelligence tools. The new rules recognize that work made with both AI input and human creation can potentially be eligible for copyright protection, but any part of it that is entirely made by AI isn’t eligible. It means copyright protections only extend to work that is attributable to human authorship. But The Recording Industry Association of America is pushing for the opposite outcome, arguing last year that AI platforms that scrape existing songs to create voice models infringe on the rights of the artists who originally wrote and recorded them.
“Punishing hobbyists seems harsh,” admits lawyer Mammen, “as once these technologies are invented, history [the Napster lawsuits] has shown going after the ordinary people who are using them isn’t always the best idea. My question is: what happens when there’s a generative AI tool pumping out 100s of life-like replications of Jay-Z songs hourly, without any human input? That’s a real possibility. So, how is the law possibly going to stop something like that?”
Lea Schönherr from the ISPA Helmholtz Center for Information Security specialises in research around preventative deep fake technology, and even she admits there’s currently little in the way of protective tools. “There are some Deep Learning based methods to distinguish fake from real audio data. But it is also not always clear when these methods are effective and for which cases they are not. In particular, detection mechanisms should be able to transfer to new and adapted generative models. Otherwise, an attacker can simply train a new model to evade detection,” she explains.
“I would say that especially in the long run, we need more than one detection mechanism to identify fake music. Instead of cutting content, we'll probably need mechanisms to ensure that the artist whose data is used to train and fine-tune a model also benefits from its use.”
It seems major artists have cottoned onto the fact that aggressively pursuing bedroom AI song creators might not be the best idea and that they should instead try to find a way - like Schönherr advises - to try to benefit from its use. In May, Grimes unveiled Elf.Tech, which allows you to record or upload vocals that are then replicated in the singer’s voice. The artist has also said she’s happy with anyone creating new songs using an AI version of her voice, so long as it’s Grimes who ends up with 50% of the royalties.
Herndon is also a pioneer in this space, with her AI deep fake “twin”, Holly+, allowing users to upload any polyphonic audio and have it transformed into a download of music sung in the artist’s voice. “With Holly+ someone took their train commute home and fed it into the system, which resulted in this crazy chorus that was really beautiful,” Herndon reveals. “I think there can be a future where AI can be playfully integrated with the consent of an artist. If you create that interaction with your fans where you allow people to use your voice, it can bring them even closer to you. Maybe one day you will have live drag performances entirely in Kanye’s voice. That could be really cool; you will be able to interact with your fandom in new ways.”
For whatever reason Kanye West is an artist with a staggering number of AI songs, with people having him cover everything from ‘Teenage Dirtbag’ to ‘Wonderwall’. At a time when the artist has become a Nazi sympathiser, it has given his fans the power and agency to use his voice for something far less toxic and more fun; a rare level of control for fans who often wake up to their favourite artist erratically unleashing deeply offensive rants. Yet on another level there’s also something problematic about people being able to bend a Black man’s voice to their own will, especially the voice of someone who has always fought - rightly or wrongly - for freedom of speech.
New York hip hop artist Big Baby Gandhi recently recorded vocals for a new socially conscious song called ‘War With The Matrix’ with a voice memo app, before using a Kanye AI voice cloner to "upscale" it to a studio-quality recording. An AI-generated video of Kanye was created in only seven days and with a budget of just $30, with the idea to show how AI can ultimately break down the economic incentives of the music business. “What I tried to do with ‘War with the Matrix’ was incorporate messages that you don't really hear in rap,” Big Baby Ghandi explains. “It’s not something that the industry wants to promote, and so it allowed me to bypass some of those gatekeepers and barriers to getting your music out there.”
The current malfunctioning of the music industry is squeezing many artists out, with a recent study finding most musicians in the UK have to work an average of three-to-four jobs to be able to afford to live. Big Baby Ghand believes AI technology can be used by amateur musicians to up their confidence to Kanye West-levels, and that it shouldn’t be looked at as people robbing Kanye of his own voice, but rather co-opting it so they can locate their own inner strength. “The AI is going to be another tool [for producers], just like auto-tune or sampling breakbeats,” he says. “One novel benefit of using the AI voice is that it 'upscales' your vocal performance to the same high quality studio mic that the top artists use. Another benefit is that the models are trained using highly confident performances from seasoned artists, so if you are an amateur who lacks self-confidence, the AI can 'inject' confidence into your own performance. It will eliminate a lot of barriers to entry for creating art.”
However, despite this positive outlook, Big Baby Ghandi admits he has concerns around how AI could be used to create unethical practices in the music business, which could potentially be rooted in racism. “If AI is co-signed by the right person, any and all objections can easily be set aside,” he says, “but I also think it’s only a matter of time before swathes of young white producers, who currently dominate music production, decide Black rappers are not easy to work with, and that instead they can just put out AI music and be the ones who get lionized for it.”
This seems to be the big problem with AI music. For every potential positive, there’s an array of negatives that sound bleakly dystopian. “There is this danger of perpetual nostalgia for a time when the music business was actually functioning well,” admits Herndon. “And that rather than creating new stars, the music business will rely on proven brands and keep bringing back dead artists over and over again.”
Yet Herndon insists all technology has good and bad possibilities, and she believes the positives far outweigh the negatives when it comes to AI music. She says: “It is here to stay and I see it being integrated in every DAW (digital audio workstation) over the next 10 years. Hopefully music will become more strange. With AI, it will become easier and easier to make very formulaic music. That’s already the case with Ableton, where anyone can make a techno track in 10 minutes. So, once it becomes just 30 seconds to create a song, things will have to get even more weird and crazy for you to have a voice that stands out, right? The ease of these tools will mean music has to be even more innovative to turn people’s heads; which I see as a good thing.”
Referencing Mtume’s comments that sampling was a problematic way of raising the dead to play in the future, Herndon says AI will instead allow the past and the present to start an open dialogue with one another. “There’s an opportunity to open up an ethical dialogue between the present and the past, which is very interesting,” she concludes. “Rather than try to fight it, we now need to build an infrastructure around AI music so that it will make ethical sense for everyone.”
Ultimately, according to MIT’s Bartholomew, “the technology itself is not the danger. We make the technology dangerous by how we use it or misuse it. It’s human beings who will decide what kind of impact it has.”
Thomas Hobbs is a freelance writer, follow him on Twitter