Don’t play it by ear:
Audio deepfakes in a
year of global elections
From robocalls to voice clones, generative AI is allowing
Artificial intelligence company OpenAI recently introduced Voice Engine, a natural-sounding speech generator that uses text and a 15-second audio sample to create an “emotive and realistic” imitation of the original speaker.
OpenAI has not yet released Voice Engine to the public, citing concerns over the potential abuse of its generative artificial intelligence (AI) – specifically to produce audio deepfakes – which could contribute to misinformation, especially during elections.
Audio deepfakes and their uses
Audio deepfakes are generated using deep learning techniques in which large datasets of audio samples are used for AI models to learn the characteristics of human speech to produce realistic audio. Audio deepfakes can be generated in two ways: text-to-speech (text is converted to audio) and speech-to-speech (an uploaded voice recording is synthesised as the targeted voice).
Audio deepfakes have been used in cyber-enabled financial scams where fraudsters impersonate bank customers to authorise transactions. The same technology is increasingly being used to propagate disinformation. Several audio deepfakes attempting to mimic the voices of politicians have circulated on social media. In 2023, artificially generated audio clips of UK Labour leader Keir Starmer allegedly featured him berating party staffers. While fact-checkers determined the audio was fake, it surpassed 1.5 million hits on X (formerly Twitter).
In India, voice cloning of children has been used to deceive parents into transferring money. In Singapore, deepfake videos containing voice clones of politicians such as the prime minister and deputy prime minister have been used in cyber-scams.
https://www.lowyinstitute.org/the-interpreter/don-t-play-it-ear-audio-deepfakes-year-global-elections
How the music industry is battling AI deepfakes one state at a time with the ELVIS Act
Unraveling the Deepfake Deception
In a startling revelation, Dr. Marco Vinicio Boza, a leading specialist in Intensive Care, disclosed that deceitful individuals employed artificial intelligence to craft videos imitating his likeness and voice. Their objective? To spread misleading messages and engage in fraudulent activities endangering public health.
The Deception Deepens
A disseminação de conteúdos produzidos com técnicas de aprendizado de máquina atingiu proporções alarmantes e são extremamente convincentes. Este momento exige tecnologias capazes de identificá-los, como é o caso da biometria de voz.
Meta has another new AI model on the docket, and this one seems perfectly engineered for the land of tomorrow if that utopian future is filled with nothing but deepfakes and modified audio. Like AI image generators, Voicebox generates synthetic voices based on a simple text prompt from scratch—or, in actuality—sound from thousands of audiobooks.
Music Deepfakes: Are AI imitations a creative art or a new kind of appropriation?
What Is Deepfake Music? And How Is It Created?
- Songs made with generative AI are infiltrating streaming services.
- One Spotify user was recommended the same song under 49 different names and suspects AI is behind it.
- While the songs sound exactly the same, each has a different title, artist, and art.
Even worse, chatbots like ChatGPT are starting to generate realistic scripts with adaptive real-time responses. By combining these technologies with voice generation, a deepfake goes from being a static recording to a live, lifelike avatar that can convincingly have a phone conversation.
< CLONING A VOICE Crafting a compelling high-quality deepfake, whether video or audio, is not the easiest thing to do. It requires a wealth of artistic and technical skills, powerful hardware and a fairly hefty sample of the target voice.
Abr23
Lil Durk has made it clear that all Artificial Intelligence deepfakes which are attempting to use his voice will never replace him as a warm-blooded, able-bodied superstar artist.
Speaking to HipHopDX about future technologies and his new NFT-centered phygital sneaker collection, NXTG3NZ, the OTF honcho said that although AI is going to change how people make music, it won’t replace humans.
“I heard them AI deep fakes usin’ my voice, it’s wild what tech be doin’,” Lil Durk told DX. “I think AI gon’ change how we make music, but ain’t nothin’ gonna replace the real deal, them raw vibes and emotions we bring. Just gotta make sure we use it right, ya know? Keep our essence alive.”
https://hiphopdx.com/news/lil-durk-ai-deepfake-wont-replace-him
fev23
The audio streaming service said Wednesday in a press release that U.S. and Canadian users with premium subscriptions would first start getting access to the DJ that day. In the beginning, it will be "in beta" and in English, according to Spotify.
https://www.foxbusiness.com/technology/spotify-releasing-artificial-intelligence-dj-two-countries
Now music streaming giant Spotify is throwing its hat in the ring, with a new feature powered by its own personalization tech, as well as by voice and generative AI.
The company is launching a ‘DJ’ feature, which it says is like an “AI DJ in your pocket” and adds that it serves as “a personalized AI guide that knows you and your music taste so well that it can choose what to play for you”.
This feature is first rolling out in beta, and Spotify says it will deliver a curated playlist of music alongside commentary around the tracks and artists it thinks you will like.
Jan23
The emergence in the last week of a particularly effective voice synthesis machine learning model called VALL-E has prompted a new wave of concern over the possibility of deepfake voices made quick and easy — quickfakes, if you will. But VALL-E is more iterative than breakthrough, and the capabilities aren’t so new as you might think. Whether that means you should be more or less worried is up to you.
Voice replication has been a subject of intense research for years, and the results have been good enough to power plenty of startups, like WellSaid, Papercup and Respeecher. The latter is even being used to create authorized voice reproductions of actors like James Earl Jones. Yes: from now on Darth Vader will be AI generated.
VALL-E, posted on GitHub by its creators at Microsoft last week, is a “neural codec language model” that uses a different approach to rendering voices than many before it. Its larger training corpus and some new methods allow it to create “high-quality personalized speech” using just three seconds of audio from a target speaker.
That is to say, all you need is an extremely short clip like the following (all clips from Microsoft’s paper):
https://techcrunch.com/2023/01/12/vall-es-quickie-voice-deepfakes-should-worry-you-if-you-werent-worried-already/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAMtFk2rYqaHs0Jt-QKqu9XKuO02KMGK86YJyhYeIRpIWscAlg4iWtv4hyjwevR6K99qmkuFQOUQJ79ti-dVhWHsFTdNPlVimRqTni1a8uGJKxFT4mXKhZgrlfL2IZVpqycXa-J_mWEF7VDpRhJBJgwy-D4x-MMaBFlvsLvFxJSQn
No comments:
Post a Comment