Friday, January 13, 2023

Audio

apr24

Don’t play it by ear:
Audio deepfakes in a
year of global elections

From robocalls to voice clones, generative AI is allowing

malicious actors to spread misinformation with ease.

Artificial intelligence company OpenAI recently introduced Voice Engine, a natural-sounding speech generator that uses text and a 15-second audio sample to create an “emotive and realistic” imitation of the original speaker.

OpenAI has not yet released Voice Engine to the public, citing concerns over the potential abuse of its generative artificial intelligence (AI) – specifically to produce audio deepfakes – which could contribute to misinformation, especially during elections.

Audio deepfakes and their uses

Audio deepfakes are generated using deep learning techniques in which large datasets of audio samples are used for AI models to learn the characteristics of human speech to produce realistic audio. Audio deepfakes can be generated in two ways: text-to-speech (text is converted to audio) and speech-to-speech (an uploaded voice recording is synthesised as the targeted voice).

Anyone can generate an audio deepfake. They are easier and cheaper to make than video deepfakes and simpler to disseminate on social media and messaging platforms.

Audio deepfakes have been used in cyber-enabled financial scams where fraudsters impersonate bank customers to authorise transactions. The same technology is increasingly being used to propagate disinformation. Several audio deepfakes attempting to mimic the voices of politicians have circulated on social media. In 2023, artificially generated audio clips of UK Labour leader Keir Starmer allegedly featured him berating party staffers.  While fact-checkers determined the audio was fake, it surpassed 1.5 million hits on X (formerly Twitter).

In India, voice cloning of children has been used to deceive parents into transferring money. In Singapore, deepfake videos containing voice clones of politicians such as the prime minister and deputy prime minister have been used in cyber-scams.

https://www.lowyinstitute.org/the-interpreter/don-t-play-it-ear-audio-deepfakes-year-global-elections


mar24

How the music industry is battling AI deepfakes one state at a time with the ELVIS Act

In an in-depth interview, Recording Academy advocacy and public policy chief officer Todd Dupler explains how the ELVIS Act could combat the misuse of a person’s voice, image and likeness using AI.
https://cointelegraph.com/news/how-music-industry-battling-ai-deepfakes


nov23
Recent advances in generative artificial intelligence have spurred developments in realistic speech synthesis. While this technology has the potential to improve lives through personalized voice assistants and accessibility-enhancing communication tools, it also has led to the emergence of deepfakes, in which synthesized speech can be misused to deceive humans and machines for nefarious purposes.
https://source.wustl.edu/2023/11/defending-your-voice-against-deepfakes/

set23

Unraveling the Deepfake Deception

In a startling revelation, Dr. Marco Vinicio Boza, a leading specialist in Intensive Care, disclosed that deceitful individuals employed artificial intelligence to craft videos imitating his likeness and voice. Their objective? To spread misleading messages and engage in fraudulent activities endangering public health.

The Deception Deepens

Recent weeks have seen the circulation of these counterfeit videos, expertly edited using artificial intelligence, purporting Dr. Boza’s endorsement of a product touted to dissolve blood clots. But the doctor isn’t staying silent. He denounced the video, explaining its misuse by swindlers selling units of the bogus medicine for ¢50,000 in the country’s northern region.
https://www.costaricantimes.com/beware-the-deepfakes-renowned-doctors-voice-mimicked-in-costa-rica-to-peddle-fake-medicines/74804


set23

A disseminação de conteúdos produzidos com técnicas de aprendizado de máquina atingiu proporções alarmantes e  são extremamente convincentes. Este momento exige  tecnologias capazes de identificá-los, como é o caso da biometria de voz.

Os deepfakes de voz ou de face são possíveis por meio da utilização de algoritmos que sintetizam ou alteram elementos em imagens e vídeos existentes, substituindo rostos, vozes e até mesmo criando cenas completamente fictícias.~
https://startupi.com.br/deepfake-biometria-de-voz/


jun23

Meta has another new AI model on the docket, and this one seems perfectly engineered for the land of tomorrow if that utopian future is filled with nothing but deepfakes and modified audio. Like AI image generators, Voicebox generates synthetic voices based on a simple text prompt from scratch—or, in actuality—sound from thousands of audiobooks.

On Friday, Meta announced its new Voicebox AI that can create voice clips using simple text prompts. In a video, CEO Mark Zuckerberg shared on his Facebook and Instagram, he said the Voicebox AI model can take a text prompt and read it in a variety of human, though somewhat digital-sounding, voices. Otherwise, Voicebox can also modify audio to remove unwanted noises from voice clips, like a dog barking in the background. Unlike many other AI voice synthesization models, Meta’s AI can create audio in languages other than English, including French, Spanish, German, Polish, and Portuguese, and the company said the AI can effectively translate any passage from one language to another, while keeping the same voice style.
https://gizmodo.com/meta-help-people-craft-more-deepfakes-with-voicebox-a-1850548158


mai23

Music Deepfakes: Are AI imitations a creative art or a new kind of appropriation?

Drake, Grimes, The Weeknd and Holly Herndon are all part of the AI vocalist boom – and they all seem to have different perspectives.

https://musictech.com/features/music-deepfakes-ai-drake-grimes-weeknd/


ab23

What Is Deepfake Music? And How Is It Created?

https://www.makeuseof.com/what-is-deepfake-ai-music/


Deepfake music mimics the style of a particular artist, including their voice. How is it possible for it to sound so real?
ab23
Voice deepfakes use AI algorithms to create audio clips that sound like a specific person, even if that person never spoke the words. This technology can be used to create fake audio recordings of public figures or even to manipulate audio evidence in legal proceedings. However, the question remains: are these AI-generated voice deepfakes any good?
https://www.techcityng.com/ai-generated-voice-deepfakes-are-comical-but-are-they-any-good/

ab23
  • Songs made with generative AI are infiltrating streaming services.
  • One Spotify user was recommended the same song under 49 different names and suspects AI is behind it.
  • While the songs sound exactly the same, each has a different title, artist, and art.
https://www.businessinsider.com/ai-mystery-spotify-song-49-different-titles-artists-art-music-2023-4

ab23

Even worse, chatbots like ChatGPT are starting to generate realistic scripts with adaptive real-time responses. By combining these technologies with voice generation, a deepfake goes from being a static recording to a live, lifelike avatar that can convincingly have a phone conversation.

< CLONING A VOICE Crafting a compelling high-quality deepfake, whether video or audio, is not the easiest thing to do. It requires a wealth of artistic and technical skills, powerful hardware and a fairly hefty sample of the target voice.

There are a growing number of services offering to produce moderate- to high-quality voice clones for a fee, and some voice deepfake tools need a sample of only a minute long, or even just a few seconds, to produce a voice clone that could be convincing enough to fool someone. However, to convince a loved one – for example, to use in an impersonation scam – it would likely take a significantly larger sample.
https://businessmirror.com.ph/2023/04/12/voice-deepfakes-are-calling-heres-what-they-are-and-how-to-avoid-getting-scammed/



Abr23

 

Lil Durk has made it clear that all Artificial Intelligence deepfakes which are attempting to use his voice will never replace him as a warm-blooded, able-bodied superstar artist.

Speaking to HipHopDX about future technologies and his new NFT-centered phygital sneaker collection, NXTG3NZ, the OTF honcho said that although AI is going to change how people make music, it won’t replace humans.

“I heard them AI deep fakes usin’ my voice, it’s wild what tech be doin’,” Lil Durk told DX. “I think AI gon’ change how we make music, but ain’t nothin’ gonna replace the real deal, them raw vibes and emotions we bring. Just gotta make sure we use it right, ya know? Keep our essence alive.”

https://hiphopdx.com/news/lil-durk-ai-deepfake-wont-replace-him


 fev23

The audio streaming service said Wednesday in a press release that U.S. and Canadian users with premium subscriptions would first start getting access to the DJ that day. In the beginning, it will be "in beta" and in English, according to Spotify.

https://www.foxbusiness.com/technology/spotify-releasing-artificial-intelligence-dj-two-countries

fev23

Now music streaming giant Spotify is throwing its hat in the ring, with a new feature powered by its own personalization tech, as well as by voice and generative AI.

The company is launching a ‘DJ’ feature, which it says is like an “AI DJ in your pocket” and adds that it serves as “a personalized AI guide that knows you and your music taste so well that it can choose what to play for you”.

This feature is first rolling out in beta, and Spotify says it will deliver a curated playlist of music alongside commentary around the tracks and artists it thinks you will like.
https://www.musicbusinessworldwide.com/spotify-just-launched-a-personalized-dj-powered-by-generative-and-voice-ai/

Jan23

The emergence in the last week of a particularly effective voice synthesis machine learning model called VALL-E has prompted a new wave of concern over the possibility of deepfake voices made quick and easy — quickfakes, if you will. But VALL-E is more iterative than breakthrough, and the capabilities aren’t so new as you might think. Whether that means you should be more or less worried is up to you.

Voice replication has been a subject of intense research for years, and the results have been good enough to power plenty of startups, like WellSaidPapercup and Respeecher. The latter is even being used to create authorized voice reproductions of actors like James Earl Jones. Yes: from now on Darth Vader will be AI generated.

VALL-E, posted on GitHub by its creators at Microsoft last week, is a “neural codec language model” that uses a different approach to rendering voices than many before it. Its larger training corpus and some new methods allow it to create “high-quality personalized speech” using just three seconds of audio from a target speaker.

That is to say, all you need is an extremely short clip like the following (all clips from Microsoft’s paper):

https://techcrunch.com/2023/01/12/vall-es-quickie-voice-deepfakes-should-worry-you-if-you-werent-worried-already/?guccounter=1&guce_referrer=aHR0cHM6Ly93d3cuZ29vZ2xlLmNvbS8&guce_referrer_sig=AQAAAMtFk2rYqaHs0Jt-QKqu9XKuO02KMGK86YJyhYeIRpIWscAlg4iWtv4hyjwevR6K99qmkuFQOUQJ79ti-dVhWHsFTdNPlVimRqTni1a8uGJKxFT4mXKhZgrlfL2IZVpqycXa-J_mWEF7VDpRhJBJgwy-D4x-MMaBFlvsLvFxJSQn

No comments:

Post a Comment