How Voice Cloning Using AI Works and How it Can Lead to Identity Theft.

By Dennis Shelly

It has been almost one year since my mother passed away, and every day I wish that I could have one more conversation with her, being able to say I love you just one more time. With AI-based Voice Cloning and a brief recording, it is possible to clone anyone’s voice. Now that is an example of using Artificial Intelligence (AI) for good use, however, it also poses risks associated with this ever-developing technology. Black-hat hackers have been posing an increasing risk to both individuals and businesses using this AI-based technology. Now there have been attempts to control AI and this technology, however, Cybercriminals have been working hard to develop new, cutting-edge strategies for eluding the authorities. Their only aim is to engage in cyber-hacking activities in such a way that they are never identified by cyber experts. AI voice cloning is a type of ‘deep fake’ that can generate new audio content that resembles a person’s real voice relying only on a brief data set of recordings. AI voice cloning technology advancements will provide new opportunities for cybercriminals and other malicious actors.

In this article, we will elaborate on what is AI-based voice cloning, how it works, and how it can lead to identity theft.

What is Voice Cloning using AI and how it works?

The concept of “voice cloning” involves simulating or “cloning” a person’s voice artificially. Modern AI software techniques can produce synthetic speech that sounds very similar to a specific human voice. In most cases, the ordinary individual cannot tell the difference between the actual and a synthetically produced voice. With the help of voice cloning, cybercriminals can produce fake sound clips or vocal instructions that mimic a person’s voice, which can result in identity theft, spoofed calls, and phishing emails. It has already claimed its first victim as The Wall Street Journal reported that an anonymous CEO of a UK-based energy company was recently defrauded of €220,000 ($236,665) by an AI-powered deepfake of the voice of his German boss. In three phone calls, the scammer imitated the parent company president’s accent using artificial intelligence (AI), persuading the victim to transfer money to a Hungarian supplier’s account.

How AI-enabled Voice Clones are created?

If you are familiar with the concept of a video deepfake, AI voice cloning software is the speech equivalent of it. Using an online voice cloning tool, almost anyone can create an audio voice clone with as little as a few minutes of speech recordings and then use the recordings to train an AI voice cloning tool that can read a created text in the cloned voice. The process is now considerably simpler thanks to a range of neural network-powered tools, such as Google’s Tacotron and Wavenet or Lyrebird, which enable almost any voice to be duplicated and used to “read” text input. The output’s quality is also gradually rising as neural network-based TTS models are highly effective at recognizing patterns in data because they mirror the way the brain works. While there are several methods for incorporating deep learning into synthetic voices, most improve word pronunciation while also capturing minute details like pace and intonation to produce speech that sounds more like a real person.

How Cybercriminals can use Voice Clones

Voice-based biometric spoofing

Voice is a distinctive identity and trustworthy biometric security indicator, however, vocal biometric systems can be tricked into believing they are hearing the authentic, legitimate user. Cybercriminals can use presentation attacks, such as recorded voice, computer-altered voice, synthetic voice, or voice cloning, to get access to confidential data and financial accounts.

Fake News and Misinformation

Fake news and other types of misinformation represent a significant threat as most of us are aware of how manipulated videos may change the political situation. Text-to-speech technologies powered by AI will accelerate efforts to influence public opinion, collect fake contributions to campaigns, defame public figures, and other things. On the business side, think about how misrepresented remarks by public figures or executives could influence the stock market.

Phishing Scams

Online AI voice cloning software also makes it possible for a new type of phishing scam that takes advantage of the victim’s perception that they are speaking to a reliable source. These scams are an evolution of executive email spoofing schemes, in which the goal is to get the receiver to provide sensitive information such as passwords, bank account numbers, and credit card information. Scammers are now using voicemail and phone calls, armed with voice clones. And the attacks threaten individuals as well as companies. A new variation of the “grandma scam” involves fraudsters pretending to be relatives in need of emergency financial assistance.

Fake Evidence and Blackmailing

Deepfakes, such as synthetic voices, may be used to fabricate evidence that affects criminal prosecutions. Although precautions have been taken to verify audio and video evidence produced in court, it may be difficult to avoid these strategies from affecting testimony based on what witnesses claim to have seen or heard. Online bullying and threats to publish fake, embarrassing footage, if victims refuse to pay a fee, might also employ manipulated video and audio of people acting or saying things they didn’t act or say.

To Conclude

As voice technology improves, having technology that can detect and prevent the use of fake voices for fraud and deceit becomes increasingly important. Voice anti-spoofing, also known as voice liveness detection, is a technique that can distinguish between live voice and altered, or synthetic voice. Many of today’s fakes are undetectable to the human ear, but AI-based software trained to detect features that aren’t present in a live voice can detect them. Initially, technologies that identify AI voice cloning software were developed to address the issue of voice biometric spoofing. Anti-spoofing technology checks to ensure the voice is live where voice biometrics match a person’s voice to the voice template on file. The technology will continue to evolve to solve new challenges ad voice cloning fraud becomes more prevalent.

Have more questions about AI, voice cloning, or protecting yourself from these new types of threats? We can help! Our Eggsperts are eggcellent in the newest technologies and are standing by. Please contact us by visiting our website at www.eggheadit.com, by calling (760) 205-0105, or by emailing us at tech@eggheadit.com  with your questions or suggestions for our next article.

IT | Networks | Security | Voice | Data