Artificial intelligence has just learned to imitate the voice of any person.  It needs three seconds

Artificial intelligence has just learned to imitate the voice of any person. It needs three seconds

Researchers from Cornell University have developed – on behalf of Microsoft – an artificial intelligence model that is able to almost perfectly impersonate the voice of a living person. The project is working under the name of VALL-E and is still at the stage of work, but its capabilities are already impressive.

Artificial intelligence impersonates the voice of any person in three seconds

As we read in the study that appeared, scientists needed 60,000 to train artificial intelligence. hours of recordings of human speech in English. This is hundreds of times more than is used in similar speech synthesis projects.

In this way, it was possible to create a model that not only generates human speech, but can accurately reproduce the timbre of someone else’s voice, and even the appropriate intonation of speech, and imitate the speaker’s emotions quite well.

The scientists assure that in this way they managed to create a system that generates the most natural and voice-like speech. They add that for artificial intelligence to generate a “fake” utterance, it is enough to analyze a sample of the voice of any person with a length of only three seconds.

Examples of recordings that make a huge impression. VALL-E does not always cope perfectly, but in fact it imitates several exemplary lectors in a natural and quite accurate way. It imitates male and female speech equally well, although some of the artificial utterances seem a bit “washed out” of emotions. Still, recognizing which of the voices belongs to a live lector is not so easy.

What can VALL-E be used for? Not only for a positive purpose

It’s not hard to imagine that once refined for VALL-E, there would be many practical applications. It could turn out to be a speech synthesizer imitating the voice of people who have lost the ability to speak or pretending to be a real voiceover in movies or audiobooks.

However, Microsoft does not make the system publicly available, e.g. because of the risk of using it in an unauthorized way. VALL-E can create a false statement of a well-known person (e.g. politician) or pretend to be the voice of a person known to the victim in fraud (e.g. extortion).

Source: Gazeta

You may also like

Immediate Access Pro