Microsoft’s new AI simulates anyone’s voice with 3 second long sample

VALL-E is a new text-to-speech AI model developed by Microsoft researchers. It can closely simulate a person’s voice when given a three-second audio sample. Once it learns a specific voice, VALL-E can synthesize audio of that person saying anything, and it attempts to preserve the speaker’s emotional tone. Microsoft calls VALL-E a “neural codec language model,” and it builds off of a technology called EnCodec. Its creators speculate that VALL-E could be used for high-quality text-to-speech applications, speech editing where a recording of a person could be edited and changed from a text transcript, and audio content creation when combined with other generative AI models.

_{Image by Microsoft}

Microsoft trained VALL-E’s speech-synthesis capabilities on an audio library called LibriLight. It contains 60,000 hours of English language speech from more than 7,000 speakers, mostly pulled from LibriVox public domain audiobooks. Some VALL-E results seem computer-generated, but others could potentially be mistaken for a human’s speech, which is the goal of the model. I wonder if there’s a Singaporean accent in the library.

However, given the potential risks of misuse, Microsoft has not provided VALL-E code for others to experiment with. The researchers are aware of the potential social harm that this technology could bring and address it in the conclusion of their paper. OK, I guess that’s one less thing to worry about now.

j0n

Feel free to drop us a message at hello@geekbytes.co if our news is wrong or inaccurate.

Apple Maps launches in beta on the Web

17 July 2024

Public Betas for iOS 18, iPadOS 18, macOS Sequoia, and watchOS 11 are now available

17 July 2024

4 Ways Apple is Protecting Your Privacy in Safari

16 July 2024

Apple HomePod mini finally gets a new color

12 July 2024

Celebrate Back to School Season with Apple

15 June 2024

WWDC24: New Experiences with Home Products

13 June 2024

WWDC24: 2 Key Features of macOS Sequoia

12 June 2024

WWDC24: 3 Key Features in iPadOS 18