At the start of 2023, Microsoft has positioned itself in the field of artificial intelligence by launching VALL-E, a speech synthesis tool capable of reproducing a human voice from a recording of only three seconds.. The company also wants to integrate ChatGPT into its Bing search engine and invest $10 billion in OpenAI to integrate AI tools into the Office suite.
Presentation of VALL-E
VALL-E is a speech synthesis tool developed by Microsoft which allows to reproduce a human voice from a recording of only three seconds. The developers trained their model for 60 hours in English, which is "hundreds of times longer than existing systems". VALL-E is able to preserve the speaker's emotion and the acoustic environment of the recording in the synthesis. The more samples, the more accurate the recreated voice.
Possible applications of VALL-E
The developers of VALL-E imagine many applications for this voice synthesis on the project's Github page. VALL-E directly enables various text-to-speech applications, such as TTS (text-to-speech), speech editing and content creation, in combination with other generative AI models like GPT3. However, it is important to note that VALL-E could also be used for less honest purposes, such as in deep fake technology.
Concerns about using VALL-E
With the advancement of the latter's technology, it is important to ask questions about the ethical implications of using VALL-E. If VALL-E is currently not available, Microsoft has not put anything in place to prevent these problems. It is therefore important to consider the potential consequences of this technology before using it for malicious purposes, and to ensure that AI technologies are used responsibly and ethically.