YouTube trained AI and podcast speaking with mums and others

Digital waveforms image

AI can generate more natural artificial speech by including pauses


Generating speech with different rhythms and pauses makes it sound more human-like, according to an assessment of a speech-trained AI taken from YouTube and podcasts.

Most AI text-to-speech systems are trained on data sets of practiced speech, which can output stilted and one-dimensional sound. Natural speech often displays a wide range of rhythms and patterns to express different meanings and emotions.

Now, Alexander Rudnicki at Carnegie Mellon University in Pittsburgh, Pennsylvania,…

