In AuthoringAI, you have more than one way to translate spoken words. You can either:
Each approach has its advantages and disadvantages. Explore below to choose the best option for your current project.
Translate slide notes, then Generate speech
Best for: Generated audio and video; recorded audio
In this approach, you first use the Translate text feature to translate your slide notes, then, you use Generate narration (audio) to generate speech, choosing a voice that matches the language of the translated text.
Advantages
- Natural Pacing - when you generate the audio it will follow the natural pacing of the language and selected voice.
- Native Speakers - when you generate audio, you can choose from our native voices that will narrate the translated text with the language's correct accent.
- Editable Translation - if our machine translation mistranslates the Slide notes, you can correct them and regenerate the audio.
Disadvantages
- Not the Original Speaker - unless you have a custom voice that matches that of the original speaker, you cannot match the voice of the original speaker.
- No Recorded Video Support - this approach cannot support translating a recorded video , both because the original audio track embedded in the video is not replaced and the original timings are not maintained.
Even if your slides have recorded audio, if they have a single speaker, when translating audio narration, you'll likely get better results by first transcribing the narration from the source spoken track, then translating the slide notes, and finally generating the narration with a native speaker's voice in the language you translated to.
Translate speech to speech
Best for: Recorded video; multi-speaker audio and video
In this approach, you use the Translate audio feature to directly translate speech-to-speech. It transcribes the speech, translates it, and generates a spoken track in a new language, matching the timing and overall duration of the original message, and the result is not unlike a dubbed movie.
Because the focus is on matching the original pacing, and because it can clone and reuse voices when multiple people are speaking, its best use is for a recorded video or anything with multiple speakers.
Advantages
- Recorded Audio and Video Support - this speech-to-speech approach directly translates an audio file, replacing the original audio track in a new language.
- Matches Duration - the translated message is the same duration and is in sync with what is being said in the original language track, even if the new speech has to be sped up to achieve this.
- Cloned Voices - this approach creates facsimiles of the original voices and uses them in the translated audio, which is useful in tracks with multiple speakers.
Disadvantages
- Not Editable - if the machine translation mistranslates the message, there is no way to modify the translation.
- Unnatural Pacing - if it takes more syllables to say the same thing in the translated language, the pacing will speed up to fit within the original duration.
- Non-native Speakers - because the voices are those of the original speakers, it will sound as if the original speakers are speaking in the new language, without the accents or tone typical to that language.
Still unsure which to choose?
Watch this Translate Audio Brainshark presentation from when this feature was first introduced. It demonstrates some of the advantages and disadvantages of the Translate Audio approach. The video in each language is the same duration as the original video, and the narrator's voice does not sound like a native speaker in the translated languages.
Comments
0 comments