AuthoringAI: Translate Audio Options

In AuthoringAI, you have more than one way to translate audio. You can either 

  1. Translate slide notes then generate audio
  2. Translate audio to audio 


Each approach has its advantages and disadvantages. Explore below to choose the best option for your current project.


Option 1: Translate Slide Notes then Generate Audio

In this approach, you first use the Translate text feature to translate your slide notes, then, you use Generate audio using the voices provided to narrate your text. 


Best for: Audio only (not video)



  • Natural Pacing - when you generate the audio it will follow the natural pacing of the language and selected voice.
  • Native Speakers - our native voices will narrate the translated text with the correct accent.
  • Editable Translation - if the machine translation mistranslates the Slide notes, you can modify them.


  • Not the Original Speaker - unless you have a custom voice that matches that of the original speaker, you cannot match the voice of the original speaker.
  • No Video Support - this approach cannot support video translation, both because the original audio track embedded in the video is not replaced and the original timings are not maintained.


Option 2: Translate Audio to Audio 

In this approach, you use the Translate audio feature to directly translate the slide audio or the audio within a video. This will translate the audio in the voices of the original speakers, and match the timing and overall duration of the original audio. 


Best for: Videos & multi-voice audio tracks



  • Video Support - this approach supports video translation, replacing the original audio track embedded in the video and maintaining the original duration and timing.
  • Maintains Timing - the translated message is in sync with what is happening within a video, or with original animation timings, even if speech has to be sped up for the new language.
  • Cloned Voices - this approach creates facsimiles of the original voices and uses them in the translated audio, which is useful in video, and audio with multiple speakers.


  • Not Editable - if the machine translation mistranslates the message, there is no way to modify the translation.
  • Unnatural Pacing - if it takes more syllables to say the same thing in the translated language, the pacing will speed up to fit it within the original timings and duration.
  • Non-native Speakers - because the voices are those of the original speakers, it will sound as if the original speakers are speaking in the new language, without the accents or tone typical to that language.
Was this article helpful?
0 out of 1 found this helpful