1. Convert subtitle or script text into narration segments
2. Select voice model and speaking parameters per segment
3. Synthesize audio locally for each narration block
4. Align generated narration with timeline timestamps