1. Extract on-screen text from slides using OCR
2. Detect slide boundaries and active slide regions
3. Feed slide text as contextual guidance to the speech recognition engine
4. Align transcribed speech with slide-level structure