Home / Guides / Subtitle Removal

How to Remove Burned-in Subtitles from Video (Offline & Batch)

TL;DR: Executive Summary

  • Best For: Content creators, editors, and archivists needing to clean multiple videos securely.
  • Typical Results: 85–90/100 quality. Perfect for simple backgrounds; expect minor blurring on complex motion.
  • Time Required: ~2 minutes setup + automated processing time (faster than real-time on GPU).
  • Key Constraint: Cannot restore original pixels behind text perfectly (it's inpainting, not magic).

Burned-in subtitles (or open captions) are permanent. Unlike soft subtitles that can be toggled off, these are baked into the video pixels. Removing them requires inpainting—a process of erasing the text and algorithmically reconstructing the background. This guide covers a professional, offline-first workflow to remove burned-in subtitles without uploading your footage to the cloud.


The Challenge: Softsubs vs. Hardsubs

Before attempting removal, confirm what you are dealing with. "Removal" means different things for different types.

Soft Subtitles (Softsubs)

Text exists as a separate subtitle track (SRT, VTT, ASS) inside the MKV/MP4 container.

✓ Solution: Turn them off.

Check: Can you select the text with your mouse? Can you toggle "CC" in your player?

Hard Subtitles (Burned-in)

Text is part of the image. The original background pixels are destroyed.

✓ Solution: Inpainting (Removal).

Check: Text scales with video? Persists even when "No Subtitles" is selected?

Step-by-Step Workflow (Offline)

We use a local processing engine (EchoSubs) to ensure privacy and high throughput. This avoids the file size limits and privacy risks of online converters.

1. Import & Analyze

Drag your video files into the application. The system supports batch import, so you can load an entire season of content at once.

2. Define Detection Region

Don't scan the whole frame. Set a Region of Interest (ROI)—typically the bottom 20% of the screen.

  • Why: Reduces processing time and prevents accidental removal of on-screen text you want to keep (like signs or titles).
  • Tip: Ensure the bounding box covers the tallest subtitle line (e.g., two-line dialogue).

3. Choose Removal Parameters

Select your detection sensitivity.

  • Standard Mode: Best for high-contrast white text with black borders.
  • Aggressive Mode: Needed for semi-transparent, colored, or karaoke-style text (may blur background more).

4. Run Batch Processing

Click start. The engine processes frame-by-frame:
Detect Text MaskDilate MaskInpaint Background

5. Review & Export

Check keyframes, especially scene transitions. Export as ProRes (for editing) or H.264 (for distribution).

Common Failure Modes & Fixes

Inpainting is typically 90% effective, but certain scenarios are difficult.

ScenarioResultFix / Mitigation
Busy/Moving Backgrounds
(e.g., Confetti, Water)
Blurry "smudge" or ghosting.Increase mask dilation slightly to blend edges softer. Accept imperfect background.
Faces Behind TextDistorted chin/mouth.Avoid. Inpainting cannot reconstruct faces accurately. Consider a blur box or "letterbox" crop instead.
Karaoke / Color ChangesPartial removal (leftover color).Use "Color-Specific" removal mode targeting the active fill color.

Why Offline & Batch Processing Matters

For professional workflows, online tools are often bottlenecks.

  • Privacy & Security: Your footage never leaves your machine. Essential for NDA content, pre-release screenings, or personal archives.
  • No Bandwidth Limits: Process 4K ProRes files without uploading gigabytes of data.
  • Consistent Quality: Online AI tools often downscale output to 720p or 1080p. Offline processing maintains your source resolution and bitrate.

Alternatives to Removal

If inpainting quality isn't sufficient for your use case, consider these alternatives:

  1. Opaque Background Subtitles: If you are re-subtitling, simply place your new subtitles with an opaque black box background over the old ones. This is the standard "Cover-up" method.
  2. Crop (Zoom In): If the text is low enough, crop the video to 2.35:1 or zoom in slightly to push the text off-screen.
  3. Blur Bar: A localized blur is less distracting than a bad inpaint job if the background is extremely complex.

Related Tools & Guides

Download NowBuy License
Try for free on macOS & Windows.

Frequently Asked Questions

Can you remove burned-in subtitles completely?

Not "completely" in the sense of restoring the original hidden pixels perfectly. The software uses AI to predict and reconstruct the background based on surrounding data. For 90% of frames with static backgrounds, the removal is practically invisible. However, for complex scenes with rapid motion or detailed textures, you may notice minor blurring or artifacts in the inpainted area.

What is the difference between hardcoded and soft subtitles?

Hardcoded (burned-in) subtitles are flattened pixels within the video image itself, making them impossible to turn off. Soft subtitles are a separate subtitle track (like .SRT or .VTT) that plays alongside the video file. While soft subtitles can be easily disabled in your media player, hardcoded captions require advanced image processing algorithms to erase and fill.

Can CapCut remove burned-in subtitles?

Generally, no. Standard video editors like CapCut or Premiere Pro typically only allow you to crop the video or place a blur box over the text. They usually lack the specific "inpainting" or "object removal" algorithms required to reconstruct the background cleanly behind the text. EchoSubs is specialized specifically for this restoration task.

Does this work offline?

Yes. EchoSubs is engineered as an offline-first desktop application for macOS and Windows. All analysis, detection, and pixel inpainting happen locally on your machine's GPU or CPU. This means no video data is ever uploaded to a cloud server, ensuring 100% privacy for your unreleased or sensitive content.

Can I batch process multiple episodes?

Yes. The workflow is designed for high-volume tasks like cleaning up entire TV seasons or lecture series. You can import unlimited files, define a common "Region of Interest," and apply the same removal settings to the entire queue. The system will process them sequentially without requiring manual intervention for each file.

Does it work on bilingual subtitles?

Yes, it handles bilingual or multi-line subtitles effectively. As long as you define the detection region (ROI) to cover both lines of text, the algorithm treats them as a single removal target. You may need to use the "Aggressive" mode if the secondary language uses a different color or font style than the primary one.

How long does it take?

Performance depends heavily on your hardware, specifically your GPU. On modern silicon like NVIDIA RTX 30/40 series or Apple M1/M2/M3 chips, processing is often faster than real-time playback. For example, a 1-hour 1080p video typically takes 20-40 minutes to clean, whereas CPU-only processing will be significantly slower.

Can I remove subtitles from just a part of the video?

Yes. You can set specific "In" and "Out" points on the timeline to limit the removal process to certain segments. This is useful if you only need to clean a specific scene or if the subtitles only appear sporadically throughout the footage. The rest of the video remains untouched and re-encoded purely for format consistency.

What video formats are supported?

EchoSubs accepts a wide range of input containers including MP4, MKV, MOV, AVI, WEBM, M4V, and FLV. For export, we recommend using MP4 (H.264/HEVC) for final distribution or MOV (ProRes 422) if you plan to perform further non-linear editing (NLE) on the cleaned footage to preserve maximum quality.

Is there a free trial?

Yes. You can download EchoSubs for free on both macOS and Windows. The free version allows you to test the removal quality on your own local files, letting you verify if the results meet your standards before purchasing. Some limitations on export duration or watermarks may apply to the free tier.

Does it work on scrolling tickers?

Scrolling text (like news tickers) is significantly harder to remove than static text because the background is constantly moving relative to the mask. While EchoSubs can attempt to remove them, results vary. You might see more motion artifacts or "smudging" compared to standard dialogue subtitles, as the algorithm has to predict pixels that are constantly being revealed and occluded.

Why shouldn't I just crop the video?

Cropping forces you to change the aspect ratio or zoom in, causing you to lose 15-20% of the visual frame. This often cuts off critical visual information like speaker's hands, items on a table, or lower-third graphics. Inpainting allows you to retain the original full-frame composition while only modifying the specific pixels where the text resides.

Does it degrade the quality of the whole video?

No, the processing is non-destructive to the unmasked areas. You can choose to re-encode the output at a high bitrate (e.g., using ProRes or high-bitrate H.264) to visually preserve the original quality of the footage. The AI inpainting logic is only applied to the pixels within the detected subtitle mask; the rest of the frame is simply copied or re-encoded faithfully.

Can I run this on a server?

EchoSubs is primarily designed as a GUI-based desktop application for individual workstations. However, for enterprise needs requiring headless, CLI-based, or server-farm deployments for high-throughput processing, we do offer specialized solutions. Please contact our sales team to discuss custom integration options.

What if the results are blurry?

Blurring typically occurs when the algorithm cannot find enough clean pixel data from surrounding spatial or temporal contexts. To fix this, try reducing the "Dilate" setting to tighten the mask, or switch to "Temporal" mode which borrows pixels from previous/next frames. For extremely complex moving backgrounds (like water), some blur may be unavoidable and preferable to the original text.

© 2026 EchoSubs. All rights reserved.