I wanted a way to summarize YouTube videos without paying for a SaaS or leaking my viewing history to someone. TubeTrim is a Python-based tool that runs LLMs locally to process transcripts. No API keys, no subscriptions, no tracking.
It uses the transformers library with a device-aware backend: it will prioritize CUDA, then MPS (for Mac users), and finally fallback to CPU. I've found that Qwen 2.5-1.5B provides a good balance between speed and summary quality for this specific task.
How it works:
- Extracts the transcript via yt-dlp. - Performs extractive compression if the text exceeds the context window. - Summarizes via local inference with streaming output.
I'd appreciate any feedback for optimization!