Cut long videos into clips worth watching.
Drop a YouTube URL. MiMo reads the transcript like an editor, finds moments with hook, tension, and payoff, then ClipForge renders polished vertical shorts with captions, overlays, metadata, and viral scoring.
Editor taste, automated.
Not a generic summarizer. The app converts transcript context into edit instructions: what to cut, why it should retain viewers, and how to package it.
Import source
yt-dlp grabs the YouTube video and metadata. FFmpeg extracts clean audio for analysis.
Timestamp transcript
Deepgram nova-3 creates segment and word timing so cuts can land on exact moments.
MiMo highlight reasoning
MiMo scores hook strength, emotional tension, payoff position, and comment potential.
Render package
ClipForge exports vertical video, subtitles, overlay hook, thumbnail, title, caption, and hashtags.
MiMo returns decisions the renderer can execute.
Every generated short keeps the model rationale next to the asset, so the UI can show why the clip was selected instead of pretending the cut was magic.
"start": 42.18,
"end": 71.52,
"viral_score": 8.7,
"hook": "This is where it changes",
"rationale": "early tension + final payoff"
}
Generated clip gallery.
Sample outputs from the existing pipeline. New jobs appear here when rendering completes.