CapCut-style creation flow × OpusClip-style AI scoring

Cut long videos into clips worth watching.

Drop a YouTube URL. MiMo reads the transcript like an editor, finds moments with hook, tension, and payoff, then ClipForge renders polished vertical shorts with captions, overlays, metadata, and viral scoring.

Public demo uses 3 clips per run. Rendering happens as a background job.
clipforge.ai/session · 1080×1920
MiMo found the 3-second hook
Curiosity gap + payoff timing + emotional signal detected from transcript context.
92Hook-first cut
88Payoff moment
84Debate trigger
9:16vertical-first render surface
JSONstructured MiMo edit decisions
FFmpegdeterministic subtitle + overlay pipeline
Livepublic proof-of-work for Orbit 100T

Editor taste, automated.

Not a generic summarizer. The app converts transcript context into edit instructions: what to cut, why it should retain viewers, and how to package it.

01

Import source

yt-dlp grabs the YouTube video and metadata. FFmpeg extracts clean audio for analysis.

source
02

Timestamp transcript

Deepgram nova-3 creates segment and word timing so cuts can land on exact moments.

transcribe
03

MiMo highlight reasoning

MiMo scores hook strength, emotional tension, payoff position, and comment potential.

reason
04

Render package

ClipForge exports vertical video, subtitles, overlay hook, thumbnail, title, caption, and hashtags.

publish
structured output

MiMo returns decisions the renderer can execute.

Every generated short keeps the model rationale next to the asset, so the UI can show why the clip was selected instead of pretending the cut was magic.

{
  "start": 42.18,
  "end": 71.52,
  "viral_score": 8.7,
  "hook": "This is where it changes",
  "rationale": "early tension + final payoff"
}

Generated clip gallery.

Sample outputs from the existing pipeline. New jobs appear here when rendering completes.

Loading generated clips…