AutoAI turns YouTube, Twitch and Kick VODs — or your own files — into vertical shorts. Transcribed, highlighted, cropped to 9:16, captioned, and published. 100% local pipeline — no cloud API, no subscription, no usage caps.
The whole generation pipeline runs locally — transcription, highlight selection, cropping, captioning, export. The only time a cloud shows up is when you click "Publish" to the platform of your choice.
Video stays local. Audio stays local. Transcripts stay local. AutoAI runs faster-whisper, Ollama and FFmpeg on your own CPU/GPU — no cloud pipeline, no usage quotas, no API keys for generation.
Per-frame face/person tracking (MediaPipe + YOLOv8) with EMA smoothing, deadzone and max-speed clamp. The crop pans naturally — it doesn't sit on a static centre or jitter.
Captions are generated from the same Whisper pass — frame-accurate, styled, and rendered in a single FFmpeg pass alongside the crop.
A local LLM reads your transcript and returns every moment that'd make a short — hook + payoff, self-contained. Cap how many you want; no floor.
Direct publish via platform APIs when you click the button, or "open in browser" with the file staged for manual upload. Your call — per short.
AutoAI is open source and self-hostable. Clone it, read it, ship PRs. The stats below are pulled live from the GitHub API when the page loads.
Loading repo…
git clone https://github.com/NYOGamesCOM/AutoAI
ollama pull mistral
Node 20+ (only to rebuild the web UI)
PS> git clone https://github.com/NYOGamesCOM/AutoAI PS> cd AutoAI PS> python -m venv .venv PS> .venv\Scripts\activate PS> pip install -r requirements.txt PS> python server.py # open http://localhost:8000
$ git clone https://github.com/NYOGamesCOM/AutoAI $ cd AutoAI $ python3 -m venv .venv && source .venv/bin/activate $ pip install -r requirements.txt $ python server.py # open http://localhost:8000
The paid short-makers are good at what they do — we just think the core feature (cutting your own video into short-form) shouldn't cost a subscription, cap your exports, or require you to upload your footage to somebody else's box.