Static screenshots can't capture temporal bugs — interactions, sequences, state transitions. But raw screen recordings are opaque to AI. Video capture for AI coding bridges this gap: Stash's Video Capture lets you record your screen with voice narration using ⌘⌃R, or let Instant Replay silently buffer the last 30–120 seconds. Either way, the output is an AI Capture Report — key frames, interaction timeline, voice transcript, console OCR, and focus tracking — structured context that any AI tool can process.


The Static Screenshot Ceiling

Screenshots capture a single moment. But many bugs only exist in motion — scroll jank, animation timing glitches, multi-step interaction failures, race conditions that appear between clicks. Video capture for AI coding addresses these temporal bugs that no static image can show. The feedback loop breaks when you can see the problem on screen but can't communicate the sequence to your AI assistant.


Why Raw Screen Recordings Fail for AI

Most AI coding tools (Claude Code, Cursor, ChatGPT) cannot process raw video files. Even tools with video input see only isolated frames without understanding the temporal relationship between them. No tool extracts which buttons were clicked, what changed on screen, or what the user was trying to do. The visual evidence exists in the recording but is locked in a format AI can't reason about.


Video Capture — Active Recording with Voice (⌘⌃R)

Press ⌘⌃R to start recording. Narrate what you're doing — your voice becomes first-person annotation baked into the structured report. The recording captures:

Recordings can be up to 20 minutes. When you stop, Stash generates the AI Capture Report.


Instant Replay — The Recording You Already Made

Instant Replay is a rolling buffer that silently records your screen in the background. No audio is captured — this is a deliberate privacy decision (no ambient microphone). No files are stored to disk. The buffer keeps the last 30–120 seconds (configurable) in memory using compressed frames (~12 MB for 60 seconds at idle).

When you see something worth reporting — a bug, an unexpected behavior, a UI glitch — press ⌘⌃R. The buffer freezes, Stash processes the last N seconds, and produces an AI Capture Report. You paste it into Claude, ChatGPT, or any AI tool. No reproduction step needed — the bug was caught live.

Without Instant ReplayWith Instant Replay
Bug happensBug happens
“Let me record that”Press ⌘⌃R
Try to reproduce the bugDone — last 30–120s already captured
Sometimes can't reproduceBug was caught live, first time
2–5 minutes10 seconds

The AI Capture Report

When the user clicks “Copy All” and pastes into an AI tool, the AI receives a unified markdown report:


Two Copy Modes

ModeWhat's CopiedBest For
Copy AllReport + key frame images + audio fileClaude.ai, ChatGPT, Gemini — paste into chat
Copy Folder PathPath to recording folderClaude Code, Cursor, terminal tools — paste path

What Video Capture Reveals That Screenshots Miss


Key Takeaways

  • Video Capture (⌘⌃R) records screen + voice for up to 20 minutes with dynamic frame rate.
  • Instant Replay silently buffers the last 30–120 seconds — press ⌘⌃R to save what just happened, no reproduction needed.
  • AI Capture Reports give AI tools structured context: voice transcript, interaction log, console OCR, focus tracking, and key frames.
  • Voice narration interleaves with the interaction timeline as first-person annotation — the AI reads what you said alongside what you did.
  • “Nothing happened” detection and retry detection are the strongest bug signals — Stash captures both automatically.
  • Two copy modes: Copy All for paste-into-chat, Copy Folder Path for CLI tools like Claude Code.

Frequently Asked Questions

How does Video Capture differ from regular screen recording?

Video Capture produces structured AI output, not raw video. It generates an AI Capture Report with key frames, voice transcript, interaction log, console OCR, and focus tracking — all in a format AI tools can process directly.

What is Instant Replay?

Instant Replay is a rolling buffer that silently records the last 30–120 seconds of screen activity in memory. No audio is captured, no files are stored. Press ⌘⌃R to save what just happened with a full AI Capture Report.

Does Instant Replay record audio?

No. Instant Replay captures only screen content and interactions — no microphone audio. This is a deliberate privacy decision to prevent ambient recording. Voice narration is only captured during active Video Capture recordings.

How long can a Video Capture recording be?

20 minutes maximum. Active recordings capture screen, voice, interactions, focus tracking, and console output. Recordings auto-stop at the 20-minute limit.

Can I paste Video Capture output into Claude Code?

Yes. Use “Copy Folder Path” to paste the recording folder path into Claude Code. The folder contains the AI Capture Report, key frame images, and audio — Claude Code reads them all via the file path.

What happens if I don't narrate during a recording?

The AI Capture Report works without voice. Machine observation — clicks, console errors, visual changes, focus tracking, state changes — provides enough context for AI diagnosis. Voice narration adds intent and direction but is not required.

Does Video Capture slow down my Mac?

Instant Replay uses ~12 MB of memory at idle (compressed frames at 2–4 FPS) and less than 2% CPU. Active recording uses dynamic frame rate (2–30 FPS) that bursts during activity and settles during idle. Post-processing takes 3–8 seconds.


References and Further Reading

  • Apple, “ScreenCaptureKit documentation” — screen capture framework for macOS
  • Apple, “SFSpeechRecognizer documentation” — on-device speech recognition
  • Apple, “AVAssetWriter documentation” — H.264 + AAC video encoding
  • RFC 6962, “Certificate Transparency” — referenced in capture metadata