Skip to main content
⚙️

How VideoBuff Runs 100% Client-Side — WebCodecs, OPFS, OffscreenCanvas, and Why MCP Works

VideoBuff never uploads your video files — everything runs inside the browser. This article explains the client-side stack that makes that possible (WebCodecs for GPU decode/encode, OPFS for local persistence, OffscreenCanvas rendering in a Worker, Web Audio API for offline export) and why this design is exactly what lets Claude Code / Claude Desktop drive edits through MCP without any cloud round-trip.

Why we went 100% client-side

VideoBuff is a video editor that never uploads your files — decoding, compositing, and export all run on your local GPU and CPU. No cloud storage, no server-side FFmpeg pipeline, no account signup.

We chose this design because a developer-grade workflow doesn't work without it. Three concrete reasons:

  1. Zero upload wait. A workflow that pushes a 1 GB asset to S3 every time you edit is slow, bound by egress costs, and rate-limited. When the local GPU does the work, file size stops mattering in practice.
  2. Crisp privacy boundary. Unreleased footage, internal materials, and personal recordings don't belong in a third-party cloud. If the file stays inside the browser sandbox, "what is leaving the machine?" is no longer a debate (the sole exception is anonymous usage telemetry, which is explicitly toggleable).
  3. MCP integration becomes natural. Because edit state lives as in-memory JavaScript objects in the browser tab, the MCP server is just a thin local bridge that talks to the tab over a WebSocket. No cloud API, no auth/key management, no regions, no round-trip latency. See the Claude integration guide for the full story.

The rest of this article walks through the four Web APIs that make this design work — WebCodecs, OffscreenCanvas, Web Audio API, and OPFS (with localStorage handling project-structure persistence) — and where each one carries its weight.

WebCodecs API — client GPU instead of server-side FFmpeg

The WebCodecs API is a low-level interface for hardware-accelerated decode and encode of H.264 / H.265 / VP9 / AV1 inside the browser. VideoDecoder pulls frames out of source files, VideoEncoder compresses edited frames with a chosen codec. Unlike a WASM build of FFmpeg, this talks to the OS's hardware decoder directly, so it is orders of magnitude faster and more power-efficient than JS/WASM implementations.

For a developer, the practical implication is: you no longer need to stand up a cloud GPU instance to encode video. The old pipeline — upload to S3, spin up EC2 or Lambda to run FFmpeg, download the result — collapses into a single stage running on the user's M1/M2/integrated GPU. No server cost, no egress.

VideoBuff uses a two-stage architecture: VideoDecoder plus CSS filters for low-overhead preview playback, and VideoEncoder plus OffscreenCanvas for precision export. Latency during preview, accuracy during export.

OffscreenCanvas

OffscreenCanvas is an API for performing frame rendering in the background without blocking the main UI thread. VideoBuff uses the 2D context to composite video frames, text overlays, and image layers.

Since this processing can run in a Web Worker, heavy rendering can be performed in parallel while maintaining UI responsiveness. Effects like color grading, blend modes, and transitions are all processed on OffscreenCanvas.

The architecture uses a two-stage approach: lightweight real-time display with CSS filters during preview, and high-precision rendering on OffscreenCanvas during export.

Web Audio API

The Web Audio API enables building real-time audio processing graphs within the browser. VideoBuff implements a 3-band EQ using BiquadFilterNode, a compressor using DynamicsCompressorNode, and volume mixing using GainNode.

These nodes are connected in series to form an audio processing chain that processes each clip's audio in real time. Using offline rendering mode (OfflineAudioContext) enables rapid generation of PCM data with the same effect chain applied during export.

Pitch preservation and speed changes are also implemented using Web Audio API features.

OPFS and localStorage — the technical guarantee that files never leave

The Origin Private File System (OPFS) is a sandboxed filesystem built into the browser. VideoBuff stores the raw bytes of imported video, audio, and image files in OPFS. OPFS is fully isolated per origin, inaccessible to other sites, and unreadable from outside the browser except through devtools. Files survive tab close, reload, and machine reboot.

Project structure — timeline layout, clip placement, effect parameters, text content — is JSON-serialized into the browser's localStorage. The save is debounced at ~1 second; even a browser crash recovers to near the last state. When the per-origin localStorage quota (~5 MB) is approached, the save automatically falls back to a stripped variant that omits regenerable derived data such as thumbnails and waveforms. The OPFS asset manifest (assetId → filename mapping) lives in the same localStorage namespace.

These two APIs are the technical backing for "never uploads." Raw media sits in OPFS, edit structure sits in localStorage, and both are confined to the browser sandbox. There is literally no implemented code path that syncs or uploads either one. If you want to verify the claim, open DevTools → Network, edit for a while, and confirm that zero upload requests are sent.

A side benefit: this architecture is also why MCP integration is serverless. Edit state is entirely contained within the tab (in-memory JS plus OPFS plus localStorage), so the MCP server just talks to the tab over a local WebSocket — no need to project state to the cloud first. The "expose state to Claude by first putting state in a server" detour is eliminated.

(Implementation note: IndexedDB was considered during early design and is a likely future migration target if the project size or diff-based persistence requirements outgrow localStorage. At the current project scale, localStorage with a ~1 Hz debounced full-JSON write is sufficient.)

Why this architecture is drivable from Claude Code / MCP

Once you understand the design, it becomes obvious why a browser app can be driven from Claude Code or Claude Desktop. The short answer: all edit state is contained inside the tab.

Concretely, installing the .mcpb starts a local stdio MCP server (a Node.js process) on your machine. This server is a thin bridge that forwards between two channels: stdio with Claude Desktop / Claude Code, and a localhost WebSocket to the VideoBuff tab. When Claude invokes a tool like add_clip, the MCP server relays it to the tab, the tab's edit store executes it, and the result flows back. There is no cloud service anywhere in this path.

This architecture works because edit state is fully local: in-memory JS objects in the tab, plus OPFS, plus localStorage. If the media lived in S3 and the project lived in an RDBMS, the MCP server would need the AWS SDK with auth, permissions, and regions to do its job. Because everything is client-side, the MCP server is a thin bridge that carries no credentials and no API keys.

From the CLI side, the experience is: launch claude code in a terminal, send a prompt like "line up every imported image on the timeline at 3 seconds each," and the change shows up in the browser tab seconds later. It feels the same as running git or ffmpeg from the shell — you've just added a browser video editor to the same tier. See the Claude integration guide for the full flow.

Browser Compatibility

The WebCodecs API is supported in Chrome 94 and later and Edge 94 and later, which are the recommended browsers for VideoBuff. Safari and Firefox do not yet fully support the WebCodecs API, so some features may be limited.

VideoBuff detects browser capabilities at startup and provides appropriate notifications when unsupported features are detected. Codec support is also checked at runtime, so if H.265 is unavailable in your browser, only H.264 will be shown as an option.

Browser APIs are evolving rapidly, and support across more browsers is expected to continue expanding.

Looking ahead: video editing in the browser + AI-agent era

Browser video processing capabilities are evolving rapidly. WebCodecs AV1 encoding is landing in some browsers, WebGPU is making direct GPU-based effects and real-time processing realistic, and SharedArrayBuffer / Atomics multi-threading plus WebTransport continue to improve.

VideoBuff's direction rides this trend toward "video editing an AI agent can drive." We get desktop-class processing entirely client-side, then layer natural-language access from Claude Code or Claude Desktop on top — a two-tier architecture.

The combination of local GPU plus natural-language control fundamentally changes repetitive editing work. Thirty minutes of tedious repetition in Premiere Pro becomes a single prompt. Taste from the human, implementation from the AI, processing on the local GPU — that future is already running on the browser in front of you.

Try it now

No download, no account. Open your browser and start editing right away.

Start Editing →