Researcher ports 0.2B Moebius image inpainting model to run in-browser via WebGPU
A developer used Claude Code to convert a PyTorch-based inpainting model to ONNX and run it client-side in a browser using WebGPU, demonstrating feasibility of large-model browser execution.
1 source · cross-referenced
- A 0.2B-parameter image inpainting model (Moebius) was converted from PyTorch to ONNX and run in a browser using WebGPU.
- The conversion and deployment were performed by Claude Code without the author writing any code.
- The resulting demo loads a ~1.3GB model on first use and uses browser caching to avoid re-downloads.
- The ONNX-converted weights are published on Hugging Face and the frontend is hosted on GitHub Pages.
Simon Willison reports successfully running the 0.2B-parameter Moebius image inpainting model in a web browser using WebGPU, after converting the PyTorch model to ONNX with Claude Code. The original Moebius release required PyTorch and NVIDIA CUDA, but the converted version runs client-side in the browser.
The process involved using ONNX Runtime Web with a WebGPU backend, which sits below libraries like Transformers.js. The author did not write any code; instead, they guided Claude Code through the conversion and deployment steps, including publishing the 1.24GB ONNX weights to Hugging Face and hosting the frontend on GitHub Pages.
The resulting demo is available at simonw.github.io/moebius-web and allows users to upload an image, select regions to remove, and run the inpainting model entirely in the browser. The first load downloads approximately 1.3GB of model weights, but subsequent reloads leverage browser caching to avoid re-downloading.
Willison notes this as an example of "vibe coding," where the primary contributions were testing, suggesting small improvements, and pointing the agent toward examples of desired functionality. The author also documented the process and learned that Chrome, Firefox, and Safari can all run this class of model in the browser.
- Jun 23, 2026 · Simon Willison’s Weblog
Researchers identify role confusion as a fundamental challenge in preventing prompt injection
Trust79 - Jun 23, 2026 · Interconnects — Nathan Lambert
GLM-5.2 release sparks community praise as a step-change open-weight coding agent
Trust71 - Jun 20, 2026 · MIT Technology Review — AI
Startup claims sparse-attention LLM rivals top dense models on coding benchmarks
Trust71