Latent Space roundup highlights agent harness engineering and open-weight model access
Meta’s Brain2Qwerty v2, Cursor’s iOS agent launch, and Arena’s $100M ARR milestone frame shifts in agent tooling and evaluation.
1 source · cross-referenced
- Meta released Brain2Qwerty v2, a real-time sentence decoder from non-invasive brain signals with open training code and dataset.
- Cursor launched iOS with always-on cloud agents and remote control of computer-based agents.
- Arena reached a $100M ARR run rate eight months after launching its evaluation product, emphasizing post-deployment and agent evaluation.
- Devin Fusion launched a hybrid-model coding harness claiming 35% lower cost while maintaining 'Fable-level' quality.
- DeepSeek’s DSpark improved speculative decoding with reported gains of 30.9% higher accepted length vs Eagle3 and 16.3% vs DFlash on Qwen3-4B.
Meta released Brain2Qwerty v2, a real-time sentence decoder that translates raw brain signals into words and semantics, narrowing the gap with invasive brain-computer interfaces. The company also released training code for versions 1 and 2 and the BCBL v1 dataset, signaling a push toward reproducible research in neural-signal modeling.
Cursor launched Cursor for iOS, integrating always-on cloud agents and remote control of agents operating on a user’s computer. Follow-up commentary highlighted features like Live Activities and diff review on mobile, reflecting a shift toward mobile-first agent workflows.
Arena announced it reached a $100M annual recurring revenue run rate eight months after launching its evaluation product. The platform is expanding beyond pre-deployment benchmarks to emphasize post-deployment and agent evaluation, reflecting growing demand for reliable agent assessment in production environments.
Cognition introduced Devin Fusion, a hybrid-model coding harness that claims to reduce costs by 35% while maintaining quality comparable to 'Fable-level' coding. The system delegates bounded subtasks to cheaper models while keeping an expensive planner in the loop, illustrating a broader trend toward multi-model orchestration in agent systems.
DeepSeek’s DSpark, an inference system focused on speculative decoding, reported gains of 30.9% higher accepted length versus Eagle3 and 16.3% versus DFlash on Qwen3-4B. The system is being integrated into production engines for DeepSeek-V4-Flash and V4-Pro, and is being adopted by the vLLM community as a new state-of-the-art single-GPU decoding path.
Community discussions highlighted the increasing use of coding agents for closed-loop experimental iteration in machine learning research. Meta noted that an Auto Research workflow powered by a coding agent discovered and implemented improvements that reduced word error rate beyond standard hyperparameter optimization, suggesting agentic systems are becoming integral to ML experimentation.
Chinese open-weight model competition continued to accelerate, with Meituan flagged as preparing to release LongCat 2.0 / Owl Alpha, a model with 1.6T total parameters, ~48B active parameters, 1M context window, and 35T training tokens, trained on 50k Chinese accelerators. Analysts framed this as potentially the first near-frontier model trained at this scale on domestic Chinese hardware.
Open-weight model access is being productized through services like Cline’s $9.99/month pass, which bundles access to GLM 5.2, DeepSeek, Kimi, MiniMax, Qwen, and others, reducing friction around API keys and provider churn. This reflects a broader trend toward commoditizing access to high-performing open models for agentic workflows.
- Jun 30, 2026 · arXiv cs.AI
Paper introduces RSEA agent that rewrites its own strategy, skills, and playbook without model updates
Trust79 - Jun 30, 2026 · TechCrunch — AI
OKX launches marketplace for AI agents to autonomously hire, pay, and build reputation
Trust72 - Jun 27, 2026 · AWS — Machine Learning Blog
Stripe details agentic AI system for financial compliance built on AWS
Trust79