Name: GPT-5.5
Price: 20 USD
Availability: InStock
Rating: 63.4 (1 reviews)
Author: OpenAI

GPT-5.5 by OpenAI | AI Market Cap

HF PapersGoogleresearch1w ago

VisualClaw: A Real-Time, Personalized Agent for the Physical World

Vision language models are serving as general-purpose interfaces for complex multimodal tasks. However, deployment still faces three gaps: VLMs typically incur high latency and cost when processing dense video frames and long prompts, the agent scaffold remains static after deployment, and standard video-QA benchmarks do not test whether agents can use visual evidence inside tool-using workspaces. We present VisualClaw, a self-evolving multimodal agent built around two principles. First, hybrid encoding reduces deployment cost by filtering less informative streaming frames with a cascaded gate and compressing the text skill bank through hot/cold top-k injection. Second, skill evolution lets the agent learn from failures: retrieved memories condition an evolver as direct concatenated context or as guided evidence, producing skill-bank updates that help future questions. Across 4 video-QA benchmarks with 2 VLMs, VisualClaw cuts per-question API cost by an average -98% versus full-frame upload and by -25.9% over the offline uniform 8 frame baseline, while boosting accuracy in most settings, e.g., an average +3.85% and a peak +15.80% on EgoSchema with Gemini 3 Flash. To address the gap, we curate VisualClawArena, a 200-scenario multimodal agentic benchmark built through a strict five-stage pipeline; models must use video evidence, documents, dynamic updates, and executable checks inside a workspace. On VisualClawArena, the same framework with computer-use agent backends improves macro accuracy by +2.9% for Codex (GPT-5.5) and +3.2% for Claude Code (Sonnet 4.6) over no-evolution baselines, with a -9.5% cost reduction compared to the uniform-sampled baseline. These properties make VisualClaw a natural fit for edge applications, where the cascade reduces a 1-hour streaming session from ~3,600 API uploads down to only 5-20 calls and the self-evolution makes it a perfect personalized assistant.

View Source

#huggingface#daily-papers

GPT-5.5

Similar Models

Introducing LifeSciBench

Social & Blog Posts14

Research Papers5

Other

Introducing GPT-5.5

We’re sharing new research on a method for anticipating how models may behave in real-world use before release: simulating deployment with recent, de-identified user requests and studying candidate mo

Predicting model behavior before release by simulating deployment

Introducing the OpenAI Partner Network

Introducing GPT-5.5

Samsung Electronics brings ChatGPT and Codex to employees

As AI takes on longer, higher-stakes tasks, we want models to carry beneficial and safe behavior into new domains beyond their training—and maintain it under pressure. That’s the idea behind our new r

Improving health intelligence in ChatGPT

Introducing LifeSciBench, a benchmark for measuring and improving how well AI supports real-world life science research. Developed with 173 scientists from biotechnology and pharmaceutical research, L

Introducing LifeSciBench

We’re sharing new research on a method for anticipating how models may behave in real-world use before release: simulating deployment with recent, de-identified user requests and studying candidate mo

Let’s talk about evals. We’re always looking for better ways to measure and forecast model progress, especially as benchmarks get saturated or gamed. @tejalpatwardhan, who leads our frontier evals tea

Predicting model behavior before release by simulating deployment

Introducing the OpenAI Partner Network

We’re bringing new capabilities to GPT-Rosalind, a model series purpose-built for life sciences research at enterprise scale. It brings GPT-5.5’s agentic coding and tool use together with stronger int

Warp’s big bet on building open source with GPT-5.5

Databricks brings GPT-5.5 to enterprise agent workflows

Scaling Trusted Access for Cyber with GPT-5.5 and GPT-5.5-Cyber

Want to secure an early ticket to OpenAI DevDay? Build something with GPT-5.5 and Image Gen. Each week, we’ll select 2–3 favorites to win free tickets to OpenAI DevDay 2026. Codex will help us find th

CEO-Bench: Can Agents Play the Long Game?

VisualClaw: A Real-Time, Personalized Agent for the Physical World

Claw-Anything: Benchmarking Always-On Personal Assistants with Broader Access to User's Digital World

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Self-Improving CAD Generation Agents with Finite Element Analysis as Feedback

Introducing GPT‑5 for developers

GPT-5.5 - GAIA

GPT-5.5 Model | OpenAI API

GPT-5.5 Model | OpenAI API