Name: Gemini 2.5 Pro
Price: 20 USD
Availability: InStock
Author: Google

Gemini 2.5 Pro by Google | AI Market Cap

HF PapersGoogleresearch1w ago

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Many moments in the real world do not wait for a user to ask. A fire starts on a security monitor, an expression flickers across a video call, or a product a viewer wants flashes by in a livestream. Yet today's large models remain mostly turn-based by design: they answer only when addressed, and even video-call apps that appear interactive still operate as question-answer systems, reacting only when polled or prompted. We argue for a different paradigm: a model that is present in the world like a person. It continuously watches what is happening now, decides on its own whether to speak or stay silent, interacts in real time, and delegates to a background model when the problem is hard. To advance interaction models and their adoption across domains, we make two fully open-sourced contributions. First, we release JoyAI-VL-Interaction, an 8B-scale, vision-first VL-interaction model. The model makes the response decision internally, choosing each second to stay silent, respond, or delegate to a background model, and it excels at vision-triggered responsiveness and time awareness. We pair it with a transferable training recipe, from which capabilities we never trained for emerge, such as guiding a shopper through changing app screens or improvising a lecture from a slide deck. Second, we release a complete, deployable system built around that model. The system streams any ongoing video into the model, making it genuinely present in the world. All other components are pluggable, including ASR/TTS modules, memory, visualization UI, and a background brain that can connect to any API or agent. Across six real-world scenarios, human raters prefer JoyAI-VL-Interaction over the in-app video-call assistants of Doubao and Gemini by a wide margin. To our knowledge, this is the first open, vision-driven interaction model released together with its training recipe, data, and complete deployable system.

View Source

#huggingface#daily-papers

Gemini 2.5 Pro

Similar Models

Google DeepMind 🤝 @A24 We’re launching a research partnership with A24 to ensure the tools of the future are shaped by the creators who use them. Find out more → https://t.co/KN3HdGVjGS https://t.co/

Social & Blog Posts5

Benchmarks & Rankings17

Gemini 2.5 Pro Benchmark Update

Research Papers6

Other

gemini-2.5 - Arena-Hard-Auto

Our Robotics Accelerator has launched with 15 startups helping shape the future of physical AI in Europe. 🤖 This three-month program will connect them with access to our AI stack, Gemini Robotics mod

When millions of AI agents interact with each other, new collective behaviors can emerge. 🌐 Together with @schmidtsciences, @coop_ai, @ARIA_research and supported by @GoogleOrg, we’re launching a $10

Gemini 2.5 Pro Benchmark Update

gemini-2.5 - Arena-Hard-Auto

Google DeepMind 🤝 @A24 We’re launching a research partnership with A24 to ensure the tools of the future are shaped by the creators who use them. Find out more → https://t.co/KN3HdGVjGS https://t.co/

Our Robotics Accelerator has launched with 15 startups helping shape the future of physical AI in Europe. 🤖 This three-month program will connect them with access to our AI stack, Gemini Robotics mod

When millions of AI agents interact with each other, new collective behaviors can emerge. 🌐 Together with @schmidtsciences, @coop_ai, @ARIA_research and supported by @GoogleOrg, we’re launching a $10

In Sierra Leone, a surging student population is outpacing available teachers. Our latest research explores how AI can act as a partner to support educators in these environments – amplifying their re

DiffusionGemma is our new experimental open model with up to 4x faster output on dedicated GPUs. Instead of predicting word-by-word, it generates entire blocks of text simultaneously. This lets the mo

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

Gemini 2.5 Pro Benchmark Update

The FID Lottery: Quantifying Hidden Randomness in Generative-Model Evaluation

Physics-IQ Verified

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

Show the Signal, Hide the Noise: Spectral Forcing for Pixel-Space Diffusion

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Towards One-to-Many Temporal Grounding

gemini-2.5-pro - SWE-Bench Verified