Stay Hungry,Stay Foolish!

Introducing Gemini 2.0: our new AI model for the agentic era

Introducing Gemini 2.0: our new AI model for the agentic era

https://blog.google/technology/google-deepmind/google-gemini-ai-update-december-2024/#ceo-message

 

Gemini 2.0 Flash

Gemini 2.0 Flash builds on the success of 1.5 Flash, our most popular model yet for developers, with enhanced performance at similarly fast response times. Notably, 2.0 Flash even outperforms 1.5 Pro on key benchmarks, at twice the speed. 2.0 Flash also comes with new capabilities. In addition to supporting multimodal inputs like images, video and audio, 2.0 Flash now supports multimodal output like natively generated images mixed with text and steerable text-to-speech (TTS) multilingual audio. It can also natively call tools like Google Search, code execution as well as third-party user-defined functions.

A chart comparing Gemini models and their capabilities

Our goal is to get our models into people’s hands safely and quickly. Over the past month, we’ve been sharing early, experimental versions of Gemini 2.0, getting great feedback from developers.

Gemini 2.0 Flash is available now as an experimental model to developers via the Gemini API in Google AI Studio and Vertex AI with multimodal input and text output available to all developers, and text-to-speech and native image generation available to early-access partners. General availability will follow in January, along with more model sizes.

To help developers build dynamic and interactive applications, we’re also releasing a new Multimodal Live API that has real-time audio, video-streaming input and the ability to use multiple, combined tools. More information about 2.0 Flash and the Multimodal Live API can be found in our developer blog.

Gemini 2.0 available in Gemini app, our AI assistant

Also starting today, Gemini users globally can access a chat optimized version of 2.0 Flash experimental by selecting it in the model drop-down on desktop and mobile web and it will be available in the Gemini mobile app soon. With this new model, users can experience an even more helpful Gemini assistant.

Early next year, we’ll expand Gemini 2.0 to more Google products.

 

Building responsibly in the agentic era

Gemini 2.0 Flash and our research prototypes allow us to test and iterate on new capabilities at the forefront of AI research that will eventually make Google products more helpful.

As we develop these new technologies, we recognize the responsibility it entails, and the many questions AI agents open up for safety and security. That is why we are taking an exploratory and gradual approach to development, conducting research on multiple prototypes, iteratively implementing safety training, working with trusted testers and external experts and performing extensive risk assessments and safety and assurance evaluations.

For example:

  • As part of our safety process, we’ve worked with our Responsibility and Safety Committee (RSC), our longstanding internal review group, to identify and understand potential risks.
  • Gemini 2.0's reasoning capabilities have enabled major advancements in our AI-assisted red teaming approach, including the ability to go beyond simply detecting risks to now automatically generating evaluations and training data to mitigate them. This means we can more efficiently optimize the model for safety at scale.
  • As Gemini 2.0’s multimodality increases the complexity of potential outputs, we’ll continue to evaluate and train the model across image and audio input and output to help improve safety.
  • With Project Astra, we’re exploring potential mitigations against users unintentionally sharing sensitive information with the agent, and we’ve already built in privacy controls that make it easy for users to delete sessions. We’re also continuing to research ways to ensure AI agents act as reliable sources of information and don’t take unintended actions on your behalf.
  • With Project Mariner, we’re working to ensure the model learns to prioritize user instructions over 3rd party attempts at prompt injection, so it can identify potentially malicious instructions from external sources and prevent misuse. This prevents users from being exposed to fraud and phishing attempts through things like malicious instructions hidden in emails, documents or websites.

We firmly believe that the only way to build AI is to be responsible from the start and we'll continue to prioritize making safety and responsibility a key element of our model development process as we advance our models and agents.

 

posted @ 2024-12-20 19:48  lightsong  阅读(2)  评论(0编辑  收藏  举报
Life Is Short, We Need Ship To Travel