- The Rundown AI
- Posts
- An exclusive look into Google's new AI models
An exclusive look into Google's new AI models
PLUS: Real-world Gemini use cases, AI agents, and more
Welcome, AI enthusiasts.
We have an exclusive for you today.
In case you missed it, last week Google released two new upgraded Gemini 1.5 models—achieving new, state-of-the-art performance across math benchmarks.
We partnered with Google to help explain what makes these new models so special for developers, real-world use cases, AI agents, and more. Let’s get into it…
In today’s AI rundown:
Google’s two new Gemini 1.5 models
Gemini 1.5 compared to other AI models
The age of the AI-first developer
Real-world use cases of Gemini 1.5
Proactive AI agent systems
– Rowan Cheung, founder
EXCLUSIVE Q&A WITH LOGAN KILPATRICK
GEMINI
Image credits: Kiki Wu / The Rundown
The Rundown: Google just released two new upgraded versions of Gemini 1.5 across the Gemini API, including 1.5 pro-002, which achieved state-of-the-art performance across math benchmarks, and 1.5-flash-002, which makes big gains in instruction following.
Cheung: “Can you give us the rundown on everything being released and why it actually matters?”
Kilpatrick: “Today, we're rolling out two new production-ready Gemini models and also improving rate limits, pricing for 1.5 Pro, and some of the filter settings enabled by default. Really, all these are focused on enabling developers to go in and build more of the stuff that they're excited about.”
Cheung: “What exactly makes the new models so unique?“
Kilpatrick: “Math, the ability for the models to code, which is obviously super important for people who care about developer stuff. It's been a lot of listening and sort of iterating on the feedback that we've been getting from the ecosystem.“
Kilpatrick added: “The linear amount of progress that we've seen with, and in some cases, exponential in different benchmarks with this iteration of Gemini models… has been incredibly exciting"
Why it matters: Google’s new Gemini 1.5-pro-002 model achieves state-of-the-art performance across challenging math benchmarks like AMC + AIME 24, and MATH. This means that the model is able to solve advanced mathematical problems and tasks that require deep domain expertise, a major hurdle from most previous AI models.
You can try AI Studio and the new Gemini 1.5 models for free here.
HEAD-TO-HEAD
Image credits: Kiki Wu / The Rundown
The Rundown: Google also announced significant improvements to accessibility for developers building with Gemini models, including a 50% reduced price on 1.5 Pro, 2x higher rate limits on Flash and 3x higher on 1.5 Pro, 2x faster output, and 3x lower latency.
Cheung: “In addition to the new updates, higher rate limits, expanded feature access, and high context windows, what other capabilities does Gemini 1 .5 offer that developers should be really excited about?“
Kilpatrick: "Part of my perspective is the financial burden to build with AI is one of the rate limiters of this technology being accessible… our strategy to combat this is we have the most generous free tier of any language model that exists in the world”
Kilpatrick added: "One of the big differentiators is you can come to AI Studio, fine-tune Gemini 1.5 Flash for free, and then ultimately put that model into production and pay the same extremely competitive, per million token cost. There's no incremental cost to use a fine-tuned model, which is super differentiated in the ecosystem.”
Why it matters: Google's latest Gemini updates significantly lower the financial barrier for AI development while boosting performance, especially in math. With these updates, Gemini now tops the LLM leaderboard in terms of performance-to-price ratio, context windows, video understanding, and other LLM benchmarks.
The pace of innovation: Google’s Gemini project is only around a year old. Google was the first to ship 1M context windows (and 2M) and context caching, and they’ve been making rapid progress ever since.
THE AI ERA
Image credits: Kiki Wu / The Rundown
The Rundown: AI is helping developers tackle significantly harder problems faster while simultaneously lowering the entry barrier for non-developers to contribute to new innovation and even build their own AI apps.
Cheung: “I think what's really, cool with the age of AI, is seeing anyone, even people who are not technical, being able to build their own AI apps. If someone were to start from zero, is there a tool stack, documentation, courses, videos, or maybe tutorials from Google that you would recommend?“
Kilpatrick: "To your point…As someone who was formerly a software engineer, I really can go and tackle 10x more difficult problems now.”
Kilpatrick added: “For the person who's never coded before, they're now able to tackle like any problem with code because they have this co-pilot in their hands.”
Kilpatrick added: "[For beginners] ai.google.dev is our default landing page that also links out to the Gemini API documentation. On GitHub, we have a Quickstart repo where you can literally run four commands have a local version of AI Studio and Gemini running on your computer to play around with the models.”
Why it matters: With AI as an assistant, some developers are tackling 10x more challenging software problems—which also means 10x the speed of improvements and 10x the innovation, for those who use the tech wisely. Google also has great resources to help even complete beginners get started in less than 5 minutes.
USE CASES
The Rundown: Gemini 1.5's multimodal capabilities allow a host of real-world applications that other models can't match, such as processing and analyzing hour-long videos or entire books—thanks to its impressive 2M token context window.
Cheung: “Can you share an example or some use cases of how customers are using these experimental models of Gemini in the real world?”
Kilpatrick: “Taking in video, I think, is one of the coolest things… Being able to go into an AI studio and just drop an hour-long video in there and ask a bunch of questions is such a mind-blowing experience. And to be able to try it for free.”
Kilpatrick added: "The intent was to build a multimodal model from the ground up…the order of magnitude of important use cases for the world, for developers and for people who want to build with this technology, so many of them are multimodal."
Why it matters: Gemini 1.5's 2M context window allows it to process and analyze long-form content like long videos, entire books, and lengthy podcasts, opening new possibilities for content analysis and interaction. For a full look at its potential, check out Google's list of 185 real-world gen AI use cases from leading organizations.
AI AGENTS
Image credits: Kiki Wu / The Rundown
The Rundown: The future of AI is likely to shift from reactive to proactive systems, with AI agents capable of initiating actions and asking for clarification or permission, much like human assistants do today.
Cheung: “What do you think the most surprising way AI will change our daily lives in the future?”
Kilpatrick: "With most AI systems today, it's one way. Sort of, I prompt the system and then it gives me a response back or I tell it to do something and it sort of does what I might instruct it to do.”
Kilpatrick added: “I think the future is, in the medium term, the system actually asking me for permission or clarification on things that I might want it to go do and really solving those problems.”
Kilpatrick added: “It's actually very interesting to me that very few AI systems, if any today, ask me how they can help in an actual, not surface-level way that ends up being meaningful.“
Why it matters: By shifting from purely reactive to proactive systems, AI could become more like a true “Her-like“ assistant, anticipating needs and offering solutions before being prompted. At the current state, no AI systems do this effectively, but as AI continues to advance with projects like Astra, this is likely the next stage for AI.
GO DEEPER
INTERVIEW
In the full interview with Logan Kilpatrick & Rowan Cheung:
Dive deep into state-of-the-art math achievements of the new models
Talk about real-world use cases of Gemini 1.5, and exciting possibilities
Go in-depth on how to succeed and thrive in the new age of AI
Nerd out on the final form factors of AI and proactive AI agents
Listen on Twitter/X, Spotify, Apple Music, or YouTube.
Reply