The Rundown: Google just officially rebranded its AI chatbot Bard as Gemini and launched major upgrades, including a new mobile app and access to its highly anticipated Ultra model.
Gemini Advanced gives access to Ultra 1.0, Google's "largest and most capable" AI model that competes with GPT-4 on performance benchmarks.
The Advanced model is accessible as part of a $19.99/month Google One AI Premium subscription plan (the same as ChatGPT plus 👀).
A Gemini mobile app rolls out on Android and iOS in the U.S., allowing enhanced on-device assistance via typing, voice, and images.
The Android app supports some existing Google Assistant features, including making calls, controlling smart devices, and accessing info from other Google apps.
Images generated with Gemini will now feature an invisible digital watermark using a tool called ‘SynthID’.
Gemini (finally) made its debut in Canada, where it is now available in both English and French.
Why it matters: The transition from Bard to Gemini is finally complete — and Google just took the LLM race to the next level. With Ultra now pushing ChatGPT for top model status, all eyes turn to OpenAI for a counter-punch with a GPT-4.5 or 5 release.
The Rundown: OpenAI is reportedly building two types of AI agents — one for devices and another for web tasks that can autogenerate expense reports, transfer data between docs, book travel, and more.
The device agent can take over the user's device to perform productivity tasks requiring text and clicks.
The web agent will collect data online and book services autonomously without the need for third-party APIs.
The agents vision is a step towards evolving AI into a more personal assistant with deep knowledge of individual users and workplaces.
Why it matters: AI agents were a trendy buzzword last year, but none of the iterations on the market truly seemed capable of carrying out ambitious tasks autonomously. High-functioning agents would blast us into a new era of AI possibilities — and OpenAI is once again leading the charge.
The Rundown: Huawei researchers just published a new paper arguing that embodied AI agents that can continuously interact with and learn from the real world represent the next step towards artificial general intelligence.
The researchers detail that static LLMs like ChatGPT lack bodies, so they can't actively sense or interact with the real world.
Embodied AI will be able to see and hear environmental input, take physical actions, and learn from experiences.
This interactivity mirrors how humans and animals gain understanding through trial-and-error, with researchers arguing a physical body is a necessary step in the quest for AGI development.
A proposed framework details how to potentially embody AI, but hardware limitations still currently pose challenges.
Why it matters: Though the hardware is an obvious challenge, advances in robotics (even some in today’s newsletter!) indicate that giving AI a functional body to enable dynamic learning is not far off — and may be the missing ingredient for progress towards truly intelligent agents.
The U.S. Department of Commerce announced the launch of the AI Safety Institute Consortium, with over 200 stakeholders (including OpenAI, Google, Anthropic, and Meta) uniting to support the safe development of the tech.
Nvidia has reportedly tapped Intel Foundry Services to ramp up its GPU packaging, which could scale production to 300,000 H100 GPUs per month.
Brilliant Labs announced Frame, the first glasses with a built-in multimodal AI assistant capable of enhancing daily activities and interacting with digital and physical worlds.
Perplexity announces a partnership with Brilliant Labs’ Frame to integrate real-time answers into the AI-enhanced glasses.
1X Tech posted a new demo showing off the startup’s autonomous Android robots carrying out complex tasks without manual intervention.
Stable Audio released a new model called AudioSparx 1.0, capable of creating long-form music with more varied structure than competing models.
HyperWrite teased two new features called Agent Trainer and Agent Studio, allowing users to teach AI better to learn and repeat tasks.
Midjourneyopened Alpha testing of their web platform for users with >1000 images, alongside a quiet announcement of a planned mobile app for the AI Image generator.
THAT’S A WRAP
Get your product in front of over 450k+ AI enthusiasts
Our newsletter is read by thousands of tech professionals, investors, engineers, managers, and business owners around the world. Get in touch today.
How would you rate today's newsletter?
Vote below to help us improve the newsletter for you.