Good morning, {{ first_name | AI enthusiasts }}. Yesterday, it was Super Bowl attack ads. Today, OpenAI and Anthropic are letting the models do the talking.

With back-to-back flagship drops that pushed agentic coding, self-improving AI, and enterprise automation forward in a single afternoon, things are moving faster than ever — and the "AI is hitting a wall" crowd might want to sit this news cycle out.

In today’s AI rundown:

  • OpenAI’s GPT-5.3-Codex helps build itself

  • Anthropic’s Opus 4.6 with ‘agent teams', 1M context

  • Cut down reporting times with Claude in Excel

  • OpenAI’s Frontier to manage ‘AI coworkers’

  • 4 new AI tools, community workflows, and more

LATEST DEVELOPMENTS

OPENAI

Image source: OpenAI

The Rundown: OpenAI just rolled out GPT-5.3-Codex, a new flagship coding model that merges its best programming and reasoning capabilities into one faster package — while also serving as a key tool in its own training and deployment process.

The details:

  • OpenAI said early versions of 5.3-Codex were used to find bugs in its own training runs, manage its rollout, and analyze evaluation results.

  • Codex tops agentic coding benchmarks like SWE-Bench Pro and Terminal-Bench 2.0, topping Opus 4.6 by 12% on the latter just minutes after its release.

  • On OSWorld, a benchmark testing AI control of desktop computers, the model scored 64.7% — nearly double the 38.2% from the prior Codex version.

  • OpenAI flagged the model as its first "High" cybersecurity risk rating and committed $10M in API credits to fund defensive security research

Why it matters: The self-improvement angle here is the headline, with Anthropic's Dario Amodei also recently saying Claude is helping design its own successor. Yesterday’s bickering over ads now looks childish compared to the true fight on the model frontier, with a big day of dueling releases out of both labs.

TOGETHER WITH BLAND

The Rundown: Bland AI automates phone calls for over 250+ enterprise customers. No phone trees. No hold music. Just faster, smarter customer conversations.

Here's some of the outcomes they've driven for businesses:

  • Idaho Finance saved $750k/yr by replacing their IVR with AI Voice Agents

  • MyPlanAdvocate added $40M/yr by automating their inbound lead qualification

  • And Needle saves $1M/yr by automating outbound calls

Book a demo today to see how they can work for your business.

ANTHROPIC

Image source: Anthropic

The Rundown: Anthropic released Claude Opus 4.6, the company’s new most powerful model — featuring multi-agent collaboration in Claude Code, a massive context window, and new Office integrations that put the AI directly inside PowerPoint.

The details:

  • A new "agent teams" feature in Claude Code lets multiple AI agents split a single project and work simultaneously instead of handling steps one at a time

  • Opus 4.6 brings a 1M token context window to Anthropic's Opus tier for the first time, matching what Sonnet offers for heavy document and code work.

  • New Excel and PowerPoint sidebars let Claude read users’ existing templates and build models or decks natively without copying and pasting between tools.

  • 4.6 topped most agentic benchmarks, including a leap on ARC-AGI-2 to nearly 70% — though OAI’s Codex 5.3 reclaimed agentic coding highs minutes later.

Why it matters: It’s a big day for devs, with both Codex 5.3 and Opus 4.6 releases bringing major capability increases across the board. With time between upgrades getting shorter and the length of tasks models can take on continuing to move up the curve, the “AI is hitting a wall” crowd seems pretty quiet these days.

AI TRAINING

The Rundown: In this guide, you will do a quick exercise that teaches you how to use Claude as a spreadsheet architect, taking 5+ messy CSVs and watching Claude handle data cleaning, table formatting, color-coding, and more.

Step-by-step:

  1. Install Claude’s Excel app from the Microsoft Marketplace. For this example, we used a year’s worth of SEO data, but you can use sales data, receipts, etc

  2. In Excel, click the Claude button and prompt “I have [data type] data from [sources] for my website/brand/team. Make a plan to rename each tab and clean the data up to make it more readable”. Then, edit and approve the plan

  3. Once done, ask Claude to make a plan for the master dashboard tab: “Based on all tabs, what’s the best way to tie this data into a Master Dashboard?”

  4. Finally, you can ask Claude to visualize data with prompts like Create a combo chart for Clicks vs. Average Position”

Pro tip: Asking Claude to review the data and create a plan improves its output significantly compared to asking it to get started immediately.

PRESENTED BY TRIPLE WHALE

The Rundown: Triple Whale merchants saw LLM-referred orders jump from 7,152 in 2024 to 424,000+ in Q4 2025 alone. AEO (AI Engine Optimization) is the next frontier—and early movers are building an unfair advantage. Try Triple Whale's free tool to see how LLMs see your brand across ChatGPT and other leading platforms.

With the AI Visibility tool, you can:

  • Monitor your brand's AI visibility score for free

  • Track mentions across ChatGPT and leading LLMs

  • Connect AI referrals to actual revenue with attribution

OPENAI

Image source: OpenAI

The Rundown: OpenAI just launched Frontier, a new platform for enterprises to deploy and manage AI agents like new hires — complete with onboarding, permissions, and performance reviews across a company’s existing tech stack.

The details:

  • Frontier connects to existing enterprise systems like CRMs and ticketing tools, letting agents pull context from across the business without migrations.

  • Built-in eval and feedback loops let agents learn via experience, with OAI comparing it to onboarding a new employee with reviews and boundaries.

  • Every agent operates under its own profile with scoped access and hard limits on what it can touch for enterprise and regulated control.

  • HP, Oracle, State Farm, and Uber are among the first adopters, with OAI embedding engineers on-site to help teams get agents into production.

Why it matters: Anthropic and OAI have been battling over models and coding tools, but Frontier shows the fight is also bleeding into who controls the enterprise agent layer underneath. Model capabilities are making AI coworkers a reality in the near future, and the system that ultimately orchestrates them will be valuable real estate.

QUICK HITS

  • ⚙️ GPT-5.3-Codex - OpenAI's new SOTA agentic coding model

  • 🧠 Claude Opus 4.6 - Anthropic’s upgrade to its most powerful model line

  • 🤖 OpenAI Frontier - Enterprise platform to create, deploy, manage AI agents

  • 🔎 Model Council - Perplexity’s new tool for querying multiple models

Perplexity launched Model Council, a new feature that runs queries through multiple AI models at the same time and synthesizes outputs into a single answer.

Roblox introduced 4D generation via its Cube AI foundation model, letting creators generate fully functional, interactive objects from text prompts.

Lotus Health raised $35M in Series A funding for its free AI-powered primary care platform, providing diagnosis, prescriptions, and referrals across 50 states.

Meta is rolling out a standalone app for its Vibes AI video platform, which was previously only available via the Meta app.

AI evaluation firm METR released new analysis for GPT-5.2 (high), finding it can now handle tasks that would take a human engineer over 6 hours to complete.

COMMUNITY

Every newsletter, we showcase how a reader is using AI to work smarter, save time, or make life easier.

Today’s workflow comes from reader T. in Canada:

"I use AI to source vendors for fresh produce and compare/predict market price changes from globally supplied goods. It helps to maintain food security in our northern region."

How do you use AI? Tell us here.

That's it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve The Rundown experience for you.

Login or Subscribe to participate

See you soon,

Rowan, Joey, Zach, Shubham, and Jennifer — the humans behind The Rundown

Reply

Avatar

or to participate