ARC-AGI-3 resets frontier AI scoreboard

^{Read Online}^|^{Sign Up}^|^Advertise

Good morning, {{ first_name | AI enthusiasts }}. One of the AI industry's favorite talking points of being on the doorstep of AGI just ran into a test where the best models in the world can't even score above 1%.

ARC-AGI-3 is a harder version of the benchmark that's become the go-to reality check for AGI claims — and with Gemini Pro leading the pack at just 0.37%, frontier models just got a brand new challenge (to likely still crush in about six months).

Reminder: Our next live workshop is today at 2 PM EST — join part 3 of our Intro to Vibe Coding, where you’ll learn how to take your apps to production for real users and workflows instead of just prototypes. RSVP here.

In today’s AI rundown:

ARC’s new AGI test stumps every frontier AI
Reddit's AI bot crackdown skips the ID check
Create branded reaction GIFs for Slack
Google shrinks AI memory with zero accuracy loss
4 new AI tools, community workflows, and more

LATEST DEVELOPMENTS

AI BENCHMARKS

🧐 ARC’s new AGI test stumps every frontier AI

Image source: ARC Prize Foundation

The Rundown: François Chollet's ARC Prize Foundation just released ARC-AGI-3, the newest version of its interactive reasoning benchmark, where humans can solve 100% of tasks on the first try but AI models struggle, with top systems not even scoring 1%.

The details:

Labs spent millions training models on earlier versions of the test, pushing ARC-AGI-2 scores from 3% to around 50% in under a year.
Agents face game-like scenarios with zero instructions, and must discover rules, form goals, and plan strategies entirely from scratch.
Google’s Gemini Pro scored the highest among frontier models at just 0.37%, followed by GPT 5.4 High (0.26%), Opus 4.6 (0.25%), and Grok-4.20 (0%).
A $1M prize backs the challenge, and cofounder Mike Knoop says frontier labs are paying far more attention to V3 than they did to earlier versions.

Why it matters: It’s always jarring to see the top models get reset below 1% on a new ARC-AGI release, but if the older tests are any indicator, even more surprising will be how quickly frontier labs climb the ladder. Whether that reflects genuine reasoning or just more expensive brute-forcing is exactly what Chollet built V3 to find out.

TOGETHER WITH SLACK FROM SALESFORCE

🧑‍💻 Your AI teammates live in Slack

The Rundown: Agentforce brings powerful AI agents directly into Slack, with no new logins or context switching. DM an agent, @mention it in a channel, or let it take action by pulling Salesforce insights, updating records, and creating canvases on the fly.

In this guide, you'll learn how to:

Get started with agents right where your team already works
Take action faster by pulling insights, updating records, and more
Get started in minutes with ready-made templates or build custom agents for any team

Read the full guide to get started with Agentforce in Slack.

🤖 Reddit's AI bot crackdown skips the ID check

Image source: Reddit

The Rundown: Reddit CEO Steve Huffman outlined a plan to separate humans from bots across the site, including labeling automated accounts, flagging suspicious users for verification, and letting sub-communities self-police without mass ID checks.

The details:

Accounts running automation in approved ways on the social platform will carry an [App] label, with suspicious behavior leading to human verification.
To confirm proof of humanity, Reddit will offer passkeys or Sam Altman’s World ID scanner, with government IDs as a last resort, only where laws require it.
AI-written content isn’t being banned, with Huffman calling it 'annoying' but saying communities can set their own rules on AI-generated posts.
Rival platform Digg recently folded after being overrun with bots, and Cloudflare data shows automated traffic on pace to surpass humans by 2027.

Why it matters: The Dead Internet Theory was already here before the AI agent acceleration we’ve seen over the past six months. Now, it’s a reality every social media site is dealing with. While this feels a bit like a band-aid, it is a small step towards every platform needing a serious human-first solution if it wants to remain usable to them.

AI TRAINING

🤯 Create branded reaction GIFs for Slack

The Rundown: In this guide, you will learn how to make custom, branded reaction GIFs for your company’s Slack using Higgsfield (an image and video generator). The trick is to generate the starting frame before you animate it.

Step-by-step:

Go to Higgsfield image gen, decide the GIF’s look, and enter the reaction’s visual style and text, like “ESPN themed reaction gif with words ‘SLOW DOWN’”
If your brand is not recognizable, attach your logo or another brand reference image while generating the still
Generate a few stills and pick the best one, then click the camera’s Animate button on that still so that it becomes the start frame in Higgsfield video
Then, set the clip length to 3 seconds, turn off its audio, and prompt: "Reaction GIF". Finally, download the MP4 and turn it into a GIF with any MP4-to-GIF site

Pro tip: If you make a whole batch of MP4s, ask Claude Code to convert them to GIFs in bulk on your desktop so you do not have to use a converter site one file at a time.

PRESENTED BY TELY AI

💬 Market leaders get leads from ChatGPT and Google

The Rundown: Your buyers are asking AI questions — and AI is answering with your competitors, not you. Tely makes AI like ChatGPT, Google, and Claude recommend your business instead.

With Tely AI, you can:

Get recommended in ChatGPT, Google, Perplexity, and Claude in as little as 1 week
Fully hands-off: no writers, no agencies, no managing content
Costs less than hiring freelancers or maintaining a marketing team
Ideal for niche industries where expertise matters

Get leads from Google and ChatGPT on autopilot.

GOOGLE

💾 Google shrinks AI memory with zero accuracy loss

Image source: Google

The Rundown: Google Research introduced TurboQuant, an algorithm that compresses AI model memory over 6x without any retraining — while delivering up to 8x speed gains on Nvidia H100 chips and losing almost zero accuracy.

The details:

AI models keep a running log of each conversation, and as chats get longer, that storage balloons, which slows responses and drives up costs.
TurboQuant shrinks that storage by over 6x with zero accuracy loss, scoring perfectly on tests that bury a key detail in a large amount of text.
On Nvidia's top server chips, it also sped up response processing up to 8x compared to standard methods, without adding any extra cost to run.
The paper, set to be presented at ICLR 2026 in April, also topped rival methods in vector search — the tech search engines use to match similar results quickly.

Why it matters: Despite being first published in April 2025, top AI memory companies felt the heat of the official release, with stocks dropping 3-5%. One compression paper won't crater memory demand overnight, but the selloff shows Wall Street is pricing in a world where smarter software cuts into the premium AI memory commands.

QUICK HITS

🛠️ Trending AI Tools

🎶 Lyria 3 Pro - Google’s upgraded AI music model with longer track outputs
🌐 MolmoWeb - Ai2's open-source web browsing agent
🎨 Uni-1- Luma's unified model that reasons and generates across text, images
⚙️ Composer 2 - Cursor’s powerful, cost-effective coding model

📰 Everything else in AI today

Oracle Data Deep Dive NYC, April 10th: Hands-on AI labs and direct access to Oracle experts. Learn more and register for free.*

OpenAI is raising another $10B to push its record funding round past $120B, with Microsoft, a16z, and T. Rowe Price joining the round.

Google upgraded its music AI model to generate full 3-minute songs with intros, verses, and choruses, with Lyria 3 Pro rolling out in Gemini, Vertex AI, and Google Vids.

Bret Taylor’s Sierra introduced Ghostwriter, an AI agent that builds other AI agents — letting companies create customer service bots across voice, chat, and 30+ languages.

The U.S. Department of Labor launched "Make America AI-Ready," a free 7-day AI literacy course delivered entirely over text message to promote AI upskilling.

*Sponsored Listing

COMMUNITY

🤝 Community AI workflows

Every newsletter, we showcase how a reader is using AI to work smarter, save time, or make life easier.

Today’s workflow comes from reader May F. in London, UK:

"I’m on maternity leave, but wanted to build up my AI knowledge, so I’ve used Claude Code to build a custom dashboard of the data I’m tracking - feed time, naps, etc. I now get an email each morning with a summary of the previous day, with coaching tailored to my baby’s current age and development.”

How do you use AI? Tell us here.

🎓 Highlights: News, Guides & Events

Read our last AI newsletter: OpenAI’s Sora gets the axe
Read our last Tech newsletter: The deep sea luxury race is back
Read our last Robotics newsletter: OpenClaw craze comes to robots
Today’s AI tool guide: Create branded reaction GIFs for Slack
RSVP to next workshop today @ 2PM EST: Intro to Vibe Coding pt. 3

That's it for today!

Before you go we’d love to know what you thought of today's newsletter to help us improve The Rundown experience for you.

See you soon,

Rowan, Joey, Zach, Shubham, and Jennifer — the humans behind The Rundown

ARC-AGI-3 resets frontier AI scoreboard

AI BENCHMARKS

🧐 ARC’s new AGI test stumps every frontier AI

TOGETHER WITH SLACK FROM SALESFORCE

🧑‍💻 Your AI teammates live in Slack

REDDIT

🤖 Reddit's AI bot crackdown skips the ID check

AI TRAINING

🤯 Create branded reaction GIFs for Slack

PRESENTED BY TELY AI

💬 Market leaders get leads from ChatGPT and Google

GOOGLE

💾 Google shrinks AI memory with zero accuracy loss

🛠️ Trending AI Tools

📰 Everything else in AI today

🤝 Community AI workflows

🎓 Highlights: News, Guides & Events

That's it for today!

Reply

Keep Reading