Image source: Kyutai

The Rundown: French startup Kyutai just introduced Moshi, a new ‘real-time’ AI voice assistant capable of responding in a range of emotions and styles in a similar fashion to OpenAI’s delayed Voice Mode feature.

The details:

  • Moshi is capable of listening and speaking simultaneously, with 70 different emotions and speaking styles ranging from whispers to accented speech.

  • Kyutai claims Moshi is the first ‘real-time voice AI assistant’ released, with a 160ms latency that potentially outpaces OpenAI's offering.

  • The nonprofit group plans to open-source the research and model in the coming weeks, with Moshi currently available to try via Hugging Face.

  • The startup launched in 2023 with $324M in funding, with a team of 8 researchers developing Moshi in just four months.

Why it matters: Moshi looks to be a massive win for the French AI landscape, and another eye-opening rival that chips away at OpenAI’s perceived moat on the rest of the field. Plus, with that uniquely French accent, there certainly won’t be any ScarJo concerns about this model rollout.


Image source: Salesforce

The Rundown: Salesforce just published new research on APIGen, an automated system that generates optimal datasets for AI training on function calling tasks — enabling the company’s xLAM model to outperform much larger rivals.

The details:

  • APIGen is designed to help models train on datasets that better reflect the real-world complexity of API usage.

  • Salesforce trained a both 7B and 1B parameter version of xLAM using APIGen, testing them against key function calling benchmarks.

  • xLAM’s 7B parameter model ranked 6th out of 46 models, matching or surpassing rivals 10x its size — including GPT-4.

  • xLAM’s 1B ‘Tiny Giant’ outperformed models like Claude Haiku and GPT-3.5, with CEO Mark Benioff calling it the best ‘micro-model’ for function calling.

Why it matters: While the AI race has been focused on building ever-larger models, Salesforce’s approach suggests that smarter data curation can lead to more efficient systems. The research is also a major step towards better on-device, agentic AI — packing the power of large models into a tiny frame.


The Rundown: ChatGPT's voice mode feature now allows you to convert your spoken ideas into well-written text, summaries, and action items, boosting your creativity and productivity.


  1. Enable “Background Conversations” in the ChatGPT app settings.

  2. Start a new chat with the prompt shown in the image above (it was too long for this email).

  3. Speak your thoughts freely, pausing as needed, and say "I'm done" when you've expressed all your ideas.

  4. Review the AI-generated text, summary, and action items, and save them to your notes.

Pro tip: Try going on a long walk and rambling any ideas to ChatGPT using this trick — you’ll be amazed by the summary you get at the end.


Image source: Perplexity

The Rundown: Perplexity just announced new upgrades to its ‘Pro Search’ feature, enhancing capabilities for complex queries, multi-step reasoning, integration of Wolfram Alpha for math improvement, and more.

The details:

  • Pro Search can now tackle complex queries using multi-step reasoning, chaining together multiple searches to find more comprehensive answers.

  • A new integration with Wolfram Alpha allows for solving advanced mathematical problems, alongside upgraded code execution abilities.

  • Free users get 5 Pro Searches every four hours, while subscribers to the $20/month plan get 600 per day.

  • The upgrade comes amid recent controversy over Perplexity's data scraping and attribution practices.

Why it matters: Given Google’s struggles with AI overviews, Perplexity’s upgrades will continue the push towards ‘answer engines’ that take the heavy lifting out of the user’s hand. But the recent accusations aren’t going away — and could cloud the whole AI-powered search sector until precedent is set.


Cloudflare released a free tool to detect and block AI bots circumventing website scraping protections, aiming to address concerns over unauthorized data collection for AI training.

App Store chief Phil Schiller is joining OpenAI’s board in an observer role, representing Apple as part of the recently announced AI partnership.

Shanghai AI Lab introduced InternLM 2.5-7B, a model with a 1M context window and the ability to use tools that surged up the Open LLM Leaderboard upon release.

Magic is set to raise over $200M at a $1.5B valuation, despite having no product or revenue yet — as the company continues to develop its coding-specialized models that can handle large context windows.

Citadel CEO Ken Griffin told the company’s new class of interns that he is ‘not convinced’ AI will achieve breakthroughs that automate human jobs in the next three years.

ElevenLabs launched Voice Isolator, a new feature designed to help users remove background noise from recordings and create studio-quality audio.



