Grok gets Vision

PLUS: Firefly trains on... Midjourney images?

Elon Musk is on a mission to prove that open-source AI can compete with the best — and Grok’s latest upgrade just took a major step forward.

With new multimodal capabilities that surpass the top models, xAI is quietly showing itself to be a serious contender in the world of AI heavyweights. Let’s explore…

Image source: xAI

The Rundown: Elon Musk’s xAI just introduced Grok-1.5 Vision, a multimodal upgrade to the open-source model that allows for processing visual information.

The details:

  • Grok 1.5V can now process visual info like documents, charts, screenshots and photos, with a focus on real-world understanding.

  • xAI created a new ‘RealWorldQA’ benchmark to evaluate spatial understanding, with Grok-1.5V outperforming GPT-4V and Gemini.

  • xAI said 1.5-V will roll to testers and existing users soon, with significant improvements across images, audio, and video expected in the coming months.

Why it matters: While Grok feels under-appreciated in the broader LLM discussion, the impressive vision upgrade shows the open-source model is here to compete. With Elon’s arsenal of data across X and Tesla and a chip on his shoulder, it might be time for the industry to start paying attention.


Image source: Firefly

The Rundown: In a surprising new report, Adobe’s Firefly AI image generator allegedly used thousands of images created by competitors like Midjourney in its training data.

The details:

  • The report reveals that around 5% of the images used to train Firefly were AI-generated, including some created by rival Midjourney.

  • Adobe has promoted Firefly as a "commercially safe" option, claiming it was trained primarily on licensed images from its own Adobe Stock library.

  • Adobe defended the practice, stating that all images (including the AI-generated ones) underwent a moderation process.

  • There has reportedly been internal disagreement within Adobe, with employees questioning the ethics of using AI imagery for training.

Why it matters: While Adobe positioned Firefly as an ethical, legally sound alternative to competitors, using Midjourney’s images in training data strongly undermines that primary selling point. It could also erode trust among its artist and enterprise customers — who were likely drawn to what now seems like false promises.


The Rundown: In this tutorial, you’ll learn how to transform complex ideas into visual and informative mind maps for free using ChatGPT.


  1. Head over to ChatGPT. You can either use GPT-3.5 (free version) or GPT-4 (paid version).

  2. Write the following prompt: "Create a mind map of [Your Topic]. List topics as central ideas, main branches, and sub-branches."

  3. Once ChatGPT generates the initial mind map outline, ask for a markdown format: “Create this same mind map in markdown format.”

  4. Paste the markdown into Markmap and watch your mind map come to life.

Bonus tip: You can customize your mind map's appearance and then download it as interactive HTML or static SVG.


Image source: Google DeepMind

The Rundown: Researchers at Google DeepMind and the University of Cambridge just taught miniature humanoid robots to play soccer against each other, showcasing complex skill learning and agile adaptability.

The details:

  • The researchers first taught basic skills like walking and getting up, then had robots practice playing against gradually improving versions of themselves.

  • Compared to the baseline robots, the AI-trained versions walked almost 3x as fast, turned around 5x quicker, and got up from falls 63% faster.

  • Researchers found the AI helped learn clever strategies, like taking short, quick steps when defending without being explicitly told to do so.

Why it matters: While these clumsy but adorable robots won’t be heading to the World Cup any time soon, the research demonstrates the power of AI in enabling complex skill learning and adaptability — even leading to developing unique behaviors and tactics on their own.


OpenAI’s latest update to its GPT-4 turbo model has reclaimed the top spot on the LMSYS leaderboard for LLM rankings, surpassing Claude 3 Opus in voting.

Google announced a new AI-powered app coming in June called Vids, which will act as a writing, production, and editing assistant for creating video content.

Avenged Sevenfold frontman M Shadows said that fans won't care if music is created by AI or humans in the future, calling AI a "deeper tool" that can help stimulate creativity for musicians rather than a threat to artists.

Elon Musk (perhaps jokingly) posed the possibility of an AI model running for president in 2032 during an interview at the Breakthrough Prize ceremony.

A new survey from Autodesk found that the ‘ability to work with AI’ was identified as the most important skill of the future across a variety of employment sectors.

The 2024 Masters implemented several AI features for enhanced coverage of the golf tournament, including course insights, AI-enabled narration, 3D course renderings, and personalized highlight reels.



