AI robots now learn by watching humans

PLUS: Apple research releases 'MLX'

Welcome, AI enthusiasts.

AI robotics firm Figure just showed off a significant development — with their humanoid robots now able to learn how to do tasks autonomously by watching humans do it first.

2024 is undoubtedly going to be a crazy year for AI and robotics. Let’s get into it…

In today’s AI rundown:

  • Figure’s humanoid robot learns by watching humans

  • Apple ML Research releases MLX for on-device AI

  • How to generate green screen product mockups with AI

  • V* enables guided visual search in AI assistants

  • 7 new AI tools & 4 new AI jobs

  • More AI & tech news

Read time: 4 minutes



Image source: Figure

The Rundown: AI robotics company Figure just showcased its Figure 01 humanoid robot making a cup of coffee, which it learned after just 10 hours of watching humans complete the task via video training.

The details:

  • Figure’s demo shows the robot using a Keurig to brew the coffee, showing off impressive dexterity in handling and placing the small pod.

  • The video also shows a side-by-side of the robot struggling to directly place the pod before eventually succeeding, highlighting its ability to self-correct.

  • Figure uses an ‘end-to-end’ training approach, with the system observing humans complete the task from start to finish and learning from the data.

Why it matters: The impressive demo showcases the potential versatile future of robotics — and its end-to-end training lends itself to massive scalability and adaptability, allowing for a wide range of real-world jobs and constantly improving performance.


The Rundowns: The AI Quality Framework helps manage the risks of regulation, security, privacy and ethics for a successful AI integration.

TÜV SÜD offers:

  • Training: Customised training for executives and AI specialists.

  • Risk Profiling: Systematic identification and assessment of potential risks to ensure responsible deployment and management.

  • Regulatory Compliance: Assistance with the EU AI Act and other national regulations

Leverage our AI expertise for compliance, ethics, and scalable solutions.


Image source: Apple

The Rundown: Apple’s Machine Learning Research team quietly introduced MLX, an open-source framework designed to streamline AI model development and deployment on Apple hardware.

The details:

  • MLX brings advanced AI capabilities to the latest Apple silicon, simplifying and enhancing the tech across Apple products.

  • The framework offers familiar coding tools like Python so developers can build models easily, with behind-the-scenes optimization for faster, more efficient model training.

  • MLX allows users to build a ‘mini' data center for LLMs on Apple’s M2 and M3 products — smoothly running advanced, intensive models on local devices.

  • Apple is also rumored to be preparing a new version of Siri with AI, which can retain conversation information across devices.

Why it matters: With its user-friendly and optimized design for Apple silicon, MLX represents a major step toward on-device AI across Apple's ecosystem. And while many have already counted Apple out of the AI race — the tech leader could seriously shake things up quickly in 2024.


Hugo Ventura on Twitter/X

The Rundown: This Midjourney prompt can now create stunningly realistic computer, phone, or product mockups, complete with a green screen to customize for your business or product easily.

Thank you to Hugo Ventura (@hugovntr on X) for sharing this workflow.


  1. Open Midjourney and type /imagine.

  2. Prompt to generate an image of someone holding/using a device in front of a solid green background. For example: “Over the shoulder shot of a person in front of an entirely <color> computer screen”

  3. Customize and tweak the prompt as necessary to get the image just right — then export and open in an editing app (Photoshop, Canva, etc.) to replace the colored screen background with your own imagery.

  4. Note — be mindful of reflections, such as green screens bleeding into other aspects of the image.


Image source: Penghao Wu and Saining Xie

The Rundown: Researchers just introduced V*, an AI-driven visual search technique to efficiently locate objects in complex images by leveraging an AI assistant's world knowledge, enhancing multimodal understanding.

The details:

  • Integrating V* into a "Show, Search, and Tell" system with visual memory yielded big accuracy boosts over powerful multimodal models like GPT-4V.

  • In tests, V* matched human visual search efficiency by utilizing similar contextual and target cues.

  • The technique Improves fixed vision encoders in multimodal models, commonly missing details in high-res images.

Why it matters: By mimicking human-guided search, V* could unlock more precise AI visual capabilities — and likely put an eventual end to the usefulness of traditional CAPTCHA technology.


  • DryRun Security- An automated buddy that adds security context as you go, so you can write code quickly without having to be a security expert (link)*

  • Bland Turbo- The world’s fastest conversational AI (link)

  • Boundaries- A GPT to learn to respectively say no (link)

  • Invstr- Streamlined AI investing hub (link)

  • Brewed- AI-powered web app creation by typing (link)

  • Reiki by Web3Go- Comprehensive AI-driven agent development and monetization suite (link)

  • McAnswers AI- Streamline coding with AI (link)

  • Shield AI- Sustaining Engineer at (link)

  • OpenAI- Account Director, ChatGPT Enterprise (link)

  • C3 AI- Director, Performance Engineering (link)

  • Anthropic- Policy Communications Lead, Corporate Communications (link)

*Sponsored listing


Microsoft exec Dee Templeton was appointed as a non-voting board observer to OpenAI, following an overhaul after Sam Altman's brief ouster as CEO.

Augmental showcased a demo of MouthPad, a smart mouth wear device that uses subtle tongue movements for hands-free phone navigation.

WhiteRabbitNeo opened access to its new 33B parameter AI model for offensive and defensive cybersecurity agents.

Bedrock announced Gal, a web-integrated, AI-augmented personal computer allowing users to own and run their own AI.

Nabla announced a $24M raise for its AI assistant that generates medical notes and reports from doctor-patient consultations, helping physicians save time on documentation.

Midjourney is facing criticism after a database showing a training set of 16,000 artists was leaked online, raising new ethical concerns over the data used for its image synthesis.

AI biotech startup Isomorphic secured deals with pharma giants Eli Lilly and Novartis worth an estimated $3B, using DeepMind’s tech to screen billions of compounds and design new molecules.



Get your product in front of over 400k+ AI enthusiasts

Our newsletter is read by thousands of tech professionals, investors, engineers, managers, and business owners around the world. Get in touch today.


How would you rate today's newsletter?

Vote below to help us improve the newsletter for you.

Login or Subscribe to participate in polls.

If you have specific feedback or anything interesting you’d like to share, please let us know by replying to this email.

Join the conversation

or to participate.