Dr. Fei Fei Li

Jobs, robots & why world models are next, covering AI product work, product design, and consumer products.

November 16, 2025·11,555 words

AI & Machine LearningGrowth & MetricsLeadership & ManagementProduct StrategyStartup BuildingDesign & UXEngineeringCareer & Personal GrowthUser PsychologyData & Analytics

Episode

The Godmother of AI on jobs, robots & why world models are next | Dr. Fei-Fei Li

Summary

Dr. Fei-Fei Li, the AI pioneer behind ImageNet and co-founder of World Labs, argues that spatial intelligence and world models represent the next critical frontier in AI that language models alone cannot address. She explains why the "bitter lesson" is harder to apply to robotics, shares the story of founding World Labs and launching Marble (a "prompt-to-worlds" generative 3D model), and reflects on her career philosophy of intellectual fearlessness.

Key Takeaways

Spatial intelligence — reasoning in and interacting with 3D worlds — is a fundamental gap in current AI that language models cannot fill, and is the missing link for robotics, scientific discovery, and creative tools.

The "bitter lesson" is much harder to apply to robotics than language because training data is not aligned with 3D actions. Solving robotics requires new data strategies, not just more scraping.

Robots are physical systems closer to self-driving cars than LLMs — calibrate timelines for embodied AI accordingly, even with deep learning acceleration.

For career decisions in AI, focus on passion alignment, belief in the mission, and trust in the team — not on optimizing every variable.

Put products in users' hands early and scan broadly for unexpected use cases — the most valuable applications are rarely the ones you anticipated.

Notable Quotes

“In the middle of 2015, middle of 2016, some tech companies avoid using the word AI because they were not sure if AI was a dirty word. 2017-ish was the beginning of companies calling themselves AI companies.”

AI & Machine Learning

00:00:07

“I chose to look at artificial intelligence through the lens of visual intelligence because humans are deeply visual animals. We need to train machines with as much information as possible on images of objects, but objects are very, very difficult to learn. A single object can have infinite possibilities that is shown on an image. In order to train computers with tens and thousands of object concepts, you really need to show it millions of examples.”

AI & Machine Learning

00:01:03

“So yes, HAI, Human-Centered AI Institute was co-founded by me and a group of faculty like Professor John Etchemendy, Professor James Landay, Professor Chris Manning back in 2018. I was actually finishing my last sabbatical at Google and it was a very, very important decision for me because I could have stayed in industry, but my time at Google taught me one thing is AI is going to be a civilization of technology. And it dawned on me how important this is to humanity to the point that I actually wrote a piece in New York Times, that year 2018, to talk about the need for a guiding framework to develop and to apply AI. And that framework has to be anchored in human benevolence, in human centeredness. And I felt that Stanford, one of the world's top university in the heart of Silicon Valley that gave birth to important companies from NVIDIA to Google, should be a thought leader to create this human-centered AI framework and to actually embody that in our research education and policy and ecosystem work.”

AI & Machine LearningLeadership & Management

01:10:36