February 5, 2026 · Podcast · 44min
Dave Baszucki on Building the Holodeck: AI, Physics Simulation, and Roblox's 20-Year Bet
The Roblox vision has been remarkably stable for 20 years: build the holodeck. A high-fidelity physics simulation where thousands of people come together and do things. What’s changed is that AI now makes the timeline plausible.
A 20-Year Business Plan That Still Holds
Dave Baszucki pulls out a business plan PowerPoint from nearly 20 years ago and marvels at its fidelity. The deck envisioned a new category of “human co-experience,” a simulation backed by physics where you could build a car and drive it, blow out birthday candles, chop trees and build a house. Snow Crash, Ready Player One, the holodeck. The spec hasn’t changed: photorealistic graphics, 10,000 concurrent players, acoustic simulation, real-time physics.
What has changed is AI’s role in getting there. Baszucki frames AI not as a product pivot but as an accelerant for a vision that predates the current wave entirely. The question isn’t “what should AI do for Roblox?” but “how does AI get us to the holodeck faster?”
He draws an interesting spectrum between two product extremes. On one end: a communication platform with high-fidelity multiplayer simulation where you interact with real people. On the other: “real-time dreaming,” a solitary experience where everyone around you is an NPC and the world adapts to your behavior, like an evolved form of doom-scrolling. He references Vanilla Sky, where Tom Cruise literally dreams for years in a simulated world without knowing it. Everything between these two extremes is possible, and Baszucki expects weird product categories nobody has imagined yet.
4D Simulation as a Superset of Video
Baszucki uses “4D” deliberately: not just 3D shape, but function over time. His argument is that as multiplayer simulation gets photorealistic, video becomes the legacy downsampling. You could always fall back to “analog mode” and make it look like Zoom, but you could also say “let’s pop up and walk around my office, I want to show you something.”
The acoustic physics problem is particularly interesting. A company meeting of a thousand people on Zoom means a bunch of squares with tricky audio mixing. In a 3D simulation, sound attenuates naturally based on distance. Walk closer to someone and you hear them better. It’s a more natural human interface.
He surfaces a delightful edge case: singing happy birthday together. No matter the platform, you’re hearing each other 30 milliseconds late. His solution: let each person sing with a time-forward extrapolation of everyone else, then remix it. The small details reveal how deeply they’ve thought about the physics of human interaction at scale.
13 Billion Hours of Data, Stored as Vectors
Here’s where it gets technically ambitious. Roblox generates 13 billion hours of user interaction data per month. Their vision is to store this history not as raster data (like video) but as vector data: full 3D state that can be replayed from any camera angle and interacted with.
The applications range from practical to profound. Safety incident? Place five cameras retroactively and listen to the audio. A special family moment? Reshoot it from a cinematic angle and make a video. But the real play is training data.
“The data we have which is 13 billion hours a month can be reproduced from any camera angle and can interact with the 3D space. So it’s very powerful data.”
While the industry scrambles to find hybrid video/keyboard interaction data for training, Roblox sits on a fundamentally richer dataset: full 3D interaction data with spatial context, far more useful than flat video for training agents that need to navigate and act in environments.
NPCs Beyond LLMs: Three Levels of Virtual Doppelgangers
Baszucki outlines a three-level roadmap for NPCs that goes well beyond chatbot-style AI:
Level 1: NPCs that can play any Roblox game competently. Trained on the platform’s massive behavioral dataset rather than just language models, these NPCs navigate, interact with objects, and understand game mechanics.
Level 2: Personal virtual doppelgangers. With user opt-in, the system would learn your gestures, how you look at things, how you talk, creating a digital twin that mirrors your behavioral patterns.
Level 3: Agentic doppelgangers with a simple user interface. Send your virtual self to play with your kid for 15 minutes while you’re working. The analogy to agentic AI in the productivity space is direct, just applied to social and gaming contexts.
The technical approach mirrors the self-driving transition: moving from hand-coded heuristics and decision trees to end-to-end learned models. Rather than programming NPC behaviors, you train them on billions of hours of real human behavior.
The Architecture Is Hybrid, Not Monolithic
When asked about video-based world models (generating game experiences as pure video with no physics engine), Baszucki is respectful but clearly betting on a different architecture. He sees a multi-component pipeline:
- A hyper-efficient synchronization engine for 1,000+ concurrent players
- Server-side 3D state management
- Client-side and intermediate 2D/3D upsampling for photorealism
- Dedicated NPC inference capability
- Potentially world model components for specific use cases
The key insight: the hard unsolved problem isn’t photorealism or NPC intelligence. It’s synchronizing the state of 10,000 people in real time. Is that state best stored in video latent space, native 3D format, or some undiscovered hybrid? That’s the research frontier.
AI in Roblox Studio: Agents That Grind While You Sleep
Roblox Studio is one of the world’s largest development environments, and AI coding is reshaping how creators use it. Users are already gluing Claude Code and similar tools into their Studio workflows.
But the more distinctive vision is environmental generation: text/image/video prompts that iterate toward a 3D skeleton, which then iterates toward a fully functional game. A first-time Studio user gets AI generation out of the box while power users plug into their existing AI toolchains.
The cloud-native architecture enables something powerful: spin up agents overnight that test, iterate, launch NPCs as players on various device emulators, and tune your game while you sleep. It’s the agentic development loop applied to game creation.
Baszucki also describes on-demand AI asset generation: a creator builds a primitive experience, adds a prompt like “make it look medieval and more realistic,” and the assets auto-upsample in 3D in the cloud for free. Assets exist on a spectrum from traditional files to AI-generated-on-demand, with dynamic LOD (level of detail) adjusting for device capability.
The Creator Economy: Growing Healthily
The top Roblox creators now make over $30 million per year with teams in the 30+ range. More importantly, Baszucki notes a healthy signal: average revenue for the top creators is growing faster than the number-one creator’s revenue, indicating a deepening long tail rather than winner-take-all dynamics. All the way out to creator number 1,000, there’s a substantial community making a living.
The shift toward live ops is significant. Because everything is cloud-connected, creators update their experiences weekly or even daily, much like good websites. The top 20 experiences are now all genuinely competitive and vying for the top slot, a more distributed landscape than three or four years ago.
On discovery, Roblox has moved toward complete transparency in their algorithm. Baszucki sees this as both a competitive advantage and a forcing function: if the algorithm is public, they have to make it genuinely good rather than gameable.
The 3x and 10x Planning Framework
Baszucki articulates a clean framework for long-term leadership: always know both your 3x and your 10x.
The 10x is the holodeck vision. Without it, “it’s hard to sleep at night.” The 3x is the operational stepping stone: 10% of global gaming content, roughly 300 million DAUs and $20 billion, about 3x current scale. This can be “forensically torn apart” into market-by-market plans and operationalized across product teams.
“If you don’t know 10x, then it’s hard to sleep at night. If you don’t know 3x, it’s hard to plan forensically.”
He pairs this with a value system of “take the long view” plus “get stuff done.” The weekly iteration cadence means even ambitious six-month product plans get broken into weekly ships. The AI team, safety team, and facial age estimation team all work on weekly cycles.
Hiring Without University Prestige
Five years ago Roblox acquired Embliss, a company building 3D assessment tooling to scientifically evaluate problem-solving and creativity. They’ve since tuned it for their new college grad and intern hiring pipeline.
The process runs 50,000-60,000 candidates through assessments built on Roblox itself: 3D problems like programming a factory or using a geometric language to program a robot. The assessments are designed to be fair and unweighted by social factors.
The edgy finding: there’s limited correlation between traditional elite universities and their own testing results. Community colleges and small Midwestern engineering schools produce candidates who perform just as well. They’ve essentially decided to “ignore the signal of where you went to university.”
“We have found community college, the small Midwestern engineering school, like because we’re assessing our own way, we basically ignore the signal of where you went to university.”
Some Thoughts
The most revealing moment is when Sarah Guo asks Baszucki what he believes about AI that not everybody believes yet. His answer: he’s actually the skeptic. More people believe more than he does. He thinks in terms of “the power of time and compounding” rather than breakthrough moments. He points to Microsoft Excel, barely changed in 40 years, as evidence that some things just stick around.
This is a useful lens for understanding Roblox’s AI strategy. They’re not chasing the hype cycle. They’re using AI to accelerate a bet they made two decades ago, one that was already ambitious enough. The holodeck spec has been stable for 20 years. AI just makes the timeline more plausible.
A few things worth sitting with:
- The “real-time dreaming” category is genuinely novel framing. The spectrum from multiplayer communication to solitary AI-generated worlds creates space for product categories nobody has named yet
- 13 billion hours of 3D vector interaction data per month may be the most underappreciated training dataset in AI. It’s richer than video, navigable from any angle, and captures spatial behavior at scale
- The hiring insight deserves more attention. If a company processing 60,000 candidates finds zero correlation between university prestige and job performance, that’s a meaningful data point for the entire tech industry
- Baszucki’s “3x/10x” planning framework is perhaps the cleanest articulation of how to balance operational execution with long-term vision