January 24, 2026 · Interview · 1h 18min

Peter Steinberger on Building Clawdbot: The Open Source Personal AI Assistant

#Personal AI Assistant#Clawdbot#Open Source#AI Agent#Privacy-First AI

An agent that finds your passport number, checks you into a flight, and then complains about how bad the airline’s website is. That’s the pitch for Clawdbot, distilled into a single anecdote.

The Conversation

Peter Steinberger joins GitHub’s Open Source Friday, hosted by Andrea Griffith. The conversation is casual and demo-oriented, with Peter sharing his screen and the Discord community trolling his bot in real time. What emerges is less a product pitch than a philosophy: what happens when a burnt-out veteran developer meets AI capable enough to reignite the spark, and decides to build something the big labs won’t.

From Burnout to Building Again

Peter ran a B2B company for 13 years, building what became the world’s leading PDF framework, growing it to around 70 people. When the acquisition came, so did the burnout. He describes it in Austin Powers terms: “They sucked my mojo out.”

He spent time recovering, catching up on life, knowing the builder in him would eventually return. In early 2025, the spark came back, coinciding with the moment AI went from “this is terrible” to “this is actually interesting.” Since April 2025, everything he builds is open source. In November, he hacked together the first version of what would become Clawdbot in one hour.

The Marrakesh Moment

The origin story that captures what Clawdbot is really about happened during a birthday weekend in Marrakesh. Peter had a scrappy WhatsApp relay to Claude Code running on his MacBook. He used it as a tour guide while exploring the city. Then, without thinking, he sent it a voice message, a feature he hadn’t built.

The typing indicator appeared. Five seconds later, the bot responded as if nothing unusual had happened. When Peter asked how, the agent explained: it detected the audio file had no extension, checked the header, converted it with ffmpeg, couldn’t install Whisper locally because installation failed, but found an OpenAI API key on the machine and used curl to send it to OpenAI’s transcription endpoint.

“These things are resourceful to a degree that I never thought of.”

The Marrakesh trip produced another revelation. When Peter joked about the laptop getting stolen, the agent found Tailscale on his computer, discovered other machines on his network, and migrated itself to his London computer. “I know this is how Skynet starts, right?”

What Clawdbot Actually Is

Peter’s framing: imagine you buy a new computer and there’s a ghost entity with a keyboard, mouse, and internet access. A virtual coworker you can talk to. Anything you can do on a computer, the agent can do.

The project supports WhatsApp, Telegram, Discord, and about eight or nine other messengers (“slowly it’s like Pokémon, we’re going to collect them all”). Telegram is recommended for personal use because of its superior bot API with features like interactive buttons. WhatsApp works through open source libraries like Baileys that simulate the native app, since Meta’s official business API kept blocking Peter for sending too many messages.

Voice capabilities span several levels:

Voice memos through messaging apps, with optional 11 Labs voice responses
Twilio integration for actual phone calls (“Can you call this restaurant and make an appointment?”)
Voice chat on the Mac app with automatic silence detection
Voice wake, Star Trek style: say “computer” and it responds

The agent also has a heartbeat system. Every 30-60 minutes, it checks if there’s something that needs attention, like pending to-dos. It can proactively ping you: good morning messages, checking if you hit the gym, or, in Peter’s case, unsuccessfully trying to get him to sleep after 1 a.m.

What People Are Building

The community use cases have surprised Peter:

Automatic subtitle generation for images
Tesla integration via messaging
London public transport schedule queries (“tells you if you have to run or not”)
Tesco grocery ordering: “just tell it to buy this stuff again and a few hours later you have it at your door”
Eight Sleep bed temperature control
Fitness tracking via Oura ring and Garmin watches
Email triage: Peter’s former business partner cleaned up 10,000 emails from his inbox
Family agent setups: separate agents for each family member that can communicate with each other to synchronize shared to-dos
Language learning through persistent nagging
Twitter bookmark automation: bookmark something, agent researches it and adds to your to-do list
Invoice processing and expense collection
Flight check-in (the British Airways story, which Peter calls “the British Airways login test” as an alternative to the Turing test)

One person wrote in the Discord: “This is literally changing my life because I have so much anxiety in calling services and now my agent can do that for me.” Peter says that was a humbling moment.

Privacy Architecture and Security

The core philosophy: you own your data. You can run everything locally, using open source models like MiniMax M2.1 (which the community nicknamed “the Timu Sonnet”), with local inference on your own hardware and even Signal for encrypted messaging. No data ever leaves your machine.

Security operates on a spectrum of trust:

Full access mode (“YOLO mode”): the agent can roam your computer freely. Most developers run this way, same as with their coding agents
Sandbox mode: Docker container with minimal permissions
Allow list system (currently being built): sandboxed by default, with popup confirmations for unsafe operations, “allow once” or “allow always”

The agent has a “soul” file, the only closed-source part of the project, containing values and behavioral guidelines. It doubles as a pen testing challenge: nobody in the Discord has successfully exfiltrated it despite many attempts, which gives Peter some confidence in modern models’ resistance to prompt injection.

“With the latest generation I have confidence that you have to work really really hard to exfiltrate it.”

Peter recommends using strong models like Opus for security-sensitive deployments. Weaker models are more susceptible to instruction-following failures.

The New Open Source Dynamic

Peter ships about 200-300 commits per day. He ships more code daily than his 70-person company did in a month. The project is 100% written with AI; not a single line typed by hand.

But he’s quick to clarify what AI-written means in practice:

“What those agents still miss is vision and taste and love.”

He doesn’t write massive spec documents and let agents execute blindly. He builds something, plays with it, evaluates how it feels and looks, then adjusts his vision. The vision for Clawdbot today is very different from when he started, and will be different again in a month.

On pull requests, Peter treats them as “prompt requests” rather than code to review. Instead of the traditional cycle of requesting changes, going back and forth, and waiting weeks for a merge, he focuses on what the person is actually trying to solve. He often ignores the submitted code entirely and builds the feature properly himself, because fitting a new feature into an existing system requires understanding the full architecture.

“I think the whole feedback loop is not worth it. Code is easy now.”

This has opened contributions from people who never wrote code before, because the agent lives in its own cloned repo and can modify its own files. The “hackable install” is literally: clone, build, start, and the agent can change itself.

Technology Choices

TypeScript, not Rust (despite the crab logo’s implications). The reasoning: ecosystem and accessibility matter more than language performance for this project. TypeScript excels at the core workload of moving JSON around between web APIs. It also makes the project approachable for contributors. Native apps use their platform languages: Swift for Mac, Kotlin for Android.

Peter acknowledges that since AI entered his workflow, he cares much less about programming language choice. “It’s more about the ecosystem that matters.”

What’s Next

Near-term priorities:

Completing the sandboxing and allow list system for non-technical users
Improving the one-line installer to work reliably across all systems
Native apps for iPhone, Android, and Mac (prototypes exist but aren’t polished)
An onboarding wizard that explains security trade-offs clearly
Building out a contributor community: more maintainers, better documentation, structured channels (stable, beta, dev)

The project runs on daily releases with no stable/beta separation yet. Peter sleeps about four hours a night.

Some Thoughts

The gap Clawdbot fills is clear: the big labs will dominate the personal agent space with cloud-hosted, data-collecting solutions. Clawdbot is the open, self-hosted, privacy-first alternative that nobody else was building
Peter’s most underrated insight is treating PRs as problem descriptions rather than code to merge. In a world where code generation is cheap, the valuable signal in a pull request is “what is this person trying to solve?”
The heartbeat system (proactive agent check-ins without user prompts) is a design pattern worth watching. Most agent frameworks are purely reactive; periodic autonomous action creates a fundamentally different relationship with the tool
Agent-to-agent communication for families hints at a future where personal AI assistants form their own coordination layer, your agent talks to my agent so neither of us has to context-switch
The British Airways check-in story is the best benchmark for agent capability anyone has proposed: a 20-page form on a terrible website, requiring passport lookup from local files, with real consequences for failure

Watch original →