‘OpenClaw’ and Silicon Valley’s Obsession with Agents

Federico Viticci, writing at MacStories:

For the past week or so, I’ve been working with a digital assistant that knows my name, my preferences for my morning routine, how I like to use Notion and Todoist, but which also knows how to control Spotify and my Sonos speaker, my Philips Hue lights, as well as my Gmail. It runs on Anthropic’s Claude Opus 4.5 model, but I chat with it using Telegram. I called the assistant Navi (inspired by the fairy companion of Ocarina of Time, not the besieged alien race in James Cameron’s sci-fi film saga), and Navi can even receive audio messages from me and respond with other audio messages generated with the latest ElevenLabs text-to-speech model. Oh, and did I mention that Navi can improve itself with new features and that it’s running on my own M4 Mac mini server?

If this intro just gave you whiplash, imagine my reaction when I first started playing around with OpenClaw, the incredible open-source project by Peter Steinberger (a name that should be familiar to longtime MacStories readers) that’s become very popular in certain AI communities over the past few weeks. I kept seeing OpenClaw being mentioned by people I follow; eventually, I gave in to peer pressure, followed the instructions provided by the funny crustacean mascot on the app’s website, installed OpenClaw on my new M4 Mac mini (which is not my main production machine), and connected it to Telegram.

I’m a few weeks late to linking to Viticci’s story, but in the odd chance you haven’t already read it, it’s the best piece on OpenClaw (née Clawdbot) on the web. It’s compulsory reading.

OpenClaw is an open-source version of what Silicon Valley is calling “agents”: artificial intelligence chatbot-powered robots that are given a wide array of “tools” to perform tasks, whether that be web search, access to a file system, or application programming interface access to other software. Agents are, to the Silicon Valley believer, the next frontier for AI chatbots like ChatGPT and Claude. They go beyond the chat interface and are able to do actual work on the internet — or at least, they’re purported to. This isn’t a bad idea, but these agents are only good if they work.

In practice, large language models are great at specific kinds of tasks. They’re not all-rounders — for instance, an LLM would be quite poor at recommending content at the scale of YouTube or Netflix. There are other kinds of machine learning structures better suited for those needs than a transformer model. But LLMs are great at helping humans debug code, for instance, because they’re trained on a corpus that includes hundreds of millions of bugs and their appropriate solutions. I think LLMs can be incorporated into agents for user-facing communication, but when it comes time to actually do some of that work, they fall short.

As an example, OpenClaw has numerous security vulnerabilities not due to any programming error — it seems to be a competently written piece of open-source software — but because of the inherent nature of LLMs. Giving an LLM access to one’s emails and text messages opens the door to prompt injection attacks, meaning that someone could coerce Claude to divulge sensitive information that it ought not to have been given access to at all. Claude has protections in place to defend against these attacks, but they’re not always reliable. Giving OpenClaw access to any kind of personal information is highly frowned upon.

Security is just one obvious example, but it proves that a general-purpose agent like OpenClaw is mostly a fantasy, at least for now. You can chat with Claude in the Claude app on your smartphone, and it can search the web, your email, and your calendar, or even write code to a Git repository in the cloud. Claude also has a “Cowork” feature in the desktop app where users can give the chatbot access to certain parts of the file system, say to organize files or retrieve information. The point is that the vanilla version of Claude can do everything OpenClaw can, but without the security vulnerabilities, as long as a human is at the helm. And that is the promise of agents.

Agents should be flexible tools managed by humans to do specific tasks. OpenClaw is an interesting glimpse at a world where human-computer interaction feels more like human-to-human interaction, but it has become increasingly obvious that LLMs cannot solely get us to that reality. They’re still too non-deterministic and language-focused to be put in charge of themselves. I’m glad OpenClaw crystalized this vulnerability: LLMs don’t need to run nonstop, and they shouldn’t work without human supervision. I don’t think they’ll ever get to a point where running them that way is the most efficient way to automate something. The biggest lesson OpenClaw teaches us is that LLMs are only the beginning of a broader movement toward automation, not the end.