Humanoid Robots for Software: Why Agents Don't Need Your API
The whole bet on humanoid robots isn't efficiency. A humanoid is probably not the most efficient solution to any single task. The bet is that the world was already built for human bodies. Scaffolding, crane controls, drill presses, operating rooms. A robot shaped like a person just inherits all of it. You don't redesign the construction site. You send in a robot capable enough to use the one that exists.
Computer use agents are the humanoid robots of software.
Every piece of software ever built assumes a human on the other end. Every button, form, and dashboard was made for someone with eyes and a cursor. That's true whether it's Salesforce or a Windows 95 inventory system running on a machine in the back of a warehouse that IT is too scared to touch. GPT-5.4 shipped this week with native computer use — it clicks, scrolls, types, navigates software the way you do — and an agent can use all of it.
Rewind to January 2025. OpenAI launched Operator, their first public computer use agent. The demos were cool but actually using it was painful. CAPTCHAs broke it, dynamic UIs confused it, it hallucinated clicks, and you had to babysit constantly. Anthropic had their own version in beta — impressive in a controlled demo, useless when you needed it to do real work.
Fourteen months later, different product. GPT-5.4 outperforms its predecessor by 17% on BrowseComp, which tests how well an agent can persistently hunt down hard-to-find information on the web. The Pro version hits 89.3%. People who gave up on Operator are now using it for hours without wanting to throw their laptop. Anthropic's browser tool has been quietly getting better. Perplexity launched "Computer" a couple weeks ago — a super-agent coordinating 19 models that can build websites, generate datasets, and do multi-step research.
The legacy banking system that would cost $50 million and three years to replace? An agent can operate it today. The internal tool nobody's touched since 2014 because nobody wants to be the one to break it? Same. The agent doesn't care about your tech stack or whether you ever got around to that Zapier integration. It logs in and gets to work.
WebMCP is about to make this far better. Google shipped the Web Model Context Protocol as an early preview in Chrome Canary a few weeks ago. It lets web developers embed structured agentic instructions directly into their frontend. Instead of an agent scraping raw HTML and burning thousands of tokens trying to figure out where the search bar is, the site just tells it. 89% token efficiency improvement in early tests. Software learning to speak agent.
If capability roughly doubles every six months (and the last 14 months suggest that's reasonable), we're six months from every piece of software having an expert power user that knows it cold. Not like a regular user. Like someone who's read every KB article and spent three years building workflows in it. Then cloned itself a hundred times.
I recently joined HubSpot as a product lead on the automation platform. I spend my days thinking about how people build workflows and connect their tools to automate the stuff that shouldn't require a human. So I have a very specific question in the back of my head.
What happens when the most capable automation layer lives above the platform, not inside it?
An agent running in ChatGPT or Claude or Perplexity isn't constrained by any one product's data model. It can pull from your CRM, check LinkedIn, skim the prospect's latest press release, look up their tech stack, then go into HubSpot and write an outreach sequence that actually reflects all of it. To do that natively you'd need a dozen integrations and probably two years of roadmap.
The agent does it in one pass because it's not inside your walls.
The platforms don't disappear. HubSpot still stores the contacts and runs the sequences. But the agent skips your onboarding flow and ignores your upsell nudges; it goes straight to execution. User intent migrates to whoever owns the agent layer.
People keep asking if SaaS is getting vibe-coded away, replaced overnight by custom apps assembled in an afternoon. The real threat is quieter. An abstraction layer builds above your product, becomes the most powerful way to use your product, and you had nothing to do with it.
I don't have a clean answer for what to do about this. A general-purpose chat agent can't guarantee enterprise reliability or auditability.
But fourteen months ago, computer use was a party trick; it's a product now. The gap between "not yet" and "already happening" keeps shrinking.
The robot just walks into your factory and starts working.

Need a Fractional Head of AI?
I help companies build an AI operating system — shared context across teams, AI handling the repetitive work, and your people focused on what actually matters.
15+
Years in Tech
12+
AI Products Shipped
3
Fortune 500 Brands