AI Agents April 19, 2026 9 min read

We Tested 4 AI Agents for 30 Days. Here Is What Actually Works

Claude Code, Codex, OpenCode, OpenClaw: we used all 4 daily for coding, SEO and deployment. Real results, honest tradeoffs, and our pick for each use case.

CS

Carlos Salgado CEO & Co-founder · Delbion

A year ago, if you asked me what an AI agent did, I would have shown you a chatbot answering questions. Today, when I open my laptop in the morning, I have three agents working in parallel: one is writing a LinkedIn article, another is reviewing the code we pushed yesterday to the website, and a third is summarising last night's emails on my phone over Telegram.

I am not exaggerating. That is literally how my day at Delbion runs. And the interesting part: none of the three are the same type of agent. Each one shines in a different area.

This article is a practical map of the four tools I use daily: Claude Code, Codex, OpenCode and OpenClaw. It is not a paper comparison. It is what I have learned using them inside a real company, mistakes included.

1. The leap: from chatbot to agent that executes

The fundamental change over the last few months is not that models got smarter. It is that they can now do things. They can read files on your computer, run commands, open a browser, answer emails, check calendars. They no longer suggest what you need to do: they do it.

To give you a concrete sense: the article you are reading was drafted with Claude Code. But before writing it, the same agent opened the OpenClaw website, read their documentation, checked our previous articles to avoid repeating angles, and proposed the structure. All of that without me touching a single file.

That difference (suggesting vs. executing) is what makes 2026 agents a real productivity tool, not a demo toy.

What exactly changed

Models now have tools: they can read and write files, run a terminal, browse the web, call APIs and connect to your calendar, email or CRM. That connection to the real world is the leap. The model itself did not change that much. The scaffolding around it did.

2. Quick map of the four agents

Before getting into detail, this is the snapshot. It will help you decide which one to explore first based on what you need:

Tool	Where it shines	Where it lives	Model	Entry price
Claude Code	Long-running code, writing and analysis tasks with local files	Terminal and VS Code	Claude (Anthropic)	From EUR 20/month
Codex	Quick automations and scripts in specific languages	Terminal and ChatGPT	GPT (OpenAI)	Included in ChatGPT Plus/Pro
OpenCode	Teams that want to pick the model and avoid vendor lock-in	Terminal	Any (Claude, GPT, Gemini, local)	Free (open source)
OpenClaw	Full personal assistant over chat, accessible to non-technical users	WhatsApp, Telegram, Discord, iMessage	Any (Claude, GPT, local)	Free (open source)

The first two are commercial. The last two are open source. At Delbion we use all four, each for what it does best.

3. Claude Code: writes, refactors and deploys

This is the tool I use the most. It lives in the terminal of my laptop and has full access to the projects I point it at. When I ask it to do something, it does it end to end: reads the necessary files, edits, tests, verifies and, if I ask, commits and pushes.

Three real cases from the past week:

SEO article writing. I give it the topic, it checks previous articles to avoid repeating itself, reads the content strategy, writes the article in three languages (Spanish, English, Catalan), generates the cover image, updates the index and deploys the site. Total time: around 40 minutes, with the first draft ready in 10.
Training quotes for clients. I hand over the client details (tax ID, course, number of students) and it produces a print-ready HTML quote with the correct tax data, FUNDAE calculations and numbered reference.
Quick SEO audits. I ask it to review recent changes, validate metadata, find broken canonicals and misplaced accents in URLs. It does it in seconds across the whole site.

What Claude Code has given me is autonomy on long tasks. I do not need to babysit it. I hand over context, define the goal, and it works until it finishes. When something gets complicated, it asks before breaking anything.

Ideal profile

Technical or semi-technical (developer, marketer who codes, founder who touches repos). If you are comfortable in the terminal and have projects in git, Claude Code changes how you work. If you never touch code, start with another tool.

4. Codex: the OpenAI script specialist

Codex is OpenAI's bet. It does things similar to Claude Code, but the experience is different. In my own use, it stands out for one-off coding tasks: generating a Python script to process a CSV, writing a quick JavaScript function, turning a spec into an API endpoint.

Where I use it: when I am already inside ChatGPT Pro and need some code without changing context. The integration with the ChatGPT environment is excellent. It lets you run the code it generates in a sandbox, see the result and polish it without leaving the conversation.

Where I do not use it: for long tasks that touch many files at once. Claude Code wins there, probably because the underlying model and the scaffolding are both tuned for that.

For an SME that already has ChatGPT Pro for its team, Codex is the first agent to try: it adds no cost, it is integrated with the tool they already use, and it enables quick automations without installing anything.

5. OpenCode: the open-source multi-model alternative

OpenCode is what happens when the community takes the best of Claude Code and Codex and makes it open source. It is a CLI that works like the previous ones, but with one crucial advantage: you pick the model.

Want to use it with Claude? Fine. With GPT? Also fine. With a local model running on your own machine that sends nothing to the internet? Perfect. That freedom is gold when you work in sectors where data cannot leave the building: healthcare, banking, defence, public administration.

At Delbion we use it on client projects with data sovereignty requirements. The client wants automation, but does not want their documents flowing through Anthropic or OpenAI servers. With OpenCode plus a local model like Qwen or Llama running on their infrastructure, we get the best of both worlds.

When OpenCode makes sense

Technical teams worried about privacy or cost, regulated companies (healthcare, finance, public sector), or anyone who wants to experiment without committing to a vendor. If your case is "I want the best and I do not mind paying for it", Claude Code wins. If it is "I want full control", OpenCode wins.

6. OpenClaw: your 24/7 assistant on WhatsApp and Telegram

Here is the surprise of the last few months. OpenClaw is not for developers. It is for anyone who has WhatsApp or Telegram and wants a personal assistant that executes, not just replies.

You install it once on your computer (or on a cheap server). You connect it to your WhatsApp, Telegram, Discord or iMessage. From that point on you talk to it like you would talk to a coworker, from anywhere.

Real examples we are already testing with SME clients:

HR lead at a 40-person company: she messages the bot on WhatsApp "summarise the 3 CVs I received this morning and schedule 15 minutes with the two that fit best". The agent reads the email, processes the CVs, finds slots in the calendar and sends the invites. She only confirms.
Dental clinic manager: "if someone cancels, let me know and offer it to the waiting list". The agent watches the calendar, detects cancellations and sends WhatsApp messages to people on the list until the slot is filled.
E-commerce operations director: "each night summarise open tickets older than 24h and send me a report at 8". It has been running for weeks.

This is the agent that opens AI agents to people who will never open a terminal. And for an SME, that is huge.

EUR 0

installation cost of OpenClaw. You only pay for the API of the model you choose (and if you use a local model, not even that).

7. Three mistakes I have already made

So you do not have to repeat them:

Mistake 1: giving wide permissions from day one. My first week with Claude Code I gave it full access to my system. Two days in, it had deleted files it should not have touched (thinking they were stale) and pushed something that should not have gone out. Lesson: start with minimal permissions. Let the agent ask before irreversible actions. Costs you ten extra seconds and saves you headaches.

Mistake 2: asking for vague things expecting magic. "Improve the website" is not an instruction. "Review the 5 top-traffic insights articles and suggest 3 SEO improvements in each, prioritised by impact" is. Agents are only as good as the context you give them.

Mistake 3: ignoring cost. An agent running long tasks burns tokens. My first monthly bill surprised me. Now I have daily spend caps configured and I review the invoice weekly. Not out of stinginess: out of operational discipline.

8. Where to start based on your profile

If you have read this far, you are probably wondering which to try first. Depends on you:

You are a technical founder or developer at an SME. Start with Claude Code. Fast learning curve, immediate results, and the best balance between power and ease of use.
You already have ChatGPT Pro across the team. Start with Codex. No added cost and it removes the friction of trying something new.
You lead a non-technical area (HR, operations, sales) and want an assistant. Start with OpenClaw. It is the most natural entry door, works over WhatsApp and requires no programming.
You work with sensitive data or have privacy constraints. Start with OpenCode plus a local model. Harder to set up, but there is no real alternative if your data cannot leave your infrastructure.

All four tools combine. I personally use Claude Code for deep work in the terminal, OpenClaw to stay connected to the company from my phone, and I reserve Codex and OpenCode for specific cases where they shine more.

What matters is not which you choose first. What matters is that you start. Every month that goes by, the gap widens between those who have already put agents into operation and those who are still thinking about it. And that gap, later on, is not recovered with budget. Only with time.

Training · 100% covered by Spain's FUNDAE credit

AI Agent Use Cases for Your SME

Practical 10h course, 100% online, covered by your company's FUNDAE credit. We map which of your processes can be automated today with the tools in this article, and teach you how to set it up. No direct cost to the company.

See the programme →

FUNDAE subsidised training

Your team needs secure AI training

The EU AI Act requires AI literacy for all staff from August 2026. Our courses cover compliance, AI agents and governance. FUNDAE can subsidise 100% of the cost.

View available courses 0 EUR cost with FUNDAE credit

CS

Author

Carlos Salgado

CEO & Co-founder · Delbion

Carlos runs Delbion. He uses AI agents every day to operate the company without a traditional team: writing articles, handling SEO, preparing quotes, answering emails and deploying websites. This article shares what works and what does not, with concrete examples.

Learn to use these agents in your company

In the "AI Agent Use Cases" course we work with your real processes and show you how to set up agents like the ones in this article. 100% covered by FUNDAE. Results in weeks.

Book the free webinar View training programme

We Tested 4 AI Agents for 30 Days. Here Is What Actually Works

1. The leap: from chatbot to agent that executes

2. Quick map of the four agents

3. Claude Code: writes, refactors and deploys

4. Codex: the OpenAI script specialist

5. OpenCode: the open-source multi-model alternative

6. OpenClaw: your 24/7 assistant on WhatsApp and Telegram

7. Three mistakes I have already made

8. Where to start based on your profile

AI Agent Use Cases for Your SME

Your team needs secure AI training

Related articles

Agentic AI: what it is and how to apply it in your business

5 signs your company needs AI agents right now

AI agents webinar guide for business

Learn to use these agents in your company

We Tested 4 AI Agents for 30 Days. Here Is What Actually Works

1. The leap: from chatbot to agent that executes

2. Quick map of the four agents

3. Claude Code: writes, refactors and deploys

4. Codex: the OpenAI script specialist

5. OpenCode: the open-source multi-model alternative

6. OpenClaw: your 24/7 assistant on WhatsApp and Telegram

7. Three mistakes I have already made

8. Where to start based on your profile

AI Agent Use Cases for Your SME

Your team needs secure AI training

Related articles

Agentic AI: what it is and how to apply it in your business

5 signs your company needs AI agents right now

AI agents webinar guide for business

Learn to use these agents in your company

Is your company ready for the new AI regulation?