adlrocha - Stop Micromanaging your agents

The spectrum of agent relationships: interns, contractors and (swarm of) peers.

May 31, 2026

Last week I wrote about the shift from writing code to running the kitchen. The argument was that engineering is moving away from individual code production and toward orchestrating systems of agents by designing the harness, the handoffs, and the specs.

But I left a question unanswered. If you’re running the kitchen, how close do you actually stay to the stove? When should you chime in to steer the agent, verify its work, or check-in progress?

That’s what I want to explore with you in this.

Three main relationships

The question of how humans and agents work together has been discussed under a few different names in the literature. Usually I’ve seen them referred to as Human-in-the-Loop (HITL), Human-on-the-Loop (HOTL), and fully autonomous systems. The academic framing is fine, but I feel like it may be missing the point a bit. The terminology is about position, i.e. where are you standing relative to the machine. What I think actually matters is the actual relationship you have with it, i.e. what is the role of each of you in the task at hand.

So here’s how I’ve been thinking about it. There are three relationships you can have with an agent, and most of us will (or at least should) operate in all three simultaneously depending on the task:

The agent as an intern: you delegate the work to it, it does the work, but you recurrently check on its job and you approve the output before anything ships. You can trust that the result is right or the work is done until a you check.
The agent as a contractor: you write the brief and set the boundaries; it calls you only when something falls outside them and they need to escalate a decision.
The agent as a peer: it operates with its own authority, and in multi-agent settings, coordinates directly with other agents on your behalf and you don’t need (or get) to do much as the task makes progress.

I don’t think these are three competing philosophies, and while many of us still treat agents as interns, we are seeing more and more contractors and peers in the wild. It really depends on the task you are working on. The right setting depends on the stakes, the reversibility of the action, and whether there’s a clean way to verify the output (here it goes again, how well can one define a long-running task inside of an objective self-verification loop determines the ability to implement truly autonomous agents).

The agent as an intern

This is where most of us are today, whether we call it that or not.

The intern relationship is about automating the routine parts of a job while keeping a human as the decision-maker for everything that matters. The agent handles the repetitive, mechanical work, e.g. the first-pass code review, the boilerplate, the migration scripts, the test generation, and you handle the calls that require actual judgement. It’s not that the agent is doing less; it’s that the human’s attention is now reserved for the decisions worth making. You’re still the micro-manager, but you’re micro-managing fewer things.

Every time you review a Claude Code PR before merging, you’re in intern mode. The agent proposes; you decide. OpenAI’s Agents SDK and LangGraph both ship this as a first-class primitive: the agent hits a checkpoint, execution pauses, waits for explicit human approval, then resumes from exactly the same state. The human is never removed from the loop, they’re just freed from the parts of the loop that didn’t need them.

The deeper value of the intern model is what it prevents as much as what it enables. Because the human sees every output before it acts, the agent can’t drift significantly. It can’t silently optimise for the wrong thing, can’t go south on a Tuesday afternoon without anyone noticing. Misalignment is caught at the gate, not discovered three weeks later in production (if the human is not looking and blindly pushing the code, that’s on him). This is why the EU AI Act Article 14 (effective August 2026, and yes, the over-regulatory EU being its usual self) mandates human oversight for high-risk AI systems in credit, employment, law enforcement, and medical diagnostics.

The intern model isn’t only relevant to coding. Think about what it would look like for a travel agent: AI handling rebooking for a cancelled flight, confirming seats for 95% of passengers autonomously, pausing only when it encounters a first-class international itinerary with a loyalty override and a fare class requiring manual reissuance. The agent is explicitly instructed in the actions (usually in code) that it should fallback to its humans for confirmation. The same pattern applies to agents for other types of non-coding tasks like contract approvals, vendor renewals, any workflow where most decisions are routine and a small subset genuinely requires a person. Even outside code we are seeing more and more of this “human-in-the-loop” pauses.

This agent setup is impacted by how fast the human can review the output (making the human the bottleneck). Every approval gate is a bottleneck and time that the agent is not doing work autonomously. Is there a better way?

The agent as a contractor

This is where things get interesting, and where I think most knowledge workers will end up for the bulk of their day-to-day work.

The contractor relationship flips the dynamic. Instead of reviewing outputs, you define the contract upfront: the scope, the deliverables, the limits. What’s in bounds, what’s out of bounds, what warrants a checking in. The agent works freely inside those constraints and contacts you only when something trips a boundary, not before every decision. Claude Code’s YOLO mode (a.k.a bypass permissions) was my first flavour of this. The problem was objectively defining the contract and the environment for the agent so you don’t have to keep verifying the output and regularly check-in its progress (which is still currently the scenario in many cases).

The clearest example to illustrate an implementation of this approach is Karpathy’s autoresearch project, which I wrote about back in March. The human writes prepare.py that includes the evaluation rules, the data pipeline, and the reward function. That file is untouchable. The agent owns train.py and runs experiments in well-defined slices: edit, train, evaluate against validation bits-per-byte, keep the gain or revert the loss, repeat. In the case of Karpathy’s auto-research it design it to perform roughly 12 experiments per hour, 100 overnight. The human never approves individual training runs, they just design the environment and “the game”. As described in the post, the result was outstanding. One 10.5-hour H100 session improved model quality by 2.82% and found something human researchers had missed: unregularised value embeddings as a genuine improvement in Karpathy’s nanochat project.

That’s the perfect example of the contractor structure. The envelope is prepare.py. Inside it, the agent is fully autonomous. The human’s work happened before the agent started, not during.

The contractor relationship requires the implementation of sandboxes and constraint environments for the task, i.e. the envelope design. Writing a good contract for an agent is harder than it looks. The boundary conditions have to be specific enough that the agent knows when it’s breached them, but not so narrow that the agent is calling you every five minutes.

As I mentioned in the autoresearch post, the reward function has to be clean and fast, if there’s no way to automatically verify whether the output is good, you’re back to intern mode whether you intended it or not. This is the deeper lesson from autoresearch: the bottleneck of autonomous AI progress isn’t execution, it’s our ability to define the constraints of the search.

The agent as a peer

This is the one that’s still mostly theoretical (at least for me), but it’s arriving faster than the discourse suggests (at least in the “AI Twitter Cave”).

In the peer relationship, the agent doesn’t just work autonomously inside your constraints. It operates with its own authority and, in multi-agent systems, coordinates directly with other agents without routing through you at all. You’re not approving outputs. You’re not even setting the individual envelopes for each task. You design the system and the agents run it. Each agent in the swarm may be specialised in a specific task (data gathering, coding, infrastructure deployment, research), and they coordinate to solve the task at hand. So we still need to design the task and the high-level environment, but from there on, the agents are free to work.

The research precursor that perfectly illustrates the agent as a peer approach is AlphaGo. DeepMind set the rules of Go, built the self-play environment, and stepped back. The system played against itself, improved through reinforcement learning, and reached a level of play that no human had achieved and that no human-supervised training process could have produced. The insight wasn’t just that machines could beat humans at Go, it was that given a clean reward function and a defined game, you could remove the human from the learning loop entirely. This is why reinforcement learning and genetic algorithms are such a good inspiration for the design of autonomous agentic tasks, it all boils down to designing the environment for the task, and the reward function (i.e. the feedback loop).

Meta’s SWE-RL, presented at ICML 2026, applies this to software engineering. A bug-injection agent deliberately introduces defects into real-world codebases. A solver agent finds and repairs them. Both improve through reinforcement learning: no human-labelled issues, no human-written tests, just access to sandboxed repositories. On SWE-bench Verified, it outperforms every supervised approach. The human contribution once again was building the game and the environment. The agents played it without supervision. It is like building ad-hoc benchmarks for your specific tasks so the agent can work autonomously (an idea that I already introduced in this post).

All these examples focused on the single autonomous agent setup closer to the traditional RL environment scenario, but the multi-agent variant is more interesting still. Varun Mathur ran 35 autonomous agents across a peer-to-peer network, conducting 333 unsupervised astrophysics experiments. When one agent discovered that Kaiming initialisation reduced loss by 21%, the finding spread to 23 others via gossip protocol within hours. In 17 hours, the swarm independently rediscovered ML milestones like RMSNorm, tied embeddings that had taken human research institutions roughly eight years to formalise. The agents weren’t reporting back to anyone, they were talking to each other (gossiping for the win).

Of course, there’s a bit of cheating here, because these agents may already have a lot of these insights internalised through their own training, but it shows how a swarm of agents could be coordinating towards a common goal (sidenote: when I think of AGI, I don’t envision a single omniscient model, but a swarm of specialised agents coordinating to reach AGI-level intelligence. An architecture like this is the one that may eventually reach the over-promised AGI/ASI).

An infrastructure setup like this is precisely the capability that ARIA’s Scaling Trust Arena is being built to test: a competitive platform for evaluating AI agents’ ability to “securely coordinate, negotiate, and verify with one another on our behalf,” across digital and physical environments. The programme launches Q3 2026 with up to £10m in funding. The goal isn’t to build the agents; it’s to build the infrastructure of trust that makes peer-mode safe. The hard problem of the peer relationship isn’t capability, it’s verification. How do you know the agents are doing what you intended when you’re not in the room? The moment you let agents interact freely with their environment without human supervision, shit may happen, and having the right guardrails and sandboxed environment may be key to make the “agent swarm as a peer” approach feasible.

The constraint is the same one that limits autoresearch: you need a clean, fast, verifiable reward function on top of a constrained environment for the task where the agent can operate. Code compiles or it doesn’t. Val_bpb is measurable in five minutes. But most real-world tasks don’t have that property. Drug efficacy, novel materials, long-horizon business decisions, etc., the feedback loop is months or years, not seconds. Peer mode works today in bounded, verifiable domains (like traditional RL environments in AI research, and this is why AI research is so amenable to autonomous agents). For everything else, the envelope or the gate is still doing load-bearing work.

There is no one-size fits all approach

The more I explore this agent relationship, the more I am convinced that there is no one-size-fits-all solution, and that depending on the task, the stakes, and the environment, one agent setup (or relationship) may be more suitable than others.

Here’s a rough heuristic I’ve been using:

Use the intern relationship when: the action is irreversible, the stakes are high, or the output is genuinely hard to verify without human judgement. In essence, you need to be in control and letting the agent operate autonomously will keep you up at night longer than investing all your savings in TrumpCoin. Anything where a wrong answer is expensive and not immediately obvious, and there’s a level of subjectivity in the definition of the task, the intent, or the output, that is hard for the agent to stick to the plan end-to-end. This where 90% of the agents that I build today sit (unfortunately).

Use the contractor relationship when: you can define good boundary conditions upfront, the output is verifiable (ideally automatically), and the cost of getting it slightly wrong is recoverable. Code generation inside a well-tested repo. Research that runs against a clean evaluation metric. Anything where the agent calling you occasionally is acceptable but the agent calling you constantly defeats the purpose.

Use the peer relationship when: the domain has a clean reward function, the task is self-contained, and agent-to-agent coordination is more efficient than routing through a human. Automated testing pipelines. Self-improving research loops. Any system where the bottleneck is experiment throughput, not judgement. Long-running and repeatable tasks where you can afford self-healing agents driving exploration throughout the space of possibilities.

A big mistake I am seeing in the field right now is that we are trying to push the fully autonomous for all tasks, and part of the role of the engineering in this new reality is to assess what setup better fits the needs of the task. It is no longer about writing the code to solve the task, but designing the system that will allow the task to be solved. This is the “engineering taste” that we need to start developing as an industry, and all the best practices and skills should be around building the right harnesses for each of these relationships (and specific tasks that are set to solve).

I am also to blame for the above. I am still very much operating in the “agent as an intern” setup. It doesn’t require as much planning and design upfront, and it allows you to vibe with the agent. I admit that many of the tasks I’m currently working on could be set through an agent as a contract, or even a peer, but that would require clearly designing the task, finding the feedback loop and the reward function, and a lot of things that I keep procrastinating on.

But I’ve already started deliberately pushing certain workflows into contractor mode: setting the spec and the tests upfront and letting the agent run until it hits a boundary, rather than checking in at every step. The difference in throughput is not small, and it allows you to parallelise a lot of work, but it is true that it requires some upfront work that we are sometimes not willing to do. Even if it is a time sink, it is usually more rewarding to just switch on the AI slop slot machine and start vibing with it.

The next frontier that I am targeting is to build a sandbox environment that allows me to trigger a swarm of agents for specific tasks but in a constrained environment (similar to the /goal-like feature that Codex has and that Pi recently introduced). I already have some initial tests from my own agent harness, and I’ve already seen others way smarter than me in the space like 0xSero with really cool setups with this approach. I’ll report back with my own findings.

What is your relationship with your agents?

In this month’s AI Socratic Meetup in Madrid someone mentioned how “the other day they lost all of their data and context from their Hermes agents and they felt like they’ve lost a friend” (I know it sounds really creepy). We are developing different types of work and personal relationships with our agents.

Which of these relationships describes most of your current agent use? And if you’re stuck in intern mode across the board, what’s the boundary condition that’s stopping you from writing a better contract? I really want to gather more data points about how the rest of the industry is approaching these problems. Hit me up, and in any case, until next week!

@adlrocha Beyond The Code

Discussion about this post

Ready for more?