human-in-the-loop is the wrong default for AI platforms

Human-in-the-loop sounds like we're just a tool for AI. I picture someone at the end of a process reading output from an AI. Most AI products sell that story: automation runs, then a person validates. Coding models hand you a diff; LLMs reshape copy from a prompt. The UI is usually tuned for throughput, and the human becomes a rubber stamp rather than a critical assessor. That gap stings when mistakes are expensive or when you need to know what you shipped, which is most of the time, and I see a lot of people trusting the workflow because of its convenient way to do the first part of a project as quick as possible.

Different entry points

Let me explain what the two directions actually are. Human-in-the-loop (HITL) is the label for validating or fixing after automation runs, like verifying AI-generated code before merge. AI-in-the-loop (AITL) is what we already know, and it was just called a 'suggestion' before anyone named it like this: the model proposes, the human decides.

AI vs Human in the loop Two different approaches of AI-implementation, where the AI-in-the-loop process gives humans more control over the definitive outcome.

This makes me think of many questions, are today's AI models even good enough to take the lead? Is it correct to use the HITL approach and can we expect users to remain critical enough when handling something led by an AI? Do people even want to validate output from a system, aren't they looking for something more challenging in their work? A lot of work needs someone to actually understand the result so the next person is not stuck with opaque edits.

In Practice

During my project for Triodos Bank, I tested both approaches to figure out what the behavioural difference is. Triodos sits on sensitive financial- and customer data so it is extremely important that any type of data is 100% accurate. The test case made users validate data in both the HITL and AITL approach.

Users actually felt like HITL was faster, which makes sense, it's a matter of just validating some output, rather than submitting that output. However, the actual critical validation, happened to a limited extent, as users skimmed more than they reviewed. I think this is a perfect example of the issue: if something is too convenient, users will not validate results well enough. This can feel productive in the moment, but hurt in the long term as it shows up incorrectly in reports, or, if you look at it from the perspective of shipped code, could simply use the wrong design patterns and result in a codebase that's impossible to maintain.

Not always bad

I don't believe there aren't scenarios where the human-in-the-loop approach can be fine, in fact, Brian Lovin of Notion actually explains that he uses AI to quickly prototype new ideas and test them in a true product environment. I think this is a perfect example where tools like code agents can be just fine; not for a definitive result but to understand the behaviour of users in a true environment, not a Figma canvas.

In fact, I did just that with this Triodos project. Due to the complexity of our product case, working in a Figma prototype would not reflect a proper real world environment. Instead, I used Onlook AI to create a quick prototype. Nowadays, you could use the Figma MCP and combine it with Claude Code, Cursor or any other tool you prefer, as over the past 2 years this workflow has become a lot more convenient and accurate to the design. Just think about the fact that shipping thes AI-first outputs can result in issues in the long term.