Mirek Riedewald - blog post

Learning from Accountants: Human-in-the-Loop Systems Done Right (May 13, 2026)

Depth Over Breadth: Why the Next Wave of AI Belongs to Domain Experts (June 23, 2026)

TL;DR

Modern AI's breadth—a single model that can write, summarize, code, and answer almost anything—started losing its role as a differentiator, since the best general models are converging and access is becoming universal. The new frontier is depth: solving domain-specific problems that may look easy from the outside but require expert knowledge to see "the part that is not written down." Invoice coding represents a good running example: seemingly dull data entry that is really a hard prediction-and-reasoning problem, closer to self-driving cars than OCR, because the knowledge that matters most almost never appears on the document or in public training data. The winning approach is not to discard the general model but to build domain-specific AI on top of it, specializing a foundation model's fluency through prompting, retrieval, fine-tuning, or agentic workflows so the domain expert can supply the depth that establishes the real competitive advantage. Rather than replacing experts, the right design keeps them in the loop, and the lasting value belongs to those who understand a domain well enough to see what the general model cannot.

Full Post

Introduction

One cannot help but marvel at modern AI: Like a Swiss Army knife, a single model now performs a wide range of tasks, from drafting essays, answering questions like "why is the sky blue," and summarizing books to generating software and proof-reading research papers. The astonishing part is not only that it does any one of these well, but that it does all of them at all. This breadth is the hallmark of the current AI revolution and it represents the realization of the old dream (and fear) of AI acquiring general human capabilities.

As I write this, a new wave is forming: While broad models will keep improving, easy wins are mostly claimed and it becomes more challenging to extract new insights from data publicly available "on the Internet." The next frontier consists of problems where automation would change how an industry runs, which requires industry-specific knowledge—in short, depth. Many of these problems look trivial from the outside, but turn out to be bottomless once you step in. Solving them does not take a broader model. It takes someone who understands a specific domain deeply enough to see the part that is not written down.

I have spent the past several months making this argument the long way around. On the surface, my recent series of posts explores automated invoice coding—arguably about as unglamorous a subject as I could have picked. But this was never just about accounting. As a computer science researcher focusing on big data and AI, I am always looking for the next big challenge. And in the context of AI, I believe the frontier has moved, now belonging to the domain experts. Let me pull those threads together to explain why.

The Unassuming Problem that Started It All

I begin by admitting my own mistake. As an established researcher who had collaborated with ornithologists, neuroscientists, and even "rocket scientists," I did not expect a consulting engagement on accounting software to teach me anything new about innovative data science. I was wrong. As I described in the post that opened the series, what appeared like a somewhat dull OCR problem turned out to be a deep one—and the only way I discovered that was by engaging with it long enough.

That is the first lesson of the shift I am describing: from the outside you often cannot tell which problems are deep. Invoice coding sounds like straightforward data entry. In practice, as I argued in the post on why it is an AI grand challenge, it has far more in common with self-driving cars: highly variable inputs, no clean ground truth, business rules that shift underfoot, and a small margin for error because real money and audit compliance are on the line.

What a General Model Cannot See

The clearest way I found to make this concrete is the distinction between OCR fields and prediction fields. Some information like vendor name, address, date, and total amount sits right there on the invoice. And a capable general model can read and extract it. But almost all the fields that actually matter for coding an invoice inside an organization, e.g., the internal vendor code, the property code, and the correct general-ledger (GL) account, practically never appear on the document at all. The vendor simply does not know the client's internal accounting structure and their business rules.

A broad foundation model, however powerful, handles what meets the eye and applies what it learned from "the Internet." Nothing in its training data reflects knowledge specific to a domain that keeps its data and processes private. And that closed-off private world can be messy and incompletely represented by the data. For example different AP clerks code similar invoices differently and a finished invoice rarely reveals the reasoning behind it. My tour of real-world invoice challenges—bundled documents, context buried in an old email, layout cues meant for human eyes, OCR errors that turn “5.0 @50” into “SO OSO”—was meant to drive home the point that AI-powered invoice coding is not an extraction problem, but a prediction and reasoning problem that demands context and judgment. That difference between reading a document and understanding a business is where breadth runs out and where depth begins.

Depth Is Not the Opposite of Breadth. It Is Built on Top of It.

As discussed in my most technical post, solving deep domain-specific problems does not mean that one should throw away the general model and replace it by a narrow one. I tried to head that off with an analogy: A model trained only on a company's invoices is like an alien who has seen the documents but knows nothing of human language or the world. It can mimic patterns but misses the connections that make those patterns mean anything. Pure narrowness causes brittleness.

I believe the winning architecture is domain-specific AI: take a foundation model's general fluency about language and the world, then specialize it to a use case through prompting, retrieval, fine-tuning, distillation, or agentic workflows. The foundation model supplies the breadth. The domain expert, either explicitly through direct input or implicitly through data they generated (e.g., coded invoices), supplies the depth: vendor codes, the chart of accounts, the late-fee policy, and the way costs get split across properties under a particular contract. Let's say it clearly: the depth layer is the differentiator. Breadth has become a commodity available to everyone. Domain knowledge represents the competitive advantage, be it in the form of a cleverly designed prompt with carefully selected example invoices or a fine-tuned model. Like with human employees, whoever owns the deepest understanding of the problem owns the value. AI further amplifies the differences by scaling that understanding as far as resources will allow.

The Expert Is Not Being Replaced. They Are the Engine.

Will capable AI make the domain expert obsolete? My previous post argues the reverse. The right design for domain-specific problems like invoice coding is not full automation, but a human-in-the-loop system where the AI clears the routine cases and routes the hard ones—flagged by per-field confidence scores—to a human expert. The accountant's judgment does not disappear from the loop. It becomes the loop's most valuable signal: Every correction an expert makes, every field they linger over, every case they rework from scratch provides valuable training data—implicit feedback harvested for free from ordinary work, refining the system day after day.

What if we reach a point where the AI has absorbed everything the human expert knows? While theoretically possible, currently I believe that many problems are sufficiently hard and represent "moving targets." More precisely, as we solve old problems and advance our knowledge, new problems will surface. And even if the AI could take a first swing at those problems as well, regulatory and business requirements will continue to demand human judgment. Humans will do well as long as they stay ahead of the machine. Anyone who outsources their judgment and creativity to an AI will probably soon be replaced by that AI. Having said that, let me quote an almost poetic line generated by Claude Opus 4.8 (which it may have found elsewhere): "The depth lives in the expert; the AI is how that depth scales."

Looking Ahead

Not everyone will agree, and I am obviously simplifying things a little, but looking at the latest generative AI models, I believe the "breadth race" is largely over. To be more precise, it is heading toward a tie where the best general models are converging and access to them is increasingly universal. What remains scarce—and therefore valuable—is depth. The people who understand a domain well enough to see what the general model cannot, and who can fold that understanding into a system that learns, are the ones who will define what comes next.

Previous post