Speed, trust, and the growing cost of verifying quality

We spent the Christmas and New Year break thinking about where the market is heading in 2026.

A lot of folks have expressed ideas about generative AI becoming less experimental this year — about how the technology will become more integrated, how agents will become deeply entrenched, and how everything will become ambient or invisible. At the same time, others have pointed to the risk of over-reliance on technology, and AI eroding our ability to independently think.

(as I write this, the soundtrack from Interstellar is playing in the background, so imagine this is quite dramatic and potentially world-ending)

AI is not liable (nor reliable)

We have already seen a few risks emerge for legal work. How do we handle hallucinations? Who will be the senior lawyers in 10 years if we don't have proper training today? And so on. Many of these challenges are being addressed through improvements to technologies and processes… but the rules of the profession require that someone is ultimately responsible for legal work… the buck has to stop somewhere. For legal professionals, that usually means a person should verify the accuracy of what is produced. (We say "should" with full awareness that many pieces of work are already done without human oversight)

The time spent verifying accuracy is what some folks are calling the "verification tax". This verification tax is paid regardless of whether the work was initially produced by a human, e.g. a junior lawyer, or by an AI. This is the cost of QA and QC. This is the cost of creating trusted legal work.

The legal profession is accustomed to paying a verification tax on checking work done by juniors. The leverage model of law firms is built so that drafts of work can be done by juniors, revised by mid-levels, and then signed off by seniors. Clients pay good money for that process. It isn't fast, but it creates work products that clients can trust.

AI reliance kills quality control

We spoke with a partner at an AmLaw 50 firm, and he was wonderful in giving us time to talk about an experiment he ran with one of his clients.

Traditionally, at this client, a junior would do research and put together draft materials that would get reviewed and revised by a senior person. The draft the junior produces might be a B/B+ in terms of quality. Once the senior has revised the draft, it hits the A+ quality necessary to go to the client.

This partner ran an experiment with his client where they used GenAI to modify their workflow. The proposal was to have GenAI write the first draft. They expected the GenAI tool would produce C/C+ quality of work. This would be handed to the junior who would polish the work to B grade, and then the senior lawyer would take the work up to A grade.

In reality, they found that with good prompting and information engineering, the GenAI tool was able to reach B/B+ quality of work. But then, when the draft was handed to a junior or a senior, no one was able to improve the work to the necessary A/A+ grade that is expected. No one had done the groundwork necessary to make proper corrections.

AI cannot be trusted to verify itself

GenAI was supposed to shift the role of lawyers from authors to editors. The theory was that humans would become centaurs with AI.

Better. Faster. Cheaper.

We're beginning to see that human-AI centaurs are not simply augmentations of existing workflows — they change the quality of the final output altogether. If humans haven't put in the groundwork, they're not able to improve on what AI prepared. The real problem is not speed or efficiency. The problem is quality.

We've known this for a while. Concocting a string of words is not the hard part. The hard part is knowing what work is right.

Using LLMs to verify the work of LLMs introduces the same error risks as using LLMs to create the work in the first place. If GenAI can produce text endlessly, yet cannot be trusted to edit its own work, then human experts must validate, verify, and improve. When outputs increase in volume, they also increase in uncertainty.

Lawyers need to be able to figure out what's right and what's wrong, and be given relevant information to support their judgment. As GenAI accelerates the drafting of text, as it performs the initial analyses, humans need tools to accelerate how they assess whether drafts and analyses are actually right.

Trust is now the bottleneck.

Accountants vs auditors

If you were a company preparing your financial accounts, you wouldn't want your financial accountants to audit their own work. So if you're a lawyer using GenAI to prepare your first draft, would you want GenAI to also audit the output?

Some vendors are selling their tools as both the creators and auditors of content. This seems inherently problematic. There's a vested interest in the tool not pointing out problems that it created in the first place. Just like an accountant auditing their own work — the accountant is incentivized to tell you they got everything right. Only an independent auditor would have the obligation and vested interest to point out the initial accountant's errors.

We think this governance dichotomy is going to become critical for generative AI going forward.

Trust is the bottleneck.

Verification will be key in 2026

By the end of this year, every large law firm will have adopted some GenAI tool. Mid-sized firms and in-house teams too. Harvey, Legora, ChatGPT, Claude — everyone's going to have something.

This means lawyers using these tools will get very little competitive advantage just from having them. Lawyers can leverage their knowledge to build their own workflows and automations. Lawyers may embed their knowledge and precedents within the tools to gain an edge… but no matter what comes out the other end of an AI-augmented workflow, unless we come to trust AI to complete a piece of work, the lawyer's job is still to review and verify the work product.

This is where Syntheia is focused in 2026. This is the gap we see in the market.

We are not yet seeing others building tools that help speed up human verification. This is what GenAI platforms cannot — and from a governance perspective should not — be the ones to fix. Verifying legal work requires a different approach than generating legal work. It requires a different technology, operating under a different set of principles.

GenAI can deliver 80% of the output. But the final 20% needs a human expert to apply their judgment and experience. The hardest part of legal work has never been drafting — it's knowing the work is right. As we already said, AI increases output, but it also increases uncertainty.

We'll share more thoughts in future blog posts on how we plan to help human experts close that last 20% faster, better, and more affordably, while retaining trust.

Happy new year!

Previous
Previous

Are Law Firms Repeating the Mistakes of the 1990s?

Next
Next

Your “AI Strategy” should not be an “AI” Strategy