The 60% Problem: Why GenAI Adoption is Challenging for Law Firms
Over the past 12 months, our conversations with associates and partners at law firms have revealed a striking pattern that potentially explains why generative AI adoption faces unexpected headwinds in the legal profession. The pattern centers around a critical threshold for quality: Gen AI tools seem to deliver work products that are approximately 60% “good enough” for most tasks.
This percentage has profound implications that create a fundamental disconnect between those who buy AI tools and those who actually use them daily.
The Buyer's Perspective: 60% Looks Like Success
From the perspective of decision-makers (usually partners and practice leaders who control purchasing decisions but may not be the frontline workers), 60% “good enough” represents a phenomenal achievement. When compared to starting from a blank page or an unamended precedent document, having a work product that's more than halfway done feels transformative.
For these buyers, the value proposition is compelling on multiple levels:
Efficiency gains: Starting at 60% versus 0% represents significant time savings
Quality improvements: AI-generated first drafts often surpass what junior associates might produce initially
Better time allocation: Fewer billable hours needed to reach a finished product means associates can take on more matters
The mathematics seem straightforward: if you can consistently start projects 60% complete rather than from scratch, you should see substantial productivity gains across the firm.
The User's Dilemma: 60% Isn't Good Enough to Ship
However, the reality for associates and other legal professionals actually using these tools tells a different story. A work product that's 60% “good enough” falls well short of what can be delivered to clients, whether internal or external.
Top-tier law firms and their clients expect perfect (or, at least, near-perfect) work products. While not every document needs to be 100% polished, the threshold for client-ready work typically sits much higher than 60%. This creates a critical gap that users must bridge through their own expertise and effort.
The challenge isn't simply just completing the remaining 40% of the work - it's identifying which 40% needs attention and then bridging this gap. Thus, users face two distinct but related problems:
Quality assessment: Determining which portions of the AI-generated content are inadequate, insufficient, or incorrect
Remediation: Making the necessary corrections to bring the work product to acceptable standards
If legal professionals knew exactly which sections needed improvement, the 60% starting point would indeed provide substantial value. However, in most cases, they don't have this clarity. This uncertainty forces them into a more complex workflow than without the AI assistance.
Perhaps more significantly, this process creates an unexpected emotional burden. Rather than feeling like they're building something from the ground up - a familiar and often satisfying process - users find themselves in a detective role, hunting for errors in seemingly complete work.
This task-switching between creation and evaluation modes can also become mentally exhausting. Some lawyers have shared with us that they spend roughly 20% of their time identifying problems and the remainder of their time implementing fixes. While this may still represent time savings and quality improvement compared to starting from scratch, the cognitive load of checking mediocre AI-slop often feels heavier and less rewarding.
The Adoption Paradox
This disconnect between buyer expectations and user experience helps explain a trend we are observing in the market: there is a non-trivial portion of law firms reducing their AI tool seat licenses from year one to year two.
The phenomenon appears to be particularly pronounced for high-stakes work. The higher the quality requirements, the more frustrating users find the 60% performance. For routine, lower-stakes tasks or for internal-facing tasks, where middling quality might suffice, AI tools can perform admirably. But these aren't typically the types of work that drive revenue at top-tier firms or justify premium client billing.
Looking Forward: Beyond the 60%
Importantly, we want to state that most buyers and users agree that it is better to have access to an LLM platform than not. The question is not whether generative AI platforms have value (which they undoubtedly do). Instead, the question is how we can enhance the user experience for those who have have access to these language models and platforms.
In our opinion, understanding the 60% “good enough” dynamic is crucial for both legal technology vendors and law firm leadership. The solution to better user experience and increased adoption isn't necessarily to improve AI performance from 60% to 80% or 90% (though that would certainly help). Instead, the focus should be on:
Task selection: Target the software so it can excel in specific, well-defined legal tasks rather than attempting broad applicability
Transparency: AI tools that clearly indicate their confidence levels in different sections of generated content
The 60% problem isn't unique to legal services, but the profession's quality requirements make it particularly acute. Recognizing this threshold effect and the behavioral bifurcation it creates is the first step toward building tools that truly serve both buyers and users in the legal profession.
For law firms considering AI investments, success will likely come not from expecting wholesale replacement of traditional workflows, but from identifying specific use cases where the 60% threshold genuinely accelerates rather than complicates the path to client-ready work products.