Legal Gen AI Megatrend 2024: Measuring Metrics

For those who are sharp of eyes, you might have seen that we released the video recording with Alex Smith on YouTube a week before we published this blog piece.

It was hard to write this third piece in our mini-series on Gen AI megatrends in legal. As we move deeper into 2024, and there are increasingly more technology providers offering generative AI capabilities and increasingly more aggressive sales pitches. (It is almost as if all the money poured into the technology needs to be recouped somehow…) We were conscious that we do not want to turn up to the heat in an already hot market. So, it took us a little time to find the right words to describe this third megatrend: the need to measure the performance of the technology.

In our conversations with customers, we have observed that generative AI enabled tools have made adoption of technology much smoother in the legal vertical, but we have not observed any evidence yet that generative AI has increased the stickiness of tools. At the core of what lawyers do is trust, and so the tools that serve them need to be reliable.

The need for reliability seems to have been an overlooked factor in the procurement cycles.

Unlike more traditional machine learning technologies, measuring the performance of generative AI tools has been somewhat more difficult due to the fuzzy nature of the outputs and the subjective judgements often required to determine if an output is “good” or “bad”. As the question or task becomes more complex, the assessment of the model’s performance also becomes more challenging. A little like pornography, today’s measure for the accuracy of generated outputs is something that “you will know it when you see it”. That is going to be inadequate a measure for lawyers, to whom every detail matters and even minor inaccuracies can lead to significant consequences.

We expect to see the development of metrics for evaluating the accuracy of generative AI in legal contexts as we continue to progress in 2024. At a minimum, we expect these metrics will account for various dimensions of accuracy, including factual correctness, relevance, and the ability to adhere to legal standards and terminology. Importantly, each law firm will likely have their own assessment measures to check whether the outputs of AI tools will meet their standards and match their style.

This third megatrend will likely take a little while to come into the mainstream. We expect to see a lot more aggressive sales pitches before the selection of AI technologies are guided by rigorous evaluation of accuracy metrics. In the meanwhile, at least we can rely on the dry humor of Alex Smith.

Previous
Previous

Webinar notes: Lessons from AI Builders

Next
Next

Legal Gen AI Megatrend 2024: Tailored Use Cases