Making the law machine-readable

In this post, Dr Ben Gardner, Legal Engineer and Chief Scientific Officer at Wavelength.law Limited, starts a series of articles that are focused on making the law machine readable

Much has been written about how artificial intelligence will change the legal profession. To-date these discussions have largely focused on tools, with text analysis capabilities, that are capable of extracting specific data elements from documents to create structured content. Tools that identify patterns in data (e.g. dates, currency values, names, images, etc.) do so by using machine learning algorithms to construct statistical models.

In the legal sector, the main application for this text analysis capability is the creation of structured (e.g. tabular) content from unstructured or semi-structured documents (e.g. contracts). When someone uses such a tool to extract elements from a document (e.g. clauses, parties, dates, etc.), these elements are made machine-identifiable or explicit to the machine. Once extracted and ordered, in a spreadsheet or database, this data can be algorithmically analysed or transformed via encoded business rules to support a decision-making process.

There are several examples of where this approach has been applied, ranging from due diligence activities (such as lease reviews, M&A transactions, etc.) through to classical eDiscovery and technology assisted review (TAR) processes. Undoubtedly, the smart application of these technologies has and will deliver significant efficiencies and an evolution in the way legal services are delivered.

However, where there is a need to ‘understand’ text (i.e. to understand what is meant by the words), rather than simply to identify ‘patterns’ in data, the uses for data extracted by these tools are limited. These limitations can only be overcome by rendering a document machine-readable. In terms of capability, the difference here is between (i) a machine identifying a block of text as a clause, and (ii) a machine ‘reading’ that block of text as being a clause of a particular type which contains reference to specific data elements (e.g. interest rates, dates, parties, etc.).

For computers to ‘understand’ both the elements mentioned in a block of text and the relationships that exist between those elements, it is necessary to identify the semantic meaning of the information contained within the text.

From a legal execution perspective this is central to efforts in FinTech (e.g. smart contracts[1]), RegTech (e.g. model-driven reporting[2]), and the creation of machine-readable legislation[3]. It could also have transformative effects on access to justice (e.g. by automating dispute resolution)[4], enhancing legal research[5] (e.g. by presenting users with more sophisticated and accurate search results), and re-designing the interface between people and law[6] etc.

The potential impact of machine-readable legal documentation is huge, as is the task of enabling it. The challenge requires us to solve the technical hurdles associated with creating a smart contract while allowing for the fact that any given contract exists as part of a mosaic of interconnected regulations and legislation.

It is important to consider that no legal document exists in isolation. It exists within a legal landscape where each factor can change to a greater or lesser extent. On top of this, the law is continually evolving - through changes to legislation and statutes, custom and judicial precedent (in common law jurisdictions), publication of new regulations, or by disruption and challenge from society, new business models and technology.

The impact of this continuous evolution can clearly be seen in the complicated interaction of legislation. John Sheridan published an excellent piece of analysis looking at UK legislation[7]. The figure below from Sheridan’s article is a visualisation of the interconnectedness of one piece of UK legislation (the Companies, Audit, Investigations and Community Enterprise Act 2004) to other pieces of legislation in the UK. It represents the complexities of legal effect caused by that one Act in the statute book.

John Sheridan’s visualisation of the interconnectedness of one piece of UK legislation (the Companies, Audit, Investigations and Community Enterprise Act 2004)

A similar picture would emerge if we were to consider a contract and all the legislation and regulations that surround it. When creating machine-readable legal documentation, we therefore need to think about how the content of a document enables its own execution as well as how it is interpreted as part of a mosaic of interconnected relationships between itself, regulations, legislation, and other agreements.

It is not enough to make a document machine readable; the data and information contained within in it must also be interoperable across these different domains.

In spite of the challenges, significant efforts are being made in the pursuit of machine-readable legal documentation. In this upcoming series of blog posts, I will review some of these initiatives, breaking the topic down into the following broad themes: (i) making existing legal documents machine readable, (ii) authoring machine readable legal documents, and (iii) solving the ‘interconnectedness’ problem.

Thanks for reading this post, I’d like to hear your thoughts – get in touch or subscribe to be the first to receive the next part in this series!

[1] Clause, Monax, OpenLaw, etc.

[2] FCA Model Driven Reporting, NESTA Anticipatory Regulations

[3] Semantic Finlex, CEN Metalex , Better rules for Government Discovery Report

[4] Artificial Intelligence and online dispute resolution, Online Dispute resolution: an artificial intelligence perspective

[5] Ross, Ravel Law, Judicata, etc.

[6] Legal Design: combining legal expertise and design to invigorate the law, Design in a world where machines are learning

[7] When laws become too complex

Ben Gardner