legal engineers


Explaining the Legal Engineering Capability Model

In this post, our Chief Scientific Officer and Legal Engineer, Dr Ben Gardner, continues a series of articles that are focused on education and legal engineering.

This post elaborates on Wavelength's legal engineering 'capability model' and examines each of the key capabilities. Understanding these tools and what they enable is central to the development of well-engineered and useful solutions. 


In a previous post I introduced Wavelength’s legal engineering capability model and explained the importance of unpacking what Artificial Intelligence means in more granular terms.

We like to think of the capabilities within our model as the tools on our workbench. More accurately, these are activities (or groupings of related activities) that can be combined to solve a problem at hand.

As legal engineers, our first task is always to understand and define a problem. The output of this process is a ‘design pattern’ that we can either implement ourselves or use to identify the best potential vendor products for a client.

The difference between a capability and a vendor product can be illustrated if we consider document automation as a capability i.e. as the core activity of creating draft documents by populating a template with input data. A vendor product that centres around a document automation capability may also include other capabilities including workflow, negotiation, and expert systems to support the additional activities of data capture, reviewing, and signing.

Any given vendor product will likely combine multiple capabilities into a solution. We cannot really know its relevance to you until we first understand your problem. Wavelength’s capability model provides the basis of a language for thinking about, defining, and solving legal engineering problems.

This post describes the different capabilities that make up this model by first focusing on the four core capabilities of:

  1. text analytics and extraction
  2. expert systems
  3. document automation
  4. data visualisation

and then discusses the interaction, and data structure & integration layers.

Wavelength Legal Engineering Capability Mode.png

Text analytics and extraction

Text analytics and extraction capability breaks-down into three sub capabilities; text extraction, document clustering and document comparison.

Text extraction tools provide the capacity to extract pieces of text from an unstructured source (like a register of title in the image below), and to create a structured set of information (such as a database or Excel table). These tools use supervised machine learning to train algorithms to recognise patterns in the documents. A pattern could be as simple as a date or a company name, or more complex such as a specific clause. In all cases, a user needs to provide the extraction tool with examples of each pattern he or she wants it to extract. This is done by manually tagging and marking up several documents and then using this tagged set of documents to train the extraction tool to recognise the pattern. In addition to extracting text, these tools can be used to automate application of metadata to a document, or drive automated redaction based on pattern recognition.


Text Extraction.png

Document clustering is the capacity to sort a set of documents into ‘similar’ groupings. Document clustering tools sort the documents based on the overall similarity in structure of the documents (e.g. headings, sections, clauses, etc.) as well as commonality of the content. This means that documents of a similar type can be grouped together.

Examination of each cluster by a user is required to determine the type of document in each cluster. For example, in the diagram below, after one round of clustering four document types have been identified:

  1. resolutions,
  2. consulting contracts,
  3. invoices, and
  4. commercial contracts.

Subsequent rounds of clustering can be performed on a grouping to identify sub-clusters e.g. different types of resolutions or variations in the template used to draft consulting contracts. It should also be noted that clustering techniques can be used to group extracted clauses as well as whole documents.

Document Clustering.png

Document comparison with respect to text analytics refers to the ability to compare and analyse differences across a whole set of documents as opposed to the more traditional document-to-document comparison.

The ability to compare across a whole portfolio at once allows the identification of representative standard documents based on the actual working practices of those who drafted the documents (as opposed to a comparison against an idealised ‘gold standard’).

The degree of variance compared to this standard can also be calculated and visualised. As with clustering, this collection-wide comparison can also be applied to clauses enabling more granular analysis.

Document Comparison.png

Expert systems

Expert systems first became popular in the late 1980’s, early 1990’s. These tools allow the user to construct a decision-tree or similar model that describes a process. Typically, this would be composed of a series of bifurcating decisions and conditional logic statements i.e. “if this, then that”. Users answer a series of questions and, based on their response, are directed down the appropriate path (as shown in the diagram below).

In addition to using the data captured in response to the questions asked, these tools can pull in data from other sources and perform calculations before providing the user with a response.

A classic application of expert systems is to build automated triage systems that allow scaling of decision-making processes where demand outstrips available human resources. More recently, these tools can be found providing the back-end logic that drives many chat-bots.

Expert Systems.png

Document automation

Document automation solutions are probably the most pervasive of these capabilities in the legal sector, with many law firms and legal departments already using various document/contract automation products.

This capability helps to streamline and scale the creation of standard documents, while enforcing best practice and enabling the creation of self service solutions.

At a high level, these solutions require a subject matter expert to create a template into which data is inserted to create a first draft of a document. This can be simply the auto population of specific terms e.g. party names, dates, etc., or selecting optional clauses and in some case selecting between version of the same clause. The data and choices can be gathered via a question-answer interface or via a spreadsheet or equivalent.

Document Automation.png

Data visualisation

Data visualisation represents the ability to display information in a visual fashion (some examples are set out in the diagram below). A good visualisation helps to turn raw data, e.g. a spreadsheet, into accessible information that supports making actionable decisions. Data visualisation as a discipline is widely used by those working with data, for example in financial reporting functions.

Within the legal sector, however, such tools have not been widely used, one reason for this maybe that valuable legal information is often trapped within unstructured Word or pdf documents. However, text extraction capabilities can enable data to be extracted and placed into alternative formats (e.g. spreadsheets and data bases) that data visualisation tools and techniques can be applied to.

This offers the opportunity to provide new insights into a portfolio of documents. For example, from a set of real estate lease agreements it would be possible to visualise how rent revenues will change over time, or to provide insight into the types of restrictions present, or to display where the real estate properties are geographically located on a map.

Data Visualisation.png

Interaction and Integration layers

Interaction layer

From a legal engineering perspective, interactions between people within teams, across departments or between a firm and a client fall into three broad categories: collaboration, workflow, and negotiation.


This is a well-established capability that centres around providing teams with shared working spaces where they can coordinate, share documents and manage day-to-day/week-to-week activities.

This capability covers the ability for a team to connect the various tasks that make up a process. It provides the capacity to both automate a process and to monitor progress of individual tasks. Workflows are particularly powerful where a process crosses business lines or group boundaries.

This is an emerging capability that is focused on helping overcome the challenges associated with the back and forth of negotiating. This is a broad category and covers many scenarios from document review through to facilitated online dispute resolution.

Data structure & integration layer

The capabilities in this layer represent different ways that systems and data can be connected. The focus is on how the output of one step can be used as the input to the next.  There are a large range of approaches that can be applied but of particular note are enterprise knowledge maps, robotic process automation and service oriented architecture.


Enterprise knowledge maps
This is a new approach to aggregating data from multiple sources. An enterprise knowledge map groups data around key concepts that are of interest to a business.  For a law firm, this means clients, matters and people, and the relationships that connect them. In essence, an enterprise knowledge map connects data with context.

Robotic process automation
This is a class of technology that can be used to connect systems and automate the execution of a process across multiple systems. For example, automating a matter-opening process could involve opening a matter, creating entries in a practice management system, creating new folders in a document management system and creating a new time code in a time management system.

Service oriented architecture
This is an approach for connecting applications, enabling information to be passed from one application to another, typically via application programming interfaces (APIs). From the legal engineering perspective, the ability to extract information from one source and feed it into another is critical to realising synergies by combining solutions with different capabilities.


In the next post we will explore using the capability model in more detail by applying it to a number of legal use-cases.

Ben Gardner