Opening Legal Knowledge Automation

13 min readOct 24, 2019

Open sorcery is back in the legal tech spotlight. See e.g. https://www.merlinfoundation.org/ and http://www.slaw.ca/2019/10/23/making-mischief-with-open-source-legal-tech-radiant-law/.
So as an exercise in personal archaeology I thought I’d resurface the following paper from a couple years ago. (Those who would prefer watching rather than reading might enjoy the middle part of this keynote at a Center for Computer-assisted Legal Instruction conference.)

Ontologies and Openness in Law Practice Automation

A working paper for Legal Knowledge Systems in Action

Eighth International Conference on Artificial Intelligence and Law

St. Louis, Missouri — May 2001

Introduction

This paper describes the Open Practice Tools (OPT) project, an initiative begun at AmeriCounsel[1] in October 2000 to promote open standards and processes for law practice automation. I hope to stimulate discussion of these ideas within the AI & law community, and help to bridge the historical disconnect between legal knowledge theorists and practice system developers. You might style this: “formal knowledge modeling meets practical legal system engineering.”

Open practice premises and principles

OPT is premised on the assumption that legal technology can be advanced through open source principles and practices. We take heart in the foundational standards work admirably being pursued by organizations like Legal XML (http://www.legalxml.com). And we intend no threat to the good quality commercial software being marketed by many legal tech vendors. But we believe there to be room between those two worlds for some very productive collaboration around key aspects of legal application development. Open source concepts like “voluntary communities of interest” engaging in shared “release/test/improve cycles” have a future in law.

Building quality applications that directly support the practice of law — sometimes called “substantive” systems — will always be complex and delicate work. But that work is often more complex than it need be. Unreasonably complex. And we can do something about it.

Law office application development is still largely a cottage industry, filled with people who receive little training in software engineering, work in small shops on discrete projects, and lack access to “industrial strength” facilities or tools. The same problems get solved again and again; the same mistakes get made over and over. Systematic collection and reuse of tried-and-true solutions rarely happens inside law offices, and is almost unheard of across firms.[2]

What we have as a result are disconnected islands of automation, incompatible tools, needless reinvention, inefficient development, uneven quality, high failure rates, and lost opportunities.

Hence the basic Open Practice Tools idea: Why not try to join some of the people involved in legal application development (technologists, lawyers, librarians, knowledge managers, educators, computer scientists, etc.) in open, cross-organizational, international, platform-independent conversations about common dimensions of their work? They would settle on some standard ways to think about and implement legal applications. Practice area by practice area, participants would evolve shared “concept maps” (and eventually formal schemas) that identify the typical data elements, rules, and processes involved. They would contribute practice software examples and components to a common repository, while also following OPT principles in their own private or commercial applications. The result could be an upsurge in legal technology reusability and interoperability, triggering greater innovation, productivity, and quality throughout the sector.

The basic idea

The core idea is to come up with a framework of standards that allow developers to build practice tools that are semantically interoperable and transparently self-descriptive. Among other things, that would mean a set of agreed data schemas, covering such things as the names of data elements, their data types, and the cardinalities of their relationships (one-to-one, one-to-many, etc.). For each of a growing list of practice contexts, we would articulate a shared vocabulary of entities, attributes, relationships, and processes. Tools developed in accordance with those vocabularies — or structurally similar vocabularies with well-defined mappings to the standard — would be reasonably interoperable, in the sense that particular data and processes in one would “mean” the same thing when used in another.

For instance, for the context of a divorce proceeding there would be an enumeration of roles like husband, wife, child; relationships like the one-to-one spousal relationship and the two-to-many parental relationship; attributes like name, gender, age; goals like preserve the marriage, obtain custody, etc.; legal tasks like commence the action, file motion for temporary orders, serve notice of deposition, etc. By unambiguously identifying and naming these actors, facts, tasks, etc., we can provide a solid roadmap developers can use in building robust and “sociable” systems.

You can think of this effort in terms of an object-oriented analysis of the facts, issues, and processes typically involved in a given practice context, or as the articulation of an ontology for that context. The result of this conceptual work might be represented in a hierarchy of programming objects in Java or Visual Basic, an XML schema, an entity-relationship diagram, or any number of formalisms. There’s been an enormous amount of work along these lines in the artificial-intelligence-and-law community and in technically advanced law offices everywhere. The point is to reach rough consensus on the conceptual architecture and express it in forms that let people write good working code.

This core effort of evolving shared ontologies for different arenas of legal practice would be embedded within a broader effort to aggregate common resources useful in practice automation — repositories of tools and example applications, best practices, design standards for legal knowledge engineering, and links to other resources.

A practical example — HotDocs

Since HotDocs is one of today’s most widely used software platforms for substantive legal practice systems, let’s consider how these ideas might play out there.

Suppose that when starting a document automation project, say, for a law firm’s wills and trusts practice, you could consult and use a framework of pre-defined variables typically needed in such a project. Many of the parties, relationships, facts, and issues would already have been named and defined by others based on their own experiences in automating similar practices, and the results of their experiences would have been refined by yet others over time. You could go to a cooperating OPT web site and download a master component file for your purposes, with much of this collective expertise already expressed in pre-built variables. Perhaps some generous souls have posted complex computations using these standard variables. Perhaps others have posted model passages or entire documents that you can download for a quick start on your project. Yet others would have entire form sets available for purchase or trade.

So you finish your automation project with much less time and frustration than in the old days. Then you discover that one of your attorneys routinely uses published form templates for some related tax filings. Fortunately, the publisher of those forms also saw fit to observe the OPT standards. And even though a second publisher of related practice tools chose not to observe those standards, a fellow user posted a mapping file on an OPT site that enables you to automate 90% of the data flow between the applications.

And by having documents automated pursuant to a vendor-neutral and implementation-independent framework, the task of migrating them to a new platform if and when HotDocs is eclipsed by a better engine will be much more straightforward. Colleagues, adversaries, or administrative agencies using comparable tools like Rapidocs, GhostFill, or SmartWords benefit from the same data sharing and migration opportunities.

Other legal tech dimensions

Open standards for law-related data and processes are of course not limited to document assembly. Electronic filing of court and other governmental documents is an important arena in which much standards work is being done. Document management applications will also benefit from the development of shared schemes for profiling work product. Case management, relationship management, and litigation support implementations likewise are natural contexts in which developers should welcome the emergence of broad-based systems of terminology and classification. The less reinventing we do, and the more re-use and interoperability we enable, the more effective practice technologists will be. The foundations we lay today will serve us well as next generation technologies like artificial intelligence and richly interactive multimedia become commonplace in the law office.

The proposed open practice tools process

Here are the basic steps being considered as part of OPT:

1. We establish a web site as a focus of conversation and resource sharing around open-spirited technology in support of law practice. The site includes collections of background material, a registry of related activities, a listserve or threaded discussion facility, and a repository of contributed examples and components. You can think of this last part as a legal software “commons,” and the whole thing as a work-space for collaborative codification of practice knowledge. Or as a global community of experts sharing ideas and resources for better supporting law practice through knowledge-based tools.

2. We adopt or construct a classification scheme — a taxonomy — for practice contexts, through which we can index our various materials and activities, and to which others can map. Practice contexts should be specific enough to represent a coherent set of typical facts, issues, and activities, but general enough to cover all situations that involve essentially the same concepts.

3. Teams form to publish concept maps for as many practice contexts as possible. Described in the next section, these are the principal documents for framing the key facts, processes, and ideas involved in the practice area, for the purpose of guiding the development and maintenance of related software applications or other information systems.

4. We consider establishing a set of XML-related standards called the Open Practice Tools Markup Language, or OPTML (“optimal”). OPTML would basically be a rigorous set of conventions within which people can define vocabularies and structures for the facts and rules involved in various common legal practice contexts. It would lay out a methodology and various formalisms for codifying the semantics of practice technology. Work along these lines will also likely involve the Resource Definition Framework (RDF) and the Object Management Group’s Unified Modeling Language (UML).

Some guiding principles:

1. We’re anxious to join forces with people and organizations already doing aspects of the OPT agenda. There are dozens of relevant efforts, but four deserve particular emphasis:

(a) Legal XML (http://www.legalxml.com) and similar organizations, which are devoted to law-related data standards using XML (and tend not to focus on specific applications, let alone open source projects — but see http://www.lexml.de).

(b) The artificial intelligence and law research community, and particularly those doing work on legal ontologies.[3]

(c) The so-called “free law” web sites dedicated to providing public access to legislation, case law, and related materials, such as the Cornell Legal Information Institute (http://www.law.cornell.edu/) and the Australasian Legal Information Institute (http://www.austlii.edu.au/).

(d) Harvard Law School’s Berkman Center projects on Open Law, Open Code, etc. (http://cyber.law.harvard.edu/projects/).

2. We accept the reality that many aspects of law practice are not amenable to rigorous data modeling, let alone cost-effective automation. OPT efforts thus should be designed as much to support better communication among people as among applications.

3. We likewise accept the fact that there are many different kinds and degrees of openness. Not everyone wants to share or get explicit about their work to the same extent. OPT will need to accommodate a lot of “free riders” and function in a world of largely closed and proprietary implementations. We’ll need to be content with access to just some of the best thinking and practices of the legal application development community.

4. We recognize that this work is highly technical and esoteric, yet it is too important to be left to technicians and theoreticians. Concept maps, for instance, need to be intelligible to both non-technical legal professionals and non-legal technical professionals, so they should avoid arcane or unexplained legal or technical terminology. They should use plain language whenever possible, while maintaining reasonable rigor.

5. Sometimes it’s important just to start doing something. We can best learn whether these ideas work by trying them out. OPT will have to be driven by a constant empiricism, and by iterative refinement of its processes and products.

Concept Maps

One of the central instruments in the open practice tools initiative is the concept map. This is the principal document for framing the key facts, processes, and ideas involved in a given legal practice area, for the purpose of guiding the development and maintenance of related software applications or other information systems. The concept map provides a vocabulary for talking about the legal context, the processes involved in it, and the tools used to support those processes.

You can think of the concept map as a kind of blueprint for a practice area. Or as the DNA or genetic code for the practice area. Computer scientists and legal knowledge engineers might call it an ontology. It serves as an informal precursor to formal schemas and object models that will be built to support software development for the practice area.

In short, the concept map tells us, for a given practice context, what things we need to pay attention to, what we call those things, and how they are related.

The chosen practice context should be specific enough to represent a coherent set of typical facts, issues, and activities, but general enough to cover all sub-contexts that involve essentially the same concepts. For example, “family law” is probably too broad; “uncontested divorce” too narrow; and “divorce practice” just about right. “Torts” is too broad; “trip and fall cases” too narrow; “personal injury claims practice” just about right.

The purpose of concept maps — like the entire OPT effort — is to support better quality and faster development of new applications, and greater reuse of and interoperability among existing applications. A map allows lawyers and technologists to get a quick overview of a practice area, and to “define their terms” prior to and independent of specific system development projects.

Concept maps are intended to be intelligible to both non-technical legal professionals and non-legal technical professionals, so they should avoid arcane or unexplained legal or technical terminology. They should use plain language whenever possible, while maintaining reasonable rigor. They should be implementation-independent and vendor-neutral. (Although they may contain application-specific examples, such as HotDocs variable lists.) They should describe what and why as opposed to how.

Concept maps focus on information gathered and used in applications, as opposed to things we know are relevant to the practice but are not part of any technology-supported process. They will evolve based on the interaction of commentators and the experience of developers. Concept maps are thus necessarily and perpetually incomplete and provisional.

Concept maps should generally focus on things that are specific to the context at hand, although they will inevitably need to refer to more general concepts developed elsewhere (and presumably modeled in more general concept maps). Authors should seek to be as comprehensive as possible in inventorying concepts that typically involve data items in associated applications, but can feel free to defer full treatment of background material, such as the overviews of legal issues and principle.

A concept map could have the following structure:

Header
Practice context
Version number?
Author(s)
Last revised
Revised by
Status
Online location (URL)
Facts [the circumstances typically constituting a particular matter]
Objects, attributes, and relations
Actors and roles
Goals, purposes
Artifacts (documents)
Law [the general regime of applicable rules and principles]
Overview
Typical factual and legal issues
Points of jurisdictional variation

i. terminology

ii. respects in which the rules vary among jurisdictions (dimensions only, not the specific differences)

Legal and lawyering processes
events
activities, tasks, and actions (what has been, is being, needs to be, will be done as part of the legal work)
Supporting technologies
Typical tools used to support activities in this practice context
Optional attachments
Platform-specific structure, like a HotDocs variable list
XML DTD, schema, or example
Revision history

Concept maps themselves should be stored as XML (validated by a DTD or schema) for consistency of structure and format, efficient search, and easy rendering on the OPT web site. We may also want to consider developing a “Practice Context Object Model,” analogous to the DOM (document object model), to support advanced applications.

Conclusion

OPT is about optimizing options — for legal professionals, for practice technologists, and for clients deciding how much of the lawyering “bundle” to buy. Its scope naturally extends to technology in the broadest sense of all the tools we and our clients bring to the distinctively legal work before us. Rather than some monolithic world of rigid standards and forced cooperation, it contemplates a loosely-coupled network of practitioners and tool providers, and the fruitful co-existence of both open and proprietary resources.

Much of what good lawyers do cannot be replicated by computer systems. The “digital substitution” model of legal e-commerce is not a compelling one. But most of what good lawyers do can be accomplished much more effectively through appropriate automation. Call it the “digital enhancement” model. Better tools and better communication media make for better lawyers. Lawyers doing good work at reasonable prices is good for just about everyone.

We are at one of those special points in history where public good and private interest are closely aligned in a sector of human activity. Law is ripe for fundamental improvement through collaborative commerce in practice systems. The window of this opportunity is wide open, and will likely remain that way for at least a while. We’re still living in the ancient world of law office automation, and the open sourcerers among us have one of the best chances, in my opinion, of redeeming the unrealized promise of advanced legal technology. The oft-mentioned analogy of the human genome mapping project may not be too far afield, either in its magnitude of work or its astounding benefits.

I think that one of the best hopes for dramatic productivity and quality gains in legal software lies in the emergence of shared standards, components, applications, and know-how through a new infrastructure of collaborative, open development.

[1] AmeriCounsel (http://www.americounsel.com), where the author served this past year as vice president for practice technology and “chief e-legal officer,” has been deeply involved in reconstructing the delivery of legal services through a combination of innovative business strategies, creative use of the Web, and other technologies. A central part of that effort involves systematizing vast territories of conventional legal practice. AmeriCounsel has begun to build and license software for case intake, document drafting, matter management, decision support, and related processes. The company seeks to move forward with hundreds of discrete legal services, jurisdictionalized eventually for all states and a growing list of countries, delivered by thousands of independent practitioners with the support of its online resources.

[2] There are a few bright counterexamples. If you happen to work with HotDocs, for instance, you can tap into a generous flow of knowledge exchange on the hotdocs-l listserve. Several auxiliary web sites have sprung up, with compilations of past posts and useful programming modules. (See, e.g., http://www.docauto.com/BOHDL/bohdlxx.htm and http://www.datarefresh.com/hotdocs.) Watching this collaborative activity in action gives me hope that ostensibly grandiose schemes like Open Practice Tools have a chance at success.

[3] See e.g. Thorne McCarty’s LLD, and the proceedings of the 1997 ontology workshop.