UMBC researchers have made strides in automated legal document analytics by creating a way to machine-process the Code of Federal Regulations (CFR). The CFR is a complex document containing policies related to doing business with the federal government. All business affiliates of the federal government must comply with it. For government contracts to be equitably open to a broad range of businesses, policies within the CFR must be accessible.
This document automation is just one part of a broader project to help contractors and other entities manage and monitor their legal documents. Directed by Karuna Joshi, associate professor of information systems, the team has successfully managed to do a complete analysis of the CFR. Digital Government: Research and Practice recently published their methodology.
Karuna Joshi (left) with Yelena Yesha, 2018.
Automating document review through AI
The team’s method for analyzing the CFR involves artificial intelligence (AI), which learns how to categorize information within the document, store it, and extract it when it is requested. Joshi and her team achieved this by creating a knowledge graph using Semantic Web technologies to illustrate all the key terms, rules, and regulations in the document. This basic framework enables users to ask an automated tool about a specific rule and be provided with the answer.
The semantic web language OWL, or Web Ontology Language, is used to represent concepts and to contextualize relationships. According to Joshi, the framework of the knowledge graph can be “adopted by federal agencies and businesses to automate their internal processes that reference the CFR rules and policies.” To facilitate this, they will make it available in the public domain.
Question and answer
General users can interact with the knowledge graph through a question-and-answer process, similar to how many people use Amazon’s Alexa or Apple’s Siri. For example, Joshi says, someone could ask a policy-related question like, “How many days at a minimum must a Request for Proposal (RFP) be posted open/available?” The system would query the CFR knowledge graph to find sections in the document that answer this question.
Karuna Joshi presents at the USNA-UMBC research symposium, 2016.
The researchers anticipate this will be a highly useful system for any business held to the CFR thanks to how it breaks down CFR’s legal complexity through the automated process with ease.
Access and accountability
This project to automate and support users’ understanding of legal documents has been an ongoing effort by the UMBC team. Beyond the CFR, they seek to assist people with understanding legally binding contracts that they encounter every day, such as terms of service for major companies.
Lavanya Elluri, graduate student of information systems, adds, “Our research helps the organizations that use cloud services to understand the context from these textual documents quickly.”
Many users have found that companies are using their data without their knowledge. Often, this is due to the information buried within terms of service and privacy policies. Joshi predicts that the tools her team is developing to help users better understand these documents will be essential to hold companies accountable for their data use.
Featured image: UMBC’s Information Technology and Engineering Building. All photos by Marlayna Demond ’11 for UMBC unless otherwise noted.
Story by Morgan Zepp for UMBC.
- For additional UMBC Science and Technology stories, visit the UMBC News site.
- For additional stories about the UMBC community, visit the UMBC Magazine site.
- For additional COEIT stories, visit the COEIT site.
- For additional COEIT Research Highlights, including Publications Spotlights, visit the COEIT Research pages on the COEIT Dean's Office site.