{shortcode-1f2eb5c33a386cabd6640ceba2854e96858e543b}
The Caselaw Access Project published nearly seven million cases from the Harvard Law School’s collections online on March 8, concluding a nine-year process to digitize the HLS Library’s archive of court cases.
The Caselaw Access Project, also known as CAP, aimed “to make all published U.S. court decisions freely available to the public online in a consistent format, digitized from the collection of the Harvard Law School Library,” according to the project’s website.
The recent release of cases has culminated in “360 years of United States caselaw” accessible to the public, according to the project’s website. This includes all “official, book-published state and federal United States caselaw through 2020,” with the first case dating back to 1658.
Jack Cushman, the project’s director, said that the impetus behind the effort was a desire to make caselaw more accessible to the public. In the past, few people beyond lawyers had access to expensive caselaw databases and could view important legal decisions.
This project, according to Cushman, sought to level the playing field.
Cushman said he believed it was important “for everyone to have access to the law of the land.”
CAP launched in 2015 through a partnership with Ravel Law, a legal research and analytics startup company. Per the terms of the partnership, CAP received financial support in exchange for Ravel obtaining eight years of exclusivity with the caselaw documents, according to Harvard Law Today, a school-run publication.
This project falls under the initiatives of the Law School’s Library Innovation Lab, “a forward-looking group of thinkers and doers working at the intersection of libraries, technology, and law,” according to the organization's website. The LIL facilitated the delicate process of digitizing case files for the project.
As part of the process, 40,000 books containing case files were retrieved from Harvard Law School’s collection in the HLS Library and a repository in Southborough, Mass. The CAP team then used a variety of tools to de-bind the books, effectively scan case files at a rate of 500,000 pages per week, and wrap the books in plastic to be sent to a limestone mine in Kentucky for preservation.
The scanned files were then translated into machine-readable documents and uploaded to the Ravel website. Ravel’s website made sifting through documents easier with their “data science, machine learning, and visualization” systems, according to Harvard Law Today.
Cushman said it was essential to not rush the process, as CAP was dealing with delicate documents that were both culturally and historically important.
“I think one lesson is just, it’s okay if it takes a long time,” he said. “For cultural preservation and cultural heritage — we’re in this for the long run.”
Now that the case files have been digitized, CAP aims to further improve search functionality to make the platform “practically usable,” furthering their mission to increase caselaw accessibility for all. With this forward-looking approach to law accessibility, CAP’s next goal is to strengthen its institutional collaborations with AI model makers interested in high quality datasets.
Cushman said that the digital archive could be useful for “Harvard students who are looking for projects or ways to make their mark with civic technology and big datasets.”
“We’ve only scratched the surface of what you can do with it,” Cushman added.
Read more in University News
Harvard Neurologists Find Skin Biopsies Can Detect Parkinson’s Disease in Recent Study