I recently attended the Federal Depository Library Conference in Washington D.C. Among the many interesting topics discussed, one in particular caught my attention and got me thinking about the way my duty as a documents librarian and as a member of our Digital Scholarship Team overlaps: promoting access to and preserving born-digital government information.
Over the past decade the amount of government information online far outpaced the number of documents printed by the Government Printing Office (GPO) for distribution through the Federal Depository Library Program (FDLP) (Jacobs, 2014). The sheer volume of this information makes both providing access (at least through bibliographic control) and ensuring preservation extremely difficult. What’s worse, much of this information is transitory and is lost when administrations change or Congressional committees disband.
One particularly interesting type of born-digital government information includes those digital documents that are created by federal executive agencies, such as the U.S. Department of Agriculture, but never reported to the GPO and thus not distributed through the FDLP. The Lost Docs Project is collecting a public listing of these fugitive documents, as they are known, and reporting them to the GPO for cataloging. A quick look at the tag cloud on the homepage of the Lost Docs Project site and one can easily surmise the importance of this endeavor for both researchers and the general public, as many of these documents concern issues in public health and the environment.
Beyond these born-digital documents there is a massive amount of government information on agency websites, at the federal, state, and local level. This information is even more vulnerable, as URLs change and sites are updated or taken down. Web harvesting is a well-known strategy to address the capture and preservation of this type of information. At the federal level, End of Term Crawls, first conducted by the National Archives and Records Administration (NARA), attempt to capture as many agency websites as possible. End of Term Crawls were also conducted in 2008 and 2012, through a collaborative effort by the Library of Congress, Internet Archive, California Digital Library, University of North Texas, and GPO. At the state level, similar initiatives are coordinated through Archive-It and partner institutions. Indiana University, an Archive-It partner since 2005, is currently archiving state and local government web sites.
Much of this may be old news, but as a new librarian with responsibilities in government information and digital scholarship, these fields appear to be natural extensions of each other. So much government information is born-digital and increasingly at risk of being lost. My hope is more government documents librarians reach out to their digital librarian colleagues, the chances are good that they care about the same issues.
Jacobs, J.A. (2014, April 24-25). Born-digital U.S. Federal Government information: Preservation and access. Report prepared for the Center for Research Libraries Global Resources Collections Forum: Leviathan - Libraries and Government Information in the Age of Big Data, Chicago, IL.