Saving Vital Scientific Data

Saving Vital Scientific Data

Seeders, harvesters, baggers… is this the Haverfarm? No, this is DataRescue.

When mentions of climate change were removed from the White House’s website within minutes of Donald Trump’s inauguration, many librarians, scientists, scholars, and concerned citizens worried about future of federal data that could be at risk for loss. Large research universities have long used their resources, every four years, to take up the task of preserving such information because of the unpredictability that a new administration brings to federal organizations’ funding and structure. This time around, though, the effort seemed more necessary than ever. And Haverford jumped in to help.

A “DataRescue” event was held Saturday, Feb. 18, in Magill Library, so students, faculty, staff, and even some of the College’s neighbors could work to preserve scientific data from federal webpages. (The Philadelphia Inquirer was even there to document the effort.)

Like an assembly line, volunteers of varying technical skillsets took up different tasks. Seeders went through webpages from the National Oceanic and Atmospheric Adminstration (NOAA) and the Department of Energy (DOE), using a browser plugin to automatically save things like text and images. More complex data, such as interactive media, had to be manually preserved by code-savvy volunteers called harvesters. Finally, all the data was passed to baggers, who formatted the information for storage in an accessible, cloud-based medium. The end goal: to preserve knowledge in the face of an unusually tumultuous administration that could yank funding at a whim.

“This stuff, our grasp on it is kind of tenuous,” said Mike Zarafonetis, coordinator for digital scholarship and research services in Magill. “If a poorly funded government office that’s a research station at the Department of Agriculture or something loses their server admin… and it crashes and they can’t get it back up, that might be it for what was on there. That’s a really untenable situation.”

Zarafonetis, one of the organizers of the Haverford event, says that although it’s impossible to ignore the political context of the event, the preservation and accessibility of the public record is part of the mission of librarians, and that the current political climate only highlights the urgency of this work. The Environmental Data and Governance Initiative (EDGI) is an initiative that has been sponsoring DataRescue events frequently as of late, identifying agencies (the NOAA,  DOE, and others) whose data is at risk, and providing the tools necessary for seeding. Alongside EDGI, Laurie Allen, Haverford’s former coordinator for digital scholarship and research services and current assistant director for digital scholarship at the University of Pennsylvania, is helping to lead the other half of the effort. DataRefuge, which is based at Penn, is the organization seeking to store data that EDGI’s browser plugin tool can’t preserve. (Continued after the photos.)

Even with the unseasonably warm weather on Saturday, Zarafonetis was happy to see so many people hunker down indoors on their computers for the effort.

“I liked the idea of having a smaller event. I liked the idea of us being able to contribute in our own way,” he said. “We did what we could, and we made a contribution… To eliminate the information that people rely on to do good science and to do quality research, that’s a scary thing. And I think this was a way that I felt I could do something; it was a way that I think others felt that they could do something.”

Just as College librarians always stress information literacy with their students, it’s important to learn ways in which information may not be accessible. But DataRescue was a way for community members to come together and fight back in a time of uncertainty.

-Michael Weber ’19

Photos by Caleb Eckert ’17.

 

 

Submit a Comment

Your email address will not be published. Required fields are marked *