As part of a series of posts by our student employees, Karl Moll ’14 talks about what he’s been working on in Special Collections this summer:
One of the bigger projects that I’ve been working on this summer has been to create a template which allows users to input archival information from a finding aid into Excel and then it converts it into a valid EAD program so that it can be uploaded into Archivists’ Toolkit.
If you don’t understand what the above section means, that’s all right. It took me a little bit to get it all, and I’m going to walk you through it now.
EAD is subset of XML which is a programming language which is pretty similar to HTML, without most of the style commands. XML is really useful in creating hierarchies of information, which is why EAD (Encoded Archival Description) is based on this format. EAD is a standard used by many archives and special collections across the country to make finding materials easier across collections. For more on the history of EAD, look here. For more on standards in general, I may write another blog post about the development of ANSI standards. But more on that later.
For this project, I’m working on a brilliant template by Matt Herbison which properly formats the information that is inputted into a spreadsheet into a valid EAD program so that it can be uploaded into Archivist’s Toolkit. It is a really wonderful template that makes converting a finding aid into AT so much easier than writing the code by hand, or even entering it directly into AT. However, to be super useful to us here, I’ve been trying to make some modifications to it.
The first thing I did was get rid of a lot of the fields that we wouldn’t be using for our finding aids. After a lot of consultation with John and Diana, we determined that all we really needed were Level, Title, and the newly added Call Number or Shelf Location. More on the Call Number in a bit. The template went from looking like this:
This new version is a lot more streamlined for what we need it to do, since that’s all the information that we are really interested in anyways. Since there wasn’t already a field for Call Number, I had to create one. This part of the project ended up taking some time getting familiar with XML and EAD.
I really wanted there to be a way to put the call number with each individual item in the Finding Aid. The way I first tried was just to concatenate the Title cell in Excel with a Call Number cell, so that it would look like “Karl’s Papers, 2010-2012. Call Number: R1 SA B3″ but this lacked a symmetry when it actually created the finding aid, since not all the titles are the same length. I was looking through the AT window…
…when I noticed that there was this field:
Now, a Component Unique Identifier sounds a lot like a Call Number to me, so I then began an Internet quest to learn more about XML and EAD to see what tags corresponded to this entry field (I knew that there HAD to be something, since AT fields correspond to the EAD standard). After a couple of days of teaching myself about XML, I realized that the tag <unitid> </unitid> would be my best bet.
Notice that some of the columns have fixed values such as “<c0″ and “><did><unittitle>”. These are here to create certain constants for a valid program, and it ends up producing code like this:
I changed some of the fixed cells so that they would populate the Component Unique Identifier field in Archivists Toolkit using the <unitid> tag so that it produced code like this (the differences are pretty subtle):
So after I made those changes, Archivists Toolkit was able to accept it as a valid program to be uploaded, with a call number for every field and then produce a Finding Aid like this:
The Call Number is on the left side preceding the Title. I’ve been working on various ways of making the Call Number more readable, but haven’t uploaded it to AT yet.
My next blog post on the templates will be on my ongoing struggles to deal with nesting issues to provide the user with one extra level of hierarchy!!! Stay tuned!
–Karl Moll ’14