This project was designed to take EAD 2002 XML finding aids, and provide archivists with a simple way to upload them, provide versioning to changes, produce a full-text searchable index of them and make a fairly simple “website” to display them.

The HTML interface allows the archivists to add a description, display title, sort title and place the finding aids into categories, which then have their own browsable pages.

The finding aids pages themselves are created via a combination of XSLT (for the descriptive elements of the collection) and Ruby templates (for the Box/Folder/Item lists).  The initial plan of this project was to create a sophisticated object model for the EAD so more interesting things could be done with the EAD’s HTML (such as linking from the the subject headings or bibliographies), although this never was fully realized.  There were two main reasons for this:

  1. EAD is a bloody awful mess.  EAD’s data model is extremely complicated and very open to local archivists’ interpretation.  There are presentation tags intermingled with semantic tags and some of the data isn’t really atomic enough to parse by machine.  The time I was able to spend working on this project was limited, so I was never able to design a working internal data model for the finding aids and had to stick to using the EAD documents themselves.
  2. Ruby’s XML parsers at the time of development were woeful.  LibXML2’s Ruby libraries were very flaky and REXML was (still is) so slow that it would grind to a halt on XML files the size of many finding aids.  Newer Ruby libraries, such as JREXML and Nokogiri, may make this a possibility to reconsider.

This project was a bit of a burden for me at the time.  Archives is a department in the Georgia Tech libraries, but at the time they had no developer resources of their own.  This meant that they had to rely on me (the primary developer for the library as a whole) to provide a way to publish their EAD files.  I wanted to write something far more sophisticated for them, an application where they could actually create the finding aids (since I could, then, have some control over how the data was formatted for the things I later wanted to do with it).  However, archives had already committed themselves somewhat to using Archivist’s Toolkit and we ran into a conflict with how I was able to prioritize development for them.  Archives’ total web traffic, across all their pages, was less than 1% of the library’s total.  At the same time, this project was a huge time sink taking up a vast amount of my time for about a month and a half.  It was really hard to justify along with the all the other things that needed to be written for the library.

There was still some nice features that were included in this, though, such as the OpenSearch interface and the addition of an OAI-PMH server, thanks to Will Groppe’s OAI server implementation.

Here is Georgia Tech’s implementation in action.

You can download the source as a .gz here.  It is available under an MIT License.

  1. Cristiano Animosi said:

    I would be interested in viewing the source code, but the link is broken.


  2. Ross said:

    Ok, sorry about that. It was gzipped, not zipped. The link is fixed.

Leave a Reply

Your email address will not be published. Required fields are marked *