Archive

Monthly Archives: August 2006

Ariadne #48 includes “Introducing unAPI“, written by Dan, Jeremy, Peter, Michael, Mike, Ed and me. It explains the potential of unAPI and shows some examples on implementations.

Like the last article I “wrote” with this group, I didn’t have to do much. On the flipside, publishing has such a minute effect (if any) on my career, I guess the “effort” reflects the “rewards”.

Still, this one was tough. A refactoring of the Umlaut broke the unAPI interface right about the time the editors were going over the example cases in the article. It was extremely difficult to put development cycles into this when I had a ton of other things I desperately needed to get completed before the start of the semester.

And that’s where the struggle is. While I’m completely sold on the “potential” of unAPI, that’s all it is right now. COinS was much easier concept for me get behind (although I would definitely say I have invested more in unAPI than COinS) since it’d be so much easier to implement on both the information provider and client ends. UnAPI is a much, much more difficult sell. However, the payoff is so much greater, one has to have faith and sometimes dedicate time that should probably have been spent on more immediate needs.

The Umlaut “launched” last Monday. I wouldn’t call it the most graceful of take-offs, but I think it’s pretty much working now.

We immediately ran into a problem with ProQuest as a source. ProQuest sends a query string that starts with an ampersand (“&ctx_ver=foo…”) which Rails strongly disliked. Thankfully, it was still part of the break, so traffic was light. It gave me the opportunity to isolate and fix the myriad little bugs that could really only have been found by exposing the Umlaut to the wild and crazy OpenURLs that occur in nature. Ed Summers talked me into (and through) patching Rails so it didn’t choke on those ProQuest queries, although we later saw that the same patch was already in Rails Edge. It’s nice how hackable this framework is, though.
There was also a bit of gaffe when my trac project went ballistic after being crawled by Googlebot bringing down my server (and with it all journal coverage and our Zebra index of the OPAC — this nice little denial of service just so happened to bring down the Umlaut, as well). As a result, the trac site is down until I can figure out how to keep it from sending my machine into a frenzy (although I’m in the process of moving the Zebra index to our Voyager server — as I type this, in fact — and the Umlaut will be able to handle its own journal coverages tomorrow, probably).

There was also a bit of a panic on Friday when the mongrel_cluster would throw 500 errors whenever it would serve a session from another server. Ugh. Little did I know that mongrel_cluster must not use Pstores to save sessions. I scaled back to one mongrel server over the weekend (which of course, was overkill considering the umlaut was crapping out on the webservices my desktop machine wasn’t providing to it, anyway) while I migrated to ActiveRecord to store sessions. It seems to be working fine now and we’re back up to 5 mongrel servers. Yeah, hadn’t even thought about that one…
Yeah, it wasn’t perfect. But all in all, no catastrophes and it was nice having the safety of class break. So, for the rest of the week, I can focus on getting our EAD finding aids searchable.

Joy.

Dan Chudnov “interviewed” me about the Ümlaut for the first “Library Geeks”. It’s crude and show obvious signs of “learning as we’re going”, but if you want to hear more about the Ümlaut from my voice and hear my favorite curse word (and boy is it a doozy), take a listen.

Note the “3-2-1″ at around the 31st minute. It took us several attempts to get this recorded. The first session was lost after about 30 minutes of recording courtesy of GarageBand crashing. We then lost the 2nd quarter, again thanks to GarageBand crashing. I had to do the 10 questions twice. Why? I’ll let you guess.

In other Ümlaut related news (is there any other right now?), I have the project page up. I’ll be adding to the HowUmlautWorks when I get free moments.

Also, thanks to Dan’s lead (lots of links to One Big Library today), I added the OCLC Resolver Resolver Registry to the Ümlaut. When you start a session, it checks your IP against the OCLC registry. If the registry returns a link resolver based on your IP, the Ümlaut will check to see if your resolver is SFX (I’ll be happy to add other XML enabled resolvers) and, if so, includes those holdings in the results, as well. If not, it includes a link to your resolver. If it doesn’t work for you, my guess is that OCLC has ILLiad (or whatever you use for Interlibrary Loan) first in your library’s profile.

Here are some screenshots of this in action:
How this citation looks at Georgia Tech [pdf]
How this citation looks at Drexel [jpg]

How does it work for you?

Thanks to Gabriel Farrell for trying this at Drexel and supplying the screenshot (and congratulations on the new job!). It still needs some work (for instance, it needs to tell the user that the fulltext is courtesy of Drexel or wherever), but I think it’s a start.

We’re about a week away from launch, so I’ll be working on migrating to the production server. I’ll be sure to document it so others can partake if they desire.

I’ve been dealing with a lot of dumb technology problems, lately.

The mlaut has uncovered stupid issues with both Voyager and SFX in the last week (Voyager’s Z39.50 server returns every record in the database if you do an ‘or’ search on the 001 and SFX’s handling of title changes is too complicated to even mention).

This simple fix makes me very happy, though.

Some backstory:

In June, I had a meeting with our Digital Initiatives department about the effect of the mlaut on their services, namely DSpace. I told them that it was impractical to search SMARTech (our DSpace instance) for every incoming citation, since, theoretically, SMARTech should appear in the Google/Yahoo results. When we tested the theory, our results looked like this. This is obviously ugly, but even worse, it probably discourages discovery of the items in the repository. One has to assume that people are mainly finding items via the search engines. When they see results like that, with no real indication of what they’re looking at, they will probably just move on (even if DSpace is holding the preprint to what they’re searching for).

I left the meeting and asked Dorothea Salo if she, too, had this problem and if she knew a fix for it. About 20 minutes later, she had this awesome title hack worked up. I sent it to our DSpace admin (I, thankfully, don’t deal with it directly) and now we get to bask in the glory of our new and spiffy title listing in the search engines.

Thanks, Dorothea, for doing something genius and simple that DSpace should have done years ago.