Archive

umlaut

Last Thursday, Johns Hopkins University Libraries went live with the Ümlaut (Ü2). This comes slightly less than four weeks after Georgia Tech took theirs down (although they were using the much more duct tape and bailing wire version 1), and it’s nice to see a library back in the land of Röck Döts.

Ü2 shares very little except superficially with the original Ümlaut, and I owe Jonathan Rochkind a lot for getting it to this level. It’s an interesting dynamic between us (as anybody who has spent a minute in #code4lib in the last eight months knows) that seems to work pretty well. It would be nice to expand the community beyond just us, though. It’s pretty likely that the Ümlaut will work its way into Talis’ product suite in some form or another, so that would probably draw some people in, but it would be nice to see more SFX (or other link resolvers) customers join the party.

This isn’t to say that JHÜmlaut doesn’t need some work. In fact, there’s something really wrong with it: it’s taking way too long to resolve (Georgia Tech’s was about twice as fast, although probably with a lighter load). If I were to guess I would assume that the SFX API is the culprit; when GT’s was performing similarly, there was a bug in the OCLC Resolver Registry lookup that was causing two SFX requests per Ümlaut request (it wasn’t recognizing that it was duplicating). This isn’t the case with JHU (not only did Jonathan remove the OCLC Registry feature, it wouldn’t be affecting me, sitting at home in Atlanta, anyway).

Performance was one of the reasons GT’s relationship soured with the Ümlaut (an unfortunate bout of downtime after I left was the biggie, I think, though), so I hope we can iron this out before JHU starts getting disillusioned. Thankfully, they didn’t have the huge EBSCO bug that afflicted GT on launch.

For reasons only known in Ipswich, MA, EBSCO appends their OpenURLs with <<SomeIdentifier. Since this is injected into the location header via JavaScript (EBSCO sends their OpenURLs via a JavaScript popup), Internet Explorer and Safari don’t escape the URL which causes Mongrel to explode (these are illegal characters in HTTP, after all). Since the entire state of Georgia gets about half their electronic journal content from EBSCO, this was a really huge problem (which was fixed by dumping Mongrel in favor of LigHTTPD and FastCGI). These are the sorts of scenarios that caused the reference librarians to lose confidence.

JHU has the advantage of GT’s learning curve, so hopefully we can circumvent these sorts of problems. It’s still got to get faster, though.

Still, I’m happy. It’s good and refreshing to see the Ümlaut back in action.

Argh.  I’ve recently migrated to using Opera on our desktop at home (long story, but basically the experiment with Ubuntu didn’t go over well; we went back to Windows; and IE7 is too slow to be considered useful in any capacity.  And Selena uses Firefox.).  While in the middle of a long post about how I’ve broken the Ümlaut (not the production or subversion versions — but the development server is FUBARed and a huge flaw has been exposed in its design), I managed to, with my stumpy, inaccurate man-fingers, hit some key combination by accident that caused me to leave my
editing page and sent me to some ’Opera search page’.  Thanks!

…pause while Ross saves…

Anyway, a huge refactoring will be taking place (although, honestly, I don’t think it will take me very long).  The long and short of it is:  I was trying to make the Ümlaut database independent.  SQLite had a hard time with the Ümlaut’s liberal use of Marshaling objects in the database.  PostgreSQL wasn’t working with Rails and when I upgraded Rails…  all hell broke loose.  Basically my Marshal plan wasn’t going to work with the direction of Rails development.

But, this has actually led me to a better way (and, I think, much more efficient) way to handle requests.

So, stay tuned for Ü2.

Ariadne #48 includes “Introducing unAPI“, written by Dan, Jeremy, Peter, Michael, Mike, Ed and me. It explains the potential of unAPI and shows some examples on implementations.

Like the last article I “wrote” with this group, I didn’t have to do much. On the flipside, publishing has such a minute effect (if any) on my career, I guess the “effort” reflects the “rewards”.

Still, this one was tough. A refactoring of the Umlaut broke the unAPI interface right about the time the editors were going over the example cases in the article. It was extremely difficult to put development cycles into this when I had a ton of other things I desperately needed to get completed before the start of the semester.

And that’s where the struggle is. While I’m completely sold on the “potential” of unAPI, that’s all it is right now. COinS was much easier concept for me get behind (although I would definitely say I have invested more in unAPI than COinS) since it’d be so much easier to implement on both the information provider and client ends. UnAPI is a much, much more difficult sell. However, the payoff is so much greater, one has to have faith and sometimes dedicate time that should probably have been spent on more immediate needs.

The Umlaut “launched” last Monday. I wouldn’t call it the most graceful of take-offs, but I think it’s pretty much working now.

We immediately ran into a problem with ProQuest as a source. ProQuest sends a query string that starts with an ampersand (“&ctx_ver=foo…”) which Rails strongly disliked. Thankfully, it was still part of the break, so traffic was light. It gave me the opportunity to isolate and fix the myriad little bugs that could really only have been found by exposing the Umlaut to the wild and crazy OpenURLs that occur in nature. Ed Summers talked me into (and through) patching Rails so it didn’t choke on those ProQuest queries, although we later saw that the same patch was already in Rails Edge. It’s nice how hackable this framework is, though.
There was also a bit of gaffe when my trac project went ballistic after being crawled by Googlebot bringing down my server (and with it all journal coverage and our Zebra index of the OPAC — this nice little denial of service just so happened to bring down the Umlaut, as well). As a result, the trac site is down until I can figure out how to keep it from sending my machine into a frenzy (although I’m in the process of moving the Zebra index to our Voyager server — as I type this, in fact — and the Umlaut will be able to handle its own journal coverages tomorrow, probably).

There was also a bit of a panic on Friday when the mongrel_cluster would throw 500 errors whenever it would serve a session from another server. Ugh. Little did I know that mongrel_cluster must not use Pstores to save sessions. I scaled back to one mongrel server over the weekend (which of course, was overkill considering the umlaut was crapping out on the webservices my desktop machine wasn’t providing to it, anyway) while I migrated to ActiveRecord to store sessions. It seems to be working fine now and we’re back up to 5 mongrel servers. Yeah, hadn’t even thought about that one…
Yeah, it wasn’t perfect. But all in all, no catastrophes and it was nice having the safety of class break. So, for the rest of the week, I can focus on getting our EAD finding aids searchable.

Joy.

Dan Chudnov “interviewed” me about the Ümlaut for the first “Library Geeks”. It’s crude and show obvious signs of “learning as we’re going”, but if you want to hear more about the Ümlaut from my voice and hear my favorite curse word (and boy is it a doozy), take a listen.

Note the “3-2-1″ at around the 31st minute. It took us several attempts to get this recorded. The first session was lost after about 30 minutes of recording courtesy of GarageBand crashing. We then lost the 2nd quarter, again thanks to GarageBand crashing. I had to do the 10 questions twice. Why? I’ll let you guess.

In other Ümlaut related news (is there any other right now?), I have the project page up. I’ll be adding to the HowUmlautWorks when I get free moments.

Also, thanks to Dan’s lead (lots of links to One Big Library today), I added the OCLC Resolver Resolver Registry to the Ümlaut. When you start a session, it checks your IP against the OCLC registry. If the registry returns a link resolver based on your IP, the Ümlaut will check to see if your resolver is SFX (I’ll be happy to add other XML enabled resolvers) and, if so, includes those holdings in the results, as well. If not, it includes a link to your resolver. If it doesn’t work for you, my guess is that OCLC has ILLiad (or whatever you use for Interlibrary Loan) first in your library’s profile.

Here are some screenshots of this in action:
How this citation looks at Georgia Tech [pdf]
How this citation looks at Drexel [jpg]

How does it work for you?

Thanks to Gabriel Farrell for trying this at Drexel and supplying the screenshot (and congratulations on the new job!). It still needs some work (for instance, it needs to tell the user that the fulltext is courtesy of Drexel or wherever), but I think it’s a start.

We’re about a week away from launch, so I’ll be working on migrating to the production server. I’ll be sure to document it so others can partake if they desire.