I’ve mentioned several times in this space the OPAC redesign project that Art and I are working on. There hasn’t really been anything to show, to date, because it’s taken a very long time to get actually get the data out of Voyager. There are easier and faster ways we could have done this, probably, but we’ve been a little bogged down trying to get this to work in Art’s webdav environment. This has required sucking the data out of Oracle according to LCC and that’s been no easy task. GovDocs are in a different hierarchy, based on SUDOC (CODOC, for Art).
In the meantime, I get emails from Art at 12:30 at night, 7:30 in the morning that say things like:
I am woefully weak on python but I know you have been working with python lately and I wondered if the approach I am using makes sense. I am persisting date modified information with a python shelve. So it looks like:
shelf[url] = last_modified
This seems to work wonderfully, but I needed to add:
for the shelf to have somewhere to put the info. What I think is supposed to happen is that the shelf command looks for some sort of database option and cycles through them all looking for storage. The “import dumbdbm” seems to be a way to add an option if no other is found. Have you ever tried anything like this? I wanted to use pickle/cpickle but a million links would probably throttle it.
… I, of course, have no idea what he’s talking about, but it’s flattering nonetheless that he thinks I might.
Anyway, last week I started actually working with PyLucene and our metadata mirror files (Art, meanwhile, is doing similar work with Cocoon/Lucene) and I came across what is possibly the most useful byproduct of this project. While I was preparing the logic to frbrize the mirror data, it struck me that it doesn’t have to be perfect, at first.
By separating the data from the ILS, we can create any kind of interface we want, indeed several, should we choose, without worrying about affecting the backend system at all. We can combine records, add metadata as necessary, remove it if it doesn’t work properly, tweak our search algorithms, and incorporate it into any sort of system we want, because it would have absolutely no effect on the ILS itself. We’ll still have the original “authority” should we mess anything up too badly and we’ll have all kinds of value that couldn’t (and probably shouldn’t) go in a “conventional opac”.
This sort of abstraction from the “inventory control system” is such a basic programming principle that I have to wonder why no vendors implement it (even I, as an untrained hacker understand the importance of this). It also abstracts the user interface from the catalogers a bit — added bonus. Catalogers are great for many things, but designing user interfaces generally isn’t one of them.