Archive

philosophizing

I’ve been accused of several things in the Linked Data community this week:  a circular reasoner, a defender of the status quo “just because that’s how we’ve always done it”, and (implicitly) an httpRange-14 apologist.  Quite frankly, none of these are true or quite what I mean (and I’m, of course, over dramatizing the accusations), but let’s focus on the last point for now (which may clear up some of the other points, as well).

Ed’s post (as he explains at the end) is a reference to me calling bullshit on his claim that “[he] think[s] httpRange-14 is an elaborate scholarly joke“.  Let me be clear from the outset that I am not particularly dogmatic on this issue.  That is, I don’t think the internet will break if the resource and carrier are conflated, but I also don’t think it’s that hard to keep them separated and that the value in doing so outweighs any perceived costs.

First off, let me explain what httpRange-14 is to the uninitiated (skip on ahead if you feel pretty comfortable with this).  In linked data (or semantic web, you can choose the words that feel best to you), we run into a problem with identifiers and what, exactly, they are identifying.  Let’s say I want to talk about Chattanooga.  Well, “Chattanooga” is not a web resource, but if I want talk about it unambiguously, it needs an identifier, preferably an HTTP URI, so other people can refer to it unambiguously and say things about it and discover it.  Ideally, this web representation would also have human readable (HTML) and machine readable (RDF, XML, etc.) versions.  But the important distinction here is that the city of Chattanooga cannot be retrieved on the web, only these HTML, RDF, XML surrogates.  If the surrogate has the same URI (identifier) as the resource it’s describing it starts to get difficult to figure out what we’re talking about.

So to try to make this a little clearer, let’s say I am making this representation of Chattanooga for people to use:

<http://dilettantes.code4lib.org/resources/Chattanooga_Tennessee.rdf>
    rdf:type <http://www.geonames.org/ontology#P.PPL> ;
    <http://www.geonames.org/ontology#population> "155554"^^xsd:integer.

But I also feel I need to let people know some administrative data about it, so they know when it was last modified and by whom, etc., so:

<http://dilettantes.code4lib.org/resources/Chattanooga_Tennessee.rdf>
    rdf:type <http://www.geonames.org/ontology#P.PPL> ;
    <http://www.geonames.org/ontology#population> "155554"^^xsd:integer ;
    dcterms:creator <http://dilettantes.code4lib.org/about#me> ;
    dcterms:created "2010-07-09"^^xsd:date ;
    dcterms:modified "2010-07-09T11:25:00-6"^^xsd:dateTime .

Now things get confusing.  My new assertions (dcterms:creator/created/modified) are being applied to the same resource as my city, so I am saying that I created a city of 155,554 people today (what have you done today, chump?).

The way we get around this is through a layer of indirection, basically we just use two URIs: you request an RDF document from http://dilettantes.code4lib.org/resources/Chattanooga_Tennessee.rdf and it has something like:

<http://dilettantes.code4lib.org/resources/Chattanooga_Tennessee#place>
  rdf:type <http://www.geonames.org/ontology#P.PPL> ;
  <http://www.geonames.org/ontology#population> "155554"^^xsd:integer.
<http://dilettantes.code4lib.org/resources/Chattanooga_Tennessee.rdf>
    rdf:type <http://xmlns.com/foaf/0.1/Document> ;
    <http://xmlns.com/foaf/0.1/primaryTopic> <http://dilettantes.code4lib.org/resources/Chattanooga_Tennessee#place> ;
    dcterms:creator <http://dilettantes.code4lib.org/about#me> ;
    dcterms:created "2010-07-09"^^xsd:date ;
    dcterms:modified "2010-07-09T11:25:00-6"^^xsd:dateTime .

And this keeps things a little clearer.  I created the document you’re looking at today, not the resource that the document is describing.  So this way when you say that my RDF is terrible (fair accusation) you’re not necessarily saying that about the city of Chattanooga (and vice versa).  You can read more about this at Cool URIs for the Semantic Web (by the way, I tend to favor the “hash URI” approach, for simplicity’s sake).

Now back to Ed’s post.  His argument is that if he uses http://en.wikipedia.org/wiki/William_Shakespeare as his identifier (referent, really) we should be smart enough to know when we say that this URI is a foaf:Person and that it was dcterms:created on “2001-10-14” that we’re referring to two different things.

The first comment is from Ian (full disclosure: my boss, fuller disclosure: this doesn’t mean I agree with him) who simultaneously “completely agrees” with Ed and yet supplies an argument that punches a gigantic hole in the side of Ed’s thesis.

To put it another way, sure, maybe we can tell that dcterms:created is a strange assertion for a foaf:Person and we have other ways to tell that Shakespeare was born in 1564 (via a bio:Birth resource or something), but this breaks down for books and all sorts of other entities.  So you have dcterms:created “2003-09-04” and dcterms:creator <http://en.wikipedia.org/wiki/Douglas_Coupland> on http://en.wikipedia.org/wiki/Girlfriend_in_a_Coma_%28novel%29 and we’ve now sown some confusion.  This ambiguity becomes more problematic down the road when the context changes (that is, assumptions I can make about wikipedia and wikipedia’s model don’t necessarily apply elsewhere).

Right around the time I graduated from high school, the guitarist in my band at the time made me a cassette copy of Jimi Hendrix’s “Jimi Plays Monterey“.  The sound quality was pretty terrible and, as I recall, my tape player ate it once making it even worse.  Still, I loved that album (Jimi, while playing Dylan’s “Like a Rolling Stone” says “I know I missed a verse, it’s alright, baby.”): I love the songs, I love the playing, I love the energy of the performance.  The medium that album came to me on, however, was subpar.  There are general attributes of “cassette tapes” and then there was “this particular recording on this particular cassette”.

At the same time in my life, I had a compact disc of the BulletBoys’ eponymous album.  Fidelity-wise, the sound of this album was orders of magnitude better than my copy of “Jimi Plays Monterey”, but pretty much everything else about it sucked.

The carrier is not the content.  Being able to refer to the quality of my dilapidated cassette without dragging the Jimi Hendrix Experience into it is useful.  I should be able to say that my BulletBoys CD sounded better than my Hendrix tape without that being a staggering example of bad taste.

In libraries, we have a long history of data ambiguity.  We have struggled enough to figure out the semantics in our AACR2/ISBD data that when we have the chance to easily and concretely identify the things we are talking about, we should take it.  I am not proposing abstracting things into oblivion with resources on top of resources – just sensibly being sure you’re talking about what you say you are.

Unfortunately, one of my problems with the new RDA vocabularies is that in several instances it schmushes multiple statements together to avoid the modeling the “hard parts” (this is precisely the same issue I have with Ian’s later comment).  For example, RDA has a bunch of properties that are intended to “hand wave” around the complexities of FRBR, such as http://RDVocab.info/Elements/otherDistinguishingCharacteristicOfTheExpression.  So you’d have something like:

<http://example.org/1>
    <http://RDVocab.info/Elements/title> "Something: a something something" ;
    <http://RDVocab.info/Elements/titleOfTheWork> "Something" .

What you’ve done here with “titleOfTheWork” is say that <http://example.org/1> has a work, is itself not a work and the work’s title is “Something”.   That’s some attribute!  But if we can say all of that, why would we not just model the work?! Even if we don’t know where in the WEMI chain <http://example.org/1> falls, if we did something like this:

<http://example.org/1>
    dcterms:title "Something: a something something" ;
    ex:hasWork <http://example.org/works/1234> .

<http://example.org/works/1234>
    a <http://RDVocab.info/uri/schema/FRBRentitiesRDA/Work>;
    dcterms:title "Something" .

we’ve now done something useful, unambiguous and reusable (and not ignoring FRBR while simultaneously defining it).  The closed nature of IFLA’s development of these vocabularies don’t lead to me have much hope, though.

But, again, back to Ed.  Like I said, I really don’t think the internet will fall apart and satellites will come crashing to the earth if we don’t adhere consistently to httpRange-14.  No, the reason why I call bullshit on Ed’s statement is because he finds the use of owl:sameAs on resources such as http://purl.org/NET/marccodes/muscomp/sn#genre to be inappropriate.  While in his post he claims it’s fine that we conflate the resource of William Shakespeare as a foaf:Person and foaf:Document that was modified on “2010-06-28T17:02:41-04:00”, he on the other hand questions the appropriateness of <http://purl.org/NET/marccodes/muscomp/sn#genre> owl:sameAs <http://dbpedia.org/resource/Sonatas> because by doing so it infers that <http://purl.org/NET/marccodes/muscomp/sn#genre> has a photo collection at <http://www4.wiwiss.fu-berlin.de/flickrwrappr/photos/Sonata> (which, in fact, has little to do with the musical genre and actually has a lot of pictures of Hyundais, among other things).

This is a perfectly fair, valid and important point (and one that absolutely needs to be addressed), but doesn’t this also mean he actually cares that we say what we really mean?

I had the opportunity to attend and present at the excellent ELAG conference last week in Bratislava, Slovakia.  The event was advertised as being somewhat of a European Code4Lib, but in reality, the format seemed to me to be more in line with Access, which in my mind is a plus.

Being the ugly American that I am, I made a series of provocative statements both in my presentation and in the Twitter “back channel” (or whatever they call hash tagging an event) about vendors, library standards, and a seeming disdain for both.  I feel like I should probably clarify my position here a bit, since Twitter is a terrible medium for in-depth communication and I didn’t go into much detail in my presentation (outside of saying vendor development teams were populated by scalliwags and ne’er-do-wells from previous gigs in finance, communications and publishing).

Here was my point I was angling towards in my presentation:  your Z39.50 implementation is never going to get any better than it was in 2001.  Outside of critical bug fixes, I would wager the Z39.50 implementation has not even been touched since it was introduced, never mind improved.  The reason for this is my above “joke” about the development teams being staffed by people that do not have a library background.  They are literally just ignoring the Z-server and praying that nothing breaks in unit and regression testing.  There are only a handful of people that understand how Z39.50 works and they are all employed by IndexData.  For everybody else, it’s just voodoo that was there when they got here, but is a requirement for each patch and release.

Thing is, even as hardware gets faster, and ILSes (theoretically) get more sophisticated, the Z-server just gets worse.  You would think that if this is the most common and consistent mechanism to get data out of ILSes that we would have seen some improvement in implementations as the need for better interoperability increases, but this is just not a reality that I have witnessed.  With the last two ILSes that I primarily worked with (Voyager and Unicorn) I would routinely, accidentally, completely bring down due to trying to use the Z39.50 server as a data source in applications.  For the Umlaut, I had to export the Voyager bib database into an external Zebra index to prevent the ILS from crashing multiple times a day just to look up incoming OpenURL requests.  Let me note that a vast majority of these lookups were just ISSN or ISBN.  Unsurprisingly, the Zebra index held up with no problems.  It’s still working, in fact.

Talis uses Zebra for Alto.  It’s probably the main reason we can check off “SRU Support” in an RFP when practically nobody else can.  But, again, this means the Z/SRU-server is sort of “outside” the development plan, delegated to IndexData.  Our SRU servers technically aren’t even conformant to the spec, since we don’t serve explain documents.  I’m not sure anybody at Talis even was aware of this until I pointed it out last year.

All of this is not intended to demonize vendors (really!) or bite the hand that feeds me.  It’s also not intended to denigrate library standards.  I’m merely trying to be pragmatic and, more importantly, I’m hoping we can make library development a less frustrating and backwards exercise for all parties (even the cads and scalliwags).

My point is that initiatives like the DLF ILS-DI, on paper, make a lot of sense.  I completely understand why they chose to implement their model using a handful of library standards (OAI-PMH, SRU).  The standards are there, why not use them?  The problem is in the reality of the situation.  If the specification “requires” SRU for search, how many vendors do you think will just slap Yaz Proxy in front of their existing (shaky, flaky) Z39.50 server and call it a day?  The OAI-PMH provider should be pretty trivial, but I would not expect any company to provide anything innovative with regards to sets or different metadata formats.

As long as libraries are not going to be writing the software they use themselves, they need to reconcile the fact that suppliers of their software is more than likely not going to be written by librarians or library technologists.  If this is the case, what’s the better alternative?  Clinging to half-assed implementations of our incredibly niche standards?  Or figuring out what technologies are developing outside of the library realm that could be used to deliver our data and services?  Is there really, honestly, no way we could figure out how to use OpenSearch to do the things we expect SRU to do?

I realize I have an axe to grind here, but this isn’t really about Jangle.

I have seen OpenURL bandied about as a “solution” to problems outside of its current primary use of “retrieving context based services from scholarly citations” (I know this is not what OpenURL’s sole use case is, but it’s all it’s being used for.  Period).  The most recent example of this was in a workshop (that I didn’t participate in) at ELAG about how libraries could share social data, such as tagging, reviews, etc. in order to create the economies of scale needed to make these concepts work satisfactorily.  Since they needed a way to “identify” things in their collection (books, journals, articles, maps, etc.) somebody had the (understandable, re: DLF) idea to use OpenURL as the identifier mechanism.

I realize that I have been accused of being “allergic” to OpenURL, but in general, my advice is that if you have a problem and you think OpenURL is the answer to said problem there’s actually probably a simpler and better answer to this if you approach it from outside of a library POV.

The drawbacks of Z39.88 for this scenario are numerous, but I didn’t go into details with my criticisms in Twitter.  Here are a few reasons why I would recommend away from OpenURL for this (and they are not exclusive to this potential application):

  1. OpenURL context objects are not identifiers.  They are a means to describe a resource, not identify it.  A context object may contain an identifier in its description.  Use that, scrap the rest of it.
  2. Because a context object is a description and not an identifier, it would have to be parsed to try to figure out what exactly it is describing.  This is incredibly expensive, error prone and more sophisticated than necessary.
  3. It was not entirely clear how the context objects would be used in this scenario.  Would they just be embedded in, say, an XML document as a clue as to what is being tagged or reviewed?  Or would the consuming service actually be an OpenURL resolver that took these context objects and returned some sort of response?  If it’s the former, what would the base URI be?  If it’s the latter… well, there’s a lot there, but let’s start simple, what sort of response would it return?
  4. There is no current infrastructure defined in OpenURL for these sorts of requests.  While there are metadata formats that could handle journals, articles, books, etc., it seems as though this would just scratch the surface of what would need context objects (music, maps, archival collections, films, etc.).  There are no ‘service types’ defined for this kind of usage (tags, reviews, etc.). The process for adding metadata formats or community profiles is not nimble, which would make it prohibitively difficult to add new functionality when the need arises.
  5. Such an initiative would have to expect to interoperate with non-library sources.  Libraries, even banding together, are not going to have the scale or attraction of LibraryThing, Freebase, IMDB, Amazon, etc.  It is not unreasonable to say that an expectation that any of these services would really adopt OpenURL to share data is naive and a waste of time and energy.
  6. There’s already a way to share this data, called SIOC.  What we should be working towards, rather than pursuing OpenURL, is designing a URI structure for these sorts of resources in a service like this.  Hell, I could even be talked into info URIs over OpenURLs for this.

We could further isolate ourselves by insisting on using our standards.  Navel gaze, keep the data consistent and standard.  To me, however, it makes more sense to figure out how to bridge this gap.  After all, the real prize here is to be able to augment our highly structured metadata with the messy, unstructured web.  A web that isn’t going to fiddle around with OpenURL.  Or Z39.50.  Or NCIP.  I have a feeling the same is ultimately true with our vendors.

There comes a point that we have to ask if our relentless commitment to library-specific standards (in cases when there are viable alternatives) is actually causing more harm than help.

I am not a programmer.

Since I first began writing code, my approach to learning a new language has been to take something that does the sort of thing I am looking for and start fiddling, seeing the results of the fiddling (most likely through some sort of error message) and refiddle until I start seeing the outcome I was looking for.  In mechanical terms, I open the hood, get out my biggest wrench and start whacking at things until the noises stop.  Or, in this case, start.

The arc of languages I primarily worked in at any given time is a pretty good reflection of this approach:  Perl, TCL, PHP, then Ruby with a small foray into Python.  All dynamic, all extremely whackable.  Where whacking doesn’t work, Google (or, previously, Yahoo or, previously, Alta Vista) generally does.  Cut, paste and resume whacking.

The same philosophy applies when it comes to developing new projects.  I know, basically, what I want on the other side, but I have absolutely no idea what it will take to get there.  Generally this means I’ll pick up the nearest tool on hand (usually a red colored wrench) and start whacking until I see what I want.  That the red wrench isn’t the right tool for the job isn’t the point, since I’m only looking for the destination, not the best route there (since I have no idea how to get there in the first place).  The more comfortable I get with a tool, the more likely I am to nest with it, since the detour of finding (and learning how to use) another tool slows me down from reaching the goal.

The perfect example of this was WAG the Dog.  PHP was a ridiculous language to try to use for it, but ping, ping, ping, ping, it worked!

So it stands to reason that I’ve never really taken to Java.  Java is not whacking.  Java is slowly, verbosely and deliberately constructing the specific parts you need to accomplish your goal.  Java is a toolbox full of parts and pieces I do not know the names of, what they do or how they would even do anything, much less the job I’m trying to accomplish.  Java is to my career what a Masters in Mechanical Engineering is to a wrench.  I don’t use Java because I don’t even know the questions to ask to get started in the right direction.

The irony is that when I was hired by Talis, I was ‘assigned’ (that’s a stronger term than really applies) to an entirely Java-based project, Keystone.  To this day, some 15 months later, I have contributed exactly 0.0 lines of code towards Keystone.

I am not a programmer.

However, I am a tinkerer.

In an effort to eat our own dogfood, I had begun to write a Jangle connector for our library management system, Alto.  Alto is built on Sybase and we already had a RESTful SOA interface, the aforementioned Keystone.  It would have been logical for me, were I a real programmer, to take Keystone, add some new models and routes and call it a connector.

But that’s not how I roll.

Initially, I took to using the JangleR Ruby framework to build the Alto connector, since all it would require is to query the appropriate tables and ping, ping, ping, ping things until JRuby and Warbler could give me a .war file.

Sybase, however, does not take well to whacking.  Especially from Ruby.  ActiveRecord-JDBC didn’t work.  Not sure if it was our particular schema or JDBC setup or just ActiveRecord, but no dice.  I couldn’t get Ribs to work at all, which is just as well, probably.  Finally, I had success just using java.sql objects directly in JRuby, but, since I really didn’t know what I was doing, I started worrying about connection pooling and leaving connections open and whatnot.  No need to show off that I have no idea by gobbling up all the resources on some customer’s Alto server.

At one point, on a lark, I decided to try out Grails, Groovy‘s web framework inspired by Rails, to see if I could have more luck interacting with Sybase.  My rationale was, “Keystone uses Hibernate, GORM (Grails’ ORM) uses Hibernate, maybe it will work for me!”.  And, it did.

So here I am, one week into using Groovy.  Just like I used Rails as an introduction to Ruby, Grails serves that purpose with Groovy pretty well.  I can’t say I love the language, but that’s purely my bias; anything that isn’t Ruby, well, isn’t Ruby.  I am certainly doing some thing inefficiently since I am learning the language as I go along.  The fact that there just isn’t much documentation (and the existing documentation isn’t all that great) doesn’t help.

For example, none of my GORM associations work.  I have no idea why.  It could very well be the legacy Sybase schema, or I might be doing something wrong in my Domain Model class.  I don’t have any idea and I don’t have any idea where to look either for an appropriate error or for a fix.  It’s not a huge issue, though, and so far I’ve just worked around it by writing methods that do roughly what I would have needed the associations to do.  Ping, ping, ping.

I also cannot make Services work the way they show in the documentation.  My controllers can’t find them when I do it like the docs, my domain models can’t find them when I do it like the doc…  But it’s no big deal.  I set my methods to be static, call the class directly, and everything works fine.  I’m not doing anything remotely sophisticated with them, so I can keep my hacks for now.

Being able to dynamically load Java classes, iterate over things with a call like foo.each { bar = it.baz } is pretty nice.  I am really amazed at what Groovy offers when it comes to working with Java classes, it’s like being able to audit those M.E. Master’s classes.  I am learning a considerable amount about Java by being able to whack away at objects and classes within the Groovy shell.

I’m not sure that Groovy was really intended for people like me, however.  All of the documentation and even some of the syntax seem to have the expectation that you are a Java developer looking for something dynamic.  It reminds me of a Perl developer going to PHP.  They are syntactically and functionally similar.  In many ways, a Perl developer will find PHP an easier language to use to get a simple, running web application.  And they always have the fallback of Perl, if they need it.  A Python developer that has to use PHP will probably curse a lot.  Groovy seems to have the same sort of relationship to Java.  A Java developer would probably immediately feel comfortable and find it amazingly easy to get up and running.  A Ruby developer (well, this Ruby developer) finds it a little more alien.

Groovy doesn’t have a tremendous amount of native language support for specific tasks, relying instead on the vast amount Java libraries out there to do, basically, anything.  This makes perfect sense and I don’t fault Groovy in the slightest for this choice, but relying on Java for functionality means factories and stream buffers and all the other things Java consists of.  Java developers would feel home.  I find it needs some getting used to.

Also needing to declare your variables.

And I’m sure I’m not really using closures to their fullest potential.

Overall, it’s nice to have this new addition to my toolbox.  Groovy is definitely whackable and development for the Jangle connector has been remarkably fast.  I expect the Aspire (née List) and Prism teams to have something to try out by the end of the month.  And for basically being Java, that ain’t bad.

When and if I rewrite the Alto connector, I’ll probably opt for GroovyRestlet over Grails, but I definitely couldn’t have gotten where I have at this point without the convention and community of Grails.  It’s a really good starting point.

Of course, none of this would have been necessary if it wasn’t for Sybase.  Consider this day one of my campaign to eventually migrate Talis Alto from Sybase to PostgreSQLPing, ping, ping.

I have been following a thread on the VuFind-Tech list regarding the project’s endorsement of Jangle to provide the basis of the ILS plugin architecture for that project.  It’s not an explicit mandate, just a pragmatic decision that if work is going in to creating a plugin for VuFind, it would make more sense (from an open source economics point of view) if that plugin was useful to more projects than just VuFind.  More users, more interest, more community, more support.

The skepticism of Jangle is understandable and expected.  After all, it’s a very unorthodox approach to library data, seemingly eschewing other library initiatives and, at the surface, seems to be wholly funded by a single vendor‘s support.

And, certainly, Jangle may fail.  Just like any other project.  Just like VuFind.  Just like Evergreen.  Any new innovative project brings risk.  More important than the direct reward of any of these initiatives succeeding is the disruption they bring to the status quo.  Instead of what they directly bring to the table, what do they change about how we view the world?

Let’s start with Evergreen.  Five years ago I sat in a conference room at Emory’s main library while Brad LaJeunesse and Jason Etheridge (this predated PINES hiring Mike Rylander and Bill Erickson) told us that they were ditching Unicorn and building their own system.  I, like the others in the room, Selden Deemer, Martin Halbert, smiled and nodded and when they left I (Mr. Library Technology Polyanna) turned to the others and said that I liked their moxie, but it was never going to work.  Koha was the only precedent at the time, and, frankly, it seemed like a toy.

Now where are we?  Most of the public libraries in Georgia using Evergreen, a large contingency from British Columbia migrating, and a handful of academic libraries either live or working towards migration.  Well, I sure was wrong.

The more significant repercussion of PINES going live with Evergreen was that it cast into doubt our assumptions of how our relationship with our integrated library system needed to work.  Rather than the library waiting for their vendor to provide whatever functionality they need or want, they can instead, implement it themselves.  While it’s unrealistic for every library to migrate to Evergreen or Koha, these projects have brought to light the lack of transparency and cooperation in the ILS marketplace.

Similarly, projects like VuFind, Blacklight and fac-back-opac prove that by pulling some off-the-shelf non-library-specific applications and cleverly using existing web services (like covers from Amazon) that we can cheaply and quickly create the kinds of interfaces we have been begging from our vendors for years.  It is unlikely that all of these initiatives will succeed, and the casualties will more likely be the result of the technology stack they are built upon rather than any lack of functionality, the fact that they all appeared around the same time and answer roughly the same question, shows that we can pool our resources and build some pretty neat things.

To be fair, the real risk taker in this arena was NC State.  They spent the money on Endeca and rolled out the interface that wound up changing the way we looked at the OPAC.  The reward of NCSU’s entrepreneurialism is that we now have projects like VuFind and its ilk.  Very few libraries can afford to be directly rewarded by NC State’s catalog implementation, but with every library that signs on with Encore or Primo, III and Ex Libris owe that sale to a handful of people in Raleigh.  You would not be able to download and play with VuFind if NC State libraries had worried too much about failure.

Which then brings me to Jangle.  The decision to build the spec on the Atom Publishing Protocol has definitely been the single most criticism of the project (once we removed the confusing, outdated wiki pages about Jangle being an Rails application), but there has been little dialogue as to why it wouldn’t work (actually, none).  The purpose of Jangle is to provide an API for roughly 95% of your local development needs with regards to your library services.  There will be edge cases, for sure, and Jangle might not cover them.  At this point, it’s hard to tell.  What is easier to tell, however, is that dwelling on the edge cases does absolutely nothing to address the majority of needs.  Also, the edge cases are mainly library-internal-specific problems (like circulation rules).  A campus or municipal IT person doesn’t particularly care about these specifics when trying to integrate the library into courseware or some e-government portal.  They just want a simple way to get the data.

This doesn’t mean that Jangle is solely relegated to simple tasks, however.  It just is capable of scaling down to simple use cases.  And that’s where I hope Jangle causes disruption whether or not it is ultimately the technology that succeeds.  By leveraging popular non-library-specific web standards it will make the job of the systems librarian or the external developer easier, whether it’s via AtomPub or some other commonly deployed protocol.

I was reading Brian’s appeal for more Emerils in the library world (bam!), noticed Steven Bell’s comment (his blog posting was a response to one by Steven in the first place) and it got me thinking.

First off, I don’t necessarily buy into Brian’s argument.  Maybe it’s due to the fact that he’s younger than me, but my noisy, unwanted opinions aren’t because I didn’t get a pretty enough pony for my sixteenth birthday or because I saw Jason Kidd’s house on Cribs ™ and want to see my slam dunk highlights on SportsCenter on my 40″ flat screens in every bathroom.  It’s because I feel I have something to offer libraries and I genuinely want to help affect change.  Really, I know this is what motivates Brian, too, despite his E! Network thesis, because we worked together and I know his ideas.

Brian doesn’t have to worry about his fifteen minutes coming to a close anytime soon.  Although at first blush it would appear that the niche he has carved out for himself is potentially flash-in-the-pan-y (Facebook, Second Life, library gaming, other Library 2.0 conceits), the motivation for why he does what he does is anything but.  He is really just trying to meet users where they are, on their terms, to help them with their library experience.

Technologies will change and so, too, will Brian, but that’s not the point.  He’ll adapt and adjust his methods to best suit what comes down the pike, as it comes down the pike (proactively, rather than reactively) and continue to be a vanguard in engaging users on their own turf.  More importantly, though, I think he can continue to be a voice in libraries because he works in a library and if you have some creative initiative it’s very easy to stand out and make yourself heard.

Brian and I used joke about the library rock star lifestyle:  articles, accolades, speaking gigs, etc.  A lot of this comes prettily easily, however.  If you can articulate some rational ideas and show a little something to back those ideas up, you can quickly make a name for yourself.  Information science wants visionary people (regardless of whether or not they follow that leader) and librarians want to hear new ideas for how to solve old problems.  Being a rock star is pretty easy, being a revolutionary is considerably harder.

I made the jump from library to vendor because I wanted to see my ideas affect a larger radius than what I could do at a single node.  It has been an interesting adjustment and I’m definitely still trying to find my footing.  It has been much, much more difficult to stand out because I am suddenly surrounded by a bunch of people that much are smarter than me, much better developers than me, and have more experience applying technology on a large scale.  This is not to say that I haven’t worked with brilliant people in libraries (certainly I have, Brian among them), but the ratio has never been quite like this.  Add to the fact that being a noisy, opinionated voice within a vendor has its immediate share of skeptics and cynics (who are the ‘rock stars’ in the vendor community?  Stephen Abram?  Shoot me.), I may find myself falling into Steven Bell’s dustbin.  Then again, I might be able to eventually influence the sorts of changes that inspired me to make the leap in the first place.  I can do without the stardom in that case.

Can anyone give a rational explanation as to why a job with a description like this:

DESCRIPTION: Provides technology and computer support for the Vanderbilt Library. The major areas of responsibility include developing, maintaining and assisting in the enhancement of interfaces to web-enabled database applications (currently implemented in perl, PHP, and MySQL). The position also helps establish and maintain guidelines (coding standards, version control, etc.) for the development of new applications in support of library patrons, staff, and faculty across the university. This position will also provide first line backup for Unix system administration. Other duties and assignments will be negotiated based on the successful candidate’s expertise, team needs, and library priorities.

would require an MLS? Library experience? Sure, I can see why that would be desirable. While I find it ridiculous when many libraries require an MLS for what is essentially an IT manager, Vandy is upping the ante here and requiring it for a developer/jr. sysadmin.

I guess that’s a way to prop up the profession.

If YPOW, like MPOW, is an Endeavor Voyager site, you’ve got some decisions ahead. Francisco Partners, naturally, would like you to migrate to Aleph, and I have no doubt that Ex Libris is, as I write this, busily working on a means to make that easy for Voyager libraries to do. But ILS migrations are painful, no matter how easy the backend process might be. There’s staff training, user training, managing new workflows, site integration; lots of things to deal with. Also, your functionality may not be a 1:1 relationship to what you currently have. How do you work around services you depended upon?

Since soon our contracts with Endeavor Information Systems will be next to worthless, I propose, Voyager customers, that we take ownership of our systems. For the price of a full Oracle (or SQL Server? — does Voyager support other RDBMSes?) license (many of us already have this), we can get write permissions to our DB and make our own interfaces. We wouldn’t need to worry about staff clients (for now), since we already have cataloging, circulation, acquisitions, etc. modules that work. When we’re ready for different functionality, however, we can create a new middleware (in fact, I’m planning to break ground on this in the next two weeks) to allow for web clients or, even better, piggyback on Evergreen’s staff clients and let somebody else do the hard work. If we had native clients in the new middleware, a library could use any database backend they wanted (just migrate the data from Oracle into something else). The key is write access to the database.

By taking ownership of our ILS, we can push developments we want, such as NCIP, a ‘Next Gen OPAC’, better link resolver integration, better metasearch integration, etc. without the pain of starting all over again (with potentially the same results, who is to say that whatever you choose as an ILS wouldn’t eventally get bought and killed off, as well?). Putting my money (or lack thereof) where my mouth is, I plan on migrating Fancy Pants to use such a backend (read only db access, for now, we still have a support contract, after all). I’m calling this project ‘Bon Voyage’. After reading Birkin’s post on CODE4LIB, I would like to make a similar service for Voyager that would basically take the place of the Z39.50 server and access to the database. Fancy Pants wouldn’t be integrated into Bon Voyage, it would just be another client (since it was always only meant as a stopgap, anyway).

What we’ll have is a framework for getting at the database backend (it’d be safe to say this will be a rails project) with APIs to access bib, item, patron, etc. information. Once the models are created, it will be relatively simple to transition to ‘write’ access when that becomes necessary. Making a replacement for WebVoyage would be fairly trivial once the architecture is in place. Web based staff clients would also be fairly simple. I think EG staff client integration wouldn’t be too hard since it would just be an issue of outputting our data to something the EG clients want (JSON, I believe) and translating the client’s reponse. That would need to be investigated more, however (I’m on paternity leave and not doing things like that right now 🙂

Would anybody find this useful?
It seems the money we spend on an ILS could be better spent elsewhere. I don’t think this would be a product we could distribute outside of the the current Voyager customer base (at least, not until it was completely native… maybe not even then- we’d have to work this out with Francisco Partners, I guess), but I think that that is big enough to be sustainable on its own.

Library Geeks #4 is out. I’m back (instead of being the ‘Poltergeek’ of episode 3), and Dan set this one up in kind of a neat way. He certainly leads and guides the dialogue, but it’s much more of a roundtable and informal discussion. No doubt this is largely due to the fact we’re all pretty good friends. Still, I learned a lot about Ed that I didn’t know — pretty amazing since we hang out every day and work on so much together.

Bobby McFerrin, eat your heart out.

I have been thinking a lot this week about how libraries are skirting ever closer to a precipice they generally refuse to acknowledge.

While certainly a cynic, I’m not generally a Cassandra, so before proclaiming last rites on the library, I wanted to make sure I had thought about this some. This was probably spurred on by the “Murder MARC” thread on NGC4Lib.

And I got to thinking about travel agents.

Fifteen or twenty years ago, it was nigh onto impossible to book a trip anywhere without a travel agent. My mother worked part-time for a travel agent; travel agents were everywhere. It seemed as much a part of travelling as does a real estate agent seem part of home buying today. It’s technically possible to work without them, but life is going to be a whole lot easier if you do. Travel agents held the keys to your vacation; you were at the mercy of where, when and how much your vacation was to be.

Then, sometime during the dot-com blitz, up popped sites such as Expedia, Travelocity and Orbitz. Suddenly, the consumer controlled his or her own travel. The traveller could waste as much time as he or she wanted looking for the perfect getaway; setting up fare watches to places they might randomly want to go; researching the hotel on the receiving end. All this without hassle of somebody trying arrange the trip for them. Sure, you might not have found the best rate or gotten a free meal thrown in at your destination, but the merits outweighed the costs. You began to know fewer and fewer travel agents.

For a long time librarians viewed Amazon.com and the other online booksellers as their primary competition. It made sense; both parties were hawking roughly the same physical wares. For a very long time, booksellers and libraries have coexisted. There was an understanding of the differences between the two models and, despite attempts to make the library catalog act more like Amazon, the market for each seemed fairly well intact.

Search engines, however, were happy to freely give away information indiscriminently. As the Googles and Yahoos indexed more and more content and the information grew freer and freer, libraries began to seem more antiquated and backwards in comparison. With more and more information available — preprints, Open Access, book digitization efforts — the need for specific content began to wane. Users began to lose interest working with arcane brokers of information when the information they were finding on their own met their needs. Librarians, in response, chastized the users’ choices and maintained that a steady diet of hard work and piety was the only way to achieve scholarly salvation. Dogmatic adherence to preservation of metadata took precedence over improving the user’s ability navigate and discover. After all, we know better than the user how to find the best fares, right?

I’m not a Cassandra, I’m really not. Disruptive technologies can lead to revolutionary changes, after all.

However, these thoughts came to a boil today. I have been given a lot of criticism about the way I rely on the search engines to provide a lot of functionality in the mlaut. How, despite the money we pay for our licensed resources, I still prominently display free web search results.

The fact of the matter is that it’s impossible for me include our precious resources for anything useful, while including Google, Yahoo and Amazon was a piece of cake. The best I can hope to give our users from our collection are links to lists of databases or subject guides; without considerable work on my part and the librarians all I can hope to get them is a generic link to all of our databases or a list of subject guides. With the search engines I can at least narrow down the information as to be somewhat useful. And if it’s not? They don’t have to use it. There’s a link to the library, that’s one link away from the databases or subject guides.

At this point it’s important for us to empower the user; not with arcane searching techniques in hard to find resources, but by leveraging our systems so they can be integrated easily into wherever the user is looking and by exposing our content and services in ways that fit into the user’s sphere of control and comfort. If we don’t, the user will still find information that is ‘good enough’ for their needs and more will be added every day. Then, we’ll not only have lost our control, but our relevance as well.

Every library I have worked at has had an uneasy caste system between the faculty and staff. While I understand this to extent, this delineation is used without rhyme or reason much of the time. The implication is that this means the librarians are treated as “career professionals” and the staff is merely “the help” (more on this in a minute).

I was pleasantly surprised to see the University of Iowa waive the MLS requirement for their current “Director of Information Technology” job posting (MS Word Document).

The University of Iowa Libraries seeks a creative, experienced professional to lead our information technology (IT) operations. Building on the Libraries’ current capabilities, the Director will provide innovative leadership in the use of technology to deliver information resources and services to the Libraries’ user communities. The Director for Library Information Technology reports directly to the University Librarian and is a member of the Libraries’ Executive Council contributing to overall strategic planning, program development and evaluation, and the allocation of resources in support of the Libraries’ mission.

This is a senior administrative position responsible for IT planning, the development of system-wide policies and procedures, and the coordination of information technology activities throughout the library system. The Director will supervise a department of 11 staff responsible for desktop support and technical training, systems administration and security/rights management, and applications development. Collaborative and advocacy activities with other library administrators and staff as well as members of the IT community on campus, the state, nationally and internationally are key responsibilities of this position. This Director serves as the Libraries’ liaison to CNI, EDUCAUSE, and similar organizations.

Qualifications

Required:

  • Bachelor’s degree

  • Minimum of 9 years of library information technology experience in a university environment

  • Demonstrated knowledge of current trends and best practices in the application of information technology in research libraries and higher education

  • Demonstrated experience promoting and working effectively in a diverse environment

  • Evidence of highly effective interpersonal and communication skills.

  • Evidence of analytical and creative problem-solving skills

  • Library-wide perspective and ability to contribute to planning and system-wide administration of the Libraries

  • Record of active participation in national pertinent professional associations

While I meet the “minimum requirements” for this position, I am not fully qualified… somewhere in this job’s requirements (maybe not actually written in it) is a healthy desire to live in or near Iowa City (which I do not possess). Still, this is very, very progressive for school of UIowa’s size. They are indicating that if you have spent 9 years of your life working in libraries and have shown the initiative to participate professionally and whatnot, you’re the sort of person who they want on their team. The idea being that the MLS is not a terribly good indicator of skills and (for this sort of position, anyway) may actually limit the pool of good potential candidates they may get.
When I was hired as “Web and Application Development Coordinator” at Emory, they stripped any “faculty-ness” from the position which, in turn, devalued the authority the position had — at least this seemed true in practice. The position could have been pretty similar to that of Iowa’s (I still would have been miserable at it), but the faculty wasn’t quite able to make the leap to label me a “peer”.

Just like my Emory position, though, Iowa is requiring that the position be a “manager” position to get invited to the “career professional” table. There are a lot of librarians at that table who are not managers.

This gap was made evident eariler this week here at Tech. I frequently hear about things going on in the library secondhand due to the fact that I am not faculty. Lord knows how much I never hear. We have a mailing list (lib-fac) where “career professional” sorts of announcements are made and, on Tuesday, a meeting was held for the faculty only to hear people report back what they had learned at ALA Midwinter.

My beef here is, why the faculty only distinction? If they were talking about tenure review or the sorts of things that couldn’t affect me by nature of my employment status that would be one thing. They are talking about things that I would like to know about, however, and could possibly contribute a voice in the discussion. I am not entirely sure why I should be left out.

It’s possible that I could be included just by asking to be. But why am I (or any other staff that has an interest in the profession, for that matter) being excluded in the first place?

I guess what I’m trying to say is that it’s a little discouraging to work to be treated as a peer and a colleague in a national/international community and not in my own organization.

My intention here is not to single out Georgia Tech; this is prevalent throughout libraries everywhere (at least, from what I’ve seen it has). I’m just calling for the possibility of a third caste: those that are making a career in libraries, but have no desire for faculty status (or management).