Archive

ruby

There are any number of reasons that you can attribute to Solr‘s status as the standard bearer of faceted full-text searching:  it’s free, fast, works shockingly well out of the box without any tweaking, has a simple and intuitive HTTP API (making it available in the programming language of your choice) and is, by far, the easiest “enterprise-level” application to get up and running.  None of its “competitors” (Sphinx, Xapian, Endeca, etc.), despite any individual advantages they might have, can claim all of these features, which goes a long way towards explaining Solr’s popularity.

The library world has definitely taken a shine to Solr:  from discovery interfaces like VuFind and Primo, to repositories like Fedora, to full-text aggregators like Summon, you can find Solr under the hood of most of the hot products and services available right now.  The fact that a library can install VuFind and have a slick, jaw-droppingly powerful OPAC-replacement that puts their legacy interface to shame in about an hour is almost completely the by-product of Solr’s amazing simplicity to get up and running.  It’s no wonder why so many libraries are adopting it (compare it to SOPAC, also built in PHP and about as old, but uses Sphinx for the full-text indexing and is hardly ever seen in the wild).

Without a doubt, Solr is pretty much a no-brainer if you are able to run Jetty (or Tomcat or JBoss or Glassfish or whatever):  with enough hardware, Solr can scale up to pretty much whatever your need might be.  The problem (at least the problem in my mind) is that Solr doesn’t scale down terribly well.  If you host your content from a cheap, shared web hosting provider or a VPS, for example, Solr is not available or not practical (it doesn’t live in small memory environments well).  The hosted Solr options are fairly expensive and while there are cheap, shared web hosting providers that do provide Java Application Servers, switching vendors to provide faceted search for your mid-size Drupal or Omeka site might not be entirely practical or desirable.

I find myself proof-of-concept-ing a lot of hacks to projects like VuFind, Blacklight, Kochief and whatnot and run these things off of my shared web server.  It’s older, underpowered and only has 1GB of RAM.  Since I’m not running any of these projects in production (just really making things available for others to see), it was really annoying to have Solr gobbling up 20% of the available RAM for these little pet projects.  What I wanted was something that acted more or less like Solr when you pointed an application that expected Solr to be there, but I wanted it to have a small footprint that could run (almost) anywhere and more or less disappear when it was idle.

So it was for this scenario that I wrote CheapSkate: a Solr emulator written in Ruby.  It uses Ferret, the Ruby port of Lucene, as the full-text indexing engine and Sinatra to supply the HTTP API.  Ferret is fast, scales quite well and responds to the same search syntax as Solr, so I knew it could handle the search aspect pretty easily.  Faceting (as can be expected) proved the harder part.  Originally, I was storing the values of fields in an RDBMS and using that to provide the facets.  Read performance was ok, although anything over 5,000 results would start to bog down – the real problem was the write performance, which was simply woeful.  Part of the issue was that this design was completely schemaless:  you could send anything to CheapSkate and facet on any field, regardless of size.  It also tried to maintain the type of the incoming field value:  dates were stored as dates, numbers stored as integers and so on.  Basically the lack of constraints made it wildly inefficient.

Eventually, I dropped the RDBMS component, and started playing around Ferret’s terms capabilities.  If you set a particular field to be untokenized, your field values appear exactly as you put them in.  This is perfect for faceting (since you don’t want stemming and whatnot on your query filters and your strings aren’t normalized or downcased or anything so they look right in the UI) and is basically the same thing Solr itself does.  Instead of a schema.xml, CheapSkate has a schema.yml, but it works essentially the same way:  you define your fields, what should be tokenized (that is, which fields allow full-text search) or not (i.e. facet fields) and what datatype the field should be.

CheapSkate doesn’t support all of the field types that Solr does, but it supports strings, numbers, dates and booleans.

One neat thing about Ferret is that you can pass a Ruby Proc to the search method as a search option.  This proc then has access to the search results as Ferret is finding them.  CheapSkate uses this find the terms in the untokenized fields for each search hit, throws them in a Hash and generates a hit count for each term.  This is a lot faster than getting all the document ids from the search, looping them and generating your term hash after the search is completed.  That said, this is still definitely the bottleneck for CheapSkate.  If the search result has more than 10-15,000 hits, performance begins to get pretty heavily impacted by grabbing the facets.  I’m not terribly concerned by this, data sets with search results in the 20,000+ range start to creep into the “you would be better off just using Solr” domain.  For my proofs-of-concepts, this has only really raised its head in VuFind when filtering on something like “Book” (with no search terms) for a 50,000 record collection.  What I mean to say is, this happens for fairly non-useful searches.

Overall, I’ve been pretty happy with how CheapSkate is working.  For regular searching it does pretty well (although, like I said, I’m not trying to run a production discovery system that pleases both librarians and users).  There’s a very poorly designed “more like this” handler that really needs an overhaul and there is no “did you mean” (spellcheck).  This hasn’t been a huge priority, because I don’t really like the spellcheck in Solr all that much, anyway.  That said, if somebody really wanted this and had an idea of how it would be implemented in Ferret, I’d be happy to add it.

Ideally, I’d like to see something like CheapSkate in PHP using Zend_Search_Lucene, since that would be accessible to virtually everybody, but that’s a project for somebody else.

In the meantime, if you want to see some examples of CheapSkate in action:

One important caveat to projects like VuFind and Blacklight:  CheapSkate doesn’t work with Solrmarc, which requires Solr to return responses in the javabin format (which may be possible to hack out something that looks enough like javabin to fool Solrmarc, I just haven’t figured it out).   My workaround has been to populate a local Solr index with Solrmarc and then just dump all of the documents out of Solr into CheapSkate.

A couple of months ago, I hacked up a really simple proof-of-concept Sinatra that took an LCCN, called the Library of Congress’ LCCN Permalink service’s MARCXML output for that particular LCCN and tried to model it into linked data. It was really basic: it only returned RDF/XML and had no persistence layer to it, so I ran it using Heroku’s free hosting plan.

It worked pretty well but as I applied more and more functionality to it (looking for matches in Musicbrainz, LinkedMDB, DBpedia, Freebase, etc. — especially inefficient SPARQL regex queries), I kept running into execution timeout errors on Heroku. These are the exact same sorts of problems that the Umlaut ran into years ago and the solution required complicated threading and, eventually, AJAX requests to offload some of the response time of waiting for synchronous web services requests to return (or timeout or fail).

One thing I began doing with LinkedLCCN was persisting the graph into a Platform store, so once stored, any subsequent request for a resource was quite speedy. The problem was the initial request, the one that gathered all of the data to fill out the graph before it stored. Quite often this would timeout or throw an error (which, given that this was still very much a work in progress, would result a 500 error) meaning the resource was never saved to the Platform meaning all of the following requests would have to go through the same process until one of them finally succeeded. Since the freebie access on Heroku lets you run one process at a time, these long running (and timing out) requests would cause a backlog which would throw more errors.

It was becoming the embodiment of the phrase I used during my Code4Lib presentation: “Amateur Hour on the Internet”.

What was obviously needed was some asynchronous mechanism for giving back part of the graph, indicating to the requester that this was a partial response, and firing off a background task to complete the rest of the processing. Because there was no HTML interface, AJAX wasn’t an option. Even if there was an HTML interface (as there is now), AJAX still wouldn’t have been an option, because this was a service intended for web agents following their nose, not human surfers, so even if the agent is satisfied with the HTML response (for instance, when it eventually gets RDFa), curl (and its ilk) don’t have javascript, so the background process would never even have a chance to be called, anyway.

This meant the only viable solution to this problem was going to be via multiple processes. This also meant that Heroku wasn’t an option anymore (at least, not without a price), so I was going to migrate to my personal web host. In Ruby web frameworks, asynchronous processing comes in one of two forms:

  1. Threads/forks/etc.
  2. Queue schedulers

Based on my experience with the Umlaut, I wanted to avoid #1, if at all possible, or at the very least, use an existing, packaged solution that could drop fairly painlessly into Sinatra. I found a port of Merb’s run_later, but I could only ever get it to run once. Any succeeding request never seemed to fire off the process.

The queue schedulers generally required their own set of baggage: namely their own running daemon and a RDBMS. Since almost all of these projects originally started out as Rails plugins, they expect the application to have ActiveRecord and an RDBMS to store their jobs in. I didn’t have either.

I had settled on using Delayed Job, since there was, again, a Sinatra port. It took quite a bit of hacking to get this to work right (mainly around marshaling/unmarshaling objects), and I never could get the job logging to work very well, but it was successfully queuing and executing jobs in the background.

It was hard to manage, though. I use Capistrano for deployment and it was very difficult to control the Delayed Job daemon so that it would stop and start with the regular webservice. Again, it worked, but it felt very fragile. The sort of thing I could see breaking and having to spend hours trying to figure out how to fix it.

Last night, while I was trying to pull together my thoughts and links and whatnot for this post, I ran across Spork, which is a Sinatra port of Spawn. A couple hours later, LinkedLCCN was refactored to use Spork instead, and that’s how it’s running now.

So, LinkedLCCN now works like this:
$ curl -v -H “accept: application/rdf+xml” http://lccn.lcsubjects.org/93707283#i

* About to connect() to lccn.lcsubjects.org port 80 (#0)
*   Trying 208.83.140.6... connected
* Connected to lccn.lcsubjects.org (208.83.140.6) port 80 (#0)
> GET /93707283#i HTTP/1.1
> User-Agent: curl/7.19.6 (i386-apple-darwin9.8.0) libcurl/7.19.6 zlib/1.2.3
> Host: lccn.lcsubjects.org
> accept: application/rdf+xml
>
< HTTP/1.1 206 Partial Content
< Date: Fri, 12 Mar 2010 20:19:50 GMT
< Server: Apache/2.2.12 (Ubuntu)
< X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.2.11
< Content-Length: 6472
< Status: 206
< Content-Type: application/rdf+xml
<
<rdf:RDF xmlns:dcterms="http://purl.org/dc/terms/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:umbel="http://umbel.org/umbel#" xmlns:rda="http://RDVocab.info/Elements/"><rdf:Description rdf:about="http://purl.org/NET/lccn/93707283#i"><rda:placeOfPublication><rdf:Description rdf:about="http://purl.org/NET/marccodes/countries/nyu#location"></rdf:Description></rda:placeOfPublication><rda:titleProper>The freewheelin' Bob Dylan</rda:titleProper><foaf:isPrimaryTopicOf><rdf:Description rdf:about="http://lccn.loc.gov/93707283"></rdf:Description></foaf:isPrimaryTopicOf><bibo:uri>http://hdl.loc.gov/loc.mbrsrs/lp0001.dyln</bibo:uri><bibo:lccn>93707283</bibo:lccn><dcterms:title>The freewheelin' Bob Dylan</dcterms:title><dcterms:creator><rdf:Description rdf:about="http://purl.org/NET/lccn/people/n50030190#i"><owl:sameAs><rdf:Description rdf:about="http://dbpedia.org/resource/Bob_Dylan"></rdf:Description></owl:sameAs><foaf:name>Dylan, Bob, 1941-</foaf:name><umbel:isAbout><rdf:Description rdf:about="http://viaf.org/viaf/46946176.rwo"><foaf:name>Dylan, Bob, pseud</foaf:name><foaf:name>Dylan, Bob, 1941-</foaf:name><foaf:page rdf:resource="http://dbpedia.org/page/Wikipedia:WikiProject_Bob_Dylan" /><foaf:page rdf:resource="http://www.worldcat.org/wcidentities/lccn-n50-030190" /><foaf:page rdf:resource="http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Bob_Dylan" /><skos:altLabel>Thomas, Robert Milkwood,</skos:altLabel><skos:altLabel>Dylan, B.</skos:altLabel><skos:altLabel>Thomas, Robert Milkwood</skos:altLabel><skos:altLabel>Landy, Bob,</skos:altLabel><skos:altLabel>Landy, Bob</skos:altLabel><skos:altLabel>Zimmermann, Robert Allen</skos:altLabel><skos:altLabel>Zimmerman, Roberto Allen</skos:altLabel><skos:altLabel>Zimmerman, Robert,</skos:altLabel><skos:altLabel>Porterhouse, Tedham,</skos:altLabel><skos:altLabel>Petrov, Sergei</skos:altLabel><skos:altLabel>Zimmerman, Robert Allen,</skos:altLabel><skos:altLabel>Blind Boy Grunt,</skos:altLabel><skos:altLabel>Gook, Roosevelt,</skos:altLabel><skos:altLabel>Dylan, Bob, 1941-</skos:altLabel><skos:altLabel>Zimmerman, Robert</skos:altLabel><skos:altLabel>Alias,</skos:altLabel><skos:altLabel>Zimmerman, Roberto Allen,</skos:altLabel><skos:altLabel>Dylan, Bob, pseud</skos:altLabel><skos:altLabel>Zimmerman, Robert Allen</skos:altLabel><skos:inScheme rdf:resource="http://viaf.org/viaf-scheme/#personalNames" /><skos:inScheme rdf:resource="http://viaf.org/viaf-scheme/#concept" /><skos:changeNote xml:lang="en">Modified by agency: OCoLC</skos:changeNote><skos:changeNote xml:lang="en">Transcribed by agency: OCoLC</skos:changeNote><skos:exactMatch rdf:resource="http://viaf.org/viaf/46946176.viaf" /><skos:exactMatch rdf:resource="http://libris.kb.se/auth/184248" /><skos:exactMatch rdf:resource="http://viaf.org/viaf/46946176.m21" /><skos:exactMatch rdf:resource="http://id.loc.gov/authorities/n50030190#concept" /><skos:exactMatch rdf:resource="http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Bob_Dylan" /><skos:exactMatch rdf:resource="http://viaf.org/viaf/46946176.unimarc" /><skos:exactMatch rdf:resource="http://viaf.org/processed/BNF%7C13893566" /><skos:exactMatch rdf:resource="http://viaf.org/processed/NKC%7Cjn20000700458" /><skos:exactMatch rdf:resource="http://d-nb.info/gnd/118528408" /><skos:exactMatch rdf:resource="http://catalogo.bne.es/uhtbin/authoritybrowse.cgi?action=display&amp;authority_id=XX821701" /><skos:exactMatch rdf:resource="http://viaf.org/processed/NLA%7C000035052711" /><skos:exactMatch rdf:resource="http://viaf.org/processed/NLIlat%7C000041704" /><skos:exactMatch rdf:resource="http://dbpedia.org/page/Wikipedia:WikiProject_Bob_Dylan" /><skos:exactMatch rdf:resource="http://viaf.org/processed/LAC%7C0008D6165" /><skos:exactMatch rdf:resource="http://viaf.org/processed/PTBNP%7C1270067" /><rdf:type rdf:resource="http://www.w3.org/2004/02/skos/core#Concept" /><dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2010-02-22T06:44:10+00:00</dcterms:modified><dcterms:type>person</dcterms:type><dcterms:identitifer>46946176</dcterms:identitifer><dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2009-03-03T12:03:19+00:00</dcterms:created></rdf:Description></umbel:isAbout><rdf:type><rdf:Description rdf:about="http://xmlns.com/foaf/0.1/Person"></rdf:Description></rdf:type></rdf:Description></dcterms:creator><dcterms:language><rdf:Description rdf:about="http://purl.org/NET/marccodes/languages/eng#lang"></rdf:Description></dcterms:language><dcterms:subject><rdf:Description rdf:about="http://id.loc.gov/authorities/sh87003307#concept"><owl:sameAs><rdf:Description rdf:about="info:lc/authorities/sh87003307"></rdf:Description></owl:sameAs><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#topicalTerms"></rdf:Description></skos:inScheme><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#conceptScheme"></rdf:Description></skos:inScheme><skos:prefLabel>Popular music--1961-1970</skos:prefLabel><rdf:type><rdf:Description rdf:about="http://www.w3.org/2004/02/skos/core#Concept"></rdf:Description></rdf:type><dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-07-14T11:37:03-04:00</dcterms:modified><dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-05-22T00:00:00-04:00</dcterms:created></rdf:Description></dcterms:subject><dcterms:subject><rdf:Description rdf:about="http://id.loc.gov/authorities/sh87003285#concept"><owl:sameAs><rdf:Description rdf:about="info:lc/authorities/sh87003285"></rdf:Description></owl:sameAs><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#topicalTerms"></rdf:Description></skos:inScheme><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#conceptScheme"></rdf:Description></skos:inScheme><skos:prefLabel>Blues (Music)--1961-1970</skos:prefLabel><rdf:type><rdf:Description rdf:about="http://www.w3.org/2004/02/skos/core#Concept"></rdf:Description></rdf:type><dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-07-14T16:41:41-04:00</dcterms:modified><dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-05-22T00:00:00-04:00</dcterms:created></rdf:Description></dcterms:subject></rdf:Description></rdf:RDF>
* Connection #0 to host lccn.lcsubjects.org left intact
* Closing connection #0

The important thing to note here is the HTTP status code sent back. LinkedLCCN sends back a 206, partial content because it wants the agent to try again later. “Thank you for waiting, here is some data to get you started. If you come back, it’s possible I might have some more.”

And, indeed, if the agent came back:
$ curl -v -H “accept: application/rdf+xml” http://lccn.lcsubjects.org/93707283#i

* About to connect() to lccn.lcsubjects.org port 80 (#0)
*   Trying 208.83.140.6... connected
* Connected to lccn.lcsubjects.org (208.83.140.6) port 80 (#0)
> GET /93707283#i HTTP/1.1
> User-Agent: curl/7.19.6 (i386-apple-darwin9.8.0) libcurl/7.19.6 zlib/1.2.3
> Host: lccn.lcsubjects.org
> accept: application/rdf+xml
>
< HTTP/1.1 200 OK
< Date: Fri, 12 Mar 2010 20:26:18 GMT
< Server: Apache/2.2.12 (Ubuntu)
< X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.2.11
< Content-Length: 40807
< Status: 200
< Content-Type: application/rdf+xml
<
<rdf:RDF xmlns:n0="http://dbtune.org/musicbrainz/resource/vocab/" xmlns:mo="http://purl.org/ontology/mo/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:n1="http://www.holygoat.co.uk/owl/redwood/0.1/tags/" xmlns:n2="http://purl.org/vocab/bio/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:skos="http://www.w3.org/2004/02/skos/core#" xmlns:bibo="http://purl.org/ontology/bibo/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:umbel="http://umbel.org/umbel#" xmlns:rda="http://RDVocab.info/Elements/"><rdf:Description rdf:about="http://purl.org/NET/lccn/93707283#i"><mo:label><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/label/011d1192-6f65-45bd-85c4-0400dd45693e"><n0:label_sortname>Columbia Records</n0:label_sortname><n0:label_name>Columbia Records</n0:label_name><n0:label_labelcode>162</n0:label_labelcode><n0:tag_count>1</n0:tag_count><n0:label_type>4</n0:label_type><n0:alias>Columbia Phonograph Company</n0:alias><n0:alias>Columbia</n0:alias><n0:alias>Colombia Records</n0:alias><n0:alias>Columbia d (Sony BMG)</n0:alias><n0:alias>Columbia Records</n0:alias><n0:alias>Columbia US</n0:alias><rdfs:label>Columbia Records</rdfs:label><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/20"></rdf:Description></n1:taggedWithTag><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/748"></rdf:Description></n1:taggedWithTag><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/1584"></rdf:Description></n1:taggedWithTag><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/111"></rdf:Description></n1:taggedWithTag><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/273"></rdf:Description></n1:taggedWithTag><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/343"></rdf:Description></n1:taggedWithTag><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/7179"></rdf:Description></n1:taggedWithTag><n1:taggedWithTag><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/tag/284"></rdf:Description></n1:taggedWithTag><n2:event><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/label/011d1192-6f65-45bd-85c4-0400dd45693e/birth"></rdf:Description></n2:event><foaf:based_near><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/country/US"></rdf:Description></foaf:based_near><dc:description>1931-1990: only USA, Canada&amp;Japan. 1990 to present: worldwide</dc:description><rdf:type><rdf:Description rdf:about="http://purl.org/ontology/mo/Label"></rdf:Description></rdf:type></rdf:Description></mo:label><mo:catalogue_number>CS 8786</mo:catalogue_number><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/5857b092-93ae-434d-b3f1-3f959396732b"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/98846b10-8951-43bc-ab24-c960e330cec8"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/88621637-8f03-427e-855f-4f52f712e80e"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/6d2b3714-478f-4fb1-9dfb-4a70c266453e"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/b654f8ad-071a-41c1-a1f7-1134de178ee8"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/9b08a5da-da77-4df8-b3ad-dcb481959013"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/f96f9b50-959d-4ef0-adc0-2995d179e6c8"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/2160818f-ce54-4502-a480-535389abef61"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/2f528602-ea36-480a-b1df-f7a5af36598e"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/19f571b4-b396-4113-9858-ab032074a3c7"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/00e37446-2e4c-409a-a8a1-ed94f1b01a57"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/536247b1-a87a-40c7-93e8-dd02fe3f3d54"></rdf:Description></mo:track><mo:track><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/track/67a68273-fa05-4ecf-aa85-648868a91b01"></rdf:Description></mo:track><rda:placeOfPublication><rdf:Description rdf:about="http://purl.org/NET/marccodes/countries/nyu#location"></rdf:Description></rda:placeOfPublication><rda:titleProper>The freewheelin' Bob Dylan</rda:titleProper><foaf:isPrimaryTopicOf><rdf:Description rdf:about="http://lccn.loc.gov/93707283"></rdf:Description></foaf:isPrimaryTopicOf><bibo:uri>http://hdl.loc.gov/loc.mbrsrs/lp0001.dyln</bibo:uri><bibo:lccn>93707283</bibo:lccn><rdf:type><rdf:Description rdf:about="http://purl.org/ontology/mo/Recording"></rdf:Description></rdf:type><dcterms:title>The freewheelin' Bob Dylan</dcterms:title><dcterms:creator><rdf:Description rdf:about="http://purl.org/NET/lccn/people/n50030190#i"><owl:sameAs><rdf:Description rdf:about="http://dbpedia.org/resource/Bob_Dylan"></rdf:Description></owl:sameAs><foaf:name>Dylan, Bob, 1941-</foaf:name><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r64001976#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007568523#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700813#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700816#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk84175336#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/95769390#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/99567433#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2002560219#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2002603175#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72343809#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/78762560#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700762#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700760#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r64001986#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2001036739#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700847#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/76762320#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93727595#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93726950#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/00536083#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk84158135#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93727467#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003636870#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93723725#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/88753098#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2008640899#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700841#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r65000579#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93705362#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700812#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700817#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007642887#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93711197#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700805#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700771#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700763#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72762075#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2002042823#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700818#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700774#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72763265#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/00717921#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72761611#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93707558#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2002578760#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2010616561#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r68000463#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007657614#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/00725480#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/95776698#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004304721#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/77760065#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700843#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700775#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94771432#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700807#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r65001916#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91759862#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004593548#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/74217252#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/68128332#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700811#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003696014#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91762191#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700770#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007657624#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/71224759#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/71763700#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/74760392#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007659104#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93726958#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/98028826#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/74760100#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700748#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk84196471#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93712912#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94023183#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700777#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91759844#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91759501#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/74762095#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2005045677#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93712878#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2009602196#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/85161136#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/99580439#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700758#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2001536938#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93703756#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700751#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700806#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700756#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700819#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/90750957#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/95701312#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2005434462#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk84219598#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/95786124#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93703149#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/99573938#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/79761946#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2002572371#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700804#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/66041425#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93710188#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/78056239#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93712977#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r67001398#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700244#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/66025502#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/86463558#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700815#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700768#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93727732#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72002339#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93703465#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700844#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2005560717#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700753#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91759880#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2008643295#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk84085959#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/76013692#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700800#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/92776389#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93711016#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93851416#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007657612#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/95129257#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700801#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700840#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700761#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/76760788#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007656198#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/70761002#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003385743#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700765#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700755#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2008655199#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/73762412#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700242#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/92755348#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r64000162#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700846#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/77275475#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700769#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/80760343#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/97751545#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2006656398#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2008038164#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93715284#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700837#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700764#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700752#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700759#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2006657054#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94770754#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93712995#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/00584949#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004590181#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93729841#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/00717773#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004592553#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2010616562#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2008300626#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/79761354#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93707283#i"><mo:label rdf:resource="http://dbtune.org/musicbrainz/resource/label/011d1192-6f65-45bd-85c4-0400dd45693e" /><mo:catalogue_number>CS 8786</mo:catalogue_number><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/5857b092-93ae-434d-b3f1-3f959396732b" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/98846b10-8951-43bc-ab24-c960e330cec8" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/88621637-8f03-427e-855f-4f52f712e80e" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/6d2b3714-478f-4fb1-9dfb-4a70c266453e" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/b654f8ad-071a-41c1-a1f7-1134de178ee8" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/9b08a5da-da77-4df8-b3ad-dcb481959013" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/f96f9b50-959d-4ef0-adc0-2995d179e6c8" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/2160818f-ce54-4502-a480-535389abef61" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/2f528602-ea36-480a-b1df-f7a5af36598e" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/19f571b4-b396-4113-9858-ab032074a3c7" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/00e37446-2e4c-409a-a8a1-ed94f1b01a57" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/536247b1-a87a-40c7-93e8-dd02fe3f3d54" /><mo:track rdf:resource="http://dbtune.org/musicbrainz/resource/track/67a68273-fa05-4ecf-aa85-648868a91b01" /><rda:placeOfPublication rdf:resource="http://purl.org/NET/marccodes/countries/nyu#location" /><rda:titleProper>The freewheelin' Bob Dylan</rda:titleProper><foaf:isPrimaryTopicOf rdf:resource="http://lccn.loc.gov/93707283" /><bibo:uri>http://hdl.loc.gov/loc.mbrsrs/lp0001.dyln</bibo:uri><bibo:lccn>93707283</bibo:lccn><rdf:type rdf:resource="http://purl.org/ontology/mo/Recording" /><dcterms:title>The freewheelin' Bob Dylan</dcterms:title><dcterms:creator rdf:resource="http://purl.org/NET/lccn/people/n50030190#i" /><dcterms:language rdf:resource="http://purl.org/NET/marccodes/languages/eng#lang" /><dcterms:subject rdf:resource="http://id.loc.gov/authorities/sh87003307#concept" /><dcterms:subject rdf:resource="http://id.loc.gov/authorities/sh87003285#concept" /><dcterms:isVersionOf rdf:resource="http://dbtune.org/musicbrainz/resource/record/942be4b0-12a2-4264-93a3-b45fa94c95c0" /></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/99583697#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93724609#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/75766013#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/66041424#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2008273503#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/96789283#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700776#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700820#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93711987#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700839#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004585875#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94162315#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700808#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/80772269#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r68000263#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700754#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2010616563#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004577731#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94770756#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004571741#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/97109960#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r62000368#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/76353124#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93711249#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93037274#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/86753728#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72760404#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700749#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700772#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/77018965#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700809#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/75762080#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r67001399#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003643362#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/78531019#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2010617063#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93701439#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r68000262#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003577486#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700766#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/76762851#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk84126692#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94762678#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007657649#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700243#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93728350#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/92750713#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94762887#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/95769054#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72373613#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/99571326#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91755140#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/92774846#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/77761257#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93711137#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003643131#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2009015567#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/76770532#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91761855#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93721323#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700803#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700810#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2010615421#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94746592#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72373606#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/92757783#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2002603171#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93712772#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700767#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004056454#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r65000580#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91759726#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93712861#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/81047774#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2005048013#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93702899#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk84096809#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700245#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/85040408#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/92754060#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/66052234#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/72251264#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700757#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700845#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91761627#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700814#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2002556777#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700750#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/87754788#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94770755#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700773#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91760389#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r62000369#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004560796#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/94750639#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2006571042#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r65001917#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/92754813#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700802#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003574417#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93707133#i"></rdf:Description></foaf:made><foaf:made>
<rdf:Description rdf:about="http://purl.org/NET/lccn/2007700842#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/95769314#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r67001260#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2003573910#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/87752098#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2004462312#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/r64000161#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/unk85066726#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/71763246#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/99567232#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2006530396#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2001545149#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/91761930#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/2007700799#i"></rdf:Description></foaf:made><foaf:made><rdf:Description rdf:about="http://purl.org/NET/lccn/93709800#i"></rdf:Description></foaf:made><umbel:isAbout><rdf:Description rdf:about="http://viaf.org/viaf/46946176.rwo"></rdf:Description></umbel:isAbout><rdf:type><rdf:Description rdf:about="http://xmlns.com/foaf/0.1/Person"></rdf:Description></rdf:type></rdf:Description></dcterms:creator><dcterms:language><rdf:Description rdf:about="http://purl.org/NET/marccodes/languages/eng#lang"></rdf:Description></dcterms:language><dcterms:subject><rdf:Description rdf:about="http://id.loc.gov/authorities/sh87003307#concept"><owl:sameAs><rdf:Description rdf:about="info:lc/authorities/sh87003307"></rdf:Description></owl:sameAs><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#conceptScheme"></rdf:Description></skos:inScheme><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#topicalTerms"></rdf:Description></skos:inScheme><skos:prefLabel>Popular music--1961-1970</skos:prefLabel><rdf:type><rdf:Description rdf:about="http://www.w3.org/2004/02/skos/core#Concept"></rdf:Description></rdf:type><dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-07-14T11:37:03-04:00</dcterms:modified><dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-05-22T00:00:00-04:00</dcterms:created></rdf:Description></dcterms:subject><dcterms:subject><rdf:Description rdf:about="http://id.loc.gov/authorities/sh87003285#concept"><owl:sameAs><rdf:Description rdf:about="info:lc/authorities/sh87003285"></rdf:Description></owl:sameAs><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#conceptScheme"></rdf:Description></skos:inScheme><skos:inScheme><rdf:Description rdf:about="http://id.loc.gov/authorities#topicalTerms"></rdf:Description></skos:inScheme><skos:prefLabel>Blues (Music)--1961-1970</skos:prefLabel><rdf:type><rdf:Description rdf:about="http://www.w3.org/2004/02/skos/core#Concept"></rdf:Description></rdf:type><dcterms:modified rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-07-14T16:41:41-04:00</dcterms:modified><dcterms:created rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">1987-05-22T00:00:00-04:00</dcterms:created></rdf:Description></dcterms:subject><dcterms:isVersionOf><rdf:Description rdf:about="http://dbtune.org/musicbrainz/resource/record/942be4b0-12a2-4264-93a3-b45fa94c95c0"></rdf:Description></dcterms:isVersionOf></rdf:Description></rdf:RDF>
* Connection #0 to host lccn.lcsubjects.org left intact
* Closing connection #0

There is a much richer graph waiting for it.

Now, I have no idea if this is a valid application of the 206 response. The only references I see to it on the web deal with either cache proxies or range requests, but this seems like best way to alert the client that they aren’t getting the entire graph on this request.

So try it out and enjoy the new (and very basic) HTML interface.

Any comments, suggestions or criticisms on this approach are extremely welcome.

One of the byproducts of the “Communicat” work I had done at Georgia Tech was a variant of Ed Summersruby-marc that went into more explicit detail regarding the contents inside the MARC record (as opposed to ruby-marc which focuses on its structure).  It had been living for the last couple of years as a branch within ruby-marc, but this was never a particularly ideal approach.  These enhancements were sort of out of scope for ruby-marc as a general MARC parser/writer, so it’s not as if this branch was ever going to see the light of day as trunk.  As a result, it was a massive pain in the butt for me to use locally:  I couldn’t easily add it as a gem (since it would have replaced the real ruby-marc, which I use far too much to live without) which meant that I would have to explicitly include it in whatever projects I wanted to use it in and update any paths included accordingly.

So as I found myself, yet again, copying the TypedRecords directory into another local project (this one to map MARC records to RDF), I decided it was time to make this its own project.

One of the amazingly wonderful aspects of Ruby is the notion of “opening up an object or class”.  For those not familiar with Ruby, the language allows you to take basically any object or class, redefine it and add your own attributes, methods, etc.  So if you feel that there is some particular functionality missing from a given Ruby object, you can just redefine it, adding or overriding the existing methods, without having to reimplement the entire thing.  So, for example:

class String
  def shout
    "#{self.upcase}!!!!"
  end
end

str = "Hello World"
str.shout
=> "HELLO WORLD!!!!"

And just like that, your String objects gained the ability to get a little louder and a little more obnoxious.

So rather than design the typed records concept as a replacement for ruby-marc, it made more sense to treat it more as an extension to ruby-marc.  By monkey patching, the regular marc parser/writer can remain the same, but if you want to look a little more closely at the contents, it will override the behavior of the original classes and objects and add a whole bunch of new functionality.  For MARC records, it’s analogous to how Facets adds all kinds of convenience methods to String, Fixnum, Array, etc.

So, now it has its own github project:  enhanced-marc.

If you want to install it:

  gem sources -a http://gems.github.com
  sudo gem install rsinger-enhanced_marc

There’s some really simple usage instructions on the project page and I’ll try to get the rdocs together as soon as I can.  In a nutshell it works almost just like ruby-marc does:

require 'enhanced_marc'

records = []
reader = MARC::Reader.open('marc.dat')
reader.each do | record
  records << record
end

As it parses each record, it examines the leader to determine what kind of record it is:

  • MARC::BookRecord
  • MARC::SerialRecord
  • MARC::MapRecord
  • MARC::ScoreRecord
  • MARC::SoundRecord
  • MARC::VisualRecord
  • MARC::MixedRecord

and adds a bunch of format specific methods appropriate for, say, a map.

It’s possible to then simply extract either the MARC codes or the (English) human readable string that the MARC code represents:

record.class
=> MARC::SerialRecord
record.frequency
=> "d"
record.frequency(true)
=> "Daily"
record.serial_type(true)
=> "Newspaper"
record.is_conference?
=> false

or, say:

record.class
=> MARC::VisualRecord
record.is_govdoc?
=> true
record.audience_level
=> "j"
record.material_type(true)
=> "Videorecording"
record.technique(true)
=> "Animation"

And so on.

There is still quite a bit I still need to add.  It pretty much ignores mixed records at the moment.  It’s something I’ll need to eventually get to, but these are uncommon enough that it’s currently a lower priority.  I also need to provide some methods that evaluate the 007 field.  I haven’t gotten to this yet, just because it’s just a ton of tedium.  It would be useful, though, so I want to get it in there.

If there is interest, it could perhaps be extended to include authority records or holdings records.  It would also be handy to have convenience methods on the data fields:

record.isbn
=> "0977616630"
record.control_number
=> "793456"

Anyway, hopefully somebody might find this to be useful.

While what I’m posting here might be incredibly obvious to anyone that understands unicode or Ruby better than me, it was new to me and might be new to you, so I’ll share.

Since Ed already let the cat out of the bag about LCSubjects.org, I can explain the backstory here.  At lcsh.info, Ed made the entire dataset available as N-Triples, so just before he yanked the site, I grabbed the data and have been holding onto it since.  I wrote a simple little N-Triples parser in Ruby to rewrite some of the data before I loaded it into the platform store I have.  My first pass at this was really buggy, I wasn’t parsing N-Triple literals well at all and was leaving out quoted text within the literal and whatnot.  I also, inadvertantly, was completely ignoring the escaped unicode within the literals and sending them verbatim.

N-Triples escapes unicode the same way Python string literals do (or at least this is how I’ve understood it), so 7⁰03ʹ43ʺN 151⁰56ʹ25ʺE is serialized into nt like: 7\\u207003\\u02B943\\u02BAN 151\\u207056\\u02B925\\u02BAE.  Try as I might, I could not figure out how to turn that back into unicode.

Jonathan Rochkind recommended that I look at the Ruby JSON library for some guidance, since JSON also encodes this way.  With that, I took a peek in JSON::Pure::Parser and modified parse_string for my needs.  So, if you have escaped unicode strings like this, and want them to be unicode, here’s a simple class to handle it.

$KCODE = 'u'
require 'strscan'
require 'iconv'
require 'jcode'
class UTF8Parser < StringScanner
  STRING = /(([\x0-\x1f]|[\\\/bfnrt]|\\u[0-9a-fA-F]{4}|[\x20-\xff])*)/nx
  UNPARSED = Object.new
  UNESCAPE_MAP = Hash.new { |h, k| h[k] = k.chr }
  UNESCAPE_MAP.update({
    ?"  => '"',
    ?\\ => '\\',
    ?/  => '/',
    ?b  => "\b",
    ?f  => "\f",
    ?n  => "\n",
    ?r  => "\r",
    ?t  => "\t",
    ?u  => nil,
  })
  UTF16toUTF8 = Iconv.new('utf-8', 'utf-16be')
  def initialize(str)
    super(str)
    @string = str
  end
  def parse_string
    if scan(STRING)
      return '' if self[1].empty?
      string = self[1].gsub(%r((?:\\[\\bfnrt"/]|(?:\\u(?:[A-Fa-f\d]{4}))+|\\[\x20-\xff]))n) do |c|
        if u = UNESCAPE_MAP[$&[1]]
          u
        else # \uXXXX
          bytes = ''
          i = 0
          while c[6 * i] == ?\\ && c[6 * i + 1] == ?u
            bytes << c[6 * i + 2, 2].to_i(16) << c[6 * i + 4, 2].to_i(16)
            i += 1
          end
          UTF16toUTF8.iconv(bytes)
        end
      end
      if string.respond_to?(:force_encoding)
        string.force_encoding(Encoding::UTF_8)
      end
      string
    else
      UNPARSED
    end
  rescue Iconv::Failure => e
    raise GeneratorError, "Caught #{e.class}: #{e}"
  end
end

My relationship with Ruby nowadays is roughly akin to somebody addicted to pain killers.  I know it’s not good for me (since everything I work on nowadays is RDF, XML or both) but I’m able to still be productive and the pain of quitting, while in the long run would be better for everybody, just isn’t something I have time for right now.  Maybe someday I’ll make the jump back to Python (since it’s actually pretty good at dealing with both RDF and XML), but for now I’ll just find workarounds to my problems (unlike others, I am completely incapable of juggling more than one language).

I first ran into my big XML and Ruby problem a couple of weeks ago while working on the TalisLMS connector for Jangle.  I’ve, of course, run into it before, but it has never been a total show stopper like this.  In order to add the Resource entity to the TalisLMS (Jangle-ese for bibliographic records) connector, I am querying the Platform store the OPAC uses.  I’m using the Platform rather than the Zebra index that comes with Alto (the records are indexed in both places) because the modified date isn’t sortable in Zebra and that would be an issue when serializing everything to Atom.  The records are transformed into a proprietary RDF format (called BibRDF) when loaded into the Platform (this is for the benefit of Prism, our OPAC).  In order to get the MARC records (there’s no route back to the MARC from BibRDF), I have to pull the UniqueIdentifer (which is the mapped 001)  field out of the BibRDF and throw them in a Z39.50 client (Ruby/ZOOM) and query the Zebra index.  In order to get enough metadata to create a valid Atom entry, I needed to be able to parse the BibRDF (which comes out of the Platform as RDF/XML), since that is the default record format.

And this is where I’d run into problems.  I have the default number of records set to be returned by the Jangle to 100.  That’s a pretty sweet spot for both servers to handle the load and clients to deal with resulting Atom document.  Well, you’d think it was, anyway, except REXML was taking about 10 seconds to parse the Platform response into Ruby objects.

I realize the Rubyists out there are already dismissing this and scrolling down to the comment box to write “well don’t use REXML, you dumbass”, but let me explain.  I generally don’t use REXML (unless it’s something very small and simple), instead opting for Hpricot for parsing XML.  I’ve tended to avoid LibXML in Ruby, when I first tried it, it segfaulted a lot, but that was the past… my reasons for avoiding it lately is because I have this stubborn ideal about having things work with JRuby and that’s just not going to be an option with LibXML (before you scroll down and add another comment about the Ruby/ZOOM requirement, it will eventually be replaced with Ruby-SRU… probably).  Hpricot was falling flat on its face with the BibRDF namespace prefixes, though (j.0:UniqueIdentifier).  It seems to have problems with periods in the prefix, so that was a no go.

So I had REXML and I had horrible performance.  Now what?

Well, JSON is fast in Ruby, so I thought that might be an option.  The Platform has a transform service, if you pass an argument with the URL for an XSLT stylesheet, it will output the result in the format you want.  Googling found several projects that would turn XML into JSON via XSLT (this one seems the best if you have an XSLT 2.0 parser), but they weren’t quite what I needed.  I wanted to preserve the original RDF/XML since I was just going to be turning around and regurgitating it back to the Jangle server, anyway.  I just needed a quick way to grab the UniqueIdentifier, MainAuthor and LastModified fields and shove the rest of the XML into an object attribute.

I have always chafed at the thought of actually doing anything in XSLT.  In retrospect (after I’ve been using almost exclusively for a month, now), I realize that my opinion was probably actually the result of the data that I was trying to transform (EAD, the metadata format designed to punish technologists) rather than XSLT itself (the project got sucked into a vortex when I tried working with the EAD directly with Ruby, too).  Still, I had always resisted.  The syntax is weird, variables confused me, I just never got the hang of it.

But, damn, it’s fast.

And, when I turned the XML into JSON (with XML), it was perfect.  Here’s my stylesheetHere’s what the output from the Platform looks like.  Here’s the output from the TalisLMS connector.

I wasn’t done, yet, though.  The DLF ILS-DI Adapter for Jangle’s OAI-PMH service was sooooo slow.  Requests were literally taking around 35 seconds each.  This was because I was using FeedTools to parse the Atom documents and Builder::XmlMarkup to generate the OAI-PMH output.  And this was silly.  Atom is a very short hop to OAI-PMH, and there was really no need to manipulate the data itself at all.  However, I did need to add stuff to the final XML output that I wouldn’t know until it was time to render.  So I wrote these two XSLTs.  I have patterns in there which are identified by “##verb##” or “##requestUrl##”, etc.  This way, I can load the XSLT file into my Ruby script, replace the patterns with their real values via regex, and then transform the Atom to OAI-PMH using libxslt-ruby.  Requests are now down to about 5 seconds.  Not bad.

All in all I’m pretty happy with this.  And I don’t have to quit my addiction just yet.

For those of you that noticed that libxslt-ruby doesn’t quite jibe with my JRuby requirement, well, I guess I’m not a very dogmatic at the end of the day (which is right about now).

Something I’ve taken it upon myself to do since I joined Talis is make ActiveRDF a viable client to access the Platform.  While this is mostly selfishness on my part (I want to keep developing in Ruby and there’s basically no RDF support right now, plus this gives me a chance to learn about the RDF/SPARQL-y aspects of the Platform), I also think that libraries like this can only help democratize the Platform.

So far, it’s been pretty ugly.  I haven’t had much time to work on it, granted, but the time I’ve spent on it has made me think that there will be a lot of work to do.  Couple this with some of the things that make the Platform difficult to work with in Ruby anyway (read: Digest Authentication) and this might be a more uphill battle than I’ll ever have time for, but I figure it’s either this or go back to Python and I’m not quite ready to give up on Ruby yet.

Currently, performance is abysmal with ActiveRDF against the Platform, so I’ll need to think of shortcuts to improve that (I’m not even considering write access presently).  Here’s some code (this is as much for my benefit, so I can remember what I’ve done) to work with Ian Davis’ Quotations Book Example store:

require ‘time’ # Otherwise ActiveRDF starts freaking out about DateTime
require ‘active_rdf’

$activerdf_without_xsdtype = true
# less than ideal, but without it, ActiveRDF sends
# ^^<http://www.w3.org/2001/XMLSchema#string> with string literals even if you don’t want
# to send the datatype.  I haven’t actually tried it with other datatypes to see how this breaks
# down the road.

ConnectionPool.set_data_source(:type => :sparql, :results => :sparql_xml, :engine=>:joseki,  :url=> “http://api.talis.com/stores/iand-dev2/services/sparql”)

Namespace.register :foaf, “http://xmlns.com/foaf/0.1/”
Namespace.register :dc, “http://purl.org/dc/elements/1.1/”
Namespace.register :quote, “http://purl.org/vocab/quotation/schema”

QUOTE::Quotations.find_by_dc::creator(“Loren, Sophia”).each do | quote |

# print the important stuff from each graph

# http://purl.org/vocab/quotation/schema#quote has to be manually added as a predicate
# the “#” seems to cause problems
quote.add_predicate(:quote, QUOTE::quote)
puts quote.quote
puts quote.subject
puts quote.rights
puts quote.isPrimaryTopicOf

end

If you actually try to execute this, you’ll see that it takes a long time to run (God help you if you try it on QUOTE::Quotations.find_by_dc::subject(“Age and Aging”)).  A really long time.

If you set some environment vars before you go into irb:

$ export ACTIVE_RDF_LOG_LEVEL=0
$ export ACTIVE_RDF_LOG=./activerdf.log

then you can tail -f activerdf.log and see what exactly is happening.

After ActiveRDF does it’s initial SPARQL query (SELECT DISTINCT ?s WHERE { ?s <http://purl.org/dc/elements/1.1/creator> “Loren, Sophia” . }), it’s doing two things for every request in the block:

  1. a SPARQL query for every predicate associated with the URI (http://api.talis.com/stores/iand-dev2/services/sparql?query=SELECT+DISTINCT+%3Fp+WHERE+%7B+%3Chttp%3A%2F%2Fapi.talis.com%2Fstores%2Fiand-dev2%2Fitems%2F1187139384317%3E+%3Fp+%3Fo+.+%7D+)
  2. a SPARQL query for the value of the attribute (predicate):  http://api.talis.com/stores/iand-dev2/services/sparql?query=SELECT+DISTINCT+%3Fo+WHERE+%7B+%3Chttp%3A%2F%2Fapi.talis.com%2Fstores%2Fiand-dev2%2Fitems%2F1187139384317%3E+%3Chttp%3A%2F%2Fpurl.org%2Fdc%2Felements%2F1.1%2Fcreator%3E+%3Fo+.+%7D

for every predicate in the graph.  You can imagine how crazily inefficient this is, since to get every value for a resource, you have to make a different HTTP request for each one.

Obviously this would be a lot easier if it used DESCRIBE rather than SELECT, but without a real RDF library to parse the resulting graph, I’m not sure how ActiveRDF would deal with what the triple store returned.

So, anyway, these are some of the hurdles in making ActiveRDF work with the Platform, but I’m not quite ready to throw in the towel, yet.

Sometime in November, I came to the realization that I had horribly misinterpreted the NISO Z39.88/OpenURL 1.0 spec.  I’m on the NISO Advisory Committee for OpenURL (which makes this even more embarrassing) and was reviewing the proposal for the Request Transfer Message Community Profile and its associated metadata formats when it dawned on me that my mental model was completely wrong.  For those of you that have primarily dealt with KEV based OpenURLs (which is 99% of all the OpenURLs in the wild), I would wager that your mental model is probably wrong, too.

A quick primer on OpenURL:

  • OpenURL is a standard for transporting ContextObjects (basically a reference to something, in practice, mostly bibliographic citations)
  • A ContextObject (CTX, for short from now on) is comprised of Entities that help define what it is.  Entities can be one of six kinds:
    • Referent – this is the meat of the CTX, what it’s about, what you’re trying to get context about.  A CTX must have one referent and only one.
    • ReferringEntitydefines the resource that cited the referent.  This is optional and can only appear once.
    • Referrer – the source of where the CTX came from (i.e. the A&I database).  This is optional and can only appear once.
    • Requester – this is information about who is making the request (i.e. the user’s IP address).  This is optional and can only appear once.
    • ServiceType – this defines what sorts of services are being requested about the referent (i.e. getFullText, document delivery services, etc.).  There can be zero or many ServiceType entities defined in the CTX.
    • Resolver these are messages specifically to the resolver about the request.  There can be zero or more Resolver entities defined in the CTX.
  • All entities are basically the same in what they can hold:
    • Identifiers (such as DOI or IP Address)
    • By-Value Metadata (the metadata is included in the Entity)
    • By-Reference Metadata (the Entity has a pointer to a URL where you can retrieve the metadata, rather than including it in the CTX itself)
    • Private Data (presumably data, possibly confidential, between the entity and the resolver)
  • A CTX can also contain administrative data, which defines the version of the ContextObject, a timestamp and an identifier for the CTX (all optional)
  • Community Profiles define valid configurations and constraints for a given use case (for instance, scholarly search services are defined differently than document delivery).  Context objects don’t actually specify any community profile they conform to.  This is a rather loose agreement between the resolver and the context object source:   if you provide me with a SAP1, SAP2 or Dublin Core compliant OpenURL, I can return something sensible.
  • There are currently two registered serializations for OpenURL:  Key/Encoded Values where all of the values are output on a single string, formatted as key=value and delimited by ampersands (this is what majority of all OpenURLs that currently exist look like) and XML (which is much rarer, but also much more powerful)
  • There is no standard OpenURL ‘response’ format.  Given the nature of OpenURL, it’s highly unlikely that one could be created that would meet all expected needs.  A better alternative would be for a particular community profile to define a response format since the scope would be more realistic and focused.

Looking back on this, I’m not sure how “quick” this is, but hopefully it can bootstrap those of you that have only cursory knowledge of OpenURL (or less).  Another interesting way to look at OpenURL is Jeff Young’s 6 questions approach, which breaks OpenURL down to “who”, “what”, “where”, “when”, “why” and “how”.

One of the great failings of OpenURL (in my mind, at least) is the complete and utter lack of documentation, examples, dialog or tutorials about its use or potential.  In fact, outside of COinS, maybe, there is no notion of “community” to help promote OpenURL or cultivate awareness or adoption.  To be fair, I am as guilty as anybody for this failure, since I had proposed making a community site for OpenURL, but due to a shift in job responsibilities and then the wholesale change in employers, coupled with the hacking of the server it was to live on, left this by the wayside.  I’m putting this back on my to do list.

What this lack of direction leads to is that would-be implementors wind up making a lot of assumptions about OpenURL.  The official spec published at NISO is a tough read and is generally discouraged by the “inner core” of the OpenURL universe (the Herbert van de Sompels, the Eric Hellmans, the Karen Coyles, etc.) in favor of the “Implementation Guidelines” documents.  However, only the KEV Guidelines are actually posted there.  The only other real avenue for trying to come to grips with OpenURL is to dissect the behavior of link resolvers.  Again, in almost every instance this means you’re working with KEVs and the downside of KEVs is that they give you a very naive view of OpenURL.

KEVs, by their very nature, are flat and expose next to nothing about the structure of the model of the context object they represent.  Take the following, for example:

url_ver=Z39.88-2004&url_tim=2003-04-11T10%3A09%3A15TZD
&url_ctx_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Actx&ctx_ver=Z39.88-2004
&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&ctx_id=10_8&ctx_tim=2003-04-11T10%3A08%3A30TZD
&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.aulast=Vergnaud
&rft.auinit=J.-R&rft.btitle=D%C3%A9pendances+et+niveaux+de+repr%C3%A9sentation+en+syntaxe
&rft.date=1985&rft.pub=Benjamins&rft.place=Amsterdam%2C+Philadelphia
&rfe_id=urn%3Aisbn%3A0262531283&rfe_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook
&rfe.genre=book&rfe.aulast=Chomsky&rfe.auinit=N&rfe.btitle=Minimalist+Program
&rfe.isbn=0262531283&rfe.date=1995&rfe.pub=The+MIT+Press&rfe.place=Cambridge%2C+Mass
&svc_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Asch_svc&svc.abstract=yes
&rfr_id=info%3Asid%2Febookco.com%3Abookreader

Ugly, I know, but bear with me for a moment.  From this example, let’s focus on the Referent:

rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.aulast=Vergnaud
&rft.auinit=J.-R&rft.btitle=D%C3%A9pendances+et+niveaux+de+repr%C3%A9sentation+en+syntaxe
&rft.date=1985&rft.pub=Benjamins&rft.place=Amsterdam%2C+Philadelphia

and then let’s make this a little more human readable:

rft_val_fmt:  info:ofi/fmt:kev:mtx:book
rft.genre:  book
rft.aulast:  Vergnaud
rft.auinit:  J.-R
rft.btitle:  Dépendances et niveaux de représentation en syntaxe
rft.date:  1985
rft.pub:  Benjamins
rft.place:  Amsterdam, Philadelphia

Looking at this example, it’s certainly easy to draw some conclusions about the referent, the most obvious being that it’s a book.

Actually (and this is where it gets complicated and I begin to look pedantic) it’s really only telling you, I am sending some by value metadata in the info:ofi/fmt:kev:mtx:book format, not that the thing is actually a book (although the info:ofi/fmt:kev:mtx:book metadata values do state that, but, ignore that for a minute since genre is optional).

The way this actually should be thought of:

ContextObject:
    Referent:
       Metadata by Value:
          Format:  info:ofi/fmt:kev:mtx:book
          Metadata:
             Genre:  book
             Btitle:  Dépendances et niveaux de représentation en syntaxe
             …
    ReferringEntity:
       Identifier:  urn:isbn:0262531283
       Metadata by Value:
           Format:  info:ofi/fmt:kev:mtx:book
           Metadata:
               Genre:  book
               Isbn:  0262531283
               Btitle:  Minimalist Progam
               …
    Referrer:
       Identifier:  info:sid/ebookco.com:bookreader
    ServiceType:
       Metadata By Value:
           Format:  info:ofi/fmt:kev:mtx:sch_svc
           Metadata:
               Abstract:  yes

So, this should still seem fairly straightforward, but the hierarchy certainly isn’t evident in the KEV.  It’s a good starting point to begin talking about the complexity of working with OpenURL, though, especially if you’re trying to create a service that consumes OpenURL context objects.

Back to the referent metadata.  The context object didn’t have to send the data in the “metadata by value” stanza.  It could have just sent the identifier “urn:isbn:9027231141” (and note in the above example, it didn’t have an identifier at all).  It could also have sent metadata in the Dublin Core format, MARC21, MODS, ONIX or all of the above (the Metadata By Value element is repeatable) if you wanted to make sure your referent could be parsed by the widest range of resolvers. While all of these are bibliographic formats, in Request Transfer Message context objects (which would be used for document delivery, which got me started down this whole path), you would conceivably have one or more of the aforementioned metadata types plus a Request Transfer Profile Referent type that describes the sorts of interlibrary loan-ish types of data that accompany the referent as well as an ISO Holdings Schema metadata element carrying the actual items a library has, their locations and status.

If you only have run across KEVs describing journal articles or books, this may come as a bit of a surprise.  Instead of saying the above referent is a book, it becomes important to say that the referent contains a metadata package (as Jonathan Rochkind calls it) that is in this (OpenURL specific) book format.  In this regard, OpenURL is similar to METS.  It wraps other metadata documents and defines the relationships between them.  It is completely ambivalent about the data it is transporting and makes no attempt to define it or format it in any way.  The Journal, Book, Patent and Dissertation formats were basically contrived to make compatibility with OpenURL 0.1 easier, but they are not directly associated with OpenURL and could have just as easily been replaced with, say, BibTex or RIS (although the fact that they were created alongside Z39.88 and are maintained by the same community makes the distinction difficult to see).

What this means, then, is that in order to know anything about a given entity, you also need to know about the metadata format that is being sent about it.  And since that metadata could literally be in any format, it means there are lot of variables that need to be addressed just to know what a thing is.

For the Umlaut, I wrote an OpenURL library for Ruby as a means to parse and create OpenURLs.  Needless to say, it was originally written with that naive, KEV-based, mental model (plus some other just completely errant assumptions about how context objects worked) and, because of this, I decided to completely rewrite it.  I am still in the process of this, but am struggling with some core architectural concepts and am throwing this out to the larger world as an appeal for ideas or advice.

Overall the design is pretty simple:  there is a ContextObject object that contains a hash of the administrative metadata and then attributes (referent, referrer, requester, etc.) that contain Entity objects.

The Entity object has arrays of identifiers, private data and metadata.

And then this is where I start to run aground.

The original (and current) plan was to populate the metadata array with native metadata objects that are generated by registering metadata classes in a MetadataFactory class.  The problem, you see, is that I don’t want to get into the business of having to create classes to parse and access every kind of metadata format that gets approved for Z39.88.  For example, Ed Summers’ ruby-marc has already solved the problem of effectively working with MARC in Ruby, so why do I want to reinvent that wheel?  The counter argument is, by delegating these responsibilities to third party libraries, there is no consistency of APIs between “metadata packages”.  A method used in format A may very well raise an exception (or, worse, overwrite data) in format B

There is a secondary problem that third party libraries aren’t going have any idea that they’re in an OpenURL context object or even know what that is.  This means there would have to be some class that handles functionality like xml serialization (since ruby-marc doesn’t know that Z39.88 refers to it as info:ofi/fmt:xml:xsd:MARC21), although this can be handled by the specific metadata factory class.  This would also be necessary when parsing an incoming OpenURL since, theoretically, every library could have a different syntax for importing XML, KEVs or whatever other serialization is devised in the future.

So I’m looking for advice on how to proceed.  All ideas welcome.

…although LinuX_Xploit_Crew, with all due respect, I think it actually is.

Oh well, we’re back with a new theme (which nobody will see except to read comments, since I’m pretty sure all traffic comes from the code4lib planet) and an updated WordPress install. Look out, world!

So, in the downtime here’s a non-comprehensive rundown of what I’ve been working on:

  1. I’ve written an improved (at least, I think it’s improved) alternative to Docutek’s Eres RSS interface. Frankly, Docutek’s sucked. Maybe we have an outdated version of Eres, but the RSS feeds would give errors because you have to click through a copyright scare page before you can view reserves, but you can’t link the RSS links to this form and get the item. I wrote a little Ruby/Camping app that takes urls like: http://eres.library.gatech.edu/course/WS-1001-A/Fall/2007 and turns that into a usable feed. I needed the course id/term/year format to show them in Sakai. My favorite part of this project was finding Rubyscript2exe. This allows me to just bundle one file (the compiled camping app) plus a configuration file. Granted, an asp.net app would be even easier for sites to install, but I didn’t have time to learn asp.net. I have more ideas of what I would like to do with this (such as show current circ status for physical reserves), but in the chaos that is our library reorg, I haven’t gotten around to even showing anybody what I’ve written so far.
  2. I broke ground on a Metalib X-Server Ruby library. It took me a while to wrap my head around how this needed to be modelled, but I think it’s starting to take shape. It doesn’t actually perform queries, yet, but it connects to the server, allows you to set the portal association and find and set the category to search. Quicksets and MySets are all derivations of the same concept, so I don’t think it will be take me long to actually incorporate actual searching. For proof-of-concept, I plan on embedding this library in MemoryHole, our Solr-based discovery app. I’ve actually stopped development of MemoryHole so we can focus on vuFind, since they do functionally the same thing and I’d rather help make vuFind better than replicate everything it does, only in Ruby. The reason I’m doing this proof-of-concept in MemoryHole rather than vuFind is solely due to familiarity and time.

In other news, my last post seems to have caused a bit of a stir. My plan is to write a response, but the short of it is that I feel the arguments for an MLS are extremely classist.

Also my bathroom is finished and it looks great.

ERESidue

Blogged with Flock

Tags:

Before I left for Guatemala, Ian Davis at Talis asked if I could give him a dump of our MARC records to load into Talis Platform. I had been talking in the #code4lib channel about how I was pushing the idea of using Talis Source to make simple, ad-hoc union catalogs; we could make one for Georgia Tech & Emory (we have joint degree programs) or Arche or Georgia Tech/Atlanta-Fulton Public Library, etc. My thinking was that by utilizing the Talis Platform, we could forgo much of the headache in actually making a union catalog for somewhat marginal use cases (the public library one notwithstanding).

About a week after I got back from Guatemala, I had an email from Richard Wallis with some urls to play around with to access my Bigfoot store. He showed me search services, facet services and augment services. I was unable to be really dive into it much at the time but since I’m working on a total site search project for the library, I thought this would be a good chance to kick the tires a bit to include catalog results.

After two days of poking around, I have made some opinions of it, have some recommendations for it, and wrote a Ruby library to access it.

1) The Item Service

This is certainly the most straightforward and for many people, the most useful service of the bunch. The easiest way to think of the item service is an HTTP based Lucene service (a la Solr or Lucene-WS) of your bib records. It returns something OpenSearch-y (it claims to be a RSS 1.0 document), but it doesn’t validate. That being said, FeedTools happily consumed it (more on that later) and the semantics should be familiar to anyone that has looked at OpenSearch before. Each item node also contains a Dublin Core representation of the record and a link to a marcxml representation. I’m not sure if there’s a description document for Bigfoot.
Although the query syntax is pure Lucene (title:”The Lexus and the Olive Tree”), the downside is that it’s not documented anywhere what the indexes are and I doubt there would be any way to add new ones (for example, my guess is I wouldn’t be able to get an index for 490/440$v that I use for the Umlaut). I don’t see returning the results as OAI_DC being too much of a problem, since the RSS item includes a title (which would have been tricky between the DC and the marcxml). My Ruby library might not generate valid DC, I haven’t really looked into it.

The docs also mention you can POST items to your Bigfoot store, but they don’t mention what your data needs to look like (MARC?) or what credentials you need to add something (I mean, it must be more than just your store name, right?). My hope is to add this functionality to bigfoot-ruby soon (especially since my data is from a bulk export from last October).

2) The Facet Service

This one is intriguing, definitely, since Faceted searching is all the rage right now. The search syntax is basically the same as the Item Service, except you also send a comma delimited list of the fields you would like to query. What you get back is either an XML or XHTML document of your results.

For each field you request, you get back a set of terms (you can specify how many you want, with a default of 5) that appear most frequently in your field. You also get an approximation for how many results you would get in that facet and a url to search on that facet. It’s quite fast, although, realistically, you can’t do much with the output of facet search alone.

Again, it’s difficult to know what you can facet on (subject, creator and date are all useful — I’m sure there are others) and the facet that (for me, at least) held the most promise — type — is too overly broad to do much with (it uses Leader position 7, but lumps the BKS and SER types all in a label called “text”). I would like to see Talis implement something like my MARC::TypedRecord concept so one could facet on things like government document or conference. You could separate newspapers from journals and globes from maps. Still, the text analysis of the non-fixed fields is powerful and useful and beats the hell out of trying to implement something like that locally.

In bigfoot-ruby, I have provided two ways to do a faceted search: you can just do the search and get back Facet objects containing the terms and search urls or you can facet with items which executes the item searches automatically (in turn getting a definitive number of results for the query, as well). Since I didn’t bother to implement threading, getting facets with items can be pretty slow.

3) The Augment Service

To be honest, I’m having a hard time figuring out useful scenarios for the augment service. The idea is that you give it the URI of an RSS feed, and this service will enhance it with data from your Bigfoot store (at least, that is sort of how I understand it works). Richard’s example for me was to feed it the output of an xISBN query (which isn’t in RSS 1.0, AFAIK, but, for the sake of example…) and the augment service would fill in the data for ISBNs your library holds. The API example page mentions Wikipedia, but I don’t know where other than the Talis Platform that you can get Wikipedia entries formatted properly. I tried sending it the results of an Umlaut2 OpenSearch query, but it didn’t do anything with it. Presumably this RSS 1.0 feed needs the bib data to be sent in a certain way (my guess is in OAI_DC, like the Item Service), but I’m not sure. The only use case I can think of for this service is a much simpler way to check for ISBN concordance (rather than isbn:(123456789X|223456789X|323456789X|etc.))

Overall, I’m really impressed with the Talis API. It is a LOT easier to use than, say, Z39.50 and by using OpenSearch seems more natural to integrate into existing web services than SRU.

Bigfoot-ruby is definitely a work in progress. I think I would like to split the Search class into ItemService and FacetService. I don’t like how results is an Array for items and a Hash for facets. Just seems sloppy. I need to document it, of course and I would like to implement Item POST. This project also made me realize how bloody slow FeedTools is. I am currently using it in both the Umlaut and the Finding Aids to provide OpenSearch, but I think it’s really too sluggish to justify itself.

Thanks, Talis, for getting me started with Bigfoot and giving me the opportunity to play around with it. Also, thanks to Ed Summers for fixing SVN on Code4lib.org. You wouldn’t be able to download it and futz around with it yourself, otherwise.