Let me start this by saying this is not a criticism or a rant against any of the technologies I am about to mention. The problems I am having are pretty specific and the fact that I am intentionally trying to use “off the shelf” commodity projects to accomplish my goal complicates things. I realize when I tweet things like this, it’s not helpful (because there is zero context).
I’ve been in a bit of a rut this week. Things were going ok on Monday, when I got the Jangle connector for Reserves Direct working, announced and started generating some conversation around how to model course reserves (and their ilk) in Jangle. However, this left me without anything specific to work on. I have a general, hand-wavy, project that I am supposed to be working on to provide a simple, general digital asset management framework that can either work on the Platform or with a local repository like Fedora, depending on the institution’s local needs or policies. More on this in some other post. The short of it is, I need to gather some general requirements before I can begin something like this in earnest, which led me to revive an old project.
When Jangle first started, about 15 months ago, Elliot and I felt we needed what we called “the Base Connector“. The idea here was that there were always going to be situations where a developer doesn’t have direct, live access to a system’s data, and they would need a surrogate external database to work with. The Base Connector was an attempt to provide an out of the box application that could simulate the basics of an ILS and be populated with the sorts of data you would get from commandline ‘report’ type applications. The sort of thing you can cron and write out to a file on your ILS server. Updates in catalog records. Changes in user status. Transactions. That sort of thing.
After the amount of interest at Code4lib in Janglifying III Millenium, I decided to revisit the concept of the Base Connector. Millenium’s (and to an extent, Unicorn’s, and there are no doubt others) lack of consistent database access, makes it a good candidate for this duplicate database. I was hoping to take a somewhat different approach to this problem than Elliot and I had originally tried, however. I was hoping to be able to come up with something:
- More generically “Jangle” and less domain specific to ILSes
- Easy to install
- Customizable with a simple interface
- Something, preferably, that could be taken mostly “off the shelf”, where the only real “coding” I had to do was to get the library data in and the connector API out. I was hoping all “data model” and “management” stuff could be taken care of already.
In my mind, I was picturing using a regular CMS for this, although it needed to be able to support customized fields for resources.
Here is the rough scenario I am picturing. Let’s say you have an ILS that you don’t have a lot of access to. For your external ‘repository’, you’ll need it to be able to accomodate a few things.
- Resources will need not just a title, an identifier and the MARC record, but also have fields for ISBN, ISSN, OCLC number, etc. They’ll also need some sort of relationship to the Items and Collections they’re associated with.
- Actors could be simple system user accounts, but they’ll need first names and last names and whatnot.
- Collections, I assume, can probably be contrived via tags and whatnot.
- The data loading would probably need to be able to be done remotely via some commandline line scripting.
I decided to try three different CMSes to try to accomplish this: Drupal, Plone and Daisy. I’ll go through each and where I ran into a snag for each. I want to reiterate here that I know next to nothing about any of these. My problems are probably not shortcomings of the project themselves, but more due to my own ignorance. If you see possible solutions to my issues (or know of other possible projects that fit my need even better) please let me know. This is a cry for help, not a review.
One of the reasons I targeted Drupal is that it’s easy to get running, can run on cheap shared hosting, has quite a bit of traction in libraries and has CCK. I actually got the farthest with Drupal in this capacity. With CCK, I was able to, in the native Drupal interface, build content types for Resources and Items. For Actors, I had just planned on using regular user accounts (since then I could probably piggyback off of the contributed authentication modules and whatnot). Collections would be derived from Taxonomies.
Where things went wrong:
My desire is to decouple the ‘data load’ aspect of the process from the ‘bag of holding’ itself. What I’m saying is that I would prefer that the MARC/borrower/item status/etc. load not be required to be built in Drupal module, but, instead, be able to be written in whatever language the developer is comfortable with and a simple way of getting that data into the bag of holding.
There are only two ways that I can see to use an external program to get data into Drupal:
- Use the XMLRPC interface
- Simulate the content creation forms with a scripted HTTP client.
I’m not above number two, but I would prefer not to if there’s a better way available. The problem is that I can find almost zero documentation on the XMLRPC service. What ‘calls’ are available? How do I create a Resource content type? How do I relate that to a user or an Item? I have no idea where to look. I don’t actually even know if the fields I created will be searchable (which was the whole point of making them).
Drupal seems promising for this, but I don’t know where to go from here.
I really thought Plone was going to be a winner. It’s completely self-contained (framework, webserver and database all rolled into one installer) and based on an OODB. Being Python based, I feel I can fall back on Python to build the scripts to actually do the dirty work of massaging and loading the data. The downside to Plone (and I have looked eye-to-eye with this downside before) is that it and Zope are total voodoo.
It didn’t take me long to run into a brick wall with Plone. I installed version 3.2.1 thanks to the handy OSX installer and got it up and running.
And then I couldn’t figure out what to do next. I think I want Archetypes. I followed the (outdated) instructions to install it. I see Archetypes stuff in the Zope control panel. However, I never see anything in Plone. I Google. Nothing. Feeling that it must be there and I’m just missing something I follow this tutorial to start building new content types. I build a new content type. It doesn’t show up in the Plone quick installer. Nothing in the logs. I Google. Nothing.
Nothing frustrates me more than software making me feel like total dumbass.
I am at the point where I think Plone might be up to the task, but I don’t have the interest, time or energy to make it work. At the end of the day, I’m still not entirely sure that it would meet my basic criteria of the ‘content type’ being editable within the native web framework anyway. I also have no idea if my plan of loading the data via an external Python (or, even better, Ruby) script is remotely feasible.
Plone got the brunt of my disgruntled tweeting. This is mainly due my frustration at seeing how well Plone would fit it my vision and being able to get absolutely nowhere towards realizing that goal.
What, you’ve never heard of it? I have a history with Daisy, and I know, without a doubt, it could serve my needs. The problem with Daisy is that it has a lot of working parts. To do what I want, you need both the data repository and the wiki running, as well as MySQL. On top of that, some external web app would need to actually do the Jangle stuff (and, this would most likely be Ruby/Sinatra) interacting with the HTTP API. This is a lot of running daemons. A lot of daemons that might not be running at any given time which would break everything. Daisy is a lot of things, but it’s not ‘self-contained’.
This is not a criticism. If I was running a CMS, this would be ok. When I was developing the Communicat, this was ok. Those are commitments. Projects that you think, “ok, I’m going to need to invest some thought and energy into this”.
The bag of holding is a stop-gap. “I need to use this until I can figure out a real way to accomplish what I need”. Maybe it’s the ultimate production service. That’s fine, but it needs to scale down as far as it scales up. I literally want something that somebody can easily get running anywhere, quickly and start Jangling.
If anybody has any recommendations on how I can easily get up and running with any of the above projects, please let me know.
Alternately, if anybody knows something else, a simple, remotely accessible dynamic, searchable data store, definitely enlighten me! I realize the irony of this plea, given who I work for, but the idea here is for something not cloud based, since I would like for the user to be able to load in their sensitive patron data without having to submit it to some third party service. There’s also the fact that there’s no front end that I can just ‘plug in’ to manage the data.
If I can’t get anything off the shelf working, I think I’ll be reduced to writing something simple in Merb or Sinatra with CouchDB or Solr or something. I was really hoping to have to avoid doing this, though.