For the last couple of weeks I’ve returned to working on Alto Jangle connector, at least part-time.  I had shelved development on it for a while; I had a hard time finding anybody interested in using it and had reached a point where the development database I was working against was making it difficult to know what to expect in a real, live Alto system.  After I got wind of a couple of libraries that might be interested in it, I thought I should at least get it in a usable state.

One of the things that was vexing me prior to my hiatus was how to get Sybase to page through results in a semi-performant way.  I had originally blamed it on Grails, then when I played around with refactoring the connector in PHP (using Quercus, which is pretty slick by the way, to provide Sybase access via JDBC — the easiest way to do it) I realized that paging is just outside of Sybase’s capabilities.

And when you’re so used to MySQL, PostgreSQL and SQLite, this sort of makes your jaw drop (although, in its defense, it appears that this isn’t all that easy in Oracle, either — however, it’s at least possible in Oracle).

There seem to be two ways to do something like getting rows 375,000 – 375,099 from all of the rows in a table:

  1. Use cursors
  2. use SET ROWCOUNT 375100 and loop through and throw out the first 375,000 results.

The first option isn’t really viable.  You need write access to the database and it’s unclear how to make this work in most database abstraction libraries.  I don’t actually know that cursors do anything differently than option 2 besides pushing the looping to the database engine itself.  I was actually using cursors in my first experiments in JRuby using java.sql directly, but since I wasn’t aware of this problem at the time, I didn’t check to see how well it performed.

Option 2 is a mess, but this appears to be how GORM/Hibernate deals with paging in Sybase.  Cursors aren’t available in Quercus’ version of PDO, so it was how I had to deal with paging in my PHP prototypes, as well.  When I realized that PHP was not going to be any faster than Grails, I decided to just stick with Grails (“regular C-PHP” is out — compiling in Sybase support is far too heavy a burden).

This paging thing still needed to be addressed.  Offsets of 400,000 and more were taking more than twelve minutes to return.  How much more, I don’t know — I killed the request at the 12 minute mark.  While some of this might be result of a bad or missing index, any way you cut it, it wasn’t going to be acceptable.

I was kicking around the idea of exporting the “models” of the Jangle entities into a local HSQLDB (or whatever) mirror and then working the paging off of that.  I couldn’t help but think that this was sort of a waste, though — exporting from one RDBMS to another solely for the benefit of paging.  You’d have to keep them in sync somehow and still refer to the original Sybase DB for things like relationships and current item or borrower status.  For somebody that’s generally pretty satisfied with hacking together kludgy solutions to problems, this seemed a little too hack-y… even for my standards.

Instead, I settled on a different solution that could potentially bring a bunch of other features along with it.  Searchable is a Grails plugin for Compass, a project to easily integrate Lucene indexes with your Java domain classes (this would be analogous to Rails’ act_as_ferret).  When your Grails application starts up, Searchable will begin to index whatever models you declared as, well,  searchable.  You can even set options to store all of your attributes, even if they’re not actual database fields, alleviating the need to hit the original database at all, which is nice.  Initial indexing doesn’t take long — our “problem” table that took twelve minutes to respond takes less than five minutes to fully index.  It would probably take considerably less than that if the data was consistent (some of the methods to set the attributes can be pretty slow if the data is wonky — it tries multiple paths to find the actual values of the attribute).

What this then affords us is consistent access times, regardless of the size of the offset:  the 4,000th page is as fast as the second:  between 2.5 and 3.5 seconds (our development database server is extremely underpowered and I access it via the VPN — my guess is that a real, live situation would be much faster).

The first page is a bit slower.  I can’t use the Lucene index for the first page of results because there’s no way for Searchable to know if the WORKS_META table has changed since the last request since these changes wouldn’t be happening through Grails.  Since performance for the first hundred rows out of Sybase isn’t bad, the connector just uses it for the first page, then syncs the Lucene index with the database at the end of the request.  Each additional page then pulls from Lucene.  Since these pages wouldn’t exist until after the Lucene index is created and the Lucene index is recreated every time the Grails app is started, I added a controller method that checks the count of the Sybase table and the count of the Lucene index to confirm that they’re in sync (it’s worth noting that if the Lucene index has already been created once, this will be available right away after Grails starts — the reindexing is still happening, but in a temp location that will be moved to the default location once it’s complete overwriting the old index).

The side benefit to using Searchable is that it will make adding search functionality to Alto connector that much easier.  Building SQL statements from the CQL queries in the OpenBiblio connector was a complete pain the butt.  CQL to Lucene syntax should be considerably easier.  It seems like  it would be possible for these Lucene indexes to potentially alleviate the need for the bundles Zebra index that comes with Alto, eventually, but that’s just me talking, not any sort of strategic goal.

Anyway, thanks to Lucene, Sybase is behaving mostly like a modern RDBMS, which is a refreshing change.

I am not a programmer.

Since I first began writing code, my approach to learning a new language has been to take something that does the sort of thing I am looking for and start fiddling, seeing the results of the fiddling (most likely through some sort of error message) and refiddle until I start seeing the outcome I was looking for.  In mechanical terms, I open the hood, get out my biggest wrench and start whacking at things until the noises stop.  Or, in this case, start.

The arc of languages I primarily worked in at any given time is a pretty good reflection of this approach:  Perl, TCL, PHP, then Ruby with a small foray into Python.  All dynamic, all extremely whackable.  Where whacking doesn’t work, Google (or, previously, Yahoo or, previously, Alta Vista) generally does.  Cut, paste and resume whacking.

The same philosophy applies when it comes to developing new projects.  I know, basically, what I want on the other side, but I have absolutely no idea what it will take to get there.  Generally this means I’ll pick up the nearest tool on hand (usually a red colored wrench) and start whacking until I see what I want.  That the red wrench isn’t the right tool for the job isn’t the point, since I’m only looking for the destination, not the best route there (since I have no idea how to get there in the first place).  The more comfortable I get with a tool, the more likely I am to nest with it, since the detour of finding (and learning how to use) another tool slows me down from reaching the goal.

The perfect example of this was WAG the Dog.  PHP was a ridiculous language to try to use for it, but ping, ping, ping, ping, it worked!

So it stands to reason that I’ve never really taken to Java.  Java is not whacking.  Java is slowly, verbosely and deliberately constructing the specific parts you need to accomplish your goal.  Java is a toolbox full of parts and pieces I do not know the names of, what they do or how they would even do anything, much less the job I’m trying to accomplish.  Java is to my career what a Masters in Mechanical Engineering is to a wrench.  I don’t use Java because I don’t even know the questions to ask to get started in the right direction.

The irony is that when I was hired by Talis, I was ‘assigned’ (that’s a stronger term than really applies) to an entirely Java-based project, Keystone.  To this day, some 15 months later, I have contributed exactly 0.0 lines of code towards Keystone.

I am not a programmer.

However, I am a tinkerer.

In an effort to eat our own dogfood, I had begun to write a Jangle connector for our library management system, Alto.  Alto is built on Sybase and we already had a RESTful SOA interface, the aforementioned Keystone.  It would have been logical for me, were I a real programmer, to take Keystone, add some new models and routes and call it a connector.

But that’s not how I roll.

Initially, I took to using the JangleR Ruby framework to build the Alto connector, since all it would require is to query the appropriate tables and ping, ping, ping, ping things until JRuby and Warbler could give me a .war file.

Sybase, however, does not take well to whacking.  Especially from Ruby.  ActiveRecord-JDBC didn’t work.  Not sure if it was our particular schema or JDBC setup or just ActiveRecord, but no dice.  I couldn’t get Ribs to work at all, which is just as well, probably.  Finally, I had success just using java.sql objects directly in JRuby, but, since I really didn’t know what I was doing, I started worrying about connection pooling and leaving connections open and whatnot.  No need to show off that I have no idea by gobbling up all the resources on some customer’s Alto server.

At one point, on a lark, I decided to try out Grails, Groovy‘s web framework inspired by Rails, to see if I could have more luck interacting with Sybase.  My rationale was, “Keystone uses Hibernate, GORM (Grails’ ORM) uses Hibernate, maybe it will work for me!”.  And, it did.

So here I am, one week into using Groovy.  Just like I used Rails as an introduction to Ruby, Grails serves that purpose with Groovy pretty well.  I can’t say I love the language, but that’s purely my bias; anything that isn’t Ruby, well, isn’t Ruby.  I am certainly doing some thing inefficiently since I am learning the language as I go along.  The fact that there just isn’t much documentation (and the existing documentation isn’t all that great) doesn’t help.

For example, none of my GORM associations work.  I have no idea why.  It could very well be the legacy Sybase schema, or I might be doing something wrong in my Domain Model class.  I don’t have any idea and I don’t have any idea where to look either for an appropriate error or for a fix.  It’s not a huge issue, though, and so far I’ve just worked around it by writing methods that do roughly what I would have needed the associations to do.  Ping, ping, ping.

I also cannot make Services work the way they show in the documentation.  My controllers can’t find them when I do it like the docs, my domain models can’t find them when I do it like the doc…  But it’s no big deal.  I set my methods to be static, call the class directly, and everything works fine.  I’m not doing anything remotely sophisticated with them, so I can keep my hacks for now.

Being able to dynamically load Java classes, iterate over things with a call like foo.each { bar = it.baz } is pretty nice.  I am really amazed at what Groovy offers when it comes to working with Java classes, it’s like being able to audit those M.E. Master’s classes.  I am learning a considerable amount about Java by being able to whack away at objects and classes within the Groovy shell.

I’m not sure that Groovy was really intended for people like me, however.  All of the documentation and even some of the syntax seem to have the expectation that you are a Java developer looking for something dynamic.  It reminds me of a Perl developer going to PHP.  They are syntactically and functionally similar.  In many ways, a Perl developer will find PHP an easier language to use to get a simple, running web application.  And they always have the fallback of Perl, if they need it.  A Python developer that has to use PHP will probably curse a lot.  Groovy seems to have the same sort of relationship to Java.  A Java developer would probably immediately feel comfortable and find it amazingly easy to get up and running.  A Ruby developer (well, this Ruby developer) finds it a little more alien.

Groovy doesn’t have a tremendous amount of native language support for specific tasks, relying instead on the vast amount Java libraries out there to do, basically, anything.  This makes perfect sense and I don’t fault Groovy in the slightest for this choice, but relying on Java for functionality means factories and stream buffers and all the other things Java consists of.  Java developers would feel home.  I find it needs some getting used to.

Also needing to declare your variables.

And I’m sure I’m not really using closures to their fullest potential.

Overall, it’s nice to have this new addition to my toolbox.  Groovy is definitely whackable and development for the Jangle connector has been remarkably fast.  I expect the Aspire (née List) and Prism teams to have something to try out by the end of the month.  And for basically being Java, that ain’t bad.

When and if I rewrite the Alto connector, I’ll probably opt for GroovyRestlet over Grails, but I definitely couldn’t have gotten where I have at this point without the convention and community of Grails.  It’s a really good starting point.

Of course, none of this would have been necessary if it wasn’t for Sybase.  Consider this day one of my campaign to eventually migrate Talis Alto from Sybase to PostgreSQLPing, ping, ping.