Supercharged Ruby MARC

One of the byproducts of the “Communicat” work I had done at Georgia Tech was a variant of Ed Summersruby-marc that went into more explicit detail regarding the contents inside the MARC record (as opposed to ruby-marc which focuses on its structure).  It had been living for the last couple of years as a branch within ruby-marc, but this was never a particularly ideal approach.  These enhancements were sort of out of scope for ruby-marc as a general MARC parser/writer, so it’s not as if this branch was ever going to see the light of day as trunk.  As a result, it was a massive pain in the butt for me to use locally:  I couldn’t easily add it as a gem (since it would have replaced the real ruby-marc, which I use far too much to live without) which meant that I would have to explicitly include it in whatever projects I wanted to use it in and update any paths included accordingly.

So as I found myself, yet again, copying the TypedRecords directory into another local project (this one to map MARC records to RDF), I decided it was time to make this its own project.

One of the amazingly wonderful aspects of Ruby is the notion of “opening up an object or class”.  For those not familiar with Ruby, the language allows you to take basically any object or class, redefine it and add your own attributes, methods, etc.  So if you feel that there is some particular functionality missing from a given Ruby object, you can just redefine it, adding or overriding the existing methods, without having to reimplement the entire thing.  So, for example:

class String
  def shout
    "#{self.upcase}!!!!"
  end
end

str = "Hello World"
str.shout
=> "HELLO WORLD!!!!"

And just like that, your String objects gained the ability to get a little louder and a little more obnoxious.

So rather than design the typed records concept as a replacement for ruby-marc, it made more sense to treat it more as an extension to ruby-marc.  By monkey patching, the regular marc parser/writer can remain the same, but if you want to look a little more closely at the contents, it will override the behavior of the original classes and objects and add a whole bunch of new functionality.  For MARC records, it’s analogous to how Facets adds all kinds of convenience methods to String, Fixnum, Array, etc.

So, now it has its own github project:  enhanced-marc.

If you want to install it:

  gem sources -a http://gems.github.com
  sudo gem install rsinger-enhanced_marc

There’s some really simple usage instructions on the project page and I’ll try to get the rdocs together as soon as I can.  In a nutshell it works almost just like ruby-marc does:

require 'enhanced_marc'

records = []
reader = MARC::Reader.open('marc.dat')
reader.each do | record
  records << record
end

As it parses each record, it examines the leader to determine what kind of record it is:

  • MARC::BookRecord
  • MARC::SerialRecord
  • MARC::MapRecord
  • MARC::ScoreRecord
  • MARC::SoundRecord
  • MARC::VisualRecord
  • MARC::MixedRecord

and adds a bunch of format specific methods appropriate for, say, a map.

It’s possible to then simply extract either the MARC codes or the (English) human readable string that the MARC code represents:

record.class
=> MARC::SerialRecord
record.frequency
=> "d"
record.frequency(true)
=> "Daily"
record.serial_type(true)
=> "Newspaper"
record.is_conference?
=> false

or, say:

record.class
=> MARC::VisualRecord
record.is_govdoc?
=> true
record.audience_level
=> "j"
record.material_type(true)
=> "Videorecording"
record.technique(true)
=> "Animation"

And so on.

There is still quite a bit I still need to add.  It pretty much ignores mixed records at the moment.  It’s something I’ll need to eventually get to, but these are uncommon enough that it’s currently a lower priority.  I also need to provide some methods that evaluate the 007 field.  I haven’t gotten to this yet, just because it’s just a ton of tedium.  It would be useful, though, so I want to get it in there.

If there is interest, it could perhaps be extended to include authority records or holdings records.  It would also be handy to have convenience methods on the data fields:

record.isbn
=> "0977616630"
record.control_number
=> "793456"

Anyway, hopefully somebody might find this to be useful.

2 comments
  1. Nice, this indeed seems quite useful to me.

  2. I have a longstanding project to import some records that are in MARC to Kete and I think this holds promise.

    I’ll be sure to fork it if I have any contributions.

    Good work.

Leave a Reply

Your email address will not be published. Required fields are marked *