Archive

RDF

Ever since the Functional Requirements for Bibliographic Resources (FRBR) first came out, there has been plenty of debate and disagreement over how to actually implement them. The boundary between Work and Expression has long been a disputed zone, without even bringing up the murkiness of adaptations, copies, derivatives, translations and their ilk. It’s complicated business and the fact that we’re just now dipping our toes in this water, the debate and disagreement are only bound to amplify once we are creating this data in earnest.

And I’m here to argue that we’re (mostly) wasting our time.

To be clear, I am not here to bury FRBR, I’m just asking that we pay less attention to the (Group 1) Entities behind the curtain and focus instead on why we are interested in the FRBR entity model at all (hint: it’s the relationships between things).

As we explore how our metadata might work in Linked Data/RDF model, there is a lot of hand-wringing over the fact that our legacy data (MARC/AACR2) does not contain a lot of fidelity when it comes to modeling it as FRBR. We know we usually have the makings of a Manifestation in there and there are probably pieces of a Work (since we likely have creators, contributors and subjects) and bits of Expression (language, medium). Since RDF is built on the open world assumption, we don’t have to know everything about the Work or Expression to identify them and make assertions about them (granted, the less we know, the harder it is to reconcile them later, but that’s a different issue).

So, by way of example, let’s take a MARC record:

000 01209cam a2200313 i 450
001 344918
005 20090410163635.0
008 760528s1976 nyua j 001 0 eng
906 __ |a 7 |b cbc |c orignew |d 1 |e ocip |f 19 |g y-gencatlg
010 __ |a 76019078
020 __ |a 0671328026 (lib. bdg.) : |c $6.64
035 __ |9 (DLC) 76019078
040 __ |a DLC |c DLC |d DLC
042 __ |a lcac
043 __ |a n-us-la
050 00 |a E356.N5 |b L86
082 00 |a 973.5/239/0924
100 1_ |a Lyons, Grant.
245 10 |a Andy Jackson and the battles for New Orleans / |c by Grant Lyons ; illustrated by Paul Frame.
260 __ |a New York : |b J. Messner, |c c1976.
300 __ |a 96 p. : |b ill. ; |c 22 cm.
500 __ |a Includes index.
520 __ |a Traces the events preceding and during the Battle of New Orleans where Andrew Jackson led the American troops to a decisive victory over the British.
650 _0 |a New Orleans, Battle of, New Orleans, La., 1815 |v Juvenile literature.
600 10 |a Jackson, Andrew, |d 1767-1845 |x Juvenile literature.
650 _1 |a New Orleans, Battle of, New Orleans, La., 1815.
600 11 |a Jackson, Andrew, |d 1767-1845.
700 1_ |a Frame, Paul, |d 1913-1994.
991 __ |b c-GenColl |h E356.N5 |i L86 |t Copy 1 |w BOOKS

(Andy Jackson and the Battle for New Orleans)

and let’s draw out the FRBR Entities in RDF:


@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix frbr: <http://vocab.org/frbr/core#> .
@prefix rdagr1: <http://RDVocab.info/Elements/> .
@prefix bibo: <http://purl.org/ontology/bibo/> .

<http://lccn.loc.gov/76019078>
a frbr:Manifestation;
bibo:isbn “0671328026”;
bibo:lccn “76019078”;
bibo:pages “96”;
dcterms:extent “22 cm.”;
rdagr1:statementOfResponsibility “by Grant Lyons ; illustrated by Paul Frame”;
frbr:embodimentOf _:expression1 .

_:expression1
a frbr:Expression;
dcterms:language <http://id.loc.gov/vocabulary/languages/eng>;
dcterms:format “text”;
frbr:embodiment <http://lccn.loc.gov/76019078>;
frbr:realizationOf _:work1 .

_:work1
a frbr:Work;
dcterms:title “Andy Jackson and the battles for New Orleans”;
dcterms:creator <http://viaf.org/viaf/48006744>;
dcterms:subject <http://id.loc.gov/authorities/subjects/sh85091385>, <http://id.loc.gov/authorities/names/n79088888>;
frbr:realization _:expression .

Note that I’m leaving a lot of things out, some for simple brevity, others because I simply have no idea what entity they belong to (copyright date? contributor – in this case, an illustrator?). I am also using Ian Davis’ original interpretation of FRBR in RDF because I can’t figure out how to express the simple “_:ex isExpressionOf _:wo” type relationships in the RDA vocabularies (maybe you can figure it out and please leave an example in the comments) and if I could even find the IFLA FRBR vocabulary, the URIs are so obtuse nobody would understand what I’m trying to say anyway.

But this little exercise raises quite a few red flags. For one thing, spreading our properties across resources like this automatically increases the complexity in actually being able to construct a query to find it (although no matter how you cut it, you will have to have some JOINs: the author, for example, should be its own resource). It also requires the existence of all of these entities to be present or the chain breaks. Leave out the Expression and we have no way to link from the Manifestation (where our edition information is) to the Work (where are title, authors and subjects are). Another glaring omission is that we still don’t really know what the thing is that we’re describing (although, again, that’s most likely my ignorance of FRBR and RDA).

Now, I have been told that implementing a model like this isn’t being done (or not done) for the convenience of developers, and that’s fair enough, but there is a larger and more fundamental problem here.

If I’ve been working in or around libraries for over 15 years and, more specifically, libraries and RDF for 4 years, and I am completely stumped and overwhelmed by trying to model this, how on earth do we really expect anybody else, that hasn’t been trained as a cataloger, to figure it out? What compounds the problem is that there (currently) aren’t any really good ways to link from the FRBR vocabularies to other simpler and more common vocabularies (like Bibliontology).

And this gets to the heart of my question. Do we really care about the Manifestation or the Expression or do we just want to know about the thing (a book, say, or an article) and the relationships between things (adaptations, copies, derivations, translations and their ilk). To put it another way, do we really need to model out all of FRBR to benefit from the data model?

About a year ago, I made a set of properties on open.vocab.org to help work around some of the problems I was having implementing FRBR in some of the work I was doing (modeling the Open Library as RDF, LinkedLCCN, etc.). The background is that I wanted to model, say, a book as a bibo:Book, which could have dcterms:creator, bibo:isbn, dcterms:language, etc. as properties attached to it. But this meant I couldn’t say the book was a frbr:Manifestation, because it also had properties of the Expression and Work on it. At the same time, I wanted to be able relate different editions of the same work together.

So I came up with what is now referred to as the commonThing properties:

The idea being that we know that my bibo:Book contains WEMI information, so why can’t we just imply the existence of the FRBR entities and then create our relationships between resources based on that?

For example, if you had _:book1 <ov:commonWork> _:book2, you are saying that these two books both share a common FRBR Work (they may also share other common entities, but the point is that you might not know that). _:book1 and _:book2 can be modeled as anything, that doesn’t matter. And, if the WEMI hierarchy ever does get modeled, you can use commonThing to link from _:book1 to its corresponding Manifestation with ov:commonManifestation.

I am not arguing that this is a perfect workaround for the conundrums that FRBR gives us (and Jakob Voss has come up with an alternative that he calls SOBR), but it begins guide the conversation away from the supremacy of FRBR Group 1 entities (for the record, I’m gung-ho for Group 2 and Group 3 entities) as our primary focus and more towards what we’re actually intended for these entities to do and mean.