Saturday, May 07, 2005

Cataloging sound recordings

There has been a fascinating discussion on the Association for Recorded Sound Collections (ARSC) email list over the last week on cataloging sound recordings (look for threads starting with "database template" and "cataloging," then continuing here). The ARSC community is wonderfully diverse, including audiophiles, librarians, archivists, and others just interested in learning about sound recordings. The thread started out with an announcement of a database template for recording information about sound recordings; someone solving an immediate problem and wanting to share their solution with others. It's expanded greatly to become somewhat of a religious discussion on the relative merits and problems of MARC/AACR2 cataloging.

I can't help but feel that, like a great many discussions of this sort, the participants are talking past each other. One point that has been mentioned but perhaps not strongly enough, is that the user experience problems with library cataloging is heavily a problem of the use the search system makes of the data and how it's presented to end-users. Ralph Papakhian, one of the premier music catalogers in the country, who I like and respect a great deal, has made the point in this thread that the data elements some respondents mention as wanting to record are in fact recordable in MARC. And if anyone would know and can explain this to others, it's Ralph. But these elements, even though they're there, are often not accessible to users. For example, MARC has fields for date of composition and coded instrumentation of a recording or score. But few if any library systems index or display this data. So catalogers rarely enter them, which provides less incentive for systems to use them, which provides less incentive for catalogers to use them, which provides less incentive for systems to use them...

But I believe systems aren't the only problem. There are lots of little things I think MARC/AACR2 could do better. However, the biggest, and mostly implicit in this discussion, difference in what MARC does and what some of the other participants in this thread look for in sound recording cataloging, is the library focus on the carrier over the content. Catalogers discuss this issue frequently, but it hasn't been brought up explicitly in this thread. Audiophiles absolutely are interested in the recording as a whole--its matrix number, sound engineers, etc. But they are also equally interested in the musical works on the recording, what personnel are connected with which piece, timings of tracks, etc. MARC has places for these things, but they are relegated to second-class status. Catalogers know and tout the benefits of structure and authority control in information retrieval. But when it comes to the contents of a bibliographic item, we apply none of these principles in the MARC environment. Contents notes are largely unstructured (and what structure is possible is rarely used and keeps changing!), don't make use of name or title authority control, and in many cases aren't indexed in library systems.

As pointed out in this thread, creating this content-level information is extremely expensive. But the networked world has the potential to change that. Much of this information has been created in structured form outside of the library environment, by record companies, retailers, and enthusiasts, but we don't make use of it. Right now, it's difficult to make use of it because our systems don't know how to talk to each other. It will take a great many baby steps, but I hope we can start down the road towards changing that.

Matt Snyder of NYPL, who I met at MLA this year and was extrememly impressed with, has made the point in this thread that MARC records (and, by extension, library catalogs) and discographies have different purposes. This is definitely true in today's environment. Library catalogs are primarily for locating things, and discographies have more of a research bent. But I feel strongly, and this email discussion seems to support this view, that the distinction is largely artificial and is becoming less relevant as information retrieval systems continue to evolve. More sharing of data between systems will hopefully result in fewer systems to consult by end-users. That's certainly my goal!

2 comments:

waltc said...

You have it exactly right about the vicious circle of instrumentation codes and similar specialized MARC content for music retrieval. I've been in the position of considering retrieval and display possibilities for coded values, and the message I consistently heard was that the instrumentation codes (etc.) are supplied so rarely by catalogers that we'd be doing more harm than good by using them. Which provides less incentive for future catalogers to apply them. Which....

Anonymous said...

Jenn, you wrote:

"As pointed out in this thread, creating this content-level information is extremely expensive. But the networked world has the potential to change that. Much of this information has been created in structured form outside of the library environment, by record companies, retailers, and enthusiasts, but we don't make use of it. Right now, it's difficult to make use of it because our systems don't know how to talk to each other. It will take a great many baby steps, but I hope we can start down the road towards changing that."

I don't trust many record companies' data entry about musical works further than I can throw it. A lot of that data comes from All Music Guide or Muse. AMG is good, and has good staff doing research (on their classical end, which recently merged with the popular section), but they have two different philosophies behind this data entry. Classical works are indeed analytic, and clickable between all of the elements, but popular works are poorly indexed between the compositions and their instantiations--so it's a big mess. This isn't the kind of data that libraries should have in their catalogs. But it's a good start (at least the classical end). If we could combine that way of thinking, with the sort of attention to authority control that catalogers have typically striven to maintain--we'd be in good shape.

Ralph notes in his 2000 article on cataloging that catalogers are expected to add more data with less time and funding.

Thom P.