AALL Home TS/SIS Home OBS/SIS Home TSLL Home Contents, v.24:04 | « Description | OCLC Committee » |
TECHNICAL SERVICES LAW LIBRARIAN
Volume 24, No. 4 (June 1999)

  Man with book scratching head THE INTERNET
Defining Metadata Kevin Butterfield
Southern Illinois University
kbutterf@siu.edu

Several years ago I participated in a research group for a NSF funded digital library project. We were tasked with creating an "ontology containing formal definitions of digital library content, services, and licenses along with a registry including metadata to describe collections and agents based on the ontology". The group, made up of engineering, computer science, and information science faculty and librarians, spent several weeks discussing structures and content from a wide array of perspectives. Our conclusion? Before we could define metadata for our digital library, we needed to define it for ourselves.

Unfortunately, defining metadata has not come easily. Definitions vary depending upon perspective (cataloging vs. computer science) or discipline (humanities text vs. hard sciences). The CC: DA's Task Force on Metadata has as one of its charges to devise a definition of "metadata" and investigate the interoperability of newly emerging metadata schemes with the cataloging rules (AACR2R) and the USMARC format. They have identified close to twenty separate definitions to date. These range from the simple, "Data about data", to that given by Arlene Taylor who devotes five chapters in her new book, The Organization of Information, to this topic. Her definition is as follows.

Metadata. An encoded description of an information package (e.g., an AACR2 record encoded with MARC, a Dublin Core record, a GILS record, Etc.); the purpose of metadata is to provide an intermediate level at which choices can be made as to which information packages one wishes to view or search, without having to search massive amounts of irrelevant full text. (p. 246)

Taylor extends her definition to include, as it should, not only descriptive information such as that found in traditional retrieval tools, but also information necessary for the management and preservation of the information package being described (p. 77). This can include such things as the Text Encoding Initiative (TEI) Header's Revision Description (<REVDESC>) or Dublin Core tags such as FORM or RIGHTS which can be used to give basic details about the technical or legal context of a document. Information vital to digital preservation may be added so that future systems would know exactly how to interpret the document itself or migrate the data to a non-obsolete format. The data could also be encapsulated together with all application and system software required to access it and a description of the original hardware environment.

The Dublin Core is one such method. The Dublin Core is being developed as a generic metadata standard for use by libraries, archives, government, and other publishers of information. The standard was intended to be descriptive, rather than evaluative, and deliberately limited to a small set of elements that would have applicability over a range of types of information resources. Those who are trying to implement the Dublin Core standard have raised a number of issues concerning both the semantics of the metadata (rules for the content of the fields) and the syntax (rules for structuring and expressing the fields themselves). For a progress report on the Dublin Core, read Stuart Weibel's article "The State of the Dublin Core Metadata Initiative: April 1999" [ http://www.dlib.org/dlib/april99/ ] in the April issue of D-Lib Magazine. You can also attend program B6: Crosswalks to Information Management: Metadata, at the 1999 annual meeting. Erik Jul will discuss metadata in general and the Dublin Core. Eliot Christian of the USGS will join him and speak on the Government Information Locator Service (GILS).

The issues raised by Dublin Core implementers offer an opportunity for technical services librarians to become involved in the creation of these schemes. A number of communities have begun expanding upon the core set by adding elements or attributes specific to their disciplines and local practice. Why not law? OBS/TS members need to consider how to get involved in formulating a common set of tags and a common format for those tags for people and institutions that are providing access to legal information over the Internet.

Many of the concepts behind metadata should sound familiar to Technical Services librarians. While the implementations differ, the principles behind the creation of these systems greatly resemble those of cataloging, acquisitions and preservation. As the methods available for describing information grow beyond MARC, it becomes increasingly apparent that we have a role to play as mediators and creators of an increasingly diverse landscape of descriptive methods. End of Article

Calls for Comments/Participation

Web Based Ontologies
Netscape is currently building a world ontology to classify web sources. The taxonomy they are building is similar to YAHOO, however:

1. The taxonomy and its instance are public.
2. They are specified in RDF.
3. Netscape is asking for volunteers as editors for entries in the taxonomy and for building the taxonomy.

More information on the Open Directory Project can be found at: http://directory.netscape.com.

Technical information (including specs in RDF) can be found at: http://dmoz.org/rdf/.

Dublin Core
A first draft of Encoding Dublin Core Metadata in HTML is available for comments. It was written in response to the need to document current practice while discussion moves forward on data models and XML/ RDF encoding. It has been the subject of several rounds of review in the Dublin Core Technical Advisory Committee. This document explains how Dublin Core elements are expressed using the META and LINK tags of HTML. You may find it at: http://www.ietf.org/internet-drafts/draft-kunze-dchtml-01.txt.

Comments are welcome.

Universal Preservation Format
An important new standard in the preservation of digital media is nearing the completion of its first iteration. Those for whom this could be an important component of their work are urged to download and comment on the papers referred to, notably the "User and Technical Requirements." There is also a separate bibliography.

You can find these papers at: http://info.wgbh.org/upf/index.html.

Resource Description Framework
The World Wide Web Consortium (W3C) has released the Resource Description Framework (RDF) Model and Syntax specification as a W3C recommendation, representing cross-industry and expert community agreement on a wide range of features for using and providing metadata on the Web. The full press release and links to resources are available at: http://www.w3.org/Press/1999/RDF-REC.html.


AALL Home TS/SIS Home OBS/SIS Home TSLL Home Contents, v.24:04 | « Description | OCLC Committee » |
Comments to: WebMaster, tssis@law.wuacc.edu
Updated: June 24, 1999.
URL: http://www.aallnet.org/sis/tssis/tsll/24-04/inet.htm