Sunday, April 5, 2009

Metadata Workshop on NIS Camp in June 2009

Self-created digital collection has become an effective means for libraries to create knowledge, preserve and share archival and research information. In order to provide easy online access to these research materials for users, librarians and information professionals use metadata to organize and describe information, and make these resources online searchable. How to create effective metadata for digital collections? Jin Xiu Guo will offer a workshop on NITLE Information Services Camp at Smith College on June 4, 2009. This workshop is for everyone who wants to know about metadata and likes to explore knowlege management in digital age. People who are interested in the workshop could visit NIS Camp.

Saturday, March 14, 2009

Thesauri for Information Retrieval Part. 1

Thesauri have been widely used in information retrieval in recent years. They are built in software to facilitate users to retrieve information on websites or content management systems. Currently, law firms and consultant companies have integrated thesauri into their websites. Thesauri could be also used to automatically index contents in databases. Most commercial databases all have thesauri to help users increase search effectiveness, such as STN, WilsonWeb. However, there are no standards for current thesauri developers to adopt or compose concepts consistently. The same concept could be displayed differently in different thesauri for different purposes. For example, "knowledge management software" could be splitted into "knowledge", "management" and "software"; it could be also broken down to "Knowledge management" and "software". The first case could happen in a general thesaurus, second one might possibly happen in a thesaurus of Information Science. People who want to integrate the above thesauri into their software will ask a question, which one is more appropriate for my system?

Except for specific purposes, most thesauri should be interoperable with software to maximize the benefit of thesauri to certain extent regardless of different domains. Sometimes, it even takes longer to customize a thesaurus for local systems than develop a thesaurus from scratch. In another word, how we could let current content management systems easily adopt available thesauri? We need a standard to standardize the way we create concepts in thesauri.

Wednesday, January 28, 2009

"Author vs. User Tagging" on Journal of Library Metadata

With the increasing application of social tagging technology in web 2.0, librarians have applied social tagging in online library catalog as an additional search entry for users. Now scientists, attorneys and technology consultants start to tag web contents as subject experts to provide such convenience. This user tagging is becoming an acceptable access tool for researchers. But what are the differences between author-supplied metadata (endo-tagging) and user- supplied metadata (exo-tagging)? An interesting article by Heather Lea Moulaison on Journal of Library Metadata, (2008, vol. 8, issue 2 p101-111, 11p) has a critical review on this issue.

Journal of Library Metadata focuses on emerging issues about all aspects of metadata applications in today's digital libraries. Haworth's Journal of Library Metadata, now published by the Taylor & Francis Group, is seeking a new editor. Any interested professionals with sufficient credentials who might like to take on this task can contact Bill Cohen, the publisher at bcohen7719@aol.com.

Monday, January 26, 2009

What Does a Metadata Librarian Do?

A metadata librarian becomes a new professional library title in recent three years. It needs to transfer traditional cataloging knowledge into digital information management with emerging technologies. Many new metadata librarians are experiencing new challenges in the digital age, especially, social libraries are becoming an emerging knowledge network. Here is the interview with Inmagic " Getting the Metadata Experience with Jin Xiu Guo".

Tuesday, January 6, 2009

International Standard Collection Identifier (ISCI)

International Standard Collection Identifier is under review now. The purpose of the document is to make various collections and fonds to be identified by a system in a systematical way. An identifier is an important metadata element, so far, there is no standard way to construct an identifier. Different entities have their own ways to create identifiers.

With the appearance of metasearch engine, an identifier has played an important role in locating information. Especially electronic resources have been exponentially increased in a recent decade, an identifier is usually generated by local practices. There is no systematical way to guide local practices to formulate an consistent identifier. Now more and more digital libraries have created digital collections including digital archives, descriptive metadata are used to describe those resources. An identifier is a key element of descriptive metadata. Today, knowledge sharing is an effective learning process. Knowledge sharing can not be alive on its own, metadata sharing and exchange inevitably accompany with knowledge sharing.

When a variety of organizations use different local identifiers, the metasearch engine has to do duplicate searching because of irregular formation of the identifiers. This irregularity greatly reduces search engine efficiency. To increase search effectiveness, we need to build standard identifiers for collections and fonds, to facilitate global metadata exchange. The proposed standard way is:

ISIL:Collection identifier string

ISIL is the identifier for the organization, the collection identifier may contain up to 16 characters.

e.g. FI-Ht:Up Helsinki University Theology Library, Psychology of religion collection. (example from ISO/CD 27730)

If ISCI becomes a standard, it will greatly reduce duplicate detection of a search engine, users will be able to identify collections and fonds through ISCI.

Monday, December 8, 2008

Dublin Core One-to-One Principle

In Dublin Core metadata schema, the one-to-one principle refers to one metadata description is only for one resource. For instance, description for a digital image of Mona Lisa can not be regarded as same as the original painting. However, in most practices, it's difficult to just make a straight line of it.

When we create metadata to describe a resource, such as a digital image, or an analog object, we need to consider users' requirements. From users perspective, we want to give the information they are looking for; metadata creators should have the capability to identify the key information need. For example, when a metadata creator describes the date of an image of Mona Lisa digitized from an original painting, s/he should think about what users really want to know here. In most case, users are interested in the original date of the painting instead of the image. If metadata creators give the digitization date of the image, it would be less satisfied users' interest.

However, in the above example, if the original created date of the painting is provided in the metdata description instead of the digitization date, it would conflict with the one-to-one principle. Therefore, we need to use our best judgment to create metadata meaningful for users, rather than just follow straight rules and miss the information users need.

Monday, November 24, 2008

RDA Constituency Review

RDA is up for review again. People who are interested in RDA could submit your comments by February 2, 2009. RDA (Resource Description and Access) will be the general guideline for information professionals to describe electronic resources and provide access to online informaton for users; it also facilitates the metadata quality control and sharing metadata between different communities and metadata schemes.