Metadata Librarian Experience: October 2008

Sunday, October 26, 2008

INMAGIC New Knowledge Management Tool

Recently, INMAGIC has announced the new generation knowledge management tool - through social knowledge networks to inspire innovative insight and share knowledge to keep organizational intelligence and transfer implicit knowledge to explicit knowledge.

It seems promising that companies have found an effective and efficient way to keep innate intelligence through the communication within a company social network. The intention of this knowledgenet is to link the existing knowledge repository with different organizational groups to generate a sound solution for a particular business or technical problem. These gourps could be R & D, marketing, sales, decision makers, production, stragetic planning, and legal department. INMAGIC hope that the social knowledge network could play a comprehesive role in information organization, publishing, sharing, creation and collaboration.

This is a new information model built upon Amazon.com. Facing such complicate and diverse types of content, I wonder what makes the search engine distinguished from other knowledge tool. How this tool will make search easier for users to find the required information or link relevant information to the target problems? How does the tool encourage internal users to contribute more to the knowledgenet? That will be interesting to see.

Monday, October 20, 2008

Ranking Terms in a Thesaurus Database

As I have discussed in the previous posting, a thesaurus is very useful tool for users to efficiently search information in a database. The purpose that users look up the thesaurus is to find the right concept for their search. Do users really need a ranking scale to indicate the relevancy of the term they are looking for?

If the ranking makes sense to the users, it would be worthy doing so. For instance, when users search chemicals in databases, such as STN and Dialog, they would prefer to look up the term in the thesaurus first. By looking at the term indexed by the database producer, users would know how to create their search strategy. The ranking scale used by these databases is the number of records linked to the term. A term in a database could be ranked quite differently from it in another database.

If the ranking is based on a word partially matching with the indexing term, it would confuse users. For example, if users get the same ranking scale of different indexing terms because of partial word matching, users would conclude that these indexing terms are all equally relevant. That's not true. Some indexing terms are linked with more relevant records than others, how the thesaurus would help users to decide which term should be used to perform the search?

I would love to see a new systematical ranking system will be adopted in a thesaurus database.

Monday, October 13, 2008

WilsonWeb Thesaurus Database

WilsonWeb is a hybrid database with subjects of humanities, social science, education, business, applied science and technology. The thesaurus database is a very useful tool for searchers. For instance, if you are not sure the term indexed in the database, you can search thesaurus database, then you will get the list for all related subjects and related terms with the number of linked records. This is a preliminary search, it will give you some hints to search WilsonWeb with indexing terms.

However, WilsonWeb covers subjects in different disciplines, the same concept could have different meanings in different domains. It would be a huge amount of works to create a hierachic structure in its thesaurus database, or people should call it taxonomy, which will really narrow down the search terms. It's always a challenge for database producers that what kind of thesaurus should be offered to users. Database producers would think cost and effectiveness are the key to solve this problem.

Today, taxonomy has been a main information technology to make search engine more intelligent. Law firms, R & D, and consulting companies have begun building their own taxonomy to enhance the searchability of web search engines, which greatly saves searchers' time with more relevant search results. The problem most people have today is too much information exists, how are they able to find the needed information? Taxonomy could assist companies to organize information and make information easily searchable for users. SLA website and Askus.com are good examples of web content with the benefit of taxonomy.

Monday, October 6, 2008

Metadata Quality Control

Metadata quality control is becoming more important than ever when we deal with batch import. These problems include various versions of author name, unconventional abbreviations, inappropriate data formats, duplicate records, spelling errors, incomplete punctuations, unrecognized characters etc. To control the quality of metadata, we need systematical way to instantly identify and correct those errors; on the other hand, we also need strengthen metadata creation at the beginning.

Recently, DSpace has released an add-on for metadata quality control, which has powerful mass-edit feature, duplicate detection and resolving algorithms. This is a very promising feature for metadata quality control, at least, some systematical methods could be adopted to solve the current dilemma at the time of submission.

Hopefully, those functions, which are available in traditional Integrating Library System, such as, authority control, control vocabulary etc., will be available in Institutional Repository software.