An Architecture for Efficient Resource Discovery with Metadat ...

M08 4

Views: 224

All Rights Reserved

Copyright © 2008, Common Ground Research Networks, All Rights Reserved

Abstract

The profusion of non-relevant information for a given query on the Web explains the pressing need for formulating ebullient strategies for pertinent Web resource discovery and retrieval. One of the major requirements for effective document retrieval is its diligently encoded metadata. At the same time metadata standards to be followed for annotating documents from large collections are pretty complex. This is because the standardized global metadata cannot represent all the elusive forms of document metadata for improved retrieval ranking. In this context, we propose an approach to facilitate document retrieval from multidisciplinary domains where each belonging to discrete domains would be indexed in a segregated instance of a repository. This would facilitate document metadata customization for each specific discipline by adding specific metadata themes. Since the approach retains the standard metadata schema in addition to the customized metadata schema, it would result in enhanced resource discovery. The metadata retrieval process will be supported by an extended protocol for metadata harvesting (X-PMH) [1] and will be implemented in each repository. The extended metadata harvesting approach has been used to tie together the metadata customization components made at various repository instances. The proposed framework could be integrated into Open Digital Libraries (ODLs) [2] and shall serve as an intrinsic model that adds value in the context of multidisciplinary metadata simplicity, maintenance, and descriptive metadata availability in the event of repository instance failures. Our approach is to implement this cost-effective architecture using the PKP-OAI (Public Knowledge Project – Open Archive Initiative) [3,4] harvester on DSpace [5], an open source digital repository platform that supports metadata harvesting in its innate form. Once this is fully achieved, a federated search build upon such repository instances using open source technologies [6] would yield promising results in the context of information retrieval.