Initial Observations on Query Based Sampling in Distributed CLIR
AutorIn[nen]
Shou, XiaoMang Sanderson, Mark
In :
FGIR 2006 : Workshop Information Retrieval 2006 of the Special Interest Group Information Retrieval (FGIR) : (Hildesheim) : 2006.10.09-11 LWA 2006 : Lernen - Wissensentdeckung - Adaptivität (Workshop 9.11.10.2006 in Hildesheim) / Martin Schaaf, Klaus-Dieter Althoff [Hrsg.]
Cross Language Information Retrieval (CLIR) enables people to search information written in different languages from their query languages. Information can be retrieved either from a single cross lingual collection or from a variety of distributed cross lingual sources. This paper presents initial results exploring the effectiveness of distributed CLIR using query-based sampling techniques, which to the best of our knowledge has not been investigated before. In distributed retrieval with multiple databases, query-based sampling provides a simple and effective way for acquiring accurate resource descriptions which helps to select which databases to search. Observations from our initial experiments show that the negative impact of query-based sampling on cross language search may not be as great as it is on monolingual retrieval.