Information Retrieval Article Index for
Information
Shopping
Retrieval
Articles about
Information Retrieval
Website Links For
Information Retrieval
 

Information About

Information Retrieval




Automated IR systems are used to reduce Information Overload . Many universities and Public Libraries use IR systems to provide access to books, journals, and other documents. IR systems are often related to object and query. Queries are formal statements of information needs that are put to an IR system by the user. An object is an entity which keeps or stores information in a database. User queries are matched to objects stored in the database. A document is, therefore, a data object. Often the documents themselves are not kept or stored directly in the IR system, but are instead represented in the system by document surrogates.

In 1992 the US Department of Defense, along with the National Institute Of Standards And Technology (NIST), cosponsored the Text Retrieval Conference (TREC) as part of the TIPSTER text program. The aim of this was to look into the information retrieval community by supplying the infrastructure that was needed for such a huge evaluation of text retrieval methodologies.

Web Search Engine s such as Google , Yahoo Search or Live.com are the most visible IR applications.


PERFORMANCE MEASURES


There are several measures on the performance of an information retrieval system. The measures rely on a collection of documents and a query for which the relevancy of the documents is known. All common measures described here assume a ground truth notion of relevancy: every document is known to be either relevant or non-relevant to a particular query. In practice queries may be Ill-posed and there may be different shades of relevancy.


Precision


The proportion of retrieved ''and'' Relevant documents to all the documents retrieved:








where ''r'' is the rank, ''N'' the number retrieved, ''rel()'' a binary function on the relevance of a given rank, and ''P()'' precision at a given cut-off rank.

If there are several queries with known relevancies available, the ''mean average precision'' is the mean value of the average precisions computed for each of the queries separately.


MODEL TYPES


For successful IR, it is necessary to represent the documents in some way. There are a number of models for this purpose. They can be categorized according to two dimensions like those shown in the figure on the right: the mathematical basis and the properties of the model. (translated from German entry , original source Dominik Kuropka )


First dimension: mathematical basis





Second dimension: properties of the model





TIMELINE



OPEN SOURCE SYSTEMS




OTHER RETRIEVAL TOOLS



RESEARCH GROUPS (IN NO PARTICULAR ORDER)



MAJOR FIGURES




OTHER FIGURES ASSOCIATED WITH INFORMATION RETRIEVAL


Awards in this field: Tony Kent Strix Award .


ACM SIGIR GERARD SALTON AWARD

; 1983 - Gerard Salton , Cornell University : "About the future of automatic information retrieval"
; 1988 - Karen Spärck Jones , University Of Cambridge : "A look back and a look forward"
; 1991 - Cyril Cleverdon , Cranfield Institute Of Technology : "The significance of the Cranfield tests on index languages"
; 1994 - William S. Cooper , University Of California, Berkeley : "The formalism of probability theory in IR: a foundation or an encumbrance?"
; 1997 - Tefko Saracevic , Rutgers University : "Users lost: reflections on the past, future, and limits of information science"
; 2000 - Stephen E. Robertson , City University, London : "On theoretical argument in information retrieval"
; 2003 - W. Bruce Croft , University Of Massachusetts, Amherst : "Information retrieval and computer science: an evolving relationship"
; 2006 - C. J. Van Rijsbergen , University Of Glasgow , UK : "Quantum haystacks"


SEE ALSO



EXTERNAL LINKS