Djoerd Hiemstra

Modeling Expert Finding as an Absorbing Random Walk (2009)

Pavel Serdyukov, Henning Rode, Djoerd Hiemstra

We introduce a novel approach to expert finding based on multi-step relevance propagation from documents to related candidates. Relevance propagation is modeled with an absorbing random walk. The...

A Probabilistic Ranking Framework using Unobservable Binary Events for Video Search (2009)

Robin Aly, Djoerd Hiemstra, Arjen De Vries, Franciska De Jong

Recent content-based video retrieval systems combine output of concept detectors (also known as high-level features) with text obtained through automatic speech recognition. This paper concerns the...

ewi.utwente.nl (2009)

Robin Aly, Djoerd Hiemstra, Roeland Ordelman

ewi.utwente.nl Bridging the semantic gap is one of the big challenges in multimedia information retrieval. It exists between the extraction of low-level features of a video and its conceptual...

General Terms (2009)

Henning Rode, Pavel Serdyukov, Djoerd Hiemstra

We study entity ranking on the INEX entity track and propose a simple graph-based ranking approach that enables to combine scores on document and paragraph level. The combined approach improves the...

PAPER SIGIR’s 30th anniversary: an analysis of trends in IRresearch and the topology of its community (2009)

Djoerd Hiemstra, Claudia Hauff, Franciska De Jong, Wessel Kraaij

This paper presents an analysis of all SIGIR proceedings to date in order to summarize what IR researchers discussed over the years, where they are from, and whether subcommunities can be identified,...

Evaluating structured information retrieval and multimedia retrieval using PF/Tijah (2009)

Thijs Westerveld, Henning Rode, Roel Van Os, Djoerd Hiemstra, Vojkan Mihajlović

Abstract. We used a flexible XML retrieval system for evaluating structured document retrieval and multimedia retrieval tasks in the context of the INEX 2006 benchmarks. We investigated the...

CWI (2009)

Johan List, Vojkan Mihajlovic, Arjen P. Vries, Georgina Ramírez, Djoerd Hiemstra

This paper discusses our participation in INEX (the Initiative for the Evaluation of XML Retrieval) using the TIJAH XML-IR system. TIJAH’s system design follows a ‘standard’ layered database...

Using Parsimonious Language Models on Web Data (2009)

Rianne Kaptein, Rongmei Li, Djoerd Hiemstra, Jaap Kamps

In this paper we explore the use of parsimonious language models for web retrieval. These models are smaller thus more efficient than the standard language models and are therefore well suited for...

Information theory (2009)

Rongmei Li, Rianne Kaptein, Djoerd Hiemstra, Jaap Kamps

The main obstacle for providing focused search is the relative opaqueness of search request—searchers tend to express their complex information needs in only a couple of keywords. Our overall aim...

Distributed Information Retrieval using Keyword Auctions (2009)

Djoerd Hiemstra

Abstract This report motivates the need for large-scale distributed approaches to information retrieval, and proposes

Evaluating Relevance Feedback: An Image Retrieval Interface for Children (2009)

Er Bockting, Matthijs Ooms, Djoerd Hiemstra, Theo Huibers

Studies on information retrieval for children are not yet common. As young children possess a limited vocabulary and limited intellectual power, they may experience more difficulty in fulfilling...

University of Twente, (2009)

Djoerd Hiemstra, Stefan Klinger, Henning Rode, Jan Flokstra, Peter Apers

We argue that ranking algorithms for XML should reflect the actual combined content and structure constraints of queries, while at the same time producing equal rankings for queries that are...

University of Twente, (2009)

Djoerd Hiemstra, Stefan Klinger, Henning Rode, Jan Flokstra, Peter Apers

We argue that ranking algorithms for XML should reflect the actual combined content and structure constraints of queries, while at the same time producing equal rankings for queries that are...

Vague Element Selection and Query Rewriting for XML Retrieval (2009)

Vojkan Mihajlović, Djoerd Hiemstra, Henk Ernst Blok

In this paper we present the extension of our prototype three-level database system (TIJAH) developed for structured information retrieval. The extension is aimed at modeling vague search on XML...

Exploiting Sequential Dependencies for Expert Finding (2009)

Pavel Serdyukov, Henning Rode, Djoerd Hiemstra

We propose an expert finding method based on assumption of sequential dependence between a candidate expert and the query terms in the scope of a document. We assume that the strength of relation of...

Efficient XML and Entity Retrieval with PF/Tijah: CWI and University of Twente at INEX’08 (2009)

Henning Rode, Djoerd Hiemstra, Arjen De Vries, Pavel Serdyukov

PF/Tijah is a research prototype created by the University of Twente and CWI Amsterdam with the goal to create a flexible environment for setting up search

Proceedings of the 9th Dutch-Belgian Information Retrieval Workshop (2009)

Aly, Robin, Hauff, Claudia, Hamer Den, Ida, Hiemstra, Djoerd, Huibers, Theo, Jong De, Franciska

Welcome to the 9th Dutch-Belgian Information Retrieval Workshop (DIR). I very well remember the DIR workshop in 2001 that was also organized in Twente. It took place exactly one day before my PhD...

Concept Detectors: How Good is Good Enough? (2009)

Aly, Robin, Hiemstra, Djoerd

Today, semantic concept based video retrieval systems often show insufficient performance for real-life applications. Clearly, a big share of the reason is the lacking performance of the detectors of...

Abstract Combining Information Sources for Video Retrieval The Lowlands Team at TRECVID 2003 (2008)

Thijs Westerveld, Tzveta Ianeva, Lioudmila Boldareva, Djoerd Hiemstra

The previous video track results demonstrated that it is far from trivial to take advantage of multiple modalities for the video retrieval search task. For almost any query, results on ASR...

Database (2008)

Henk Ernst Blok, Djoerd Hiemstra

1.1 From flat file to XML IR................................. 5 1.2 Towards the database approach............................. 7

Theme “Text search with PFTijah” (2008)

Assignments For Xml, Djoerd Hiemstra, Henning Rode, Jan Flokstra, Roel Van Os, Maurice Van Keulen

This report briefly introduces the PFTijah project and describes project assignments for the course XML & Databases 2 (211086) [7]. Additional assignments can be found in the document

Abstract (2008)

Djoerd Hiemstra, Wessel Kraaij

In this paper we present the language modeling approach to information retrieval as a toolbox to systematically combine information from different sources. Four TREC subtasks (Ad Hoc, Entry Page,...

Abstract: Parsimonious Language Models for a Terabyte of Text (2008)

Djoerd Hiemstra, Jaap Kamps

The aims of this paper are twofold. Our first aim is to compare results of the earlier Terabyte tracks to the Million Query track. We submitted a number of runs using different document...

Cross-language Retrieval at Twente and TNO. (2008)

Dennis Reidsma, Djoerd Hiemstra, Franciska De Jong, Wessel Kraaij

This paper describes the official runs of the Twenty-One group for CLEF-2002. The Twenty-One group participated in the Dutch and Finnish monolingual and the Dutch bilingual tasks. This paper also...

ABSTRACT (2008)

Pavel Serdyukov, Henning Rode, Djoerd Hiemstra

Modeling relevance propagation for the expert search task

Structured Document Retrieval, Multimedia Retrieval, and Entity Ranking Using PF/Tijah (2008)

Theodora Tsikrika, Pavel Serdyukov, Henning Rode, Thijs Westerveld, Robin Aly, Djoerd Hiemstra, ...

XML retrieval system, to evaluate structured document retrieval, multimedia retrieval, and entity ranking tasks in the context of INEX 2007. For the retrieval of textual and multimedia elements in...

Interactive Retrieval of Video Using Pre-computed Shot-Shot Similarities (2008)

Liudmila Boldareva, Djoerd Hiemstra

A probabilistic framework for content-based interactive video retrieval is described. The developed indexing of video fragments is originated from the probability of the user’s positive judgment...

DEFINITION (2008)

Djoerd Hiemstra, Ricardo Baeza-yates

Structured text retrieval models provide a formal definition or mathematical framework for querying semistructured textual databases. A textual database contains both content and structure. The...

General Terms (2008)

Djoerd Hiemstra

We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly...

PF/Tijah Documentation 1 Features And Goals PF/Tijah Documentation (2008)

Djoerd Hiemstra, Henning Rode, Jan Flokstra

PF/Tijah (Pathfinder/Tijah, pronounce as "Pee Ef Teeja") is a flexible open source text search system developed at the University of Twente in cooperation with CWI Amsterdam and TU...

General Terms (2008)

Pavel Serdyukov, Djoerd Hiemstra, Maarten Fokkinga

In this paper we address the task of automatically finding an expert within the organization, known as the expert search problem. We present the theoretically-based probabilistic algorithm which...

Exploiting Query Structure and Document Structure to Improve Document Retrieval Effectiveness ABSTRACT (2008)

Vojkan Mihajlović, Djoerd Hiemstra, Henk Ernst, Blok Peter, M. G. Apers

In this paper we present a systematic analysis of document retrieval using unstructured and structured queries within the score region algebra (SRA) structured retrieval framework. The behavior of...

Abstract (2008)

Djoerd Hiemstra

This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch test data, which is...

Categories and Subject Descriptors: (2008)

Henk Ernst Blok, Vojkan Mihajlović, Georgina Ramírez, Thijs Westerveld, Djoerd Hiemstra

Not many XML information retrieval (IR) systems exist that allow easy addition of and switching between different IR models. Especially for the scientific environment where building a system takes a...

A Tutorial on Information Retrieval Modelling (2008)

Djoerd Hiemstra

Many applications that handle information on the internet would be completely

The Potential of User Feedback through the Iterative Refining of Queries in an Image Retrieval System (2008)

Maher Ben Moussa, Marco Pasch, Djoerd Hiemstra

Abstract. Inaccurate or ambiguous expressions in queries lead to poor results in information retrieval. We assume that iterative user feedback can improve the quality of queries. To this end we...

Question Answering for Dutch: Simple does it (2008)

Arjen Hoekstra, Djoerd Hiemstra, Paul Vet, Theo Huibers

When people pose questions in natural language to search for information on the web, the role of question answering (QA) systems becomes important. In this paper the QAsystem simpleQA, capable of...

DEFINITION (2008)

Djoerd Hiemstra

A language model assigns a probability to a piece of unseen text, based on some training data. For example, a language model based on a big English newspaper archive is expected to assign a higher...

Entity Ranking on Graphs: Studies on Expert Finding (2008)

Henning Rode, Pavel Serdyukov, Djoerd Hiemstra, Hugo Zaragoza

Todays web search engines try to offer services for finding various information in addition to simple web pages, like showing locations or answering simple fact queries. Understanding the association...

Sound ranking algorithms for XML search in PF/Tijah (2008)

Hiemstra, Djoerd, Klinger, Stefan, Rode, Henning, Flokstra, Jan, Apers, Peter

We argue that ranking algorithms for XML should reflect the actual combined content and structure constraints of queries, while at the same time producing equal rankings for queries that are...

08111 Report -- Ranked XML Querying (2008)

Amer-Yahia, Sihem, Hiemstra, Djoerd, Roelleke, Thomas, Srivastava, Divesh, Weikum, Gerhard

This paper is based on a five-day workshop on "Ranked XML Querying" that took place in Schloss Dagstuhl in Germany in March 2008 and was attended by 27 people from three different research...

Additional (2008)

Robin Aly, Claudia Hauff, Willemijn Heeren, Djoerd Hiemstra, Franciska De Jong, Thijs Verschoor, ...

Type Run Description MAP Official A UTen English ASR 0.0031 A UTt hs-t2-nm Top-2 concepts from t hs graph method with 0.0137 neighbor multiply A UTwiki-t2-nm Top-2 Wikipedia concepts with neighbor...

Modeling multi-step relevance propagation for expert finding (2008)

Pavel Serdyukov, Henning Rode, Djoerd Hiemstra

An expert finding system allows a user to type a simple text query and retrieve names and contact information of individuals that possess the expertise expressed in the query. This paper proposes a...

SCHEDULE (2007)

Rik De Busser, Regina Barzilay, Paul Clough, Bruce Croft, Norbert Fuhr, Fredric Gey, ...

In cooperation with: Interdisciplinary Centre for Law and IT Research Group Legal Informatics & Information Retrieval IWT Vlaanderen Proceedings editors

CIRQUID: Complex Information Retrieval QUeries In a Database (2007)

Djoerd Hiemstra, Henk Ernst Blok, Maurice Van Keulen, Willem Jonker, Martin L. Kersten

The CIRQUID project plans to design and build a DBMS that seemlessly integrates relevance-oriented querying of semi-structured data (XML) with traditional querying of this data. The project is funded...

DRUID. The following people have contributed to these results (appearing in alphabetical order): Jan (2007)

Alex Van Ballegooij, Jan Mark Geusenbroek, Jurgen Den Hartog, Djoerd Hiemstra, Thijs Westerveld, Ioannis Patras, ...

This paper describes our participation in the TREC Video Retrieval evaluation. Our approach uses two complementary automatic approaches (the first based on visual content, the other on transcripts),...

DRUID. The following people have contributed to these results (appearing in alphabetical order): Jan (2007)

Alex Van Ballegooij, Jan Mark Geusenbroek, Jurgen Den Hartog, Djoerd Hiemstra, Thijs Westerveld, Ioannis Patras, ...

This paper describes our participation in the TREC Video Retrieval evaluation. Our approach uses two complementary automatic approaches (the first based on visual content, the other on transcripts),...

y (2007)

Wessel Kraaij, Djoerd Hiemstra

This paper describes the ocial runs of the Twenty-One group for TREC-8. The Twenty-One group participated in the Ad-hoc, CLIR, Adaptive Filtering and SDR tracks. The main focus of our experiments is...

Creating a Dutch information retrieval test corpus (2007)

Djoerd Hiemstra And, Djoerd Hiemstra, David Van Leeuwen

This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch test data, which is...

Creating a Dutch testbed to evaluate the (2007)

Retrieval From Textual, Djoerd Hiemstra

This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch test data, which is...

Cross-language Retrieval at the University of Twente and TNO. (2007)

Dennis Reidsma, Djoerd Hiemstra, Franciska De Jong, Wessel Kraaij

This paper describes the o#cial runs of the Twente/TNO group for CLEF 2002. We participated in the Dutch and Finnish monolingual and the Dutch bilingual tasks. In addition this paper reports on an...

Bibliography Published Papers by Dragomir R. Radev References (2007)

Dragomir R. Radev, Alfred Aho, Shih-fu Chang, Kathleen Mckeown, Dragomir Radev, Bruce Croft, ...

[5] Suresh Bhavnani, Karen Drabenstott, and Dragomir Radev. Towards a unified framework of IR tasks and strategies. In 2001 ASIST Annual

The effectiveness of concept based search for video retrieval (2007)

Claudia Hauff, Robin Aly, Djoerd Hiemstra

In this paper we investigate how a small number of high-level concepts derived for video shots, such as Sports, Face, Indoor, etc., can be used effectively for ad hoc search in video material. We...

XML Information Retrieval from Spoken Word Archives (2007)

Aly, Robin, Hiemstra, Djoerd, Ordelman, Roeland, Werff Van Der, Laurens, Jong De, Franciska

This report presents the University of Twente's first cross-language speech retrieval experiments in Cross-Language Evaluation Forum (CLEF). It describes the issues our contribution was focusing on,...

D.: Using Query Profiles for Clarification (2006)

Henning Rode, Djoerd Hiemstra

Abstract. The following paper proposes a new kind of relevance feedback. It shows how so called query profiles can be employed for disambiguation and clarification. Query profiles provide useful...

Pftijah: text search in an xml database system (2006)

Djoerd Hiemstra, Henning Rode, Roel Van Os, Jan Flokstra

This paper introduces the PF/Tijah system, a text search system that is integrated with an XML/XQuery database management system. We present examples of its use, we explain some of the system...

The Lowlands' TREC Experiments 2005 (2006)

Henning Rode Georgina, Henning Rode, Georgina Ramírez, Thijs Westerveld, Djoerd Hiemstra, Arjen P. Vries

This paper describes our participation to the TREC HARD track (High Accuracy Retrieval of Documents) and the TREC Enterprise track. The main goal of our HARD participation is the development and...

Pftijah: text search in an xml database system (2006)

Djoerd Hiemstra, Henning Rode, Roel Van Os, Jan Flokstra

This paper introduces the PF/Tijah system, a text search system that is integrated with an XML/XQuery database management system. We present examples of its use, we explain some of the system...

Score Region Algebra: Building a Transparent XML-IR Database (2005)

Mihajlovic, Vojkan, Blok, Henk Ernst, Hiemstra, Djoerd, Apers, Peter M.G.

A unied database framework that will enable better comprehension of ranked XML retrieval is still a challenge in the XML database field. We propose a logical algebra, named score region algebra, that...

Score Region Algebra: Building a Transparant XML-IR Databases (2005)

Mihajlovic, Vojkan, Blok, Henk Ernst, Hiemstra, Djoerd, Apers, Peter M.G.

A unified database framework that will enable better comprehension of ranked XML retrieval is still a challenge in the XML database eld. We propose a logical algebra, named score region algebra, that...

A database approach to information retrieval: The remarkable relationship between language models and region models (2005)

Hiemstra, Djoerd, Mihajlovic, Vojkan

In this report, we unify two quite distinct approaches to information retrieval: region models and language models. Region models were developed for structured document retrieval. They provide a...

Utilizing Structural Knowledge for Information Retrieval in XML Databases (2005)

Mihajlovic, Vojkan, Hiemstra, Djoerd, Blok, Henk Ernst, Apers, Peter M.G.

In this paper we address the problem of immediate translation of eXtensible Mark-up Language (XML) information retrieval (IR) queries to relational database expressions and stress the benefits of...

TIJAH at INEX 2004: Modeling Phrases and Relevance Feedback (2005)

Vojkan Mihajlović, Georgina Ramírez, Djoerd Hiemstra, Henk Ernst Blok

Abstract. This paper discusses our participation in INEX using the TIJAH XML-IR system. We have enriched the TIJAH system, which follows a standard layered database architecture, with several new...

Vries. An integrated approach to text and image retrieval – the lowlands team at TRECVID 2005 (2005)

Thijs Westerveld, Jan C. Gemert, Roberto Cornacchia, Djoerd Hiemstra, Arjen P. Vries

Our main focus for this year was on setting up a flexible retrieval environment rather than on evaluating novel video retrieval approaches. In this structured abstract the submitted runs are briefly...

The simplest evaluation measures for XML information retrieval that could possibly work (2005)

Djoerd Hiemstra, Vojkan Mihajlović

This paper reviews several evaluation measures developed for evaluating XML information retrieval (IR) systems. We argue that these measures, some of which are currently in use by the INitiative for...

Parsimonious Language Models for Information Retrieval (2004)

Hiemstra, Djoerd, Robertson, Stephen, Zaragoza, Hugo

We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly...

Combining information sources for video retrieval: The Lowlands team at TRECVID 2003, in ‘Proceedings of TRECVid 2003 (2004)

Thijs Westerveld, Tzvetanka Ianeva, Lioudmila Boldareva, Djoerd Hiemstra

The previous video track results demonstrated that it is far from trivial to take advantage of multiple modalities for the video retrieval search task. For almost any query, results on ASR...

Probabilistic Approaches to Video Retrieval (2004)

Tzvetanka Ianeva, Lioudmila Boldareva, Thijs Westerveld, Roberto Cornacchia, Djoerd Hiemstra

Our experiments for TRECVID 2004 further investigate the applicability of the so-called “Generative Probabilistic Models to video retrieval”. TRECVID 2003 results demonstrated that mixture models...

Parsimonious Language Models for Information Retrieval (2004)

Djoerd Hiemstra, Stephen Robertson, Hugo Zaragoza

We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly...

Combining Information Sources for Video Retrieval (2004)

The Lowlands Team, Thijs Westerveld, Tzvetanka Ianeva, Lioudmila Boldareva, Djoerd Hiemstra

The previous video track results demonstrated that it is far from trivial to take advantage of multiple modalities for the video retrieval search task. For almost any query, results based on ASR...

Monitoring User-System Performance (2004)

In Interactive Retrieval, Liudmila Boldareva, Arjen P. Vries, Djoerd Hiemstra

Monitoring user-system performance in interactive search is a challenging task. Traditional measures of retrieval evaluation, based on recall and precision, are not of any use in real time, for they...

Interactive content-based retrieval using pre-computed object-object similarities (2004)

Liudmila Boldareva, Djoerd Hiemstra

Abstract. We propose using truncated object-object similarity matrix as an access structure for interactive video retrieval. The proposed approach offers a scalable solution to retrieval and allows...

An XML-IR-DB Sandwich: Is it Better with an Algebra in Between (2004)

Vojkan Mihajlović, Djoerd Hiemstra, Henk Ernst, Blok Peter, M. G. Apers

In this paper we address the problem of immediate translation of XPath+IR queries to relational database expressions and exert the benefits of using an intermediate algebra. Adding an intermediate...

CIRQUID: Complex Information Retrieval Queries in a Database (2003)

Hiemstra, Djoerd, Vries De, Arjen P., Blok, Henk Ernst, Keulen Van, Maurice, Jonker, Willem, Kersten, Martin L.

The CIRQUID project plans to design and build a DBMS that seemlessly integrates relevance-oriented querying of semi-structured data (XML) with traditional querying of this data. The project is funded...

A Probabilistic Multimedia Retrieval Model and Its Evaluation (2003)

Thijs Westerveld, Alex Van Ballegooij, Franciska De Jong, Djoerd Hiemstra

We present a probabilistic model for the retrieval of multimodal documents. The model is based on Bayesian decision theory and combines models for text-based search with models for visual search. The...

A probabilistic multimedia retrieval model and its evaluation (2003)

Thijs Westerveld, Alex Van Ballegooij, Franciska De, Djoerd Hiemstra

Abstract. In this paper we present a probabilistic model for the retrieval of multimodal documents. The model is based on Bayesian decision theory and combines models for text based search with...

CIRQuL - Complex Information Retrieval Query Language (2003)

Vojkan Mihajlovic Djoerd, Djoerd Hiemstra

In this paper we will present a new framework for the retrieval of XML documents. We will describe the extension for existing query languages (XPath and XQuery) geared toward ranked information...

Bayesian Extension to the Language Model for Ad Hoc Information Retrieval (2003)

Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping, Stephen Robertson

We propose a Bayesian extension to the ad-hoc Language Model. Many smoothed estimators used for the multinomial query model in ad-hoc Language Models (including Laplace and Bayes-smoothing) are...

Relevance Feedback in Probabilistic Multimedia Retrieval (2003)

Lioudmila Boldareva Djoerd, Djoerd Hiemstra, Willem Jonker

In this paper we propose a new method for data organisation in a (multimedia) collection. We use probabilistic approaches to indexing and interactive retrieval which enable to fill the semantic gap....

Bayesian extension to the language model for ad hoc information retrieval (2003)

Hugo Zaragoza, Djoerd Hiemstra, Stephen Robertson

We propose a Bayesian extension to the ad-hoc Language Model. Many smoothed estimators used for the multinomial query model in ad-hoc Language Models (including Laplace and Bayes-smoothing) are...

Cross-language Retrieval at the University of Twente and TNO (2003)

Reidsma, Dennis, Hiemstra, Djoerd, Jong De, Franciska, Kraaij, Wessel

This paper describes the official runs of the Twente/TNO group for CLEF 2002. We participated in the Dutch and Finnish monolingual and the Dutch bilingual tasks. In addition this paper reports on an...

A Probabilistic Multimedia Retrieval Model and Its Evaluation (2003)

Djoerd Hiemstra, Franciska De Jong, Alex Van Ballegooij, Thijs Westerveld

We present a probabilistic model for the retrieval of multimodal documents. The model is based on Bayesian decision theory and combines models for text-based search with models for visual search. The...

A Probabilistic Multimedia Retrieval Model and Its Evaluation (2003)

Thijs Westerveld, Alex Van Ballegooij, Franciska De Jong, Djoerd Hiemstra

We present a probabilistic model for the retrieval of multimodal documents. The model is based on Bayesian decision theory and combines models for text-based search with models for visual search. The...

Creating a Dutch testbed to evaluate the retrieval from textual databases (2002)

Hiemstra, Djoerd, Leeuwen Van, David A.

This paper describes the first large-scale evaluation of information retrieval systems using Dutch documents and queries. We describe in detail the characteristics of the Dutch test data, which is...

A database approach to INEX (2002)

Djoerd Hiemstra

This paper describes a first prototype system for contentbased retrieval from XML data. The system's design supports both XPath queries and complex information retrieval queries.

A Scalable and Efficient Content-Based Multimedia Retrieval System (2002)

Lioudmila Boldareva, Djoerd Hiemstra, Willem Jonker

In this work the problem of content-based information retrieval is approached from a new perspective. We look at a probabilistic approach in CBIR from the angle of Bayesian networks. Our data...

Term-Specific Smoothing for the Language Modeling Approach to Information Retrieval: The Importance of a Query Term (2002)

Djoerd Hiemstra

This paper follows a formal approach to information retrieval based on statistical language models. By introducing some simple reformulations of the basic language modeling approach we introduce the...

Predicting the cost-quality trade-off for information retrieval queries: Facilitating database design and query optimisation (2001)

Blok, Henk Ernst, Hiemstra, Djoerd, Choenni, Sunil, Jong De, Franciska, Blanken, Henk M., Apers, Peter M.G.

Efficient, exible, and scalable integration of full text information retrieval (IR) in a DBMS is not a trivial case. This holds in particular for query optimization in such a context. To facilitate...

Statistical Language Models and Information Retrieval: natural language processing really meets retrieval (2001)

Djoerd Hiemstra, Franciska De Jong

Traditionally, natural language processing techniques for information retrieval have always been studied outside the framework of formal models of information retrieval. In this article, we introduce...

Retrieving Web pages using content, links, URLs and anchors (2001)

Thijs Westerveld, Wessel Kraaij, Djoerd Hiemstra

Abstract. For this year’s web track, we concentrated on the entry page finding task. For the content-only runs, in both the ad-hoc task and the entry page finding task, we used an information...

Relevance feedback for best match term weighting algorithms in information retrieval (2001)

Djoerd Hiemstra, Stephen Robertson

Abstract Personalisation in full text retrieval or full text filtering implies reweighting of the query terms based on some explicit or implicit feedback from the user. Relevance feedback inputs the...

Retrieving Web pages using content, links, URLs and anchors (2001)

Thijs Westerveld, Wessel Kraaij, Djoerd Hiemstra

Abstract. For this year’s web track, we concentrated on the entry page finding task. For the content-only runs, in both the ad-hoc task and the entry page finding task, we used an information...

Relevance feedback for best match term weighting algorithms in information retrieval (2001)

Djoerd Hiemstra, Stephen Robertson

Abstract Personalisation in full text retrieval or full text filtering implies reweighting of the query terms based on some explicit or implicit feedback from the user. Relevance feedback inputs the...

Language models and probability of relevance (2001)

Stephen Robertson, Djoerd Hiemstra

1 A formulation of the Language Model for IR The basic formula used in several of the papers which take a language modelling approach to IR can be written as follows: n� P (D, T1, T2,..., Tn) = P...

Relating the new language models of information retrieval to the traditional retrieval models (2000)

Hiemstra, Djoerd, Vries De, Arjen P.

During the last two years, exciting new approaches to information retrieval were introduced by a number of different research groups that use statistical language models for retrieval. This paper...

AIP). © 2000, The Physics of Flocking. Online. American Institute of Physics. Available: http://www.aip.org/physnews/preview/1998/flocks/text. htm. Last accessed 24 (2000)

Vojkan Mihajlović, Djoerd Hiemstra, Peter Apers

Abstract This paper describes some new ideas on developing a logical algebra for databases that manage textual data and support information retrieval functionality. We describe a first prototype of...

Vries, ‘Relating the new language models of information retrieval to the traditional retrieval models.’ CTIT (2000)

Djoerd Hiemstra

During the last two years, exciting new approaches to information retrieval were introduced by a number of different research groups that use statistical language models for retrieval. This paper...

Vries, ‘Relating the new language models of information retrieval to the traditional retrieval models.’ CTIT (2000)

Djoerd Hiemstra

During the last two years, exciting new approaches to information retrieval were introduced by a number of different research groups that use statistical language models for retrieval. This paper...

Extracting Bimodal Representations for Language-Based Image Retrieval (2000)

Thijs Westerveld, Djoerd Hiemstra, Franciska De Jong

Abstract This paper explores two approaches to multimedia indexing that might contribute to the advancement of text-based conceptual search for pictorial information. Insights from relatively mature...

Relating the New Language Models of Information Retrieval to the Traditional Retrieval Models (2000)

Djoerd Hiemstra

During the last two years, exciting new approaches to information retrieval were introduced by a number of different research groups that use statistical language models for retrieval. This paper...

Extracting Bimodal Representations for Language-Based Image Retrieval (2000)

Thijs Westerveld, Djoerd Hiemstra, Franciska De Jong

This paper explores two approaches to multimedia indexing that might contribute to the advancement of text-based conceptual search for pictorial information. Insights from relatively mature retrieval...

Vries, ‘Relating the new language models of information retrieval to the traditional retrieval models.’ CTIT (2000)

Djoerd Hiemstra

During the last two years, exciting new approaches to information retrieval were introduced by a number of different research groups that use statistical language models for retrieval. This paper...

Twenty-One at TREC-7: Ad-hoc and Cross-Language Track (1999)

Djoerd Hiemstra, Wessel Kraaij

This paper describes the o cial runs of the Twenty-One group for TREC-7. The Twenty-One group participated in the ad-hoc and the cross-language track and made the following accomplishments: We...

Disambiguation strategies for cross-language information retrieval (1999)

Djoerd Hiemstra, Franciska De Jong

Abstract. This paper gives an overview of tools and methods for CrossLanguage Information Retrieval (CLIR) that are developed within the Twenty-One project. The tools and methods are evaluated with...

Disambiguation Strategies for Cross-language Information Retrieval (1999)

Djoerd Hiemstra, Franciska De Jong

. This paper gives an overview of tools and methods for CrossLanguage Information Retrieval (CLIR) that are developed within the Twenty-One project. The tools and methods are evaluated with the TREC...

Twenty-One at TREC-7: Ad-hoc and Cross-language track (1999)

Djoerd Hiemstra, Wessel Kraaij

This paper describes the official runs of the Twenty-One group for TREC-7. The Twenty-One group participated in the ad-hoc and the cross-language track and made the following accomplishments: We...

Cross-language information retrieval in Twenty-One: Using one, some or all possible translations? (1999)

Djoerd Hiemstra, Franciska De Jong

This paper gives an overview of the tools and methods for Cross-Language Information Retrieval (CLIR) that were developed within the Twenty-One project. The tools and methods are evaluated with the...

Twenty-One at TREC-7: Ad-hoc and Cross-Language Track (1999)

Djoerd Hiemstra, Wessel Kraaij

This paper describes the o cial runs of the Twenty-One group for TREC-7. The Twenty-One group participated in the ad-hoc and the cross-language track and made the following accomplishments: We...

Disambiguation strategies for cross-language information retrieval (1999)

Djoerd Hiemstra

Keywords: Cross-Language Information Retrieval, Statistical Machine

Cross Language Retrieval with the Twenty-One system (1998)

Wessel Kraaij, Djoerd Hiemstra

The EU project Twenty-One will support cross language queries in a multilingual document base. A prototype version of the Twenty-One system has been subjected to the Cross Language track tests in...

Cross Language Retrieval with the Twenty-One system (1998)

Wessel Kraaij, Djoerd Hiemstra

The EU project Twenty-One will support cross language queries in a multilingual document base. A prototype version of the Twenty-One system has been subjected to the Cross Language track tests in...

A Linguistically Motivated Probabilistic Model of Information Retrieval (1998)

Djoerd Hiemstra, Centre For Telematics, Information Technology

. This paper presents a new probabilistic model of information retrieval. The most important modeling assumption made is that documents and queries are defined by an ordered sequence of single terms....

Multilingual Domain Modeling in Twenty-One - Automatic Creation of a Bi-directional Translation Lexicon from a Parallel Corpus (1998)

Djoerd Hiemstra

Within the project Twenty-One, which aims at effective dissemination of information on ecology and sustainable development, a system is developed that supports cross-language information retrieval...

Nymble: a high performance learning name-finder (1997)

Djoerd Hiemstra, Vojkan Mihajlović

In this report, we unify two quite distinct approaches to information retrieval: region models and language models. Region models were developed for structured document retrieval. They provide a...

A Domain Specific Lexicon Acquisition Tool for Cross-Language Information Retrieval (1997)

Djoerd Hiemstra, Franciska De Jong, Wessel Kraaij

With the recent enormous increase of information dissemination via the web as incentive there is a growing interest in supporting tools for cross-language retrieval. In this paper we describe a...

The Mirror DBMS at TREC-8

Djoerd Hiemstra

The database group at University of Twente participates in TREC-8 using the Mirror DBMS, a prototype database system especially designed for multimedia and web retrieval. From a database perspective,...

The Mirror DBMS at TREC

Djoerd Hiemstra

The database group at University of Twente participates in TREC8 using the Mirror DBMS, a prototype database system especially designed for multimedia and web retrieval. From a database perspective,...

A database approach to content-based XML retrieval

Djoerd Hiemstra

This paper describes a first prototype system for content-based retrieval from XML data. The system's design supports both XPath queries and complex information retrieval queries based on a...

A database approach to content-based XML retrieval

Djoerd Hiemstra

This paper describes a first prototype system for contentbased retrieval from XML data. The system's design supports both XPath queries and complex information retrieval queries based on a...