Gerhard Weikum

Unbundling Transaction Services in the Cloud (2009)

David Lomet, Alan Fekete, Gerhard Weikum, Mike Zwilling

The traditional architecture for a DBMS engine has the recovery, concurrency control and access method code tightly bound together in a storage engine for records. We propose a different approach,...

Unbundling Transaction Services in the Cloud (2009)

Lomet, David, Fekete, Alan, Weikum, Gerhard, Zwilling, Mike

The traditional architecture for a DBMS engine has the recovery, concurrency control and access method code tightly bound together in a storage engine for records. We propose a different approach,...

Dagstuhl Seminar Organizer Authors (2009)

Cliff Jones, David Lomet, Er Romanovsky, Gerhard Weikum, Alan Fekete, Marie-claude Gaudel, ...

This paper is based on a five-day workshop on “Atomicity in System Design and Execution ” that took place in Schloss Dagstuhl in Germany [5] in April 2004 and was attended by 32 people from...

TOB: Timely Ontologies for Business Relations (2009)

Qi Zhang, Fabian M. Suchanek, Lihua Yue, Gerhard Weikum

In this paper we present a suite of methods for extracting temporal relations from semi-structured and textual Web sources. We particularly address the needs for building and maintaining business...

Abstract Social Wisdom for Search and Recommendation (2009)

Ralf Schenkel, Tom Crecelius, Mouna Kacimi, Thomas Neumann, Josiane Xavier Parreira, Marc Spaniol, ...

Social-tagging communities offer great potential for smart recommendation and “socially enhanced ” searchresult ranking. Beyond traditional forms of collaborative recommendation that are based on...

Efficient Time-Travel on Versioned Text Collections (2009)

Klaus Berberich, Srikanta Bedathur, Gerhard Weikum

Abstract: The availability of versioned text collections such as the Internet Archive opens up opportunities for time-aware exploration of their contents. In this paper, we propose time-travel...

EOS 2: Unstoppable Stateful PHP (2009)

German Shegalov, Gerhard Weikum

A growing number of businesses deliver mission-critical applications (stock trading, auctions, etc.) to their customers as Web Services. These applications comprise heterogeneous components...

Good Guys vs. Bad Guys: Countering Cheating in Peer-to-Peer Authority Computations over Social Networks ABSTRACT (2009)

Mauro Sozio, Tom Crecelius, Josiane Xavier Parreira, Gerhard Weikum

Eigenvector computations are an important building block for computing authority, trust, and reputation scores in social networks and other graphs. In peer-to-peer networks or other forms of...

Making SENSE: Socially ENhanced Search and Exploration ABSTRACT (2009)

Tom Crecelius, Mouna Kacimi, Sebastian Michel, Thomas Neumann, Josiane Xavier Parreira, Ralf Schenkel, ...

Online communities like Flickr, del.icio.us and YouTube have established themselves as very popular and powerful services for publishing and searching contents, but also for identifying other users...

Database and Information-Retrieval Methods for Knowledge Discovery (2009)

Weikum, Gerhard

This article\'s aim is to advocate for the integration of database systems (DB) and information-retrieval (IR) methods to address applications that are emerging from the ongoing explosion and...

Database and Information-Retrieval Methods for Knowledge Discovery (2009)

Weikum, Gerhard

This article's aim is to advocate for the integration of database systems (DB) and information-retrieval (IR) methods to address applications that are emerging from the ongoing explosion and...

Word Sense Disambiguation (2008)

Speaker Georgiana Ifrim, Supervisors Prof, Gerhard Weikum, Mpi Informatik

● “He who knows not and knows not he knows not, He is a fool    Shun him. ● He who knows not and knows he knows not, He is simple    Teach him. ●...

SELECT * FROM INDEX (2008)

Anja Theobald, Gerhard Weikum

XML is becoming the standard for integrating and exchanging data over the Internet and within intranets, covering the complete spectrum from largely unstructured, ad hoe documents to highly...

Decentralized searc... (2008)

Sebastian Michel, Matthias Bender, Nikos Ntarmos, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

Peer-to-Peer (P2P) search requires intelligent decisions for query routing: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional...

ACKNOWLEDGEMENTS (2008)

Light Weight, Workflow Management System, Sai Pradeep Vangala, Prof Dr. -ing, Gerhard Weikum

I confirm under oath that I have written the thesis on my own and that I have not used any other media that ones mentioned in the thesis

13 (2008)

Gerhard Goos, Juris Hartmanis, Jan Van Leeuwen, Editorial Board, David Hutchison, Takeo Kanade, ...

concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks....

Dagstuhl Seminar Organizer Authors (2008)

Cliff Jones, David Lomet, Er Romanovsky, Gerhard Weikum, Alan Fekete, Marie-claude Gaudel, ...

This paper is based on a five-day workshop on “Atomicity in System Design and Execution ” that took place in Schloss Dagstuhl in Germany [5] in April 2004 and was attended by 32 people from...

On the Utility of Automatically Generated Wordnets (2008)

Gerard De Melo, Gerhard Weikum

Abstract. Lexical resources modelled after the original Princeton Word-Net are being compiled for a considerable number of languages, however most have yet to reach a comparable level of coverage. In...

ABSTRACT The LRU-K Page Replacement Algorithm For Database Disk Buffering (2008)

Elizabeth J. O'neil, Patrick E. O'neil, Gerhard Weikum

This paper introduces a new approach to database disk buffering, called the LRU-K method. The basic idea of LRU-K is to keep track of the times of the last K references to popular database pages,...

Part I: What Is It All About (2008)

Surajit Chaudhuri, Gerhard Weikum, Surajit Chaudhuri, Gerhard Weikum, Surajit Chaudhuri, ...

Motivate and enable students and young scientists to pursue research on the auto-tuning aspect of autonomic computing Complementary to • SIGMOD 02 and VLDB 02 tutorials (Shasha/Bonnet) on tuning...

ABSTRACT Node Behavior Prediction for Large-Scale Approximate Information Filtering (2008)

Christian Zimmer, Christos Tryfonopoulos, Klaus Berberich, Gerhard Weikum, Manolis Koubarakis

In this paper we investigate methods that allow us to identify the publishing behavior of individual nodes in large-scale distributed information filtering systems. The work presented here is based...

Using Restrictive Classification and Meta Classification for Junk Elimination (2008)

Stefan Siersdorfer, Gerhard Weikum

Abstract. This paper addresses the problem of performing supervised classification on document collections containing also junk documents. With ”junk documents ” we mean documents that do not...

On the usage of global document occurrences in peer-to-peer information systems (2008)

Odysseas Papapetrou, Sebastian Michel, Matthias Bender, Gerhard Weikum

Abstract. There exist a number of approaches for query processing in Peer-to-Peer information systems that efficiently retrieve relevant information from distributed peers. However, very few of them...

and (2008)

Surajit Chaudhuri, Gerhard Weikum

We investigate the problem of ranking the answers to a database query when many tuples are returned. In particular, we present methodologies to tackle the problem for conjunctive and range queries,...

Dagstuhl Seminar (Organizer Authors) (2008)

Cliff Jones, David Lomet, Alexander Romanovsky, Gerhard Weikum, Alan Fekete, Marie-claude Gaudel, ...

Abstract: This paper is a manifesto for future research on “atomicity ” in its many guises and is based on a five-day workshop on “Atomicity in System Design and Execution ” that took place...

Associate Editors (2008)

Jaideep Srivastava, Thomas M. Niccum, Bhaskar Himatsingka, Leana Golubchik, Richard R. Muntz, Gerhard Weikum, ...

is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and application of database systems and their technology. Letters,...

ABSTRACT P2P Authority Analysis for Social Communities (2008)

Josiane Xavier Parreira, Sebastian Michel, Matthias Bender, Tom Crecelius, Gerhard Weikum

PageRank-style authority analyses of Web graphs are of great importance for Web mining. Such authority analyses also apply to hot “Web 2.0 ” applications that exhibit a natural graph structure,...

Report on the Second International Workshop on Self-Managing Database Systems (SMDB 2007) (2008)

Anastassia Ailamaki, Surajit Chaudhuri, Sam Lightstone, Guy Lohman, Pat Martin, Ken Salem, ...

Information management systems are growing rapidly in scale and complexity, while skilled database administrators

Dagstuhl Seminar Organizer Authors (2008)

Cliff Jones, David Lomet, Er Romanovsky, Gerhard Weikum, Alan Fekete, Marie-claude Gaudel, ...

This paper is based on a five-day workshop on “Atomicity in System Design and Execution ” that took place in Schloss Dagstuhl in Germany [5] in April 2004 and was attended by 32 people from...

ABSTRACT Node Behavior Prediction for Large-Scale Approximate Information Filtering (2008)

Christian Zimmer, Christos Tryfonopoulos, Klaus Berberich, Gerhard Weikum, Manolis Koubarakis

In this paper we investigate methods that allow us to identify the publishing behavior of individual nodes in large-scale distributed information filtering systems. The work presented here is based...

Unstoppable Stateful PHP Web Services (2008)

German Shegalov, Gerhard Weikum, Klaus Berberich

Abstract. This paper presents the architecture and implementation of the EOS 2 failure-masking framework for composite Web Services. EOS 2 is based on the recently proposed notion of interaction...

TopX – AdHoc and Feedback Tasks (2008)

Martin Theobald, Andreas Broschart, Ralf Schenkel, Silvana Solomon, Gerhard Weikum

Abstract. This paper describes the setup and results of our contributions

Towards Peer-to-Peer Web Search (Extended Abstract) (2008)

Gerhard Weikum, Holger Bast, Geoffrey Canright, David Hales, Christian Schindelhauer, Peter Triantafillou

The peer-to-peer (P2P) computing paradigm is an intriguing alternative to Google-style search engines for querying and ranking Web content. In a network with many thousands or millions of peers the...

Abstract Distributed with Scalable File Organization (2008)

Radek Vingralek, Yuri Breitbartt, Gerhard Weikum

This paper presents a distributed file organization for record-structured, disk-resident files with key-based exact-match access. The file is organized into buckets that are spread across multiple...

Abstract The Web in Ten Years: Challenges and Opportunities for Database Research (2008)

Gerhard Weikum

In order to evolve into a dependable and ubiquitous information infrastructure, the World Wide Web needs comprehensive quality, performance, and availability guarantees for all kinds of E-services...

Database Selection and Result Merging in P2P Web Search (2008)

Matthias Bender, Sergey Chernov, Sebastian Michel, Pavel Serdyukov, Gerhard Weikum, Christian Zimmer

Abstract. Intelligent Web search engines are extremely popular now. Currently, only the commercial centralized search engines like Google can process terabytes of Web data. Alternative search engines...

Dagstuhl Seminar Organizer Authors (2008)

Cliff Jones, David Lomet, Er Romanovsky, Gerhard Weikum, Alan Fekete, Marie-claude Gaudel, ...

This paper is based on a five-day workshop on "Atomicity in Sys-tem Design and Execution " that took place in Schloss Dagstuhl in Germany [5] in April 2004 and was attended by 32...

von / by (2008)

Prof Dr, Gerhard Weikum

anderen als die angegebenen Quellen und Hilfsmittel benutzt zu haben.

Relevance Feedback for Sketch Retrieval Based on Linear Programming Classification (2008)

Gerhard Goos, Juris Hartmanis, Jan Van Leeuwen, Editorial Board, ...

Abstract. Relevance feedback plays as an important role in sketch retrieval as it does in existing content-based retrieval. This paper presents a method of relevance feedback for sketch retrieval by...

General Terms (2008)

Alessandro Linari, Gerhard Weikum

In this paper we address the query routing problem in peerto-peer (P2P) information retrieval. Our system builds up on the idea of a Semantic Overlay Network (SON), in which each peer becomes...

Computing Trusted Authority Scores in Peer-to-Peer Web Search Networks ∗ ABSTRACT (2008)

Josiane Xavier Parreira, Debora Donato, Carlos Castillo, Gerhard Weikum

Peer-to-peer (P2P) networks have received great attention for sharing and searching information in large user communities. The open and anonymous nature of P2P networks is one of its main strengths,...

ABSTRACT A Time Machine for Text Search (2008)

Klaus Berberich, Srikanta Bedathur, Thomas Neumann, Gerhard Weikum

Text search over temporally versioned document collections such as web archives has received little attention as a research problem. As a consequence, there is no scalable and principled solution to...

New Challenges (2008)

Matthias Bender, Yannis Ioannidis, Donald Kossmann, Henrik Nottelmann, Hans-jörg Scheck, Gerhard Weikum, ...

The peer-to-peer (P2P) paradigm is an intriguing approach for coping with dynamically evolving federations of loosely coupled digital libraries. In addition to the libraries, user agents with...

der Universität des Saarlandes (2008)

Roman Dementiev, Prof Dr. -ing, Gerhard Weikum, Prof Dr. -ing, Gerhard Weikum, Prof Dr, ...

In recent years, the development of theoretically I/O-efficient algorithms and data structures has received considerable attention. However, much less has been done to evaluate their performance, in...

Dagstuhl Seminar Organizer Authors (2008)

Cliff Jones, David Lomet, Er Romanovsky, Gerhard Weikum, Alan Fekete, Marie-claude Gaudel, ...

This paper is based on a five-day ~orlishop on "At,omicity in System Design and Execution " that t,oolt place in Schloss Dagstuhl in Germany (5) in April 2004 alicl was attended by...

Efficient Search and Approximate Information Filtering in a Distributed Peer-to-Peer Environment of Digital Libraries (2008)

Christian Zimmer, Christos Tryfonopoulos, Gerhard Weikum

Abstract. We present a new architecture for efficient search and approximate information filtering in a distributed Peer-to-Peer (P2P) environment of Digital Libraries. The MinervaLight search system...

Web Search with Entities and Binary Relationships (2008)

Gjergji Kasneci, Gerhard Weikum

Searching with entities and relationships (e.g. “vertical search ” for products, locations, persons, specific services, or scholarly relationships) is becoming one of the main goals within the...

der Universität des Saarlandes (2008)

Ralf Schenkel, Doktors Der Ingenieurwissenschaften, Fakultät I Prof, Dr. Rainer Schulze-pillot-ziemen, ...

KURZFASSUNG.............................................................................................. 9...

Towards Self-Organizing Query Routing and Processing for Peer-to-Peer Web Search (2008)

Gerhard Weikum, Holger Bast, Geoffrey Canright, David Hales, Christian Schindelhauer, Peter Triantafillou

peer-to-peer systems, probabilistic and statistical methods, semantic overlay networks, self-organization, Web search. The peer-to-peer computing paradigm is an intriguing alternative to Google-style...

Computing Trusted Authority Scores in Peer-to-Peer Web Search Networks ∗ ABSTRACT (2008)

Josiane Xavier Parreira, Debora Donato, Carlos Castillo, Gerhard Weikum

Peer-to-peer (P2P) networks have received great attention for sharing and searching information in large user communities. The open and anonymous nature of P2P networks is one of its main strengths,...

The MINERVA 1 Project: Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology (2008)

Matthias Bender, Sebastian Michel, Christian Zimmer, Gerhard Weikum

Abstract. We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users...

Associate Editors (2008)

Kaushik Chakrabarti, Michael Ortega, Kriengkrai Porkaew, Sharad Mehrotra, Leejay Wu, Christos Faloutsos, ...

The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...

Peer-to-Peer Information Search: Semantic, Social, or Spiritual? ∗ (2008)

Matthias Bender, Tom Crecelius, Mouna Kacimi, Sebastian Michel, Josiane Xavier Parreira, Gerhard Weikum

We consider the network structure and query processing capabilities of social communities like bookmarks and photo sharing communities such as del.icio.us or flickr. A common feature of all these...

Associate Editors (2008)

Masaru Kitsuregawa, Betty Salzberg, Gonzalo Navarro, Ricardo Baeza-yates, Erkki Sutinen, Jorma Tarhio, ...

IntegratingDiverseInformationManagementSystems:ABriefSurvey..................................

Systems ” (DELIS). (2008)

Josiane Xavier, Parreira Carlos, Castillo Debora Donato, Sebastian Michel, Gerhard Weikum, Josiane Xavier Parreira, ...

Abstract Link analysis on Web graphs and social networks form the foundation for authority assessment, search result ranking, and other forms of Web and graph mining. The PageRank (PR) method is the...

Challenges of Distributed Search Across Digital Libraries (2008)

Matthias Bender, Sebastian Michel, Gerhard Weikum, Christian Zimmer

Abstract. We present the MINERVA 1 project that tackles the problem of collaborative search across a large number of digital libraries. The search engine is layered on top of a Chord-style...

P2P Directories for Distributed Web Search: From Each According to His Ability, to Each According to His Needs ∗ (2008)

Matthias Bender, Sebastian Michel, Gerhard Weikum

A compelling application of peer-to-peer (P2P) system technology would be distributed Web search, where each peer autonomously runs a search engine on a personalized local corpus (e.g., built from a...

On the usage of global document occurrences in peer-to-peer information systems (2008)

Odysseas Papapetrou, Sebastian Michel, Matthias Bender, Gerhard Weikum

Abstract. There exist a number of approaches for query processing in Peer-to-Peer information systems that efficiently retrieve relevant information from distributed peers. However, very few of them...

ABSTRACT A Reproducible Benchmark for P2P Retrieval (2008)

Thomas Neumann, Matthias Bender, Sebastian Michel, Gerhard Weikum

With the growing popularity of information retrieval (IR) in distributed systems and in particular P2P Web search, a huge number of protocols and prototypes have been introduced in the literature....

On the Usage of Global Document Occurrences in Peer-to-Peer Information Systems (2008)

Odysseas Papapetrou, Sebastian Michel, Matthias Bender, Gerhard Weikum

There exist a number of approaches for query processing in Peer-to-Peer information systems that e#ciently retrieve relevant information from distributed peers. However, very few of them take into...

P2P Content Search: Give the Web Back to the People (2008)

Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

The proliferation of peer-to-peer (P2P) systems has come with various compelling applications including file sharing based on distributed hash tables (DHTs) or other kinds of overlay networks....

TopX: Efficient and versatile top-k query processing for semistructured data (2008)

Martin Theobald, Holger Bast, Debapriyo Majumdar, Ralf Schenkel Gerhard, Martin Theobald, Holger Bast, ...

Abstract Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval...

Dedication (2008)

For Topx, Osama Sammodi, Prof Dr, Gerhard Weikum, Dr. Ralf Schenkel, Prof Dr, ...

Hereby I confirm that this thesis is my own work and that I have documented all

TopX: Efficient and versatile top-k query processing for semistructured data (2008)

Martin Theobald, Ralf Schenkel, Gerhard Weikum

Abstract: This paper presents a comprehensive overview of the TopX search engine, an extensive framework for unified indexing and querying large collections of unstructured, semistructured, and...

TopX: Efficient and versatile top-k query processing for semistructured data (2008)

Martin Theobald, Ralf Schenkel, Gerhard Weikum

Abstract: This paper presents a comprehensive overview of the TopX search engine, an extensive framework for unified indexing and querying large collections of unstructured, semistructured, and...

08111 Report -- Ranked XML Querying (2008)

Amer-Yahia, Sihem, Hiemstra, Djoerd, Roelleke, Thomas, Srivastava, Divesh, Weikum, Gerhard

This paper is based on a five-day workshop on "Ranked XML Querying" that took place in Schloss Dagstuhl in Germany in March 2008 and was attended by 27 people from three different research...

08111 Abstracts Collection -- Ranked XML Querying (2008)

Amer-Yahia, Sihem, Srivastava, Divesh, Weikum, Gerhard

From 09.03. to 14.03.08, the Dagstuhl Seminar 08111 ``Ranked XML Querying'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several...

Exploiting Social Relations for Query Expansion and Result Ranking (2008)

Bender, Matthias, Crecelius, Tom, Kacimi, Mouna, Michel, Sebastian, Neumann, Thomas, Parreira, Josiane Xavier, ...

Online communities have recently become a popular tool for publishing and searching content, as well as for finding and connecting to other users that share common interests. The content is typically...

TopX @ INEX 2007 (2008)

Broschart, Andreas, Schenkel, Ralf, Theobald, Martin, Weikum, Gerhard

This paper describes the setup and results of the Max-Planck-Institut f{\"u}r Informatik's contributions for the INEX 2007 AdHoc Track task. The runs were produced with TopX, a search engine for...

Making SENSE: Socially Enhanced Search and Exploration (2008)

Crecelius, Tom, Kacimi, Mouna, Michel, Sebastian, Neumann, Thomas, Parreira, Josiane Xavier, Schenkel, Ralf, ...

Online communities like Flickr, del.icio.us and YouTube have established themselves as very popular and powerful services for publishing and searching contents, but also for identifying other users...

Mapping Roget's Thesaurus and WordNet to French (2008)

De Melo, Gerard, Weikum, Gerhard

Roget's Thesaurus and WordNet are very widely used lexical reference works. We describe an automatic mapping procedure that effectively produces French translations of the terms in these two...

Language as a Foundation of the Semantic Web (2008)

De Melo, Gerard, Weikum, Gerhard

This paper aims to show how language-related knowledge may serve as a fundamental building block for the Semantic Web. We present a system of URIs for terms, languages, scripts, and characters, which...

Fast logistic regression for text categorization with variable-length n-grams (2008)

Ifrim, Georgiana, Bakir, Goekhan, Weikum, Gerhard

A common representation used in text categorization is the bag of words model (aka. unigram model). Learning with this particular representation involves typically some preprocessing, e.g....

NAGA: Harvesting, Searching and Ranking Knowledge (2008)

Kasneci, Gjergji, Suchanek, Fabian, Ifrim, Georgiana, Elbassuoni, Shady, Ramanath, Maya, Weikum, Gerhard

The presence of encyclopedic Web sources, such as Wikipedia, the Internet Movie Database (IMDB), World Factbook, etc. calls for new querying techniques that are simple and yet more expressive than...

NAGA: Searching and Ranking Knowledge (2008)

Kasneci, Gjergji, Suchanek, Fabian, Ifrim, Georgiana, Ramanath, Maya, Weikum, Gerhard

The Web has the potential to become the world’s largest knowledge base. In order to unleash this potential, the wealth of information available on the Web needs to be extracted and organized. There...

Task-aware Search Personalization (2008)

Luxenburger, Julia, Elbassuoni, Shady, Weikum, Gerhard

Search personalization has been pursued in many ways, in order to provide better result rankings and better overall search experience to individual users. However, blindly applying personalization to...

Optimizing Distributed Top-k Queries (2008)

Neumann, Thomas, Bender, Matthias, Michel, Sebastian, Schenkel, Ralf, Triantafillou, Peter, Weikum, Gerhard

Top-k query processing is a fundamental building block for efficient ranking in a large number of applications. Efficiency is a central issue, especially for distributed settings, when the data is...

RDF-3X: a {RISC-style} Engine for {RDF} (2008)

Neumann, Thomas, Weikum, Gerhard

RDF is a data representation format for schema-free structured information that is gaining momentum in the context of Semantic-Web corpora, life sciences, and also Web 2.0 platforms. The...

Fine-Grained Relevance Feedback for XML Retrieval (Demo) (2008)

Pan, Hanglin, Schenkel, Ralf, Weikum, Gerhard

This demonstration presents an XML IR system that allows users to give feedback of different granularities and types, using Dempster-Shafer theory of evidence to compute expanded and reweighted...

The Juxtaposed approximate PageRank method for robust PageRank approximation in a peer-to-peer web search network (2008)

Parreira, Josiane Xavier, Castillo, Carlos, Donato, Debora, Michel, Sebastian, Weikum, Gerhard

We present Juxtaposed approximate PageRank ({JXP}), a distributed algorithm for computing PageRank-style authority scores of Web pages on a peer-to-peer ({P}2{P}) network. Unlike previous...

Efficient Top-k Querying over Social-Tagging Networks (2008)

Schenkel, Ralf, Crecelius, Tom, Kacimi, Mouna, Michel, Sebastian, Neumann, Thomas, Parreira, Josiane Xavier, ...

Online communities have become popular for publishing and searching content, as well as for finding and connecting to other users. User-generated content includes, for example, personal blogs,...

Yago - A Large Ontology from Wikipedia and WordNet (2008)

Suchanek, Fabian, Kasneci, Gjergji, Weikum, Gerhard

This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently...

TopX: Efficient and Versatile Top-k Query Processing for Semistructured Data (2008)

Theobald, Martin, Bast, Holger, Majumdar, Debapriyo, Schenkel, Ralf, Weikum, Gerhard

Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval over...

Efficiently Handling Dynamics in Distributed Link Based Authority Analysis (2008)

Xavier Parreira, Josiane, Michel, Sebastian, Weikum, Gerhard

Link based authority analysis is an important tool for ranking resources in social networks and other graphs. Previous work have presented JXP, a decentralized algorithm for computing PageRank...

P2P Information Retrieval and Filtering with MAPS (Demo) (2008)

Zimmer, Christian, Heinz, Johannes, Tryfonopoulos, Christos, Weikum, Gerhard

In this demonstration paper we present MAPS, a novel system that combines approximate information retrieval and filtering functionality in a peer-to-peer setting. In MAPS, a user is able to submit...

A Machine Learning Approach to Building Aligned Wordnets (2008)

De Melo, Gerard, Weikum, Gerhard

WordNet is a lexical database describing English words and their senses. We propose a method for automatically producing similar resources for new languages by taking advantage of the original...

Social Wisdom for Search and Recommendation (2008)

Schenkel, Ralf, Crecelius, Tom, Kacimi, Mouna, Neumann, Thomas, Parreira, Josiane Xavier, Spaniol, Marc, ...

Social-tagging communities offer great potential for smart recommendation and “socially enhanced” search result ranking. Beyond traditional forms of collaborative recommendation that are based on...

Good Guys vs. Bad Guys: Countering Cheating in Peer-to-Peer Authority Computations over Social Networks (2008)

Sozio, Mauro, Crecelius, Tom, Xavier Parreira, Josiane, Weikum, Gerhard

Eigenvector computations are an important building block for computing authority, trust, and reputation scores in social networks and other graphs. In peer-to-peer networks or other forms of...

TOB: Timely Ontologies for Business Relations (2008)

Zhang, Qi, Suchanek, Fabian, Weikum, Gerhard

In this paper we present a suite of methods for extracting temporal relations from semi-structured and textual Web sources. We particularly address the needs for building and maintaining business...

Associate Editors (2007)

Gerhard Weikum, Arnd Christian, Achim Kraiss, Markus Sinnwell, Gary Valentin, Eric Christensen, ...

is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and application of database systems and their technology. Letters,...

Associate Editors (2007)

What Neighbours, Think Computing, Reputations Alberto, O. Mendelzon, Davood Rafiei, Andrew Mccallum, ...

The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...

Abstract XML-enabled Workflow Management for E-Services across Heterogeneous Platforms (2007)

German Shegalov, Michael Gillmann, Gerhard Weikum

Advanced e-services require efficient, flexible, and easy-to-use workflow technology that integrates well with mainstream Internet technologies like XML and Web servers. This paper discusses an...

Associate Editors (2007)

K. Bharat, A. Broder, J. Dean, M. R. Henzinger, Automatically Extracting, Structure Free, ...

The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...

Quality of Service Guarantees for Multimedia Digital Libraries and Beyond (2007)

Gerhard Weikum

Servers for multimedia digital libraries have to manage huge amounts of data and pose challenging performance requirements. Most notably, for smooth playback of video and audio on client machines the...

G.: Taming the Tiger: How to Cope with Real Database Products in Transactional Federations for Internet Applications. GI-Workshop InternetDatenbanken 2000 (2007)

Ralf Schenkel, Gerhard Weikum

Data consistency in transactional federations is a key requirement of advanced E-service applications on the Internet, such as electronic auctions or real-estate purchase. Federated concurrency...

Data Engineering (2007)

June Vol No, Adaptive Query, Processing Technology, Evolution Joseph, M. Hellerstein, Michael J. Franklin, ...

As query engines are scaled and federated, they must cope with highly unpredictable and changeable environments. In the Telegraph project, we are attempting to architect and implement a continuously...

Bulletin of the Technical Committee on (2007)

June Vol No, Sirish Ch, Amol Deshp, Kris Hildrum, Sam Madden, Vijayshankar Raman, ...

As query engines are scaled and federated, they must cope with highly unpredictable and changeable environments. In the Telegraph project, we are attempting to architect and implement a continuously...

Performance Assessment and Configuration of Enterprise-Wide Workflow Management Systems (2007)

Extended Michael, Michael Gillmann, Jeanine Weissenfels, Gerhard Weikum, Achim Kraiss, Dresdner Bank Ag

this paper, we present an analytic approach that considers the performance as well as the availability of the WFMS in its assessment of the quality of a given configuration of a distributed WFMS. The...

Mentor-lite Customizability: Tailoring a Light-Weight Workflow Management System to Workflow Application and Organizational Needs (2007)

Michael Gillmann Jeanine, Michael Gillmann, Jeanine Weissenfels, German Shegalov, Wolfgang Wonner, Gerhard Weikum

The Mentor-lite prototype has been developed within the research project "Architecture, Configuration, and Administration of Large Workflow Management Systems" funded by the German Science...

Data Engineering (2007)

June Vol No, Letter Special, Gerhard Weikum, Arnd Christian, Achim Kraiss, Markus Sinnwell, ...

Although today's computers provide huge amounts of main memory, the ever-increasing load of large data servers, imposed by resource-intensive decision-support queries and accesses to multimedia...

Cost/Performance Control in SNOWBALL Distributed File Manager (2007)

Radek Vingralek, Yuri Breitbart, Gerhard Weikum

Networks of workstations are an emerging architectural paradigm for highperformance parallel and distributed systems. Exploiting networks of workstations for massive data management poses exciting...

0306-4379(94)00020-4?rlntea in Great Britain. All right........ d (2007)

Gerhard Weikum, Christof Hasse, Axes Mnkeberg, Peter Zabback

Abstract-- This paper reports on results and experiences from the COMFORT automatic tuning project. The objective of the project has been to investigate architectural principles of self-tuning...

Heuristic Optimization of Speedup and Benefit/Cost for Parallel Database Scans on Shared-Memory Multiprocessors (2007)

Michael Rys, Gerhard Weikum

Previous work on parallel database systems has paid little attention to the interaction of asynchronous disk prefetching and processor parallelism. This paper investigates this issue for scan...

Networks of workstations, also known as NOWs [ACP*95], (2007)

Markus Sinnwell, Gerhard Weikum

The paper presents a method for distributed caching to exploit the aggregate memory of networks of workstations in data-intensive applications. In contrast to prior work, the approach is based on a...

Abstract Adding Relevance to XML (2007)

Anja Theobald, Gerhard Weikum

XML query languages proposed so far are limited to Boolean retrieval in the sense that query results are sets of qualifying XML elements or subgraphs. This search paradigm is intriguing for...

Announcements and Notices (2007)

Kaushik Chakrabarti, Michael Ortega, Kriengkrai Porkaew, Sharad Mehrotra, Leejay Wu, Christos Faloutsos, ...

TCDE Election Notice and Position Statement.................................................... 50 TCDE Election Ballot.................................................................... back cover...

Auto-Tuned Spline Synopses for Database Statistics Management (2007)

Gerhard Weikum

Data distribution statistics are vital for database systems and other data-mining platforms in order to predict the running time of complex queries for data filtering and extraction. State-of-theart...

Associate Editors (2007)

Masaru Kitsuregawa, Betty Salzberg, Mary Fern, Atsuyuki Morishima, Dan Suciu, Wang-chiew Tan, ...

The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...

Cost/Performance Control in SNOWBALL Distributed File Manager (2007)

Radek Vingralek, Yuri Breitbart, Gerhard Weikum

Networks of workstations are an emerging architectural paradigm for highperformance parallel and distributed systems. Exploiting networks of workstations for massive data management poses exciting...

Conference and Journal Notices (2007)

Letter Special, Gerhard Weikum, Arnd Christian, Achim Kraiss, Markus Sinnwell, Surajit Chaudhuri, ...

Letter from the Editor-in-Chief ICDE'2000 As many of you perhaps are aware, the deadline for the ICDE'2000 conference, which is the flagship conference of our technical committee, was June...

Entityauthority: Semantically enriched graph-based authority propagation (2007)

Julia Stoyanovich, Srikanta Bedathur, Klaus Berberich, Gerhard Weikum

This paper pursues the recently emerging paradigm of searching for entities that are embedded in Web pages. We utilize informationextraction techniques to identify entity candidates in documents, map...

FluxCapacitor: Efficient Time-Travel Text Search (2007)

Klaus Berberich, Srikanta Bedathur, Thomas Neumann, Gerhard Weikum

An increasing number of temporally versioned text collections is available today with Web archives being a prime example. Search on such collections, however, is often not satisfactory and ignores...

MinervaDL: An Architecture for Information Retrieval and Filtering (2007)

Christian Zimmer, Christos Tryfonopoulos, Gerhard Weikum

Abstract. We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of...

Architectural Alternatives for Information Filtering in Structured Overlay Networks (2007)

Christos Tryfonopoulos, Christian Zimmer, Manolis Koubarakis, Gerhard Weikum

In this work we discuss how to provide information filtering (pub/sub) functionality over peer-to-peer structured overlay networks by presenting two approaches we developed. Both approaches utilize...

Declaration of Consent (2007)

Prof Dr, Gerhard Weikum, Ulrich Bügel, Fraunhofer Iitb, Prof Dr, Gerhard Weikum

Hereby I confirm that this thesis is my own work and that I have documented all sources used.

Second reviewer (2007)

Silvana Solomon, Prof Dr. -ing, Gerhard Weikum, Dr. Ralf Schenkel, Prof Dr, Christoph Koch

ii Information retrieval and feedback in XML are rather new fields for researchers; natural questions arise, such as: how good are the feedback algorithms in XML IR? Can they be evaluated with...

Comparing Apples and Oranges: Normalized PageRank for Evolving Graphs WWW (2007)

Klaus Berberich, Srikanta Bedathur, Gerhard Weikum, Michalis Vazirgiannis

PageRank is the best known technique for link-based importance ranking. The computed importance scores, however, are not directly comparable across different snapshots of an evolving graph. We...

Authors ’ Addresses (2007)

Gjergji Kasneci, Fabian M, Georgiana Ifrim, Gerhard Weikum, Gjergji Kasneci, Fabian M. Suchanek, ...

The Web has the potential to become the world’s largest knowledge base. In order to unleash this potential, the wealth of information available on the web needs to be extracted and organized. There...

Architectural Alternatives for Information Filtering in Structured Overlay Networks (2007)

Christian Zimmer, Gerhard Weikum, Manolis Koubarakis

Today’s content providers are naturally distributed and produce large amounts of new information every day. Peer-to-peer information filtering is a promising approach that offers scalability,...

Yago: A Large Ontology from Wikipedia and WordNet (2007)

Fabian M. Suchanek, Gjergji Kasneci, Gerhard Weikum

This article presents YAGO, a large ontology with high coverage and precision. YAGO has been automatically derived from Wikipedia and WordNet. It comprises entities and relations, and currently...

Advisors (2007)

Tim Benke, Prof Dr. -ing, Gerhard Weikum, Josiane Xavier Parreira, Sebastian Michel, Prof Dr. -ing, ...

Hereby I confirm that this thesis is my own work and that I have documented all sources used.

A Pocket Guide to Web History (2007)

Klaus Berberich, Srikanta Bedathur, Gerhard Weikum

Abstract. Web archives like the Internet Archive preserve the evolutionary history of large portions of the Web. Access to them, however, is still via rather limited interfaces – a search...

Authors ’ Addresses (2007)

Klaus Berberich, Srikanta Bedathur, Thomas Neumann, Gerhard Weikum, Klaus Berberich, Srikanta Bedathur, ...

Text search over temporally versioned document collections such as web archives has received little attention as a research problem. As a consequence, there is no scalable and principled solution to...

MinervaDL: An Architecture for Information Retrieval and Filtering (2007)

Christian Zimmer, Christos Tryfonopoulos, Gerhard Weikum

Abstract. We present MinervaDL, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of...

Exploiting Community Behavior for Enhanced Link Analysis and Web Search (2007)

Luxenburger, Julia, Weikum, Gerhard

Methods for Web link analysis and authority ranking such as PageRank are based on the assumption that a user endorses a Web page when creating a hyperlink to this page. There is a wealth of...

Efficient Search and Approximate Information Filtering in a Distributed Peer-to-Peer Environment of Digital Libraries (2007)

Zimmer, Christian, Tryfonopoulos, Christos, Weikum, Gerhard

We present a new architecture for efficient search and approximate information filtering in a distributed {P}eer-to-{P}eer ({P2P}) environment of Digital Libraries. The {M}inerva{L}ight search system...

TopX @ INEX 2007 (2007)

Broschart, Andreas, Schenkel, Ralf, Theobald, Martin, Weikum, Gerhard

This paper describes the setup and results of the Max-Planck-Institut f{\"u}r Informatik's contributions for the INEX 2007 AdHoc Track task. The runs were produced with TopX, a search engine for...

Harvesting and Organizing Knowledge from the Web (2007)

Weikum, Gerhard

Information organization and search on the {W}eb is gaining structure and context awareness and more semantic flavor, for example, in the forms of faceted search, vertical search, entity search, and...

Adaptive Personalization of Web Search (2007)

Elbassuoni, Shady, Luxenburger, Julia, Weikum, Gerhard

In this paper we present a client-side approach towards per- sonalization of web search which adapts the means of per- sonalization to the user need in place. We differentiate three different search...

MinervaDL: An Architecture for Information Retrieval and Filtering in Distributed Digital Libraries (2007)

Zimmer, Christian, Tryfonopoulos, Christos, Weikum, Gerhard

We present Minerva{DL}, a digital library architecture that supports approximate information retrieval and filtering functionality under a single unifying framework. The architecture of {M}inerva{DL}...

Design Alternatives for Large-Scale Web Search: Alexander was Great, Aeneas a Pioneer, and Anakin has the Force (2007)

Bender, Matthias, Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard

Indexing the Web and meeting the throughput, response-time, and failure-resilience requirements of a search engine requires massive storage and computational resources and a careful system design for...

Report on the Second {International} {Workshop} on {Self-Managing} {Database} {Systems} (SMDB 2007) (2007)

Ailamaki, Anastassia, Chaudhuri, Surajit, Lightstone, Sam, Lohman, Guy M., Martin, Patrick, Salem, Kenneth, ...

Information management systems are growing rapidly in scale and complexity, while skilled database administrators are becoming rarer and more expensive. Increasingly, the total cost of ownership of...

P2P Authority Analysis for Social Communities (2007)

Parreira, Josiane Xavier, Michel, Sebastian, Bender, Matthias, Crecelius, Tom, Weikum, Gerhard

Page{R}ank-style authority analyses of Web graphs are of great importance for {W}eb mining. Such authority analyses also apply to hot ``Web 2.0'' applications that exhibit a natural graph structure,...

Efficient Text Proximity Search (2007)

Schenkel, Ralf, Broschart, Andreas, Hwang, Seungwon, Theobald, Martin, Weikum, Gerhard

In addition to purely occurrence-based relevance models, term proximity has been frequently used to enhance retrieval quality of keyword-oriented retrieval systems. While there have been approaches...

A Pocket Guide to Web History (2007)

Berberich, Klaus, Bedathur, Srikanta, Weikum, Gerhard

Web archives like the {I}nternet {A}rchive preserve the evolutionary history of large portions of the {W}eb. Access to them, however, is still via rather limited interfaces – a search functionality...

EntityAuthority: Semantically Enriched Graph-Based Authority Propagation (2007)

Stoyanovich, Julia, Bedathur, Srikanta, Berberich, Klaus, Weikum, Gerhard

This paper pursues the recently emerging paradigm of searching for entities that are embedded in Web pages. We utilize information extraction techniques to identify entity candidates in documents,...

A Time Machine for Text Search (2007)

Berberich, Klaus, Bedathur, Srikanta, Neumann, Thomas, Weikum, Gerhard

Text search over temporally versioned document collections such as web archives has received little attention as a research problem. As a consequence, there is no scalable and principled solution to...

NAGA: Searching and Ranking Knowledge (2007)

Kasneci, Gjergji, Suchanek, Fabian M., Ifrim, Georgiana, Ramanath, Maya, Weikum, Gerhard

The Web has the potential to become the world's largest knowledge base. In order to unleash this potential, the wealth of information available on the web needs to be extracted and organized. There...

TopX - Adhoc Track and Feedback Task (2007)

Theobald, Martin, Broschart, Andreas, Schenkel, Ralf, Solomon, Silvana, Weikum, Gerhard

This paper describes the setup and results of the Max-Planck-Institut für Informatik’s contributions for the {INEX} 2006 AdHoc Track and Feedback task. The runs were produced with the Top{X}...

TopX - Efficient and Versatile Top-k Query Processing for Text, Semistructured, and Structured Data (2007)

Theobald, Martin, Schenkel, Ralf, Weikum, Gerhard

This paper presents a comprehensive overview of the Top{X} search engine, an extensive framework for unified indexing and querying large collections of unstructured, semistructured, and structured...

Architectural Alternatives for Information Filtering in Structured Overlay Networks (2007)

Tryfonopoulos, Christos, Zimmer, Christian, Koubarakis, Manolis, Weikum, Gerhard

Today's content providers are naturally distributed and produce large amounts of new information every day. Peer-to-peer information filtering is a promising approach that offers scalability,...

A User-interaction model for The European {Library} Portal (2007)

Luxenburger, Julia, Van Der Meulen, Eric, Weikum, Gerhard

Users are a far too often neglected variable in the design of information-seeking systems. This also holds for digital libraries. In this paper, we study navigational patterns of users within...

Comparing Apples and Oranges: Normalized PageRank for Evolving Graphs (2007)

Berberich, Klaus, Bedathur, Srikanta, Vazirgiannis, Michalis, Weikum, Gerhard

Page\-Rank is the best known technique for link-based importance ranking. The computed importance scores, however, are not directly comparable across different snapshots of an evolving graph. We...

STAR: A System for Tuple and Attribute Ranking of Query Answers (Demo) (2007)

Kapoor, Nishant, Das, Gautam, Hristidis, Vagelis, Sudarshan, S., Weikum, Gerhard

In recent years there has been a great deal of interest in developing effective techniques for ad-hoc search and retrieval in structured repositories such as relational databases - e.g., searching...

Efficient Time-Travel on Versioned Text Collections (2007)

Berberich, Klaus, Bedathur, Srikanta, Weikum, Gerhard

The availability of versioned text collections such as the Internet Archive opens up opportunities for time-aware exploration of their contents. In this paper, we propose \emph{time-travel retrieval...

The TopX DB&IR engine (2007)

Theobald, Martin, Schenkel, Ralf, Weikum, Gerhard

This paper proposes a demo of the Top{X} search engine, an extensive framework for unified indexing, querying, and ranking of large collections of unstructured, semistructured, and structured data....

p2pDating: Real Life Inspired Semantic Overlay Networks for Web Search (2007)

Parreira, Josiane Xavier, Michel, Sebastian, Weikum, Gerhard

We consider a network of autonomous peers forming a logically global but physically distributed search engine, where every peer has its own local collection generated by independently crawling the...

Yago: A Core of Semantic Knowledge - Unifying {WordNet} and {Wikipedia} (2007)

Suchanek, Fabian M., Kasneci, Gjergji, Weikum, Gerhard

We present {YAGO}, a light-weight and extensible ontology with high coverage and quality. {YAGO} builds on entities and relations and currently contains roughly 900,000 entities and 5,000,000 facts....

A Comparative Study of Pub/Sub Methods in Structured P2P Networks (2007)

Bender, Matthias, Michel, Sebastian, Parkitny, Sebastian, Weikum, Gerhard

Methods for publish/subscribe applications over P2P networks have been a research issue for a long time. Many approaches have been developed and evaluated, but typically each based on different...

Database Selection and Result Merging in P2P Web Search (2007)

Chernov, Sergey, Serdyukov, Pavel, Bender, Matthias, Michel, Sebastian, Weikum, Gerhard, Zimmer, Christian

Intelligent Web search engines are extremely popular now. Currently, only the commercial centralized search engines like Google can process terabytes of Web data. Alternative search engines...

Peer-to-Peer Information Search: Semantic, Social, or Spiritual? (2007)

Bender, Matthias, Crecelius, Tom, Kacimi, Mouna, Michel, Sebastian, Parreira, Josiane Xavier, Weikum, Gerhard

We consider the network structure and query processing capabilities of social communities like bookmarks and photo sharing communities such as del.icio.us or flickr. A common feature of all these...

FluxCapacitor: Efficient Time-Travel Text Search (2007)

Berberich, Klaus, Bedathur, Srikanta, Neumann, Thomas, Weikum, Gerhard

An increasing number of temporally versioned text collections is available today with {W}eb archives being a prime example. Search on such collections, however, is often not satisfactory and ignores...

Computing Trusted Authority Scores in Peer-to-Peer Web Search Networks (2007)

Parreira, Josiane Xavier, Donato, Debora, Castillo, Carlos, Weikum, Gerhard

Peer-to-peer ({P2P}) networks have received great attention for sharing and searching information in large user communities. The open and anonymous nature of {P2P} networks is one of its main...

Probabilistic information retrieval approach for ranking of database query results (2006)

Surajit Chaudhuri, Gerhard Weikum

We investigate the problem of ranking the answers to a database query when many tuples are returned. In particular, we present methodologies to tackle the problem for conjunctive and range queries,...

Io-top-k: Index-access optimized top-k query processing (2006)

Holger Bast, Debapriyo Majumdar, Ralf Schenkel, Martin Theobald, Gerhard Weikum

Top-k query processing is an important building block for ranked retrieval, with applications ranging from text and data integration to distributed aggregation of network logs and sensor data. Top-k...

Efficient and decentralized pagerank approximation in a peer-to-peer web search network (2006)

Josiane Xavier Parreira, Debora Donato, Sebastian Michel, Gerhard Weikum

PageRank-style (PR) link analyses are a cornerstone of Web search engines and Web mining, but they are computationally expensive. Recently, various techniques have been proposed for speeding up these...

Transductive learning for text classification using explicit knowledge models (2006)

Georgiana Ifrim, Gerhard Weikum

Abstract. We present a generative model based approach for transductive learning for text classification. Our approach combines three methodological ingredients: learning from background corpora,...

Efficient and decentralized pagerank approximation in a peer-to-peer web search network (2006)

Josiane Xavier Parreira, Debora Donato, Sebastian Michel, Gerhard Weikum

PageRank-style (PR) link analyses are a cornerstone of Web search engines and Web mining, but they are computationally expensive. Recently, various techniques have been proposed for speeding up these...

Discovering and exploiting keyword and attribute-value co-occurrences to improve p2p routing indices (2006)

Sebastian Michel, Matthias Bender, Nikos Ntarmos, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

Peer-to-Peer (P2P) search requires intelligent decisions for query routing: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving additional...

Global document frequency estimation in peer-to-peer web search (2006)

Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum

Information retrieval (IR) in peer-to-peer (P2P) networks, where the corpus is spread across many loosely coupled peers, has recently gained importance. In contrast to IR systems on a centralized...

Io-top-k: Index-access optimized top-k query processing (2006)

Holger Bast, Debapriyo Majumdar, Ralf Schenkel, Martin Theobald, Gerhard Weikum

Top-k query processing is an important building block for ranked retrieval, with applications ranging from text and data integration to distributed aggregation of network logs and sensor data. Top-k...

Global document frequency estimation in peer-to-peer web search (2006)

Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum

Information retrieval (IR) in peer-to-peer (P2P) networks, where the corpus is spread across many loosely coupled peers, has recently gained importance. In contrast to IR systems on a centralized...

IQN routing: Integrating quality and novelty in p2p querying and ranking (2006)

Sebastian Michel, Matthias Bender, Peter Triantafillou, Gerhard Weikum

Abstract. We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is query routing: selecting a...

Acknowledgements (2006)

Fabian M. Suchanek, Georgiana Ifrim, Gerhard Weikum, Fabian M. Suchanek, Georgiana Ifrim, Gerhard Weikum, ...

We would like to thank Eugene Agichtein for his caring support with Snowball.

Exploiting community behavior for enhanced link analysis and web search (2006)

Julia Luxenburger, Gerhard Weikum

Abstract. Methods for Web link analysis and authority ranking such as PageRank are based on the assumption that a user endorses a Web page when creating a hyperlink to this page. There is a wealth of...

Authors ’ Addresses (2006)

Matthias Bender, Sebastian Michel, Peter Triantafillou Gerhard, Matthias Bender, Sebastian Michel, Gerhard Weikum, ...

Peer-to-Peer (P2P) search engines and other forms of distributed information retrieval (IR) are gaining momentum. Unlike in centralized IR, it is difficult and expensive to compute statistical...

Io-top-k: Index-access optimized top-k query processing (2006)

Holger Bast, Debapriyo Majumdar, Ralf Schenkel, Martin Theobald, Gerhard Weikum

Top-k query processing is an important building block for ranked retrieval, with applications ranging from text and data integration to distributed aggregation of network logs and sensor data. Top-k...

IO-Top-k: Index-Access Optimized Top-k Query Processing (2006)

Bast, Holger, Majumdar, Debapriyo, Schenkel, Ralf, Theobald, Martin, Weikum, Gerhard, Dayal, Umeshwar, ...

Top-$k$ query processing is an important building block for ranked retrieval, with applications ranging from text and data integration to distributed aggregation of network logs and sensor data....

06121 Executive Summary -- Atomicity: A Unifying Concept in Computer Science (2006)

Weikum, Gerhard, Jones, Clifford B., Lomet, David, Romanovsky, Alexander

This seminar was based on and continued the interaction of different computer-science communities that was begun in an earlier Dagstuhl seminar in April 2004. Both seminars have aimed at a deeper...

06121 Abstracts Collection -- Atomicity: A Unifying Concept in Computer Science (2006)

Weikum, Gerhard, Jones, Clifford B., Lomet, David, Romanovsky, Alexander

From 19.03.06 to 24.03.06, the Dagstuhl Seminar 06121 ``Atomicity: A Unifying Concept in Computer Science'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl....

Database Selection and Result Merging in P2P Web Search (2006)

Chernov, Sergey, Serdyukov, Pavel, Bender, Matthias, Michel, Sebastian, Weikum, Gerhard, Zimmer, Christian, ...

Intelligent Web search engines are extremely popular now. Currently, only the commercial centralized search engines like Google can process terabytes of Web data. Alternative search engines...

Personalized Query Routing in Peer-to-Peer Federations of Digital Libraries (2006)

Bender, Matthias, Ioannidis, Yannis, Kossmann, Donald, Nottelmann, Henrik, Scheck, Hans-Jörg, Weikum, Gerhard, ...

This task explores routing of various types of queries (SQL, XQuery, etc.) over a P2P network where, apart from DLs, user agents with powerful personalized tools may participate as peers as well. It...

''To Infinity and Beyond'': P2P Web Search with Minerva and Minerva∞ (2006)

Bender, Matthias, Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard, Zimmer, Christian, Baldoni, Roberto, ...

Peer-to-peer (P2P) computing is an intriguing paradigm for Web search for several reasons: 1) the computational resources of a huge computer network can facilitate richer mathematical and linguistic...

LEILA: Learning to Extract Information by Linguistic Analysis (2006)

Suchanek, Fabian M., Ifrim, Georgiana, Weikum, Gerhard, Buitelaar, Paul, Cimiano, Philipp, Loos, Berenike

One of the challenging tasks in the context of the Semantic Web is to automatically extract instances of binary relations from Web documents - for example all pairs of a person and the corresponding...

Counting at Large: Efficient Cardinality Estimation in Internet-Scale Data Networks (2006)

Ntarmos, Nikos, Triantafillou, Peter, Weikum, Gerhard, Liu, Ling, Reuter, Andreas, Whang, Kyu-Young, ...

Counting in general, and estimating the cardinality of (multi-) sets in particular, is highly desirable for a large variety of applications, representing a foundational block for the efficient...

Discovering and Exploiting Keyword and Attribute-Value Co-occurrences to Improve P2P Routing Indices (2006)

Michel, Sebastian, Bender, Matthias, Ntarmos, Nikos, Triantafillou, Peter, Weikum, Gerhard, Zimmer, Christian, ...

Peer-to-Peer (P2P) search requires intelligent decisions for {\em query routing}: selecting the best peers to which a given query, initiated at some peer, should be forwarded for retrieving...

Graph-based Text Classification: Learn from Your Neighbors (2006)

Angelova, Ralitsa, Weikum, Gerhard, Efthimiadis, Efthimis N., Dumais, Susan T., Hawking, David, Jaervelin, Kalervo

Automatic classification of data items, based on training samples, can be boosted by considering the neighborhood of data items in a graph structure (e.g., neighboring documents in a hyperlink...

Unstoppable Stateful PHP Web Services (2006)

Shegalov, German, Weikum, Gerhard, Berberich, Klaus, Aberer, Karl, Peng, Zhiyong, Rundensteiner, Elke A., ...

This paper presents the architecture and implementation of the EOS2 failure-masking framework for composite Web Services. EOS2 is based on the recently proposed notion of interaction contracts (IC),...

Transductive Learning for Text Classification using Explicit Knowledge Models (2006)

Ifrim, Georgiana, Weikum, Gerhard, Fürnkranz, Johannes, Scheffer, Tobias, Spiliopoulou, Myra

We present a generative model based approach for transductive learning for text classification. Our approach combines three methodological ingredients: learning from background corpora, latent...

IO-Top-k: Index-access Optimized Top-k Query Processing (2006)

Bast, Holger, Majumdar, Debapriyo, Schenkel, Ralf, Theobald, Martin, Weikum, Gerhard, Dayal, Umeshwar, ...

Top-$k$ query processing is an important building block for ranked retrieval, with applications ranging from text and data integration to distributed aggregation of network logs and sensor data....

A Reproducible Benchmark for P2P Retrieval (2006)

Neumann, Thomas, Bender, Matthias, Michel, Sebastian, Weikum, Gerhard, Bonnet, Philippe, Manolescu, Ioana

With the growing popularity of information retrieval (IR) in distributed systems and in particular {P2P} Web search, a huge number of protocols and prototypes have been introduced in the literature....

IQN Routing: Integrating Quality and Novelty in P2P Querying and Ranking (2006)

Michel, Sebastian, Bender, Matthias, Triantafillou, Peter, Weikum, Gerhard, Ioannidis, Yannis, Scholl, Marc H., ...

We consider a collaboration of peers autonomously crawling the Web. A pivotal issue when designing a peer-to-peer (P2P) Web search engine in this environment is \textit{query routing}: selecting a...

Global Document Frequency Estimation in Peer-to-Peer Web Search (2006)

Bender, Matthias, Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard, Zhou, Dayou

Information retrieval (IR) in peer-to-peer (P2P) networks, where the corpus is spread across many loosely coupled peers, has recently gained importance. In contrast to IR systems on a centralized...

Exploiting Community Behavior for Enhanced Link Analysis and Web Search (2006)

Luxenburger, Julia, Weikum, Gerhard, Zhou, Dayou

Methods for Web link analysis and authority ranking such as PageRank are based on the assumption that a user endorses a Web page when creating a hyperlink to this page. There is a wealth of...

P2P Directories for Distributed Web Search: From Each According to His Ability, to Each According to His Needs (2006)

Bender, Matthias, Michel, Sebastian, Weikum, Gerhard, Barga, Roger S., Zhou, Xiaofang

A compelling application of peer-to-peer (P2P) system technology would be distributed Web search, where each peer autonomously runs a search engine on a personalized local corpus (e.g., built from a...

IO-Top-k at TREC 2006: Terabyte Track (2006)

Bast, Holger, Majumdar, Debapriyo, Schenkel, Ralf, Theobald, Martin, Weikum, Gerhard, Voorhees, Ellen M., ...

This paper describes the setup and results of our contribution to the TREC 2006 Terabyte Track. Our implementation was based on the algorithms proposed in [IO-Top-k: Index-Access Optimized Top-K...

Combining Linguistic and Statistical Analysis to Extract Relations from Web Documents (2006)

Suchanek, Fabian M., Ifrim, Georgiana, Weikum, Gerhard, Eliassi-Rad, Tina, Ungar, Lyle, Craven, Mark, ...

abstract 1: The World Wide Web provides a nearly endless source of knowledge, which is mostly given in natural language. A first step towards exploiting this data automatically could be to extract...

EOS²: Unstoppable Stateful PHP (demo) (2006)

Shegalov, German, Weikum, Gerhard, Dayal, Umeshwar, Whang, Kyu-Young, Lomet, David B., Alonso, Gustavo, ...

This paper presents the architecture and implementation of the EOS2 failure-masking framework for composite Web Services. EOS2 is based on the recently proposed notion of interaction contracts (IC),...

Foundations of Automated Database Tuning (Tutorial) (2006)

Chaudhuri, Surajit, Weikum, Gerhard, Liu, Ling, Reuter, Andreas, Whang, Kyu-Young, Zhang, Jianjun

Our society is more dependent on information systems than ever before. However, managing the information systems infrastructure in a cost-effective manner is a growing challenge. The total cost of...

Web Search Clickstreams (2006)

Kammenhuber, Nils, Luxenburger, Julia, Feldmann, Anja, Weikum, Gerhard, Almeida, Jussara M., Almeida, Virgílio A. F., ...

Search engines are a vital part of the Web and thus the Internet infrastructure. Therefore understanding the behavior of users searching the Web gives insights into trends, and enables enhancements...

P2P Content Search: Give the Web Back to the People (2006)

Bender, Matthias, Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard, Zimmer, Christian

The solution that we have developed and advocate in this paper is based on the postulation that we cited above as a motivation for the general direction of P2P search engines: \emph{give the Web back...

MAPS: Approximate Publish/Subscribe Functionality in Peer-to-Peer Networks (2006)

Berberich, Klaus, Koubarakis, Manolis, Tryfonopoulos, Christos, Weikum, Gerhard, Zimmer, Christian

Information filtering has been a research issue for years. In an information filtering scenario users information needs are expressed by user subscriptions, and users are notified about published...

The Database Research Group at the Max-Planck Institute for Informatics (2006)

Weikum, Gerhard

The Max-Planck Institute for Informatics (MPI-INF) is one of 80 institutes of the Max-Planck Society, Germany's premier scientific organization for foundational research with numerous Nobel prizes in...

Efficient peer-to-peer semantic overlay networks based on statistical language models (2006)

Linari, Alessandro, Weikum, Gerhard

In this paper we address the query routing problem in peer-to-peer ({P2P}) information retrieval. Our system builds up on the idea of a {S}emantic {O}verlay {N}etwork ({SON}), in which each peer...

TopX & XXL at INEX 2005 (Ad-Hoc Track) (2006)

Theobald, Martin, Schenkel, Ralf, Weikum, Gerhard, Fuhr, Norbert, Lalmas, Mounia, Malik, Saadia, ...

We participated with two different and independent search engines in this year's INEX round: The XXL Search Engine and the TopX engine. As this is the first participation for TopX, this paper focuses...

TopX - AdHoc Track and Feedback Task (2006)

Theobald, Martin, Broschart, Andreas, Schenkel, Ralf, Solomon, Silvana, Weikum, Gerhard, Fuhr, Norbert, ...

This paper describes the setup and results of our contributions to the INEX 2006 AdHoc and Feedback tasks.

BuzzRank ... and the Trend is Your Friend (2006)

Berberich, Klaus, Bedathur, Srikanta J., Vazirgiannis, Michalis, Weikum, Gerhard

Ranking methods like PageRank assess the importance of Web pages based on the current state of the rapidly evolving Web graph. The dynamics of the resulting importance scores, however, have not been...

Efficient and Decentralized PageRank Approximation in a Peer-to-Peer Web Search Network (2006)

Parreira, Josiane Xavier, Donato, Debora, Michel, Sebastian, Weikum, Gerhard, Dayal, Umeshwar, Whang, Kyu-Young, ...

PageRank-style (PR) link analyses are a cornerstone of Web search engines and Web mining, but they are computationally expensive. Recently, various techniques have been proposed for speeding up these...

Time-Aware Authority Ranking (2006)

Berberich, Klaus, Vazirgiannis, Michalis, Weikum, Gerhard

The link structure of the web is analyzed to measure the authority of pages, which can be taken into account for ranking query results. Due to the enormous dynamics of the web, with millions of pages...

Probabilistic information retrieval approach for ranking of database query results (2006)

Chaudhuri, Surajit, Das, Gautam, Hristidis, Vagelis, Weikum, Gerhard

We investigate the problem of ranking the answers to a database query when many tuples are returned. In particular, we present methodologies to tackle the problem for conjunctive and range queries,...

Data partitioning and load balancing in parallel disk systems (2005)

Scheuermann, Peter, Weikum, Gerhard, Zabback, Peter

Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper we discuss the main issues in...

Time-Aware Authority Rankings (2005)

Berberich, Klaus, Vazirgiannis, Michalis, Weikum, Gerhard

The link structure of the web is analyzed to measure the authority of pages, which can be taken into account for ranking query results. Due to the enormous dynamics of the web, with millions of pages...

Integrating DB and IR technologies: What is the sound of one hand clapping (2005)

Surajit Chaudhuri, Raghu Ramakrishnan, Gerhard Weikum

Databases (DB) and information retrieval (IR) have evolved as separate fields. However, modern applications such as customer support, health care, and digital libraries require capabilities for both...

Minerva: Collaborative p2p search (2005)

Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

This paper proposes the live demonstration of a prototype of MINERVA 1, a novel P2P Web search engine. The search engine is layered on top of a DHT-based overlay network that connects an a-priori...

Improving Collection Selection with Overlap Awareness in P2P Search Engines (2005)

Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

Collection selection has been a research issue for years. Typically, in related work, precomputed statistics are employed in order to estimate the expected result quality of each collection, and...

Ontological Reasoning for Natural Language Understanding (2005)

Fabian M. Suchanek, Supervisors Prof, Dr. Gerhard Weikum, Dr. Habil Peter Baumgartner

This thesis presents OntoNat, a prototypical system for answering Yes/No-questions on natural language sentences. Different from exist-ing systems, OntoNat uses background knowledge from the...

Automatic Generation of Thematically Focused Information Portals from Web Data (2005)

Datenbanken Informationssysteme, Fachrichtung Informatik, Prof Dr. -ing, Gerhard Weikum, Sergej Sizov, ...

Finding the desired information on the Web is often a hard and time-consuming task. This thesis presents a methodology for the automatic generation of thematically focused portals from Web data. The...

Minerva: Collaborative p2p search (2005)

Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

This paper proposes the live demonstration of a prototype of MINERVA 1, a novel P2P Web search engine. The search engine is layered on top of a DHT-based overlay network that connects an a-priori...

Topx xxl at inex 2005 (2005)

Martin Theobald, Ralf Schenkel, Gerhard Weikum

Abstract. We participated with two different and independent search engines in this year’s INEX round: The XXL Search Engine and the TopX engine. As this is the first participation for TopX, this...

Supervised by (2005)

Prof Dr. -ing, Gerhard Weikum

ii The amount of information in the world is enormous. Millions of docu-ments in electronic libraries, thousands of them on each personal computer waiting for the expert to organize this information,...

Topx xxl at inex 2005 (2005)

Martin Theobald, Ralf Schenkel, Gerhard Weikum

Abstract. We participated with two different and independent search engines in this year’s INEX round: The XXL Search Engine and the TopX engine. As this is the first participation for TopX, this...

Diplom-Informatiker (2005)

German Shegalov, Fakultät I Prof, Dr. Jörg Eschmeier, Vorsitzender Prüfungskommission, ...

ii In memory of my father Isaac To my mother Betti iii Modern Web Services applications encompass multiple distributed interacting components, possibly including millions of lines of code written in...

The SphereSearch Engine for Unified Ranked Retrieval of Heterogeneous XML and Web Documents (2005)

Jens Graupmann, Ralf Schenkel, Gerhard Weikum

This paper presents the novel SphereSearch Engine that provides unified ranked retrieval on heterogeneous XML and Web data. Its search capabilities include vague structure conditions, text content...

An Efficient and Versatile Query Engine for TopX Search (2005)

Martin Theobald, Ralf Schenkel, Gerhard Weikum

This paper presents a novel engine, coined TopX, for efficient ranked retrieval of XML documents over semistructured but nonschematic data collections. The algorithm follows the paradigm of threshold...

Improving Collection Selection with Overlap Awareness in P2P Search Engines (2005)

Matthias Bender, Sebastian Michel, Peter Triantafillou, Gerhard Weikum, Christian Zimmer

Collection selection has been a research issue for years. Typically, in related work, precomputed statistics are employed in order to estimate the expected result quality of each collection, and...

Combining Text and Linguistic Document Representations for Authorship Attribution (2005)

Andreas Kaster, Stefan Siersdorfer, Gerhard Weikum

In this paper, we provide several alternatives to the classical Bag-Of-Words model for automatic authorship attribution. To this end, we consider linguistic and writing style information such as...

Exploiting Organizational Information in Enterprise Text Search (2005)

Sebastian Blohm, Supervisors Prof, Dr. Gerhard Weikum

fasst zu haben und keine anderen Quellen oder Hilfmittel als die angegebenen verwendet zu haben.

Authors ’ Addresses (2005)

Stefan Siersdorfer, Gerhard Weikum, Stefan Siersdorfer, Gerhard Weikum

This paper addresses the problem of semi-supervised classification on document collections using retraining (also called self-training). A possible application is focused Web crawling which may start...

Diplom-Informatiker (2005)

German Shegalov, Fakultät I Prof, Dr. Jörg Eschmeier, Vorsitzender Prüfungskommission, ...

ii In memory of my father Isaac To my mother Betti iii Modern Web Services applications encompass multiple distributed interacting components, possibly including millions of lines of code written in...

An efficient and versatile query engine for TopX search (2005)

Martin Theobald, Ralf Schenkel, Gerhard Weikum

This paper presents a novel engine, coined TopX, for efficient ranked retrieval of XML documents over semistructured but nonschematic data collections. The algorithm follows the paradigm of threshold...

The MINERVA project: Database selection in the context of P2P search (2005)

Matthias Bender, Sebastian Michel, Gerhard Weikum, Christian Zimmer

Abstract: This paper presents the MINERVA project that protoypes a distributed search engine based on P2P techniques. MINERVA is layered on top of a Chord-style overlay network and uses a powerful...

under the guidance of (2005)

Sergey Chernov, Sergey Chernov, Prof Dr-ing, Gerhard Weikum, Christian Zimmer, ...

A tremendous amount of information in the Internet requires powerful search engines. Currently, only the commercial centralized search engines like Google can process terabytes of Web documents. Such...

Towards Self-Organizing Query Routing and Processing for Peer-to-Peer Web Search (2005)

Weikum, Gerhard, Bast, Holger, Canright, Geoffrey, Hales, David, Schindelhauer, Christian, Triantafillou, Peter

The peer-to-peer computing paradigm is an intriguing alternative to Google-style search engines for querying and ranking Web content. In a network with many thousands or millions of peers the storage...

Towards Self-Organizing Query Routing and Processing for Peer-to-Peer Web Search (2005)

Weikum, Gerhard, Bast, Holger, Canright, Geoffrey, Hales, David, Schindelhauer, Christian, Triantafillou, Peter

The peer-to-peer computing paradigm is an intriguing alternative to Google-style search engines for querying and ranking Web content. In a network with many thousands or millions of peers the storage...

The Atomic Manifesto (2005)

Weikum, Gerhard

This paper is a manifesto for future research on "atomicity" in its many guises and is based on a five-day workshop on "Atomicity in System Design and Execution" that took place in Schloss Dagstuhl...

Efficient and Self-Tuning Incremental Query Expansion for Top-k Query Processing (2005)

Theobald, Martin, Schenkel, Ralf, Weikum, Gerhard, Baeza-Yates, Ricardo A., Ziviani, Nivio, Marchionini, Gary, ...

We present a novel approach for efficient and self-tuning query expansion that is embedded into a top-k query processor with candidate pruning. Traditional query expansion methods select expansion...

Automated Retraining Methods for Document Classification and Their Parameter Tuning (2005)

Siersdorfer, Stefan, Weikum, Gerhard, Ngu, Anne H. H., Kitsuregawa, Masaru, Neuhold, Erich J., Chung, Jen-Yao, ...

This paper addresses the problem of semi-supervised classification on document collections using retraining (also called self-training). A possible application is focused Web crawling which may start...

Using Restrictive Classification and Meta Classification for Junk Elimination (2005)

Siersdorfer, Stefan, Weikum, Gerhard, Losada, David, Fern{\'a}ndez-Luna, Juan M.

This paper addresses the problem of performing supervised classification on document collections containing also junk documents. With junk documents we mean documents that do not belong to the topic...

Semantic Similarity Search on Semistructured Data with the XXL Search Engine (2005)

Schenkel, Ralf, Theobald, Anja, Weikum, Gerhard

Query languages for XML such as XPath or XQuery support Boolean retrieval: a query result is a (possibly restructured) subset of XML elements or entire documents that satisfy the search conditions of...

Efficient Creation and Incremental Maintenance of the HOPI Index for Complex XML Document Collections (2005)

Schenkel, Ralf, Theobald, Anja, Weikum, Gerhard

The HOPI index, a connection index for XML documents based on the concept of a 2--hop cover, provides space-- and time--efficient reachability tests along the ancestor, descendant, and link axes to...

Untersuchungen zur automatischen Klassifikation von Lamellengraphit mit Hilfe des Stützvektorverfahrens (Examinations on the Automatic Classification of Lamellar raphite Using the Support Vector Machine) (2005)

Roberts, Kathrin, Mücklich, Frank, Weikum, Gerhard, Portella, P.

Die unterschiedliche Graphitausbildung in Gußeisen ist wesentlich für die mechanischen Eigenschaften dieses Werkstoffes. Deshalb wurden in der Norm EN ISO 945:1994 sechs generelle Formen für die...

JXP: Global Authority Scores in a P2P Network (2005)

Parreira, Josiane Xavier, Weikum, Gerhard, Doan, AnHai, Neven, Frank, McCann, Robert, Jan Bex, Geert

This document presents the \textit{JXP} algorithm for dynamically and collaboratively computing PageRank-style authority scores of Web pages distributed in a P2P network. In the architecture that we...

On the Usage of Global Document Occurrences in Peer-to-Peer Information Systems (2005)

Papapetrou, Odysseas, Michel, Sebastian, Bender, Matthias, Weikum, Gerhard, Meersman, Robert, Tari, Zahir, ...

There exist a number of approaches for query processing in Peer-to-Peer information systems that efficiently retrieve relevant information from distributed peers. However, very few of them take into...

KLEE: A Framework for Distributed Top-k Query Algorithms (2005)

Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard, Böhm, Klemens, Jensen, Christian S., Haas, Laura M., ...

This paper addresses the efficient processing of top-k queries in wide-area distributed data repositories where the index lists for the attribute values (or text terms) of a query are distributed...

MINERVA∞ Infinity: A Scalable Efficient Peer-to-Peer Search Engine (2005)

Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard, Alonso, Gustavo

The promises inherent in users coming together to form data sharing network communities, bring to the foreground new problems formulated over such dynamic, ever growing, computing, storage, and...

P2P Web Search with MINERVA: How do you want to search tomorrow? (Demo) (2005)

Michel, Sebastian, Bender, Matthias, Triantafillou, Peter, Weikum, Gerhard, Zimmer, Christian, Alonso, Gustavo

MINERVA is a novel approach towards P2P Web search that connects an a-priori unlimited number of peers, each of which maintains a personal local database and a local search facility. Each peer posts...

Combining Text and Linguistic Document Representations for Authorship Attribution (2005)

Kaster, Andreas, Siersdorfer, Stefan, Weikum, Gerhard

In this paper, we provide several alternatives to the classical Bag-Of-Words model for automatic authorship attribution. To this end, we consider linguistic and writing style infor- mation such as...

The Atomic Manifesto: a Story in Four Quarks (2005)

Jones, Cliff, Lomet, David, Romanovsky, Alexander, Weikum, Gerhard, Fekete, Alan, Gaudel, Marie-Claude, ...

This paper is based on a five-day workshop on "Atomicity in System Design and Execution" that took place in Schloss Dagstuhl in Germany in April 2004 and was attended by 32 people from different...

The Atomic Manifesto: a Story in Four Quarks (2005)

Jones, Cliff, Lomet, David, Romanovsky, Alexander, Weikum, Gerhard, Fekete, Alan, Gaudel, Marie-Claude, ...

This paper is based on a five-day workshop on "Atomicity in System Design and Execution" that took place in Schloss Dagstuhl in Germany [5] in April 2004 and was attended by 32 people from different...

Digital Library Information-Technology Infrastructures (2005)

Ioannidis, Yannis, Maier, David, Abiteboul, Serge, Buneman, Peter, Davidson, Susan, Fox, Edward, ...

This paper charts a research agenda on systems-oriented issues in digital libraries. It focuses on the most central and generic system issues, including system architecture, user-level functionality,...

Learning Word-to-Concept Mappings for Automatic Text Classification (2005)

Ifrim, Georgiana, Theobald, Martin, Weikum, Gerhard, De Raedt, Luc, Wrobel, Stefan

For both classification and retrieval of natural language text documents, the standard document representation is a term vector where a term is simply a morphological normal form of the corresponding...

Foundations of Automated Database Tuning (2005)

Chaudhuri, Surajit, Weikum, Gerhard, Widom, Jennifer, Özcan, Fatma, Chrikova, Rada

The Challenge of Total Cost of-Ownership Our society is more dependent on information systems than ever before. However, managing the information systems infrastructure in a cost-effective manner is...

Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology (2005)

Bender, Matthias, Michel, Sebastian, Zimmer, Christian, Weikum, Gerhard, Türker, Can, Agosti, Maristella, ...

We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are...

The MINERVA Project: Database Selection in the Context of P2P Search (2005)

Bender, Matthias, Michel, Sebastian, Weikum, Gerhard, Zimmer, Christian, Vossen, Gottfried, Leymann, Frank, ...

This paper presents the MINERVA project that protoypes a distributed search engine based on P2P techniques. MINERVA is layered on top of a Chord-style overlay network and uses a powerful crawling,...

TopX & XXL at INEX 2005 (2005)

Theobald, Martin, Schenkel, Ralf, Weikum, Gerhard, Fuhr, Norbert, Lalmas, Mounia, Malik, Saadia, ...

We participated with two different and independent search engines in this year's INEX round: The XXL Search Engine and the TopX engine. As this is the first participation for TopX, this paper focuses...

p2pDating: Real Life Inspired Semantic Overlay Networks for Web Search (2005)

Parreira, Josiane Xavier, Michel, Sebastian, Weikum, Gerhard, Marchionini, Gary, Moffat, Alistair, ...

We consider a network of autonomous peers forming a logically global but physically distributed search engine, where every peer has its own local collection generated by independently crawling the...

Integrating DB and IR Technologies: What is the Sound of One Hand Clapping? (2005)

Chaudhuri, Surajit, Ramakrishnan, Raghu, Weikum, Gerhard, Stonebraker, Michael, Weikum, Gerhard, DeWitt, David

Databases (DB) and information retrieval (IR)have evolved as separate fields. However, modern applications such as customer support, health care, and digital libraries require capabilities for both...

Challenges of Distributed Search Across Digital Libraries (2005)

Bender, Matthias, Michel, Sebastian, Weikum, Gerhard, Zimmer, Christian, Weikum, Gerhard, Ioannidis, Yannis, ...

We present the MINERVA project that tackles the problem of collaborative search across a large number of digital libraries. The search engine is layered on top of a Chord-style peer-to-peer overlay...

Das MINERVA-Projekt: Datenbankselektion für Peer-to-Peer-Websuche (2005)

Bender, Matthias, Michel, Sebastian, Weikum, Gerhard, Zimmer, Christian

In diesem Artikel wird MINERVA präsentiert, eine prototypische Implementierung einer verteilten Suchmaschine basierend auf einer Peer-to-Peer (P2P)-Architektur. MINERVA setzt auf die in der P2P-Welt...

MINERVA: Collaborative P2P Search (Demo) (2005)

Bender, Matthias, Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard, Zimmer, Christian, Böhm, Klemens, ...

This paper proposes the live demonstration of a prototype of MINERVA, a novel P2P Web search engine. The search engine is layered on top of a DHT-based overlay network that connects an a-priori...

Improving Collection Selection with Overlap-Awareness (2005)

Bender, Matthias, Michel, Sebastian, Triantafillou, Peter, Weikum, Gerhard, Zimmer, Christian, Baeza-Yates, Ricardo A., ...

Collection selection has been a research issue for years. Most of the existing literature estimates the expected result quality of a collection, typically using precomputed statistics, and ranks the...

Word Sense Disambiguation for Exploiting Hierarchical Thesauri in Text Classification (2005)

Mavroeidis, Dimitrios, Tsatsaronis, George, Vazirgiannis, Michalis, Theobald, Martin, Weikum, Gerhard, Jorge, Alípio, ...

The introduction of hierarchical thesauri (HT) that contain significant semantic information, has led researchers to investigate their potential for improving performance of the text classification...

The Lowell Database Research Self-Assessment (2005)

Abiteboul, Serge, Agrawal, Rakesh, Bernstein, Philip A., Carey, Michael J., Ceri, Stefano, Croft, W. Bruce, ...

Database needs are changing, driven by the Internet and increasing amounts of scientific and sensor data. In this article, the authors propose research into several important new directions for...

XXL @ INEX 2003 (2004)

Schenkel,Ralf, Theobald,Anja, Weikum,Gerhard

Information retrieval on XML combines retrieval on content data (element and attribute values) with retrieval on structural data (element and attribute names). Standard query languages for XML such...

T-Rank: Time-aware Authority Ranking (2004)

Berberich,Klaus, Vazirgiannis,Michalis, Weikum,Gerhard

Analyzing the link structure of the web for deriving a page's authority and implied importance has deeply affected the way information providers create and link content, the ranking in web search...

Bookmark-driven Query Routing in Peer-to-Peer Web Search (2004)

Bender,Matthias, Michel,Sebastian, Zimmer,Christian, Weikum,Gerhard

We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a...

Towards a Statistically Semantic Web (2004)

Weikum,Gerhard, Graupmann,Jens, Schenkel,Ralf, Theobald,Martin

The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this...

An Information System for Material Microstructures (2004)

Roberts,Kathrin, Mücklich,Frank, Schenkel,Ralf, Weikum,Gerhard

This paper presents an information system that supports a materialographic laboratory in classifying material samples based on microstructure images. The system uses database and Web technologies to...

Top-k Query Evaluation with Probabilistic Guarantees (2004)

Theobald,Martin, Weikum,Gerhard, Schenkel,Ralf

Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algo-rithm for evaluating...

Goal-oriented Methods and Meta Methods for Document Classification and their Parameter Tuning (2004)

Sizov,Sergej, Siersdorfer,Stefan, Weikum,Gerhard

Automatic text classification methods come with various calibration parameters such as thresholds for probabilities in Bayesian classifiers or for hyperplane distances in SVM classifiers. In a given...

HOPI: An Efficient Connection Index for Complex XML Document Collections (2004)

Schenkel,Ralf, Theobald,Anja, Weikum,Gerhard

In this paper we present {\em HOPI}, a new connection index for XML documents based on the concept of the 2--hop cover of a directed graph introduced by Cohen et al. In contrast to most of the prior...

Query-log based Authority Analysis for Web Information Search (2004)

Luxenburger,Julia, Weikum,Gerhard

The ongoing explosion of web information calls for more intelligent and personalied methods towards better search result quality for advanced queries. Query log and click streams obtained from web...

Probabilistic Ranking of Database Query Results (2004)

Chaudhuri,Surajit, Das,Gautam, Hristidis,Vagelis, Weikum,Gerhard

We investigate the problem of ranking answers to a database query when many tuples are returned. We adapt and apply principles of probabilistic models from Information Retrieval structured data. Our...

Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology (2004)

Bender,Matthias, Michel,Sebastian, Zimmer,Christian, Weikum,Gerhard

We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are...

Peer-to-Peer-Technologie für unternehmensweites und organisationsübergreifendes Workflow-Management (2004)

Bender,Matthias, Kraus,Steffen, Kupsch,Florian, Shegalov,German, Weikum,Gerhard, Werth,Dirk, ...

Workflow-Management ist eine reife Technologie; ihre Erfolgsbilanz beim möglichen Einsatz für die Steuerung unternehmensweiter und organisationsübergreifender Geschäftsprozesse ist aber eher...

Recovery Guarantees for Internet Applications (2004)

Barga,Roger, Lomet,David, Shegalov,German, Weikum,Gerhard

Internet-based e-services require application developers to deal explicitly with failures of the underlying software components, e.g. web servers, servlets, browser sessions, etc. This complicates...

XXL @ INEX 2003 (2004)

Schenkel, Ralf, Theobald, Anja, Weikum, Gerhard, Fuhr, Norbert, Lalmas, Mounia, Malik, Saadia

Information retrieval on XML combines retrieval on content data (element and attribute values) with retrieval on structural data (element and attribute names). Standard query languages for XML such...

T-Rank: Time-aware Authority Ranking (2004)

Berberich, Klaus, Vazirgiannis, Michalis, Weikum, Gerhard, Leonardi, Stefano

Analyzing the link structure of the web for deriving a page's authority and implied importance has deeply affected the way information providers create and link content, the ranking in web search...

Bookmark-driven Query Routing in Peer-to-Peer Web Search (2004)

Bender, Matthias, Michel, Sebastian, Zimmer, Christian, Weikum, Gerhard, Callan, Jamie, Fuhr, Norbert, ...

We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a...

Towards a Statistically Semantic Web (2004)

Weikum, Gerhard, Graupmann, Jens, Schenkel, Ralf, Theobald, Martin, Atzeni, Paolo, Chu, Wesley, ...

The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this...

An Information System for Material Microstructures (2004)

Roberts, Kathrin, Mücklich, Frank, Schenkel, Ralf, Weikum, Gerhard, Hatzopoulos, Michael, Manolopoulos, Yannis

This paper presents an information system that supports a materialographic laboratory in classifying material samples based on microstructure images. The system uses database and Web technologies to...

Top-k Query Evaluation with Probabilistic Guarantees (2004)

Theobald, Martin, Weikum, Gerhard, Schenkel, Ralf, Nascimento, Mario A., Özsu, M. Tamer, Kossmann, Donald, ...

Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algo-rithm for evaluating...

Goal-oriented Methods and Meta Methods for Document Classification and their Parameter Tuning (2004)

Sizov, Sergej, Siersdorfer, Stefan, Weikum, Gerhard, Evans, David A., Gravano, Luis, Herzog, Otthein, ...

Automatic text classification methods come with various calibration parameters such as thresholds for probabilities in Bayesian classifiers or for hyperplane distances in SVM classifiers. In a given...

HOPI: An Efficient Connection Index for Complex XML Document Collections (2004)

Schenkel, Ralf, Theobald, Anja, Weikum, Gerhard, Bertino, Elisa, Christodoulakis, Stavros, Plexousakis, Dimitris, ...

In this paper we present {\em HOPI}, a new connection index for XML documents based on the concept of the 2--hop cover of a directed graph introduced by Cohen et al. In contrast to most of the prior...

Query-log based Authority Analysis for Web Information Search (2004)

Luxenburger, Julia, Weikum, Gerhard, Zhou, Xiaofang, Su, Stanley Y. W., Papazoglou, Mike P., Orlowska, Maria E., ...

The ongoing explosion of web information calls for more intelligent and personalied methods towards better search result quality for advanced queries. Query log and click streams obtained from web...

Probabilistic Ranking of Database Query Results (2004)

Chaudhuri, Surajit, Das, Gautam, Hristidis, Vagelis, Weikum, Gerhard, Nascimento, Mario A., Özsu, M. Tamer, ...

We investigate the problem of ranking answers to a database query when many tuples are returned. We adapt and apply principles of probabilistic models from Information Retrieval structured data. Our...

Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology (2004)

Bender, Matthias, Michel, Sebastian, Zimmer, Christian, Weikum, Gerhard, Agosti, Maristella, Schek, Hans-Jörg, ...

We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are...

Peer-to-Peer-Technologie für unternehmensweites und organisationsübergreifendes Workflow-Management (2004)

Bender, Matthias, Kraus, Steffen, Kupsch, Florian, Shegalov, German, Weikum, Gerhard, Werth, Dirk, ...

Workflow-Management ist eine reife Technologie; ihre Erfolgsbilanz beim möglichen Einsatz für die Steuerung unternehmensweiter und organisationsübergreifender Geschäftsprozesse ist aber eher...

Recovery Guarantees for Internet Applications (2004)

Barga, Roger, Lomet, David, Shegalov, German, Weikum, Gerhard

Internet-based e-services require application developers to deal explicitly with failures of the underlying software components, e.g. web servers, servlets, browser sessions, etc. This complicates...

Goal-oriented methods and meta methods for document classification and their parameter tuning (2004)

Stefan Siersdorfer, Sergej Sizov, Gerhard Weikum

Automatic text classification methods come with various calibration parameters such as thresholds for probabilities in Bayesian classifiers or for hyperplane distances in SVM classifiers. In a given...

Towards collaborative search in digital libraries using peer-to-peer technology (2004)

Matthias Bender, Sebastian Michel, Christian Zimmer, Gerhard Weikum

Abstract. We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peerto-peer (P2P) environment. Both digital libraries and users...

Top-k Query Evaluation with Probabilistic Guarantees (2004)

Martin Theobald, Gerhard Weikum, Ralf Schenkel

Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algorithm for evaluating...

Language modeling based passage retrieval for question answering systems (2004)

Munawar Hussain, Supervisors Prof, Dr. Gerhard Weikum, Prof Dr, Dietrich Klakow, Eidesstattliche Erklärung, ...

Hiermit erkläre ich an Eides statt, dass ich die vorliegende Mastersarbeit selbständig und ohne fremde Hilfe verfasst habe. Ich habe dazu keine weiteren als die angeführten Hilfsmittel benutzt und...

Bookmark-driven query routing in peer-to-peer web search (2004)

Matthias Bender, Sebastian Michel, Gerhard Weikum, Christian Zimmer

Abstract: We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a...

Top-k Query Evaluation with Probabilistic Guarantees (2004)

Martin Theobald, Gerhard Weikum, Ralf Schenkel

Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algorithm for evaluating...

Bookmark-driven query routing in peer-to-peer web search (2004)

Matthias Bender, Matthias Bender, Sebastian Michel, Sebastian Michel, Gerhard Weikum, Gerhard Weikum, ...

Abstract: We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a...

Bookmark-driven query routing in peer-to-peer web search (2004)

Matthias Bender, Sebastian Michel, Christian Zimmer, Gerhard Weikum

Abstract: We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a...

T-rank: Time-aware authority ranking (2004)

Klaus Berberich, Michalis Vazirgiannis, Gerhard Weikum

Abstract. The link structure of the web is analyzed to measure the authority of pages, which can be taken into account for ranking query results. Due to the enormous dynamics of the web, with...

The Atomic Manifesto: a Story in Four Quarks (2004)

Jones, Cliff, Lomet, David, Romanovsky, Alexander, Weikum, Gerhard, Fekete, Alan, Gaudel, Marie-Claude, ...

This report summarizes the viewpoints and insights gathered in the Dagstuhl Seminar on Atomicity in System Design and Execution, which was attended by 32 people from four different scientific...

Recovery Guarantees for Internet Applications (2004)

Barga, Roger, Lomet, David, Shegalov, German, Weikum, Gerhard

Internet-based e-services require application developers to deal explicitly with failures of the underlying software components, e.g. web servers, servlets, browser sessions, etc. This complicates...

Peer-to-Peer-Technologie für unternehmensweites und organisationsübergreifendes Workflow-Management (2004)

Bender, Matthias, Kraus, Steffen, Kupsch, Florian, Shegalov, German, Weikum, Gerhard, Werth, Dirk, ...

Workflow-Management ist eine reife Technologie; ihre Erfolgsbilanz beim möglichen Einsatz für die Steuerung unternehmensweiter und organisationsübergreifender Geschäftsprozesse ist aber eher...

Towards Collaborative Search in Digital Libraries Using Peer-to-Peer Technology (2004)

Bender, Matthias, Michel, Sebastian, Zimmer, Christian, Weikum, Gerhard, Agosti, Maristella, Schek, Hans-Jörg, ...

We consider the problem of collaborative search across a large number of digital libraries and query routing strategies in a peer-to-peer (P2P) environment. Both digital libraries and users are...

T-Rank: Time-aware Authority Ranking (2004)

Berberich, Klaus, Vazirgiannis, Michalis, Weikum, Gerhard, Leonardi, Stefano

Analyzing the link structure of the web for deriving a page's authority and implied importance has deeply affected the way information providers create and link content, the ranking in web search...

Probabilistic Ranking of Database Query Results (2004)

Chaudhuri, Surajit, Das, Gautam, Hristidis, Vagelis, Weikum, Gerhard, Nascimento, Mario A., Özsu, M. Tamer, ...

We investigate the problem of ranking answers to a database query when many tuples are returned. We adapt and apply principles of probabilistic models from Information Retrieval structured data. Our...

Query-log based Authority Analysis for Web Information Search (2004)

Luxenburger, Julia, Weikum, Gerhard, Zhou, Xiaofang, Su, Stanley Y. W., Papazoglou, Mike P., Orlowska, Maria E., ...

The ongoing explosion of web information calls for more intelligent and personalied methods towards better search result quality for advanced queries. Query log and click streams obtained from web...

An Information System for Material Microstructures (2004)

Roberts, Kathrin, Mücklich, Frank, Schenkel, Ralf, Weikum, Gerhard, Hatzopoulos, Michael, Manolopoulos, Yannis

This paper presents an information system that supports a materialographic laboratory in classifying material samples based on microstructure images. The system uses database and Web technologies to...

HOPI: An Efficient Connection Index for Complex {XML} Document Collections (2004)

Schenkel, Ralf, Theobald, Anja, Weikum, Gerhard, Bertino, Elisa, Christodoulakis, Stavros, Plexousakis, Dimitris, ...

In this paper we present {\em HOPI}, a new connection index for XML documents based on the concept of the 2--hop cover of a directed graph introduced by Cohen et al. In contrast to most of the prior...

Goal-oriented Methods and Meta Methods for Document Classification and their Parameter Tuning (2004)

Sizov, Sergej, Siersdorfer, Stefan, Weikum, Gerhard, Evans, David A., Gravano, Luis, Herzog, Otthein, ...

Automatic text classification methods come with various calibration parameters such as thresholds for probabilities in Bayesian classifiers or for hyperplane distances in SVM classifiers. In a given...

Top-k Query Evaluation with Probabilistic Guarantees (2004)

Theobald, Martin, Weikum, Gerhard, Schenkel, Ralf, Nascimento, Mario A., Özsu, M. Tamer, Kossmann, Donald, ...

Top-k queries based on ranking elements of multidimensional datasets are a fundamental building block for many kinds of information discovery. The best known general-purpose algo-rithm for evaluating...

Towards a Statistically Semantic Web (2004)

Weikum, Gerhard, Graupmann, Jens, Schenkel, Ralf, Theobald, Martin, Atzeni, Paolo, Chu, Wesley, ...

The envisioned Semantic Web aims to provide richly annotated and explicitly structured Web pages in XML, RDF, or description logics, based upon underlying ontologies and thesauri. Ideally, this...

Bookmark-driven Query Routing in Peer-to-Peer Web Search (2004)

Bender, Matthias, Michel, Sebastian, Zimmer, Christian, Weikum, Gerhard, Callan, Jamie, Fuhr, Norbert, ...

We consider the problem of collaborative Web search and query routing strategies in a peer-to-peer (P2P) environment. In our architecture every peer has a full-fledged search engine with a...

XXL @ {INEX} 2003 (2004)

Schenkel, Ralf, Theobald, Anja, Weikum, Gerhard, Fuhr, Norbert, Lalmas, Mounia, Malik, Saadia

Information retrieval on XML combines retrieval on content data (element and attribute values) with retrieval on structural data (element and attribute names). Standard query languages for XML such...

The Lowell Database Research Self Assessment (2003)

Abiteboul, Serge, Agrawal, Rakesh, Bernstein, Phil, Carey, Mike, Ceri, Stefano, Croft, Bruce, ...

A group of senior database researchers gathers every few years to assess the state of database research and to point out problem areas that deserve additional focus. This report summarizes the...

Konstruktion von Featureräumen und Metaverfahren zur Klassifikation von Webdokumenten (2003)

Siersdorfer, Stefan, Sizov, Sergej, Weikum, Gerhard, Schöning, Harald, Rahm, Erhard

Dieses Papier befasst sich mit der automatischen Klassifikation von Webdokumenten in eine vorgegebene Taxonomie. Wir betrachten dabei vektorbasierte Verfahren des maschinellen Lernens am Beispiel von...

An Ontology for Domain-oriented Semantic Similarity Search on XML Data (2003)

Theobald, Anja, Weikum, Gerhard, Schöning, Harald, Rahm, Erhard

Query languages for XML such as XPath or XQuery support Boolean retrieval where a query result is a (possibly restructured) subset of XML elements or entire documents that satisfy the search...

The bingo! system for information portal generation and expert web search (2003)

Sergej Sizov, Michael Biwer, Jens Graupmann, Stefan Siersdorfer, Martin Theobald, Gerhard Weikum, ...

This paper presents the BINGO! focused crawler, an advanced tool for information portal generation and expert Web search. In contrast to standard search engines such as Google which are solely based...

Serge Abiteboul, Rakesh Agrawal, Phil Bernstein, Mike Carey, Stefano Ceri, Bruce Croft, David DeWitt, Mike Franklin, (2003)

Serge Abiteboul, Rakesh Agrawal, Phil Bernstein, Mike Carey, Stefano Ceri, Bruce Croft, ...

This report summarizes the discussion and conclusions of the sixth ad-hoc meeting held May 4-6, 2003 in Lowell, Mass. It observes that information management continues to be a critical component of...

Weikum: The BINGO! Focused Crawler: From Bookmarks to Archetypes (2002)

Sergej Sizov, Stefan Siersdorfer, Martin Theobald, Gerhard Weikum

Focused crawling is a relatively new, promising approach to improving the recall of expert search on the Web. Consider an advanced Web user, say a researcher or a student, who is looking for the...

The index-based XXL search engine for querying XML data with relevance ranking (2002)

Anja Theobald, Gerhard Weikum

Abstract. Query languages for XML such as XPath or XQuery support Boolean retrieval: a query result is a (possibly restructured) subset of XML elements or entire documents that satisfy the search...

EOS: Exactly-Once E-Service middleware (2002)

German Shegalov, Gerhard Weikum, Roger Barga, David Lomet

Today's web-based E-services do not handle system failures well. One of the most prominent examples is unintentional purchase of multiple copies of the same item (e.g., a DVD) in an online...

Self-tuning Database Technology and Information Services: From Wishful Thinking To Viable Engineering (2002)

Gerhard Weikum, Axel Moenkeberg, Christof Hasse, Peter Zabback

Automatic tuning has been an elusive goal for database technology for a long time and is becoming a pressing issue for modern E-services. This paper reviews and assesses the advances that have been...

A Framework for the Physical Design Problem for Data Synopses (2002)

Arnd Christian König, Gerhard Weikum

Maintaining statistics on multidimensional data distributions is crucial for predicting the run-time and result size of queries and data analysis tasks with acceptable accuracy. Applications of such...

A framework for the physical design problem for data synopses (2002)

Arnd Christian König, Gerhard Weikum

Abstract. Maintaining statistics on multidimensional data distributions is crucial for predicting the run-time and result size of queries and data analysis tasks with acceptable accuracy. To this end...

The Web in 2010: Challenges and Opportunities for Database Research (2001)

Gerhard Weikum

The impressive advances in global networking and information technology provide great opportunities for all kinds of Web-based information services, ranging from digital libraries and information...

XML-enabled workflow management for e-services across heterogeneous platforms (2001)

German Shegalov, Michael Gillmann, Gerhard Weikum

Advanced e-services require efficient, flexible, and easy-to-use workflow technology that integrates well with mainstream Internet technologies like XML and Web servers. This paper discusses an...

Rethinking database system architecture: Towards a self-tuning RISC-style database system (2000)

Surajit Chaudhuri, Gerhard Weikum

Database technology is one of the cornerstones for the new millennium’s IT landscape. However, database systems as a unit of code packaging and deployment are at a crossroad: commercial systems...

Benchmarking and Configuration of Workflow Management Systems (2000)

Michael Gillmann, Ralf Mindermann, Gerhard Weikum

Workflow management systems are a cornerstone of mission-criticial, possibly cross-organizational business processes. For large-scale applications both their performance and availability are crucial...

Benchmarking and Configuration of Workflow Management Systems (2000)

Michael Gillmann, Ralf Mindermann, Gerhard Weikum

. Workflow management systems (WFMS) are a cornerstone of mission-criticial, possibly cross-organizational business processes. For largescale applications both their performance and availability are...

Rethinking Database System Architecture: Towards a Self-tuning RISC-style Database System (2000)

Surajit Chaudhuri, Gerhard Weikum

Database technology is one of the cornerstones for the new millennium's IT landscape. However, database systems as a unit of code packaging and deployment are at a crossroad: commercial systems...

A Goal-driven Auto-Configuration Tool for the Distributed Workflow Management System Mentor-lite (2000)

Michael Gillmann Jeanine, Michael Gillmann, Jeanine Weissenfels, German Shegalov, Wolfgang Wonner, Gerhard Weikum

s to form so-called "virtual enterprises". A communication manager is responsible for sending and receiving synchronization messages between the engines. In order to guarantee a consistent...

A Goal-driven Auto-Configuration Tool for the Distributed Workflow Management System Mentor-lite (2000)

Michael Gillmann Jeanine, Michael Gillmann, Jeanine Weissenfels, German Shegalov, Wolfgang Wonner, Gerhard Weikum

The Mentor-lite prototype has been developed within the research project "Architecture, Configuration, and Administration of Large Workflow Management Systems" funded by the German Science...

Data Engineering Special Issue on Adaptive Query Processing, June 2000 (2000)

Alon Levy (Ed.), Adaptive Query, Processing Technology, Evolution Joseph, M. Hellerstein, Sirish Ch, ...

As query engines are scaled and federated, they must cope with highly unpredictable and changeable environments. In the Telegraph project, we are attempting to architect and implement a continuously...

Performance and Availability Assessment for the Configuration of Distributed Workflow Management Systems (2000)

Michael Gillmann, Jeanine Weissenfels, Gerhard Weikum, Achim Kraiss

Workflow management systems (WFMSs) that are geared for the orchestration of enterprise-wide or even "virtual-enterprise"-style business processes across multiple organizations are complex...

Performance and Availability Assessment for the Configuration of Distributed Workflow Management Systems (2000)

Michael Gillmann Jeanine, Michael Gillmann, Jeanine Weissenfels, Gerhard Weikum, Achim Kraiss

Workflow management systems (WFMSs) that are geared for the orchestration of enterprise-wide or even "virtual-enterprise"-style business processes across multiple organizations are complex...

Performance and Availability Assessment for the Configuration of Distributed Workflow Management Systems (2000)

Michael Gillmann Jeanine, Michael Gillmann, Jeanine Weissenfels, Gerhard Weikum, Achim Kraiss

Workflow management systems (WFMSs) that are geared for the orchestration of enterprise-wide or even "virtual-enterprise"-style business processes across multiple organizations are complex...

Mentor-lite Customizability: Tailoring a Light-Weight Workflow Management System to Workflow Application and Organizational Needs (2000)

Michael Gillmann, Jeanine Weissenfels, German Shegalov, Wolfgang Wonner, Gerhard Weikum

The Mentor-lite prototype has been developed within the research project "Architecture, Configuration, and Administration of Large Workflow Management Systems" funded by the German Science...

Combining histograms and parametric curve fitting for feedback-driven query result-size estimation (1999)

Arnd Christian Konig, Gerhard Weikum

This paper aims to improve the accuracy of query result-size estimations in query optimizers by leveraging the dynamic feedback obtained from observations on the executed query workload. To...

Towards guaranteed quality and dependability of information systems (1999)

Gerhard Weikum

If I had had more time, I could have written you a shorter letter. (Blaise Pascal) The impressive advances in global networking and information technology provide great opportunities for all kinds of...

Federated transaction management with snapshot isolation (1999)

Ralf Schenkel, Gerhard Weikum, Norbert Weißenberg, Xuequn Wu

Federated transaction management (also known as multidatabase transaction management in the literature) is needed to ensure the consistency of data that is distributed across multiple, largely...

A Performance Model of Mixed-Workload Multimedia Information Servers (1999)

Guido Nerjes, Peter Muth, Gerhard Weikum

Advanced multimedia applications such as digital libraries or teleteaching exhibit a mixed workload with accesses to both "continuous" data (e.g., video) and conventional,...

Combining Histograms and Parametric Curve Fitting for Feedback-Driven Query Result-Size Estimation (1999)

Arnd Christian König, Gerhard Weikum

This paper aims to improve the accuracy of query result-size estimations in query optimizers by leveraging the dynamic feedback obtained from observations on the executed query workload. To this end,...

Towards Self-Tuning Memory Management for Data Servers (1999)

Gerhard Weikum, Arnd Christian König, Achim Kraiss, Markus Sinnwell

Although today's computers provide huge amounts of main memory, the ever-increasing load of large data servers, imposed by resource-intensive decision-support queries and accesses to multimedia...

Integrating Light-Weight Workflow Management Systems within Existing Business Environments (1999)

Peter Muth, Jeanine Weissenfels, Michael Gillmann, Gerhard Weikum

Workflow management systems support the efficient, largely au- tomated execution of business processes. However, using a workflow management system typically requires implementing the...

Experiences with Building a Federated Transaction Manager based on CORBA OTS (1999)

Ralf Schenkel, Gerhard Weikum

. Federated transaction management is needed to ensure the consistency of data that is distributed across multiple, largely autonomous, and possibly heterogeneous component databases and accessed by...

An optimality proof of the LRU-K page replacement algorithm (1999)

Elizabeth J. O’neil, Gerhard Weikum

Abstract. This paper analyzes a recently published algorithm for page replacement in hierarchical paged memory systems [O’Neil et al. 1993]. The algorithm is called the LRU-K method, and reduces to...

Design, implementation, and performance of the LHAM log-structured history data access method (1998)

Peter Muth, Achim Pick, Gerhard Weikum

Numerous applications such as stock market or medical informa-tion systems require that both historical and current data be logical-ly integrated into a temporal database. The underlying access...

SNOWBALL: Scalable Storage on Networks of Workstations with Balanced Load (1998)

Radek Vingralek, Yuri Breitbart, Gerhard Weikum

Networks of workstations are an emerging architectural paradigm for high-performance parallel and distributed systems. Exploiting networks of workstations for massive data management poses exciting...

What Workflow Technology Can Do For Electronic Commerce (1998)

Peter Muth, Jeanine Weissenfels, Gerhard Weikum

Electronic Commerce (EC) is a rapidly growing research and development area of very high practical relevance. A major challenge in successfully designing EC applications is to identify existing...

From Centralized Workflow Specification to Distributed Workflow Execution (1998)

Peter Muth, Dirk Wodtke, Jeanine Weissenfels, Angelika Kotz Dittrich, Gerhard Weikum

Current workflow management systems fall short of supporting large-scale distributed, enterprise -wide applications. We present a scalable, rigorously founded approach to enterprise-wide workflow...

Implementation and Performance of the LHAM Log-Structured History Data Access Method (1998)

Peter Muth, Patrick O'Neil, Gerhard Weikum

Numerous applications such as stock market or medical information systems require that both historical and current data be logically integrated into a temporal database. The underlying access method...

Efficient Transparent Application Recovery In Client-Server Information Systems (1998)

David Lomet, Gerhard Weikum

Database systems recover persistent data, providing high database availability. However, database applications, typically residing on client or "middle-tier" application-server machines,...

On the Ubiquity of Information Services and the Absence of Guaranteed Service Quality (Extended Abstract) (1998)

Gerhard Weikum

We are witnessing the proliferation of the global information society with a sheer explosion of information services on a world-spanning network. This opens up unprecedented opportunities for...

Efficient Transparent Application Recovery In Client-Server Information Systems (1998)

David Lomet, Gerhard Weikum

Database systems recover persistent data, providing high database availability. However, database applications, typically residing on client or "middle-tier" application-server machines,...

Integrated Document Caching and Prefetching in Storage Hierarchies Based on Markov-Chain Predictions (1998)

Achim Kraiss, Gerhard Weikum

.<F3.733e+05> Large multimedia document archives may hold a major fraction of their data in tertiary storage libraries for cost reasons. This paper develops an integrated approach to the...

Design, Implementation, and Performance of the LHAM Log-Structured History Data Access Method (1998)

Peter Muth, Patrick O'Neil, Achim Pick, Gerhard Weikum

Numerous applications such as stock market or medical information systems require that both historical and current data be logical- ly integrated into a temporal database. The underlying access...

Integrated Document Caching and Prefetching in Storage Hierarchies Based on Markov-Chain Predictions (1998)

Achim Kraiss, Gerhard Weikum

. Large multimedia document archives may hold a major fraction of their data in tertiary stor- age libraries for cost reasons. This paper develops an integrated approach to the vertical data...

Data Partitioning and Load Balancing in Parallel Disk Systems (1998)

Peter Scheuermann, Gerhard Weikum, Peter Zabback

Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper, we discuss the main issues in...

Vertical data migration in large near-line document archives based on markov-chain predictions (1997)

Achim Kraiss, Gerhard Weikum

Large multimedia document archives hold most of their data in near-line tertiary storage libraries for cost reasons. This paper de-velops an integrated approach to the vertical data migration...

Stochastic Service Guarantees for Continuous Data on Multi-Zone Disks (1997)

Peter Muth, Guido Nerjes, G. Weikum, Gerhard Weikum

Continuous data types like video and audio require the real-time delivery of data fragments from a server's disks to the client at which the data is displayed. This paper develops a stochastic...

Stochastic Performance Guarantees for Mixed Workloads in a Multimedia Information System (1997)

Guido Nerjes, Peter Muth, Gerhard Weikum

We present an approach to stochastic performance guar- antees for multimedia servers with mixed workloads. Advanced multimedia applications such as digital libraries or teleteaching exhibit a mixed...

Vertical Data Migration in Large (1997)

Achim Kraiss, Gerhard Weikum

Large multimedia document archives hold most of their data in near-line tertiary storage libraries for cost reasons. This paper de- velops an integrated approach to the vertical data migration...

Workbench for Enterprise-wide Workflow Management (1997)

Dirk Wodtke, Jeanine Weissenfels, Gerhard Weikum, Angelika Kotz Dittrich, Peter Muth

workflows according to the organizational responsibilities of the enterprise. For the distributed execution of the partitioned workflow specification, MENTOR relies mostly on standard middleware...

A Formal Foundation for Distributed Workflow Execution Based on State Charts (1997)

Dirk Wodtke, Gerhard Weikum

. This paper provides a formal foundation for distributed workflow executions. The state chart formalism is adapted to the needs of a workflow model in order to establish a basis for both correctness...

Vertical Data Migration in Large Near-Line Document Archives Based on Markov-Chain Predictions (1997)

Achim Kraiss, Gerhard Weikum

Large multimedia document archives hold most of their data in near-line tertiary storage libraries for cost reasons. This paper de- velops an integrated approach to the vertical data migration...

A formal foundation for distributed workflow execution based on state charts (1997)

Dirk Wodtke, Gerhard Weikum

Abstract. This paper provides a formal foundation for distributed workfiow executions. The state chart formalism is adapted to the needs of a workflow model in order to establish a basis for both...

Data partitioning and load balancing in parallel disk systems (1996)

Scheuermann, Peter, Weikum, Gerhard, Zabback, Peter

Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper we discuss the main issues in...

An Optimality Proof of the LRU-K Page Replacement Algorithm (1996)

Elizabeth J. O'neil, Patrick E. O'Neil, Gerhard Weikum

This paper analyzes a recently published algorithm for page replacement in hierarchical paged memory systems [OOW93]. The algorithm is called the LRU-K method, and reduces to the well-known LRU...

The Mentor Project: Steps Towards Enterprise-Wide Workflow Management (1996)

Dirk Wodtke, Jeanine Weissenfels, Gerhard Weikum, Angelika Kotz Dittrich

Enterprise--wide workflow management where workflows may span multiple organizational units require particular consideration of scalability, heterogeneity, and availability issues. The Mentor project...

LoT: Dynamic Declustering of TSB-Tree Nodes for Parallel Access to Temporal Data (1996)

Peter Muth, Achim Kraiß, Gerhard Weikum

. In this paper, we consider the problem of exploiting I/O parallelism for efficient access to transaction-time temporal databases. As temporal databases maintain historical versions of records in...

The Mentor Project: Steps Towards Enterprise-Wide Workflow Management (1996)

Dirk Wodtke, Jeanine Weissenfels, Gerhard Weikum, Angelika Kotz Dittrich

A workflow is the coordinated execution of a collection of computer-supported activities which are run under the responsibility of different, human or automated processing entities. Enterprise-wide...

Load Control in Scalable Distributed File Structures (1995)

Yuri Breitbart, Radek Vingralek, Gerhard Weikum

The paper presents a family of distributed file structures, coined DiFS, for record structured, disk resident files with key based exact or interval match access. The file is organized into buckets...

SNOWBALL: Scalable Storage on Networks of Workstations with Balanced Load (1995)

Radek Vingralek, Yuri Breitbart, Gerhard Weikum

Networks of workstations are an emerging architectural paradigm for high-performance parallel and distributed systems. Exploiting networks of workstations for massive data management poses exciting...

SNOWBALL: Scalable Storage on Networks of Workstations with Balanced Load (1995)

Radek Vingralek, Yuri Breitbart, Gerhard Weikum

Networks of workstations are an emerging architectural paradigm for highperformance parallel and distributed systems. Exploiting networks of workstations for massive data management poses exciting...

"Disk Cooling" in Parallel Disk Systems (1994)

Peter Scheuermann, Gerhard Weikum, Peter Zabback

Parallel disk systems provide opportunities for high performance I/O by supporting efficiently intra-request and inter-request parallelism. We review briefly the components of an intelligent file...

Unifying Concurrency Control and Recovery of Transactions (1994)

Gustavo Alonso, Radek Vingralek, Divyakant Agrawal, Yuri Breitbart, Amr El Abbadi, Hans-J. Schek, ...

Transaction management in shared databases is generally viewed as a combination of two problems, concurrency control and recovery, which have been considered as orthogonal problems. Consequently, the...

Data Partitioning and Load Balancing in Parallel Disk Systems (1994)

Peter Scheuermann, Gerhard Weikum, Peter Zabback

Parallel disk systems provide opportunities for exploiting I/O parallelism in two possible ways, namely via inter-request and intra-request parallelism. In this paper we discuss the main issues in...

Semantic Concurrency Control in Object-Oriented Database Systems (1993)

D-w Darmstadt, Peter Muth, Thomas C. Rakow, Gerhard Weikum, Peter Brössler, Christof Hasse

This paper presents a new locking protocol for object-oriented database systems (OODBSs). The protocol can exploit the semantics of methods invoked on encapsulated objects. Compared to conventional...

The LHAM log-structured history data access method (1993)

Peter Muth, Achim Pick, Gerhard Weikum

Numerous applications such as stock market or medical information systems require that both historical and current data be logically integrated into a temporal database. The underlying access method...

Adaptive load balancing in disk arrays (1993)

Peter Scheuermann, Gerhard Weikum, Peter Zabback

Large arrays of small disks are providing an attractive approach for high performance I/O systems. In order to make effective use of disk arrays and other multi-disk architectures, it is necessary to...

The lru-k page replacement algorithm for database disk buffering (1993)

Elizabeth J. O'neil, Patrick E. O'neil, Gerhard Weikum, Eth Zurich

This paper introduces a new approach to database disk buffering, called the LRU-K method. The basic idea of LRU-K is to keep track of the times of the last K references to popular database pages,...

Semantics-based Multilevel Transaction Management in Federated Systems (1993)

Andrew Deacon, Hans-Jörg Schek, Gerhard Weikum

A federated database management system (FDBMS) is a special type of distributed database system that enables existing local databases, in a heterogeneous environment, to maintain a high degree of...

Extending Transaction Management To Capture More Consistency With Better Performance (1993)

Gerhard Weikum

This paper surveys recent workon extended transaction management. It focusses on transaction--oriented applications in large distributed and heterogeneous information systems where applications often...

Inter- and Intra-Transaction Parallelism in Database Systems (1993)

Christof Hasse, Gerhard Weikum

This paper presents an approach to improving database performance by combining parallelism of multiple independent transactions and parallelism of multiple subtransactions within a transaction. An...

Towards a Unified Theory of Concurrency Control and Recovery (1993)

Hans-Jörg Schek, Gerhard Weikum, Haiyan Ye

The classical theory of transaction management is based on two different and independent criteria for the correct execution of transactions. The first criterion, serializability, ensures correct...

Semantic Concurrency Control in Object-Oriented Database Systems (1993)

Computer Science Dept, Peter Muth, Thomas C. Rakow, Gerhard Weikum, Peter Brössler, Christof Hasse

This paper presents a new locking protocol for object-oriented database systems (OODBSs). The protocol can exploit the semantics of methods invoked on encapsulated objects. Compared to conventional...

A Log-Structured History Data Access Method (LHAM) (1993)

Patrickk O'Neil, Gerhard Weikum

There are numerous applications that require on--line access to a history of business events. Ideally, both historical and current data should be logically integrated into some form of temporal...

Bulletin of the Technical Committee on Data Engineering (June, 1993 Vol. 16 No. 2) (1993)

Important Membership, March Issue, Rakesh Agrawal, David Lomet, ...

In many real world applications (even in banking), imprecise data is a matter of fact. However, classic database management systems provide little if any help in the management of imprecise data. We...

Semantic Concurrency Control in Object-Oriented Database Systems (1993)

Peter Muth Thomas, Thomas C. Rakow, Gerhard Weikum, Christof Hasse

This paper presents a new locking protocol for object-oriented database systems (OODBSs). The protocol can exploit the semantics of methods invoked on encapsulated objects. Compared to conventional...

Semantics--based Multilevel Transaction Management in Federated Systems (1993)

Andrew Deacon, Gerhard Weikum

A federated database management system (FDBMS) is a special type of distributed database system that enables existing local databases, in a heterogeneous environment, to maintain a high degree of...

The LRU--K Page Replacement Algorithm For Database Disk Buffering (1993)

Elizabeth Neil, Patrick E. O'neil, Gerhard Weikum, Eth Zurich

This paper introduces a new approach to database disk buffering, called the LRU--K method. The basic idea of LRU--K is to keep track of the times of the last K references to popular database pages,...

The LRU-K Page Replacement Algorithm For Database Disk Buffering (1993)

Elizabeth J. O'neil, Patrick E. O'Neil, Gerhard Weikum

This paper introduces a new approach to database disk buffering, called the LRU-K method. The basic idea of LRU-K is to keep track of the times of the last K references to popular database pages,...

The LHAM log-structured history data access method (1993)

Peter Muth, Achim Pick, Gerhard Weikum

Abstract. Numerous applications such as stock market or medical information systems require that both historical and current data be logically integrated into a temporal database. The underlying...

G.Weikum, A Log-Structured History Data Access Method (1993)

Gerhard Weikum

There are numerous applications that require on-line access to a history of business events. Ideally, both historical and current data should be logically integrated into some form of temporal...

The LRU-K Page Replacement Algorithm For Database Disk Buffering (1993)

Elizabeth J. O’neil, Patrick E. O’neil, Gerhard Weikum, Eth Zurich

This paper introduces a new approach to database disk buffering, called the LRU–K method. The basic idea of LRU–K is to keep track of the times of the last K references to popular database pages,...

The LRU-K Page Replacement Algorithm For Database Disk Buffering (1993)

Elizabeth J. O’neil, Patrick E. O’neill, Gerhard Weikum

This paper introduces a new approach to database disk buffering, called the LRU-K method. The basic idea of LRU-K is to keep track of the times of the last K references to popular database pages,...

Concepts and Applications of Multilevel Transactions and Open Nested Transactions (1992)

Gerhard Weikum

This chapter gives an overview on multilevel transactions and its generalization toward open nested transactions. The main features of these transaction models are the following: first, semantic...

QVLDB Multi-Level Transaction Management for Complex Objects: Implementation, Performance, Parallelism (1991)

Gerhard Weikum, Christof Hasse

Abstract. Multi-level transactions are a variant of open-nested transactions in which the subtransactions correspond to operations at different levels of a layered system architecture. They allow the...

Multi-Level Recovery (1990)

Gerhard Weikum, Christof Hasse, Peter Broessler, Peter Muth, Computer Science Dept

Multi--level transactions have received considerable attention as a framework for high--performance concurrency control methods. An inherent property of multi--level transactions is the need for...

Mentor-lite: Integrating Light-Weight Workflow Management Systems within Business Environments

Peter Muth, Jeanine Weissenfels, Michael Gillmann, Gerhard Weikum

) Peter Muth, Jeanine Weissenfels, Michael Gillmann, Gerhard Weikum Department of Computer Science, University of the Saarland P.O. Box 151150, D-66041 Saarbruecken, Germany Phone: ++49 681 302 4786,...

Workflow History Management in Virtual Enterprises using a Light-Weight Workflow Management System

Peter Muth, Jeanine Weissenfels, Michael Gillmann, Gerhard Weikum

Enterprise-spanning workflows require workflow management systems that can be tailored to specific application needs, as well as enhanced support for interoperability between different workflow...

Recovery Guarantees for Internet Applications

Roger Barga David, David Lomet, German Shegalov, Gerhard Weikum

Internet-based e-services require application developers to deal explicitly with failures of the underlying software components, e.g. web servers, servlets, browser sessions, etc. This complicates...