Panagiotis G. Ipeirotis

Working Paper CeDER-07-05, New York University Deriving the Pricing Power of Product Features by Mining Consumer Reviews (2009)

Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis

The growing pervasiveness of the Internet has changed the way that consumers shop for goods. Increasingly, usergenerated product reviews serve as a valuable source of information for customers making...

Efficient Ranked Queries on Sources with Boolean Query Interfaces (2009)

Hristidis, Vagelis, Hu, Yuheng, Ipeirotis, Panagiotis G.

Many online or local data sources provide powerful querying mechanisms but limited ranking capabilities. For instance, PubMed allows users to submit highly expressive Boolean keyword queries, but...

Stay Elsewhere? Improving Local Search for Hotels Using Econometric Modeling and Image Classification ∗ (2009)

Beibei Li, Anindya Ghose, Panagiotis G. Ipeirotis

One of the common Web searches that have a strong local component is the search for hotel accommodation. Customers try to identify hotels that satisfy particular criteria, such as service, food...

Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers (2009)

Victor S. Sheng, Foster Provost, Panagiotis G. Ipeirotis

This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus...

6 Classification-Aware Hidden-Web Text Database Selection (2009)

Panagiotis G. Ipeirotis, Luis Gravano

Many valuable text databases on the web have noncrawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over multiple such “hidden-web”...

Answering General Time-Sensitive Queries (2009)

Wisam Dakka, Luis Gravano, Panagiotis G. Ipeirotis

Time is an important dimension of relevance for a large number of searches, such as over blogs and news archives. So far, research on searching over such collections has largely focused on locating...

Stay Elsewhere? Improving Local Search for Hotels Using Econometric Modeling and Image Classification ∗ (2009)

Beibei Li, Anindya Ghose, Panagiotis G. Ipeirotis

One of the common Web searches that have a strong local component is the search for hotel accommodation. Customers try to identify hotels that satisfy particular criteria, such as service, food...

Designing ranking systems for consumer reviews: The impact of review subjectivity on product sales and review quality (2008)

Anindya Ghose, Panagiotis G. Ipeirotis

With the rapid growth of the Internet, users ’ ability to publish content has created active electronic communities that provide a wealth of product information. Consumers naturally gravitate to...

Duplicate Record Detection: A Survey Ahmed K. Elmagarmid (2008)

Senior Member Ieee, Panagiotis G. Ipeirotis, Ieee Computer Society, Vassilios S. Verykios, Ieee Computer Society

Abstract—Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a...

Extending SDARTS: Extracting Metadata from Web Databases and Interfacing with the Open Archives Initiative (2008)

Panagiotis G. Ipeirotis

SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines two complementary existing protocols, SDLIP and STARTS, to define a uniform interface that collections should...

ABSTRACT Designing Novel Review Ranking Systems: Predicting Usefulness and Impact of Reviews (2008)

Anindya Ghose, Panagiotis G. Ipeirotis

With the rapid growth of the Internet, users ’ ability to publish content has created active electronic communities that provide a wealth of product information. Consumers naturally gravitate to...

Reputation Premiums in Electronic Peer-to-Peer Markets: Analyzing Textual Feedback and Network Structure (2008)

Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan

Web-based systems that establish reputation are central to the viability of many electronic markets. We present theory that identifies the different dimensions of online reputation and characterizes...

ABSTRACT Designing Novel Review Ranking Systems: Predicting the Usefulness and Impact of Reviews (2008)

Anindya Ghose, Panagiotis G. Ipeirotis

With the rapid growth of the Internet, users ’ ability to publish content has created active electronic communities that provide a wealth of product information. Consumers naturally gravitate to...

Modeling Volatility in Prediction Markets (2008)

Archak, Nikolay, Ipeirotis, Panagiotis G.

Nowadays, there is a significant experimental evidence of excellent ex-post predictive accuracy in certain types of prediction markets, such as markets for elections. This evidence shows that...

Estimating the Socio-Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics (2008)

Ghose, Anindya, Ipeirotis, Panagiotis G.

With the rapid growth of the Internet, the ability of users to create and publish content has created active electronic communities that provide a wealth of product information. However, the high...

Estimating the Socio-Economic Impact of Product Reviews: Mining Text and Reviewer Characteristics (2008)

Ghose, Anindya, Ipeirotis, Panagiotis G.

With the rapid growth of the Internet, the ability of users to create and publish content has created active electronic communities that provide a wealth of product information. However, the high...

Reputation Premiums in Electronic Peer-to-Peer Markets: Analyzing Textual Feedback and Network Structure (2008)

Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan

Web-based systems that establish reputation are central to the viability of many electronic markets. We present theory that identifies the different dimensions of online reputation and characterizes...

SYNONYMS federated search Searching Digital Libraries (2008)

Panagiotis G. Ipeirotis

Searching digital libraries refers to searching and retrieving information from remote databases of digitized or digital objects. These databases may hold either the metadata for an object of...

Understanding, Estimating, and Incorporating Output Quality Into Join Algorithms For Information Extraction (2008)

Jain, Alpa, Ipeirotis, Panagiotis G., Gravano, Luis, Doan, Anhai

Information extraction (IE) systems are trained to extract specific relations from text databases. Real-world applications often require that the output of multiple IE systems be joined to produce...

Understanding, Estimating, and Incorporating Output Quality Into Join Algorithms For Information Extraction (2008)

Jain, Alpa, Ipeirotis, Panagiotis G., Gravano, Luis, Doan, Anhai

Information extraction (IE) systems are trained to extract specific relations from text databases. Real-world applications often require that the output of multiple IE systems be joined to produce...

Working Paper CeDER-06-02, New York University The Dimensions of Reputation in Electronic Markets (2008)

Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan

We analyze how different dimensions of a seller’s reputation affect pricing power in electronic markets. We do so by using text mining techniques to identify and structure dimensions of importance...

Working Paper CeDER-07-05, New York University Deriving the Pricing Power of Product Features by Mining Consumer Reviews (2008)

Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis

The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumer-generated product reviews have become a valuable source of information for...

A Quality-Aware Optimizer for Information Extraction (2008)

Jain, Alpa, Ipeirotis, Panagiotis G.

Large amounts of structured information is buried in unstructured text. Information extraction systems can extract structured relations from the documents and enable sophisticated, SQL-like queries...

A Quality-Aware Optimizer for Information Extraction (2008)

Jain, Alpa, Ipeirotis, Panagiotis G.

Large amounts of structured information is buried in unstructured text. Information extraction systems can extract structured relations from the documents and enable sophisticated, SQL-like queries...

Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers (2008)

Sheng, Victor, Provost, Foster, Ipeirotis, Panagiotis G.

This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus...

Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers (2008)

Sheng, Victor, Provost, Foster, Ipeirotis, Panagiotis G.

This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus...

CUCS-004-00 Automatic Classification of Text Databases Through Query Probing (2008)

Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami

Many text databases on the web are “hidden ” behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...

Microsoft Search Labs and (2008)

Panagiotis G. Ipeirotis, Junghoo Cho, Luis Gravano

Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers” provide unified interfaces to query multiple such databases at once. For efficiency,...

Associate Editors (2008)

Masaru Kitsuregawa, Betty Salzberg, Gonzalo Navarro, Ricardo Baeza-yates, Erkki Sutinen, Jorma Tarhio, ...

IntegratingDiverseInformationManagementSystems:ABriefSurvey..................................

String... (2008)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava

String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...

String... (2008)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava

String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...

ABSTRACT To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks (2008)

Panagiotis G. Ipeirotis, Pranay Jain

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...

Microsoft Search Labs (2008)

Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho, Luis Gravano

Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers ” provide unified interfaces to query multiple such databases at once. For efficiency,...

Modeling and Managing Content Changes in Text Databases (2008)

Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho, Luis Gravano

Large amounts of (often valuable) information are stored in web-accessible text databases. "Metasearchers" provide unified interfaces to query multiple such databases at once. For...

Automatic extraction of useful facet hierarchies from text databases (2008)

Wisam Dakka, Panagiotis G. Ipeirotis

Abstract — Databases of text and text-annotated data constitute a significant fraction of the information available in electronic form. Searching and browsing are the typical ways that users locate...

ABSTRACT Text Joins in an RDBMS for Web Data Integration (2007)

Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas, Divesh Srivastava

The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identifiers, the same...

Databases (2007)

Luis Gravano, Panagiotis G. Ipeirotis, Mehran Sahami

The contents of many valuable web-accessible databases are only available through search interfaces and are hence invisible to traditional web “crawlers. ” Recently, commercial web sites have...

Using Õ-grams in a DBMS for Approximate String Processing (2007)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...

String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...

Extending SDARTS: Extracting Metadata from Web Databases and Interfacing with the Open Archives Initiative ABSTRACT (2007)

Panagiotis G. Ipeirotis

SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines two complementary existing protocols, SDLIP and STARTS, to define a uniform interface that collections should...

String... (2007)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava

String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...

ABSTRACT Probe, Count, and Classify: Categorizing Hidden-Web Databases (2007)

Panagiotis G. Ipeirotis

The contents of many valuable web-accessible databases are only accessible through search interfaces and are hence invisible to traditional web “crawlers. ” Recent studies have estimated the size...

Deriving the Pricing Power of Product Features by Mining Consumer Reviews (2007)

Archak, Nikolay, Ghose, Anindya, Ipeirotis, Panagiotis G.

The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumer-generated product reviews have become a valuable source of information for...

Deriving the Pricing Power of Product Features by Mining Consumer Reviews (2007)

Archak, Nikolay, Ghose, Anindya, Ipeirotis, Panagiotis G.

The growing pervasiveness of the Internet has changed the way that consumers shop for goods. Increasingly, user-generated product reviews serve as a valuable source of information for customers...

Opinion mining using econometrics: A case study on reputation systems (2007)

Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan

Deriving the polarity and strength of opinions is an important research topic, attracting significant attention over the last few years. In this work, to measure the strength and polarity of an...

Towards a query optimizer for text-centric tasks (2007)

Panagiotis G. Ipeirotis, Pranay Jain, Luis Gravano

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...

me the money! Deriving the pricing power of product features by mining consumer reviews (2007)

Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis

The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumergenerated product reviews have become a valuable source of information for...

Opinion mining using econometrics: A case study on reputation systems (2007)

Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan

Deriving the polarity and strength of opinions is an important research topic, attracting significant attention over the last few years. In this work, to measure the strength and polarity of an...

Towards a Query Optimizer for Text-Centric Tasks (2006)

Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...

Towards a Query Optimizer for Text-Centric Tasks (2006)

Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...

Towards a Query Optimizer for Text-Centric Tasks (2006)

Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...

Towards a Query Optimizer for Text-Centric Tasks (2006)

Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...

Modeling and Managing Content Changes in Text Databases (2006)

Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis

Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...

Modeling and Managing Content Changes in Text Databases (2006)

Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis

Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...

Modeling and Managing Changes in Text Databases (2006)

Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis

Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...

Modeling and Managing Changes in Text Databases (2006)

Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis

Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...

Duplicate Record Detection: A Survey (2006)

Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios

Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....

Duplicate Record Detection: A Survey (2006)

Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios

Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....

Duplicate Record Detection: A Survey (2006)

Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios

Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....

Duplicate Record Detection: A Survey (2006)

Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios

Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....

Classification-Aware Hidden-Web Text Database Selection (2006)

Ipeirotis, Panagiotis G., Gravano, Luis

Many valuable text databases on the web have non-crawlable contents that are "€œhidden"€ behind search interfaces. Metasearchers are helpful tools for searching over multiple such...

Classification-Aware Hidden-Web Text Database Selection (2006)

Ipeirotis, Panagiotis G., Gravano, Luis

Many valuable text databases on the web have non-crawlable contents that are "€œhidden"€ behind search interfaces. Metasearchers are helpful tools for searching over...

Classification-Aware Hidden-Web Text Database Selection (2006)

Ipeirotis, Panagiotis G., Gravano, Luis

Many valuable text databases on the web have non-crawlable contents that are ``hidden'' behind search interfaces. Metasearchers are helpful tools for searching over multiple such ``hidden-web'' text...

Classification-Aware Hidden-Web Text Database Selection (2006)

Ipeirotis, Panagiotis G., Gravano, Luis

Many valuable text databases on the web have non-crawlable contents that are ``hidden'' behind search interfaces. Metasearchers are helpful tools for searching over multiple such ``hidden-web'' text...

The Dimensions of Reputation in Electronic Markets (2006)

Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun

We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...

The Dimensions of Reputation in Electronic Markets (2006)

Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun

We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...

The Dimensions of Reputation in Electronic Markets (2006)

Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun

We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...

The Dimensions of Reputation in Electronic Markets (2006)

Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun

We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...

Abstract (2006)

Panagiotis G. Ipeirotis, Eugene Agichtein, Luis Gravano, Pranay Jain

Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...

Modeling and managing content changes in text databases (2005)

Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho

Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers ” provide unified interfaces to query multiple such databases at once. For efficiency,...

Modeling and managing content changes in text databases (2005)

Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho, Luis Gravano

Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers” provide unified interfaces to query multiple such databases at once. For efficiency,...

Modeling and managing content changes in text databases (2005)

Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho

Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers ” provide unified interfaces to query multiple such databases at once. For efficiency,...

Modeling and Managing Content Changes in Text Databases (2004)

Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis

Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...

When one Sample is not Enough: Improving Text Database Selection Using Shrinkage (2004)

Ipeirotis, Panagiotis G., Gravano, Luis

Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which...

Classifying and searching hidden-web text databases (2004)

Ipeirotis, Panagiotis G

The World-Wide Web continues to grow rapidly, which makes exploiting all available information a challenge. Search engines such as Google index an unprecedented amount of information, but still do...

Abstract Classifying and Searching Hidden-Web Text Databases (2004)

Panagiotis G. Ipeirotis, Panagiotis G. Ipeirotis

The World-Wide Web continues to grow rapidly, which makes exploiting all available information a challenge. Search engines such as Google index an unprecedented amount of information, but still do...

When one sample is not enough: Improving text database selection using shrinkage (2004)

Panagiotis G. Ipeirotis

Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which...

When one Sample is not Enough: Improving Text Database Selection Using Shrinkage (2004)

Panagiotis G. Ipeirotis, Luis Gravano

Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which...

Text Joins in an RDBMS for Web Data Integration (2003)

Gravano, Luis, Ipeirotis, Panagiotis G., Koudas, Nick, Srivastava, Divesh

The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identifiers, the same...

QProber: A system for automatic classification of hidden-web databases (2003)

Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami

The contents of many valuable web-accessible databases are only available through search interfaces and are hence invisible to traditional web “crawlers. ” Recently, commercial web sites have...

QProber: A system for automatic classification of hidden-web databases (2003)

Luis Gravano, Panagiotis G. Ipeirotis

The contents of many valuable Web-accessible databases are only available through search interfaces and are hence invisible to traditional Web “crawlers. ” Recently, commercial Web sites have...

Text Joins for Data Cleansing and Integration in an RDBMS (2003)

Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas, Divesh Srivastava

An organization’s data records are often noisy because of transcription errors, incomplete information, lack of standard formats for textual data or combinations thereof. A fundamental task in a...

Approximate String Joins in a Database (Almost) for Free (2003)

Erratum Luis Gravano, Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, ...

case the result returned by the Figure 1 query is incomplete and su#ers from "false negatives," in contrast to our claim to the contrary in [GIJ 01b]. In general, the string pairs that are...

Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection (2002)

Ipeirotis, Panagiotis G., Gravano, Luis

Many valuable text databases on the web have non-crawlable contents that are 'hidden' behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once through...

Extending SDARTS: Extracting Metadata from Web Databases and Interfacing with the Open Archives Initiative (2002)

Ipeirotis, Panagiotis G., Barry, Tom, Gravano, Luis

SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines two complementary existing protocols, SDLIP and STARTS, to define a uniform interface that collections should...

Distributed search over the hidden web: Hierarchical database sampling and selection (2002)

Panagiotis G. Ipeirotis, Luis Gravano

Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...

Distributed search over the hidden web: Hierarchical database sampling and selection (2002)

Panagiotis G. Ipeirotis, Luis Gravano

Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...

Classification-Aware Hidden-Web Text Database Selection · 55 (2002)

Panagiotis G. Ipeirotis, Luis Gravano

Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over multiple such “hidden-web”...

Distributed search over the hidden web: Hierarchical database sampling and selection (2002)

Panagiotis G. Ipeirotis, Luis Gravano

Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...

Summarizing and Searching Hidden-Web Databases Hierarchically Using Focused Probes (2001)

Ipeirotis, Panagiotis G., Gravano, Luis

Many valuable text databases on the web have non-crawlable contents that are ``hidden'' behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...

QProber: A System for Automatic Classification of Hidden-Web Resources (2001)

Ipeirotis, Panagiotis G., Gravano, Luis, Sahami, Mehran

The contents of many valuable web-accessible databases are only available through search interfaces and are hence invisible to traditional web ``crawlers.'' Recently, commercial web sites have...

Approximate string joins in a database (almost) for free (2001)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava

In [GIJ + 01a, GIJ + 01b] we described how to use q-grams in an RDBMS to perform approximate string joins. We also showed how to implement the approximate join using plain SQL queries. Specifically,...

Using q-grams in a DBMS for Approximate String Processing (2001)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...

String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...

PERSIVAL demo: Categorizing hidden-Web resources (2001)

Panagiotis G. Ipeirotis

The information available in electronic form continues to grow at an exponential rate and this trend is expected to continue. Although traditional search engines like AltaVista can address common...

Approximate string joins in a database (almost) for free (2001)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava

String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...

Using q-grams in a DBMS for Approximate String Processing (2001)

Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...

String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...

Automatic classification of text databases through query probing (2000)

Panagiotis G. Ipeirotis

Many text databases on the web are “hidden ” behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...

Automatic Classification of Text Databases through Query Probing (2000)

Panagiotis Ipeirotis Computer, Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami

Many text databases on the web are \hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...

Automatic Classification of Text Databases through Query Probing (2000)

Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami

Many text databases on the web are "hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such...

Automatic Classification of Text Databases through Query Probing (2000)

Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami

Many text databases on the web are hidden behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...

Deriving the Pricing Power of Product Features by Mining Consumer Reviews

Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis

The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumer-generated product reviews have become a valuable source of information for...