Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis
The growing pervasiveness of the Internet has changed the way that consumers shop for goods. Increasingly, usergenerated product reviews serve as a valuable source of information for customers making...
Efficient Ranked Queries on Sources with Boolean Query Interfaces (2009)
Hristidis, Vagelis, Hu, Yuheng, Ipeirotis, Panagiotis G.
Many online or local data sources provide powerful querying mechanisms but limited ranking capabilities. For instance, PubMed allows users to submit highly expressive Boolean keyword queries, but...
Beibei Li, Anindya Ghose, Panagiotis G. Ipeirotis
One of the common Web searches that have a strong local component is the search for hotel accommodation. Customers try to identify hotels that satisfy particular criteria, such as service, food...
Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers (2009)
Victor S. Sheng, Foster Provost, Panagiotis G. Ipeirotis
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus...
6 Classification-Aware Hidden-Web Text Database Selection (2009)
Panagiotis G. Ipeirotis, Luis Gravano
Many valuable text databases on the web have noncrawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over multiple such “hidden-web”...
Answering General Time-Sensitive Queries (2009)
Wisam Dakka, Luis Gravano, Panagiotis G. Ipeirotis
Time is an important dimension of relevance for a large number of searches, such as over blogs and news archives. So far, research on searching over such collections has largely focused on locating...
Beibei Li, Anindya Ghose, Panagiotis G. Ipeirotis
One of the common Web searches that have a strong local component is the search for hotel accommodation. Customers try to identify hotels that satisfy particular criteria, such as service, food...
Anindya Ghose, Panagiotis G. Ipeirotis
With the rapid growth of the Internet, users ’ ability to publish content has created active electronic communities that provide a wealth of product information. Consumers naturally gravitate to...
Duplicate Record Detection: A Survey Ahmed K. Elmagarmid (2008)
Senior Member Ieee, Panagiotis G. Ipeirotis, Ieee Computer Society, Vassilios S. Verykios, Ieee Computer Society
Abstract—Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a...
SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines two complementary existing protocols, SDLIP and STARTS, to define a uniform interface that collections should...
ABSTRACT Designing Novel Review Ranking Systems: Predicting Usefulness and Impact of Reviews (2008)
Anindya Ghose, Panagiotis G. Ipeirotis
With the rapid growth of the Internet, users ’ ability to publish content has created active electronic communities that provide a wealth of product information. Consumers naturally gravitate to...
Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan
Web-based systems that establish reputation are central to the viability of many electronic markets. We present theory that identifies the different dimensions of online reputation and characterizes...
Anindya Ghose, Panagiotis G. Ipeirotis
With the rapid growth of the Internet, users ’ ability to publish content has created active electronic communities that provide a wealth of product information. Consumers naturally gravitate to...
Modeling Volatility in Prediction Markets (2008)
Archak, Nikolay, Ipeirotis, Panagiotis G.
Nowadays, there is a significant experimental evidence of excellent ex-post predictive accuracy in certain types of prediction markets, such as markets for elections. This evidence shows that...
Ghose, Anindya, Ipeirotis, Panagiotis G.
With the rapid growth of the Internet, the ability of users to create and publish content has created active electronic communities that provide a wealth of product information. However, the high...
Ghose, Anindya, Ipeirotis, Panagiotis G.
With the rapid growth of the Internet, the ability of users to create and publish content has created active electronic communities that provide a wealth of product information. However, the high...
Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan
Web-based systems that establish reputation are central to the viability of many electronic markets. We present theory that identifies the different dimensions of online reputation and characterizes...
SYNONYMS federated search Searching Digital Libraries (2008)
Searching digital libraries refers to searching and retrieving information from remote databases of digitized or digital objects. These databases may hold either the metadata for an object of...
Jain, Alpa, Ipeirotis, Panagiotis G., Gravano, Luis, Doan, Anhai
Information extraction (IE) systems are trained to extract specific relations from text databases. Real-world applications often require that the output of multiple IE systems be joined to produce...
Jain, Alpa, Ipeirotis, Panagiotis G., Gravano, Luis, Doan, Anhai
Information extraction (IE) systems are trained to extract specific relations from text databases. Real-world applications often require that the output of multiple IE systems be joined to produce...
Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan
We analyze how different dimensions of a seller’s reputation affect pricing power in electronic markets. We do so by using text mining techniques to identify and structure dimensions of importance...
Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis
The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumer-generated product reviews have become a valuable source of information for...
A Quality-Aware Optimizer for Information Extraction (2008)
Jain, Alpa, Ipeirotis, Panagiotis G.
Large amounts of structured information is buried in unstructured text. Information extraction systems can extract structured relations from the documents and enable sophisticated, SQL-like queries...
A Quality-Aware Optimizer for Information Extraction (2008)
Jain, Alpa, Ipeirotis, Panagiotis G.
Large amounts of structured information is buried in unstructured text. Information extraction systems can extract structured relations from the documents and enable sophisticated, SQL-like queries...
Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers (2008)
Sheng, Victor, Provost, Foster, Ipeirotis, Panagiotis G.
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus...
Get Another Label? Improving Data Quality and Data Mining Using Multiple, Noisy Labelers (2008)
Sheng, Victor, Provost, Foster, Ipeirotis, Panagiotis G.
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus...
CUCS-004-00 Automatic Classification of Text Databases Through Query Probing (2008)
Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami
Many text databases on the web are “hidden ” behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...
Microsoft Search Labs and (2008)
Panagiotis G. Ipeirotis, Junghoo Cho, Luis Gravano
Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers” provide unified interfaces to query multiple such databases at once. For efficiency,...
Masaru Kitsuregawa, Betty Salzberg, Gonzalo Navarro, Ricardo Baeza-yates, Erkki Sutinen, Jorma Tarhio, ...
IntegratingDiverseInformationManagementSystems:ABriefSurvey..................................
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
ABSTRACT To Search or to Crawl? Towards a Query Optimizer for Text-Centric Tasks (2008)
Panagiotis G. Ipeirotis, Pranay Jain
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
Into A Hierarchical, John Hopkins, Aids Service, Heart Matches, Panagiotis G. Ipeirotis, Luis Gravano, ...
{ aids, hiv AND infection, …}
Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho, Luis Gravano
Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers ” provide unified interfaces to query multiple such databases at once. For efficiency,...
Modeling and Managing Content Changes in Text Databases (2008)
Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho, Luis Gravano
Large amounts of (often valuable) information are stored in web-accessible text databases. "Metasearchers" provide unified interfaces to query multiple such databases at once. For...
Automatic extraction of useful facet hierarchies from text databases (2008)
Wisam Dakka, Panagiotis G. Ipeirotis
Abstract — Databases of text and text-annotated data constitute a significant fraction of the information available in electronic form. Searching and browsing are the typical ways that users locate...
ABSTRACT Text Joins in an RDBMS for Web Data Integration (2007)
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas, Divesh Srivastava
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identifiers, the same...
Luis Gravano, Panagiotis G. Ipeirotis, Mehran Sahami
The contents of many valuable web-accessible databases are only available through search interfaces and are hence invisible to traditional web “crawlers. ” Recently, commercial web sites have...
Using Õ-grams in a DBMS for Approximate String Processing (2007)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...
SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines two complementary existing protocols, SDLIP and STARTS, to define a uniform interface that collections should...
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
ABSTRACT Probe, Count, and Classify: Categorizing Hidden-Web Databases (2007)
The contents of many valuable web-accessible databases are only accessible through search interfaces and are hence invisible to traditional web “crawlers. ” Recent studies have estimated the size...
Deriving the Pricing Power of Product Features by Mining Consumer Reviews (2007)
Archak, Nikolay, Ghose, Anindya, Ipeirotis, Panagiotis G.
The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumer-generated product reviews have become a valuable source of information for...
Deriving the Pricing Power of Product Features by Mining Consumer Reviews (2007)
Archak, Nikolay, Ghose, Anindya, Ipeirotis, Panagiotis G.
The growing pervasiveness of the Internet has changed the way that consumers shop for goods. Increasingly, user-generated product reviews serve as a valuable source of information for customers...
Opinion mining using econometrics: A case study on reputation systems (2007)
Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan
Deriving the polarity and strength of opinions is an important research topic, attracting significant attention over the last few years. In this work, to measure the strength and polarity of an...
Towards a query optimizer for text-centric tasks (2007)
Panagiotis G. Ipeirotis, Pranay Jain, Luis Gravano
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
me the money! Deriving the pricing power of product features by mining consumer reviews (2007)
Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis
The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumergenerated product reviews have become a valuable source of information for...
Opinion mining using econometrics: A case study on reputation systems (2007)
Anindya Ghose, Panagiotis G. Ipeirotis, Arun Sundararajan
Deriving the polarity and strength of opinions is an important research topic, attracting significant attention over the last few years. In this work, to measure the strength and polarity of an...
Towards a Query Optimizer for Text-Centric Tasks (2006)
Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
Towards a Query Optimizer for Text-Centric Tasks (2006)
Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
Towards a Query Optimizer for Text-Centric Tasks (2006)
Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
Towards a Query Optimizer for Text-Centric Tasks (2006)
Ipeirotis, Panagiotis G., Agichtein, Eugene, Jain, Pranay, Gravano, Luis
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
Modeling and Managing Content Changes in Text Databases (2006)
Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis
Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...
Modeling and Managing Content Changes in Text Databases (2006)
Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis
Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...
Modeling and Managing Changes in Text Databases (2006)
Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis
Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...
Modeling and Managing Changes in Text Databases (2006)
Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis
Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...
Duplicate Record Detection: A Survey (2006)
Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios
Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....
Duplicate Record Detection: A Survey (2006)
Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios
Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....
Duplicate Record Detection: A Survey (2006)
Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios
Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....
Duplicate Record Detection: A Survey (2006)
Elmagarmid, Ahmed, Ipeirotis, Panagiotis G., Verykios, Vassilios
Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difficult task....
Classification-Aware Hidden-Web Text Database Selection (2006)
Ipeirotis, Panagiotis G., Gravano, Luis
Many valuable text databases on the web have non-crawlable contents that are "ÂÂhidden"ÂÂ behind search interfaces. Metasearchers are helpful tools for searching over multiple such...
Classification-Aware Hidden-Web Text Database Selection (2006)
Ipeirotis, Panagiotis G., Gravano, Luis
Many valuable text databases on the web have non-crawlable contents that are "ÃÂÃÂhidden"ÃÂÃÂ behind search interfaces. Metasearchers are helpful tools for searching over...
Classification-Aware Hidden-Web Text Database Selection (2006)
Ipeirotis, Panagiotis G., Gravano, Luis
Many valuable text databases on the web have non-crawlable contents that are ``hidden'' behind search interfaces. Metasearchers are helpful tools for searching over multiple such ``hidden-web'' text...
Classification-Aware Hidden-Web Text Database Selection (2006)
Ipeirotis, Panagiotis G., Gravano, Luis
Many valuable text databases on the web have non-crawlable contents that are ``hidden'' behind search interfaces. Metasearchers are helpful tools for searching over multiple such ``hidden-web'' text...
The Dimensions of Reputation in Electronic Markets (2006)
Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun
We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...
The Dimensions of Reputation in Electronic Markets (2006)
Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun
We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...
The Dimensions of Reputation in Electronic Markets (2006)
Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun
We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...
The Dimensions of Reputation in Electronic Markets (2006)
Ghose, Anindya, Ipeirotis, Panagiotis G., Sundararajan, Arun
We present a framework for identifying the different dimensions of online reputation and characterizing their influence on the pricing power of sellers. Our theory predicts that sellers with better...
Panagiotis G. Ipeirotis, Eugene Agichtein, Luis Gravano, Pranay Jain
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
Modeling and managing content changes in text databases (2005)
Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho
Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers ” provide unified interfaces to query multiple such databases at once. For efficiency,...
Modeling and managing content changes in text databases (2005)
Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho, Luis Gravano
Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers” provide unified interfaces to query multiple such databases at once. For efficiency,...
Modeling and managing content changes in text databases (2005)
Panagiotis G. Ipeirotis, Alexandros Ntoulas, Junghoo Cho
Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers ” provide unified interfaces to query multiple such databases at once. For efficiency,...
Modeling and Managing Content Changes in Text Databases (2004)
Ipeirotis, Panagiotis G., Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis
Large amounts of (often valuable) information are stored in web-accessible text databases. ``Metasearchers'' provide unified interfaces to query multiple such databases at once. For efficiency,...
When one Sample is not Enough: Improving Text Database Selection Using Shrinkage (2004)
Ipeirotis, Panagiotis G., Gravano, Luis
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which...
Classifying and searching hidden-web text databases (2004)
The World-Wide Web continues to grow rapidly, which makes exploiting all available information a challenge. Search engines such as Google index an unprecedented amount of information, but still do...
Abstract Classifying and Searching Hidden-Web Text Databases (2004)
Panagiotis G. Ipeirotis, Panagiotis G. Ipeirotis
The World-Wide Web continues to grow rapidly, which makes exploiting all available information a challenge. Search engines such as Google index an unprecedented amount of information, but still do...
When one sample is not enough: Improving text database selection using shrinkage (2004)
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which...
When one Sample is not Enough: Improving Text Database Selection Using Shrinkage (2004)
Panagiotis G. Ipeirotis, Luis Gravano
Database selection is an important step when searching over large numbers of distributed text databases. The database selection task relies on statistical summaries of the database contents, which...
Text Joins in an RDBMS for Web Data Integration (2003)
Gravano, Luis, Ipeirotis, Panagiotis G., Koudas, Nick, Srivastava, Divesh
The integration of data produced and collected across autonomous, heterogeneous web services is an increasingly important and challenging problem. Due to the lack of global identifiers, the same...
QProber: A system for automatic classification of hidden-web databases (2003)
Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami
The contents of many valuable web-accessible databases are only available through search interfaces and are hence invisible to traditional web “crawlers. ” Recently, commercial web sites have...
QProber: A system for automatic classification of hidden-web databases (2003)
Luis Gravano, Panagiotis G. Ipeirotis
The contents of many valuable Web-accessible databases are only available through search interfaces and are hence invisible to traditional Web “crawlers. ” Recently, commercial Web sites have...
Text Joins for Data Cleansing and Integration in an RDBMS (2003)
Luis Gravano, Panagiotis G. Ipeirotis, Nick Koudas, Divesh Srivastava
An organization’s data records are often noisy because of transcription errors, incomplete information, lack of standard formats for textual data or combinations thereof. A fundamental task in a...
Approximate String Joins in a Database (Almost) for Free (2003)
Erratum Luis Gravano, Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, ...
case the result returned by the Figure 1 query is incomplete and su#ers from "false negatives," in contrast to our claim to the contrary in [GIJ 01b]. In general, the string pairs that are...
Distributed Search over the Hidden Web: Hierarchical Database Sampling and Selection (2002)
Ipeirotis, Panagiotis G., Gravano, Luis
Many valuable text databases on the web have non-crawlable contents that are 'hidden' behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once through...
Ipeirotis, Panagiotis G., Barry, Tom, Gravano, Luis
SDARTS is a protocol and toolkit designed to facilitate metasearching. SDARTS combines two complementary existing protocols, SDLIP and STARTS, to define a uniform interface that collections should...
Distributed search over the hidden web: Hierarchical database sampling and selection (2002)
Panagiotis G. Ipeirotis, Luis Gravano
Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...
Distributed search over the hidden web: Hierarchical database sampling and selection (2002)
Panagiotis G. Ipeirotis, Luis Gravano
Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...
Classification-Aware Hidden-Web Text Database Selection · 55 (2002)
Panagiotis G. Ipeirotis, Luis Gravano
Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over multiple such “hidden-web”...
Distributed search over the hidden web: Hierarchical database sampling and selection (2002)
Panagiotis G. Ipeirotis, Luis Gravano
Many valuable text databases on the web have non-crawlable contents that are “hidden ” behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...
Summarizing and Searching Hidden-Web Databases Hierarchically Using Focused Probes (2001)
Ipeirotis, Panagiotis G., Gravano, Luis
Many valuable text databases on the web have non-crawlable contents that are ``hidden'' behind search interfaces. Metasearchers are helpful tools for searching over many such databases at once...
QProber: A System for Automatic Classification of Hidden-Web Resources (2001)
Ipeirotis, Panagiotis G., Gravano, Luis, Sahami, Mehran
The contents of many valuable web-accessible databases are only available through search interfaces and are hence invisible to traditional web ``crawlers.'' Recently, commercial web sites have...
Approximate string joins in a database (almost) for free (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
In [GIJ + 01a, GIJ + 01b] we described how to use q-grams in an RDBMS to perform approximate string joins. We also showed how to implement the approximate join using plain SQL queries. Specifically,...
Using q-grams in a DBMS for Approximate String Processing (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...
PERSIVAL demo: Categorizing hidden-Web resources (2001)
The information available in electronic form continues to grow at an exponential rate and this trend is expected to continue. Although traditional search engines like AltaVista can address common...
Approximate string joins in a database (almost) for free (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Divesh Srivastava
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data especially for more complex queries...
Using q-grams in a DBMS for Approximate String Processing (2001)
Luis Gravano, Panagiotis G. Ipeirotis, H. V. Jagadish, Nick Koudas, S. Muthukrishnan, Lauri Pietarinen, ...
String data is ubiquitous, and its management has taken on particular importance in the past few years. Approximate queries are very important on string data. This is due, for example, to the...
Automatic classification of text databases through query probing (2000)
Many text databases on the web are “hidden ” behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...
Automatic Classification of Text Databases through Query Probing (2000)
Panagiotis Ipeirotis Computer, Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami
Many text databases on the web are \hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...
Automatic Classification of Text Databases through Query Probing (2000)
Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami
Many text databases on the web are "hidden" behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such...
Automatic Classification of Text Databases through Query Probing (2000)
Panagiotis G. Ipeirotis, Luis Gravano, Mehran Sahami
Many text databases on the web are hidden behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...
Deriving the Pricing Power of Product Features by Mining Consumer Reviews
Nikolay Archak, Anindya Ghose, Panagiotis G. Ipeirotis
The increasing pervasiveness of the Internet has dramatically changed the way that consumers shop for goods. Consumer-generated product reviews have become a valuable source of information for...