Karthik Balakrishnan, Anindya Ghose, Panagiotis Ipeirotis
The Sarbanes-Oxley (SOX) Act of 2002 is one of the, if not the, most important pieces of legislation affecting corporations traded on the U.S. stock exchanges. While SOX does not explicitly address...
Building Query Optimizers for Information Extraction: The SQoUT Project (2009)
Alpa Jain, Panagiotis Ipeirotis, Luis Gravano
Text documents often embed data that is structured in nature. This structured data is increasingly exposed using information extraction systems, which generate structured relations from documents,...
Karthik Balakrishnan, Anindya Ghose, Panagiotis Ipeirotis
The Sarbanes-Oxley (SOX) Act of 2002 is one of the, if not the, most important pieces of legislation affecting corporations traded on the U.S. stock exchanges. While SOX does not explicitly address...
Yang, Yin, Bansal, Nilesh, Dakka, Wisam, Ipeirotis, Panagiotis, Koudas, Nick, Papadias, Dimitris
We are experiencing an unprecedented increase of content contributed by users in forums such as blogs, social networking sites and microblogging services. Such abundance of content complements...
ABSTRACT Modeling Query-Based Access to Text Databases (2008)
Eugene Agichtein, Panagiotis Ipeirotis, Luis Gravano
Searchable text databases abound on the web. Applications that require access to such databases often resort to querying to extract relevant documents because of two main reasons. First, some text...
Classification-Aware Hidden-Web Text Database Selection, (2008)
Ipeirotis, Panagiotis, Gravano, Luis
Many valuable text databases on the web have noncrawlable contents that are “hidden” behind search interfaces. Metasearchers are helpful tools for searching over multiple such “hidden-web”...
Get Another Label? Improving Data Quality and Data Mining (2008)
Sheng, Victor, Provost, Foster, Ipeirotis, Panagiotis
This paper addresses the repeated acquisition of labels for data items when the labeling is imperfect. We examine the improvement (or lack thereof) in data quality via repeated labeling, and focus...
Towards a Query Optimizer for Text-Centric Tasks (2007)
Ipeirotis, Panagiotis, Agichtein, Eugene, Jain, Pranay, Gravano, Luis
Text is ubiquitous and, not surprisingly, many important applications rely on textual data for a variety of tasks. As a notable example, information extraction applications derive structured...
Modeling and Managing Changes in Text Databases (2007)
Ipeirotis, Panagiotis, Ntoulas, Alexandros, Cho, Junghoo, Gravano, Luis
Large amounts of (often valuable) information are stored in web-accessible text databases. “Metasearchers” provide unified interfaces to query multiple such databases at once. For efficiency,...
Classifying and searching hidden-web text databases (2004)
Department: Computer Science.
QProber: A System for Automatic Classification of Hidden-Web Databases (2003)
Ipeirotis, Panagiotis, Gravano, Luis
The contents of many valuable Web-accessible databases are only available through search interfaces and are hence invisible to traditional Web “crawlers.” Recently, commercial Web sites have...
Modeling query-based access to text databases (2003)
Eugene Agichtein, Panagiotis Ipeirotis, Luis Gravano
Searchable text databases abound on the web. Applications that require access to such databases often resort to querying to extract relevant documents because of two main reasons. First, some text...
Modeling query-based access to text databases (2003)
Eugene Agichtein, Panagiotis Ipeirotis, Luis Gravano
Searchable text databases abound on the web. Applications that require access to such databases often resort to querying to extract relevant documents because of two main reasons. First, some text...
Automatic Classification of Text Databases Through Query Probing (2000)
Ipeirotis, Panagiotis, Gravano, Luis, Sahami, Mehran
Many text databases on the web are 'hidden' behind search interfaces, and their documents are only accessible through querying. Search engines typically ignore the contents of such search-only...