AND GIANSALVATORE MECCA (2008)
Abstract. Information extraction from websites is nowadays a relevant problem, usually performed by software modules called wrappers. A key requirement is that the wrapper generation process should...
1 D.I.F.A.-- Universit`a della Basilicata (2007)
Paolo Merialdo, Paolo Atzeni, Valter Crescenzi
Web sites are rapidly becoming a world-wide standard platform for information system development. The paper reports on the work conducted in the last few years in the framework of the Araneus project...
Sihem Amer-yahia, Luis Gravano, Sergey Brin, Taher Haveliwala, Jayavel Shanmugasundaram, Maha Abdallah, ...
Automatic annotation of data extracted from large web sites (2003)
Luigi Arlotta, Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo
Data extraction from web pages is performed by software modules called wrappers. Recently, some systems for the automatic generation of wrappers have been proposed in the literature. These systems...
RoadRunner: Towards automatic data extraction from large Web sites (2001)
Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo, ...
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extraction process, the...
The RoadRunner Project: Towards Automatic Extraction of Web Data (2001)
Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo
ROADRUNNER is a research project that aims at developing solutions for automatically extracting data from large HTML
RoadRunner: Towards automatic data extraction from large Web sites (2001)
Valter Crescenzi, Giansalvatore Mecca, Paolo Merialdo, ...
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extraction process, the...
The (short) Araneus guide to web-site development (1999)
Paolo Merialdo, Paolo Atzeni, Valter Crescenzi
D.I.F.A.- Universit`a della Basilicata 2
The (short) Araneus guide to web-site development (1999)
Paolo Merialdo, Paolo Atzeni, Valter Crescenzi
D.I.F.A.- Universit`a della Basilicata 2
Grammars Have Exceptions (1998)
Valter Crescenzi, Giansalvatore Mecca
Extending database-like techniques to semi-structured and Web data sources is becoming a prominent research field. These data sources are essentially collections of textual documents. Hence, in this...