Publication View

Annotation Graphs: A Foundation for Integrating Tools, Formats and Corpora (2007)

Abstract
In recent work we have presented a formal framework for linguistic annotations using labeled acyclic digraphs. These `annotation graphs' offer a simple yet powerful method for representing complex annotation structures incorporating hierarchy and overlap. We illustrate some applications to existing discourse-level annotations of text and speech data. Annotation graphs are capable of representing the structure and content of a diverse range of formats, and this opens the door to wide-ranging integration of tools and corpora. We show how the approach facilitates substantive comparison of annotations expressed in different formats and how it permits queries on corpora which have been annotated at multiple levels using different coding standards and tools. Finally, we describe our philosophy on tool development. 1 Annotation Graphs When we examine the kinds of speech transcription and annotation found in many existing `communities of practice', we see commonality of abstract form along wi...

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.49.3484
Source http://www.ldc.upenn.edu/sb/papers/discourse99/discourse99.ps.Z
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Type text
Language English
Relation 10.1.1.30.7806, 10.1.1.47.4813, 10.1.1.57.160