Sophia Ananiadou

BioMed Central Research How to make the most of NE dictionaries in statistical NER (2009)

Bmc Bioinformatics, Yutaka Sasaki, Yoshimasa Tsuruoka, John Mcnaught, Sophia Ananiadou, John Mcnaught, ...

growth of a wide range of repositories of biomedical data and literature. The automatic construction and update of scientific knowledge bases is a major research topic in Biofrom

Construction of an annotated corpus to support biomedical information extraction (2009)

Thompson, Paul, Iqbal, Syed A, McNaught, John, Ananiadou, Sophia

Abstract Background Information Extraction (IE) is a component of text mining that facilitates knowledge discovery by automatically locating instances of interesting biomedical events from huge...

An Annotation Type System for a Data-Driven NLP Pipeline (2009)

Udo Hahn, Ekaterina Buyko, Katrin Tomanek, Scott Piao, John Mcnaught, Yoshimasa Tsuruoka, ...

We introduce an annotation type system for a data-driven NLP core system. The specifications cover formal document structure and document meta information, as well as the linguistic levels of...

U-Compare: share and compare text mining tools with UIMA (2009)

Kano, Yoshinobu, Baumgartner, William A., McCrohon, Luke, Ananiadou, Sophia, Cohen, K. Bretonnel, Hunter, Lawrence, ...

Summary: Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using...

Accelerating the annotation of sparse named entities by dynamic sentence selection (2008)

Tsuruoka, Yoshimasa, Tsujii, Jun'ichi, Ananiadou, Sophia

Abstract Background Previous studies of named entity recognition have shown that a reasonable level of recognition accuracy can be achieved by using machine learning models such as conditional random...

How to make the most of NE dictionaries in statistical NER (2008)

Sasaki, Yutaka, Tsuruoka, Yoshimasa, McNaught, John, Ananiadou, Sophia

Abstract Background When term ambiguity and variability are very high, dictionary-based Named Entity Recognition ( NER ) is not an ideal solution even though large-scale terminological resources are...

BOOTStrep Annotation Scheme – Encoding Information for Text Mining (2008)

Scott Piao, Ekaterina Buyko, Yoshimasa Tsuruoka, Katrin Tomanek, John Mcnaught, Udo Hahn, ...

Annotation of information in corpora is an important aspect of text mining. It bridges between the information hidden in natural language texts and the semantic search queries for the information...

Text Mining Services to Support E-Research (2008)

Brian Rea, Sophia Ananiadou

In recent years the developments and opportunities created for e-Science infrastructure have promised technological support for the ever growing area of text mining applications and services. The...

Normalizing biomedical terms by minimizing ambiguity and variability (2008)

Tsuruoka, Yoshimasa, McNaught, John, Ananiadou, Sophia

Abstract Background One of the difficulties in mapping biomedical named entities, e.g. genes, proteins, chemicals and diseases, to their concept identifiers stems from the potential variability of...

BIOINFORMATICS Bioinformatics Advance Access published February 22, 2005 MaSTerClass: a case-based reasoning system for the classification of biomedical terms (2008)

Irena Spasic, Sophia Ananiadou, Jun-ichi Tsujii

Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...

classification of biomedical terms (2008)

Irena Spasic, Sophia Ananiadou, Junichi Tsujii

MaSTerClass: a case-based reasoning system for the

Mining semantically related terms from biomedical literature (2008)

Goran Nenadić, Sophia Ananiadou

Discovering links and relationships is one of the main challenges in biomedical research, as scientists are interested in uncovering entities that have similar functions, take part in the same...

Ananiadou S: Clustering acronyms in biomedical text for disambiguation (2008)

Naoaki Okazaki, Sophia Ananiadou

Given the increasing number of neologisms in biomedicine (names of genes, diseases, molecules, etc.), the rate of acronyms used in literature also increases. Existing acronym dictionaries cannot keep...

FILLING THE GAPS BETWEEN TOOLS AND USERS: A TOOL COMPARATOR, USING PROTEIN-PROTEIN INTERACTION AS AN EXAMPLE (2008)

Yusuke Miyao, Yoshimasa Tsuruoka, Yuichiro Matsubayashi, Sophia Ananiadou

Recently, several text mining programs have reached a near-practical level of performance. Some systems are already being used by biologists and database curators. However, it has also been...

Reviewed by (2008)

Christian Jacquemin, Sophia Ananiadou

is a much needed and most welcome addition to the field of computational terminology. The central issue of this book is the in-depth examination of term variation, that is, the morphological,...

£53.00 Reviewed by (2008)

Sophia Ananiadou, John Mcnaught (editors, Nikiforos Karamanis

Text mining is defined by Hearst (1999) as the automatic discovery of new, previously unknown, information from unstructured textual data. This is often seen as comprising of three major tasks:...

Abstract (2008)

Classifying Technical Terms, Katerina T. Frantzi, Junichi Tsujii, Sophia Ananiadou

Automating the process of term recognition and classi#cation is important for digital libraries. Automatic Term Recognition #ATR# has many applications in areas related to digital libraries, e.g....

FACTA: a text search engine for finding associated biomedical concepts (2008)

Tsuruoka, Yoshimasa, Tsujii, Jun'ichi, Ananiadou, Sophia

Summary: FACTA is a text search engine for MEDLINE abstracts, which is designed particularly to help users browse biomedical concepts (e.g. genes/proteins, diseases, enzymes and chemical compounds)...

Manch(;sl;(;r Metroi)olil;an Univ(;rsity (2007)

Katerina T. Frantzi, Sophia Ananiadou

'l?his paper 1)rovidcs an at)l)roa(:h to tim semi-aul;onmtic exl;i'action of (:olloca-IJons flom eorl)ora using sl;atisti(:s. The growing availability of lm'ge textual cor-t)ora, and...

An Integrated Term-Based Corpus Query System (2007)

Irena Spasic, Goran Nenadic, Kostas Manios, Sophia Ananiadou

In this paper we describe the X-TRACT workbench, which enables efficient term-based querying against a domain-specific literature corpus. Its main aim is to aid domain specialists in locating and...

Learning string similarity measures for gene/protein name dictionary look-up using logistic regression (2007)

Tsuruoka, Yoshimasa, McNaught, John, Tsujii, Jun'i;chi, Ananiadou, Sophia

Motivation: One of the bottlenecks of biomedical data integration is variation of terms. Exact string matching often fails to associate a name with its biological concept, i.e. ID or accession number...

Organizers (2007)

Demonstration Sessions, Sophia Ananiadou

Demonstrations session was held between the 25th to 27th June 2007 in Prague. This year we had 113 submissions out of which 61 were selected for presentation, resulting in a 54 % acceptance rate. The...

BIOINFORMATICS ORIGINAL PAPER (2006)

Naoaki Okazaki, Sophia Ananiadou

Data and text mining Vol. 22 no. 24 2006, pages 3089–3095 doi:10.1093/bioinformatics/btl534 Building an abbreviation dictionary using a term recognition approach

Mining Opinion Polarity Relations of Citations (2006)

Scott S. Piao, Sophia Ananiadou, Yoshimasa Tsuruoka, Yutaka Sasaki, John Mcnaught

Opinion mining has been receiving increasing attention recently, and various approaches have been suggested for mining sentiment information, such as mining attitudes or opinions about a topic or...

Text mining and ontologies in biomedicine: making sense of raw text (2005)

Irena Spasic, Sophia Ananiadou, John Mcnaught

The volume of biomedical literature is increasing at such a rate that it is becoming difficult to locate, retrieve and manage the reported information without text mining, which aims to automatically...

MaSTerClass: a case-based reasoning system for the classification of biomedical terms (2005)

Spasic, Irena, Ananiadou, Sophia, Tsujii, Junichi

Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...

MaSTerClass: a case-based reasoning system for the classification of biomedical terms (2005)

Spasic, Irena, Ananiadou, Sophia, Tsujii, Jun-ichi

Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...

MaSTerClass: a case-based reasoning system for the classification of biomedical terms (2005)

Spasic, Irena, Ananiadou, Sophia, Tsujii, Jun-ichi

Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...

Text mining and ontologies in biomedicine: Making sense of raw text (2005)

Spasic, Irena, Ananiadou, Sophia, McNaught, John, Kumar, Anand

The volume of biomedical literature is increasing at such a rate that it is becoming difficult to locate, retrieve and manage the reported information without text mining, which aims to automatically...

Learning to classify biomedical terms through literature mining and genetic algorithms (2004)

Irena Spasić, Goran Nenadić, Sophia Ananiadou

Abstract. We present an approach to classification of biomedical terms based on the information acquired automatically from the corpus of relevant literature. The learning phase consists of two...

Selecting text features for gene name classification: from documents to terms (2003)

Goran Nenadić, Simon Rice, Irena Spasić, Sophia Ananiadou, Benjamin Stapley

In this paper we discuss the performance of a text-based classification approach by comparing different types of features. We consider the automatic classification of gene names from the molecular...

Selecting text features for gene name classification: from documents to terms (2003)

Goran Nenadić, Simon Rice, Irena Spasić, Sophia Ananiadou, Benjamin Stapley

In this paper we discuss the performance of a text-based classification approach by comparing different types of features. We consider the automatic classification of gene names from the molecular...

Selecting text features for gene name classification: from documents to terms (2003)

Goran Nenadić, Simon Rice, Irena Spasić, Sophia Ananiadou, Benjamin Stapley

In this paper we discuss the performance of a text-based classification approach by comparing different types of features. We consider the automatic classification of gene names from the molecular...

Terminology-driven mining of biomedical literature (2003)

Nenadic, Goran, Spasic, Irena, Ananiadou, Sophia

Motivation: With an overwhelming amount of textual information in molecular biology and biomedicine, there is a need for effective literature mining techniques that can help biologists to gather and...

Automatic Discovery of Term Similarities Using Pattern Mining (2002)

Goran Nenadić, Irena Spasić, Sophia Ananiadou

Term recognition and clustering are key topics in automatic knowledge acquisition and text mining. In this paper we present a novel approach to the automatic discovery of term similarities, which...

Automatic acronym acquisition and term variation management within domain specific texts (2002)

Goran Nenadić, Irena Spasić, Sophia Ananiadou

In this paper we present a framework for the effective management of terms and their variants that are automatically acquired from domain-specific texts. In our approach, the term variant recognition...

A Methodology for Terminology-based (2002)

Knowledge Acquisition And, Hideki Mima, Sophia Ananiadou, Goran Nenadic, Junichi Tsujii

In this paper we propose an integrated knowledge management system in which terminology-based knowledge acquisition, knowledge integration, and XML-based knowledge retrieval are combined using tag...

Automatic Discovery of Term Similarities Using Pattern Mining (2002)

Goran Nenadi Irena, Irena Spasić, Sophia Ananiadou

Term recognition and clustering are key topics in automatic knowledge acquisition and text mining. In this paper we present a novel approach to the automatic discovery of term similarities, which...

Terminological acquaintance: the importance of contextual information in terminology (2000)

Diana Maynard, Sophia Ananiadou

This paper examines an idea we call terminological acquaintance, which considers the importance of contextual information for various applications in NLP. The importance of contextual information is...

Identifying Terms by Their Family and Friends (2000)

Diana Maynard, Sophia Ananiadou

Multi-word terms are traditionally identied using statistical techniques or, more recently, using hybrid techniques combining statistics with shallow linguistic information. Approaches to word sense...

A linguistic approach to terminological context clustering (1999)

Diana Maynard, Sophia Ananiadou

Clustering mechanisms are important for many NLP tasks such as knowledge acquisition, term extraction and disambiguation, machine translation and ontology building. Our approach focuses on the...

Identifying Contextual Information for Multi-Word Term Extraction (1999)

Diana Maynard, Sophia Ananiadou

Methods for multi-word term extraction have traditionally involved statistical techniques. More recently, hybrid techniques have been evolving which incorporate some linguistic knowledge. This...

Term Extraction using a Similarity-based Approach (1999)

Diana Maynard, Sophia Ananiadou

Traditional methods of multi-word term extraction have used hybrid methods combining linguistic and statistical information. The linguistic part of these applications is often underexploited and...

Term Sense Disambiguation using a Domain-Specific Thesaurus (1998)

Diana Maynard, Sophia Ananiadou

Term extraction is important for many information systems applications. Although terms should be monoreferential, in reality they exhibit a high degree of ambiguity. Whilst conventional solutions...

Acquiring contextual information for term disambiguation (1998)

Diana Maynard, Sophia Ananiadou

Term extraction is important for many information systems applications. Although terms should be monoreferential, they actually exhibit a high degree of ambiguity. This paper describes a method for...

Term Sense Disambiguation using a Domain-Specific Thesaurus (1998)

Diana Maynard, Sophia Ananiadou

Term extraction is important for many information systems applications. Although terms should be monoreferential, in reality they exhibit a high degree of ambiguity. This paper describes a method for...

Automatic term recognition using contextual cues (1997)

Katerina T. Frantzi, Sophia Ananiadou

In this paper we present an approach for the extraction of multi-word terms from special language corpora. the new element is the incorporation of context information for the evaluation of candidate...

Automatic term recognition using contextual cues (1997)

Katerina T. Frantzi, Sophia Ananiadou

In this paper we present an approach for the extraction of multi-word terms from special language corpora. the new element is the incorporation of context information for the evaluation of candidate...

FACTA: a text search engine for finding associated biomedical concepts

Tsuruoka, Yoshimasa, Tsujii, Jun'ichi, Ananiadou, Sophia

Summary: FACTA is a text search engine for MEDLINE abstracts, which is designed particularly to help users browse biomedical concepts (e.g. genes/proteins, diseases, enzymes and chemical compounds)...

U-Compare: share and compare text mining tools with UIMA

Kano, Yoshinobu, Baumgartner, William A., McCrohon, Luke, Ananiadou, Sophia, Cohen, K. Bretonnel, Hunter, Lawrence, ...

Summary: Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using...