BioMed Central Research How to make the most of NE dictionaries in statistical NER (2009)
Bmc Bioinformatics, Yutaka Sasaki, Yoshimasa Tsuruoka, John Mcnaught, Sophia Ananiadou, John Mcnaught, ...
growth of a wide range of repositories of biomedical data and literature. The automatic construction and update of scientific knowledge bases is a major research topic in Biofrom
Construction of an annotated corpus to support biomedical information extraction (2009)
Thompson, Paul, Iqbal, Syed A, McNaught, John, Ananiadou, Sophia
Abstract Background Information Extraction (IE) is a component of text mining that facilitates knowledge discovery by automatically locating instances of interesting biomedical events from huge...
Poster Paper Evaluation of Automatic Term Recognition of Nuclear Receptors from MEDLINE (2009)
Sophia Ananiadou, Sylvie Albert, Dietrich Schuhmann
Keywords: automatic term recognition, C/NC-value approach 1
An Annotation Type System for a Data-Driven NLP Pipeline (2009)
Udo Hahn, Ekaterina Buyko, Katrin Tomanek, Scott Piao, John Mcnaught, Yoshimasa Tsuruoka, ...
We introduce an annotation type system for a data-driven NLP core system. The specifications cover formal document structure and document meta information, as well as the linguistic levels of...
U-Compare: share and compare text mining tools with UIMA (2009)
Kano, Yoshinobu, Baumgartner, William A., McCrohon, Luke, Ananiadou, Sophia, Cohen, K. Bretonnel, Hunter, Lawrence, ...
Summary: Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using...
Accelerating the annotation of sparse named entities by dynamic sentence selection (2008)
Tsuruoka, Yoshimasa, Tsujii, Jun'ichi, Ananiadou, Sophia
Abstract Background Previous studies of named entity recognition have shown that a reasonable level of recognition accuracy can be achieved by using machine learning models such as conditional random...
How to make the most of NE dictionaries in statistical NER (2008)
Sasaki, Yutaka, Tsuruoka, Yoshimasa, McNaught, John, Ananiadou, Sophia
Abstract Background When term ambiguity and variability are very high, dictionary-based Named Entity Recognition ( NER ) is not an ideal solution even though large-scale terminological resources are...
Themes in biomedical natural language processing: BioNLP08 (2008)
Demner-Fushman, Dina, Ananiadou, Sophia, Cohen, K Bretonnel, Pestian, John, Tsujii, Jun'ichi, Webber, Bonnie
BOOTStrep Annotation Scheme – Encoding Information for Text Mining (2008)
Scott Piao, Ekaterina Buyko, Yoshimasa Tsuruoka, Katrin Tomanek, John Mcnaught, Udo Hahn, ...
Annotation of information in corpora is an important aspect of text mining. It bridges between the information hidden in natural language texts and the semantic search queries for the information...
Text Mining Services to Support E-Research (2008)
In recent years the developments and opportunities created for e-Science infrastructure have promised technological support for the ever growing area of text mining applications and services. The...
Normalizing biomedical terms by minimizing ambiguity and variability (2008)
Tsuruoka, Yoshimasa, McNaught, John, Ananiadou, Sophia
Abstract Background One of the difficulties in mapping biomedical named entities, e.g. genes, proteins, chemicals and diseases, to their concept identifiers stems from the potential variability of...
Manchester and an Associate Director of the UK National (2008)
Irena Spasic, Sophia Ananiadou, John Mcnaught, Centre Text, Mining His, Irena Spasic, ...
is a postdoctoral research associate in the School of
Irena Spasic, Sophia Ananiadou, Jun-ichi Tsujii
Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...
classification of biomedical terms (2008)
Irena Spasic, Sophia Ananiadou, Junichi Tsujii
MaSTerClass: a case-based reasoning system for the
Mining semantically related terms from biomedical literature (2008)
Goran Nenadić, Sophia Ananiadou
Discovering links and relationships is one of the main challenges in biomedical research, as scientists are interested in uncovering entities that have similar functions, take part in the same...
Ananiadou S: Clustering acronyms in biomedical text for disambiguation (2008)
Naoaki Okazaki, Sophia Ananiadou
Given the increasing number of neologisms in biomedicine (names of genes, diseases, molecules, etc.), the rate of acronyms used in literature also increases. Existing acronym dictionaries cannot keep...
Yusuke Miyao, Yoshimasa Tsuruoka, Yuichiro Matsubayashi, Sophia Ananiadou
Recently, several text mining programs have reached a near-practical level of performance. Some systems are already being used by biologists and database curators. However, it has also been...
Christian Jacquemin, Sophia Ananiadou
is a much needed and most welcome addition to the field of computational terminology. The central issue of this book is the in-depth examination of term variation, that is, the morphological,...
Sophia Ananiadou, John Mcnaught (editors, Nikiforos Karamanis
Text mining is defined by Hearst (1999) as the automatic discovery of new, previously unknown, information from unstructured textual data. This is often seen as comprising of three major tasks:...
Classifying Technical Terms, Katerina T. Frantzi, Junichi Tsujii, Sophia Ananiadou
Automating the process of term recognition and classi#cation is important for digital libraries. Automatic Term Recognition #ATR# has many applications in areas related to digital libraries, e.g....
FACTA: a text search engine for finding associated biomedical concepts (2008)
Tsuruoka, Yoshimasa, Tsujii, Jun'ichi, Ananiadou, Sophia
Summary: FACTA is a text search engine for MEDLINE abstracts, which is designed particularly to help users browse biomedical concepts (e.g. genes/proteins, diseases, enzymes and chemical compounds)...
Manch(;sl;(;r Metroi)olil;an Univ(;rsity (2007)
Katerina T. Frantzi, Sophia Ananiadou
'l?his paper 1)rovidcs an at)l)roa(:h to tim semi-aul;onmtic exl;i'action of (:olloca-IJons flom eorl)ora using sl;atisti(:s. The growing availability of lm'ge textual cor-t)ora, and...
An Integrated Term-Based Corpus Query System (2007)
Irena Spasic, Goran Nenadic, Kostas Manios, Sophia Ananiadou
In this paper we describe the X-TRACT workbench, which enables efficient term-based querying against a domain-specific literature corpus. Its main aim is to aid domain specialists in locating and...
Sophia Ananiadou, Sylvie Albert, Dietrich Schuhmann
Keywords: automatic term recognition, C/NC-value approach 1
Ex Sharp (chair, Nicoletta Calzolari, Sophia Ananiadou, Subgroup Coordinator, Nuria Bel, Maite Melero Nogues, ...
Tsuruoka, Yoshimasa, McNaught, John, Tsujii, Jun'i;chi, Ananiadou, Sophia
Motivation: One of the bottlenecks of biomedical data integration is variation of terms. Exact string matching often fails to associate a name with its biological concept, i.e. ID or accession number...
Demonstration Sessions, Sophia Ananiadou
Demonstrations session was held between the 25th to 27th June 2007 in Prague. This year we had 113 submissions out of which 61 were selected for presentation, resulting in a 54 % acceptance rate. The...
BIOINFORMATICS ORIGINAL PAPER (2006)
Naoaki Okazaki, Sophia Ananiadou
Data and text mining Vol. 22 no. 24 2006, pages 3089–3095 doi:10.1093/bioinformatics/btl534 Building an abbreviation dictionary using a term recognition approach
Mining Opinion Polarity Relations of Citations (2006)
Scott S. Piao, Sophia Ananiadou, Yoshimasa Tsuruoka, Yutaka Sasaki, John Mcnaught
Opinion mining has been receiving increasing attention recently, and various approaches have been suggested for mining sentiment information, such as mining attitudes or opinions about a topic or...
Text mining and ontologies in biomedicine: making sense of raw text (2005)
Irena Spasic, Sophia Ananiadou, John Mcnaught
The volume of biomedical literature is increasing at such a rate that it is becoming difficult to locate, retrieve and manage the reported information without text mining, which aims to automatically...
Junichi Tsujii, Sophia Ananiadou
text mining?
MaSTerClass: a case-based reasoning system for the classification of biomedical terms (2005)
Spasic, Irena, Ananiadou, Sophia, Tsujii, Junichi
Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...
MaSTerClass: a case-based reasoning system for the classification of biomedical terms (2005)
Spasic, Irena, Ananiadou, Sophia, Tsujii, Jun-ichi
Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...
MaSTerClass: a case-based reasoning system for the classification of biomedical terms (2005)
Spasic, Irena, Ananiadou, Sophia, Tsujii, Jun-ichi
Motivation: The sheer volume of textually described biomedical knowledge exerts the need for natural language processing (NLP) applications in order to allow flexible and efficient access to relevant...
Text mining and ontologies in biomedicine: Making sense of raw text (2005)
Spasic, Irena, Ananiadou, Sophia, McNaught, John, Kumar, Anand
The volume of biomedical literature is increasing at such a rate that it is becoming difficult to locate, retrieve and manage the reported information without text mining, which aims to automatically...
Learning to classify biomedical terms through literature mining and genetic algorithms (2004)
Irena Spasić, Goran Nenadić, Sophia Ananiadou
Abstract. We present an approach to classification of biomedical terms based on the information acquired automatically from the corpus of relevant literature. The learning phase consists of two...
Selecting text features for gene name classification: from documents to terms (2003)
Goran Nenadić, Simon Rice, Irena Spasić, Sophia Ananiadou, Benjamin Stapley
In this paper we discuss the performance of a text-based classification approach by comparing different types of features. We consider the automatic classification of gene names from the molecular...
Selecting text features for gene name classification: from documents to terms (2003)
Goran Nenadić, Simon Rice, Irena Spasić, Sophia Ananiadou, Benjamin Stapley
In this paper we discuss the performance of a text-based classification approach by comparing different types of features. We consider the automatic classification of gene names from the molecular...
Selecting text features for gene name classification: from documents to terms (2003)
Goran Nenadić, Simon Rice, Irena Spasić, Sophia Ananiadou, Benjamin Stapley
In this paper we discuss the performance of a text-based classification approach by comparing different types of features. We consider the automatic classification of gene names from the molecular...
Terminology-driven mining of biomedical literature (2003)
Nenadic, Goran, Spasic, Irena, Ananiadou, Sophia
Motivation: With an overwhelming amount of textual information in molecular biology and biomedicine, there is a need for effective literature mining techniques that can help biologists to gather and...
Automatic Discovery of Term Similarities Using Pattern Mining (2002)
Goran Nenadić, Irena Spasić, Sophia Ananiadou
Term recognition and clustering are key topics in automatic knowledge acquisition and text mining. In this paper we present a novel approach to the automatic discovery of term similarities, which...
Automatic acronym acquisition and term variation management within domain specific texts (2002)
Goran Nenadić, Irena Spasić, Sophia Ananiadou
In this paper we present a framework for the effective management of terms and their variants that are automatically acquired from domain-specific texts. In our approach, the term variant recognition...
A Methodology for Terminology-based (2002)
Knowledge Acquisition And, Hideki Mima, Sophia Ananiadou, Goran Nenadic, Junichi Tsujii
In this paper we propose an integrated knowledge management system in which terminology-based knowledge acquisition, knowledge integration, and XML-based knowledge retrieval are combined using tag...
Automatic Discovery of Term Similarities Using Pattern Mining (2002)
Goran Nenadi Irena, Irena Spasić, Sophia Ananiadou
Term recognition and clustering are key topics in automatic knowledge acquisition and text mining. In this paper we present a novel approach to the automatic discovery of term similarities, which...
Terminological acquaintance: the importance of contextual information in terminology (2000)
Diana Maynard, Sophia Ananiadou
This paper examines an idea we call terminological acquaintance, which considers the importance of contextual information for various applications in NLP. The importance of contextual information is...
Identifying Terms by Their Family and Friends (2000)
Diana Maynard, Sophia Ananiadou
Multi-word terms are traditionally identied using statistical techniques or, more recently, using hybrid techniques combining statistics with shallow linguistic information. Approaches to word sense...
A linguistic approach to terminological context clustering (1999)
Diana Maynard, Sophia Ananiadou
Clustering mechanisms are important for many NLP tasks such as knowledge acquisition, term extraction and disambiguation, machine translation and ontology building. Our approach focuses on the...
Identifying Contextual Information for Multi-Word Term Extraction (1999)
Diana Maynard, Sophia Ananiadou
Methods for multi-word term extraction have traditionally involved statistical techniques. More recently, hybrid techniques have been evolving which incorporate some linguistic knowledge. This...
Term Extraction using a Similarity-based Approach (1999)
Diana Maynard, Sophia Ananiadou
Traditional methods of multi-word term extraction have used hybrid methods combining linguistic and statistical information. The linguistic part of these applications is often underexploited and...
Term Sense Disambiguation using a Domain-Specific Thesaurus (1998)
Diana Maynard, Sophia Ananiadou
Term extraction is important for many information systems applications. Although terms should be monoreferential, in reality they exhibit a high degree of ambiguity. Whilst conventional solutions...
Acquiring contextual information for term disambiguation (1998)
Diana Maynard, Sophia Ananiadou
Term extraction is important for many information systems applications. Although terms should be monoreferential, they actually exhibit a high degree of ambiguity. This paper describes a method for...
Term Sense Disambiguation using a Domain-Specific Thesaurus (1998)
Diana Maynard, Sophia Ananiadou
Term extraction is important for many information systems applications. Although terms should be monoreferential, in reality they exhibit a high degree of ambiguity. This paper describes a method for...
Automatic term recognition using contextual cues (1997)
Katerina T. Frantzi, Sophia Ananiadou
In this paper we present an approach for the extraction of multi-word terms from special language corpora. the new element is the incorporation of context information for the evaluation of candidate...
Automatic term recognition using contextual cues (1997)
Katerina T. Frantzi, Sophia Ananiadou
In this paper we present an approach for the extraction of multi-word terms from special language corpora. the new element is the incorporation of context information for the evaluation of candidate...
FACTA: a text search engine for finding associated biomedical concepts
Tsuruoka, Yoshimasa, Tsujii, Jun'ichi, Ananiadou, Sophia
Summary: FACTA is a text search engine for MEDLINE abstracts, which is designed particularly to help users browse biomedical concepts (e.g. genes/proteins, diseases, enzymes and chemical compounds)...
How to make the most of NE dictionaries in statistical NER
Sasaki, Yutaka, Tsuruoka, Yoshimasa, McNaught, John, Ananiadou, Sophia
Themes in biomedical natural language processing: BioNLP08
Demner-Fushman, Dina, Ananiadou, Sophia, Cohen, K Bretonnel, Pestian, John, Tsujii, Jun'ichi, Webber, Bonnie
U-Compare: share and compare text mining tools with UIMA
Kano, Yoshinobu, Baumgartner, William A., McCrohon, Luke, Ananiadou, Sophia, Cohen, K. Bretonnel, Hunter, Lawrence, ...
Summary: Due to the increasing number of text mining resources (tools and corpora) available to biologists, interoperability issues between these resources are becoming significant obstacles to using...