Regression Canonical Correlation Analysis (2008)
In this paper we present Regression Canonical Correlation Analysis, an extension of Canonical Correlation Analysis, where one of the dimensions is fixed and demonstrate how it can be solved...
OntoGen Semi-automatic Ontology Editor (2008)
Blaz Fortuna, Marko Grobelnik, Dunja Mladenic
The rapid growth of documents, web pages and other types of textual content pose a great challenge to modern content management systems. Ontologies offer an efficient way to reduce the amount of...
OntoGen: Semi-automatic Ontology Editor (2008)
Blaz Fortuna, Marko Grobelnik, Dunja Mladenic
Abstract. In this paper we present a semi-automatic ontology editor as implemented in a new version of OntoGen system. The system integrates machine learning and text mining algorithms into an...
Improving the Classification of Newsgroup Messages through Social Network Analysis (2008)
Newsgroup participants interact with their communities through conversation threads. They may respond to a message to answer a question, debate a topic, support or disagree with another person’s...
Detecting the bias in media with statistical learning methods (2008)
Fortuna, Blaz, Galleguillos, Carolina, Cristianini, Nello
The international media system plays a crucial role both in re ecting public opinion and events, and in shaping them. Understanding the workings of this complex system is of crucial importance for...
TRIPLET EXTRACTION FROM SENTENCES USING SVM (2008)
In this paper we present a machine learning approach to extract subject-predicate-object triplets from English sentences. SVM is used to train a model on human annotated triplets, and the features...
SEMANTIC GRAPHS DERIVED FROM TRIPLETS WITH APPLICATION IN DOCUMENT SUMMARIZATION (2008)
Rusu, Delia, Fortuna, Blaz, Grobelnik, Marko, Mladenic, Dunja
Information nowadays has become more and more accessible, so much as to give birth to an information overload issue. Yet important decisions have to be made, depending on the available information....
Semantic Modeling, Translation and Matching of QoS (2008)
Moraru, Alexandra, Fortuna, Blaz, Fortuna, Carolina
The variety of access and transport technologies available in modern computer networks pose significant challenges related to compatibility and quality of service (QoS) related issues. Applications...
Contextualizing ontologies with ontolight : a pragmatic approach (2008)
Grobelnik, Marko, Brank, Janez, Fortuna, Blaz, Mozetic, Igor
Contextualizing ontologies with ontolight : a pragmatic approach
Advancing topic ontology learning through term extraction. (2008)
Fortuna, Blaz, Lavrac, Nada, Velardi, Paola
Advancing topic ontology learning through term extraction.
Cross-lingual search over 22 european languages (2008)
Fortuna, Blaz, Rupnik, Jan, Pajntar, Bostjan, Grobelnik, Marko, Mladenić, Dunja
In this paper we present a system for cross-lingual information retrieval, which can handle tens of languages and millions of documents. Functioning of the system is demonstrated on corpus of...
Semantic Modeling, Translation and Matching of QoS (2008)
Moraru, Alexandra, Fortuna, Blaz, Fortuna, Carolina
The variety of access and transport technologies available in modern computer networks pose significant challenges related to compatibility and quality of service (QoS) related issues. Applications...
Organization Workshop Organizers (2008)
Stephan Bloehdorn, Marko Grobelnik, Peter Mika, Thanh Tran Duc, Bettina Berendt, Paul Buitelaar, ...
International Workshop located at the
Improving the Classification of Newsgroups Messages through Social Network Analysis (2007)
Fortuna, Blaz, Rodrigues, Eduarda Medres, Milic-Frayling, Natasa
Newsgroup participants interact with their communities through conversation threads. They may respond to a message to answer a question, debate a topic, support or disagree with another person’s...
Detection of Web Subsites: Concepts, Algorithms, and Evaluation Issues (2007)
Rodrigues, Eduarda Medres, Milic-Frayling, Natasa, Fortuna, Blaz
Web sites are often organized into several regions, each dedicated to a specific topic or serving a particular function. From a user’s perspective, these regions typically form coherent sets of...
Anomaly detection in computer networks using linear SVMs (2007)
Fortuna, Carolina, Fortuna, Blaz, Mohorcic, Mihael
Modern computer networks are subject to various malicious attacks. Since attacks are becoming more sophisticated and networks are becoming larger there is a need for an efficient intrusion detection...
OntoGen: Semi-automatic Ontology Editor (2007)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we present a semi-automatic ontology editor as implemented in a new version of OntoGen system. The system integrates machine learning and text mining algorithms into an efficient user...
A Kernel Canonical Correlation Analysis for Learning the Semantics of Text (2007)
Fortuna, Blaz, Cristianini, Nello, Shawe-Taylor, John
A Kernel Canonical Correlation Analysis For Learning The Semantics Of Text
User study of ontology generation tool (2007)
Ilijasic Misic, Ivana, Kovacic, Bozidar, Mohoric, Tamara, Mladenić, Dunja, Fortuna, Blaz, Grobelnik, Marko
We present design and results of a user study undertaken in order to evaluate ontology generation process. We have applied our study to an example tool for semi-automatic ontology generation –...
Triplet extraction from sentences (2007)
Rusu, Delia, Dali, Lorand, Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we present an approach to extracting subject-predicate-object triplets from English sentences. To begin with, four different well known syntactical parsers for English are used for...
ADVANCING TOPIC ONTOLOGY LEARNING THROUGH TERM EXTRACTION (2007)
Fortuna, Blaz, Lavrac, Nada, Velardi, Paola
This paper presents a novel methodology for topic ontology learning from text documents. The proposed methodology, named OntoTermExtraction is based on OntoGen, a semi-automated tool for topic...
Extracting named entities and relating them over time based on Wikipedia (2007)
Bhole, Abhiji, Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
This paper presents an approach to mining information relating people, places, organizations and events extracted from Wikipedia and linking them on a time scale. The approach consists of two phases:...
From social network to light-weight ontology (2007)
Mladenić, Dunja, Grobelnik, Marko, Fortuna, Blaz
We address the problem of constructing a light-weight ontology from social network data. As an example we use social network of a mid size research institution obtained based on e-mail communication....
Semi-automatic data-driven ontology construction system (2006)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we present a new version of OntoGen system for semi-automatic data-driven ontology construction. The system is based on a novel ontology learning framework which formalizes and extends...
System for Semi-automatic Ontology construction (2006)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and KMeans clustering) and present how we integrated them into a system for...
Background Knowledge for Ontology Construction (2006)
Fortuna, Blaz, Grobelnik, Marko, Mladenić, Dunja
In this paper we describe a solution for incorporating background knowledge into the OntoGen system for semi-automatic ontology construction. This makes it easier for different users to construct...
Using DMoz for constructing ontology from data stream (2006)
Grobelnik, Marko, Brank, Janez, Mladenić, Dunja, Novak, Blaz, Fortuna, Blaz
This paper presents an approach for constructing an ontology from a stream of documents. Named entities extracted from the documents are used as instances of the ontology. Entities and co-occurring...
Visualization of text document corpus (2005)
Fortuna, Blaz, Mladenić, Dunja, Grobelnik, Marko
From the automated text processing point of view, natural language is very redundant in the sense that many different words share a common or similar meaning. For computer this can be hard to...
Semi-automatic construction of topic ontology (2005)
Fortuna, Blaz, Mladenić, Dunja, Grobelnik, Marko
In this paper, we review two techniques for topic discovery in collections of text documents (Latent Semantic Indexing and K-Means clustering) and present how we integrated them into a system for...
kNN Versus SVM in the Collaborative Filtering Framework (2005)
Grcar, Miha, Fortuna, Blaz, Mladenić, Dunja
We present experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with Support Vector Machine (SVM) in the collaborative filtering framework using datasets with different...
The use of machine translation tools for cross-lingual text mining (2005)
Fortuna, Blaz, Shawe-Taylor, John
Eigen-analysis such as LSI or KCCA was already successfully applied to cross-lingual information retrieval. This approach has a weakness in that it needs an aligned training set of documents. In this...
Using string kernels for classification of Slovenian Web documents (2005)
Fortuna, Blaz, Mladenić, Dunja
In this paper we present an approach for classifying web pages obtained from the Slovenian Internet directory where the web sites covering different topics are organized into a topic ontology.We...
This paper provides an overview of string kernels. String kernels compare text documents by the substrings they contain. Because of high computational complexity, methods for approximating string...
KERNEL CANONICAL CORRELATION ANALYSIS WITH APPLICATIONS (2004)
This paper provides an overview of Kernel Canonical Correlation Analysis. KCCA is a technique for finding common semantic features between dierent views of data. Applications on text retrieval,...