Research Constrained hidden Markov models for population-based haplotyping (2008)
Bmc Bioinformatics, Niels L, Taneli Mielikäinen, Lauri Eronen, Hannu Toivonen, Heikki Mannila
This is an open access article distributed under the terms of the Creative Commons Attribution License
47. An Efficient Method for Association Mapping in Phase-unknown Genotype Data (2008)
Petteri Sevon, Päivi Onkamo, Hannu Toivonen, Heikki Mannila
In genetic association analysis a researcher tries to find shared alleles or haplotypes in a group of patients, which would be much rarer in controls. The analysis is based on availability of...
Research Constrained hidden Markov models for population-based haplotyping (2008)
Bmc Bioinformatics, Niels L, Taneli Mielikäinen, Lauri Eronen, Hannu Toivonen, Heikki Mannila
This is an open access article distributed under the terms of the Creative Commons Attribution License
Compressing probabilistic Prolog programs (2008)
Kersting, Kristian, Revoredo, Kate, Toivonen, Hannu
ProbLog is a recently introduced probabilistic extension of Prolog [De Raedt et al. IJCAI 2007]. A ProbLog program defines a distribution over logic programs by specifying for each clause the...
Probabilistic Explanation Based Learning (2008)
Angelika Kimmig, Luc De Raedt, Hannu Toivonen
Abstract. Explanation based learning produces generalized explanations from examples. These explanations are typically built in a deductive manner and they aim to capture the essential...
Constrained Hidden Markov Models for Population-based Haplotyping Extended abstract (2008)
Niels L, Taneli Mielikäinen, Lauri Eronen, Hannu Toivonen, Heikki Mannila
Analysis of genetic variation in human populations is critical to the understanding of the genetic basis for complex diseases. Although genomes of several species have been sequenced, it is still too...
ECEM/EAML 2004 Segmentation of paleoecological spatio-temporal count data (2008)
Kari Vasko, Hannu Toivonen, Atte Korhola
Key words: Spatio-temporal data analysis, segmentation, analysis of compositional data Segmentation analysis addresses the following data analysis problem: given a time series, find a partitioning of...
Abstract Mining Non-Derivable Association Rules (2008)
Bart Goethals, Juho Muhonen, Hannu Toivonen
Association rule mining typically results in large amounts of redundant rules. We introduce efficient methods for deriving tight bounds for confidences of association rules, given their subrules. If...
Petteri Sevon, Hannu Toivonen, Vesa Ollikainen
Abstract—We describe TreeDT, a novel association-based gene mapping method. Given a set of disease-associated haplotypes and a set of control haplotypes, TreeDT predicts likely locations of a...
Visualisation of Associations Between Nucleotides in SNP Neighbourhoods (2008)
Kimmo Kulovesi, Juho Muhonen, Pentti T. Riikonen, Mauno Vihinen, Hannu Toivonen, Tomi A. Pasanen
A large number of single nucleotide polymorphisms have been mapped onto the human genome. Mutations are induced through endogenous and exogenous processes, and these procedures have been shown to be...
COMBINING PHENOTYPIC AND GENOTYPIC DATA TO DISCOVER MULTIPLE DISEASE GENES (2008)
Hannu Toivonen, Saara Hyvönen, Petteri Sevon
Mapping genes for common, polygenic diseases is challenging due to the number of genes involved. Typical mapping methods search for one gene at a time, but their marginal effects may be too weak to...
Constrained Hidden Markov Models for Population-based Haplotyping Extended abstract (2008)
Niels L, Taneli Mielikäinen, Lauri Eronen, Hannu Toivonen, Heikki Mannila
Analysis of genetic variation in human populations is critical to the understanding of the genetic basis for complex diseases. Although genomes of several species have been sequenced, it is still too...
Mining Relaxed Graph Properties in Internet (2008)
Wilhelmiina Hämäläinen, Hannu Toivonen, Vladimir Poroshin
Many real world datasets are represented in the form of graphs. The classical graph properties found in the data, like cliques or independent sets, can reveal new interesting information in the data....
A Perspective on Databases and Data Mining (2007)
Data Mining, M. Holsheimer, M. Kersten, H. Mannila, H. Toivonen, Issn -x, ...
and their applications. SMC is sponsored by the Netherlands Organization for Scientific Research (NWO). CWI is a member of
Learning, mining, or modeling? - A case study from paleoecology (2007)
Heikki Mannila, Hannu Toivonen, Atte Korhola, Heikki Olander
Exploratory data mining, machine learning, and statistical modeling all have a role in discovery science. We describe a paleoecological reconstruction problem where Bayesian methods are useful and...
Verlag London Ltd. The book will be based on selected (2007)
Jason Wang, Mohammed Zaki, Hannu Toivonen, Mohammed J. Zaki
Bioinformatics provides opportunities for developing novel data mining methods. Some of the grand challenges in bioinformatics include protein structure prediction, homology search, multiple...
Constrained hidden Markov models for population-based haplotyping (2007)
Landwehr, Niels, Mielikäinen, Taneli, Eronen, Lauri, Toivonen, Hannu, Mannila, Heikki
Abstract Background Haplotype Reconstruction is the problem of resolving the hidden phase information in genotype data obtained from laboratory measurements. Solving this problem is an important...
Constrained hidden Markov models for population-based haplotyping (2007)
Landwehr, Niels, Mielikäinen, Taneli, Eronen, Lauri, Toivonen, Hannu, Mannila, Heikki
Background. Haplotype Reconstruction is the problem of resolving the hidden phase information in genotype data obtained from laboratory measurements. Solving this problem is an important intermediate...
ProbLog: a probabilistic Prolog and its application in link discovery (2007)
Luc De Raedt, Angelika Kimmig, Hannu Toivonen
We introduce ProbLog, a probabilistic extension of Prolog. A ProbLog program defines a distribution over logic programs by specifying for each clause the probability that it belongs to a randomly...
ProbLog: a probabilistic Prolog and its application in link discovery (2007)
Luc De Raedt, Angelika Kimmig, Hannu Toivonen
We introduce ProbLog, a probabilistic extension of Prolog. A ProbLog program defines a distribution over logic programs by specifying for each clause the probability that it belongs to a randomly...
ProbLog: A probabilistic Prolog and its application in link discovery (2007)
http://www.ijcai.org/papers07/Papers/IJCAI07-396.pdf
HaploRec: efficient and accurate large-scale reconstruction of haplotypes (2006)
Eronen, Lauri, Geerts, Floris, Toivonen, Hannu
Abstract Background Haplotypes extracted from human DNA can be used for gene mapping and other analysis of genetic patterns within and across populations. A fundamental problem is, however, that...
Constrained Hidden Markov Models for Population-based Haplotyping (Extended Abstract) (2006)
Landwehr, Niels, Mielikäinen, Taneli, Eronen, Lauri, Toivonen, Hannu, Mannila, Heikki
We propose a simple haplotype reconstruction method that is based on iterative refinement and regularization of constrained Hidden Markov Models. We show that it gives results comparable to the...
H.: Closed non-derivable itemsets (2006)
Abstract. Itemset mining typically results in large amounts of redundant itemsets. Several approaches such as closed itemsets, non-derivable itemsets and generators have been suggested for losslessly...
Link discovery in graphs derived from biological databases (2006)
Petteri Sevon, Lauri Eronen, Petteri Hintsanen, Kimmo Kulovesi, Hannu Toivonen
Abstract. Public biological databases contain vast amounts of rich data that can also be used to create and evaluate new biological hypothesis. We propose a method for link discovery in biological...
Email Alerting, Petteri Hintsanen, Petteri Sevon, Päivi Onkamo, Lauri Eronen, Hannu Toivonen, ...
service
Data mining for gene mapping (2005)
Hannu Toivonen, Päivi Onkamo, Petteri Hintsanen, Evimaria Terzi, Petteri Sevon, Jozef Zurada, ...
Localization of disease susceptibility genes to certain areas in the human genome, or gene mapping, requires careful analysis of genetic marker data. Gene mapping is often carried out using a sample...
SOFTWARE REVIEW Asurvey ofdata mining methods for linkage disequilibrium mapping (2005)
Data mining methods are gaining more interest aspotential tools in mapping and identification of complex disease loci. The methods are well suited to large numbers of genetic marker loci produced by...
A Markov Chain Approach to Reconstruction of Long Haplotypes (2004)
Eronen, Lauri, Toivonen, Hannu
Haplotypes are important for association based gene mapping, but there are no practical laboratory methods for obtaining them directly from DNA samples. We propose simple Markov models for...
A Markov Chain Approach to Reconstruction of Long Haplotypes (2004)
Eronen, Lauri, Toivonen, Hannu
Haplotypes are important for association based gene mapping, but there are no practical laboratory methods for obtaining them directly from DNA samples. We propose simple Markov models for...
A Markov Chain Approach to Reconstruction of Long Haplotypes (2004)
Eronen, Lauri, GEERTS, Floris, Toivonen, Hannu
Haplotypes are important for association based gene mapping, but there are no practical laboratory methods for obtaining them directly from DNA samples. We propose simple Markov models for...
A markov chain approach to reconstruction of long haplotypes (2004)
Lauri Eronen, Floris Geerts, Hannu Toivonen
Haplotypes are important for association based gene mapping, but there are no practical laboratory methods for obtaining them directly from DNA samples. We propose simple Markov models for...
Adaptive On-Device Location Recognition (2004)
Kari Laasonen Mika, Mika Raento, Hannu Toivonen
Location-awareness is useful for mobile and pervasive computing.
Statistical evaluation of the Predictive Toxicology Challenge 2000-2001 (2003)
Toivonen, Hannu, Srinivasan, Ashwin, King, Ross D., Kramer, Stefan, Helma, Christoph
Motivation: The development of in silico models to predict chemical carcinogenesis from molecular structure would help greatly to prevent environmentally caused cancers. The Predictive Toxicology...
H.: Automated detection of epidemics from the usage logs of a physicians’ reference database (2003)
Abstract. Epidemics of infectious diseases are usually recognized by an observation of an abnormal cluster of cases. Usually, the recognition is not automated, and relies on the alertness of human...
Discovering All Most Specific Sentences (2003)
Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Sanjeev Saluja, Hannu Toivonen, Ram Sewak Sharma
Data mining can be viewed, in many cases, as the task of computing a representation of a theory...
Statistical Evaluation of the predictive toxicology challenge 2000-2001 (2003)
Hannu Toivonen, Ashwin Srinivasan, Ross D. King, Stefan Kramer
Motivation: The development of in silico models to predict chemical carcinogenesis from molecular structure would help greatly to prevent environmentally caused cancers. The Predictive Toxicology...
Statistical evaluation of the Predictive Toxicology Challenge 2000-2001 (2003)
Toivonen, Hannu, Srinivasan, Ashwin, King, Ross D., Kramer, Stefan, Helma, Christoph
Motivation: The development of in silico models to predict chemical carcinogenesis from molecular structure would help greatly to prevent environmentally caused cancers. The Predictive Toxicology...
Mining for Similarities in Aligned Time Series Using Wavelets (1999)
Ykä Huhtala, Juha Kärkkäinen, Hannu Toivonen
Discovery of non-obvious relationships between time series is an important problem in many domains, such as financial, sensory, and scientific data analysis. We consider data mining in aligned time...
Rule Discovery in Telecommunication Alarm Data (1999)
Mika Klemettinen, Heikki Mannila, Hannu Toivonen
Fault management is an important but difficult area of telecommunication...
Discovery of frequent Datalog patterns (1999)
Luc Dehaspe, Hannu Toivonen, Sǎso Dˇzeroski, Nada Lavrač
Discovery of frequent patterns has been studied in a variety of data mining settings. In its simplest form, known from association rule mining, the task is to discover all frequent itemsets, i.e.,...
Tane: An Efficient Algorithm for Discovering Functional and Approximate Dependencies (1999)
Huhtala, Ykä, Kärkkäinen, Juha, Porkka, Pasi, Toivonen, Hannu
The discovery of functional dependencies from relations is an important database analysis technique. We present Tane, an efficient algorithm for finding functional dependencies from large databases....
Finding frequent substructures in chemical compounds (1998)
Luc Dehaspe, Hannu Toivonen, Ross Donald King
The discovery of the relationships between chemical structure and biological function is central to biological science and medicine. In this paper we apply data mining to the problem of predicting...
Efficient Discovery of Functional and Approximate Dependencies Using Partitions (1998)
Ykä Huhtala, Juha Kärkkäinen, Pasi Porkka, Hannu Toivonen
Discovery of functionaldependencies from relations has been identified as an important database analysis technique. In this paper, we present a new approach for finding functional dependencies from...
Bassist - a tool for MCMC simulation of statistical models (1998)
Hannu Toivonen, Heikki Mannila, Marko Salmenkivi, Karri-pekka Laakso
In this paper we give a short overview of MCMC simulation and the Bassist system, and describe some of the applications in which Bassist has been used.
Ykä Huhtala, Yka Huhtala, Juha Kärkkäinen, Juha Karkkainen, Pasi Porkka, Pasi Porkka, ...
Discovery of functional dependencies from relations has been identified as an important database analysis technique. In this paper, we present a new approach for finding functional dependencies from...
Data mining, hypergraph transversals, and machine learning (1997)
Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Hannu Toivonen
Several data mining problems can be formulated as problems of finding maximally specific sentences that are interesting in a database. We first show that this problem has a close relationship with...
Data mining, Machine Learning (1997)
Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Hannu Toivonen
Several data mining problems can be formulated as problems of finding maximally specific sentences that are interesting in a database. We first show that this problem has a close relationship with...
Levelwise Search and Borders of Theories in Knowledge Discovery (1997)
Heikki Mannila, Heikki Mannila, Hannu Toivonen, Hannu Toivonen
One of the basic problems in knowledge discovery in databases (KDD) is the following: given a data set r, a class L of sentences for defining subgroups of r, and a selection predicate, find all...
Discovery of Frequent Episodes in Event Sequences (1997)
Heikki Mannila, Heikki Mannila, Hannu Toivonen, Hannu Toivonen, A. Inkeri Verkamo, A. Inkeri Verkamo
Sequences of events describing the behavior and actions of users or systems can be collected in several domains. We consider the problem of discovering frequently occurring episodes in such...
Data mining, Hypergraph Transversals, and Machine Learning (1997)
Dimitrios Gunopulos, Roni Khardon, Heikki Mannila, Hannu Toivonen
Several data mining problems can be formulated as problems of finding maximally specific sentences that are interesting in a database. We first show that this problem has a close relationship with...
Discovery of Frequent Episodes in Event Sequences (1997)
Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo
Sequences of events describing the behavior and actions of users or systems can be collected in several domains. An episode is a collection of events that occur relatively close to each other in a...
Discovery of frequent patterns in large data collections / (1996)
Diss. -- Helsingin yliopisto.
Sampling Large Databases for Association Rules (1996)
Discovery of association rules is an important database mining problem. Current algorithms for nding association rules require several passes over the analyzed database, and obviously the role of I/O...
Multiple uses of frequent sets and condensed representations (Extended Abstract) (1996)
Heikki Mannila, Hannu Toivonen
In interactive data mining it is advantageous to have condensed representations of data that can be used to efficiently answer different queries. In this paper we show how frequent sets can be used...
On an algorithm for finding all interesting sentences (Extended Abstract) (1996)
Heikki Mannila, Mpi Informatik, Im Stadtwaldt, Hannu Toivonen
Knowledge discovery in databases (KDD), also called data mining, has recently received wide attention from practitioners and researchers. One of the basic problems in KDD is the following: given a...
Discovering Generalized Episodes Using Minimal Occurrences (1996)
Heikki Mannila, Hannu Toivonen
Sequences of events are an important special form of data that arises in several contexts, including telecommunications, user interface studies, and epidemiology. We present a general and flexible...
Sampling Large Databases for Association Rules (1996)
Discovery of association rules is an important database mining problem. Current algorithms for finding association rules require several passes over the analyzed database, and obviously the role of...
Discovery of Frequent Patterns in Large Data Collections (1996)
Hannu Toivonen, Hannu Toivonen, Hannu Toivonen
Data mining, or knowledge discovery in databases, aims at finding useful regularities in large data sets. Interest in the field is motivated by the growth of computerized data collections and by the...
Rule Discovery in Alarm Databases (1996)
Kimmo Hätönen, Kimmo Hatonen, Mika Klemettinen, Mika Klemettinen, Heikki Mannila, Heikki Mannila, ...
Telecommunication networks produce large amounts of alarm information daily. This data contains potentially valuable knowledge about the network. We present a methodology for the analysis of large...
Mika Klemettinen, Mika Klemettinen, Heikki Mannila, Heikki Mannila, Hannu Toivonen, Hannu Toivonen
We introduce a methodology for knowledge discovery in databases (KDD) where one first discovers large collections of patterns at once, and then performs interactive retrievals from the collection of...
TASA: Telecommunication Alarm Sequence Analyzer or: How to enjoy faults in your network (1996)
Kimmo Hätönen, Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen
Today's large and complex telecommunication networks produce large amounts of alarms daily. The sequence of alarms contains valuable knowledge about the behavior of the network, but much of the...
Sampling Large Databases for Association Rules (1996)
Discovery of association rules.is an import-ant database mining problem. Current al-gorithms for finding association rules require several passes over the analyzed database, and obviously the role of...
A Perspective on Databases and Data Mining (1995)
Marcel Holsheimer, Martin Kersten, Heikki Mannila, Hannu Toivonen
We discuss the use of database methods for data mining. Recently impressive results have been achieved for some data mining problems using highly specialized and clever data structures. We study how...
Discovering frequent episodes in sequences (Extended Abstract) (1995)
Heikki Mannila, Hannu Toivonen, A. Inkeri Verkamo
Sequences of events describing the behavior and actions of users or systems can be collected in several domains. In this paper we consider the problem of recognizing frequent episodes in such...
Efficient Algorithms for Discovering Association Rules (1994)
Heikki Mannila, Hannu Toivonen, Inkeri Verkamo
Association rules are statements of the form "for 90 % of the rows of the relation, if the row has value 1 in the columns in set W , then it has 1 also in column B". Agrawal, Imielinski,...
Finding Interesting Rules from Large Sets of Discovered Association Rules (1994)
Mika Klemettinen, Heikki Mannila, Pirjo Ronkainen, Hannu Toivonen, A. Inkeri Verkamo
Association rules, introduced by Agrawal, Imielinski, and Swami, are rules of the form "for 90 % of the rows of the relation, if the row has value 1 in the columns in set W , then it has 1 also...
Improved Methods for Finding Association Rules (1994)
Heikki Mannila, Heikki Mannila, Hannu Toivonen, Hannu Toivonen, A. Inkeri Verkamo, A. Inkeri Verkamo
Association rules are statements of the form "for 90 % of the rows of the relation, if the row has value 1 in the columns in set W , then it has 1 also in column B". Agrawal, Imielinski,...
VITAL Knowledge Representation Language Specification (1991)
Enrico Motta, Arthur Stutt, Kieron O'Hara, Juha Kuusela, Hannu Toivonen, Han Reichgelt, ...
: In this document the knowledge representation component of the VITAL workbench is specified. Authors: Enrico Motta, Arthur Stutt, Kieron O'Hara, Juha Kuusela, Hannu Toivonen, Han Reichgelt,...
Thesis (D. Tech.)--Helsinki University of Technology, 1984
Väitöskirjan tiivistelmäosa.
Tiivistelmä ja 6 erip. - Tiivistelmä ilm. sarjassa Acta polytechnica Scandinavica. Ch ; 156. - Diss. 557.
Thesis (doctoral)--Turku, 1981.
Limited cataloging.
Effects of cigarette smoke on the metabolism and action of vasoactive hormones in the rat / (1981)
Tiivistelmä ja 8 erip. - Nimiösivulla myös: From the Department of Physiology, Biomedical Institute, University of Turku, Finland.