Gregory Kucherov

On maximal repetitions of arbitrary exponent (2009)

Kolpakov, Roman, Kucherov, Gregory, Ochem, Pascal

The first two authors have shown [KK99,KK00] that the sum the exponent (and thus the number) of maximal repetitions of exponent at least 2 (also called runs) is linear in the length of the word. The...

Protein Similarity Search with Subset Seeds on a Dedicated Reconfigurable Hardware (2009)

Pierre Peterlongo, Dominique Lavenier, Gilles Georges, Julien Jacques, Gregory Kucherov, Mathieu Giraud

Abstract. With a sharp increase of available DNA and protein sequence data, new precise and fast similarity search methods are needed for largescale genome and proteome comparisons. Modern seed-based...

G.: Combinatorial search on graphs motivated by bioinformatics applications: A brief survey (2009)

Mathilde Bouvel, Vladimir Grebinski, Gregory Kucherov

Abstract. The goal of this paper is to present a brief survey of a collection of methods and results from the area of combinatorial search [1,8] focusing on graph reconstruction using queries of...

Searching for Gapped Palindromes (2009)

Roman Kolpakov, Gregory Kucherov

Abstract. We study the problem of finding, in a given word, all maximal gapped palindromes verifying two types of constraints, that we call longarmed and length-constrained palindromes. For both...

Structural pattern matching of nonribosomal peptides (2009)

Caboche, Ségolène, Pupin, Maude, Leclère, Valérie, Jacques, Phillipe, Kucherov, Gregory

Abstract Background Nonribosomal peptides (NRPs), bioactive secondary metabolites produced by many microorganisms, show a broad range of important biological activities (e.g. antibiotics,...

On subset seeds for protein alignment (2009)

Roytberg, Mikhail A., Gambin, Anna, Noé, Laurent, Lasota, Slawomir, Furletova, Eugenia, Szczurek, Ewa, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

Multiseed Lossless Filtration (2009)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail A.

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather...

On subset seeds for protein alignment (2009)

Roytberg, Mikhail, Gambin, Anna, Noé, Laurent, Lasota, Slawomir, Furletova, Eugenia, Szczurek, Ewa, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

On subset seeds for protein alignment (2009)

Roytberg, Mikhail, Gambin, Anna, Noé, Laurent, Lasota, Slawomir, Furletova, Eugenia, Szczurek, Ewa, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

On subset seeds for protein alignment (2009)

Mikhail Roytberg, Anna Gambin, Laurent Noé, Sl‚awomir Lasota, Eugenia Furletova, Ewa Szczurek, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

Optimal neighborhood indexing for protein similarity search (2008)

Peterlongo, Pierre, Noé, Laurent, Lavenier, Dominique, Nguyen, Van, Kucherov, Gregory, Giraud, Mathieu

Abstract Background Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow...

G.: Combinatorial search on graphs motivated by bioinformatics applications: A brief survey (2008)

Mathilde Bouvel, Vladimir Grebinski, Gregory Kucherov

Abstract. The goal of this paper is to present a brief survey of a collection of methods and results from the area of combinatorial search [1,8] focusing on graph reconstruction using queries of...

Efficient seeding techniques for protein similarity search (2008)

Roytberg, Mihkail, Gambin, Anna, Noé, Laurent, Lasota, Slawomir, Furletova, Eugenia, Szczurek, Ewa, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

Optimal linear arrangement of interval graphs (2008)

Johanne Cohen, Fedor Fomin, Pinar Heggernes, Dieter Kratsch, Gregory Kucherov

Abstract. We study the optimal linear arrangement (OLA) problem on interval graphs. Several linear layout problems that are NP-hard on general graphs are solvable in polynomial time on interval...

Optimal linear arrangement of interval graphs (2008)

Johanne Cohen, Fedor Fomin, Pinar Heggernes, Dieter Kratsch, Gregory Kucherov

Abstract. We study the optimal linear arrangement (OLA) problem on interval graphs. Several linear layout problems that are NP-hard on general graphs are solvable in polynomial time on interval...

A new method of finding similarity regions in DNA sequences (2008)

Laurent Noe, Gregory Kucherov

Identifying similarity regions inside a DNA sequence (repeats), or between two sequences (local alignment), is a fundamental problem in bioinformatics. For this task, many algorithms use a technique...

Improved (2008)

Laurent Noé, Gregory Kucherov

hit criteria for DNA local alignment

Multi-seed lossless filtration (Extended abstract) (2008)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

Abstract. We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a...

Database and comparison of non ribosomal peptides (2008)

Ségolène Caboche, Valérie Leclère, Maude Pupin, Gregory Kucherov

Peptides are synthesized in bacteria or fungi not only through the classical pathway from the transcription of DNA to the translation of mRNA into peptides, but also through a ribosomeindependent...

SIGffRid: A tool to search for sigma factor binding sites in bacterial genomes using comparative approach and biologically driven statistics (2008)

Touzain, Fabrice, Schbath, Sophie, Debled-Rennesson, Isabelle, Aigle, Bertrand, Kucherov, Gregory, Leblond, Pierre

Abstract Background Many programs have been developed to identify transcription factor binding sites. However, most of them are not able to infer two-word motifs with variable spacer lengths. This...

Efficient seeding techniques for protein similarity search (2008)

Roytberg, Mihkail, Gambin, Anna, Noé, Laurent, Lasota, Slawomir, Furletova, Eugenia, Szczurek, Ewa, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

Efficient seeding techniques for protein similarity search (2008)

Roytberg, Mihkail, Gambin, Anna, Noé, Laurent, Lasota, Slawomir, Furletova, Eugenia, Szczurek, Ewa, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

Optimal neighborhood indexing for protein similarity search (2008)

Peterlongo, Pierre, Noé, Laurent, Lavenier, Dominique, Nguyen, Van Hoa, Kucherov, Gregory, Giraud, Mathieu

Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with...

Optimal neighborhood indexing for protein similarity search (2008)

Peterlongo, Pierre, Noé, Laurent, Lavenier, Dominique, Nguyen, Van Hoa, Kucherov, Gregory, Giraud, Mathieu

Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves heuristics with...

Protein similarity search with subset seeds on a dedicated reconfigurable hardware (2008)

Pierre Peterlongo, Laurent Noé, Dominique Lavenier, Gilles Georges, Julien Jacques, Gregory Kucherov, ...

Genome sequencing of numerous species raises the need of complete genome comparison with precise and fast similarity searches. Today, advanced seed-based techniques (spaced seeds, multiple seeds,...

Optimal neighborhood indexing for protein similarity search (2008)

Pierre Peterlongo, Laurent Noé, Dominique Lavenier, Van Hoa Nguyen, Gregory Kucherov, Mathieu Giraud

Background Similarity inference, one of the main bioinformatics tasks, has to face an exponential growth of the biological data. A classical approach used to cope with this data flow involves...

Efficient seeding techniques for protein similarity search (2008)

Mikhail Roytberg, Anna Gambin, Laurent Noé, Slawomir Lasota, Eugenia Furletova, Ewa Szczurek, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

Protein sequence alignment via anti-translation (2008)

Marta Gîrdea, Gregory Kucherov, Laurent Noé

We propose a new approach to protein sequence alignment, where the basic idea is to find the best pairwise alignment between two DNA sequences, each of which encodes one of the given proteins. A...

Efficient seeding techniques for protein similarity search (2008)

Mikhail Roytberg, Anna Gambin, Laurent Noé, Slawomir Lasota, Eugenia Furletova, Ewa Szczurek, ...

We apply the concept of subset seeds proposed in [1] to similarity search in protein sequences. The main question studied is the design of efficient seed alphabets to construct seeds with optimal...

NORINE: a database of nonribosomal peptides (2008)

Caboche, Ségolène, Pupin, Maude, Leclère, Valérie, Fontaine, Arnaud, Jacques, Philippe, Kucherov, Gregory

Norine is the first database entirely dedicated to nonribosomal peptides (NRPs). In bacteria and fungi, in addition to the traditional ribosomal proteic biosynthesis, an alternative...

BMC Bioinformatics BioMed Central Methodology article (2008)

Fabrice Touzain, Sophie Schbath, Isabelle Debled-rennesson, Gregory Kucherov, Pierre Leblond, Open Access

SIGffRid: A tool to search for sigma factor binding sites in bacterial genomes using comparative approach and biologically driven statistics

On Repetition-Free Binary Words of Minimal Density (Extended Abstract) (2007)

Roman Kolpakov, Gregory Kucherov, Yuri Tarannikov

this paper we continue with the study initiated in [12]. The general problem behind this study can be described as follows. Assume we have specified a set of "prohibited" words P ` A

How to Get Rid of Projection Rules in Context-free Tree Grammars (2007)

Dieter Hofbauer, Maria Huber, Gregory Kucherov

In contrast to the case of words, in context-free tree grammars projection rules cannot always be eliminated completely. In this paper we partly overcome this deficiency by approximating projection...

Maximal repetitions and Application to DNA sequences Mathieu Giraud (2007)

Gregory Kucherov

In this paper we describe an implementation of Main-Kolpakov-Kucherov algorithm [9] of linear-time search for maximal repetitions in sequences. We first present a theoretical background and sketch...

Visualization of Dynamic Automata using Padnon (2007)

Gregory Kucherov

We present an implementation of the string matching algorithm with variable length don't care symbols, described in [6], augmented with visualization features provided by the graph drawing...

Diversity and structure of PIF/Harbinger-like elements in the genome of Medicago truncatula (2007)

Grzebelus, Dariusz, Lasota, Slawomir, Gambin, Tomasz, Kucherov, Gregory, Gambin, Anna

Abstract Background Transposable elements constitute a significant fraction of plant genomes. The PIF/Harbinger superfamily includes DNA transposons (class II elements) carrying terminal inverted...

Subset seed automaton (2007)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We study the pattern matching automaton introduced in [KucherovNoeRoytbergJBCB06] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much...

Subset seed automaton (2007)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We study the pattern matching automaton introduced in [KucherovNoeRoytbergJBCB06] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much...

Protein similarity search with subset seeds on a dedicated reconfigurable hardware (2007)

Peterlongo, Pierre, Noé, Laurent, Lavenier, Dominique, Georges, Gilles, Jacques, Julien, Kucherov, Gregory, ...

Genome sequencing of numerous species raises the need of complete genome comparison with precise and fast similarity searches. Today, advanced seed-based techniques (spaced seeds, multiple seeds,...

Protein similarity search with subset seeds on a dedicated reconfigurable hardware (2007)

Peterlongo, Pierre, Noé, Laurent, Lavenier, Dominique, Georges, Gilles, Jacques, Julien, Kucherov, Gregory, ...

Genome sequencing of numerous species raises the need of complete genome comparison with precise and fast similarity searches. Today, advanced seed-based techniques (spaced seeds, multiple seeds,...

Reconsidering the significance of genomic word frequencies (2007)

Miklós Csürös, Laurent Noé, Gregory Kucherov

By conventional wisdom, a feature that occurs too often or too rarely in a genome can indicate a functional element. To infer functionality from frequency, it is crucial to precisely characterize...

Subset seed automaton (2007)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We study the pattern matching automaton introduced in [1] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much smaller than the one obtained...

Subset seed automaton (2007)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We study the pattern matching automaton introduced in [1] for the purpose of seed-based similarity search. We show that our definition provides a compact automaton, much smaller than the one obtained...

Reconsidering the significance of genomic word frequency (2006)

Csűrös, Miklós, Noé, Laurent, Kucherov, Gregory

We propose that the distribution of DNA words in genomic sequences can be primarily characterized by a double Pareto-lognormal distribution, which explains lognormal and power-law features found...

A unifying framework for seed sensitivity and its application to subset seeds (Extended abstract) (2006)

Kucherov, Gregory, Noe, Laurent, Roytberg, Mikhail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem - a set of...

Estimating seed sensitivity on homogeneous alignments (2006)

Kucherov, Gregory, Noe, Laurent, Ponty, Yann

We address the problem of estimating the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models [18, 6, 3, 4, 10], we study the estimation based on...

A unifying framework for seed sensitivity and its application to subset seeds (2006)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

A unifying framework for seed sensitivity and its application to subset seeds (2006)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

A unifying framework for seed sensitivity and its application to subset seeds (2006)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

A unifying framework for seed sensitivity and its application to subset seeds (2006)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

Optimal Linear Arrangement of Interval Graphs (2006)

Cohen, Johanne, Fomin, Fedor, Heggernes, Pinar, Kratsch, Dieter, Kucherov, Gregory

We study the optimal linear arrangement (OLA) problem on interval graphs. Several linear layout problems that are NP-hard on general graphs are solvable in polynomial time on interval graphs.We prove...

Optimal Linear Arrangement of Interval Graphs (2006)

Cohen, Johanne, Fomin, Fedor, Heggernes, Pinar, Kratsch, Dieter, Kucherov, Gregory

We study the optimal linear arrangement (OLA) problem on interval graphs. Several linear layout problems that are NP-hard on general graphs are solvable in polynomial time on interval graphs.We prove...

A unifying framework for seed sensitivity and its application to subset seeds (2006)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

A unifying framework for seed sensitivity and its application to subset seeds (2006)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem – a...

Recherches de motifs et de similarités en bioinformatique : modélisations, solutions logicielles et matérielles (2005)

Giraud, Mathieu, Noé, Laurent, Kucherov, Gregory, Lavenier, Dominique

Ce tutoriel expose certains problèmes fondamentaux en algorithmique du texte pour la bioinformatique, leurs solutions actuelles ainsi que quelques perspectives de recherche. Après une introduction...

Combinatorial search on graphs motivated by bioinformatics applications: a brief survey (2005)

Bouvel, Mathilde, Grebinski, Vladimir, Kucherov, Gregory

The goal of this paper is to present a brief survey of a collection of methods and results from the area of combinatorial search \cite{aig,du-hwa} focusing on graph reconstruction using queries of...

Combinatorial search on graphs motivated by bioinformatics applications: a brief survey (2005)

Bouvel, Mathilde, Grebinski, Vladimir, Kucherov, Gregory

The goal of this paper is to present a brief survey of a collection of methods and results from the area of combinatorial search \cite{aig,du-hwa} focusing on graph reconstruction using queries of...

SIGffRid : Programme de recherche des sites de fixation des facteurs de transcription par approche comparative (2005)

Touzain, Fabrice, Schbath, Sophie, Debled-Rennesson, Isabelle, Aigle, Bertrand, Leblond, Pierre, Kucherov, Gregory

Notre objectif est la recherche des sites de fixation des sous-unités σ de l'ARN polymérase dans des génomes bactériens, sites généralement composés de deux « boîtes » dites -35 et -10...

YASS: enhancing the sensitivity of DNA similarity search (2005)

Noé, Laurent, Kucherov, Gregory

YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous...

YASS: enhancing the sensitivity of DNA similarity search (2005)

Noé, Laurent, Kucherov, Gregory

YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous...

Multi-seed lossless filtration (2005)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather...

Multi-seed lossless filtration (2005)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather...

A Unifying Framework for Seed Sensitivity and Its Application to Subset Seeds (2005)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mihkail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

A unifying framework for seed sensitivity and its application to subset seeds (Extended abstract) (2005)

Kucherov, Gregory, Noe, Laurent, Roytberg, Mikhail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem - a set of...

Recherches de motifs et de similarités en bioinformatique : modélisations, solutions logicielles et matérielles (2005)

Giraud, Mathieu, Noé, Laurent, Kucherov, Gregory, Lavenier, Dominique

Ce tutoriel expose certains problèmes fondamentaux en algorithmique du texte pour la bioinformatique, leurs solutions actuelles ainsi que quelques perspectives de recherche. Après une introduction...

SIGffRid : Programme de recherche des sites de fixation des facteurs de transcription par approche comparative (2005)

Touzain, Fabrice, Schbath, Sophie, Debled-Rennesson, Isabelle, Aigle, Bertrand, Leblond, Pierre, Kucherov, Gregory

Notre objectif est la recherche des sites de fixation des sous-unités σ de l'ARN polymérase dans des génomes bactériens, sites généralement composés de deux « boîtes » dites -35 et -10...

A unifying framework for seed sensitivity and its application to subset seeds (Extended abstract) (2005)

Kucherov, Gregory, Noe, Laurent, Roytberg, Mikhail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem - a set of...

A unifying framework for seed sensitivity and its application to subset seeds (Extended abstract) (2005)

Kucherov, Gregory, Noe, Laurent, Roytberg, Mikhail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem - a set of...

Recherches de motifs et de similarités en bioinformatique : modélisations, solutions logicielles et matérielles (2005)

Giraud, Mathieu, Noé, Laurent, Kucherov, Gregory, Lavenier, Dominique

Ce tutoriel expose certains problèmes fondamentaux en algorithmique du texte pour la bioinformatique, leurs solutions actuelles ainsi que quelques perspectives de recherche. Après une introduction...

SIGffRid : Programme de recherche des sites de fixation des facteurs de transcription par approche comparative (2005)

Touzain, Fabrice, Schbath, Sophie, Debled-Rennesson, Isabelle, Aigle, Bertrand, Leblond, Pierre, Kucherov, Gregory

Notre objectif est la recherche des sites de fixation des sous-unités σ de l'ARN polymérase dans des génomes bactériens, sites généralement composés de deux « boîtes » dites -35 et -10...

Multiseed Lossless Filtration (2005)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather...

Multiseed Lossless Filtration (2005)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather...

YASS: enhancing the sensitivity of DNA similarity (2005)

Laurent Noé, Gregory Kucherov

YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous...

A unifying framework for seed sensitivity and its application to subset seeds (Extended abstract) (2005)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We propose a general approach to compute the seed sensitivity, that can be applied to di#erent definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

A unifying framework for seed sensitivity and its application to subset seeds (Extended abstract) (2005)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

Multiseed Lossless Filtration (2005)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather...

Multiseed lossless filtration (2005)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We study a method of seed-based lossless filtration for approximate string matching and related bioinformatics applications. The method is based on a simultaneous use of several spaced seeds rather...

YASS: enhancing the sensitivity of DNA similarity search (2005)

Noé, Laurent, Kucherov, Gregory

YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous...

A unifying framework for seed sensitivity and its application to subset seeds (2004)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem --  a set...

Improved hit criteria for DNA local alignment (2004)

Noé, Laurent, Kucherov, Gregory

Abstract Background The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is...

Multi-seed lossless filtration (Extended abstract) (2004)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We study a method of seed-based lossless ltration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed...

Estimating seed sensitivity on homogeneous alignments (2004)

Kucherov, Gregory, Noe, Laurent, Ponty, Yann

We address the problem of estimating the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models [18, 6, 3, 4, 10], we study the estimation based on...

Multi-seed lossless filtration (Extended abstract) (2004)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We study a method of seed-based lossless ltration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed...

Estimating seed sensitivity on homogeneous alignments (2004)

Kucherov, Gregory, Noe, Laurent, Ponty, Yann

We address the problem of estimating the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models [18, 6, 3, 4, 10], we study the estimation based on...

A unifying framework for seed sensitivity and its application to subset seeds (2004)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

A unifying framework for seed sensitivity and its application to subset seeds (2004)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We propose a general approach to compute the seed sensitivity, that can be applied to different definitions of seeds. It treats separately three components of the seed sensitivity problem -- a set of...

Estimating seed sensitivity on homogeneous alignments (2004)

Kucherov, Gregory, Noe, Laurent, Ponty, Yann

We address the problem of estimating the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models [18, 6, 3, 4, 10], we study the estimation based on...

Multi-seed lossless filtration (Extended abstract) (2004)

Kucherov, Gregory, Noé, Laurent, Roytberg, Mikhail

We study a method of seed-based lossless ltration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single seed...

Multi-seed lossless filtration (Extended abstract) (2004)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single...

Multi-seed lossless filtration (Extended abstract) (2004)

Gregory Kucherov, Laurent Noé, Mikhail Roytberg

We study a method of seed-based lossless filtration for approximate string matching and related applications. The method is based on a simultaneous use of several spaced seeds rather than a single...

Estimating Seed Sensitivity on Homogeneous Alignments (2004)

Gregory Kucherov, Laurent Noé, Yann Ponty

We address the problem of estimating the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models [18, 6, 3, 4, 10], we study the estimation based on...

Improved hit criteria for DNA local alignment (2004)

Laurent Noé, Gregory Kucherov

The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is decisive for the...

Improved hit criteria for DNA local alignment (2004)

Laurent Noé, Gregory Kucherov

Background: The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is decisive for...

Estimating Seed Sensitivity on Homogeneous Alignments (2004)

Gregory Kucherov

We address the problem of estimating the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models [18, 6, 3, 4, 10], we study the estimation based on...

Improved hit criteria for DNA local alignment (2004)

Laurent Noé, Gregory Kucherov

The hit criterion is a key component of heuristic local alignment algorithms. It specifies a class of patterns assumed to witness a potential similarity, and this choice is decisive for the...

Estimating seed sensitivity on homogeneous alignments (2003)

Kucherov, Gregory, Noé, Laurent, Ponty, Yann

We address the problem of measuring the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models, we study the measurement based on homogeneous...

YASS: Similarity search in DNA sequences (2003)

Noé, Laurent, Kucherov, Gregory

We describe YASS -- a new tool for finding local similarities in DNA sequences. The YASS algorithm first scans the sequence(s) and creates on the fly groups of (small exact repeats obtained by...

Estimating seed sensitivity on homogeneous alignments (2003)

Kucherov, Gregory, Noé, Laurent, Ponty, Yann

We address the problem of measuring the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models, we study the measurement based on homogeneous...

YASS: Similarity search in DNA sequences (2003)

Noé, Laurent, Kucherov, Gregory

We describe YASS -- a new tool for finding local similarities in DNA sequences. The YASS algorithm first scans the sequence(s) and creates on the fly groups of (small exact repeats obtained by...

YASS: Similarity search in DNA sequences (2003)

Noé, Laurent, Kucherov, Gregory

We describe YASS -- a new tool for finding local similarities in DNA sequences. The YASS algorithm first scans the sequence(s) and creates on the fly groups of (small exact repeats obtained by...

Estimating seed sensitivity on homogeneous alignments (2003)

Kucherov, Gregory, Noé, Laurent, Ponty, Yann

We address the problem of measuring the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models, we study the measurement based on homogeneous...

mreps: efficient and flexible detection of tandem repeats in dna (2003)

Roman Kolpakov, Ghizlane Bana, Gregory Kucherov

The presence of repeated sequences is a fundamental feature of genomes. Tandemly repeated DNA appears in both eukaryotic and prokaryotic genomes, it is associated with various regulatory mechanisms...

Identification of Transcription Factor Binding Sites in Streptomyces coelicolor A3(2) by Phylogenetic Comparison (2003)

Touzain Loria Fr, Fabrice Touzain, Patricia Lavigne, Isabelle Debled-rennesson, Pierre Leblond, Gregory Kucherov

Introduction The goal of this work is to identify, through computer methods, nucleotide patterns involved in transcription mechanisms in the bacteria Streptomyces coelicolor. In a preceding work [1],...

YASS: Similarity search in DNA sequences (2003)

Laurent Noé, Gregory Kucherov

We describe YASS – a new tool for finding local similarities in DNA sequences. The YASS algorithm first scans the sequence(s) and creates on the fly groups of seeds (small exact repeats obtained by...

Estimating seed sensitivity on homogeneous alignments (2003)

Gregory Kucherov, Laurent Noé, Yann Ponty

We address the problem of measuring the sensitivity of seed-based similarity search algorithms. In contrast to approaches based on Markov models, we study the measurement based on homogeneous...

How Many Square Occurrences Must a Binary Sequence Contain? (2003)

Gregory Kucherov, Pascal Ochem, Michaël Rao

Every binary word with at least four letters contains a square. A. Fraenkel and J. Simpson showed that three distinct squares are necessary and su#cient to construct an infinite binary word. We study...

mreps: efficient and flexible detection of tandem repeats in DNA (2003)

Kolpakov, Roman, Bana, Ghizlane, Kucherov, Gregory

The presence of repeated sequences is a fundamental feature of genomes. Tandemly repeated DNA appears in both eukaryotic and prokaryotic genomes, it is associated with various regulatory mechanisms...

A new method of finding similarity regions in DNA sequences (2002)

Laurent Noé, Gregory Kucherov

Identifying similarity regions inside a DNA sequence (repeats), or between two sequences (local alignment), is a fundamental problem in bioinformatics. For this task, many algorithms use a technique...

Finding approximate repetitions under Hamming distance (2001)

Kolpakov, Roman, Kucherov, Gregory

The problem of computing tandem repetitions with $K$ possible mismatches is studied. Two main definitions are considered, and for both of them an $O(nK\log K+S)$ algorithm is proposed ($S$ the size...

Finding approximate repetitions under Hamming distance (2001)

Kolpakov, Roman, Kucherov, Gregory

The problem of computing tandem repetitions with $K$ possible mismatches is studied. Two main definitions are considered, and for both of them an $O(nK\log K+S)$ algorithm is proposed ($S$ the size...

Finding approximate repetitions under Hamming distance (2001)

Kolpakov, Roman, Kucherov, Gregory

The problem of computing tandem repetitions with $K$ possible mismatches is studied. Two main definitions are considered, and for both of them an $O(nK\log K+S)$ algorithm is proposed ($S$ the size...

Finding Repeats With Fixed Gap (2000)

Kolpakov, Roman, Kucherov, Gregory

We propose an algorithm for finding in a word all pairs of occurrences of the same subword with a given distance r between them. The obtained complexity is O(n\log r +S), where S is the size of the...

Finding Repeats With Fixed Gap (2000)

Kolpakov, Roman, Kucherov, Gregory

We propose an algorithm for finding in a word all pairs of occurrences of the same subword with a given distance r between them. The obtained complexity is O(n\log r +S), where S is the size of the...

Finding Repeats With Fixed Gap (2000)

Kolpakov, Roman, Kucherov, Gregory

We propose an algorithm for finding in a word all pairs of occurrences of the same subword with a given distance r between them. The obtained complexity is O(n\log r +S), where S is the size of the...

Finding Repeats With Fixed Gap (2000)

Roman Kolpakov, Gregory Kucherov

We propose an algorithm for finding in a word all pairs of occurrences of the same subword within a given distance r. The obtained complexity is O(n log r + S), where S is the size of the output. We...

Maximal repetitions and Application to DNA sequences (2000)

Mathieu Giraud, Gregory Kucherov

In this paper we describe an implementation of Main-Kolpakov-Kucherov algorithm [9] of linear-time search for maximal repetitions in sequences. We first present a theoretical background and sketch...

The Complexity of Some Complementation Problems (1999)

Plaisted, David A., Kucherov, Gregory

We study the computational complexity of the problem of computing a complement representation for a set of terms. Depending on the class of sets considered, some sets are shown to have an exponential...

The Complexity of Some Complementation Problems (1999)

Plaisted, David A., Kucherov, Gregory

We study the computational complexity of the problem of computing a complement representation for a set of terms. Depending on the class of sets considered, some sets are shown to have an exponential...

The Complexity of Some Complementation Problems (1999)

Plaisted, David A., Kucherov, Gregory

We study the computational complexity of the problem of computing a complement representation for a set of terms. Depending on the class of sets considered, some sets are shown to have an exponential...

Patterns in words versus patterns in trees: a brief survey and new results (1999)

Gregory Kucherov

Abstract. In this paper we study some natural problems related to specifying sets of words and trees by patterns. 1

On Maximal Repetitions in Words (1999)

Roman Kolpakov, Gregory Kucherov

A (fractional) repetition in a word w is a subword with the period of at most half of the subword length. We study maximal repetitions occurring in w, that is those for which any extended subword of...

On Repetition-Free Binary Words of Minimal Density (1999)

Roman Kolpakov, Gregory Kucherov, Yuri Tarannikov

We study the minimal proportion (density) of one letter in n-th power-free binary words. First, we introduce and analyse a general notion of minimal letter density for any innite set of words which...

Finding Maximal Repetitions in a Word in Linear Time (1999)

Roman Kolpakov, Gregory Kucherov

A repetition in a word w is a subword with the period of at most half of the subword length. We study maximal repetitions occurring in w, that is those for which any extended subword of w has a...

Reconstructing Set Partitions (1999)

Vladimir Grebinski, Gregory Kucherov

. The algorithm proceeds as follows. 1. Split the set [n] into two disjoint subsets of n=2 elements. 2. Reconstruct recursively the induced partition for each of the two subsets. 3. Choose one...

The Complexity of Some Complementation Problems (1998)

David A. Plaisted, Gregory Kucherov

We study the computational complexity of the problem of computing a complement representation for a set of terms. Depending on the classe of sets considered, some sets are shown to have an...

Maximal Repetitions in Words or How to Find all Squares in Linear Time (1998)

Roman Kolpakov And, Roman Kolpakov, Gregory Kucherov

A (fractional) repetition in a word w is a subword with the period of at most half of the subword length. We study maximal repetitions occurring in w, that is those for which any extended subword of...

Maximal Repetitions in Words or How to Find all Squares in Linear Time (1998)

Roman Kolpakov, Gregory Kucherov

A (fractional) repetition in a word w is a subword with the period of at most half of the subword length. We study maximal repetitions occurring in w, that is those for which any extended subword of...

Optimal Reconstruction of Graphs Under the Additive Model (1997)

Grebinski, Vladimir, Kucherov, Gregory

solution of $O(n^2/\log n)$ queries for general graphs can be obtained.

Optimal Reconstruction of Graphs Under the Additive Model (1997)

Grebinski, Vladimir, Kucherov, Gregory

We study the problem of combinatorial search for graphs under the additive model. The main result concerns the reconstruction of bounded degree graphs, i.e. graphs with the degree of all vertices...

Optimal Reconstruction of Graphs Under the Additive Model (1997)

Grebinski, Vladimir, Kucherov, Gregory

We study the problem of combinatorial search for graphs under the additive model. The main result concerns the reconstruction of bounded degree graphs, i.e. graphs with the degree of all vertices...

Optimal Reconstruction of Graphs Under the Additive Model (1997)

Vladimir Grebinski, Vladimir Grebinski, Gregory Kucherov, Gregory Kucherov

: We study the problem of combinatorial search for graphs under the additive model. The main result concerns the reconstruction of bounded degree graphs, i.e. graphs with the degree of all vertices...

Optimal Query Bounds for Reconstructing a Hamiltonian Cycle in Complete Graphs (extended abstract) (1997)

Vladimir Grebinski, Gregory Kucherov

This paper studies four combinatorial search models of reconstructing a fixed unknown Hamiltonian cycle in the complete graph by means of queries about subgraphs. For each model, an efficient...

Optimal Reconstruction of Graphs Under the Additive Model (1997)

Vladimir Grebinski, Gregory Kucherov

We study the problem of combinatorial search for graphs under the additive model. The main result concerns the reconstruction of bounded degree graphs, i.e. graphs with the degree of all vertices...

Minimal Letter Frequency in N-Th Power-Free Binary Words (1997)

Roman Kolpakov, Gregory Kucherov

We show that the minimal proportion of one letter in an n-th power-free binary word is asymptotically 1=n. We also consider a generalization of n-th power-free words defined through the notion of...

Reconstructing a Hamiltonian Circuit by Querying the Graph: Application to DNA Physical Mapping (1996)

Vladimir Grebinski, Gregory Kucherov

This paper studies three mathematical models of the multiplex PCR method of genome physical mapping described in [12]. The models are expressed as combinatorial group testing problems of finding an...

Matching a Set of Strings with Variable Length Don't Cares (1995)

Gregory Kucherov, Michaël Rusinowitch

Given an alphabet A, a pattern p is a word v 1 @ : : : @vm , where v i 2 A and @ = 2 A is a distinguished symbol called a variable length don't care symbol. Pattern p matches a text t 2 A if t =...

Matching a Set of Strings with Variable Length Don't Cares (1995)

Gregory Kucherov, Michael Rusinowitch

this paper we address the following problem: given a set P of patterns and a text t, test whether one of the patterns of P matches

Undecidability of Ground Reducibility for Word Rewriting Systems With Variables (1995)

Gregory Kucherov, Michael Rusinowitch

this paper we study the following ground reducibility problem: for a given word with variables w, does

The Complexity of Testing Ground Reducibility for Linear Word Rewriting Systems with Variables (1995)

Gregory Kucherov, Michal Rusinowitch

In [9] we proved that for a word rewriting system with variables R and a word with variables w, it is undecidable if w is ground reducible by R, that is if all the instances of w obtained by...

Some Results on Top-context-free Tree Languages (1994)

Dieter Hofbauer, Maria Huber, Gregory Kucherov

. Top-context-free tree languages (called cor#gulier by Arnold and Dauchet [1, 2]) constitute a natural subclass of context-free tree languages. In this paper, we give further evidence for the...

On the ground reducibility problem for word rewriting systems with variables (1993)

Kucherov, Gregory, Rusinowitch, Michaël

Given a word rewriting system with variables R and a word with variables w the question we are interested in is whether all the instances of w obtained by substituting its variables by non-empty...

On the ground reducibility problem for word rewriting systems with variables (1993)

Kucherov, Gregory, Rusinowitch, Michaël

Given a word rewriting system with variables R and a word with variables w the question we are interested in is whether all the instances of w obtained by substituting its variables by non-empty...

Decidability of Regularity and Related Properties of Ground Normal Form Languages (1992)

Gregory Kucherov, Mohamed Tajine, Rue Ren Descartes

We study language-theoretical properties of the set of reducible ground terms and its complement - the set of ground normal forms induced by a given rewriting system. As a tool for our analysis we...

mreps: efficient and flexible detection of tandem repeats in DNA

Kolpakov, Roman, Bana, Ghizlane, Kucherov, Gregory

The presence of repeated sequences is a fundamental feature of genomes. Tandemly repeated DNA appears in both eukaryotic and prokaryotic genomes, it is associated with various regulatory mechanisms...

YASS: enhancing the sensitivity of DNA similarity search

Noé, Laurent, Kucherov, Gregory

YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous...

mreps: efficient and flexible detection of tandem repeats in DNA

Kolpakov, Roman, Bana, Ghizlane, Kucherov, Gregory

The presence of repeated sequences is a fundamental feature of genomes. Tandemly repeated DNA appears in both eukaryotic and prokaryotic genomes, it is associated with various regulatory mechanisms...

YASS: enhancing the sensitivity of DNA similarity search

Noé, Laurent, Kucherov, Gregory

YASS is a DNA local alignment tool based on an efficient and sensitive filtering algorithm. It applies transition-constrained seeds to specify the most probable conserved motifs between homologous...

NORINE: a database of nonribosomal peptides

Caboche, Ségolène, Pupin, Maude, Leclère, Valérie, Fontaine, Arnaud, Jacques, Philippe, Kucherov, Gregory

Norine is the first database entirely dedicated to nonribosomal peptides (NRPs). In bacteria and fungi, in addition to the traditional ribosomal proteic biosynthesis, an alternative...