Dennis Shasha

Efficient and Correct Programs that Share Execution of Parallel Memory (2009)

Dennis Shasha, Marc Snir

In this paper we consider an optimization problem that arises in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers. A program on...

Repeatability & Workability Evaluation of SIGMOD 2009 (2009)

Manegold, Stefan, Manolescu, Ioana, Afanasiev, Loredana, Feng, Jianling, Gou, Gang, Hadjieleftheriou, Marios, ...

This paper reports on the SIGMOD 2009 Repeatability and Workability initiative, an effort to verify the experiments presented in articles accepted to the ACM SIGMOD 2009 conference. We summarize the...

Repeatability & Workability Evaluation of SIGMOD 2009 (2009)

Manegold, Stefan, Manolescu, Ioana, Afanasiev, Loredana, Feng, Jianling, Gou, Gang, Hadjieleftheriou, Marios, ...

This paper reports on the SIGMOD 2009 Repeatability and Workability initiative, an effort to verify the experiments presented in articles accepted to the ACM SIGMOD 2009 conference. We summarize the...

Fast Structural Search in Phylogenetic Databases (2008)

Huiyuan Shan, Dennis Shasha, William H. Piel

Abstract: As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has...

Abstract Secure Untrusted Data Repository (SUNDR) (2008)

Jinyuan Li, Maxwell Krohn, David Mazières, Dennis Shasha

SUNDR is a network file system designed to store data securely on untrusted servers. SUNDR lets clients detect any attempts at unauthorized file modification by malicious server operators or users....

Abstract Secure Untrusted Data Repository (SUNDR) (2008)

Jinyuan Li, Maxwell Krohn, David Mazières, Dennis Shasha

SUNDR is a network file system designed to store data securely on untrusted servers. SUNDR lets clients detect any attempts at unauthorized file modification by malicious server operators or users....

� Connection Caching � Additional Performance Enhancing Techniques (2008)

Thomas Kyte, Dennis Shasha

� Constraints: in Java vs. the DBMS � Another look at transactions and locking � Efficient interaction with the database � How to generate numbers in a sequence?

GraphFind: enhancing graph searching by low support data mining techniques (2008)

Ferro, Alfredo, Giugno, Rosalba, Mongiovì, Misael, Pulvirenti, Alfredo, Skripin, Dmitry, Shasha, Dennis

Abstract Background Biomedical and chemical databases are large and rapidly growing in size. Graphs naturally model such kinds of data. To fully exploit the wealth of information in these graph...

An integrated genetic, genomic and systems approach defines gene networks regulated by the interaction of light and carbon signaling pathways in Arabidopsis (2008)

Thum, Karen E, Shin, Michael J, Gutiérrez, Rodrigo A, Mukherjee, Indrani, Katari, Manpreet S, Nero, Damion, ...

Abstract Background Light and carbon are two important interacting signals affecting plant growth and development. The mechanism(s) and/or genes involved in sensing and/or mediating the signaling...

List of Supported Students and Staff (2008)

Dennis Shasha, Dennis Shasha, Guangwei Dai, Rosalba Giugno, ...

techniques and query languages, pattern matching and recognition, scientific data mining Project Summary The goal of this research project is to make it possible to process approximate queries on...

Database (2008)

Alfredo Ferro, Rosalba Giugno, Misael Mongioví, Alfredo Pulvirenti, Dmitry Skripin, Dennis Shasha, ...

(molecules, networks) •Prediction of the functionality of new natural or synthesized compounds •Make a compound Q more active •Find fragment with the same function among different species...

Fast Structural Search in Phylogenetic Databases (2008)

Huiyuan Shan, Dennis Shasha, William H. Piel

Abstract: As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has...

Bad (2008)

Edgar H. Sibley, Steve Rozen, Dennis Shasha

Developers of a Wall Street financial application were able to exploit a relational DBMS to advantage for some data management tasks (the good). For others, the relational system was not helpful (the...

Abstract Evaluating A Class of Distance-Mapping Algorithms for Data Mining and Clustering* (2008)

Jason Tsong-li, Wang+ Xiong Wang, Dennis Shasha, Bruce A. Shapiro

A distance-mapping algorithm takes a set of objects and a distance metric and then maps those objects to a Euclidean or pseudoEuclidean space in such a way that the distances among objects are...

Abstract Secure Untrusted Data Repository (SUNDR) (2008)

Jinyuan Li, Maxwell Krohn, David Mazières, Dennis Shasha

SUNDR is a network file system designed to store data securely on untrusted servers. SUNDR lets clients detect any attempts at unauthorized file modification by malicious server operators or users....

Exact and Approximate Algorithms for Unordered 'he Matching (2008)

Dennis Shasha, Kaizhong Zhang, Frank Y. Shih

Abstract-We consider the problem of comparison between unordered trees, i.e., trees for which the order among siblings is unimportant. The criterion for comparison is the distance as measured by a...

ABSTRACT GhostDB: Hiding Data from Prying Eyes (2008)

Christophe Salperwyck, Nicolas Anciaux, Mehdi Benzine, Luc Bouganim, Dennis Shasha

Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...

Vol. 23 ISMB/ECCB 2007, pages i97–i103 BIOINFORMATICS doi:10.1093/bioinformatics/btm225 Homology (2008)

Xuefeng Cui, Broňa Brejová, Dennis Shasha, Ming Li

Motivation: Life science researchers often require an exhaustive list of protein coding genes similar to a given query gene. To find such genes, homology search tools, such as BLAST or PatternHunter,...

MetricMap: An embedding technique for processing distance-based queries in metric spaces (2008)

Xiong Wang, Dennis Shasha, Kaizhong Zhang

Abstract—In this paper, we present an embedding technique, called MetricMap, which is capable of estimating distances in a pseudometric space. Given a database of objects and a distance function...

Revelation on Demand (2008)

Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Private data sometimes must be made public. A corporation may keep its customer sales data secret, but reveals totals by sector for marketing reasons. A hospital keeps individual patient data secret,...

Revelation on Demand (2008)

Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Private data sometimes must be made public. A corporation may keep its customer sales data secret, but reveals totals by sector for marketing reasons. A hospital keeps individual patient data secret,...

BMC Systems Biology BioMed Central (2008)

Karen E Thum, Michael J Shin, Rodrigo A Gutiérrez, Indrani Mukherjee, Manpreet S Katari, Damion Nero, ...

Research article An integrated genetic, genomic and systems approach defines gene networks regulated by the interaction of light and carbon signaling pathways in Arabidopsis

y (2007)

Steve Rozen, Bruce A. Shapiro, Dennis Shasha, Zhiyuan Wang, Maisheng Yin

Key words: algorithms, consensus sequence, pattern matching, tools for computational biology, DNA sequence recognition

Bruce Shapiro x (2007)

Thomas G. Marr, Dennis Shasha, Gung-wei Chirn

We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) find candidate motifs in a small sample of the sequences;...

4 (2007)

Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang

Abstract. In this paper we present an index structure, called MetricMap, that takes a set of objects and a distance metric and then maps those objects to a k-dimensional space in such a way that the...

2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm (2007)

Theodore Johnson, Dennis Shasha

In a path-breaking paper last year Pat and Betty O'Neil and Gerhard Weikum proposed a self-tuning improvement to the Least Recently Used (LRU) buffer management algorithm[15]. Their improvement...

Synthesizing Arbitrary Genomes (2007)

Dennis Shasha, Zasha Weinberg

Suppose a researcher wants a long arbitrary sequence of nucleotides and asks a lab to synthesize it. Oligonucleotides are generally of length less than 100, so it is necessary to resort to...

Bulletin of the Technical Committee on (2007)

December Vol No, Answering Histograms, Viswanath Poosala, Venkatesh Ganti, Yannis E. Ioannidis, Dennis Shasha, ...

Answering queries approximately has recently been proposed as a way to reduce query response times in on-line decision support systems, when the precise answer is not necessary or early feedback is...

Cv (2007)

Ffl Xiong Wang, Xiong Wang, Xiong Wang, King-ip Lin, ...

ditor, Combinatorial Pattern Matching, pages 104 -- 117, Lecture Notes in Computer Science, SpringerVerlag, 1998. ffl Xiong Wang, Jason T.L. Wang, Dennis Shasha, Bruce Shapiro, Sitaram Dikshitulu,...

TREEDIFF: A System for Document Comparison by Structure (2007)

Girish Patel, Liam Relihan, Dennis Shasha, Kaizhong Zhang, ...

rstructure comparison(A, B): to determine whether or not A contains B as an approximate subtree; it returns the closest matching subtree of A and the distance between that subtree and B. 3 Query...

Q: A Low Overhead High Performance Buffer Management Replacement Algorithm (2007)

Theodore Johnson, Dennis Shasha, Gerhard Weikum Proposed

In a path-breaking paper last year Pat and Betty O'Neil and Gerhard Weikum proposed a self-tuning improvement to the Least Recently Used (LRU) buffer management algorithm[15]. Their improvement...

Building secure file systems out of Byzantine storage # (2007)

David Mazi Eres, Dennis Shasha

This paper shows how to implement a trusted network file system on an untrusted server. While cryptographic storage techniques exist that allow users to keep data secret from untrusted servers, this...

Building secure file systems out of Byzantine storage (2007)

David Mazi Eres, Dennis Shasha

This paper shows how to implement a trusted network file system on an untrusted server. While cryptographic storage techniques exist that allow users to keep data secret from untrusted servers, this...

and (2007)

Kaizhong Zhang, Dennis Shasha, Communicated T. Jiang

We consider the problem of comparing CUAL graphs (Connected, Undirected, Acyclic graphs with nodes being Labeled). This problem is motivated by the study of information retrieval for bio-chemical and...

Don't Trust Your File Server (2007)

David Mazi Eres, Dennis Shasha

All too often, decisions about whom to trust in computer systems are driven by the needs of system management rather than data security. In particular, data storage is often entrusted to people who...

Francois Llirbat (2007)

Inria Rocquencourt, Inria Rocquencourt, Inria Rocquencourt, H. Arno Jacobsen, Dennis Shasha

Large-scale information dissemination systems for selective information distribution are gaining increasing importance. This is due to the fast proliferation of dierent communication infrastructures,...

x (2007)

Qicheng Ma, Dennis Shasha, Cathy H. Wu

In this paper we propose new techniques to extract features from protein sequences. We then use the features as inputs for a Bayesian neural network (BNN) and apply the BNN to classifying protein...

Secure Untrusted Data Repository (SUNDR) (2007)

Jinyuan Li, Maxwell Krohn, David Mazieres, Dennis Shasha

We have implemented a secure network file system called SUNDR that guarantees the integrity of data even when malicious parties control the server. SUNDR splits storage functionality between two...

Homology search for genes (2007)

Cui, Xuefeng, Vinar, Tomás, Brejová, Brona, Shasha, Dennis, Li, Ming

Motivation: Life science researchers often require an exhaustive list of protein coding genes similar to a given query gene. To find such genes, homology search tools, such as BLAST or PatternHunter,...

DNA Hash Pooling and its Applications (2007)

Shasha, Dennis, Amos, Martyn

In this paper we describe a new technique for the comparison of populations of DNA strands. Comparison is vital to the study of ecological systems, at both the micro and macro scales. Existing...

GhostDB: Querying Visible and Hidden Data Without Leaks (2007)

Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...

GhostDB: Querying Visible and Hidden Data Without Leaks (2007)

Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...

GhostDB: Hiding Data from Prying Eyes (2007)

Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...

GhostDB: Hiding Data from Prying Eyes (2007)

Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...

GhostDB: Hiding Data from Prying Eyes (2007)

Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may want to issue queries whose result will combine private and...

GhostDB: Hiding Data from Prying Eyes (2007)

Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may want to issue queries whose result will combine private and...

Querying and Aggregating Visible and Hidden Data Without Leaks (2007)

Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...

Querying and Aggregating Visible and Hidden Data Without Leaks (2007)

Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis

Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...

The Design of Griffin: A Common Prototyping Language (2006)

Dewar, Robert, Goldberg, Benjamin, Harrison, Malcolm, Schonberg, Edmond, Shasha, Dennis

The objective of the Griffin project at NYU is the design of a language, called Griffin, for prototyping large software systems. The success and cost-effectiveness of prototyping depends on, among...

2.3. Desktop Search...... 7 (2006)

Jeffrey Borden, Dennis Shasha, Jeffrey Borden, Chris Harrison, Stacey Kuznetsov, Dennis Shasha, ...

I would like to thank Professor Dennis Shasha for his continued support and guidance. I am also grateful to my two co-researchers Chris Harrison and Stacey Kuznetsov.

2.2 User Reliance........ 4 (2006)

Chris Harrison, Dennis Shasha, Stacey Kuznetsov

over the course of the Spring 2006 semester. The text analytics package was developed Jeff Borden.

Fast Structural Search in Phylogenetic Databases (2005)

Huiyuan Shan, Dennis Shasha, William H. Piel

As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become...

Fast algorithms for time series with applications to finance and physics and music and biology and and other suspects (2004)

Alberto Lerner, Dennis Shasha, Zhihua Wang, Xiaojian Zhao, Yunyue Zhu

Financial time series streams are watched closely by millions of traders. What exactly do they look for and how can we help them do it faster? Physicists study the time series emerging from their...

Unordered Tree Mining with Applications to Phylogeny (2004)

Dennis Shasha

Frequent structure mining (FSM) aims to discover and extract patterns frequently occurring in structural data, such as trees and graphs. FSM finds many applications in bioinformatics, XML processing,...

Activist data mining for computational science: tools and applications (2003)

Dennis Shasha

Classical data mining involves: waiting for data to appear and then mining it. Activist data mining involves: proposing experiments based on algorithmic and application-specific considerations,...

Treerank: A similarity measure for nearest neighbor searching in phylogenetic databases (2003)

Huiyuan Shan, Dennis Shasha, William H. Piel

Phylogenetic trees are unordered labeled trees in which each leaf node has a label and the order among siblings is unimportant. In this paper we propose a new similarity measure, called TreeRank, for...

Treerank: A similarity measure for nearest neighbor searching in phylogenetic databases (2003)

Huiyuan Shan, Dennis Shasha, William H. Piel

Phylogenetic trees are unordered labeled trees in which each leaf node has a label and the order among siblings is unimportant. In this paper we propose a new similarity measure, called TreeRank, for...

Algorithmics and Applications of Tree and Graph Searching (2002)

Dennis Shasha, Rosalba Giugno

Modern search engines answer keyword-based queries extremely eciently. The impressive speed is due to clever inverted index structures, caching, a domain-independent knowledge of strings, and...

Finding approximate patterns in undirected acyclic graphs (2002)

Kaizhong Zhang, George Chang, Dennis Shasha

We consider an approximate pattern matching problem for undirected acyclic graphs. Specifically, let P be a pattern graph, D a data graph and t an integer. We present an algorithm to locate a...

ATreeGrep: Approximate Searching in Unordered Trees (2002)

Dennis Shasha, Huiyuan Shan, Kaizhong Zhang

An unordered labeled tree is a tree in which each node has a string label and the parent-child relationship is significant, but the order among siblings is unimportant. This paper presents an...

Finding Patterns in Three Dimensional Graphs: Algorithms and Applications to Scientific Data Mining (2002)

Xiong Wang, Dennis Shasha, Bruce A. Shapiro, Isidore Rigoutsos, Kaizhong Zhang

This paper presents a method for finding patterns in three dimensional (3D) graphs. Each node in a graph is an undecomposable or atomic unit and has a label. Edges are links between the atomic units....

A Structure-Based Search Engine for Phylogenetic Databases (2002)

Huiyuan Shan, Katherine G. Herbert, William H. Piel, Dennis Shasha

Phylogenetic trees are essential for understanding the relationships among organisms or taxa. Many of the current techniques for searching phylogenetic repositories allow the user to perform a...

Declarative Data Cleaning : Language, Model, and Algorithms (2001)

Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric, Saita, Cristian

The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...

Declarative Data Cleaning : Language, Model, and Algorithms (2001)

Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric, Saita, Cristian

The problem of data cleaning, which consists of emoving inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...

Declarative Data Cleaning : Language, Model, and Algorithms (2001)

Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric, Saita, Cristian

The problem of data cleaning, which consists of emoving inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...

New techniques for extracting features from protein sequences (2001)

Qicheng Ma, Dennis Shasha, Cathy H. Wu

In this paper we propose new techniques to extract features from protein sequences. We then use the features as inputs for a Bayesian neural network (BNN) and apply the BNN to classifying protein...

DNA sequence classification via an expectation maximization algorithm and neural networks: A case study (2001)

Qicheng Ma, Dennis Shasha, Cathy H. Wu

This paper presents new techniques for biosequence classification, with a focus on recognizing E. Coli promoters in DNA. Specifically, given an unlabeled DNA sequence S, we want to determine whether...

Improving data cleaning quality using a data lineage facility (2001)

Helena Galhardas, Inria Rocquencourt, Daniela Florescu, Dennis Shasha, Eric Simon, Cristian-augustin Saita

The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...

Declarative Data Cleaning: Language, Model, and Algorithms (2001)

Helena Galhardas, Daniela Florescu, Dennis Shasha

The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. This holds...

Filtering algorithms and implementation for very fast publish/subscribe systems (2001)

Franoise Fabret, Arno Jacobsen, Franois Llirbat, Joo Pereira, Ken Ross, Dennis Shasha

Publish/Subscribe is the paradigm in which users express long-term interests (subscriptions) and some external agent (perhaps other users) publishes events (e.g., oers). The job of Publish/Subscribe...

Filtering Algorithms and Implementation for Very Fast Publish/Subscribe Systems (2001)

Françoise Fabret, H. Arno Jacobsen, François Llirbat, João Pereira, Jo Ao Pereira, Kenneth A. Ross, ...

Publish/Subscribe is the paradigm in which users express long-term interests (\subscriptions") and some agent \publishes " events (e.g., oers). The job of Publish/Subscribe software is to...

Efficient Matching Algorithms for Publish and Subscribe Systems (2000)

Fabret, Françoise, Llirbat, François, Jacobsen, Arno, Pereira, Joâo, Ross, Kenneth, Shasha, Dennis

Publish/Subscribe is the paradigm in which users express long-term interests («subscriptions») and some external agent (perhaps other users) «publishes» events (e.g., offers). The job of...

Efficient Matching Algorithms for Publish and Subscribe Systems (2000)

Fabret, Françoise, Llirbat, François, Jacobsen, Arno, Pereira, Joâo, Ross, Kenneth, Shasha, Dennis

Publish/Subscribe is the paradigm in which users express long-term interests («subscriptions») and some external agent (perhaps other users) «publishes» events (e.g., offers). The job of...

Efficient Matching Algorithms for Publish and Subscribe Systems (2000)

Fabret, Françoise, Llirbat, François, Jacobsen, Arno, Pereira, Joâo, Ross, Kenneth, Shasha, Dennis

Publish/Subscribe is the paradigm in which users express long-term interests («subscriptions») and some external agent (perhaps other users) «publishes» events (e.g., offers). The job of...

Application of neural networks to biological data mining: A case study in protein sequence classification (2000)

Qicheng Ma, Dennis Shasha, Cathy H. Wu

Biological data mining aims to extract significant information from DNA, RNA and proteins. The significant information may refer to motifs, functional sites, clustering and classification rules. This...

An extensible framework for data cleaning (2000)

Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon

Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...

Efficient Matching for Content-based Publish/Subscribe Systems (2000)

Françoise Fabret, François Llirbat, João Pereira, Jo Ao Pereira, Inria Rocquencourt, Dennis Shasha

This paper describes the event model, subscription language and the matching algorithms developed for our context-based pub/sub system. The event model and subscription language adopted in our...

Publish/Subscribe on the Web at Extreme Speed (2000)

Françoise Fabret, François LLirbat, João Pereira, Jo Ao Pereira, Inria Rocquencourt, Inria Rocquencourt, ...

Introduction This demonstration presents Le Subscribe an event notification system for the Web. It is widely accepted that the majority of human information will be on the Web in ten years. As...

Declaratively cleaning your data using AJAX (2000)

Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon

Data quality concerns arise when correcting anomalies in a single data source, or integrating data coming from multiple sources into a single data repository. The information handled may also need to...

Efficient Matching for Web-Based Publish/subscribe Systems (2000)

João Pereira, Françoise Fabret, Francois Llirbat, Dennis Shasha, Inria Rocquencourt

There is a need for systems being able to capture the dynamic aspect of the web information by notifying users of interesting events. Content-based publish/subscribe systems are an emerging type of...

Declaratively cleaning your data using AJAX (2000)

Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon

Data quality concerns arise when correcting anomalies in a single data source, or integrating data coming from multiple sources into a single data repository. The information handled may also need to...

An index structure for data mining and clustering (2000)

Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang

Abstract. In this paper we present an index structure, called Metric-Map, that takes a set of objects and a distance metric and then maps those objects to a k-dimensional space in such away that the...

An index structure for data mining and clustering (2000)

Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang

Abstract. In this paper we present an index structure, called Metric-Map, that takes a set of objects and a distance metric and then maps those objects to a k-dimensional space in such a way that the...

Making Snapshots Isolation Serializable (2000)

Alan Fekete, Dennis Shasha

Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson et al. [1995]. SI is attractive because it provides an isolation level that avoids many of the...

An Extensible Framework for Data Cleaning (1999)

Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric

Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...

An Extensible Framework for Data Cleaning (1999)

Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric

Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...

An Extensible Framework for Data Cleaning (1999)

Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric

Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...

Evaluating A Class of Distance-Mapping Algorithms for Data Mining and Clustering (1999)

Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang

A distance-mapping algorithm takes a set of objects and a distance metric and then maps those objects to a Euclidean or pseudo-Euclidean space in such a way that the distances among objects are...

New techniques for DNA sequence classification (1999)

Steve Rozen, Bruce A. Shapiro, Dennis Shasha, Zhiyuan Wang, Maisheng Yin

DNA sequence classification is the activity of determining whether or not an unlabeled sequence S belongs to an existing class C. This paper proposes two new techniques for DNA sequence...

The Design of Griffin: A Common Prototyping Language. (1998)

Dewar, Robert, Goldberg, Benjamin, Harrison, Malcolm, Schonberg, Edmond, Shasha, Dennis

This project was the first phase of the development of a language for prototyping large software systems, especially those that will ultimately be implemented in Ada. The Department of Defense has...

Secure Untrusted Data Repository (SUNDR) (1998)

Li, Jinyuan, Krohn, Maxwell, Mazieres, David, Shasha, Dennis

We have implemented a secure network file system called SUNDR that guarantees the integrity of data even when malicious parties control the server. SUNDR splits storage functionality between two...

An Algorithm for Finding the Largest Approximately Common Substructures of Two Trees (1998)

Bruce A. Shapiro, Dennis Shasha, Kaizhong Zhang, Kathleen M. Currey

Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is fixed. Such trees have many applications in vision, pattern recognition,...

An algorithm for finding the largest approximately common substructures of two trees (1998)

Bruce A. Shapiro, Dennis Shasha, Kaizhong Zhang, Kathleen M. Currey

Abstract | Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is xed. Such trees have many applications in vision, pattern...

Structural matching and discovery in document databases (1997)

Dennis Shasha, Liam Relihan, Kaizhong Zhang, Girish Patel

Structural matching and discovery in documents such as SGML and HTML is important for data warehousing [6], version management [7, 11], hypertext authoring, digital libraries [4] and Internet...

Automated Discovery of Active Motifs in Three Dimensional Molecules (1997)

Xiong Wang, Dennis Shasha, Sitaram Dikshitulu, Isidore Rigoutsos, Kaizhong Zhang

In this paper we present a method for discovering approximately common motifs (also known as active motifs) in three dimensional (3D) molecules. Each node in a molecule is represented by a 3D point...

An approach to fault tolerant parallel processing on intermittently idle, heterogeneous workstations (1997)

Karpjoo Jeong, Dennis Shasha, Surendranath Talla, Peter Wyckoff

We propose a novel approach to harness the idle cycles of workstations connected by LAN/WANs for long running scientific computations and present performance results for our prototype system called...

PLinda User Manual (1997)

Thomas Brown Karpjoo, Thomas Brown, Karpjoo Jeong, Bin Li, Suren Talla, Peter Wyckoff, ...

this document is organized as follows: section 4 introduces the PLinda system architecture. Section 5 introduces the basic concepts of the PLinda model. Section 6 concerns tuning the execution of...

The dangers of replication and a solution (1996)

Jim Gray, Pat Helland, Dennis Shasha

Abstract: Update anywhere-anytime-anyway transactional replication has unstable behavior as the workload scales up: a ten-fold increase in nodes and traffic gives a thousand fold increase in...

The dangers of replication and a solution (1996)

Jim Gray, Pat Helland, Dennis Shasha

Abstract: Update anywhere-anytime-anyway transactional replication has unstable behavior as the workload scales up: a ten-fold increase in nodes and traffic gives a thousand fold increase in...

Skip-Over: Algorithms and Complexity for Overloaded Systems that Allow Skips (1996)

Gilad Koren, Dennis Shasha

In applications ranging from video reception to telecommunications and packet communication to aircraft control, tasks enter periodically and have fixed response time constraints, but missing a...

Hierarchically Split Cube Forests for Decision Support: description and tuned design (1996)

Theodore Johnson, Dennis Shasha

The paradigmatic view of data in decision support consists of a set of dimensions (e.g., location, product, time period, ...), each encoding a hierarchy (e.g., location has hemisphere, country,...

The Dangers of Replication and a Solution (1996)

Jim Gray Pat, Dennis Shasha (nyu, Jim Gray, Jim Gray, Pat Helland, Pat Helland, ...

ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications...

The Dangers of Replication and a Solution (1996)

Jim Gray, Pat Helland, Pat O'Neil, Dennis Shasha

: Update anywhere-anytime-anyway transactional replication has unstable behavior as the workload scales up: a ten-fold increase in nodes and traffic gives a thousand fold increase in deadlocks or...

Complementary classification approaches for protein sequences (1996)

Wang, Jason T.L., Marr, Thomas G., Shasha, Dennis, Shapiro, Bruce A., Chirn, Gung-Wei, Lee, T.Y.

We have studied five methods of protein classification and have applied them to the 768 groups of related proteins in the PROSITE catalog. Four of these methods are based on searching a database of...

Transaction Chopping: Algorithms and Performances Studies (1995)

Llirbat, François, Shasha, Dennis, Simon, Eric, Valduriez, Patrick

Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...

Transaction Chopping: Algorithms and Performances Studies (1995)

Llirbat, François, Shasha, Dennis, Simon, Eric, Valduriez, Patrick

Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...

Transaction Chopping: Algorithms and Performances Studies (1995)

Llirbat, François, Shasha, Dennis, Simon, Eric, Valduriez, Patrick

Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...

Transaction Chopping: Algorithms and Performances Studies (1995)

Dennis Shasha, Dennis Shasha, Francois Llirbat, Francois Llirbat, Eric Simon, Eric Simon, ...

Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...

Plinda 2.0: A transactional/checkpointing approach to fault tolerant linda (1994)

Karpjoo Jeong, Dennis Shasha

Robust parallel computation in Linda requires both tuple space and processes to be resilient to failure. In this paper, we present PLinda 2.0, set of extensions to Linda to support robust parallel...

Discovering active motifs in sets of related protein sequences and using them for classification (1994)

Thomas G. Marr, Dennis Shasha, Bruce Shapiro, Gung-wei Chirn

We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) nd candidate motifs in a small sample of the sequences �...

Discovering active motifs in sets of related protein sequences and using them for classification (1994)

Wang, Jason T.L., Marr, Thomas G., Shasha, Dennis, Shapiro, Bruce A., Chirn, Gung-Wei

We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) find candidate motifs in a small sample of the sequences;...

Approximate Tree Matching in the Presence of Variable Length Don't Cares (1993)

Kaizhong Zhang, Dennis Shasha

Ordered labeled trees are trees in which the sibling order matters. This paper presents algorithms for three problems having to do with approximate matching for such trees with variable-length...

Competitive Algorithms and Lower Bounds for On-Line Scheduling of Multiprocessor Real-Time Systems (1993)

Gilad Koren Dennis, Dennis Shasha

We study competitive on-line scheduling in multi-processor real-time environments. In our model, every task has a deadline and a value that it obtains only if it completes by its deadline. A task can...

D-OVER ; an optimal on-line scheduling algorithm for overloaded real-time systems (1992)

Koren, G., Shasha, Dennis

Every task in a real-time system has a deadline by which time it should complete. Each task also has a value that it obtains only if it completes by its deadline. The problem is to design an on-line...

D-OVER ; an optimal on-line scheduling algorithm for overloaded real-time systems (1992)

Koren, G., Shasha, Dennis

Every task in a real-time system has a deadline by which time it should complete. Each task also has a value that it obtains only if it completes by its deadline. The problem is to design an on-line...

D-OVER ; an optimal on-line scheduling algorithm for overloaded real-time systems (1992)

Koren, G., Shasha, Dennis

Every task in a real-time system has a deadline by which time it should complete. Each task also has a value that it obtains only if it completes by its deadline. The problem is to design an on-line...

A System for Approximate Tree Matching (1992)

Jason Tsong-li, Wang Kaizhong Zhang, Karpjoo Jeong, Dennis Shasha

Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is fixed. Such trees have many applications in vision, pattern recognition,...

Conventional Query Optimization Research: (1992)

Dennis Shasha

Sprinkled with case studies here and there,

An Optimal Scheduling Algorithm with a Competitive Factor for Real-Time Systems (1991)

Gilad Koren, Dennis Shasha

We consider real-time systems in which the value of a task is proportional to its computation time. The system obtains the value of a given task if the task completes by its deadline. Otherwise, the...

Persistent Linda: Linda + Transactions + Query Processing (1991)

Brian Anderson, Dennis Shasha

this document use a "C" flavor of PLinda. 3 Tuple Patterns

Optimizing equijoin queries in distributed databases where relations are hash partitioned (1991)

Dennis Shasha, Tsong-li Wang

Consider the class of distributed database systems consisting of a set of nodes connected by a high bandwidth network. Each node consists of a processor, a random access memory, and a slower but much...

Query processing for distance metrics (1990)

Tsong-li Wang, Dennis Shasha

In applications such as vision and molecular biology, a common problem is to find the similar objects to a given target (according to some distance measure) in a large database. This paper presents a...

Tree Locking On Changing Trees (1990)

Vladimir Lanin, Dennis Shasha

: The tree locking protocol is a deadlock-free method of concurrency control defined and verified by Silberschatz and Kedem for data organized in a directed tree. Can the tree protocol work for...

New Techniques for Best-Match Retrieval (1990)

Dennis Shasha, Tsong-li Wang

A scheme to answer best-match queries from a file containing a collection of objects is described. A best-match query is to find the objects in the file that are closest (according to some...

An Analytical Model for the Performance of Concurrent B Tree Algorithms (1987)

Dennis Shasha, Vladimir Lanin, Jeanette Schmidt

A dictionary is an abstract data type supporting the actions search, insert, and delete. Search structures are data structure used to implement a dictionary, e.g. B trees, hash structures, grid...

Query Processing in a Symmetric Parallel Environment (1986)

Dennis Shasha

We consider a database machine consisting of n nodes connected by an O(n*processing speed) bandwidth network. Each node consists of a processor, a random access memory, and a slower but much larger...

A Systems Approach Uncovers Restrictions for Signal Interactions Regulating Genome-wide Responses to Nutritional Cues in Arabidopsis

Krouk, Gabriel, Tranchina, Daniel, Lejay, Laurence, Cruikshank, Alexis A., Shasha, Dennis, Coruzzi, Gloria M., ...

As sessile organisms, plants must cope with multiple and combined variations of signals in their environment. However, very few reports have studied the genome-wide effects of systematic signal...

Fast Structural Search in Phylogenetic Databases

Wang, Jason T. L., Shan, Huiyuan, Shasha, Dennis, Piel, William H.

As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become...

GraphClust: A Method for Clustering Database of Graphs

Diego Reforgiato, Rodrigo Gutierrez, Dennis Shasha

Any application that represents data as sets of graphs may benefit from the discovery of relationships among those graphs. To do this in an unsupervised fashion requires the ability to find graphs...