Efficient and Correct Programs that Share Execution of Parallel Memory (2009)
In this paper we consider an optimization problem that arises in the execution of parallel programs on shared-memory multiple-instruction-stream, multiple-data-stream (MIMD) computers. A program on...
Repeatability & Workability Evaluation of SIGMOD 2009 (2009)
Manegold, Stefan, Manolescu, Ioana, Afanasiev, Loredana, Feng, Jianling, Gou, Gang, Hadjieleftheriou, Marios, ...
This paper reports on the SIGMOD 2009 Repeatability and Workability initiative, an effort to verify the experiments presented in articles accepted to the ACM SIGMOD 2009 conference. We summarize the...
Repeatability & Workability Evaluation of SIGMOD 2009 (2009)
Manegold, Stefan, Manolescu, Ioana, Afanasiev, Loredana, Feng, Jianling, Gou, Gang, Hadjieleftheriou, Marios, ...
This paper reports on the SIGMOD 2009 Repeatability and Workability initiative, an effort to verify the experiments presented in articles accepted to the ACM SIGMOD 2009 conference. We summarize the...
FinTime- a financial time series benchmark About FinTime (2008)
Morgan Stanley, Dean Witter, New York, Dennis Shasha
FinTime
Fast Structural Search in Phylogenetic Databases (2008)
Huiyuan Shan, Dennis Shasha, William H. Piel
Abstract: As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has...
Abstract Secure Untrusted Data Repository (SUNDR) (2008)
Jinyuan Li, Maxwell Krohn, David Mazières, Dennis Shasha
SUNDR is a network file system designed to store data securely on untrusted servers. SUNDR lets clients detect any attempts at unauthorized file modification by malicious server operators or users....
Abstract Secure Untrusted Data Repository (SUNDR) (2008)
Jinyuan Li, Maxwell Krohn, David Mazières, Dennis Shasha
SUNDR is a network file system designed to store data securely on untrusted servers. SUNDR lets clients detect any attempts at unauthorized file modification by malicious server operators or users....
� Connection Caching � Additional Performance Enhancing Techniques (2008)
� Constraints: in Java vs. the DBMS � Another look at transactions and locking � Efficient interaction with the database � How to generate numbers in a sequence?
GraphFind: enhancing graph searching by low support data mining techniques (2008)
Ferro, Alfredo, Giugno, Rosalba, Mongiovì, Misael, Pulvirenti, Alfredo, Skripin, Dmitry, Shasha, Dennis
Abstract Background Biomedical and chemical databases are large and rapidly growing in size. Graphs naturally model such kinds of data. To fully exploit the wealth of information in these graph...
Thum, Karen E, Shin, Michael J, Gutiérrez, Rodrigo A, Mukherjee, Indrani, Katari, Manpreet S, Nero, Damion, ...
Abstract Background Light and carbon are two important interacting signals affecting plant growth and development. The mechanism(s) and/or genes involved in sensing and/or mediating the signaling...
List of Supported Students and Staff (2008)
Dennis Shasha, Dennis Shasha, Guangwei Dai, Rosalba Giugno, ...
techniques and query languages, pattern matching and recognition, scientific data mining Project Summary The goal of this research project is to make it possible to process approximate queries on...
Alfredo Ferro, Rosalba Giugno, Misael Mongioví, Alfredo Pulvirenti, Dmitry Skripin, Dennis Shasha, ...
(molecules, networks) •Prediction of the functionality of new natural or synthesized compounds •Make a compound Q more active •Find fragment with the same function among different species...
Fast Structural Search in Phylogenetic Databases (2008)
Huiyuan Shan, Dennis Shasha, William H. Piel
Abstract: As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has...
Edgar H. Sibley, Steve Rozen, Dennis Shasha
Developers of a Wall Street financial application were able to exploit a relational DBMS to advantage for some data management tasks (the good). For others, the relational system was not helpful (the...
Fuzzy Modeling and Genetic Algorithms for Data Mining and Exploration (2008)
Mamdouh Refaat, Jim Melton, Stephen Buxton, Jiawei Han, Micheline Kamber, Toby J. Teorey, ...
Abstract Evaluating A Class of Distance-Mapping Algorithms for Data Mining and Clustering* (2008)
Jason Tsong-li, Wang+ Xiong Wang, Dennis Shasha, Bruce A. Shapiro
A distance-mapping algorithm takes a set of objects and a distance metric and then maps those objects to a Euclidean or pseudoEuclidean space in such a way that the distances among objects are...
Abstract Secure Untrusted Data Repository (SUNDR) (2008)
Jinyuan Li, Maxwell Krohn, David Mazières, Dennis Shasha
SUNDR is a network file system designed to store data securely on untrusted servers. SUNDR lets clients detect any attempts at unauthorized file modification by malicious server operators or users....
Exact and Approximate Algorithms for Unordered 'he Matching (2008)
Dennis Shasha, Kaizhong Zhang, Frank Y. Shih
Abstract-We consider the problem of comparison between unordered trees, i.e., trees for which the order among siblings is unimportant. The criterion for comparison is the distance as measured by a...
ABSTRACT GhostDB: Hiding Data from Prying Eyes (2008)
Christophe Salperwyck, Nicolas Anciaux, Mehdi Benzine, Luc Bouganim, Dennis Shasha
Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...
Xuefeng Cui, Broňa Brejová, Dennis Shasha, Ming Li
Motivation: Life science researchers often require an exhaustive list of protein coding genes similar to a given query gene. To find such genes, homology search tools, such as BLAST or PatternHunter,...
MetricMap: An embedding technique for processing distance-based queries in metric spaces (2008)
Xiong Wang, Dennis Shasha, Kaizhong Zhang
Abstract—In this paper, we present an embedding technique, called MetricMap, which is capable of estimating distances in a pseudometric space. Given a database of objects and a distance function...
Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Private data sometimes must be made public. A corporation may keep its customer sales data secret, but reveals totals by sector for marketing reasons. A hospital keeps individual patient data secret,...
Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Private data sometimes must be made public. A corporation may keep its customer sales data secret, but reveals totals by sector for marketing reasons. A hospital keeps individual patient data secret,...
BMC Systems Biology BioMed Central (2008)
Karen E Thum, Michael J Shin, Rodrigo A Gutiérrez, Indrani Mukherjee, Manpreet S Katari, Damion Nero, ...
Research article An integrated genetic, genomic and systems approach defines gene networks regulated by the interaction of light and carbon signaling pathways in Arabidopsis
Steve Rozen, Bruce A. Shapiro, Dennis Shasha, Zhiyuan Wang, Maisheng Yin
Key words: algorithms, consensus sequence, pattern matching, tools for computational biology, DNA sequence recognition
Thomas G. Marr, Dennis Shasha, Gung-wei Chirn
We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) find candidate motifs in a small sample of the sequences;...
Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang
Abstract. In this paper we present an index structure, called MetricMap, that takes a set of objects and a distance metric and then maps those objects to a k-dimensional space in such a way that the...
2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm (2007)
Theodore Johnson, Dennis Shasha
In a path-breaking paper last year Pat and Betty O'Neil and Gerhard Weikum proposed a self-tuning improvement to the Least Recently Used (LRU) buffer management algorithm[15]. Their improvement...
Synthesizing Arbitrary Genomes (2007)
Suppose a researcher wants a long arbitrary sequence of nucleotides and asks a lab to synthesize it. Oligonucleotides are generally of length less than 100, so it is necessary to resort to...
Bulletin of the Technical Committee on (2007)
December Vol No, Answering Histograms, Viswanath Poosala, Venkatesh Ganti, Yannis E. Ioannidis, Dennis Shasha, ...
Answering queries approximately has recently been proposed as a way to reduce query response times in on-line decision support systems, when the precise answer is not necessary or early feedback is...
Ffl Xiong Wang, Xiong Wang, Xiong Wang, King-ip Lin, ...
ditor, Combinatorial Pattern Matching, pages 104 -- 117, Lecture Notes in Computer Science, SpringerVerlag, 1998. ffl Xiong Wang, Jason T.L. Wang, Dennis Shasha, Bruce Shapiro, Sitaram Dikshitulu,...
TREEDIFF: A System for Document Comparison by Structure (2007)
Girish Patel, Liam Relihan, Dennis Shasha, Kaizhong Zhang, ...
rstructure comparison(A, B): to determine whether or not A contains B as an approximate subtree; it returns the closest matching subtree of A and the distance between that subtree and B. 3 Query...
Q: A Low Overhead High Performance Buffer Management Replacement Algorithm (2007)
Theodore Johnson, Dennis Shasha, Gerhard Weikum Proposed
In a path-breaking paper last year Pat and Betty O'Neil and Gerhard Weikum proposed a self-tuning improvement to the Least Recently Used (LRU) buffer management algorithm[15]. Their improvement...
Building secure file systems out of Byzantine storage # (2007)
David Mazi Eres, Dennis Shasha
This paper shows how to implement a trusted network file system on an untrusted server. While cryptographic storage techniques exist that allow users to keep data secret from untrusted servers, this...
Building secure file systems out of Byzantine storage (2007)
David Mazi Eres, Dennis Shasha
This paper shows how to implement a trusted network file system on an untrusted server. While cryptographic storage techniques exist that allow users to keep data secret from untrusted servers, this...
Kaizhong Zhang, Dennis Shasha, Communicated T. Jiang
We consider the problem of comparing CUAL graphs (Connected, Undirected, Acyclic graphs with nodes being Labeled). This problem is motivated by the study of information retrieval for bio-chemical and...
Don't Trust Your File Server (2007)
David Mazi Eres, Dennis Shasha
All too often, decisions about whom to trust in computer systems are driven by the needs of system management rather than data security. In particular, data storage is often entrusted to people who...
Inria Rocquencourt, Inria Rocquencourt, Inria Rocquencourt, H. Arno Jacobsen, Dennis Shasha
Large-scale information dissemination systems for selective information distribution are gaining increasing importance. This is due to the fast proliferation of dierent communication infrastructures,...
Inria Rocquencourt, Inria Rocquencourt, Inria Rocquencourt, Radu Preotiuc-pietro, Kenneth A. Ross, Dennis Shasha
y Columbia University
Qicheng Ma, Dennis Shasha, Cathy H. Wu
In this paper we propose new techniques to extract features from protein sequences. We then use the features as inputs for a Bayesian neural network (BNN) and apply the BNN to classifying protein...
Secure Untrusted Data Repository (SUNDR) (2007)
Jinyuan Li, Maxwell Krohn, David Mazieres, Dennis Shasha
We have implemented a secure network file system called SUNDR that guarantees the integrity of data even when malicious parties control the server. SUNDR splits storage functionality between two...
Homology search for genes (2007)
Cui, Xuefeng, Vinar, Tomás, Brejová, Brona, Shasha, Dennis, Li, Ming
Motivation: Life science researchers often require an exhaustive list of protein coding genes similar to a given query gene. To find such genes, homology search tools, such as BLAST or PatternHunter,...
DNA Hash Pooling and its Applications (2007)
In this paper we describe a new technique for the comparison of populations of DNA strands. Comparison is vital to the study of ecological systems, at both the micro and macro scales. Existing...
GhostDB: Querying Visible and Hidden Data Without Leaks (2007)
Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...
GhostDB: Querying Visible and Hidden Data Without Leaks (2007)
Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...
GhostDB: Hiding Data from Prying Eyes (2007)
Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...
GhostDB: Hiding Data from Prying Eyes (2007)
Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...
GhostDB: Hiding Data from Prying Eyes (2007)
Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may want to issue queries whose result will combine private and...
GhostDB: Hiding Data from Prying Eyes (2007)
Salperwyck, Christophe, Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may want to issue queries whose result will combine private and...
Querying and Aggregating Visible and Hidden Data Without Leaks (2007)
Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...
Querying and Aggregating Visible and Hidden Data Without Leaks (2007)
Anciaux, Nicolas, Benzine, Mehdi, Bouganim, Luc, Pucheral, Philippe, Shasha, Dennis
Imagine that you have been entrusted with private data, such as corporate product information, sensitive government information, or symptom and treatment information about hospital patients. You may...
The Design of Griffin: A Common Prototyping Language (2006)
Dewar, Robert, Goldberg, Benjamin, Harrison, Malcolm, Schonberg, Edmond, Shasha, Dennis
The objective of the Griffin project at NYU is the design of a language, called Griffin, for prototyping large software systems. The success and cost-effectiveness of prototyping depends on, among...
2.3. Desktop Search...... 7 (2006)
Jeffrey Borden, Dennis Shasha, Jeffrey Borden, Chris Harrison, Stacey Kuznetsov, Dennis Shasha, ...
I would like to thank Professor Dennis Shasha for his continued support and guidance. I am also grateful to my two co-researchers Chris Harrison and Stacey Kuznetsov.
2.2 User Reliance........ 4 (2006)
Chris Harrison, Dennis Shasha, Stacey Kuznetsov
over the course of the Spring 2006 semester. The text analytics package was developed Jeff Borden.
Fast Structural Search in Phylogenetic Databases (2005)
Huiyuan Shan, Dennis Shasha, William H. Piel
As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become...
Alberto Lerner, Dennis Shasha, Zhihua Wang, Xiaojian Zhao, Yunyue Zhu
Financial time series streams are watched closely by millions of traders. What exactly do they look for and how can we help them do it faster? Physicists study the time series emerging from their...
Unordered Tree Mining with Applications to Phylogeny (2004)
Frequent structure mining (FSM) aims to discover and extract patterns frequently occurring in structural data, such as trees and graphs. FSM finds many applications in bioinformatics, XML processing,...
Activist data mining for computational science: tools and applications (2003)
Classical data mining involves: waiting for data to appear and then mining it. Activist data mining involves: proposing experiments based on algorithmic and application-specific considerations,...
Database Tuning : Principles, Experiments, and Troubleshooting Techniques (2003)
Shasha, Dennis, Bonnet, Philippe
1-55860-753-6
Treerank: A similarity measure for nearest neighbor searching in phylogenetic databases (2003)
Huiyuan Shan, Dennis Shasha, William H. Piel
Phylogenetic trees are unordered labeled trees in which each leaf node has a label and the order among siblings is unimportant. In this paper we propose a new similarity measure, called TreeRank, for...
Treerank: A similarity measure for nearest neighbor searching in phylogenetic databases (2003)
Huiyuan Shan, Dennis Shasha, William H. Piel
Phylogenetic trees are unordered labeled trees in which each leaf node has a label and the order among siblings is unimportant. In this paper we propose a new similarity measure, called TreeRank, for...
Algorithmics and Applications of Tree and Graph Searching (2002)
Modern search engines answer keyword-based queries extremely eciently. The impressive speed is due to clever inverted index structures, caching, a domain-independent knowledge of strings, and...
Finding approximate patterns in undirected acyclic graphs (2002)
Kaizhong Zhang, George Chang, Dennis Shasha
We consider an approximate pattern matching problem for undirected acyclic graphs. Specifically, let P be a pattern graph, D a data graph and t an integer. We present an algorithm to locate a...
ATreeGrep: Approximate Searching in Unordered Trees (2002)
Dennis Shasha, Huiyuan Shan, Kaizhong Zhang
An unordered labeled tree is a tree in which each node has a string label and the parent-child relationship is significant, but the order among siblings is unimportant. This paper presents an...
Xiong Wang, Dennis Shasha, Bruce A. Shapiro, Isidore Rigoutsos, Kaizhong Zhang
This paper presents a method for finding patterns in three dimensional (3D) graphs. Each node in a graph is an undecomposable or atomic unit and has a label. Edges are links between the atomic units....
A Structure-Based Search Engine for Phylogenetic Databases (2002)
Huiyuan Shan, Katherine G. Herbert, William H. Piel, Dennis Shasha
Phylogenetic trees are essential for understanding the relationships among organisms or taxa. Many of the current techniques for searching phylogenetic repositories allow the user to perform a...
Declarative Data Cleaning : Language, Model, and Algorithms (2001)
Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric, Saita, Cristian
The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...
Declarative Data Cleaning : Language, Model, and Algorithms (2001)
Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric, Saita, Cristian
The problem of data cleaning, which consists of emoving inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...
Declarative Data Cleaning : Language, Model, and Algorithms (2001)
Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric, Saita, Cristian
The problem of data cleaning, which consists of emoving inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...
New techniques for extracting features from protein sequences (2001)
Qicheng Ma, Dennis Shasha, Cathy H. Wu
In this paper we propose new techniques to extract features from protein sequences. We then use the features as inputs for a Bayesian neural network (BNN) and apply the BNN to classifying protein...
Qicheng Ma, Dennis Shasha, Cathy H. Wu
This paper presents new techniques for biosequence classification, with a focus on recognizing E. Coli promoters in DNA. Specifically, given an unlabeled DNA sequence S, we want to determine whether...
Improving data cleaning quality using a data lineage facility (2001)
Helena Galhardas, Inria Rocquencourt, Daniela Florescu, Dennis Shasha, Eric Simon, Cristian-augustin Saita
The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. However, for...
Declarative Data Cleaning: Language, Model, and Algorithms (2001)
Helena Galhardas, Daniela Florescu, Dennis Shasha
The problem of data cleaning, which consists of removing inconsistencies and errors from original data sets, is well known in the area of decision support systems and data warehouses. This holds...
Filtering algorithms and implementation for very fast publish/subscribe systems (2001)
Franoise Fabret, Arno Jacobsen, Franois Llirbat, Joo Pereira, Ken Ross, Dennis Shasha
Publish/Subscribe is the paradigm in which users express long-term interests (subscriptions) and some external agent (perhaps other users) publishes events (e.g., oers). The job of Publish/Subscribe...
Filtering Algorithms and Implementation for Very Fast Publish/Subscribe Systems (2001)
Françoise Fabret, H. Arno Jacobsen, François Llirbat, João Pereira, Jo Ao Pereira, Kenneth A. Ross, ...
Publish/Subscribe is the paradigm in which users express long-term interests (\subscriptions") and some agent \publishes " events (e.g., oers). The job of Publish/Subscribe software is to...
Efficient Matching Algorithms for Publish and Subscribe Systems (2000)
Fabret, Françoise, Llirbat, François, Jacobsen, Arno, Pereira, Joâo, Ross, Kenneth, Shasha, Dennis
Publish/Subscribe is the paradigm in which users express long-term interests («subscriptions») and some external agent (perhaps other users) «publishes» events (e.g., offers). The job of...
Efficient Matching Algorithms for Publish and Subscribe Systems (2000)
Fabret, Françoise, Llirbat, François, Jacobsen, Arno, Pereira, Joâo, Ross, Kenneth, Shasha, Dennis
Publish/Subscribe is the paradigm in which users express long-term interests («subscriptions») and some external agent (perhaps other users) «publishes» events (e.g., offers). The job of...
Efficient Matching Algorithms for Publish and Subscribe Systems (2000)
Fabret, Françoise, Llirbat, François, Jacobsen, Arno, Pereira, Joâo, Ross, Kenneth, Shasha, Dennis
Publish/Subscribe is the paradigm in which users express long-term interests («subscriptions») and some external agent (perhaps other users) «publishes» events (e.g., offers). The job of...
Qicheng Ma, Dennis Shasha, Cathy H. Wu
Biological data mining aims to extract significant information from DNA, RNA and proteins. The significant information may refer to motifs, functional sites, clustering and classification rules. This...
An extensible framework for data cleaning (2000)
Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon
Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...
Efficient Matching for Content-based Publish/Subscribe Systems (2000)
Françoise Fabret, François Llirbat, João Pereira, Jo Ao Pereira, Inria Rocquencourt, Dennis Shasha
This paper describes the event model, subscription language and the matching algorithms developed for our context-based pub/sub system. The event model and subscription language adopted in our...
Publish/Subscribe on the Web at Extreme Speed (2000)
Françoise Fabret, François LLirbat, João Pereira, Jo Ao Pereira, Inria Rocquencourt, Inria Rocquencourt, ...
Introduction This demonstration presents Le Subscribe an event notification system for the Web. It is widely accepted that the majority of human information will be on the Web in ten years. As...
Declaratively cleaning your data using AJAX (2000)
Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon
Data quality concerns arise when correcting anomalies in a single data source, or integrating data coming from multiple sources into a single data repository. The information handled may also need to...
Efficient Matching for Web-Based Publish/subscribe Systems (2000)
João Pereira, Françoise Fabret, Francois Llirbat, Dennis Shasha, Inria Rocquencourt
There is a need for systems being able to capture the dynamic aspect of the web information by notifying users of interesting events. Content-based publish/subscribe systems are an emerging type of...
Declaratively cleaning your data using AJAX (2000)
Helena Galhardas, Daniela Florescu, Dennis Shasha, Eric Simon
Data quality concerns arise when correcting anomalies in a single data source, or integrating data coming from multiple sources into a single data repository. The information handled may also need to...
An index structure for data mining and clustering (2000)
Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang
Abstract. In this paper we present an index structure, called Metric-Map, that takes a set of objects and a distance metric and then maps those objects to a k-dimensional space in such away that the...
An index structure for data mining and clustering (2000)
Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang
Abstract. In this paper we present an index structure, called Metric-Map, that takes a set of objects and a distance metric and then maps those objects to a k-dimensional space in such a way that the...
Making Snapshots Isolation Serializable (2000)
Snapshot Isolation (SI) is a multiversion concurrency control algorithm, first described in Berenson et al. [1995]. SI is attractive because it provides an isolation level that avoids many of the...
An Extensible Framework for Data Cleaning (1999)
Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric
Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...
An Extensible Framework for Data Cleaning (1999)
Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric
Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...
An Extensible Framework for Data Cleaning (1999)
Galhardas, Helena, Florescu, Daniela, Shasha, Dennis, Simon, Eric
Data integration solutions dealing with large amounts of data have been strongly required in the last few years. Besides the traditional data integration problems (e.g. schema integration, local to...
Evaluating A Class of Distance-Mapping Algorithms for Data Mining and Clustering (1999)
Xiong Wang, King-ip Lin, Dennis Shasha, Bruce A. Shapiro, Kaizhong Zhang
A distance-mapping algorithm takes a set of objects and a distance metric and then maps those objects to a Euclidean or pseudo-Euclidean space in such a way that the distances among objects are...
New techniques for DNA sequence classification (1999)
Steve Rozen, Bruce A. Shapiro, Dennis Shasha, Zhiyuan Wang, Maisheng Yin
DNA sequence classification is the activity of determining whether or not an unlabeled sequence S belongs to an existing class C. This paper proposes two new techniques for DNA sequence...
The Design of Griffin: A Common Prototyping Language. (1998)
Dewar, Robert, Goldberg, Benjamin, Harrison, Malcolm, Schonberg, Edmond, Shasha, Dennis
This project was the first phase of the development of a language for prototyping large software systems, especially those that will ultimately be implemented in Ada. The Department of Defense has...
Secure Untrusted Data Repository (SUNDR) (1998)
Li, Jinyuan, Krohn, Maxwell, Mazieres, David, Shasha, Dennis
We have implemented a secure network file system called SUNDR that guarantees the integrity of data even when malicious parties control the server. SUNDR splits storage functionality between two...
An Algorithm for Finding the Largest Approximately Common Substructures of Two Trees (1998)
Bruce A. Shapiro, Dennis Shasha, Kaizhong Zhang, Kathleen M. Currey
Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is fixed. Such trees have many applications in vision, pattern recognition,...
An algorithm for finding the largest approximately common substructures of two trees (1998)
Bruce A. Shapiro, Dennis Shasha, Kaizhong Zhang, Kathleen M. Currey
Abstract | Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is xed. Such trees have many applications in vision, pattern...
Structural matching and discovery in document databases (1997)
Dennis Shasha, Liam Relihan, Kaizhong Zhang, Girish Patel
Structural matching and discovery in documents such as SGML and HTML is important for data warehousing [6], version management [7, 11], hypertext authoring, digital libraries [4] and Internet...
Automated Discovery of Active Motifs in Three Dimensional Molecules (1997)
Xiong Wang, Dennis Shasha, Sitaram Dikshitulu, Isidore Rigoutsos, Kaizhong Zhang
In this paper we present a method for discovering approximately common motifs (also known as active motifs) in three dimensional (3D) molecules. Each node in a molecule is represented by a 3D point...
Karpjoo Jeong, Dennis Shasha, Surendranath Talla, Peter Wyckoff
We propose a novel approach to harness the idle cycles of workstations connected by LAN/WANs for long running scientific computations and present performance results for our prototype system called...
Thomas Brown Karpjoo, Thomas Brown, Karpjoo Jeong, Bin Li, Suren Talla, Peter Wyckoff, ...
this document is organized as follows: section 4 introduces the PLinda system architecture. Section 5 introduces the basic concepts of the PLinda model. Section 6 concerns tuning the execution of...
The dangers of replication and a solution (1996)
Jim Gray, Pat Helland, Dennis Shasha
Abstract: Update anywhere-anytime-anyway transactional replication has unstable behavior as the workload scales up: a ten-fold increase in nodes and traffic gives a thousand fold increase in...
The dangers of replication and a solution (1996)
Jim Gray, Pat Helland, Dennis Shasha
Abstract: Update anywhere-anytime-anyway transactional replication has unstable behavior as the workload scales up: a ten-fold increase in nodes and traffic gives a thousand fold increase in...
Skip-Over: Algorithms and Complexity for Overloaded Systems that Allow Skips (1996)
In applications ranging from video reception to telecommunications and packet communication to aircraft control, tasks enter periodically and have fixed response time constraints, but missing a...
Hierarchically Split Cube Forests for Decision Support: description and tuned design (1996)
Theodore Johnson, Dennis Shasha
The paradigmatic view of data in decision support consists of a set of dimensions (e.g., location, product, time period, ...), each encoding a hierarchy (e.g., location has hemisphere, country,...
The Dangers of Replication and a Solution (1996)
Jim Gray Pat, Dennis Shasha (nyu, Jim Gray, Jim Gray, Pat Helland, Pat Helland, ...
ing with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Publications...
The Dangers of Replication and a Solution (1996)
Jim Gray, Pat Helland, Pat O'Neil, Dennis Shasha
: Update anywhere-anytime-anyway transactional replication has unstable behavior as the workload scales up: a ten-fold increase in nodes and traffic gives a thousand fold increase in deadlocks or...
Complementary classification approaches for protein sequences (1996)
Wang, Jason T.L., Marr, Thomas G., Shasha, Dennis, Shapiro, Bruce A., Chirn, Gung-Wei, Lee, T.Y.
We have studied five methods of protein classification and have applied them to the 768 groups of related proteins in the PROSITE catalog. Four of these methods are based on searching a database of...
Transaction Chopping: Algorithms and Performances Studies (1995)
Llirbat, François, Shasha, Dennis, Simon, Eric, Valduriez, Patrick
Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...
Transaction Chopping: Algorithms and Performances Studies (1995)
Llirbat, François, Shasha, Dennis, Simon, Eric, Valduriez, Patrick
Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...
Transaction Chopping: Algorithms and Performances Studies (1995)
Llirbat, François, Shasha, Dennis, Simon, Eric, Valduriez, Patrick
Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...
Transaction Chopping: Algorithms and Performances Studies (1995)
Dennis Shasha, Dennis Shasha, Francois Llirbat, Francois Llirbat, Eric Simon, Eric Simon, ...
Chopping transactions into pieces is good for performance but may lead to non-serializable executions. Many researchers have reacted to this fact by either inventing new concurrency control...
Plinda 2.0: A transactional/checkpointing approach to fault tolerant linda (1994)
Robust parallel computation in Linda requires both tuple space and processes to be resilient to failure. In this paper, we present PLinda 2.0, set of extensions to Linda to support robust parallel...
Thomas G. Marr, Dennis Shasha, Bruce Shapiro, Gung-wei Chirn
We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) nd candidate motifs in a small sample of the sequences �...
Wang, Jason T.L., Marr, Thomas G., Shasha, Dennis, Shapiro, Bruce A., Chirn, Gung-Wei
We describe a method for discovering active motifs in a set of related protein sequences. The method is an automatic two step process: (1) find candidate motifs in a small sample of the sequences;...
Approximate Tree Matching in the Presence of Variable Length Don't Cares (1993)
Ordered labeled trees are trees in which the sibling order matters. This paper presents algorithms for three problems having to do with approximate matching for such trees with variable-length...
Gilad Koren Dennis, Dennis Shasha
We study competitive on-line scheduling in multi-processor real-time environments. In our model, every task has a deadline and a value that it obtains only if it completes by its deadline. A task can...
D-OVER ; an optimal on-line scheduling algorithm for overloaded real-time systems (1992)
Every task in a real-time system has a deadline by which time it should complete. Each task also has a value that it obtains only if it completes by its deadline. The problem is to design an on-line...
D-OVER ; an optimal on-line scheduling algorithm for overloaded real-time systems (1992)
Every task in a real-time system has a deadline by which time it should complete. Each task also has a value that it obtains only if it completes by its deadline. The problem is to design an on-line...
D-OVER ; an optimal on-line scheduling algorithm for overloaded real-time systems (1992)
Every task in a real-time system has a deadline by which time it should complete. Each task also has a value that it obtains only if it completes by its deadline. The problem is to design an on-line...
A System for Approximate Tree Matching (1992)
Jason Tsong-li, Wang Kaizhong Zhang, Karpjoo Jeong, Dennis Shasha
Ordered, labeled trees are trees in which each node has a label and the left-to-right order of its children (if it has any) is fixed. Such trees have many applications in vision, pattern recognition,...
Conventional Query Optimization Research: (1992)
Sprinkled with case studies here and there,
An Optimal Scheduling Algorithm with a Competitive Factor for Real-Time Systems (1991)
We consider real-time systems in which the value of a task is proportional to its computation time. The system obtains the value of a given task if the task completes by its deadline. Otherwise, the...
Persistent Linda: Linda + Transactions + Query Processing (1991)
this document use a "C" flavor of PLinda. 3 Tuple Patterns
Optimizing equijoin queries in distributed databases where relations are hash partitioned (1991)
Consider the class of distributed database systems consisting of a set of nodes connected by a high bandwidth network. Each node consists of a processor, a random access memory, and a slower but much...
Query processing for distance metrics (1990)
In applications such as vision and molecular biology, a common problem is to find the similar objects to a given target (according to some distance measure) in a large database. This paper presents a...
Tree Locking On Changing Trees (1990)
: The tree locking protocol is a deadlock-free method of concurrency control defined and verified by Silberschatz and Kedem for data organized in a directed tree. Can the tree protocol work for...
New Techniques for Best-Match Retrieval (1990)
A scheme to answer best-match queries from a file containing a collection of objects is described. A best-match query is to find the objects in the file that are closest (according to some...
An Analytical Model for the Performance of Concurrent B Tree Algorithms (1987)
Dennis Shasha, Vladimir Lanin, Jeanette Schmidt
A dictionary is an abstract data type supporting the actions search, insert, and delete. Search structures are data structure used to implement a dictionary, e.g. B trees, hash structures, grid...
Query Processing in a Symmetric Parallel Environment (1986)
We consider a database machine consisting of n nodes connected by an O(n*processing speed) bandwidth network. Each node consists of a processor, a random access memory, and a slower but much larger...
Thum, Karen E, Shin, Michael J, Gutiérrez, Rodrigo A, Mukherjee, Indrani, Katari, Manpreet S, Nero, Damion, ...
GraphFind: enhancing graph searching by low support data mining techniques
Ferro, Alfredo, Giugno, Rosalba, Mongiovì, Misael, Pulvirenti, Alfredo, Skripin, Dmitry, Shasha, Dennis
Krouk, Gabriel, Tranchina, Daniel, Lejay, Laurence, Cruikshank, Alexis A., Shasha, Dennis, Coruzzi, Gloria M., ...
As sessile organisms, plants must cope with multiple and combined variations of signals in their environment. However, very few reports have studied the genome-wide effects of systematic signal...
Fast Structural Search in Phylogenetic Databases
Wang, Jason T. L., Shan, Huiyuan, Shasha, Dennis, Piel, William H.
As the size of phylogenetic databases grows, the need for efficiently searching these databases arises. Thanks to previous and ongoing research, searching by attribute value and by text has become...
GraphClust: A Method for Clustering Database of Graphs
Diego Reforgiato, Rodrigo Gutierrez, Dennis Shasha
Any application that represents data as sets of graphs may benefit from the discovery of relationships among those graphs. To do this in an unsupervised fashion requires the ability to find graphs...