Masaru Kitsuregawa

Using Hidden Markov Random Fields to Combine Distributional and Pattern-based Word Clustering (2009)

Nobuhiro Kaji, Masaru Kitsuregawa

Word clustering is a conventional and important NLP task, and the literature has suggested two kinds of approaches to this problem. One is based on the distributional similarity and the other relies...

Power-aware Remote Replication for Enterprise-level Disaster Recovery Systems (2009)

Kazuo Goda, Masaru Kitsuregawa

Electric energy consumed in data centers is rapidly growing. Power-aware IT, recently called ‘green IT’, is widely recognized as a significant challenge. Disk storage is a non-negligible energy...

Abstract (2009)

Masaru Kitsuregawa, Yasushi Ogawa

The Super Database Computer (SDC) is a high-performance relational database server for a join-intensive environment under development at Univer-sity of Tokyo. SDC is designed to execute a join in a...

Abstract Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining (2009)

Takayuki Tamura, Masato Oguchi, Masaru Kitsuregawa

We developed a PC cluster system consists of 100 PCs. Each PC employs the 200MHz Pentium Pro CPU and is connected with others through an ATM switch. We picked up two kinds of data intensive...

DataBank: A Blueprint for efficient privacy-preserving personalized user data management worldwide (2008)

Anirban Mondal, Pankaj Garg, Masaru Kitsuregawa

The unprecedented increase in the complexity of user-related data coupled with the dramatically growing user dependence on such data motivates a strong need for a new kind of virtual (electronic)...

Concise Papers __________________________________________________________________________________________ Using Predeclaration for Efficient Read-Only Transaction Processing in Wireless Data Broadcast (2008)

Sangkeun Lee, Chong-sun Hwang, Masaru Kitsuregawa

Abstract—Wireless data broadcast allows a large number of users to retrieve data simultaneously in mobile databases, resulting in an efficient way of using the scarce wireless bandwidth. However,...

DEWS2006 2A-o4 Efficient Large Scale Continuous Selection-Join Queries Based on Multidimensional Index (2008)

Botao Wang, Masaru Kitsuregawa

Abstract We consider the problem of large number of continuous selection-join queries over data streams. As far as we know, in current data stream management systems, events are filtered based on the...

DEWS2006 3A-i6 Finding Thai Web Pages in Foreign Web Spaces� (2008)

Kulwadee Somboonviwat, Takayuki Tamura, Masaru Kitsuregawa

Abstract While the Web has been increasingly recognized as a culturally valuable social artifact, many nations endeavor to create national Web archives for long term preservation. However, due to its...

16:25–16:50 WebRelievo: A System for Browsing and Analyzing the (2008)

Mark Levene, Ra Poulovassilis, Judit Bar-ilan, Mark Levene, Mazlita Mat-hassan, Yen-yu Chen, ...

14:50–15:15 Modeling Semantic Web Services with OPM/S A Human and Machine-Interpretable Language

An Effective System for Mining Web Log (2008)

Zhenglu Yang, Yitong Wang, Masaru Kitsuregawa

Abstract. The WWW provides a simple yet effective media for users to search, browse, and retrieve information in the Web. Web log mining is a promising tool to study user behaviors, which could...

DESIGN OF DATA SERVER FOR CEOP DATA (2008)

Toshihiro Nemoto, Masaru Kitsuregawa

On the CEOP (Coordinated Enhanced Observing Period) project, in order to improve our understanding of water and energy and fluxes and reservoirs over land areas, large amount of data are being...

Dynamic Adaptation Strategies for Long-Term and Short-Term User Profile to Personalize Search (2008)

Lin Li, Zhenglu Yang, Botao Wang, Masaru Kitsuregawa

Abstract. Recent studies on personalized search have shown that user preferences could be learned implicitly. As far as we know, these studies, however, neglect that user preferences are likely to...

Aggregating User-Centered Rankings to Improve Web Search (2008)

Lin Li, Zhenglu Yang, Masaru Kitsuregawa

This paper is to investigate rank aggregation based on multiple user-centered measures in the context of the web search. We introduce a set of techniques to combine ranking lists in order of user...

Load-balancing Remote Spatial Join Queries in a Spatial GRID (2008)

Anirban Mondal, Masaru Kitsuregawa

Abstract. The explosive growth of spatial data worldwide coupled with the emergence of GRID computing provides a strong motivation for designing a spatial GRID which allows transparent access to...

Associate Editors (2008)

Masaru Kitsuregawa, Betty Salzberg, Gonzalo Navarro, Ricardo Baeza-yates, Erkki Sutinen, Jorma Tarhio, ...

IntegratingDiverseInformationManagementSystems:ABriefSurvey..................................

Towards Efficient Dominant Relationship Exploration of the Product Items on the Web ABSTRACT (2008)

Zhenglu Yang, Lin Li, Botao Wang, Masaru Kitsuregawa

In recent years, there has been a prevalence of search engines being employed to find useful information in the Web as they efficiently explore hyperlinks between web pages which define a natural...

The Simulation Evaluation of Heat Balancing Strategies for Btree Index over Parallel Shared Nothing Machines” Technical report of IEICE (2007)

Hisham Feelifl, Masaru Kitsuregawa

In shared nothing machines the data are typically declustered across the system processing elements (PEs) to exploit the I/O bandwidth of the PEs. Notwithstanding, the access pattern is inherently...

Performance Analysis for Parallel Generalized Association Rule Mining on a Large Scale PC Cluster (2007)

Takahiko Shintani, Masato Oguchi, Masaru Kitsuregawa

One of the most important problems in data mining is discovery of association rules in large database. We had proposed parallel algorithms for mining generalized association rules with classification...

Implementation and Evaluation of the Bucket Flattening Omega Network of the Parallel Relational Database Server SDC-II (2007)

Takayuki Tamura Masaru, Masaru Kitsuregawa

This paper presents the implementation and performance evaluation of the Bucket Flattening Omega Network of the SDC-II, the Super Database Computer II. The SDC-II is a highly parallel relational...

Compact Encoding of the Web Graph Exploiting Various Power Laws Statistical Reason Behind Link Database (2007)

Yasuhito Asano, Tsuyoshi Ito, Hiroshi Imai, Masashi Toyoda, Masaru Kitsuregawa

Abstract. Compact encodings of the web graph are required in order to keep the graph on main memory and to perform operations on the graph efficiently. Link2, the second version of the Link Database...

DEWS2003 2-B-01 Effective load-balancing of peer-to-peer systems (2007)

Anirban Mondal, Kazuo Goda, Masaru Kitsuregawa

The growing popularity of peer-to-peer (P2P) systems has necessitated the need for managing huge volumes of data efficiently to ensure acceptable user response times. Dynamically changing...

Data Mining on PC Cluster connected with Storage Area Network: Its Preliminary Experimental Results (2007)

Masato Oguchi, Masaru Kitsuregawa

Abstract — Personal computer/Workstation (PC/WS) clusters have become a hot research topic recently in the field of parallel and distributed computing. They are considered to play an important role...

An E cient Scheme for Processing Wireless Read-only Transactions in Data Broadcast (2007)

Sangkeun Lee, Masaru Kitsuregawa

This paper addresses the issue of ensuring consistency and currency of data items requested by wireless readonly transactions in data broadcast. To handle an inherent property in wireless data...

Speculative Distributed Transaction Processing (2007)

P. Krishna Reddy, Masaru Kitsuregawa

In this paper, we propose speculative distributed transaction processing (SDTP) strategy, in which, a transaction releases the locks on the data objects immediately after the completion of its...

Preliminary Experimental Results of a Parallel Association Rule Mining on ATM connected PC Clusters (2007)

Masato Oguchi, Takahiko Shintani, Takayuki Tamura, Masaru Kitsuregawa

Until recently, workstations were overwhelmingly superior to personal computers in terms of performance. However, recent PC technology has dramatically increased its CPU, main memory, and cache...

Runtime Data Declustering over SAN-Connected PC Cluster (2007)

Masato Oguchi, Masaru Kitsuregawa

computer/workstation (PC/WS) clusters have come to be studied intensively in the field of parallel and distributed computing. They are considered to play an important role as a large scale computer...

Implementation and Evaluation of Parallel Data Mining on PC Cluster and Optimization of its Execution Environments (2007)

Masato Oguchi, Masaru Kitsuregawa

Abstract — Personal Computer/Workstation clusters have been studied intensively in the field of parallel and distributed computing. In the viewpoint of applications, data intensive applications...

E-mail: (2007)

Iko Pramudiono, Takahiko Shintani, Katsumi Takahashi, Masaru Kitsuregawa

Rapid growth of internet access from mobile users puts much importance on location specific information on the web. An unique web service called Mobile Info Search (MIS) from NTT Laboratories gathers...

LG Electronics Inc. (2007)

Sangkeun Lee, Masaru Kitsuregawa, Chong-sun Hwang

Wireless data broadcast allows a large number of users to retrieve data simultaneously in mobile databases, resulting in an efficient way of using the scarce wireless bandwidth. The efficiency of...

An approach to build a cyber-community hierarchy (2007)

P. Krishna Reddy, Masaru Kitsuregawa

In this paper we propose an approach to extract community structures in the Web by considering a community structure as a group of content creators that manifests itself as a set of interlinked...

Some Experiences on Large Scale Web Mining (2007)

Masaru Kitsuregawa, Iko Pramudiono, Yusuke Ohura, Masashi Toyoda

Abstract. Web mining is now a popular term of techniques to analize the data from World Wide Web(WWW). Here we will report some of our experiences in large scale web mining. The first is the...

2 (2007)

Mong Li Lee, Masaru Kitsuregawa, Beng Chin Ooi, Kian-lee Tan, Anirban Mondal

Parallel database systems are increasingly being deployed to support the performance demands of end-users. While declustering data across multiple nodes facilitates parallelism, existing data...

Visualization System for Earth Environmental Data Base (2007)

Eiji Ikoma, Masaru Kitsuregawa

The earth’s environmental problems have attracted serious attention worldwide. Various kinds of environmental data, such as remote sensing data, have become available for examining. Although this...

1 (2007)

Hisham Feelifl, Masaru Kitsuregawa, Beng-chin Ooi

Abstract. In shared-nothing environments, data is typically declustered and indexed across the system processing elements (PEs) to achieve efficient processing. However access patterns are inherently...

Development of an Earth Environmental Database System which Interacts with Application Software (2007)

Eiji Ikoma, Taikan Oki, Masaru Kitsuregawa

With increasing interest in earth environmental issues, earth environmental database system that can solve various types of relevant data is high demand. While currently available databases are...

Associate Editors (2007)

Masaru Kitsuregawa, Betty Salzberg, Mary Fern, Atsuyuki Morishima, Dan Suciu, Wang-chiew Tan, ...

The Bulletin of the Technical Committee on Data Engineering is published quarterly and is distributed to all TC members. Its scope includes the design, implementation, modelling, theory and...

Abstract Design and Implementation of Scalable Tape Archiver (2007)

Toshihiro Nemoto, Masaru Kitsuregawa, Mikio Takagi

In order to reduce costs, computer manufacturers try to use commodity parts as much as possible. Mainframes using proprietary processors are being replaced by high performance RISC...

Summarizing order statistics over data streams with duplicates (poster (2007)

Ying Zhang, Xuemin Lin, Yidong Yuan, Masaru Kitsuregawa, Xiaofang Zhou, Jeffrey Xu Yu

A rank query is essentially to find a data element with a given rank against a monotonic order specified on data elements. Rank queries have several equivalent variations

Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents (2007)

Nobuhiro Kaji, Masaru Kitsuregawa

Recognizing polarity requires a list of polar words and phrases. For the purpose of building such lexicon automatically, a lot of studies have investigated (semi-) unsupervised method of learning...

Building Lexicon for Sentiment Analysis from Massive Collection of HTML Documents (2007)

Nobuhiro Kaji, Masaru Kitsuregawa

Recognizing polarity requires a list of polar words and phrases. For the purpose of building such lexicon automatically, a lot of studies have investigated (semi-) unsupervised method of learning...

ConQuer: A Peer Group-based Incentive Model for Constraint Querying (2007)

Anirban Mondal, Sanjay Kumar Madria, Masaru Kitsuregawa

Abstract — In mobile ad-hoc peer-to-peer (M-P2P) networks, economic models become a necessity for enticing non-cooperative mobile peers to provide service. M-P2P users may issue queries with...

Role-based delegation with negative authorization (2006)

Wang, Hua, Cao, Jinli, Ross, David, Zhou, Xiaofang, Li, Jianzhong, Shen, Heng Tao, ...

[Abstract]: Role-based delegation model (RBDM) based on role-basedaccess control (RBAC) has proven to be a flexible and useful access control model for information sharing on distributed...

Role-based delegation with negative authorization (2006)

Wang, Hua, Cao, Jinli, Ross, David, Zhou, Xiaofang, Li, Jianzhong, Shen, Heng Tao, ...

[Abstract]: Role-based delegation model (RBDM) based on role-basedaccess control (RBAC) has proven to be a flexible and useful access control model for information sharing on distributed...

Automatic construction of polarity-tagged corpus from html documents (2006)

Nobuhiro Kaji, Masaru Kitsuregawa

This paper proposes a novel method of building polarity-tagged corpus from HTML documents. The characteristics of this method is that it is fully automatic and can be applied to arbitrary HTML...

Energy Conserving Transaction Processing in Wireless Data Broadcast (2006)

Sangkeun Lee, Chong-sun Hwang, Masaru Kitsuregawa

Abstract—Broadcasting in wireless mobile computing environments is an effective technique to disseminate information to a massive number of clients equipped with powerful, battery operated devices....

Mining Communities on the Web Using a Max-Flow and a Site-Oriented Framework (2006)

ASANO, Yasuhito, NISHIZEKI, Takao, TOYODA, Masashi, KITSUREGAWA, Masaru

There are several methods for mining communities on the Web using hyperlinks. One of the well-known ones is a max-flow based method proposed by Flake et al. The method adopts a page-oriented...

A Formal Ontology Reasoning with Individual Optimization: A Realization of the Semantic Web (2005)

Pothipruk, Pakornpong, Governatori, Guido, Kitsuregawa, Masaru, Neuhold, Erich, Ngu, Anne

Answering a query over a group of RDF data pages is a trivial process. However, in the Semantic Web, there is a need for ontology technology. Consequently, OWL, a family of web ontology languages...

A system for visualizing and analyzing the evolution of the Web with a time series of graphs (2005)

Masashi Toyoda, Masaru Kitsuregawa

We propose WebRelievo, a system for visualizing and analyzing the evolution of the web structure based on a large Web archive with a series of snapshots. It visualizes the evolution with a time...

Automated Retraining Methods for Document Classification and Their Parameter Tuning (2005)

Siersdorfer, Stefan, Weikum, Gerhard, Ngu, Anne H. H., Kitsuregawa, Masaru, Neuhold, Erich J., Chung, Jen-Yao, ...

This paper addresses the problem of semi-supervised classification on document collections using retraining (also called self-training). A possible application is focused Web crawling which may start...

Extracting user behavior by web communities technology on global web logs (2004)

Shingo Otsuka, Masashi Toyoda, Masaru Kitsuregawa

Abstract. A lot of work has been done on extracting the model of web user behavior. Most of them target server-side logs that cannot track user behavior outside of the server. Recently, a novel way...

On improving the performance dependability of unstructured P2P systems via replication (2004)

Anirban Mondal, Yi Lifu, Masaru Kitsuregawa

Abstract. The ever-increasing popularity of peer-to-peer (P2P) systems provides a strong motivation for designing a dependable P2P system. Dependability in P2P systems can be viewed from two...

Effective load-balancing via migration and replication in spatial GRIDs (2003)

Anirban Mondal, Kazuo Goda, Masaru Kitsuregawa

Abstract. The unprecedented growth as well as the growing importance of available spatial data at geographically distributed locations has made efficient networking of such data a necessity for...

Finding Neighbor Communities in the Web Using Inter-Site Graph (2003)

Yasuhito Asano, Hiroshi Imai, Masashi Toyoda, Masaru Kitsuregawa

In recent years, link-based information retrieval methods from the Web are developed. A framework of these methods is a Web graph using pages as vertices and Web-links as edges. In the last year, the...

Effective load-balancing via migration and replication in spatial GRIDs (2003)

Anirban Mondal, Kazuo Goda, Masaru Kitsuregawa

Abstract. The unprecedented growth of available spatial data at geographically distributed locations coupled with the emergence of grid computing provides a strong motivation for designing a spatial...

University of tokyo/ricoh at ntcir-3 web retrieval task (2002)

Masashi Toyoda, Masaru Kitsuregawa, Hiroko Mano, Hideo Itoh, Yasushi Ogawa

In NTCIR-3 Web Task, we introduced new approaches in (1) similarity retrieval using one known relevant document and pseudo-relevance feedback and (2) topic and target retrieval incorporating link...

Observing evolution of Web communities (2002)

Masashi Toyoda, Masaru Kitsuregawa

We propose a method for observing evolution of web communities. A web community is a set of web pages created by individuals or associations with a common interest on a topic. So far various...

Naviz: User Behavior Visualization of Dynamic Page (2002)

Bowo Prasetyo, Iko Pramudiono, Katsumi Takahashi, Masashi Toyoda, Masaru Kitsuregawa

Navigational behavior of website visitors can be extracted from web access log files with data mining techniques such as sequential pattern mining. Visualization of the discovered patterns is very...

community mining and web log mining: commodity cluster based execution (2002)

Masaru Kitsuregawa, Masashi Toyoda, Iko Pramudiono

The emergence of WWW has drawn new frontiers for database research. Web mining has become a hot topic since WWW rapid expansion rate and chaotic nature have exposed some technical challenges as well...

An approach to relate the Web communities through bipartite graphs (2001)

P. Krishna Reddy, Masaru Kitsuregawa

The Web harbors a large number of community structures. Early detection of community structures has many purposes such as reliable searching and selective advertising. In this paper we investigate...

Creating a Web Community Chart for Navigating Related Communities (2001)

Masashi Toyoda, Masaru Kitsuregawa

Recent research on link analysis has shown the existence of numerous web communities on the Web. A web community is a collection of web pages created by individuals or any kind of associations that...

Link based clustering of Web search results (2001)

Yitong Wang, Masaru Kitsuregawa

Abstract. With information proliferation on the Web, how to obtain highquality information from the Web has been one of hot research topics in many fields like Database, IR as well as AI. Web search...

Parallel Data Mining on ATM-Connected PC Cluster and Optimization of Its Execution Environments (2000)

Masato Oguchi, Masaru Kitsuregawa

Abstract. In this paper, wehave constructed a large scale ATM-connected PC cluster consists of 100 PCs, implemented a data mining application, and optimized its execution environment. Default...

SQL based association rule mining using commercial RDBMS (2000)

Takeshi Yoshizawa, Iko Pramudiono, Masaru Kitsuregawa

Abstract. Data mining is becoming increasingly important since the size of databases grows even larger and the need to explore hidden rules from the databases becomes widely recognized. Currently...

Heat Balancing for Btree Indexed Database over Ring Configured Shared Nothing Machines (2000)

Hisham Feelifl, Masaru Kitsuregawa

In shared nothing machines, data are typically declustered and indexed across the system processing elements (PEs) to achieve efficient query and transaction execution. Since the access pattern is...

Applying the Site Information to (2000)

The Information Retrieval, Yasuhito Asano, Masashi Toyoda, Masaru Kitsuregawa

In recent years, several information retrieval methods using information about the Web-links are developed, such as HITS and Trawling. In order to analyze the Web-links dividing into links inside...

Scalable Tape Archiver for Satellite Image Database and its Performance Analysis with Access Logs --Hot Declustering and Hot Replication-", The (1999)

Toshihiro Nemoto, Masaru Kitsuregawa

Recently, global environmental studies have become very important around the world. Repeated observations of wide areas of the earth at the same time makes satellite images useful. For understanding...

Parallel SQL Based Association Rule Mining on Large Scale PC Cluster: Performance Comparison with Directly Coded C Implementation (1999)

Iko Pramudiono, Takahiko Shintani, Takayuki Tamura, Masaru Kitsuregawa

. Data mining is becoming increasingly important since the size of databases grows even larger and the need to explore hidden rules from the databases becomes widely recognized. Currently database...

A Dynamic Load Balancing Strategy for Parallel Datacube Computation (1999)

Seigo Muto, Masaru Kitsuregawa

important applications in the database industry. In particular, the datacube operation proposed in [5] receives strong attention among researchers as a fundamental research topic in the OLAP...

Reducing the blocking in two-phase commit protocol employing backup sites (1998)

P. Krishna Reddy, Masaru Kitsuregawa

In distributed data base systems (DDBSs), a transaction blocks during two-phase commit (2PC) processing if the coordinator site fails and at the same time some participant site has declared itself...

Improving performance in distributed database systems using speculative transaction processing (1998)

P. Krishna Reddy, Masaru Kitsuregawa

In distributed database systems (DDBSs), a transaction acquires the locks on the data objects during the execution and releases them only after the completion of commit processing. In DDBSs, it can...

Parallel Mining Algorithms for Generalized Association Rules with Classification Hierarchy (1998)

Takahiko Shintani, Masaru Kitsuregawa

Association rule mining recently attracted strong attention. Usually, the classification hierarchy over the data items is available. Users are interested in generalized association rules that span...

Mining Algorithms for Sequential Patterns in Parallel: Hash Based Approach (1998)

Takahiko Shintani, Masaru Kitsuregawa

In this paper, we study the problem of mining sequential patterns in a large database of customer transactions. Since finding sequential patterns has to handle a large amount of customer transaction...

Parallel Database Processing on a 100 Node PC Cluster: Cases for Decision Support Query Processing and Data Mining (1997)

Takayuki Tamura, Masato Oguchi, Masaru Kitsuregawa

We developed a PC cluster system consists of 100 PCs. Each PC employs the 200MHz Pentium Pro CPU and is connected with others through an ATM switch. We picked up two kinds of data intensive...

Hash Based Parallel Algorithms for Mining Association Rules (1996)

Takahiko Shintani Masaru, Masaru Kitsuregawa

In this paper, we propose four parallel algorithms (NPA, SPA, HPA and HPA-ELD) for mining association rules on shared-nothing parallel machines to improve its performance. In NPA, candidate itemsets...

Implementation And Performance Evaluation Of The Parallel Relational Database Server SDC-II (1996)

Takayuki Tamura, Yoshihisa Ogawa, Minoru Nakamura, Masaru Kitsuregawa

This paper presents the implementation and performance evaluation of the SDC-II, the Super Database Computer II. The SDC-II is a highly parallel relational database server, which consists of eight...

Partial migration in an 8mm tape based tertiary storage file system and its performance evaluation through satellite image processing applications (1995)

Kazuhiko Sako, Toshihiro Nemoto, Masaru Kitsuregawa, Mikio Takagi

Abstract. Recent attention on global environmental changes has stimulated the development of large scale global information systems. Satellite images play a very important role for understanding...

Hot Block Clustering for Disk Arrays with Dynamic Striping (1995)

Kazuhiko Mogi, Masaru Kitsuregawa

RAID5 disk arrays provide high performance and high reliability for reasonable cost. However RAID5 suffers a performance penalty during block updates. In order to overcome this problem, the use of...