Hideharu Amano

Performance, Cost, and Energy Evaluation of Fat H-Tree: A Cost-Efficient Tree-Based On-Chip Network ∗ (2009)

Hiroki Matsutani, Michihiro Koibuchi, Hideharu Amano

Fat H-Tree is a novel tree-based interconnection network providing a torus structure, which is formed by combining two folded H-Tree networks, and is an attractive alternative to tree-based networks...

Performance Evaluation of Deterministic Routings, Multicasts, and Topologies on RHiNET-2 Cluster (2008)

Michihiro Koibuchi, Konosuke Watanabe, Tomohiro Otsuka, Hideharu Amano

arbitrary topologies, have been used to connect nodes in PC/WS clusters or high-performance storage systems. Although deadlock-free routings, multicasts, and topologies for SANs have been widely...

MAPLE chip: a processing element for a static scheduling centric multiprocessor (2008)

Kenta Yasufuku, Riku Ogawa, Keisuke Iwai, Hideharu Amano

Abstract — A custom processor called MAPLE which supports static scheduling by automatic parallelizing compilers is implemented and evaluated. MAPLE has a high performance floating point arithmetic...

Environment for Multiprocessor Simulator Development (2008)

Masaki Wakabayashi, Hideharu Amano

Performance estimation is essential for designing and investigating of new architectures including multiprocessors. Software simulation is one of the most common methods, since there is no limitation...

A (2008)

Hideharu Amano, Akiya Jouraku, Kenichiro Anjo

dynamically adaptive switching fabric on a multicontext reconfigurable device

Performance/Cost trade-off evaluation for the DCT implementation on the Dynamically Reconfigurable Processor (2008)

Vu Manh Tuan, Yohei Hasegawa, Naohiro Katsura, Hideharu Amano

Abstract. The Dynamically Reconfigurable Processor (DRP) developed by NEC Electronics is a coarse grain reconfigurable processor with the capability of changing its hardware functionality within a...

PAPER MMLRU Selection Function: A Simple and Efficient Output Selection Function in Adaptive Routing (2008)

Michihiro Koibuchi †a, Akiya Jouraku, Hideharu Amano

SUMMARY Adaptive routing algorithms, which dynamically select the route of a packet, have been widely studied for interconnection networks in massively parallel computers. An output selection...

Performance and Power Analysis of Time-multiplexed Execution on Dynamically Reconfigurable Processor (2008)

Yohei Hasegawa, Shohei Abe, Shunsuke Kurotaki, Vu Manh Tuan, Naohiro Katsura, Takuro Nakamura, ...

Dynamically Reconfigurable Processor (DRP) developed by NEC Electronics is a coarse grain reconfigurable processor that selects a datapath called a context from the on-chip repository of sixteen...

ISIS: MULTIPROCESSOR SIMULATOR LIBRARY Abstract (2008)

Masaki Wakabayashi, Keisuke Inoue, Hideharu Amano

In this paper, architecture independent software simulation kit for multiprocessors called ISIS is proposed and designed. It includes various small simulators of a hardware device. All functions are...

-An (2008)

Hideharu Amano, Masayasu Suzuki, Takeshi Inuo, Hirokazu Kami, Taro Fujii

Techniques for virtual hardware on a dynamic

AN ASYNCHRONOUS SWITCHING FABRIC (2008)

Hideharu Amano

Abstract — A simple switching fabric which works completely in an asynchronous manner without a system clock was implemented. It is equipped with 4 input/output ports, each of which has 4-bit data...

A Parametric Study of Scalable Interconnects on FPGAs (2008)

Daihan Wang, Hiroki Matsutani, Masato Yoshimi, Michihiro Koibuchi, Hideharu Amano

Abstract — With the constantly increasing gate capacity of FP-GAs, a single FPGA chip is able to employ large-scale applications. To connect a large number of computational nodes, Network-On-Chip...

Reducing (2008)

Toshiro Kitaoka, Hideharu Amano, Kenichiro Anjo

the configuration loading time of a coarse grain multicontext reconfigurable device

Overview of the JUMP-1, an MPP Prototype for General-Purpose Parallel Computations (2008)

Kei Hiraki, Hideharu Amano, Morihiro Kuga, Toshinori Sueyoshi, Tomohiro Kudoh, Hironori Nakajo, ...

this paper, we discuss the importance of flexible distributed shared memory in a MPP system for general-purpose computations. The main features of JUMP-1 memory system are: 1. Flexible...

A Mapping Method for Multi-Process Execution on Dynamically Reconfigurable Processors (2008)

TUAN, Vu MANH, AMANO, Hideharu

The multi-process execution in dynamically reconfigurable processors is a technique to enhance throughput by trying to exploit more inherent parallelism of applications. Basically, a total process...

A Retargetable Compiler Based on Graph Representation for Dynamically Reconfigurable Processor Arrays (2008)

TUNBUNHENG, Vasutan, AMANO, Hideharu

For developing design environment of various Dynamically Reconfigurable Processor Arrays (DRPAs), the Graph with Configuration Information (GCI) is proposed to represent configurable resource in the...

A Preemption Algorithm for a Multitasking Environment on Dynamically Reconfigurable Processors (2008)

TUAN, Vu Manh, AMANO, Hideharu

Task preemption is a critical mechanism for building an effective multi-tasking environment on dynamically reconfigurable processors. When a task is preempted, its necessary state information must be...

Hot spot contention and message combining in the Simple Serial Synchronized Multistage Interconnection Network (2007)

Toshihiro Hanawa, Takashi Fujiwara, Hideharu Amano

Network (MIN) is a novel MIN architecture for connecting processors and memory modules in multiprocessors. Synchronized bit-serial communication simplifies the structure /control, and permits the...

Interconnection Network and Distributed Shared Memory of a Massively Parallel Machine JUMP-1 (2007)

Hideharu Amano, Katsunobu Nishimura, Tomohiro Kudoh, Hiroaki Nishi, Ken'ichiro Anjo

For cache coherent distributed shared memory on a large scale parallel machine, each node processor of JUMP-1 shares a global virtual address space with two-stage TLB implementation. The directory is...

Fault tolerance of the TBSF (Tandem Banyan Switching Fabrics) and PBSF (Piled Banyan Switching Fabrics). (2007)

Akira Funahashi, Toshihiro Hanawa, Hideharu Amano

this paper, a fault recovery mechanism is attached to these two networks, and proposed Fault tolerant TBSF (F-TBSF) and Fault tolerant PBSF (FPBSF) respectively. Then, the performance degradation...

The MINC(Multistage Interconnection Network with Cache control mechanism) chip (2007)

Takashi Midorikawa, Takayuki Kamei, Toshihiro Hanawa, Hideharu Amano

Introduction Although bus connected multiprocessors have been widely used as high-end workstations or servers, the number of connected processors is strictly limited by the maximum bandwidth of the...

The MINC chip: Multistage Interconnection Network with Cache control mechanism chip (2007)

Takashi Midorikawa, Takayuki Kamei, Toshihiro Hanawa, Hideharu Amano

The Multistage Interconnection Network with Cache control mechanism (MINC) is a hardware mechanism to control the cache coherent in a switchconnected multiprocessors using a crossbar or Multistage...

The Preliminary Evaluation of MBP-light with Two Protocol Policies for A Massively Parallel Processor -- JUMP-1 -- (2007)

Inoue Hiroaki, Ken-ichiro Anjo, Junji Yamamoto, Jun Tanabe, Masaki Wakabayashi, Mitsuru Sato, ...

A massively parallel processor called JUMP-1 has been developed to build an efficient cache coherent-distributed shared memory (DSM) on a large system with more than 1000 processors. Here, the...

SUMMARY (2007)

Xiaoshe Dong, Tomohiro Kudoh, Hideharu Amano

ring is an optical interconnection network for workstation clusters or parallel machines which can connect various number of nodes easil using wavelength division multiplexing techniques. However,...

Data Multicasting Procedure for Increasing Configuration Speed of Coarse Grain Reconfigurable Devices (2007)

TUNBUNHENG, Vasutan, SUZUKI, Masayasu, AMANO, Hideharu

A novel configuration method called Row Multicast Configuration (RoMultiC) is proposed for high speed configuration of coarse grain reconfigurable systems. The same configuration data can be...

A Port Combination Methodology for Application-Specific Networks-on-Chip on FPGAs (2007)

WANG, Daihan, MATSUTANI, Hiroki, KOIBUCHI, Michihiro, AMANO, Hideharu

A temporal correlation based port combination algorithm that customizes the router design in Network-on-Chip (NoC) is proposed for reconfigurable systems in order to minimize required hardware...

Non-Minimal Routing Strategy for Application-Specific Networks-onChips (2005)

Hiroki Matsutani, Michihiro Koibuchi, Yutaka Yamada, Akiya Jouraku, Hideharu Amano

We propose a deterministic routing strategy called flee which introduces non-minimal paths in order to distribute traffic with a high degree of communication locality in Networks-on-Chips. In the...

Folded Fat H-Tree: An Interconnection Topology for Dynamically Reconfigurable Processor Array (2005)

Yutaka Yamada, Hideharu Amano, Michihiro Koibuchi, Akiya Jouraku, Kenichiro Anjo, Katsunobu Nishimura

Abstract. Fat H-Tree is a novel on-chip network topology for a dynamic reconfigurable processor array. It includes both fat tree and torus structure, and suitable to map tasks in a stream processing....

Implementation of Active Direction-Pass Filter on Dynamically Reconfigurable Processor (2005)

Shunsuke Kurotaki, Noriaki Suzuki, Kazuhiro Nakadai, Hiroshi G. Okuno, Hideharu Amano

Abstract — In this paper, we report the design and implementation of a sound source separation system using a dynamically reconfigurable device. A robot in real-world environments should have an...

MMLRU Selection Function: A Simple and Efficient Output Selection Function in Adaptive Routing (2005)

KOIBUCHI, Michihiro, JOURAKU, Akiya, AMANO, Hideharu

Adaptive routing algorithms, which dynamically select the route of a packet, have been widely studied for interconnection networks inmassively parallel computers. An output selection function (OSF),...

BLACK-BUS: A New Data-Transfer Technique using Local Address on Networks-on-Chips (2004)

Kenichiro Anjo, Yutaka Yamada, Michihiro Koibuchi, Akiya Jouraku, Hideharu Amano

Network-on-a-Chip (NoC) has received attention as a high-performance interconnect, because traditional buses, which can’t transfer more than one data-stream simultaneously, are more likely to...

Descending Layers Routing: A Deadlock-Free Deterministic Routing using Virtual Channels in System Area Networks with Irregular Topologies (2003)

Michihiro Koibuchi, Akiya Jouraku, Konosuke Watanabe, Hideharu Amano

System Area Networks (SANs), which usually accept irregular topologies, have been used to connect nodes in PC/WS clusters or high-performance storage systems. Since wormhole or virtual cut-through...

Routing Algorithms Based on 2D Turn Model for Irregular Networks (2002)

Akiya Jouraku, Michihiro Koibuchi, Hideharu Amano, Akira Funahashi

In order to solve traffic unbalancing caused by up*/down * routing for irregular networks, two-dimensional direction is introduced into a spanning tree, and novel routing algorithms based on...

The impact of path selection algorithm of adaptive routing for implementing deterministic routing (2002)

Michihiro Koibuchi, Akiya Jouraku, Hideharu Amano

In PC clusters or high performance I/O networks including InfiniBand, network topologies often become irregular. Although various adaptive routings for irregular networks have been proposed, most of...

MBP-light: A Processor for Management of Distributed Shared Memory (1998)

Inoue Hiroaki, Katsunobu Nishimura, Mitsuru Satoh, Kei Hiraki, Hideharu Amano

MBP(Memory Based Processor)-light is a dedicated processor for management of cache coherent distributed shared memory (DSM) in a massively parallel processor called JUMP-1. Unlike traditional...

An LSI implementation of the Simple Serial Synchronized Multistage Interconnection Network (1997)

Takayuki Kamei, Masashi Sasahara, Hideharu Amano

After the delay for passing through all stages, head of packets come out from the outlets of the MIN. When a conflict occurs, one of the conflicting packets set the conflict bit, and must be routed...

The RDT Router Chip: A versatile router for supporting a distributed shared memory (1997)

Hiroaki Nishi, Ken-ichiro Anjo, Tomohiro Kudoh, Hideharu Amano

Introduction JUMP-1 is a massively parallel processor prototype developed by a collaboration between seven Japanese universities [4]. The major goal of this project is to establish techniques for...

Shared vs. Snoop: Evaluation of Cache Structure for Single-chip Multiprocessors (1997)

Toru Kisuki, Masaki Wakabayashi, Junji Yamamoto, Keisuke Inoue, Hideharu Amano

Abstract. The shared cache structures and snoop cache structures for single-chip multiprocessors are evaluated and compared using an instruction level simulator. Simulation results show that 1-port...

MINC : Multistage Interconnection Network with Cache control mechanism (1996)

Toshihiro Hanawa Hideki, Hideki Yasukawa, Katsunobu Nishimura, Hideharu Amano

A novel approach to the cache coherent Multistage Interconnection Network (MIN) called the MINC (MIN with Cache control mechanism) is proposed. In the MINC, the directory is located only on the...

An LSI implimentation of the Simple Serial Synchronized Multistage Interconnection Network (1995)

Takayuki Kamei, Masashi Sasahara, Hideharu Amano

Network (MIN) is a novel MIN architecture for connecting processors and memory modules in multiprocessors. Synchronized bit-serial communication simplifies the structure/control, and also solves the...

Hierarchical bit-map directory schemes on the RDT interconnection network for a massively parallel processor JUMP-1 (1995)

Tomohiro Kudoh, Hideharu Amano, Takashi Matsumoto, Kei Hiraki, Yulu Yang, Katsunobu Nishimura, ...

JUMP-1 is currently under development by seven Japanese universities to establish techniques of an efficient distributed shared memory on a massively parallel processor. It provides a memory...

SNAIL: a multiprocessor based on the Simple Serial Synchronized multistage interconnection network architecture (1994)

Masashi Sasahara, Jun Terada, Luo Zhou, Kalidou Gaye, Jun-ichi Yamato, Satoshi Ogura, ...

Simple Serial Synchronized (SSS) Multistage Interconnection Network (MIN) is a novel MIN architecture for connecting processors and memory modules in multiprocessors. Synchronized bit-serial...