A Hybrid Web Server Architecture for Secure e-Business Web Applications (2009)
Vicenç Beltran, David Carrera, Jordi Guitart, Jordi Torres, Eduard Ayguadé
Nowadays the success of many e-commerce applications, such as on-line banking, depends on their reliability, robustness and security. Designing a web server architecture that keeps these properties...
A Study of Implicit Data Distribution Methods for OpenMP Using the SPEC Benchmarks (2009)
Dimitrios S. Nikolopoulos, Eduard Ayguadé
Abstract. In contrast to the common belief that OpenMP requires data-parallel extensions to scale well on architectures with non-uniform memory access latency, recent work has shown that it is...
ABSTRACT Efficient Execution of Parallel Java Applications (2008)
Jordi Guitart, Xavier Martorell, Jordi Torres, Eduard Ayguadé
In this paper we propose mechanisms to improve the performance of parallel Java applications. The proposal is based on the establishment of a dialog between each Java application and the underlying...
Jordi Guitart, Jordi Torres, Eduard Ayguadé, José Oliver, Jesús Labarta
Abstract—The rapid maturing process of the Java technology is encouraging users to develope of portable applications using the Java language. As an important part of the definition of the Java...
A proposal for error handling in OpenMP (2008)
Xavier Martorell, Eduard Ayguadé, Jesús Labarta
Abstract. OpenMP has been focused in performance applied to numerical applications, but when we try to move this focus to other kind of applications, like Web servers, we detect one important lack....
Dimitrios S. Nikolopoulos, Eduard Ayguadé, Constantine D. Polychronopoulos
This paper compares data distribution methodologies for scaling the performance of OpenMP on NUMA architectures. We investigate the performance of automatic page placement algorithms implemented in...
Exploiting Memory Affinity in OpenMP through Schedule Reuse (2008)
In this paper we explore the possibility of reusing schedules to improve the scalability of numerical codes in shared–memory architectures with non–uniform memory access. The main objective is to...
Runtime Address Space Computation for SDSM Systems (2008)
Jairo Balart, Marc Gonzàlez, Xavier Martorell, Eduard Ayguadé, Jesús Labarta
Abstract. This paper explores the benefits and limitations of using a inspector/executor approach for Software Distributed Shared Memory (SDSM) systems. The role of the inspector is to obtain a...
Jordi Guitart, Eduard Ayguadé, Nacho Navarro, Jordi Torres
exploitation of loop-level
Ò�Ñ � Ô�� � Ñ��Ö�Ø�ÓÒ �Ò � Ñ�Ò Ù�Ð ��Ø � ��רÖ��ÙØ�ÓÒ Ì�� (2008)
Dimitrios S. Nikolopoulos, Eduard Ayguadé, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta
Scaling Non-Regular Shared-Memory Codes by Reusing Custom Loop Schedules (2008)
Dimitrios S. Nikolopoulos, Ernest Artiaga, Eduard Ayguadé
In this paper we explore the idea of customizing and reusing loop schedules to improve the scal-ability of non-regular numerical codes in shared–memory architectures with non–uniform memory...
Register constrained modulo scheduling (2008)
Javier Zalamea, Josep Llosa, Eduard Ayguadé, Mateo Valero
Abstract—Software pipelining is an instruction scheduling technique that exploits the instruction level parallelism (ILP) available in loops by overlapping operations from various successive loop...
Performance and Power Evaluation of Clustered VLIW Processors with Wide Functional Units (2008)
Miquel Pericàs, Eduard Ayguadé, Javier Zalamea, Josep Llosa, Mateo Valero
Abstract — Architectural resources and program recurrences are the main limitations to the amount of Instruction-Level Parallelism (ILP) exploitable from loops, the most time-consuming part in...
Jordi Guitart, Eduard Ayguadé, Nacho Navarro, Jordi Torres
exploitation of loop-level
ABSTRACT Efficient Execution of Parallel Java Applications (2008)
Jordi Guitart, Xavier Martorell, Jordi Torres, Eduard Ayguadé
In this paper we propose mechanisms to improve the performance of parallel Java applications. The proposal is based on the establishment of a dialog between each Java application and the underlying...
Improving the Performance of Multiprogrammed Parallel Workloads in Origin2000 Systems (2008)
Xavier Martorell, Eduard Ayguadé, Jesús Labarta, Nacho Navarro
In this paper, we present the evaluation of the Nanos parallel execution environment and its comparison with the native SGIMP environment with respect the execution of multiprogrammed parallel...
Abstract Swing Modulo Scheduling: A Lifetime-Sensitive Approach (2008)
Josep Llosa, Antonio González, Eduard Ayguadé, Mateo Valero
This paper presents a novel software pipelining approach, which is called Swing Modulo Scheduling (SMS). It generates schedules that are near optimal in terms of initiation interval, register...
Improving the Performance of Multiprogrammed Parallel Workloads in Origin2000 Systems (2008)
Xavier Martorell, Eduard Ayguadé, Jesús Labarta, Nacho Navarro
In this paper, we present the evaluation of the Nanos parallel execution environment and its comparison with the native SGIMP environment with respect the execution of multiprogrammed parallel...
Generating a Periodic Pattern for VLIW Abstract (2008)
Cristina Barrado, Jesús Labarta, Eduard Ayguadé, Mateo Valero
Fine-grain parallelism available in VLIW and superscalar processors can be mainly exploited in computational intensive loops. Aggressive scheduling techniques are required to fully exploit this...
An Instrumentation Tool for Threaded Java Application Servers (2008)
David Carrera, Jordi Guitart, Jordi Torres, Eduard Ayguadé, Jesús Labarta
Abstract — Rapid development of e-business services has extended the use of application servers on companies. The Java platform has an important presence on this sector because of its portability...
Jordi Guitart, Eduard Ayguadé, Nacho Navarro, Jordi Torres
exploitation of loop-level
Jordi Guitart, Xavier Martorell, Jordi Torres, Eduard Ayguadé
In this paper we propose mechanisms to improve the performance of parallel Java applications executing on multiprogrammed shared-memory multiprocessors. The proposal is based on a dialog between each...
Synchronized Access to Streams in SIMD Vector Multiprocessors (2008)
Montse Peiron, Mateo Valero, Eduard Ayguadé
The synchronized and simultaneous access to several vectors that form a single stream is typical in SIMD vector multiprocessors as well as in MIMD superscalar multiprocessors with decoupled access....
Memory Access Synchronization in Vector Multiprocessors (2008)
Mateo Valero, Montse Peiron, Eduard Ayguadé
Abstract. In vector multiprocessor systems, collisions in the interconnection network and conflicts in the memory modules are the main causes of the performance degradation. In this work we propose...
PARALLEL EXECUTION OF LOOPS WITH CONDITIONAL STATEMENTS (2008)
Eduard Ayguadé, Jordi Torres, Jesús Labarta, Josep Llosa
This paper describes an approach to the evaluation of bounds of the execution time and number of processors needed to execute DO-like loops on MIMD systems. In the scope of this paper, we only...
Generating a Periodic Pattern for VLIW Abstract (2008)
Cristina Barrado, Jesús Labarta, Eduard Ayguadé, Mateo Valero
Fine-grain parallelism available in VLIW and superscalar processors can be mainly exploited in computational intensive loops. Aggressive scheduling techniques are required to fully exploit this...
Improving the Performance of Multiprogrammed Parallel Workloads in Origin2000 Systems (2008)
Xavier Martorell, Eduard Ayguadé, Jesús Labarta, Nacho Navarro
In this paper, we present the evaluation of the Nanos parallel execution environment and its comparison with the native SGIMP environment with respect the execution of multiprogrammed parallel...
Power–Performance Trade–Offs in Wide and Clustered VLIW Cores for Numerical Codes (2008)
Miquel Pericàs, Eduard Ayguadé, Javier Zalamea, Josep Llosa, Mateo Valero
Abstract. Instruction-Level Parallelism (ILP) is the main source of performance achievable in numerical applications. Architectural resources and program recurrences are the main limitations to the...
WAS Control Center: An Autonomic Performance-Triggered Tracing Environment for WebSphere (2008)
David Carrera, David Garcia, Jordi Torres, Eduard Ayguadé, Jesús Labarta
Detecting performance problems and their causes on J2EE application servers such as WebSphere requires the use of appropriate tools and environments. The analysis of servers with high-availability...
ABSTRACT Efficient Execution of Parallel Java Applications (2008)
Jordi Guitart, Xavier Martorell, Jordi Torres, Eduard Ayguadé
In this paper we propose mechanisms to improve the performance of parallel Java applications. The proposal is based on the establishment of a dialog between each Java application and the underlying...
Jordi Guitart, Jordi Torres, Eduard Ayguadé, José Oliver, Jesús Labarta
Abstract—The rapid maturing process of the Java technology is encouraging users to develope of portable applications using the Java language. As an important part of the definition of the Java...
Jordi Guitart, Xavier Martorell, Jordi Torres, Eduard Ayguadé
In this paper we propose mechanisms to improve the performance of parallel Java applications executing on multiprogrammed shared-memory multiprocessors. The proposal is based on a dialog between each...
Daniel Ortega, Iván Martel, Venkata Krishnan, Eduard Ayguadé, Mateo Valero
In this paper we exploit the existence of distant parallelism that future compilers could detect and characterise its performance under simultaneous multithreading architectures. By distant...
HwA1: Reduced Memory Latency For Regular Data Access (2007)
Mateo Valero, Tomás Lang, Montse Peiron, Eduard Ayguadé, J. M. Llaberia, J. J. Navarro
Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free access for streams with constant stride. However, this is achieved only for...
ABSTRACT Reusing Custom Loop Schedules (2007)
Dimitrios S. Nikolopoulos, Ernest Artiaga, Eduard Ayguadé, Jesús Labarta
In this paper we explore the idea of customizing and reusing loop schedules to improve the scalabil-ity of non-regular numerical codes in shared–memory architectures with non–uniform memory...
A Study of Implicit Data Distribution Methods for OpenMP using the SPEC benchmarks (2007)
Dimitrios S. Nikolopoulos, Eduard Ayguadé
In contrast to the common belief that OpenMP requires data-parallel extensions to scale well on architectures with non-uniform memory access latency, recent work has shown that it is possible to...
Jordi Guitart, David Carrera, Vicenç Beltran, Jordi Torres, Eduard Ayguadé
an overload control strategy for secure
Session-Based Adaptive Overload Control for Secure Dynamic Web Applications (2005)
Jordi Guitart, David Carrera, Vicenç Beltran, Jordi Torres, Eduard Ayguadé
As dynamic web content and security capabilities are becoming popular in current web sites, the performance demand on application servers that host the sites is increasing, leading sometimes these...
Tuning dynamic web applications using fine-grain analysis (2005)
Jordi Guitart, David Carrera, Jordi Torres, Eduard Ayguadé, Jesús Labarta
In this paper we present a methodology to analyze the behavior and performance of Java application servers using a performance analysis framework. This framework considers all levels involved in the...
Characterizing secure dynamic web applications scalability (2005)
Jordi Guitart, Vicenç Beltran, David Carrera, Jordi Torres, Eduard Ayguadé
Growing of users demanding secure dynamic web applications on current sites encourages the use of scalable application servers to host these sites in order to maintain their availability and their...
Performance, power efficiency and scalability of asymmetric cluster chip multiprocessors (2005)
Tomer Y. Morad, Uri C. Weiser, Avinoam Kolodny, Mateo Valero, Eduard Ayguadé
Abstract—This paper evaluates asymmetric cluster chip multiprocessor (ACCMP) architectures as a mechanism to achieve the highest performance for a given power budget. ACCMPs execute serial phases...
A hybrid web server architecture for e-commerce applications (2005)
David Carrera, Vicenç Beltran, Jordi Torres, Eduard Ayguadé
The performance of an e-commerce application can be measured according to technical metrics but also following business indicators. The revenue obtained by a commercial web application is directly...
Session-Based Adaptive Overload Control for Secure Dynamic Web Applications (2005)
Jordi Guitart, David Carrera, Vicenç Beltran, Jordi Torres, Eduard Ayguadé
As dynamic web content and security capabilities are becoming popular in current web sites, the performance demand on application servers that host the sites is increasing, leading sometimes these...
Optimizing NANOS openMP for the IBM Cyclops multithreaded architecture (2005)
David Ródenas, Xavier Martorell, Eduard Ayguadé, Jesús Labarta
drodenas,xavim,eduard,jesus¡ In this paper, we present two approaches to improve the execution of OpenMP applications on the IBM Cyclops multithreaded architecture. Both solutions are independent...
Tuning dynamic web applications using fine-grain analysis (2005)
Jordi Guitart, David Carrera, Jordi Torres, Eduard Ayguadé, Jesús Labarta
Abstract. In this paper we present a methodology to analyze the behavior and performance of Java application servers using a performance analysis framework. This framework considers all levels...
Experiences parallelizing a web server with openmp (2005)
Jairo Balart, Ro Duran, Marc Gonzàlez, Xavier Martorell, Eduard Ayguadé, Jesús Labarta
Abstract. Multi–threaded web servers are typically parallelized by hand using the pthreads library. OpenMP has rarely been used to parallelize such kind of applications, although we foresee that it...
Evaluating the scalability of Java event-driven web servers (2004)
Vicenç Beltran, David Carrera, Jordi Torres, Eduard Ayguadé
The two major strategies used to construct highperformance web servers are thread pools and eventdriven architectures. The Java platform is commonly used in web environments but up to the moment it...
Evaluating the scalability of Java event-driven web servers (2004)
Vicenç Beltran, David Carrera, Jordi Torres, Eduard Ayguadé
The two major strategies used to construct highperformance web servers are thread pools and eventdriven architectures. The Java platform is commonly used in web environments but up to the moment it...
Evaluating the scalability of Java event-driven web servers (2004)
Vicenç Beltran, David Carrera, Jordi Torres, Eduard Ayguadé
The two major strategies used to construct highperformance web servers are thread pools and eventdriven architectures. The Java platform is commonly used in web environments but up to the moment it...
Complete Instrumentation Requirements for Performance Analysis of Web based Technologies (2003)
David Carrera, Jordi Guitart, Jordi Torres, Eduard Ayguadé, Jesús Labarta
In this paper we present the eDragon environment, a research platform created to perform complete performance analysis of new Web-based technologies. eDragon enables the understanding of how...
Complete Instrumentation Requirements for Performance Analysis of Web based Technologies (2003)
David Carrera, Jordi Guitart, Jordi Torres, Eduard Ayguadé, Jesús Labarta
In this paper we present the eDragon environment, a research platform created to perform complete performance analysis of new Web-based technologies. eDragon enables the understanding of how...
Hierarchical clustered register file organization for VLIW processors (2003)
Javier Zalamea, Josep Llosa, Eduard Ayguadé, Mateo Valero
Technology projections indicate that wire delays will become one of the biggest constraints in future microprocessor designs. To avoid long wire delays and therefore long cycle times, processor cores...
Is the schedule clause really necessary in openmp (2003)
Eduard Ayguadé, Bob Blainey, Ro Duran, Jesús Labarta, Francisco Martínez, Xavier Martorell, ...
Abstract. Choosing the appropriate assignment of loop iterations to threads is one of the most important decisions that need to be taken when parallelizing Loops, the main source of parallelism in...
Evaluation of OpenMP for the Cyclops multithreaded architecture (2003)
George Almasi, Eduard Ayguadé, José Castaños, Jesús Labarta, Francisco Martínez, Xavier Martorell, ...
Abstract. Multithreaded architectures have the potential of tolerating large memory and functional unit latencies and increase resource utilization. The Blue Gene/Cyclops architecture, being...
Is the schedule clause really necessary in openmp (2003)
Eduard Ayguadé, Bob Blainey, Ro Duran, Jesús Labarta, Xavier Martorell, Raúl Silvera
Abstract. Choosing the appropriate assignment of loop iterations to threads is one of the most important decisions that need to be taken when parallelizing Loops, the main source of parallelism in...
Performance Analysis Tools for Parallel Java (2002)
Jordi Guitart, Jordi Torres, Eduard Ayguadé
In this paper we describe an instrumentation environment for the performance analysis and visualization of parallel applications written in JOMP, an OpenMP-like interface for Java. The environment...
Dual-Level Parallelism Exploitation with OpenMP (2002)
In Coastal Ocean, Marc González, Eduard Ayguadé, Xavier Martorell, Jesús Labarta, Phu V. Luong
Two alternative dual-level parallel implementations of the Multiblock Grid Princeton Ocean Model (MGPOM) are compared in this paper. The first one combines the use of two programming paradigms:...
Dimitrios S. Nikolopoulos, Eduard Ayguadé
automatic page placement algorithms implemented in the operating system, runtimealgorithmsbasedondynamicpagemigration,runtimealgorithmsbased...
Performance Analysis Tools for Parallel Java (2002)
Jordi Guitart, Jordi Torres, Eduard Ayguadé
In this paper we describe an instrumentation environment for the performance analysis and visualization of parallel applications written in JOMP, an OpenMP-like interface for Java. The environment...
Scaling irregular parallel codes with minimal programming effort (2001)
Dimitrios S. Nikolopoulos, Eduard Ayguadé, Constantine D. Polychronopoulos
The long foreseen goal of parallel programming models is to scale parallel code without significant programming effort. Irregular parallel applications are a particularly challenging application...
Scaling irregular parallel codes with minimal programming effort (2001)
Dimitrios S. Nikolopoulos, Eduard Ayguadé, Constantine D. Polychronopoulos
The long foreseen goal of parallel programming models is to scale parallel code without significant programming effort. Irregular parallel applications are a particularly challenging application...
A Transparent Runtime Data Distribution Engine for OpenMP (2001)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
This paper makes two important contributions. First, the paper investigates the performance implications of data placement in OpenMP programs running on modern NUMA multiprocessors. Data locality and...
Dimitrios S. Nikolopoulos, Eduard Ayguadé, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta
Scaling irregular parallel codes with minimal programming effort (2001)
Dimitrios S. Nikolopoulos, Eduard Ayguadé, Constantine D. Polychronopoulos
The long foreseen goal of parallel programming models is to scale parallel code without significant programming effort. Irregular parallel applications are a particularly challenging application...
Dimitrios S. Nikolopoulos, Constantine D. Polychronopoulos, Theodore S. Papatheodorou, Jesús Labarta, Eduard Ayguadé
experiments presented in this paper were conducted with resources
Towards an efficient exploitation of loop-level parallelism in Java (2000)
José Oliver, Eduard Ayguadé, Nacho Navarro, Jordi Guitart, Jordi Torres
This paper analyzes the overheads incurred in the exploitation of loop-level parallelism using Java Threads and proposes some code transformations that minimize them. Avoiding the intensive use of...
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
Abstract. We present the design and implementation of UPMLIB, a runtime system that provides transparent facilities for dynamically tuning the memory performance of OpenMP programs on scalable...
A Simulator for SMT Architectures: Evaluating Instruction Cache Topologies (2000)
Ronaldo Gonçalves, Eduard Ayguadé, Mateo Valero
SMT (Simultaneous MultiThreaded) is becoming one of the major trends in the design of future generations of microarchitectures. Its ability to exploit both intra- and interthread parallelism makes it...
Is data distribution necessary in OpenMP (2000)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
This paper investigates the performance implications of data placement in OpenMP programs running on modern ccNUMA multiprocessors. Data locality and minimization of the rate of remote memory...
Is data distribution necessary in OpenMP (2000)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
This paper investigates the performance implications of data placement in OpenMP programs running on modern ccNUMA multiprocessors. Data locality and minimization of the rate of remote memory...
A Case for User-Level Dynamic Page Migration (2000)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Jes Us Labarta, Eduard Ayguadé
This paper presents user-level dynamic page migration, a runtime technique which transparently enables parallel programs to tune their memory performance on distributed shared memory multiprocessors,...
Java Instrumentation Suite: Accurate Analysis of Java Threaded Applications (2000)
Jordi Guitart Jordi, Jordi Torres, Eduard Ayguadé, José Oliver, Jesús Labarta
. The rapid maturing process of the Java technology is encouraging users the development of portable applications using the Java language. As an important part of the definition of the Java language,...
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors (2000)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesus Labarta, Eduard Ayguadé
This paper presents algorithms for improving the performance of parallel programs on multiprogrammed sharedmemory NUMA multiprocessors, via the use of user-level dynamic page migration. The idea that...
Java Instrumentation Suite: Accurate Analysis of Java Threaded Applications (2000)
Jordi Guitart, Jordi Torres, Eduard Ayguadé, José Oliver, Jesús Labarta
Abstract. The rapid maturing process of the Java technology is encouraging users the development of portable applications using the Java language. As an important part of the definition of the Java...
Navarro \Applying Interposition Techniques for Performance Analysis of OpenMP Applications (2000)
Marc González, Albert Serra, Xavier Martorell, José Oliver, Eduard Ayguadé, Jesús Labarta, ...
Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may...
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesus Labarta, Eduard Ayguadé
We present the design and implementation of UPMLIB, a runtime system that provides transparent facilities for dynamically tuning the memory performance of OpenMP programs on scalable shared-memory...
Is data distribution necessary in OpenMP (2000)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
This paper investigates the performance implications of data placement in OpenMP programs running on modern ccNUMA multiprocessors. Data locality and minimization of the rate of remote memory...
Navarro \Applying Interposition Techniques for Performance Analysis of OpenMP Applications (2000)
Marc González, Albert Serra, Xavier Martorell, Eduard Ayguadé, Jesús Labarta, Nacho Navarro
Tuning parallel applications requires the use of effective tools for detecting performance bottlenecks. Along a parallel program execution, many individual situations of performance degradation may...
Widening Resources: A Cost-Effective Technique for Aggressive ILP Architectures (2000)
David López, Josep Llosa, Mateo Valero, Eduard Ayguadé
The inherent instruction-level parallelism (ILP) of current applications (specially those based on floating point computations) has driven hardware designers and compilers writers to investigate...
Java Instrumentation Suite: Accurate Analysis of Java Threaded Applications (2000)
Jordi Guitart, Jordi Torres, Eduard Ayguadé, José Oliver, Jesús Labarta
Abstract. The rapid maturing process of the Java technology is encouraging users the development of portable applications using the Java language. As an important part of the definition of the Java...
A Simulator for SMT Architectures: Evaluating Instruction Cache Topologies (2000)
Ronaldo Gonçalves, Eduard Ayguadé, Mateo Valero
SMT (Simultaneous MultiThreaded) is becoming one of the major trends in the design of future generations of microarchitectures. Its ability to exploit both intra- and interthread parallelism makes it...
User-Level Dynamic Page Migration for Multiprogrammed Shared-Memory Multiprocessors (2000)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
This paper presents algorithms for improving the performance of parallel programs on multiprogrammed sharedmemory NUMA multiprocessors, via the use of user-level dynamic page migration. The idea that...
A Case for User-Level Dynamic Page Migration (2000)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
Towards an efficient exploitation of loop-level parallelism in Java (2000)
José Oliver, Jordi Guitart, Eduard Ayguadé, Nacho Navarro, Jordi Torres
This paper analyzes the overheads incurred in the exploitation of loop-level parallelism using Java Threads and proposes some code transformations that minimize them. The transformations avoid the...
Thread Fork/Join Techniques for Multi-level Parallelism Exploitation in NUMA Multiprocessors (1999)
Xavier Martorell, Eduard Ayguadé, Nacho Navarro, Julita Corbalán, Marc González, Jesús Labarta
This paper presents some techniques for efficient thread forking and joining in parallel execution environments, taking into consideration the physical structure of NUMA machines and the support for...
Exploiting Multiple Levels of Parallelism in OpenMP: A Case Study (1999)
Eduard Ayguadé, Xavier Martorell, Jesús Labarta, Marc González, Nacho Navarro
Most current shared--memory parallel programming environments are based on thread packages that allow the exploitation of a single level of parallelism. These thread packages do not enable the...
Thread Fork/Join Techniques for Multi-level Parallelism Exploitation (1999)
Xavier Martorell, Eduard Ayguadé, Nacho Navarro, Julita Corbalán, Marc González, Jesús Labarta
This paper presents some techniques for efficient thread forking and joining in parallel execution environments, taking into consideration the physical structure of NUMA machines and the support for...
Thread Fork/Join Techniques for Multi-level Parallelism Exploitation (1999)
Xavier Martorell, Eduard Ayguadé, Nacho Navarro, Julita Corbalán, Marc González, Jesús Labarta
This paper presents some techniques for efficient thread forking and joining in parallel execution environments, taking into consideration the physical structure of NUMA machines and the support for...
Graph Traverse Software Pipelining (1998)
Cristina Barrado, Eduard Ayguadé, Jesús Labarta
Software pipelining is becoming widely used as a loop execution model for microprocessors supporting a high instruction level parallelism. In this paper we describe a heuristic method for software...
Resource widening versus replication: limits and performance-cost trade-off (1998)
David López, Josep Llosa, Mateo Valero, Eduard Ayguadé
A balanced increase of memory bandwidth and computational capabilities is going to be one of the trends in the design of near future high-performance microprocessors. Alternative solutions are...
Tools and Techniques for Automatic Data Layout: A Case Study (1998)
Eduard Ayguadé, Jordi Garcia, Ulrich Kremer
Parallel architectures with physically distributed memory providing computing cycles and large amounts of memory are becoming more and more common. To make such architectures truly usable,...
Increasing Performance With Multiply-Add Units and Wide Buses (1997)
David Lopez, Mateo Valero, Josep Llosa, Eduard Ayguadé, Campus Nord, Jordi Girona
A balanced increase of memory bandwidth and computational performance is one of the current trends towards high performance microprocessors. This improvement can be attained either by replicating...
Swing Modulo Scheduling: A Lifetime-Sensitive Approach (1996)
Josep Llosa, Antonio González, Eduard Ayguadé, Mateo Valero
This paper presents a novel software pipelining approach, which is called Swing Modulo Scheduling (SMS). It generates schedules that are near optimal in terms of initiation interval, register...
A Framework for Automatic Dynamic Data Mapping (1996)
Jordi Garcia, Eduard Ayguadé, Jesús Labarta
Data distribution is one of the key aspects that a parallelizing compiler for a distributed memory architecture should consider, to get efficiency from the system. The cost of accessing local and...
Increasing Memory Bandwidth with Wide Buses: Compiler, Hardware and Performance Trade-offs (1996)
David Lopez, Mateo Valero, Josep Llosa, Eduard Ayguadé
Memory latency and lack of bandwidth are the main barriers to achieve high performance from current and future processors, specially in numeric applications. New organizations of the memory subsystem...
Loop Parallelization: Revisiting Framework of Unimodular Transformations (1996)
Jordi Torres, Eduard Ayguadé, Jesus Labarta, Mateo Valero
This paper extends the framework of linear loop transformations adding a new non-lineal step at the transformation process. The current framework of linear loop transformation cannot identify a...
Loop Parallelization: Revisiting Framework of Unimodular Transformations (1996)
Jordi Torres, Eduard Ayguadé, Jesus Labarta, Mateo Valero
This paper extends the framework of linear loop transformations adding a new non-lineal step at the transformation process. The current framework of linear loop transformation cannot identify a...
Hypernode Reduction Modulo Scheduling (1995)
Josep Llosa, Mateo Valero, Eduard Ayguadé, Antonio González
Software Pipelining is a loop scheduling technique that extracts parallelism from loops by overlapping the execution of several consecutive iterations. Most prior scheduling research has focused on...
Hypernode Reduction Modulo Scheduling (1995)
Josep Llosa, Mateo Valero, Eduard Ayguadé
Software Pipelining is a loop scheduling technique that extracts parallelism from loops by overlapping the execution of several consecutive iterations. Prior scheduling research has focused on...
A Novel Approach Towards Automatic Data Distribution (1995)
Jordi Garcia, Eduard Ayguadé, Jesús Labarta
: Data distribution is one of the key aspects that a parallelizing compiler for a distributed memory architecture should consider, in order to get efficiency from the system. The cost of accessing...
Nano-Threads Library Design, Implementation and Evaluation (1995)
Eduard Ayguadé, Xavier Martorell, Xavier Martorell, Jesús Labarta, Nacho Navarro, Nacho Navarro, ...
: In this report we describe the design and implementation of a user-level thread package based on the nano-threads programming model, whose goal is to efficiently manage the application parallelism...
Concurrency: Practice And Experience (1995)
Concurrency Pract Exper, Marc Gonzàlez, Eduard Ayguadé, Xavier Martorell, Jesús Labarta, Nacho Navarro
This paper describes the support provided by the NanosCompiler to nested parallelism in OpenMP. The NanosCompiler is a source-to-source parallelizing compiler implemented around a hierarchical...
Vector multiprocessors with arbitrated memory access (1995)
Montse Peiron, Mateo Valero, Eduard Ayguadé, Tomás Lang
The high latency of memory accesses is one of the factors that most contribute to reduce the performance of current vector supercomputers. The conflicts that can occur in the memory modules plus the...
USING A 0-1 INTEGER PROGRAMMING MODEL FOR AUTOMATIC STATIC DATA DISTRIBUTION (1995)
Jordi Garcia, Eduard Ayguadé, Jesús Labarta
This paper describes an automatic data distribution method which deal with both the alignment and the distribution problems in a single optimization phase, as opposed to sequentially solving these...
Revisiting Framework of Linear Loop Transformations to Extract Full Loop Parallelism (1995)
Jordi Torres, Eduard Ayguadé, Jesus Labarta, Mateo Valero
This paper extends the framework of linear loop transformations adding a new non-lineal step at the transformation process. The current framework of linear loop transformation cannot identify a...
A novel approach towards automatic data distribution (1995)
Jordi Garcia, Eduard Ayguadé, Jesús Labarta
Abstract: Data distribution is one of the key aspects that a parallelizing compiler for a distributed memory architecture should consider, in order to get efficiency from the system. The cost of...
Data Partitioning Methods: Implementation and Static Evaluation Reports (1994)
Eduard Ayguadé, Jesús Labarta, Jordi Garcia, Mercé Girones
this report we have described how two methods for automatically determining convenient data distributions out of sequential programs have been implemented in the ParaScope environment. The selected...
Network Synchronization and out-of-order Access to Vectors” Parallel Processing Letters (1994)
Mateo Valero, Eduard Ayguadé, Montse Peiron
In vector multiprocessor systems, collisions in the interconnection network and conflicts in the memory modules are the main causes of the performance degradation. In this work we use a synchronized...
Partitioning the Statement per Iteration Space Using Non-singular Matrices (1993)
In this paper we generalize the framework of linear loop transformations: we consider loop alignment as a new component in the transformation process. The aim is to exploit the additional inherent...
name cause-promise))) Zerdia the country - TA5 (1993)
Montse Peiron, Mateo Valero, Eduard Ayguadé, Tomás Lang
The synchronized and simultaneous access to several vectors that form a single stream occurs in SIMD vector multiprocessors as well as in MIMD superscalar multiprocessors with decoupled access. In...
Partitioning the Statement per Iteration Space Using Non-singular Matrices (1993)
In this paper we generalize the framework of linear loop transformations: we consider loop alignment as a new component in the transformation process. The aim is to exploit the additional inherent...
Partitioning the Statement per Iteration Space Using Non-singular Matrices (1993)
In this paper we generalize the framework of linear loop transformations: we consider loop alignment as a new component in the transformation process. The aim is to exploit the additional inherent...
Increasing the number of strides for conflict-free vector access (1992)
Mateo Valero, Tomás Lang, José M. Llabería, Montse Peiron, Eduard Ayguadé, Juan J. Navarro
Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free vector access for some strides in vector processors with multi-module memories....
Increasing the number of strides for conflict-free vector access (1992)
Mateo Valero, Tomás Lang, José M. Llabería, Montse Peiron, Eduard Ayguadé, Juan J. Navarro
Address transformation schemes, such as skewing and linear transformations, have been proposed to achieve conflict-free vector access for some strides in vector processors with multi-module memories....
CYOS: Scheduling in a Continuous Area-Time Design Space (1990)
Jordi Cortadella, Jordi Cortadella, Rosa M. Badia, Rosa M. Badia, Eduard Ayguadé, ...
Operation scheduling and hardware allocation are the two most important phases in the synthesis of circuits from behavioral descriptions. This paper presents CYOS (CYcle time Optimizer and...
Leveraging transparent data distribution in OpenMP via user-level dynamic page migration (1940)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D. Polychronopoulos, Jesús Labarta, Eduard Ayguadé
Abstract. This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High Performance Fortran,...
Leveraging transparent data distribution in OpenMP via user-level dynamic page migration (1940)
Dimitrios S. Nikolopoulos, Theodore S. Papatheodorou, Constantine D, Jesús Labarta, Eduard Ayguadé
jesus,eduard¦ Abstract. This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High...
MIRS : Modulo Scheduling with Integrated Register Spilling
Javier Zalamea, Josep Llosa, Eduard Ayguadé, Mateo Valero
The overlapping of loop iterations in software pipelining techniques imposes high register requirements. The schedule for a loop is valid if it requires at most the number of registers available in...
MIRS: Modulo Scheduling with Integrated Register Spilling
Javier Zalamea, Josep Llosa, Eduard Ayguadé, Mateo Valero
The overlapping of loop iterations in software pipelining techniques imposes high register requirements. The schedule for a loop is valid if it requires at most the number of registers available in...