Design Issues in Parallel Array Languages for Shared Memory ⋆ (2009)
James Brodman, Basilio B. Fraguela, María J. Garzarán, David Padua
Abstract. The Hierarchically Tiled Array (HTA) is a data type that facilitates the definition and manipulation of arrays partitioned into tiles. The data type allows to exploit those tiles to attain...
Optimizing Sorting with Machine Learning Algorithms (2008)
Xiaoming Li, María Jesús Garzarán, David Padua
The growing complexity of modern processors has made the development of highly efficient code increasingly difficult. Manually developing highly efficient code is usually expensive but necessary due...
Diego Andrade, James Brodman, Basilio B. Fraguela, David Padua
Abstract. Multicore systems are now the norm. Programmers can no longer rely on faster clock rates to speed up their applications. Thus, software developers are increasingly forced to face the...
Peng Zhao, Peng Zhao, David Padua, Bruce Cockburn, Jonathan Schaeffer, Duane Szafron
Permission is hereby granted to the University of Alberta Library to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes...
Programming the FlexRAM Intelligent Memory Architecture (2008)
Basilio B. Fraguela, Jose Renau, Paul Feautrier, David Padua
Universit'e de Versailles St-Quentin
Evaluating the Impact of Thread Escape Analysis on a Memory Consistency Model-aware Compiler (2008)
Chi-leung Wong, Zehra Sura, Xing Fang, Kyungwoo Lee, Samuel P. Midkiff, Jaejin Lee, ...
Abstract. The widespread popularity of languages allowing explicitly parallel, multi-threaded programming, e.g. Java and C#, have focused attention on the issue of memory model design. The Pensieve...
HiLO: High Level Optimization of FFTs ⋆ (2008)
Abstract. As computing platforms become more and more complex, the task of optimizing performance critical codes becomes more challenging. Recently, more attention has been focused on automating this...
Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, Manuela Veloso, ...
Abstract The paper describes SPIRAL, a generator of libraries for fast software implementations of signal processing transforms. These libraries are adapted to the computing platform and can be...
Automatic Generation of a Parallel Sorting Algorithm ∗ (2008)
Brian A. Garber, Dan Hoeflinger, Xiaoming Li, María Jesús Garzarán, David Padua
In this paper, we discuss a library generator for parallel sorting routines that examines the input characteristics (and the parameters they affect) to select the best performing algorithm. Our...
Basilio B. Fraguela, Jia Guo, Ganesh Biksh, María J. Garzarán, Gheorghe Almási, José Moreira, ...
In this paper, we show our initial experience with a class of objects, called Hierarchically Tiled Arrays (HTAs), that encapsulate parallelism. HTAs allow the construction of single-threaded parallel...
An advanced compiler framework for non-cache-coherent multiprocessors (2008)
Yunheung Paek, Angeles Navarro, Emilio Zapata, Senior Member, Jay Hoeflinger, David Padua
AbstractÐThe Cray T3D and T3E are non-cache-coherent (NCC) computers with a NUMA structure. They have been shown to exhibit a very stable and scalable performance for a variety of application...
Markus Püschel, Jeremy Johnson, David Padua, Manuela Veloso, Bryan W. Singer, ...
Abstract — Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance?...
1.1 Hierarchically Tiled Arrays (2008)
Ganesh Biksh, Jia Guo, Christoph Von Praun, Gabriel Tanase, Basilio B. Fraguela, María J. Garzarán, ...
Abstract. Hierarchically Tiled Arrays (HTAs) are data structures that facilitate locality and parallelism of array intensive computations with block-recursive nature. The model underlying HTAs...
Hierarchically Tiled Array Vs. Intel Thread Building Blocks for Multicore Systems Programming (2008)
Diego Andrade, James Brodman, Basilio B. Fraguela, David Padua
Multicore systems are becoming common, while programmers cannot rely on growing clock rate to speed up their application. Thus, software developers are increasingly exposed to the complexity...
Markus Püschel, Jeremy Johnson, David Padua, Manuela Veloso, Bryan W. Singer, ...
Abstract — Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance?...
HiLO: High Level Optimization of FFTs ⋆ (2008)
Abstract. As computing platforms become more and more complex, the task of optimizing performance critical codes becomes more challenging. Recently, more attention has been focused on automating this...
under Awards ACR-0234293, ITR/NGS-0325687, and SYS-310941. (2008)
Markus Püschel, Jeremy R. Johnson, David Padua, Robert W. Johnson, Nicholas Rizzolo
Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance? We present...
Automatic derivation and implementation of signal processing algorithms (2008)
We present a computer algebra framework to automatically derive and implement algorithms for digital signal processing. The two main parts of the framework are AREP, a library for symbolic...
1.1 Hierarchically Tiled Arrays (2008)
Ganesh Biksh, Jia Guo, Christoph Von Praun, Gabriel Tanase, Basilio B. Fraguela, María J. Garzarán, ...
Abstract. Hierarchically Tiled Arrays (HTAs) are data structures that facilitate locality and parallelism of array intensive computations with block-recursive nature. The model underlying HTAs...
Hierarchically Tiled Arrays for Parallelism and Locality ∗ (2008)
Jia Guo, Ganesh Biksh, Daniel Hoeflinger, Gheorghe Almasi, Basilio Fraguela, María Jesús Garzarán, ...
Parallel programming is facilitated by constructs which, unlike the widely used SPMD paradigm, provide programmers with a global view of the code and data structures. These constructs could be...
On the Automatic Parallelization of the Perfect Benchmarks R Rudolf Eigenmann y Jay Hoe inger (2008)
This paper presents the results of the Cedar Hand-Parallelization Experiment, conducted
Abstract Gated SSA-Based Demand-Driven Symbolic Analysis for Parallelizing Compilers * (2008)
In this paper, we present a GSA-based technique that performs more efficient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The efficiency is improved by...
In this paper, we describe the inference mechanism used by the FALCON system to translate MATLABl programs to Fortran 90. FALCON is a programming environment for the development of scientific...
languages. They are Dennis Gannon from Indiana (2008)
Equipment Corporation, David Padua
Introductory paper to the session of this title at the Workshop on Challenges for Parallel Processing, International Conference on Parallel Processing, 1996 In this workshop session, three speakers...
Benchmarking FALCON's MATLAB-to-Fortran 90 Compiler on an SGI Power Challenge (2008)
This paper presents an overview of the FALCON MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scientific programs. It combines static and...
Jia Guo, Ganesh Biksh, Basilio B. Fraguela, María J. Garzarán, David Padua
The importance of tiles or blocks in scientific computing cannot be overstated. Many algorithms, both iterative and recursive, can be expressed naturally if tiles are represented explicitly. From the...
Vikram S. Adve, Gul Agha, Matthew I. Frank, María Jesús Garzarán, John C. Hart, ...
Illinois history in parallel computing stretches more than 40 years. From the first academic parallel supercomputer, the ILLIAC IV started in 1964, to today’s work to install the first petascale...
Code Study: Automatic Parallelism Detection in Dyfesm (2007)
Yuan Lin And, Yuan Lin, David Padua
This paper reports the results of our study on the Perfect Benchmarks code DYFESM in order to determine how eective automatic parallelism detection techniques are on this code.
. The property analysis of subscript arrays can be used to facilitate the automatic detection of parallelism in sparse/irregular programs that use indirectly accessed arrays. In order for property...
Benchmarking FALCON's MATLAB-to-Fortran 90 Compiler on an SGI Power Challenge (2007)
This paper presents an overview of the FALCON MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scienti c programs. It combines static and...
Jeremy Johnson, Robert W. Johnson, David Padua, Jianxin Xiong
Abstract. This paper discuss an approach to implementing and optimizing fast signal transforms based on a domain specific computer language, called SPL. SPL programs, which are essentially...
Jianxin Xiong, Jeremy Johnson, Robert Johnson, David Padua
We discuss the design and implementation of a compiler that translates formulas representing signal processing transforms into ecient C or Fortran programs. The formulas are represented in a language...
Automatic derivation and implementation of signal processing algorithms (2007)
We present a computer algebra framework to automatically derive and implement algorithms for digital signal processing. The two main parts of the framework are AREP, a library for symbolic...
Yunheung Paek, Jay Hoeflinger, Angeles Navarro, Emilio Zapata, David Padua
The Cray T3D and T3E are non-cache-coherent (NCC) computers with a NUMA structure. They have been shown to exhibit very stable and scalable performance for a variety of application programs....
Analyzing Java Arrays: Combness and Synchness Analysis (2007)
Peng Wu Paul, Paul Feautrier, David Padua
In this paper, we propose two new abstractions, combness and synchness, for analyzing Java arrays, which capture element-wise aliasing relation within an array or between two arrays. combness and...
Benchmarking FALCON's MATLAB-to-Fortran 90 Compiler on an SGI Power Challenge (2007)
Luiz De, Luiz De Rose, David Padua
This paper presents an overview of the FALCON MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scientific programs. It combines static and...
Programming the FlexRAM Intelligent Memory Architecture (2007)
Basilio B. Fraguela, Jose Renau, Paul Feautrier, David Padua, Josep Torrellas
Advances in VLSI technology have enabled the implementation of computer architectures based on Processing-in-Memory (PIM) chips. These architectures can help bridge the growing gap between processor...
Points-to Analysis for Java with Applications to Loop Optimizations (2007)
Peng Wu, Paul Feautrier, David Padua, Zehra Sura
If Java is to be used for high performance computing, we must enable classical loop optimizations as well as eliminate those features of Java which are detrimental to good performance, like...
Abstract Gated SSA-Based Demand-Driven Symbolic Analysis for Parallelizing Compilers (2007)
In this paper, we present a GSA-based technique that performs more e cient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The e ciency is improved by...
On the Automatic Parallelization of the Perfect Benchmarks R Rudolf Eigenmann y Jay Hoe inger (2007)
We present a set of advanced program parallelization techniques that are able to signi cantly improve the performance of application programs. We present evidence for this improvement in terms of the...
On the Automatic Parallelization of (2007)
Rudolf Eigenmann, Jay Hoeflinger, David Padua, Senior Member
transformed the Perfect Benchmarks ® into parallel program versions. In doing so, we used techniques that may be automated in an optimizing compiler. We then ran these programs on the Cedar...
Programming for parallelism and locality with hierarchically tiled arrays (2006)
Ganesh Biksh, Jia Guo, Daniel Hoeflinger, Gheorghe Almasi, Basilio B. Fraguela, María J. Garzarán, ...
Tiling has proven to be an effective mechanism to develop high performance implementations of algorithms. Tiling can be used to organize computations so that communication costs in parallel programs...
Programming for Parallelism and Locality with Hierarchically Tiled Arrays (2006)
Ganesh Bikshandi, Jia Guo, Daniel Hoeflinger, Gheorghe Almasi, Basilio B. Fraguela, Mara J. Garzaran, ...
Tiling has proven to be an effective mechanism to develop high performance implementations of algorithms. Tiling can be used to organize computations so that communication costs in parallel programs...
Is Search Really Necessary to Generate High-Performance BLAS? (2005)
Kamen Yotov, Xiaoming Li, Gang Ren, Maria Garzaran, David Padua, Keshav Pingali, ...
Abstract — A key step in program optimization is the estimation of optimal values for parameters such as tile sizes and loop unrolling factors. Traditional compilers use simple analytical models to...
A Language for the Compact Representation of Multiple Program Versions (2005)
Sebastien Donadio, James Brodman, Thomas Roeder, Kamen Yotov, Denis Barthou, Albert Cohen, ...
As processor complexity increases compilers tend to deliver suboptimal performance. Library generators such as ATLAS, FFTW and SPIRAL overcome this issue by empirically searching in the space of...
Albert Cohen, Sebastien Donadio, Maria-Jesus Garzaran, Christoph Herrmann, Oleg Kiselyov, David Padua
The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the...
A Language for the Compact Representation of Multiple Program Versions (2005)
Sebastien Donadio, James Brodman, Thomas Roeder, Kamen Yotov, Denis Barthou, Albert Cohen, ...
As processor complexity increases compilers tend to deliver suboptimal performance. Library generators such as ATLAS, FFTW and SPIRAL overcome this issue by empirically searching in the space of...
Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization (2005)
Arkady Epshteyn, Maria Garzaran, Gerald Dejong, David Padua, Gang Ren, Xiaoming Li, ...
Compilers employ system models, sometimes implicitly, to make code optimization decisions. These models are analytic; they reflect their implementor 's understanding and beliefs of the system....
Optimizing Sorting with Genetic Algorithms (2005)
Xiaoming Li Mar, Xiaoming Li, María Jesús Garzarán, David Padua
The growing complexity of modern processors has made the generation of highly efficient code increasingly difficult. Manual code generation is very time consuming, but it is often the only choice...
Is Search Really Necessary to Generate High-Performance BLAS? (2005)
Kamen Yotov, Xiaoming Li, Gang Ren, Maria Jesus Garzaran, David Padua, Member Ieee, ...
this paper was arrived at by studying the ATLAS source code. In case of any discrepancy between this description and how the ATLAS system is actually implemented, the documentation of the ATLAS...
Albert Cohen, Sebastien Donadio, Maria-Jesus Garzaran, Christoph Herrmann, Oleg Kiselyov, David Padua
The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the...
Parallel mining of closed sequential patterns (2005)
Shengnan Cong, Jiawei Han, David Padua
Discovery of sequential patterns is an essential data mining task with broad applications. Among several variations of sequential patterns, closed sequential pattern is the most useful one since it...
A samplingbased framework for parallel data mining (2005)
Shengnan Cong, Jiawei Han, Jay Hoeflinger, David Padua
The goal of data mining algorithm is to discover useful information embedded in large databases. Frequent itemset mining and sequential pattern mining are two important data mining problems with...
A framework for measuring supercomputer productivity (2004)
Marc Snir, David A. Bader, Marc Snir, David A. Bader, James C. Browne, Brad Chamberlain, ...
We propose a framework for measuring the productivity of high performance computing (HPC) systems, based on common economic definitions of productivity and on utility theory. We discuss how these...
Albert Cohen, Sébastien Donadio, Maria-jesus Garzaran, Christoph Herrmann, David Padua
The quality of compiler-optimized code for high-performance applications lags way behind what optimization and domain experts can achieve by hand. This paper explores in-between solutions, besides...
SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)
Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...
SPIRAL is a generator for libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...
Gengbin Zheng, Terry Wilmarth, Orion Sky Lawlor, Laxmikant V. Kalé, Sarita Adve, David Padua
We present a performance modeling and programming environment for petaflops computers and the Blue Gene machine. It consists of a parallel simulator, BigSim, for predicting performance of machines...
SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)
Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...
SPIRAL is a generator for libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...
SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)
Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...
SPIRAL is a generator of libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...
SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)
Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...
SPIRAL is a generator of libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...
Implementation of Parallel Numerical Algorithms Using Hierarchically Tiled Arrays (2004)
Ganesh Bikshandi, Basilio B. Fraguela, Jia Guo, Maria J. Garzaran, Gheorghe Almasi, Jose Moreira, ...
In this paper, we describe our experience in writing parallel numerical algorithms using Hierarchically Tiled Arrays (HTAs). HTAs are classes of objects that encapsulate parallelism. HTAs allow the...
The Hierarchically Tiled Arrays Programming Approach (2004)
Basilio B. Fraguela, Jia Guo, Ganesh Bikshandi, Maria J. Garzaran, Gheorghe Almasi, Jose Moreira, ...
In this paper, we show our initial experience with a class of objects, called Hierarchically Tiled Arrays (HTAs), that encapsulate parallelism. HTAs allow the construction of single-threaded parallel...
Albert Cohen, Sebastien Donadio, Maria-Jesus Garzaran, Christoph Herrmann, David Padua
The quality of compiler-optimized code for high-performance applications lags way behind what optimization and domain experts can achieve by hand. This paper explores in-between solutions, besides...
A Dynamically Tuned Sorting Library (2004)
Xiaoming Li, Maria Jesus Garzaran, David Padua
Empirical search is a strategy used during the installation of library generators such as ATLAS, FFTW, and SPIRAL to identify the algorithm or the version of an algorithm that delivers the best...
Implementation of Parallel Numerical Algorithms Using Hierarchically Tiled Arrays (2004)
Ganesh Bikshandi, Basilio B. Fraguela, Jia Guo, María J. Garzarán, Gheorghe Almási, José Moreira, ...
Abstract. In this paper, we describe our experience in writing parallel numerical algorithms using Hierarchically Tiled Arrays (HTAs). HTAs are classes of objects that encapsulate parallelism. HTAs...
Programming for locality and parallelism with hierarchically tiled arrays (2003)
Gheorghe Almasi, Luiz De Rose, Jose Moreira, David Padua
This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programing and programming for...
The Power of Belady’s Algorithm in Register Allocation for Long Basic Blocks (2003)
Jia Guo, María Jesús Garzarán, David Padua
Abstract. Optimization techniques such as loop-unrolling and trace-scheduling can result in long straight-line codes. It is, however, unclear how well the register allocation algorithms of current...
Programming for locality and parallelism with hierarchically tiled arrays (2003)
Gheorghe Almási, Luiz De Rose, Basilio B. Fraguela, José Moreira, David Padua
Abstract. This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programming and...
A preliminary study on the vectorization of multimedia applications for multimedia extensions (2003)
Gang Ren, Peng Wu, David Padua
Abstract. In 1994, the first multimedia extension, MAX-1, was introduced to general-purpose processors by HP. Almost ten years have passed, the present means of accessing the computing power of...
Programming the FlexRAM Parallel Intelligent Memory System (2003)
Basilio B. Fraguela, Jose Renau, Paul Feautrier, David Padua, Josep Torrellas
Intelligent memory architectures enhance the memory chips of a computer with many simple processors. The result is a highly-parallel, heterogeneous machine that is able to exploit computation in...
Programming for locality and parallelism with hierarchically tiled arrays (2003)
Gheorghe Almasi, Luiz De Rose, Jose Moreira, David Padua
This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programing and programming for...
A preliminary study on the vectorization of multimedia applications for multimedia extensions (2003)
Gang Ren, Peng Wu, David Padua
Abstract. In 1994, the first multimedia extension, MAX-1, was introduced to general-purpose processors by HP. Almost ten years have passed, the present means of accessing the computing power of...
Programming the FlexRAM Parallel Intelligent Memory System (2003)
Basilio B. Fraguela, Jose Renau, Paul Feautrier, David Padua, Josep Torrellas
In an intelligent memory architecture, the main memory of a computer is enhanced with many simple processors. The result is a highly-parallel, heterogeneous machine that is able to exploit...
Intraprocedural Pointer Analysis for Container-Centric Applications (2001)
Cohen, Albert, Wu, Peng, Padua, David
As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...
Intraprocedural Pointer Analysis for Container-Centric Applications (2001)
Cohen, Albert, Wu, Peng, Padua, David
As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...
Intraprocedural Pointer Analysis for Container-Centric Applications (2001)
Cohen, Albert, Wu, Peng, Padua, David
As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...
Induction variable analysis without idiom recognition: Beyond monotonicity (2001)
Peng Wu, Albert Cohen, David Padua
This work is an extension of the previous induction variable analyses based on monotonic evolution [11]. With the same computational complexity, the new algorithm improves the monotonic...
A compiler for multiple memory models (2001)
Samuel P. Midkiff, Jaejin Lee, David Padua
The design of consistency models for both hardware and software is a difficult task. For programming languages it is particularly difficult because the target audience for a programming language is...
Searching for the Best FFT Formulas with the SPL Compiler (2001)
Jeremy Johnson, Robert W. Johnson, David Padua, Jianxin Xiong
This paper discuss an approach to implementing and optimizing fast signal transforms based on a domain speci c computer language, called SPL. SPL programs, which are essentially mathematical...
Induction Variable Analysis without Idiom Recognition: Beyond Monotonicity (2001)
Peng Wu, Albert Cohen, David Padua
Traditional induction variables (IV) analyses focus on computing the closed form expressions of variables. This paper presents a new IV analysis based on an IV property called distance interval ....
Peng Wu, Albert Cohen, Jay Hoeflinger, Usa Intelamericas, Inc Xkaisoftware, ...
We present a new approach to dependence testing in the presence of induction variables. Instead of looking for closed form expressions, our method computes monotonic evolution which captures the...
Pointer Analysis for Monotonic Container Traversals (2001)
Albert Cohen Peng, Peng Wu, David Padua
As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...
Efficient and precise array access analysis (2000)
Yunheung Paek, Jay Hoeflinger, David Padua
A number of existing compiler techniques hinge on the analysis of array accesses in a program. The most important task in array access analysis is to collect the information about array accesses of...
Abstract. Many compiler techniques usually require analysis of array subscripts to determine whether a transformation is legal. Traditional methods require the array subscript expressions to be...
Techniques for the translation of MATLAB programs into Fortran 90 (1999)
This article describes the main techniques developed for FALCON’s MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scientific programs. It...
Containers on the Parallelization of General-purpose Java Programs (1999)
Static parallelization of general-purpose programs is still impossible, in general, due to their common use of pointers, irregular data structures, and complex control-flows. One promising strategy...
Access Descriptor based Locality Analysis for Distributed-Shared Memory Multiprocessors (1999)
Angeles G. Navarro, Rafael Asenjo, Emilio L. Zapata, David Padua
Most of today's multiprocessors have a DistributedShared Memory (DSM) organization, which enables scalability while retaining the convenience of the shared-memory programming paradigm. Data...
Containers on the Parallelization of General-purpose Java Programs (1999)
Automatic parallelization of general-purpose programs is still not possible in general in the presence of irregular data structures and complex control-ows. One promising strategy is tread-level data...
On the automatic parallelization of the perfect benchmarks (1998)
Rudolf Eigenmann, Jay Hoeflinger, David Padua
This paper presents the results of the Cedar Hand-Parallelization Experiment, conducted
On the Automatic Parallelization of Sparse and Irregular Fortran Programs (1998)
. Automatic parallelization is usually believed to be less effective at exploiting implicit parallelism in sparse/irregular programs than in their dense/regular counterparts. However, not much is...
Parallelization of Benchmarks for Scalable Shared-Memory Multiprocessors (1998)
Y. Paek, A. Navarro, E.L. Zapata, D. Padua, Yunheung Paek, Angeles Navarro, ...
This work identifies practical compiling techniques for scalable shared memory machines. For this, we have focused on experimental studies using a real machine and representative codes. In the...
Simplification of Array Access Patterns for Compiler Optimizations (1998)
Yunheung Paek, Jay Hoeflinger, David Padua
Existing array region representation techniques are sensitive to the complexity of array subscripts. In general, these techniques are very accurate and efficient for simple subscript expressions, but...
Performance Analysis for Polaris on Distributed-Memory Multiprocessors (1997)
A.G. Navarro, D. Padua, Y. Paek, E.L. Zapata, Angeles G. Navarro, Yunheung Paek, ...
The Polaris restructurer transforms conventional Fortran programs into parallel form for various types of multiprocessor systems. This paper presents the results of a study on strategies to improve...
Compiler Techniques for Effective Communication on Distributed-Memory Multiprocessors (1997)
A.G. Navarro, Y. Paek, E.L. Zapata, D. Padua, Angeles G. Navarro, Yunheung Paek, ...
The Polaris restructurer transforms conventional Fortran programs into parallel form for various types of multiprocessor systems. This paper presents the results of a study on strategies to improve...
On the Automatic Parallelization of Sparse and Irregular Fortran Codes (1997)
Rafael Asenjo, E. Gutierrez, Y. Lin, D. Padua, B. Pottenger, E.L. Zapata, ...
This paper studies how well automatic parallelization techniques work on a collection of real codes with sparse and irregular access patterns. In conducting this work, we have compared existing...
Compiler Techniques for Effective Communication on Distributed-Memory Multiprocessors (1997)
Angeles G. Navarro, Yunheung Paek, Emilio L. Zapata, David Padua
The Polaris restructurer transforms conventionalFortran programs into parallel form for various types of multiprocessor systems. This paper presents the results of a study on strategies to improve...
On the Automatic Parallelization of Sparse and Irregular Fortran Codes (1997)
Rafael Asenjo, Eladio Gutierrez, Yuan Lin, David Padua, Bill Pottenger, Emilio Zapata
This paper studies howwell automatic parallelization techniques work on a collection of real codes with sparse and irregular access patterns. In conducting this work, wehave compared existing...
Access Regions: Towards aPowerful Parallelizing Compiler (1996)
Yunheung Paek, Jay Hoe Inger, David Padua
The bulk of the work within a scienti c program involves processing data stored in arrays. We present a general and e cient means of representing the region of an array accessed by a section of a...
Region-based parallelization using the region test (1996)
Jay Hoe Inger, Yunheung Paek, David Padua
In this paper, we discuss the advantages of using array access region summaries to parallelize programs. We reformulate the de nition of the types of dependence in terms of array regions, and present...
On the Automatic Parallelization of Sparse and Irregular Fortran Codes (1996)
Rafael Asenjo, Eladio Gutierrez, Yuan Lin, David Padua, Bill Pottenger, Emilio Zapata, ...
Irregular memory access patterns have traditionally caused difficulties in the automatic detection of parallelism, and in many cases parallelization is prevented. These problems are nonetheless...
Restructuring programs for high-speed computers with Polaris (1996)
Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jaejin Lee, Tom Lawrence, ...
The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris project to develop a...
Advanced Program Restructuring for High-Performance Computers with Polaris (1996)
William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...
Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today, and it is likely that most PCs in the near future will contain multiple processors. To...
Advanced Program Restructuring for High-Performance Computers with Polaris (1996)
William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...
Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today and it is likely that most PCs in the near future will also be parallel. To accommodate these...
Access Regions: Toward a Powerful Parallelizing Compiler (1996)
Yunheung Paek, Jay Hoeflinger, David Padua
The bulk of the work within a scientific program involves processing data stored in arrays. We present a general and efficient means of representing the region of an array accessed by a section of a...
Advanced Program Restructuring for High-Performance Computers with Polaris (1996)
William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...
Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today and it is likely that most PCs in the near future will also be parallel. To accommodate these...
The Illinois Aggressive Coma Multiprocessor Project (I-ACOMA) (1996)
While scalable shared-memory multiprocessors with hardware-assisted cache coherence are relatively easy to program, if truly high-performance is desired, they still require substantial programmer...
A MATLAB to Fortran 90 Translator and its Effectiveness (1996)
Luiz De Rose, Luiz De Rose, David Padua, David Padua
In this paper, we describe the inference mechanism used by the FALCON system to translate MATLAB 1 programs to Fortran 90. FALCON is a programming environment for the development of scientific...
Restructuring programs for high-speed computers with Polaris (1996)
Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jaejin Lee, Tom Lawrence, ...
The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris project to develop a...
Region-based Parallelization Using the Region Test (1996)
Jay Hoeflinger Yunheung, Jay Hoeflinger, Yunheung Paek, David Padua
In this paper, we discuss the advantages of using array access region summaries to parallelize programs. We reformulate the definition of the types of dependence in terms of array regions, and...
Advanced Program Restructuring for High-Performance Computers with Polaris (1996)
William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...
Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today, and it is likely that most PCs in the near future will contain multiple processors. To...
Advanced Program Restructuring for High-Performance (1996)
Computers With Polaris, William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, ...
Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today and it is likely that most PCs in the near future will also be parallel. To accommodate these...
Parallelizing While Loops for Multiprocessor Systems (1995)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...
Efficient building and placing of gating functions (1995)
The Gated Single-Assignment (GSA) program representation is an extension of the Static Single Assignment (SSA) representation[CFR+91]. GSA was introduced by Ballance, Maccabe and Ottenstein as a part...
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...
Parallelizing While Loops for Multiprocessor Systems (1995)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers treat WHILE loops and DO loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...
Gated SSA-Based Demand-Driven Symbolic Analysis for Parallelizing Compilers (1995)
In this paper, we present a GSA-based technique that performs more efficient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The efficiency is improved by...
Gated SSA-Based Demand-Driven Symbolic Analysis (1995)
It has become increasingly evident... This paper presents a technique to determine the equality and inequality relations between symbolic expressions. The problem of determining the relationship...
Effective Automatic Parallelization with Polaris (1995)
William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...
. The Polaris project has delivered a new parallelizing compiler that overcomes severe limitations of current compilers. While available parallelizing compilers may succeed on small kernels, they...
Parallelizing WHILE Loops for Multiprocessor Systems (1995)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructsbecause their iteration space is unknown. Because these types of loops arise frequently...
Efficient Building and Placing of Gating Functions (1995)
Peng Tu And, Peng Tu, David Padua
this paper, we present an almost linear time algorithm to construct the GSA. The new algorithm is more efficient and simpler than the existing algorithms for GSA construction [BMO90, Hav93]. Since...
Eduard Ayguad, Cristina Barrado, Susana Moreno, David Padua, Mateo Valero, Departament Arquitectura De
this paper we describe a strategy that will make it possible, after applying a small number of changes, to represent low-level operations as part of the internal representation of a conventional...
Parallelizing While Loops for Multiprocessor Systems (1995)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...
Quantitative Analysis of Vector Code (1995)
Roger Espasa, Mateo Valero, David Padua
In this paper we present the results of a detailed simulation study of the execution of vector programs on a single processor of a Convex C3480 machine, using a subset of the Perfect Club benchmarks....
Parallelizing WHILE Loops for Multiprocessor Systems (1995)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers treat WHILE loops and DO loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...
Gated SSA Based Demand-Driven Symbolic Analysis (1995)
Peng Tu And, Peng Tu, David Padua
This paper presents a technique to determine the equality and inequality relations between symbolic expressions. The problem of determining the relationship between symbolic expressions is...
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...
Parallelizing While Loops for Multiprocessor Systems (1995)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...
The LRPD Test: Speculative Run--Time Parallelization of Loops (1995)
With Privatization And, Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...
Polaris: Improving the E ectiveness of Parallelizing Compilers (1994)
William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe Inger, David Padua, ...
Abstract. It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small...
Speculative Run-Time Parallelization of Loops (1994)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a signi cant fraction of fully parallel loops because they have complex or statically insu ciently de ned access patterns. Since fully parallel loops...
Polaris: The Next Generation in Parallelizing Compilers (1994)
Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe, David Padua, ...
It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot extract a significant fraction of the available parallelism in a loop if it has a complex and/or statically insufficiently defined access pattern. This is an...
The Privatizing DOALL Test: (1994)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot extract a significant fraction of the available parallelism in a loop if it has a complex and/or statically insufficiently defined access pattern. This is an...
Polaris: Improving the Effectiveness of Parallelizing Compilers (1994)
William Blume Rudolf, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...
. It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small...
The Privatizing DOALL Test: (1994)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a significant fraction of fully parallel loops because they have complex or statically insufficiently defined access patterns. For this reason,...
An Inference Mechanism for the Compilation of Interactive Array Languages (1994)
Luiz De Rose, Luiz De Rose, David Padua, David Padua
Interactive array languages are powerful programming tools for the development of programs for numerical computation. They provide an environment that tends to increase productivity in software...
Polaris: The Next Generation in Parallelizing Compilers (1994)
Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...
It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...
On the Automatic Parallelization of the Perfect Benchmarks (1994)
Rudolf Eigenmann, Jay Hoeflinger, David Padua
We present a set of advanced program parallelization techniques that are able to significantly improve the performance of application programs. We present evidence for this improvement in terms of...
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a significant fraction of fully parallel loops because they have complex or statically insufficiently defined access patterns. For this reason, we have...
Comparing the Performance of the DASH and Cedar Multiprocessors for Scientific Applications (1994)
Josep Torrellas, David Koufaty, David Padua
this paper, we compare the performance of DASH and Cedar under a set of varied parallel scientific loads. We focus on three main differences between the two machines: cache coherence supported in...
Polaris: Improving the Effectiveness of Parallelizing Compilers (1994)
William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...
. It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small...
Polaris: Improving the Effectiveness of Parallelizing Compilers (1994)
William Blume Rudolf, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe Inger, David Padua, ...
It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...
The Privatizing DOALL Test: (1994)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot extract a significant fraction of the available parallelism in a loop if it has a complex and/or statically insufficiently defined access pattern. This is an...
Speculative Run-Time Parallelization of Loops (1994)
Lawrence Rauchwerger, David Padua
Current parallelizing compilers cannot identify a significant fraction of fully parallel loops because they have complex or statically insufficiently defined access patterns. Since fully parallel...
Comparing the Performance of the DASH and Cedar Multiprocessors for Scientific Applications (1994)
Josep Torrellas, David Koufaty, David Padua
Given all these differences, it is important to comparethe performance of the different machines. Such comparisons can determine the cost-effectiveness of the different choices. As an example, the...
Automatic Detection of Parallelism: A Grand Challenge for High-Performance Computing (1994)
William Blume, Rudolf Eigenmann, Jay Hoe, David Padua, Paul Petersen, Lawrence Rauchwerger, ...
The limited ability of compilers to nd the parallelism in programs is a signi cant barrier to the use of high performance computers. It forces programmers to resort to parallelizing their programs by...
Automatic Detection of Parallelism: A Grand Challenge for High-Performance Computing (1994)
William Blume, Rudolf Eigenmann, Jay Hoe, David Padua, Paul Petersen, Lawrence Rauchwerger, ...
The limited ability of compilers to nd the parallelism in programs is a signi cant barrier to the use of high performance computers. It forces programmers to resort to parallelizing their programs by...
Polaris: The Next Generation in Parallelizing Compilers (1994)
Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe, David Padua, ...
It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...
Automatic array privatization (1993)
Abstract. Array privatization is one of the most e ective transformations for the exploitation of parallelism. In this paper, we present atechnique for automatic array privatization. Our algorithm...
Automatic Array Privatization (1993)
. Array privatization is one of the most effective transformations for the exploitation of parallelism. In this paper, we present a technique for automatic array privatization. Our algorithm uses...
Restructuring Fortran Programs for Cedar (1993)
Rudolf Eigenmann, Jay Hoeflinger, Greg Jaxon, Zhiyuan Li, David Padua
This paper reports on the status of the Fortran translator for the Cedar computer at the end of March, 1991. A brief description of the Cedar Fortran language is followed by a discussion of the...
4th International Workshop on Language and Compilers for Parallel Computing (1992)
Banerjee, Utpal, Gelernter, David, Nicolau, Alexandru, Padua, David
Evaluation Of Parallelizing Compilers (1992)
The recognition and exploitation of parallelism is a difficult problem for restructuring compilers. We present a method for evaluating the effectiveness of parallelizing compilers in general and of...
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs (1991)
Rudolf Eigenmann, Jay Hoeflinger, Zhiyuan Li, David Padua
. This paper discusses the techniques used to hand-parallelize, for the Alliant FX/80, four Fortran programs from the Perfect-Benchmark suite. The paper also includes the execution times of the...
Restructuring Fortran Programs for Cedar (1991)
Rudolf Eigenmann, Jay Hoe Inger, Greg Jaxon, Zhiyuan Li, David Padua
This paper reports on the status of the Fortran translator for the Cedar computer at the end of March, 1991. A brief description of the Cedar Fortran language is followed by a discussion of the...
Principal Investigators: (1990)
John Bruner, Lloichi Cheong, David Kuck, Er Veidenbawn, Grego O, ...
ti j/t d' ()h,icclive: The objective of the Chief project is to provide an i.tegrated simulation environment for studying and evaluating various issues in designing parallel systems, including...
accepted on the recommendation of (1972)
Christoph Von Praun, Prof Dr, Thomas Gross, Prof Dr, David Padua, ...
ii This dissertation describes an efficient and automated approach to determine synchronization defects in multi-threaded object-oriented programs. The approach is based on the key observation that...
citizen of Germany accepted on the recommendation of (1972)
Prof Dr, Thomas Gross, Prof Dr, David Padua, Prof Dr, Robert Stärk
This dissertation describes an efficient and automated approach to determine synchronization defects in multi-threaded object-oriented programs. The approach is based on the key observation that...
Lung metastasis genes couple breast tumor size and metastatic spread
Minn, Andy J., Gupta, Gaorav P., Padua, David, Bos, Paula, Nguyen, Don X., Nuyten, Dimitry, ...
The association between large tumor size and metastatic risk in a majority of clinical cancers has led to questions as to whether these observations are causally related or whether one is simply a...
A MATLAB to Fortran 90 Translator and its Effectiveness
Luiz De, Luiz De Rose, Luiz De Rose, David Padua, David Padua
In this paper, we describe the inference mechanism used by the FALCON system to translate MATLAB 1 programs to Fortran 90. FALCON is a programming environment for the development of scientific...