David Padua

Design Issues in Parallel Array Languages for Shared Memory ⋆ (2009)

James Brodman, Basilio B. Fraguela, María J. Garzarán, David Padua

Abstract. The Hierarchically Tiled Array (HTA) is a data type that facilitates the definition and manipulation of arrays partitioned into tiles. The data type allows to exploit those tiles to attain...

Optimizing Sorting with Machine Learning Algorithms (2008)

Xiaoming Li, María Jesús Garzarán, David Padua

The growing complexity of modern processors has made the development of highly efficient code increasingly difficult. Manually developing highly efficient code is usually expensive but necessary due...

Hierarchically Tiled Arrays Vs. Intel Threading Building Blocks for Programming Multicore Systems ⋆ (2008)

Diego Andrade, James Brodman, Basilio B. Fraguela, David Padua

Abstract. Multicore systems are now the norm. Programmers can no longer rely on faster clock rates to speed up their applications. Thus, software developers are increasingly forced to face the...

Supervisor (2008)

Peng Zhao, Peng Zhao, David Padua, Bruce Cockburn, Jonathan Schaeffer, Duane Szafron

Permission is hereby granted to the University of Alberta Library to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes...

Evaluating the Impact of Thread Escape Analysis on a Memory Consistency Model-aware Compiler (2008)

Chi-leung Wong, Zehra Sura, Xing Fang, Kyungwoo Lee, Samuel P. Midkiff, Jaejin Lee, ...

Abstract. The widespread popularity of languages allowing explicitly parallel, multi-threaded programming, e.g. Java and C#, have focused attention on the issue of memory model design. The Pensieve...

HiLO: High Level Optimization of FFTs ⋆ (2008)

Nick Rizzolo, David Padua

Abstract. As computing platforms become more and more complex, the task of optimizing performance critical codes becomes more challenging. Recently, more attention has been focused on automating this...

Automatic Generation and Adaptation of High-Performance Implementations of Signal Processing Transforms (2008)

Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, Manuela Veloso, ...

Abstract The paper describes SPIRAL, a generator of libraries for fast software implementations of signal processing transforms. These libraries are adapted to the computing platform and can be...

Automatic Generation of a Parallel Sorting Algorithm ∗ (2008)

Brian A. Garber, Dan Hoeflinger, Xiaoming Li, María Jesús Garzarán, David Padua

In this paper, we discuss a library generator for parallel sorting routines that examines the input characteristics (and the parameters they affect) to select the best performing algorithm. Our...

Languages (2008)

Basilio B. Fraguela, Jia Guo, Ganesh Biksh, María J. Garzarán, Gheorghe Almási, José Moreira, ...

In this paper, we show our initial experience with a class of objects, called Hierarchically Tiled Arrays (HTAs), that encapsulate parallelism. HTAs allow the construction of single-threaded parallel...

An advanced compiler framework for non-cache-coherent multiprocessors (2008)

Yunheung Paek, Angeles Navarro, Emilio Zapata, Senior Member, Jay Hoeflinger, David Padua

AbstractÐThe Cray T3D and T3E are non-cache-coherent (NCC) computers with a NUMA structure. They have been shown to exhibit a very stable and scalable performance for a variety of application...

PROCEEDINGS OF THE IEEE SPECIAL ISSUE ON PROGRAM GENERATION, OPTIMIZATION, AND ADAPTATION 1 SPIRAL: Code Generation for DSP Transforms (2008)

Markus Püschel, Jeremy Johnson, David Padua, Manuela Veloso, Bryan W. Singer, ...

Abstract — Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance?...

1.1 Hierarchically Tiled Arrays (2008)

Ganesh Biksh, Jia Guo, Christoph Von Praun, Gabriel Tanase, Basilio B. Fraguela, María J. Garzarán, ...

Abstract. Hierarchically Tiled Arrays (HTAs) are data structures that facilitate locality and parallelism of array intensive computations with block-recursive nature. The model underlying HTAs...

Hierarchically Tiled Array Vs. Intel Thread Building Blocks for Multicore Systems Programming (2008)

Diego Andrade, James Brodman, Basilio B. Fraguela, David Padua

Multicore systems are becoming common, while programmers cannot rely on growing clock rate to speed up their application. Thus, software developers are increasingly exposed to the complexity...

PROCEEDINGS OF THE IEEE SPECIAL ISSUE ON PROGRAM GENERATION, OPTIMIZATION, AND ADAPTATION 1 SPIRAL: Code Generation for DSP Transforms (2008)

Markus Püschel, Jeremy Johnson, David Padua, Manuela Veloso, Bryan W. Singer, ...

Abstract — Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance?...

HiLO: High Level Optimization of FFTs ⋆ (2008)

Nick Rizzolo, David Padua

Abstract. As computing platforms become more and more complex, the task of optimizing performance critical codes becomes more challenging. Recently, more attention has been focused on automating this...

under Awards ACR-0234293, ITR/NGS-0325687, and SYS-310941. (2008)

Markus Püschel, Jeremy R. Johnson, David Padua, Robert W. Johnson, Nicholas Rizzolo

Fast changing, increasingly complex, and diverse computing platforms pose central problems in scientific computing: How to achieve, with reasonable effort, portable optimal performance? We present...

Automatic derivation and implementation of signal processing algorithms (2008)

Sebastian Egner, David Padua

We present a computer algebra framework to automatically derive and implement algorithms for digital signal processing. The two main parts of the framework are AREP, a library for symbolic...

1.1 Hierarchically Tiled Arrays (2008)

Ganesh Biksh, Jia Guo, Christoph Von Praun, Gabriel Tanase, Basilio B. Fraguela, María J. Garzarán, ...

Abstract. Hierarchically Tiled Arrays (HTAs) are data structures that facilitate locality and parallelism of array intensive computations with block-recursive nature. The model underlying HTAs...

Hierarchically Tiled Arrays for Parallelism and Locality ∗ (2008)

Jia Guo, Ganesh Biksh, Daniel Hoeflinger, Gheorghe Almasi, Basilio Fraguela, María Jesús Garzarán, ...

Parallel programming is facilitated by constructs which, unlike the widely used SPMD paradigm, provide programmers with a global view of the code and data structures. These constructs could be...

On the Automatic Parallelization of the Perfect Benchmarks R Rudolf Eigenmann y Jay Hoe inger (2008)

David Padua

This paper presents the results of the Cedar Hand-Parallelization Experiment, conducted

Abstract Gated SSA-Based Demand-Driven Symbolic Analysis for Parallelizing Compilers * (2008)

Peng Tu, David Padua

In this paper, we present a GSA-based technique that performs more efficient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The efficiency is improved by...

Abstract (2008)

Luiz De Rose, David Padua

In this paper, we describe the inference mechanism used by the FALCON system to translate MATLABl programs to Fortran 90. FALCON is a programming environment for the development of scientific...

languages. They are Dennis Gannon from Indiana (2008)

Equipment Corporation, David Padua

Introductory paper to the session of this title at the Workshop on Challenges for Parallel Processing, International Conference on Parallel Processing, 1996 In this workshop session, three speakers...

Benchmarking FALCON's MATLAB-to-Fortran 90 Compiler on an SGI Power Challenge (2008)

Luiz De Rose, David Padua

This paper presents an overview of the FALCON MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scientific programs. It combines static and...

Programming with tiles (2008)

Jia Guo, Ganesh Biksh, Basilio B. Fraguela, María J. Garzarán, David Padua

The importance of tiles or blocks in scientific computing cannot be overstated. Many algorithms, both iterative and recursive, can be expressed naturally if tiles are represented explicitly. From the...

Rakesh Kumar (2008)

Vikram S. Adve, Gul Agha, Matthew I. Frank, María Jesús Garzarán, John C. Hart, ...

Illinois history in parallel computing stretches more than 40 years. From the first academic parallel supercomputer, the ILLIAC IV started in 1964, to today’s work to install the first petascale...

Code Study: Automatic Parallelism Detection in Dyfesm (2007)

Yuan Lin And, Yuan Lin, David Padua

This paper reports the results of our study on the Perfect Benchmarks code DYFESM in order to determine how eective automatic parallelism detection techniques are on this code.

A Simple Framework to Calculate the Reaching Definition of Array References and Its Use in Subscript Array Analysis (2007)

Yuan Lin, David Padua

. The property analysis of subscript arrays can be used to facilitate the automatic detection of parallelism in sparse/irregular programs that use indirectly accessed arrays. In order for property...

Benchmarking FALCON's MATLAB-to-Fortran 90 Compiler on an SGI Power Challenge (2007)

Luiz De Rose, David Padua

This paper presents an overview of the FALCON MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scienti c programs. It combines static and...

2 (2007)

Jeremy Johnson, Robert W. Johnson, David Padua, Jianxin Xiong

Abstract. This paper discuss an approach to implementing and optimizing fast signal transforms based on a domain specific computer language, called SPL. SPL programs, which are essentially...

3 (2007)

Jianxin Xiong, Jeremy Johnson, Robert Johnson, David Padua

We discuss the design and implementation of a compiler that translates formulas representing signal processing transforms into ecient C or Fortran programs. The formulas are represented in a language...

Automatic derivation and implementation of signal processing algorithms (2007)

Sebastian Egner, David Padua

We present a computer algebra framework to automatically derive and implement algorithms for digital signal processing. The two main parts of the framework are AREP, a library for symbolic...

Abstract (2007)

Yunheung Paek, Jay Hoeflinger, Angeles Navarro, Emilio Zapata, David Padua

The Cray T3D and T3E are non-cache-coherent (NCC) computers with a NUMA structure. They have been shown to exhibit very stable and scalable performance for a variety of application programs....

Analyzing Java Arrays: Combness and Synchness Analysis (2007)

Peng Wu Paul, Paul Feautrier, David Padua

In this paper, we propose two new abstractions, combness and synchness, for analyzing Java arrays, which capture element-wise aliasing relation within an array or between two arrays. combness and...

Benchmarking FALCON's MATLAB-to-Fortran 90 Compiler on an SGI Power Challenge (2007)

Luiz De, Luiz De Rose, David Padua

This paper presents an overview of the FALCON MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scientific programs. It combines static and...

Programming the FlexRAM Intelligent Memory Architecture (2007)

Basilio B. Fraguela, Jose Renau, Paul Feautrier, David Padua, Josep Torrellas

Advances in VLSI technology have enabled the implementation of computer architectures based on Processing-in-Memory (PIM) chips. These architectures can help bridge the growing gap between processor...

Points-to Analysis for Java with Applications to Loop Optimizations (2007)

Peng Wu, Paul Feautrier, David Padua, Zehra Sura

If Java is to be used for high performance computing, we must enable classical loop optimizations as well as eliminate those features of Java which are detrimental to good performance, like...

Abstract Gated SSA-Based Demand-Driven Symbolic Analysis for Parallelizing Compilers (2007)

Peng Tu, David Padua

In this paper, we present a GSA-based technique that performs more e cient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The e ciency is improved by...

On the Automatic Parallelization of the Perfect Benchmarks R Rudolf Eigenmann y Jay Hoe inger (2007)

David Padua

We present a set of advanced program parallelization techniques that are able to signi cantly improve the performance of application programs. We present evidence for this improvement in terms of the...

On the Automatic Parallelization of (2007)

Rudolf Eigenmann, Jay Hoeflinger, David Padua, Senior Member

transformed the Perfect Benchmarks ® into parallel program versions. In doing so, we used techniques that may be automated in an optimizing compiler. We then ran these programs on the Cedar...

Programming for parallelism and locality with hierarchically tiled arrays (2006)

Ganesh Biksh, Jia Guo, Daniel Hoeflinger, Gheorghe Almasi, Basilio B. Fraguela, María J. Garzarán, ...

Tiling has proven to be an effective mechanism to develop high performance implementations of algorithms. Tiling can be used to organize computations so that communication costs in parallel programs...

Programming for Parallelism and Locality with Hierarchically Tiled Arrays (2006)

Ganesh Bikshandi, Jia Guo, Daniel Hoeflinger, Gheorghe Almasi, Basilio B. Fraguela, Mara J. Garzaran, ...

Tiling has proven to be an effective mechanism to develop high performance implementations of algorithms. Tiling can be used to organize computations so that communication costs in parallel programs...

Is Search Really Necessary to Generate High-Performance BLAS? (2005)

Kamen Yotov, Xiaoming Li, Gang Ren, Maria Garzaran, David Padua, Keshav Pingali, ...

Abstract — A key step in program optimization is the estimation of optimal values for parameters such as tile sizes and loop unrolling factors. Traditional compilers use simple analytical models to...

A Language for the Compact Representation of Multiple Program Versions (2005)

Sebastien Donadio, James Brodman, Thomas Roeder, Kamen Yotov, Denis Barthou, Albert Cohen, ...

As processor complexity increases compilers tend to deliver suboptimal performance. Library generators such as ATLAS, FFTW and SPIRAL overcome this issue by empirically searching in the space of...

In Search of a Program Generator to Implement Generic Transformations for High-performance Computing (2005)

Albert Cohen, Sebastien Donadio, Maria-Jesus Garzaran, Christoph Herrmann, Oleg Kiselyov, David Padua

The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the...

A Language for the Compact Representation of Multiple Program Versions (2005)

Sebastien Donadio, James Brodman, Thomas Roeder, Kamen Yotov, Denis Barthou, Albert Cohen, ...

As processor complexity increases compilers tend to deliver suboptimal performance. Library generators such as ATLAS, FFTW and SPIRAL overcome this issue by empirically searching in the space of...

Analytic Models and Empirical Search: A Hybrid Approach to Code Optimization (2005)

Arkady Epshteyn, Maria Garzaran, Gerald Dejong, David Padua, Gang Ren, Xiaoming Li, ...

Compilers employ system models, sometimes implicitly, to make code optimization decisions. These models are analytic; they reflect their implementor 's understanding and beliefs of the system....

Optimizing Sorting with Genetic Algorithms (2005)

Xiaoming Li Mar, Xiaoming Li, María Jesús Garzarán, David Padua

The growing complexity of modern processors has made the generation of highly efficient code increasingly difficult. Manual code generation is very time consuming, but it is often the only choice...

Is Search Really Necessary to Generate High-Performance BLAS? (2005)

Kamen Yotov, Xiaoming Li, Gang Ren, Maria Jesus Garzaran, David Padua, Member Ieee, ...

this paper was arrived at by studying the ATLAS source code. In case of any discrepancy between this description and how the ATLAS system is actually implemented, the documentation of the ATLAS...

In Search of a Program Generator to Implement Generic Transformations for High-performance Computing (2005)

Albert Cohen, Sebastien Donadio, Maria-Jesus Garzaran, Christoph Herrmann, Oleg Kiselyov, David Padua

The quality of compiler-optimized code for high-performance applications is far behind what optimization and domain experts can achieve by hand. Although it may seem surprising at first glance, the...

Parallel mining of closed sequential patterns (2005)

Shengnan Cong, Jiawei Han, David Padua

Discovery of sequential patterns is an essential data mining task with broad applications. Among several variations of sequential patterns, closed sequential pattern is the most useful one since it...

A samplingbased framework for parallel data mining (2005)

Shengnan Cong, Jiawei Han, Jay Hoeflinger, David Padua

The goal of data mining algorithm is to discover useful information embedded in large databases. Frequent itemset mining and sequential pattern mining are two important data mining problems with...

A framework for measuring supercomputer productivity (2004)

Marc Snir, David A. Bader, Marc Snir, David A. Bader, James C. Browne, Brad Chamberlain, ...

We propose a framework for measuring the productivity of high performance computing (HPC) systems, based on common economic definitions of productivity and on utility theory. We discuss how these...

In search of a program generator to implement generic transformations for high-performance computing (2004)

Albert Cohen, Sébastien Donadio, Maria-jesus Garzaran, Christoph Herrmann, David Padua

The quality of compiler-optimized code for high-performance applications lags way behind what optimization and domain experts can achieve by hand. This paper explores in-between solutions, besides...

SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)

Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...

SPIRAL is a generator for libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...

Performance modeling and programming environments for petaflops computers and the blue gene machine (2004)

Gengbin Zheng, Terry Wilmarth, Orion Sky Lawlor, Laxmikant V. Kalé, Sarita Adve, David Padua

We present a performance modeling and programming environment for petaflops computers and the Blue Gene machine. It consists of a parallel simulator, BigSim, for predicting performance of machines...

SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)

Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...

SPIRAL is a generator for libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...

SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)

Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...

SPIRAL is a generator of libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...

SPIRAL: A Generator for Platform-Adapted Libraries of Signal Processing Algorithms (2004)

Markus Püschel, Bryan Singer, Jianxin Xiong, Jeremy Johnson, David Padua, ...

SPIRAL is a generator of libraries of fast software implementations of linear signal processing transforms. These libraries are adapted to the computing platform and can be re-optimized as the...

Implementation of Parallel Numerical Algorithms Using Hierarchically Tiled Arrays (2004)

Ganesh Bikshandi, Basilio B. Fraguela, Jia Guo, Maria J. Garzaran, Gheorghe Almasi, Jose Moreira, ...

In this paper, we describe our experience in writing parallel numerical algorithms using Hierarchically Tiled Arrays (HTAs). HTAs are classes of objects that encapsulate parallelism. HTAs allow the...

The Hierarchically Tiled Arrays Programming Approach (2004)

Basilio B. Fraguela, Jia Guo, Ganesh Bikshandi, Maria J. Garzaran, Gheorghe Almasi, Jose Moreira, ...

In this paper, we show our initial experience with a class of objects, called Hierarchically Tiled Arrays (HTAs), that encapsulate parallelism. HTAs allow the construction of single-threaded parallel...

In Search of a Program Generator to Implement Generic Transformations for High-performance Computing (2004)

Albert Cohen, Sebastien Donadio, Maria-Jesus Garzaran, Christoph Herrmann, David Padua

The quality of compiler-optimized code for high-performance applications lags way behind what optimization and domain experts can achieve by hand. This paper explores in-between solutions, besides...

A Dynamically Tuned Sorting Library (2004)

Xiaoming Li, Maria Jesus Garzaran, David Padua

Empirical search is a strategy used during the installation of library generators such as ATLAS, FFTW, and SPIRAL to identify the algorithm or the version of an algorithm that delivers the best...

Implementation of Parallel Numerical Algorithms Using Hierarchically Tiled Arrays (2004)

Ganesh Bikshandi, Basilio B. Fraguela, Jia Guo, María J. Garzarán, Gheorghe Almási, José Moreira, ...

Abstract. In this paper, we describe our experience in writing parallel numerical algorithms using Hierarchically Tiled Arrays (HTAs). HTAs are classes of objects that encapsulate parallelism. HTAs...

Programming for locality and parallelism with hierarchically tiled arrays (2003)

Gheorghe Almasi, Luiz De Rose, Jose Moreira, David Padua

This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programing and programming for...

The Power of Belady’s Algorithm in Register Allocation for Long Basic Blocks (2003)

Jia Guo, María Jesús Garzarán, David Padua

Abstract. Optimization techniques such as loop-unrolling and trace-scheduling can result in long straight-line codes. It is, however, unclear how well the register allocation algorithms of current...

Programming for locality and parallelism with hierarchically tiled arrays (2003)

Gheorghe Almási, Luiz De Rose, Basilio B. Fraguela, José Moreira, David Padua

Abstract. This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programming and...

A preliminary study on the vectorization of multimedia applications for multimedia extensions (2003)

Gang Ren, Peng Wu, David Padua

Abstract. In 1994, the first multimedia extension, MAX-1, was introduced to general-purpose processors by HP. Almost ten years have passed, the present means of accessing the computing power of...

Programming the FlexRAM Parallel Intelligent Memory System (2003)

Basilio B. Fraguela, Jose Renau, Paul Feautrier, David Padua, Josep Torrellas

Intelligent memory architectures enhance the memory chips of a computer with many simple processors. The result is a highly-parallel, heterogeneous machine that is able to exploit computation in...

Programming for locality and parallelism with hierarchically tiled arrays (2003)

Gheorghe Almasi, Luiz De Rose, Jose Moreira, David Padua

This paper introduces a new primitive data type, hierarchically tiled arrays (HTAs), which could be incorporated into conventional languages to facilitate parallel programing and programming for...

A preliminary study on the vectorization of multimedia applications for multimedia extensions (2003)

Gang Ren, Peng Wu, David Padua

Abstract. In 1994, the first multimedia extension, MAX-1, was introduced to general-purpose processors by HP. Almost ten years have passed, the present means of accessing the computing power of...

Programming the FlexRAM Parallel Intelligent Memory System (2003)

Basilio B. Fraguela, Jose Renau, Paul Feautrier, David Padua, Josep Torrellas

In an intelligent memory architecture, the main memory of a computer is enhanced with many simple processors. The result is a highly-parallel, heterogeneous machine that is able to exploit...

Intraprocedural Pointer Analysis for Container-Centric Applications (2001)

Cohen, Albert, Wu, Peng, Padua, David

As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...

Intraprocedural Pointer Analysis for Container-Centric Applications (2001)

Cohen, Albert, Wu, Peng, Padua, David

As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...

Intraprocedural Pointer Analysis for Container-Centric Applications (2001)

Cohen, Albert, Wu, Peng, Padua, David

As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...

Induction variable analysis without idiom recognition: Beyond monotonicity (2001)

Peng Wu, Albert Cohen, David Padua

This work is an extension of the previous induction variable analyses based on monotonic evolution [11]. With the same computational complexity, the new algorithm improves the monotonic...

A compiler for multiple memory models (2001)

Samuel P. Midkiff, Jaejin Lee, David Padua

The design of consistency models for both hardware and software is a difficult task. For programming languages it is particularly difficult because the target audience for a programming language is...

Searching for the Best FFT Formulas with the SPL Compiler (2001)

Jeremy Johnson, Robert W. Johnson, David Padua, Jianxin Xiong

This paper discuss an approach to implementing and optimizing fast signal transforms based on a domain speci c computer language, called SPL. SPL programs, which are essentially mathematical...

Induction Variable Analysis without Idiom Recognition: Beyond Monotonicity (2001)

Peng Wu, Albert Cohen, David Padua

Traditional induction variables (IV) analyses focus on computing the closed form expressions of variables. This paper presents a new IV analysis based on an IV property called distance interval ....

Monotonic Evolution: an Alternative to Induction Variable Substitution for Dependence Analysis (2001)

Peng Wu, Albert Cohen, Jay Hoeflinger, Usa Intelamericas, Inc Xkaisoftware, ...

We present a new approach to dependence testing in the presence of induction variables. Instead of looking for closed form expressions, our method computes monotonic evolution which captures the...

Pointer Analysis for Monotonic Container Traversals (2001)

Albert Cohen Peng, Peng Wu, David Padua

As programmers look forward to designing high performance applications with object-oriented models, compilers must support higher-level analyses and optimizations. Pointer analysis for...

Efficient and precise array access analysis (2000)

Yunheung Paek, Jay Hoeflinger, David Padua

A number of existing compiler techniques hinge on the analysis of array accesses in a program. The most important task in array access analysis is to collect the information about array accesses of...

Analysis of irregular singleindexed array accesses and its applications in compiler optimizations (2000)

Yuan Lin, David Padua

Abstract. Many compiler techniques usually require analysis of array subscripts to determine whether a transformation is legal. Traditional methods require the array subscript expressions to be...

Techniques for the translation of MATLAB programs into Fortran 90 (1999)

Luiz De Rose, David Padua

This article describes the main techniques developed for FALCON’s MATLAB-to-Fortran 90 compiler. FALCON is a programming environment for the development of high-performance scientific programs. It...

Containers on the Parallelization of General-purpose Java Programs (1999)

Peng Wu, David Padua

Static parallelization of general-purpose programs is still impossible, in general, due to their common use of pointers, irregular data structures, and complex control-flows. One promising strategy...

Access Descriptor based Locality Analysis for Distributed-Shared Memory Multiprocessors (1999)

Angeles G. Navarro, Rafael Asenjo, Emilio L. Zapata, David Padua

Most of today's multiprocessors have a DistributedShared Memory (DSM) organization, which enables scalability while retaining the convenience of the shared-memory programming paradigm. Data...

Containers on the Parallelization of General-purpose Java Programs (1999)

Peng Wu David, David Padua

Automatic parallelization of general-purpose programs is still not possible in general in the presence of irregular data structures and complex control-ows. One promising strategy is tread-level data...

On the automatic parallelization of the perfect benchmarks (1998)

Rudolf Eigenmann, Jay Hoeflinger, David Padua

This paper presents the results of the Cedar Hand-Parallelization Experiment, conducted

On the Automatic Parallelization of Sparse and Irregular Fortran Programs (1998)

Yuan Lin, David Padua

. Automatic parallelization is usually believed to be less effective at exploiting implicit parallelism in sparse/irregular programs than in their dense/regular counterparts. However, not much is...

Parallelization of Benchmarks for Scalable Shared-Memory Multiprocessors (1998)

Y. Paek, A. Navarro, E.L. Zapata, D. Padua, Yunheung Paek, Angeles Navarro, ...

This work identifies practical compiling techniques for scalable shared memory machines. For this, we have focused on experimental studies using a real machine and representative codes. In the...

Simplification of Array Access Patterns for Compiler Optimizations (1998)

Yunheung Paek, Jay Hoeflinger, David Padua

Existing array region representation techniques are sensitive to the complexity of array subscripts. In general, these techniques are very accurate and efficient for simple subscript expressions, but...

Performance Analysis for Polaris on Distributed-Memory Multiprocessors (1997)

A.G. Navarro, D. Padua, Y. Paek, E.L. Zapata, Angeles G. Navarro, Yunheung Paek, ...

The Polaris restructurer transforms conventional Fortran programs into parallel form for various types of multiprocessor systems. This paper presents the results of a study on strategies to improve...

Compiler Techniques for Effective Communication on Distributed-Memory Multiprocessors (1997)

A.G. Navarro, Y. Paek, E.L. Zapata, D. Padua, Angeles G. Navarro, Yunheung Paek, ...

The Polaris restructurer transforms conventional Fortran programs into parallel form for various types of multiprocessor systems. This paper presents the results of a study on strategies to improve...

On the Automatic Parallelization of Sparse and Irregular Fortran Codes (1997)

Rafael Asenjo, E. Gutierrez, Y. Lin, D. Padua, B. Pottenger, E.L. Zapata, ...

This paper studies how well automatic parallelization techniques work on a collection of real codes with sparse and irregular access patterns. In conducting this work, we have compared existing...

Compiler Techniques for Effective Communication on Distributed-Memory Multiprocessors (1997)

Angeles G. Navarro, Yunheung Paek, Emilio L. Zapata, David Padua

The Polaris restructurer transforms conventionalFortran programs into parallel form for various types of multiprocessor systems. This paper presents the results of a study on strategies to improve...

On the Automatic Parallelization of Sparse and Irregular Fortran Codes (1997)

Rafael Asenjo, Eladio Gutierrez, Yuan Lin, David Padua, Bill Pottenger, Emilio Zapata

This paper studies howwell automatic parallelization techniques work on a collection of real codes with sparse and irregular access patterns. In conducting this work, wehave compared existing...

Access Regions: Towards aPowerful Parallelizing Compiler (1996)

Yunheung Paek, Jay Hoe Inger, David Padua

The bulk of the work within a scienti c program involves processing data stored in arrays. We present a general and e cient means of representing the region of an array accessed by a section of a...

Region-based parallelization using the region test (1996)

Jay Hoe Inger, Yunheung Paek, David Padua

In this paper, we discuss the advantages of using array access region summaries to parallelize programs. We reformulate the de nition of the types of dependence in terms of array regions, and present...

On the Automatic Parallelization of Sparse and Irregular Fortran Codes (1996)

Rafael Asenjo, Eladio Gutierrez, Yuan Lin, David Padua, Bill Pottenger, Emilio Zapata, ...

Irregular memory access patterns have traditionally caused difficulties in the automatic detection of parallelism, and in many cases parallelization is prevented. These problems are nonetheless...

Restructuring programs for high-speed computers with Polaris (1996)

Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jaejin Lee, Tom Lawrence, ...

The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris project to develop a...

Advanced Program Restructuring for High-Performance Computers with Polaris (1996)

William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...

Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today, and it is likely that most PCs in the near future will contain multiple processors. To...

Advanced Program Restructuring for High-Performance Computers with Polaris (1996)

William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...

Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today and it is likely that most PCs in the near future will also be parallel. To accommodate these...

Access Regions: Toward a Powerful Parallelizing Compiler (1996)

Yunheung Paek, Jay Hoeflinger, David Padua

The bulk of the work within a scientific program involves processing data stored in arrays. We present a general and efficient means of representing the region of an array accessed by a section of a...

Advanced Program Restructuring for High-Performance Computers with Polaris (1996)

William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...

Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today and it is likely that most PCs in the near future will also be parallel. To accommodate these...

The Illinois Aggressive Coma Multiprocessor Project (I-ACOMA) (1996)

Josep Torrellas, David Padua

While scalable shared-memory multiprocessors with hardware-assisted cache coherence are relatively easy to program, if truly high-performance is desired, they still require substantial programmer...

A MATLAB to Fortran 90 Translator and its Effectiveness (1996)

Luiz De Rose, Luiz De Rose, David Padua, David Padua

In this paper, we describe the inference mechanism used by the FALCON system to translate MATLAB 1 programs to Fortran 90. FALCON is a programming environment for the development of scientific...

Restructuring programs for high-speed computers with Polaris (1996)

Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jaejin Lee, Tom Lawrence, ...

The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris project to develop a...

Region-based Parallelization Using the Region Test (1996)

Jay Hoeflinger Yunheung, Jay Hoeflinger, Yunheung Paek, David Padua

In this paper, we discuss the advantages of using array access region summaries to parallelize programs. We reformulate the definition of the types of dependence in terms of array regions, and...

Advanced Program Restructuring for High-Performance Computers with Polaris (1996)

William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, Thomas Lawrence, ...

Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today, and it is likely that most PCs in the near future will contain multiple processors. To...

Advanced Program Restructuring for High-Performance (1996)

Computers With Polaris, William Blume, Ramon Doallo, Rudolf Eigenmann, John Grout, Jay Hoeflinger, ...

Multiprocessor computers are rapidly becoming the norm. Parallel workstations are widely available today and it is likely that most PCs in the near future will also be parallel. To accommodate these...

Parallelizing While Loops for Multiprocessor Systems (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...

Efficient building and placing of gating functions (1995)

Peng Tu, David Padua

The Gated Single-Assignment (GSA) program representation is an extension of the Static Single Assignment (SSA) representation[CFR+91]. GSA was introduced by Ballance, Maccabe and Ottenstein as a part...

The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...

Parallelizing While Loops for Multiprocessor Systems (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers treat WHILE loops and DO loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...

The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...

Gated SSA-Based Demand-Driven Symbolic Analysis for Parallelizing Compilers (1995)

Peng Tu, David Padua

In this paper, we present a GSA-based technique that performs more efficient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The efficiency is improved by...

Gated SSA-Based Demand-Driven Symbolic Analysis (1995)

Peng Tu, David Padua

It has become increasingly evident... This paper presents a technique to determine the equality and inequality relations between symbolic expressions. The problem of determining the relationship...

Effective Automatic Parallelization with Polaris (1995)

William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...

. The Polaris project has delivered a new parallelizing compiler that overcomes severe limitations of current compilers. While available parallelizing compilers may succeed on small kernels, they...

Parallelizing WHILE Loops for Multiprocessor Systems (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructsbecause their iteration space is unknown. Because these types of loops arise frequently...

Efficient Building and Placing of Gating Functions (1995)

Peng Tu And, Peng Tu, David Padua

this paper, we present an almost linear time algorithm to construct the GSA. The new algorithm is more efficient and simpler than the existing algorithms for GSA construction [BMO90, Hav93]. Since...

A Uniform Internal Representation for High-Level and Instruction-Level Transformations Eduard Ayguadé, Cristina Barrado, Jesús Labarta, David López, Susana Moreno, David Padua, and Mateo Valero (1995)

Eduard Ayguad, Cristina Barrado, Susana Moreno, David Padua, Mateo Valero, Departament Arquitectura De

this paper we describe a strategy that will make it possible, after applying a small number of changes, to represent low-level operations as part of the internal representation of a conventional...

Parallelizing While Loops for Multiprocessor Systems (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...

Quantitative Analysis of Vector Code (1995)

Roger Espasa, Mateo Valero, David Padua

In this paper we present the results of a detailed simulation study of the execution of vector programs on a single processor of a Convex C3480 machine, using a subset of the Perfect Club benchmarks....

Parallelizing WHILE Loops for Multiprocessor Systems (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers treat WHILE loops and DO loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...

Gated SSA Based Demand-Driven Symbolic Analysis (1995)

Peng Tu And, Peng Tu, David Padua

This paper presents a technique to determine the equality and inequality relations between symbolic expressions. The problem of determining the relationship between symbolic expressions is...

The LRPD Test: Speculative Run-Time Parallelization of Loops with Privatization and Reduction Parallelization (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...

Parallelizing While Loops for Multiprocessor Systems (1995)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers treat while loops and do loops with conditional exits as sequential constructs because their iteration space is unknown. Motivated by the fact that these types of...

The LRPD Test: Speculative Run--Time Parallelization of Loops (1995)

With Privatization And, Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a significant fraction of parallelizable loops because they have complex or statically insufficiently defined access patterns. As parallelizable loops...

Polaris: Improving the E ectiveness of Parallelizing Compilers (1994)

William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe Inger, David Padua, ...

Abstract. It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small...

Speculative Run-Time Parallelization of Loops (1994)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a signi cant fraction of fully parallel loops because they have complex or statically insu ciently de ned access patterns. Since fully parallel loops...

Polaris: The Next Generation in Parallelizing Compilers (1994)

Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe, David Padua, ...

It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...

The Privatizing DOALL Test: A Run-Time Technique for DOALL Loop Identification and Array Privatization (1994)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot extract a significant fraction of the available parallelism in a loop if it has a complex and/or statically insufficiently defined access pattern. This is an...

The Privatizing DOALL Test: (1994)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot extract a significant fraction of the available parallelism in a loop if it has a complex and/or statically insufficiently defined access pattern. This is an...

Polaris: Improving the Effectiveness of Parallelizing Compilers (1994)

William Blume Rudolf, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...

. It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small...

The Privatizing DOALL Test: (1994)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a significant fraction of fully parallel loops because they have complex or statically insufficiently defined access patterns. For this reason,...

An Inference Mechanism for the Compilation of Interactive Array Languages (1994)

Luiz De Rose, Luiz De Rose, David Padua, David Padua

Interactive array languages are powerful programming tools for the development of programs for numerical computation. They provide an environment that tends to increase productivity in software...

Polaris: The Next Generation in Parallelizing Compilers (1994)

Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...

It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...

On the Automatic Parallelization of the Perfect Benchmarks (1994)

Rudolf Eigenmann, Jay Hoeflinger, David Padua

We present a set of advanced program parallelization techniques that are able to significantly improve the performance of application programs. We present evidence for this improvement in terms of...

The Privatizing DOALL Test: A Run-Time Technique for DOALL Loop Identification and Array Privatization (1994)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a significant fraction of fully parallel loops because they have complex or statically insufficiently defined access patterns. For this reason, we have...

Comparing the Performance of the DASH and Cedar Multiprocessors for Scientific Applications (1994)

Josep Torrellas, David Koufaty, David Padua

this paper, we compare the performance of DASH and Cedar under a set of varied parallel scientific loads. We focus on three main differences between the two machines: cache coherence supported in...

Polaris: Improving the Effectiveness of Parallelizing Compilers (1994)

William Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoeflinger, David Padua, ...

. It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small...

Polaris: Improving the Effectiveness of Parallelizing Compilers (1994)

William Blume Rudolf, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe Inger, David Padua, ...

It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...

The Privatizing DOALL Test: (1994)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot extract a significant fraction of the available parallelism in a loop if it has a complex and/or statically insufficiently defined access pattern. This is an...

Speculative Run-Time Parallelization of Loops (1994)

Lawrence Rauchwerger, David Padua

Current parallelizing compilers cannot identify a significant fraction of fully parallel loops because they have complex or statically insufficiently defined access patterns. Since fully parallel...

Comparing the Performance of the DASH and Cedar Multiprocessors for Scientific Applications (1994)

Josep Torrellas, David Koufaty, David Padua

Given all these differences, it is important to comparethe performance of the different machines. Such comparisons can determine the cost-effectiveness of the different choices. As an example, the...

Automatic Detection of Parallelism: A Grand Challenge for High-Performance Computing (1994)

William Blume, Rudolf Eigenmann, Jay Hoe, David Padua, Paul Petersen, Lawrence Rauchwerger, ...

The limited ability of compilers to nd the parallelism in programs is a signi cant barrier to the use of high performance computers. It forces programmers to resort to parallelizing their programs by...

Automatic Detection of Parallelism: A Grand Challenge for High-Performance Computing (1994)

William Blume, Rudolf Eigenmann, Jay Hoe, David Padua, Paul Petersen, Lawrence Rauchwerger, ...

The limited ability of compilers to nd the parallelism in programs is a signi cant barrier to the use of high performance computers. It forces programmers to resort to parallelizing their programs by...

Polaris: The Next Generation in Parallelizing Compilers (1994)

Bill Blume, Rudolf Eigenmann, Keith Faigin, John Grout, Jay Hoe, David Padua, ...

It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels,...

Automatic array privatization (1993)

Peng Tu, David Padua

Abstract. Array privatization is one of the most e ective transformations for the exploitation of parallelism. In this paper, we present atechnique for automatic array privatization. Our algorithm...

Automatic Array Privatization (1993)

Peng Tu, David Padua

. Array privatization is one of the most effective transformations for the exploitation of parallelism. In this paper, we present a technique for automatic array privatization. Our algorithm uses...

Restructuring Fortran Programs for Cedar (1993)

Rudolf Eigenmann, Jay Hoeflinger, Greg Jaxon, Zhiyuan Li, David Padua

This paper reports on the status of the Fortran translator for the Cedar computer at the end of March, 1991. A brief description of the Cedar Fortran language is followed by a discussion of the...

Evaluation Of Parallelizing Compilers (1992)

David Padua, Paul M. Petersen

The recognition and exploitation of parallelism is a difficult problem for restructuring compilers. We present a method for evaluating the effectiveness of parallelizing compilers in general and of...

Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs (1991)

Rudolf Eigenmann, Jay Hoeflinger, Zhiyuan Li, David Padua

. This paper discusses the techniques used to hand-parallelize, for the Alliant FX/80, four Fortran programs from the Perfect-Benchmark suite. The paper also includes the execution times of the...

Restructuring Fortran Programs for Cedar (1991)

Rudolf Eigenmann, Jay Hoe Inger, Greg Jaxon, Zhiyuan Li, David Padua

This paper reports on the status of the Fortran translator for the Cedar computer at the end of March, 1991. A brief description of the Cedar Fortran language is followed by a discussion of the...

Principal Investigators: (1990)

John Bruner, Lloichi Cheong, David Kuck, Er Veidenbawn, Grego O, ...

ti j/t d' ()h,icclive: The objective of the Chief project is to provide an i.tegrated simulation environment for studying and evaluating various issues in designing parallel systems, including...

accepted on the recommendation of (1972)

Christoph Von Praun, Prof Dr, Thomas Gross, Prof Dr, David Padua, ...

ii This dissertation describes an efficient and automated approach to determine synchronization defects in multi-threaded object-oriented programs. The approach is based on the key observation that...

citizen of Germany accepted on the recommendation of (1972)

Prof Dr, Thomas Gross, Prof Dr, David Padua, Prof Dr, Robert Stärk

This dissertation describes an efficient and automated approach to determine synchronization defects in multi-threaded object-oriented programs. The approach is based on the key observation that...

Lung metastasis genes couple breast tumor size and metastatic spread

Minn, Andy J., Gupta, Gaorav P., Padua, David, Bos, Paula, Nguyen, Don X., Nuyten, Dimitry, ...

The association between large tumor size and metastatic risk in a majority of clinical cancers has led to questions as to whether these observations are causally related or whether one is simply a...

A MATLAB to Fortran 90 Translator and its Effectiveness

Luiz De, Luiz De Rose, Luiz De Rose, David Padua, David Padua

In this paper, we describe the inference mechanism used by the FALCON system to translate MATLAB 1 programs to Fortran 90. FALCON is a programming environment for the development of scientific...