David K. Lowenthal

Abstract Minimizing Execution Time in MPI Programs on an Energy-Constrained, Power-Scalable Cluster (2009)

Robert Springer, David K. Lowenthal, Barry Rountree

Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power capacity and...

ABSTRACT Implicit Java Array Bounds Checking on 64-bit Architectures (2008)

Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree

Interest in using Java for high-performance parallel computing has increased in recent years. One obstacle that has inhibited Java from widespread acceptance in the scientific community is the...

Client-Centered, Energy-Efficient Wireless Communication on IEEE 802.11b Networks (2008)

Haijin Yan, Scott A. Watterson, David K. Lowenthal, Kang Li, Rupa Krishnan, Larry L. Peterson

In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a device is to transition its WNIC to a...

Research Statement (2008)

David K. Lowenthal

My research spans the areas of parallel and distributed computing, networks, and operating systems. The common thread throughout my work is the design, implementation, and evaluation of software...

Abstract (2008)

Amit Karw, E Xin Yuan, David K. Lowenthal

Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to...

Cv (2007)

David K. Lowenthal

ize at Run-Time in Pipelined Parallel Programs. Submitted to International Journal on Parallel Programming. To appear, International Journal of Parallel Programming. David K. Lowenthal and Vincent W....

Filaments Programmer's Manual (2007)

David K. Lowenthal, Vincent W. Freeh

This report describes how to write architecture-independent parallel programs---programs that are portable and efficient across vastly different parallel machines---using the Filaments package. All...

Hybrid Messaging Passing in Shared-Memory Clusters # (2007)

Vincent W. Freeh, Jin Xu, David K. Lowenthal

An increasingly popular choice for high-performance machines is clusters of SMPs. Such clusters are popular in department-wide as well as national laboratory settings. On clusters of SMPs, there are...

Cv (2007)

David K. Lowenthal

itecture-Independent Parallelism on Networks of Multiprocessors. International Journal of Parallel and Distributed Systems and Networks, 25(4):272--282 (2003). Karthik Balasubramanian and David K....

Implicit array bounds checking on 64-bit architectures (2006)

Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree

Several programming languages guarantee that array subscripts are checked to ensure they are within the bounds of the array. While this guarantee improves the correctness and security of arraybased...

Dynamic, power-aware scheduling for mobile clients using a transparent proxy (2004)

Michael Gundlach, Sarah Doster, Haijin Yan, David K. Lowenthal, Scott A. Watterson, Surendar Chandra

Mobile computers consume significant amounts of energy when receiving large files. The wireless network interface card (WNIC) is the primary source of this energy consumption. One way to reduce the...

Client-centered energy and delay analysis for TCP downloads (2004)

Haijin Yan, Rupa Krishnan, Scott A. Watterson, David K. Lowenthal, Kang Li, Larry L. Peterson

In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a mobile device is to transition its WNIC...

Client-centered energy savings for concurrent http connections (2004)

Haijin Yan, Rupa Krishnan, Scott A. Watterson, David K. Lowenthal

Abstract—In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a WNIC is to transition it to a...

Client-Centered Energy Savings for Concurrent HTTP Connections (2004)

Haijin Yan, Rupa Krishnan, Scott A. Watterson, David K. Lowenthal

In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a WNIC is to transition it to a lower-power...

A comparative analysis of fine-grain threads packages (2003)

Gregory W. Price, David K. Lowenthal

The rising availability of multiprocessing platforms has increased the importance of providing programming models that allow users to express parallelism simply, portably, and eciently. One popular...

CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters (2003)

Amit Karwande, Xin Yuan, David K. Lowenthal

Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to...

Popularity-Aware Cache Replacement in Streaming Environments (2003)

Haijin Yan, David K Lowenthal

With the explosion of Internet streaming applications, system-level support for these applications has become increasingly important. One type of support is caching on streaming servers; this can...

A comparative analysis of fine-grain threads packages (2003)

Gregory W. Price, David K. Lowenthal

The rising availability of multiprocessing platforms has increased the importance of providing programming models that allow users to express parallelism simply, portably, and efficiently. One...

Dyn-MPI: Supporting MPI on Non Dedicated Clusters (2003)

D. Brent Weatherly, David K. Lowenthal, Mario Nakazawa, Franklin Lowenthal

Distributing data is one of the fundamental problems in implementing efficient distributed-memory parallel programs. The problem becomes more difficult in environments where the participating nodes...

CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters (2003)

Amit Karwande, Xin Yuan, David K. Lowenthal

Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to...

A Comparison of Array Bounds Checking on Superscalar and VLIW Architectures (2002)

Chris Bentley, Scott A. Watterson, David K. Lowenthal

Several programming languages guarantee that array subscripts are checked to ensure they are within the array bounds. While this guarantee improves the correctness and security of array-based code,...

Efficient Support for Two-Dimensional Data Distributions in Distributed Shared Memory Systems (2002)

David K. Lowenthal, Vincent W. Freeh, David W. Miller

Despite their clear advantage in scalability, two-dimensional data distributions are not efficiently supported by current software distributed shared memory (SDSM) systems. This is because sharing...

SUIF-Adapt: An Integrated Compiler/Run-Time System for Global and Dynamic Data Distributions (2002)

David K. Lowenthal, Donald G. Morris, D. Brent Weatherly, Franklin Lowenthal

Distributing data is one of the key problems in implementing efficient distributed-memory parallel programs. The problem is especially difficult in programs where (1) data redistribution between...

Parallel Implementation of the PHOENIX Generalized Stellar Atmosphere Program. III: A parallel algorithm for direct opacity sampling (2001)

Hauschildt, Peter H., Lowenthal, David K., Baron, E.

We describe two parallel algorithms for line opacity calculations based on a local file and on a global file approach. The performance and scalability of both approaches is discussed for different...

Supporting regular data distributions on nondedicated parallel machines (submitted to ICPP '01 (2001)

David K. Lowenthal, Franklin Lowenthal

Distributing data is one of the key problems in implementing ecient parallel programs. Two common distributions are BLOCK and CYCLIC, which are typically ecient for programs with nearest-neighbor and...

Accurate data redistribution cost estimation in software distributed shared memory systems (2001)

David K. Lowenthal

Distributing data is one of the key problems in implementing ecient distributed-memory parallel programs. The problem becomes more dicult in programs where data redistribution between computational...

Accurate data redistribution cost estimation in software distributed shared memory systems (2001)

David K. Lowenthal

Distributing data is one of the key problems in implementing ecient distributed-memory parallel programs. The problem becomes more dicult in programs where data redistribution between computational...

Ecient support for two-dimensional data distributions in distributed shared memory systems (in preparation (2001)

David K. Lowenthal, Vincent W. Freeh

Despite their clear advantage in scalability, two-dimensional data distributions are not ef-ciently supported by current distributed shared memory (DSM) systems. This is because sharing between nodes...

An integrated compiler/run-time system for global data distribution in distributed shared memory systems (2000)

David K. Lowenthal

A software distributed shared memory (DSM) provides the illusion of shared memory on a distributed-memory machine; communication occurs implicitly via page faults. For efficient execution of DSM...

Architecture-independent parallelism for both sharedand distributed-memory machines using the Filaments package (2000)

David K. Lowenthal, Vincent W. Freeh

This paper presents the Filaments package, which can be used to create architecture-independent parallel programs---that is, programs that are portable and efficient across vastly different parallel...

An Integrated Compiler/Run-Time System for Global Data Distribution in Distributed Shared Memory Systems (2000)

David K. Lowenthal

A software distributed shared memory (DSM) provides the illusion of shared memory on a distributed-memory machine; communication occurs implicitly via page faults. For efficient execution of DSM...

An Integrated Compiler/Run-Time System for Global Data Distribution in Distributed Shared Memory Systems (2000)

David K. Lowenthal

Shared memory provides a desirable parallel programming model because communication is implicit. However, to achieve scalability it is often necessary to execute programs on a distributed-memory...

Architecture-Independent Parallelism for Both Sharedand Distributed-Memory Machines using the Filaments Package (2000)

David K. Lowenthal, Vincent W. Freeh

This paper presents the Filaments package, which can be used to create architecture-independent parallel programs---that is, programs that are portable and efficient across vastly different parallel...

An Integrated Compiler/Run-Time System for Global Data Distribution in Distributed Shared Memory Systems (2000)

David K. Lowenthal

Shared memory provides a desirable parallel programming model because communication is implicit. However, to achieve scalability it is often necessary to execute programs on a distributed-memory...

Efficient support for pipelining in distributed shared memory systems (1999)

Karthik Balasubramanian, David K. Lowenthal

Though more difficult to program, distributed-memory parallel machines provide greater scalability than their shared-memory counterparts. Distributed Shared Memory (DSM) systems provide the...

Obtaining Efficient Single-Processor Performance From Fine-Grain Threads in Array-Based Parallel Programs (1999)

David K. Lowenthal, Graham C. Greene

A fine-grain parallel program is one in which a thread is created for each logical unit of work. Fine-grain parallelism can help hide latency and balance load, which improves speedup. However, many...

Efficient Support for Pipelining in Distributed Shared Memory Systems (1999)

Karthik Balasubramanian, David K. Lowenthal

Though more difficult to program, distributed-memory parallel machines provide greater scalability than their shared-memory counterparts. Distributed Shared Memory (DSM) systems provide the...

Efficient Support for Fine-Grain Parallelism on Shared-Memory Machines (1999)

David K. Lowenthal, Vincent W. Freeh, Gregory R. Andrews

A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain program has one thread for each independent unit of work. Although there are several advantages to...

Run-Time Selection of Block Size in Pipelined Parallel Programs (1998)

David K. Lowenthal, Michael James

Parallelizing compiler technology has improved in recent years. One area in which compilers have made progress is in handling DOACROSS loops, where cross-processor data dependencies can inhibit...

An Adaptive Approach to Data Placement (1996)

David K. Lowenthal, Gregory R. Andrews

Programming distributed-memory machines requires careful placement of data to balance the computationalload among the nodes and minimize excess data movement between the nodes. Most current...

Efficient Support for Fine-Grain Parallelism on Shared-Memory Machines (1996)

Vincent Freeh, Vincent W. Freeh, David K. Lowenthal, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews

A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain program has one thread for each independent unit of work. Although there are several advantages to...

Dynamically Controlling False Sharing in Distributed Shared Memory (1996)

Vincent Freeh, Vincent W. Freeh, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews

Distributed shared memory (DSM) alleviates the need to program message passing explicitly on a distributed-memory machine. In order to reduce memory latency, a DSM replicates copies of data. This...

Using Fine-Grain Threads and Run-Time Decision Making in Parallel Computing (1996)

David K. Lowenthal, Vicent W. Freeh, Gregory R. Andrews, Gregory R. Andrews

Programming distributed-memory multiprocessors and networks of workstations requires deciding what can execute concurrently, how processes communicate, and where data is placed. These decisions can...

Adaptive Data Placement for Distributed-Memory Machines (1995)

David K. Lowenthal, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews

Programming distributed-memory machines requires paying careful attention to where the data is placed. This is because for efficiency, it is important to balance the computational load among the...

Adaptive Data Placement for Distributed-Memory Machines (1995)

David K. Lowenthal, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews

Programming distributed-memory machines requires careful placement of data on the nodes. This is because achieving efficiency requires balancing the computational load among the nodes and minimizing...

Filaments: Efficient Support for Fine-Grain Parallelism (1994)

Dawson R. Engler, Gregory R. Andrews, David K. Lowenthal

. It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and synchronization....

Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations (1994)

Vincent Freeh, David K. Lowenthal, Gregory R. Andrews

A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as...

Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations (1994)

V. Freeh, Vincent W. Freeh, G. Andrews, D. Lowenthal, David K. Lowenthal, Gregory R. Andrews, ...

A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as...

Distributed filaments: Efficient fine-grain parallelism on a cluster of workstations (1994)

Vincent W. Freeh, David K. Lowenthal, Gregory R. Andrews

A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as...

Filaments: Efficient support for fine-grain parallelism (1993)

Dawson R. Engler, Dawson R. Engler, Gregory R. Andrews, Gregory R. Andrews, David K. Lowenthal, David K. Lowenthal

Abstract. It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and...

Performance Experiments for the Filaments Package (1993)

David K. Lowenthal, David K. Lowenthal, Dawson R. Engler, Dawson R. Engler

Ten representative benchmarks were run on two shared-memory multiprocessors using an efficient, fine-grain threads package called Filaments. This paper describes the implementation and performance of...

Hyfi: Architecture-independent parallelism on a cluster of multiprocessors. October 2002. [RTY + 87] Richard Rashid, Avadis Tevanian (1987)

David K. Lowenthal, Ragavan Subramanian

A network of parallel workstations promises cost-eective parallel computing. This paper presents the HyFi (Hybrid Filaments) package, which can be used to create architectureindependent parallel...