Robert Springer, David K. Lowenthal, Barry Rountree
Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power capacity and...
ABSTRACT Implicit Java Array Bounds Checking on 64-bit Architectures (2008)
Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree
Interest in using Java for high-performance parallel computing has increased in recent years. One obstacle that has inhibited Java from widespread acceptance in the scientific community is the...
Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs ∗ (2008)
Adaptive, Transparent Frequency and Voltage Scaling of Communication Phases in MPI Programs ∗ (2008)
Client-Centered, Energy-Efficient Wireless Communication on IEEE 802.11b Networks (2008)
Haijin Yan, Scott A. Watterson, David K. Lowenthal, Kang Li, Rupa Krishnan, Larry L. Peterson
In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a device is to transition its WNIC to a...
My research spans the areas of parallel and distributed computing, networks, and operating systems. The common thread throughout my work is the design, implementation, and evaluation of software...
Amit Karw, E Xin Yuan, David K. Lowenthal
Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to...
ize at Run-Time in Pipelined Parallel Programs. Submitted to International Journal on Parallel Programming. To appear, International Journal of Parallel Programming. David K. Lowenthal and Vincent W....
Filaments Programmer's Manual (2007)
David K. Lowenthal, Vincent W. Freeh
This report describes how to write architecture-independent parallel programs---programs that are portable and efficient across vastly different parallel machines---using the Filaments package. All...
Hybrid Messaging Passing in Shared-Memory Clusters # (2007)
Vincent W. Freeh, Jin Xu, David K. Lowenthal
An increasingly popular choice for high-performance machines is clusters of SMPs. Such clusters are popular in department-wide as well as national laboratory settings. On clusters of SMPs, there are...
itecture-Independent Parallelism on Networks of Multiprocessors. International Journal of Parallel and Distributed Systems and Networks, 25(4):272--282 (2003). Karthik Balasubramanian and David K....
Implicit array bounds checking on 64-bit architectures (2006)
Chris Bentley, Scott A. Watterson, David K. Lowenthal, Barry Rountree
Several programming languages guarantee that array subscripts are checked to ensure they are within the bounds of the array. While this guarantee improves the correctness and security of arraybased...
Dynamic, power-aware scheduling for mobile clients using a transparent proxy (2004)
Michael Gundlach, Sarah Doster, Haijin Yan, David K. Lowenthal, Scott A. Watterson, Surendar Chandra
Mobile computers consume significant amounts of energy when receiving large files. The wireless network interface card (WNIC) is the primary source of this energy consumption. One way to reduce the...
Client-centered energy and delay analysis for TCP downloads (2004)
Haijin Yan, Rupa Krishnan, Scott A. Watterson, David K. Lowenthal, Kang Li, Larry L. Peterson
In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a mobile device is to transition its WNIC...
Client-centered energy savings for concurrent http connections (2004)
Haijin Yan, Rupa Krishnan, Scott A. Watterson, David K. Lowenthal
Abstract—In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a WNIC is to transition it to a...
Client-Centered Energy Savings for Concurrent HTTP Connections (2004)
Haijin Yan, Rupa Krishnan, Scott A. Watterson, David K. Lowenthal
In mobile devices, the wireless network interface card (WNIC) consumes a significant portion of overall system energy. One way to reduce energy consumed by a WNIC is to transition it to a lower-power...
A comparative analysis of fine-grain threads packages (2003)
Gregory W. Price, David K. Lowenthal
The rising availability of multiprocessing platforms has increased the importance of providing programming models that allow users to express parallelism simply, portably, and eciently. One popular...
CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters (2003)
Amit Karwande, Xin Yuan, David K. Lowenthal
Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to...
Popularity-Aware Cache Replacement in Streaming Environments (2003)
With the explosion of Internet streaming applications, system-level support for these applications has become increasingly important. One type of support is caching on streaming servers; this can...
A comparative analysis of fine-grain threads packages (2003)
Gregory W. Price, David K. Lowenthal
The rising availability of multiprocessing platforms has increased the importance of providing programming models that allow users to express parallelism simply, portably, and efficiently. One...
Dyn-MPI: Supporting MPI on Non Dedicated Clusters (2003)
D. Brent Weatherly, David K. Lowenthal, Mario Nakazawa, Franklin Lowenthal
Distributing data is one of the fundamental problems in implementing efficient distributed-memory parallel programs. The problem becomes more difficult in environments where the participating nodes...
CC-MPI: A Compiled Communication Capable MPI Prototype for Ethernet Switched Clusters (2003)
Amit Karwande, Xin Yuan, David K. Lowenthal
Compiled communication has recently been proposed to improve communication performance for clusters of workstations. The idea of compiled communication is to apply more aggressive optimizations to...
A Comparison of Array Bounds Checking on Superscalar and VLIW Architectures (2002)
Chris Bentley, Scott A. Watterson, David K. Lowenthal
Several programming languages guarantee that array subscripts are checked to ensure they are within the array bounds. While this guarantee improves the correctness and security of array-based code,...
Efficient Support for Two-Dimensional Data Distributions in Distributed Shared Memory Systems (2002)
David K. Lowenthal, Vincent W. Freeh, David W. Miller
Despite their clear advantage in scalability, two-dimensional data distributions are not efficiently supported by current software distributed shared memory (SDSM) systems. This is because sharing...
SUIF-Adapt: An Integrated Compiler/Run-Time System for Global and Dynamic Data Distributions (2002)
David K. Lowenthal, Donald G. Morris, D. Brent Weatherly, Franklin Lowenthal
Distributing data is one of the key problems in implementing efficient distributed-memory parallel programs. The problem is especially difficult in programs where (1) data redistribution between...
Hauschildt, Peter H., Lowenthal, David K., Baron, E.
We describe two parallel algorithms for line opacity calculations based on a local file and on a global file approach. The performance and scalability of both approaches is discussed for different...
David K. Lowenthal, Franklin Lowenthal
Distributing data is one of the key problems in implementing ecient parallel programs. Two common distributions are BLOCK and CYCLIC, which are typically ecient for programs with nearest-neighbor and...
Accurate data redistribution cost estimation in software distributed shared memory systems (2001)
Distributing data is one of the key problems in implementing ecient distributed-memory parallel programs. The problem becomes more dicult in programs where data redistribution between computational...
Accurate data redistribution cost estimation in software distributed shared memory systems (2001)
Distributing data is one of the key problems in implementing ecient distributed-memory parallel programs. The problem becomes more dicult in programs where data redistribution between computational...
David K. Lowenthal, Vincent W. Freeh
Despite their clear advantage in scalability, two-dimensional data distributions are not ef-ciently supported by current distributed shared memory (DSM) systems. This is because sharing between nodes...
A software distributed shared memory (DSM) provides the illusion of shared memory on a distributed-memory machine; communication occurs implicitly via page faults. For efficient execution of DSM...
David K. Lowenthal, Vincent W. Freeh
This paper presents the Filaments package, which can be used to create architecture-independent parallel programs---that is, programs that are portable and efficient across vastly different parallel...
A software distributed shared memory (DSM) provides the illusion of shared memory on a distributed-memory machine; communication occurs implicitly via page faults. For efficient execution of DSM...
Shared memory provides a desirable parallel programming model because communication is implicit. However, to achieve scalability it is often necessary to execute programs on a distributed-memory...
David K. Lowenthal, Vincent W. Freeh
This paper presents the Filaments package, which can be used to create architecture-independent parallel programs---that is, programs that are portable and efficient across vastly different parallel...
Shared memory provides a desirable parallel programming model because communication is implicit. However, to achieve scalability it is often necessary to execute programs on a distributed-memory...
Efficient support for pipelining in distributed shared memory systems (1999)
Karthik Balasubramanian, David K. Lowenthal
Though more difficult to program, distributed-memory parallel machines provide greater scalability than their shared-memory counterparts. Distributed Shared Memory (DSM) systems provide the...
David K. Lowenthal, Graham C. Greene
A fine-grain parallel program is one in which a thread is created for each logical unit of work. Fine-grain parallelism can help hide latency and balance load, which improves speedup. However, many...
Efficient Support for Pipelining in Distributed Shared Memory Systems (1999)
Karthik Balasubramanian, David K. Lowenthal
Though more difficult to program, distributed-memory parallel machines provide greater scalability than their shared-memory counterparts. Distributed Shared Memory (DSM) systems provide the...
Efficient Support for Fine-Grain Parallelism on Shared-Memory Machines (1999)
David K. Lowenthal, Vincent W. Freeh, Gregory R. Andrews
A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain program has one thread for each independent unit of work. Although there are several advantages to...
Run-Time Selection of Block Size in Pipelined Parallel Programs (1998)
David K. Lowenthal, Michael James
Parallelizing compiler technology has improved in recent years. One area in which compilers have made progress is in handling DOACROSS loops, where cross-processor data dependencies can inhibit...
An Adaptive Approach to Data Placement (1996)
David K. Lowenthal, Gregory R. Andrews
Programming distributed-memory machines requires careful placement of data to balance the computationalload among the nodes and minimize excess data movement between the nodes. Most current...
Efficient Support for Fine-Grain Parallelism on Shared-Memory Machines (1996)
Vincent Freeh, Vincent W. Freeh, David K. Lowenthal, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews
A coarse-grain parallel program typically has one thread (task) per processor, whereas a fine-grain program has one thread for each independent unit of work. Although there are several advantages to...
Dynamically Controlling False Sharing in Distributed Shared Memory (1996)
Vincent Freeh, Vincent W. Freeh, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews
Distributed shared memory (DSM) alleviates the need to program message passing explicitly on a distributed-memory machine. In order to reduce memory latency, a DSM replicates copies of data. This...
Using Fine-Grain Threads and Run-Time Decision Making in Parallel Computing (1996)
David K. Lowenthal, Vicent W. Freeh, Gregory R. Andrews, Gregory R. Andrews
Programming distributed-memory multiprocessors and networks of workstations requires deciding what can execute concurrently, how processes communicate, and where data is placed. These decisions can...
Adaptive Data Placement for Distributed-Memory Machines (1995)
David K. Lowenthal, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews
Programming distributed-memory machines requires paying careful attention to where the data is placed. This is because for efficiency, it is important to balance the computational load among the...
Adaptive Data Placement for Distributed-Memory Machines (1995)
David K. Lowenthal, David K. Lowenthal, Gregory R. Andrews, Gregory R. Andrews
Programming distributed-memory machines requires careful placement of data on the nodes. This is because achieving efficiency requires balancing the computational load among the nodes and minimizing...
Filaments: Efficient Support for Fine-Grain Parallelism (1994)
Dawson R. Engler, Gregory R. Andrews, David K. Lowenthal
. It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and synchronization....
Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations (1994)
Vincent Freeh, David K. Lowenthal, Gregory R. Andrews
A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as...
Distributed Filaments: Efficient Fine-Grain Parallelism on a Cluster of Workstations (1994)
V. Freeh, Vincent W. Freeh, G. Andrews, D. Lowenthal, David K. Lowenthal, Gregory R. Andrews, ...
A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as...
Distributed filaments: Efficient fine-grain parallelism on a cluster of workstations (1994)
Vincent W. Freeh, David K. Lowenthal, Gregory R. Andrews
A fine-grain parallel program is one in which processes are typically small, ranging from a few to a few hundred instructions. Fine-grain parallelism arises naturally in many situations, such as...
Filaments: Efficient support for fine-grain parallelism (1993)
Dawson R. Engler, Dawson R. Engler, Gregory R. Andrews, Gregory R. Andrews, David K. Lowenthal, David K. Lowenthal
Abstract. It has long been thought that coarse-grain parallelism is much more efficient than fine-grain parallelism due to the overhead of process (thread) creation, context switching, and...
Performance Experiments for the Filaments Package (1993)
David K. Lowenthal, David K. Lowenthal, Dawson R. Engler, Dawson R. Engler
Ten representative benchmarks were run on two shared-memory multiprocessors using an efficient, fine-grain threads package called Filaments. This paper describes the implementation and performance of...
David K. Lowenthal, Ragavan Subramanian
A network of parallel workstations promises cost-eective parallel computing. This paper presents the HyFi (Hybrid Filaments) package, which can be used to create architectureindependent parallel...