Mark Horowitz

Publication List Details

Period

1988 - 2009

Number

157

Co-Authors

Semi-Custom High-Speed Datapath Design Using Commericial ASIC Design Tools (2009)

Amir Amirkhany, Metha Jeeradit, Mark Horowitz, Rambus Inc

A semi-custom high-speed datapath design flow is described using commercial ASIC design CAD tools interacting with a simple hierarchical placement Matlab code. The flow leverages the vast automation...

Verification of Chip Multiprocessor Memory Systems Using A Relaxed Scoreboard (2009)

Ofer Shacham, Megan Wachs, Alex Solomatnikov, Amin Firoozshahian, Stephen Richardson, Mark Horowitz

Verification of chip multiprocessor memory systems remains challenging. While formal methods have been used to validate protocols, simulation is still the dominant method used to validate memory...

Towards an explanatory and computational theory of scientific discovery (2009)

Chen, Chaomei, Chen, Yue, Horowitz, Mark, Hou, Haiyan, Liu, Zeyuan, Pellegrino, Don

We propose an explanatory and computational theory of transformative discoveries in science. The theory is derived from a recurring theme found in a diverse range of scientific change, scientific...

Integrated Regulation for Energy-Efficient Digital Circuits (2009)

Elad Alon, Mark Horowitz

Abstract—Despite their use in analog or mixed-signal applications, the high power overheads of traditional linear regulators (both series and shunt) have precluded their successful adoption in...

Digital Circuit Design Trends (2009)

Mark Horowitz, Donald Stark, Elad Alon

THE past 20 years have seen enormous growth in the capability and ubiquity of digital integrated circuits. Today, it sometimes seems difficult to buy any product without them—even greeting cards...

24.4 10GHz Clock Distribution Using Coupled Standing-Wave Oscillators (2009)

C. Patrick Yue, Mark Horowitz, S. Simon Wong

Global clock distribution has become increasingly difficult for multi-GHz microprocessors. Timing uncertainty must reduce with clock period, but skew and jitter for conventional H-trees are...

WA 17.6: A Variable-Frequency Parallel I/O Interface with Adaptive Power Supply Regulation (2009)

Gu-yeon Wei, Jaeha Kim, Dean Liu, Stefanos Sidiropoulos, Mark Horowitz

Adaptive power supply regulation reduces power dissipation in DSP and microprocessor cores [1,2]. A technique extends this concept to a high-performance parallel input/output (I/O) interface. An...

Velio Communications (2008)

Patrick Chiang, William J. Dally, Ramesh Senthinathan, Yangjin Oh, Mark Horowitz

A 20Gb/s transmitter is implemented in 0.13um CMOS technology. Eight 2.5Gb/s data streams are 4:1 multiplexed, sampled, and retimed into two 10Gb/s data streams. A final 20Gb/s 2:1 output...

Impedance Requirements of High-Performance Processors (2008)

Low Power Circuits, Power Delivery, Copyright Ron Ho, Mark Horowitz

• Significant time and resources spent on power distribution network: – ~70 % of package pins just for power – Top 2-3 (thick) metal layers • Why has power delivery become this critical?

An Evaluation of Directory Schemes for Cache Coherence Abstract (2008)

Anant Agarwal, Richard Simoni, John Hennessy, Mark Horowitz

The problem of cache coherence in shared-memory multipre cessors has been addressed using two basic approaches: direc-tory schemes and snoopy cache schemes. Directory schemes have been given less...

Abstract Dynamic Pointer Allocation for Scalable Cache Coherence Directories (2008)

Richard Simoni, Mark Horowitz

The efficient implementation of cache consistency is one of the primary challenges in building shared memory multiprocessors with hundreds or thousands of processors. While directory-based coherency...

IEEE 2007 Custom Intergrated Circuits Conference (CICC) Integrated Regulation for Energy-Efficient Digital Circuits (2008)

Elad Alon, Mark Horowitz

Abstract – Linear regulation can reduce the effective supply impedance of digital circuits without increasing their total power dissipation. This can be achieved with a push-pull regulator topology...

Equalization of Modal Dispersion in Multimode Fiber using Spatial Light Modulators (2008)

Elad Alon, Vladimir Stojanović, Joseph M. Kahn, Stephen Boyd, Mark Horowitz

Abstract – Intersymbol interference (ISI) due to modal dispersion is the dominant limitation to the bit rate-distance product in multimode fiber-optic communication systems. If the light launched...

10.1 Adaptive Bandwidth DLLs and PLLs using Regulated Supply CMOS Buffers (2008)

Stefanos Sidiropoulos, Dean Liu, Jaeha Kim, Guyeon We, Mark Horowitz

A technique for designing DLLs and PLLs using CMOS buffers with a regulated supply is presented. By scaling the charge pump current and the output resistance of the regulating amplifier, the proposed...

A 90 nm CMOS 16 Gb/s Transceiver for Optical Interconnects (2008)

Palermo, Samuel, Emami-Neyestanak, Azita, Horowitz, Mark

Interconnect architectures which leverage high-bandwidth optical channels offer a promising solution to address the increasing chip-to-chip I/O bandwidth demands. This paper describes a dense,...

Robust Energy-Efficient Adder Topologies (2008)

Dinesh Patil, Omid Azizi, Mark Horowitz

In this paper we explore the relationship between adder topology and energy efficiency. We compare the energy-delay tradeoff curves of selected 32-bit adder topologies, to determine how architectural...

B.7.2 [Hardware]: Integrated Circuits – Design Aids (2008)

Alex Solomatnikov, Amin Firoozshahian, Wajahat Qadeer, Ofer Shacham, Kyle Kelley, Megan Wachs, ...

The drive for low-power, high performance computation coupled with the extremely high design costs for ASIC designs, has driven a number of designers to try to create a flexible, universal computing...

Piecewise Linear Models for (2008)

Russell Kao, Mark Horowitz

research relevant to the design and application of high performance scientific computers. We test our ideas by designing, building, and using real systems. The systems we build are research...

Example: Ladner-Fisher 32-bit adder (2008)

Stephen Boyd, Seung-jean Kim, Dinesh Patil, Mark Horowitz, A Quick Example, Psfrag Replacements

Statistical variation in digital circuits • growing in importance as devices shrink • modeling still open – many sources: environmental, process parameter variation, lithography – intrachip,...

Matthew Footer 2 (2008)

Ren Ng, Andrew Adams, Mark Horowitz

Figure 1: At left is a light field captured by photographing a speck of fluorescent crayon wax through a microscope objective and microlens array. The objective magnification is 16×, and the field...

M.: Veiling glare in high dynamic range imaging (2008)

Eino-ville Talvala, Andrew Adams, Mark Horowitz, Marc Levoy

Figure 1: Two HDR captures of a strongly backlit scene, tonemapped for printing. The camera is a Canon 20D. (a) Backlighting produces veiling glare in the camera body and lens, visible as a loss of...

1.l The Case for the Analytical Cache Model (2008)

Anant Agarwal, Mark Horowitz, John Hennessy

Trace-driven simulation and hardware measurement are the techniques most often used to obtain accurate performance figures for caches. The former requires a large amount of simulation time to...

Matthew Footer 2 (2008)

Ren Ng, Andrew Adams, Mark Horowitz

Figure 1: At left is a light field captured by photographing a speck of fluorescent crayon wax through a microscope objective and microlens array. The objective magnification is 16×, and the field...

Light Field Microscopy (2008)

Mark Horowitz, Ren Ng, Andrew Adams

Figure 1: At left is a light field captured by photographing a speck of fluorescent crayon wax through a microscope objective and microlens array. The magnification is 16×, and the field of view is...

28.9 Clocking and Circuit Design for a Parallel I/O on a First-Generation CELL Processor (2008)

Ken Chang, Sudhakar Pamarti, Kambiz Kaviani, Elad Alon, Xudong Shi, Tj Chin, ...

To maintain high performance across a wide variety of applications, the first-generation CELL processor [1], fabricated in 90nm SOI CMOS, requires hundreds of gigabits of aggregate I/O bandwidth. In...

18.4 Improving CDR Performance via Estimation (2008)

Haechang Lee, Akash Bansal, Yohan Frans, Jared Zerbe, Stefanos Sidiropoulos, Mark Horowitz

An implementation of the semi-digital dual-loop first-order CDR of [1] is shown in Fig. 18.4.1. The CDR (peripheral) loop consists of a bang-bang phase detector, gain (pre_filt), binary accumulator...

A 0.4-μm CMOS 10-Gb/s 4-PAM Pre-Emphasis Serial Link Transmitter* (2008)

Ramin Farjad-rad, Mark Horowitz, Thomas Lee

A serial link transmitter fabricated in the LSI 0.4-μm CMOS process uses multilevel signaling (4-PAM) and a 3-tap pre-emphasis filter to reduce intersymbol interference (ISI) caused by channel...

TA 10.3 An Eight Channel 36GSample/s CMOS Timing Analyzer (2008)

Dan Weinlader, Ron Ho, Mark Horowitz

While today’s test systems measure inputs at a single time point each cycle, measuring when the inputs arrive is often more useful for understanding timing and jitter problems. An eight channel...

The Tiny Tera: 1 A Packet Switch Core (2008)

Nick Mckeown, Martin Izzard, Adisak Mekkittikul, William Ellersick, Mark Horowitz

Abstract — In this paper, we present the Tiny Tera: a small packet switch with an aggregate bandwidth of 320Gb/s. The Tiny Tera is a CMOS-based input-queued, fixed-size packet switch suitable for a...

Abstract Smart Memories: A Modular Reconfigurable Architecture (2008)

Ken Mai, Tim Paaske, Nuwan Jayasena, Ron Ho, William J. Dally, Mark Horowitz

Trends in VLSI technology scaling demand that future computing devices be narrowly focused to achieve high performance and high efficiency, yet also target the high volumes and low costs of widely...

The Tiny Tera: 1 A Packet Switch Core (2008)

Nick Mckeown, Martin Izzard, Adisak Mekkittikul, William Ellersick, Mark Horowitz

Abstract — In this paper, we present the Tiny Tera: a small packet switch with an aggregate bandwidth of 320Gb/s. The Tiny Tera is a CMOS-based input-queued, fixed-size packet switch suitable for a...

TA 6.7: Regenerative Feedback Repeaters for Programmable Interconnections (2008)

Ivo Dobbelaere, Mark Horowitz, Abbas El Gama

A test chip in MOSIS 1.2pm-well CMOS is used to evaluate regenerative feedback repeaters. Figure 5 shows a micrograph of the chip. Several chains are implemented, each consisting of 64 MOS switches...

WA 17.6: A Variable-Frequency Parallel I/O Interface with Adaptive Power Supply Regulation (2008)

Gu-yeon Wei, Jaeha Kim, Dean Liu, Stefanos Sidiropoulos, Mark Horowitz

Adaptive power supply regulation reduces power dissipation in DSP and microprocessor cores [1,2]. A technique extends this concept to a high-performance parallel input/output (I/O) interface. An...

10-2 A 0.6~m CMOS 4Gb/s Transceiver with Data Recovery using Oversampling* (2008)

Ramin Farjad-rad, Mark Horowitz

A 4Gb/s serial link transmitter and receiver fabricated in the MOSIS HPO.6~m CMOS process uses edges tapped from a PLL to multiplex (transmit) and demultiplex (receive) the data. For data

A 50 Gb/s 32 × 32 CMOS Crossbar Chip using Asymmetric Serial Links* (2008)

Shang-tse Chuang, Nick Mckeown, Mark Horowitz

A synchronous crossbar chip was designed in a 0.27μm CMOS technology for use in a high-speed network switch [1]. The crossbar chip uses 32 Asymmetric Serial Links [2] [3] to achieve high speed at...

Array-of-arrays Architecture for Parallel Floating Point Multiplication (2008)

Hema Dhanesha, Katayoun Falakshahi, Mark Horowitz

This paper presents a new architecture style for the design of a parallel floating point multiplier. The proposed architecture is a synergy of trees and arrays. Architectural models were designed to...

SA 20.2: A Semi-Digital DLL with Unlimited Phase Shift Capability and 0.08-400MHz Operating (2008)

Stefanos Sidiropoulos, Mark Horowitz

Delay-locked loops are an attractive alternative to VCO-based phase-locked loops due to their simpler design and inherent stability [1-3]. The primary disadvantage of conventional DLLs is limited...

Equalization of Modal Dispersion in Multimode Fiber using Spatial Light Modulators (2008)

Elad Alon, Vladimir Stojanović, Joseph M. Kahn, Stephen Boyd, Mark Horowitz

Abstract – Intersymbol interference (ISI) due to modal dispersion is the dominant limitation to the bit rate-distance product in multimode fiber-optic communication systems. If the light launched...

1.0 Abstract Interconnect Scaling Implications for CAD (2008)

Ron Ho, Ken Mai, Hema Kapadia, Mark Horowitz

Interconnect scaling to deep submicron processes presents many challenges to today’s CAD flows. A recent analysis by Sylvester and Keutzer examined the behavior of average length wires under...

Scalable Circuits for Supply Noise Measurement (2008)

Valentin Abramzon, Elad Alon, Bita Nezamfar, Mark Horowitz

This paper discusses techniques to allow highresolution supply noise measurements in advanced CMOS technologies without the overhead of voltage references or separate power supplies. In addition to...

Abstract Smart Memories: A Modular Reconfigurable Architecture (2008)

Ken Mai, Tim Paaske, Nuwan Jayasena, Ron Ho, William J. Dally, Mark Horowitz

Trends in VLSI technology scaling demand that future computing devices be narrowly focused to achieve high performance and high efficiency, yet also target the high volumes and low costs of widely...

Abstract Automatic Color Calibration for Large Camera Arrays (2008)

Neel Joshi, Bennett Wilburn, Vaibhav Vaish, Marc Levoy, Mark Horowitz

We present a color calibration pipeline for large camera arrays. We assume static lighting conditions for each camera, such as studio lighting or a stationary array outdoors. We also assume we can...

Array-of-arrays Architecture for Parallel Floating Point Multiplication (2007)

Hema Dhanesha, Katayoun Falakshahi, Mark Horowitz

This paper presents a new architecture style for the design of a parallel floating point multiplier. The proposed architecture is a synergy of trees and arrays. Architectural models were designed to...

A 700 Mbps/pin CMOS Signalling Interface Using Current Integrating Receivers (2007)

Mark Horowitz, Mark Horowitz, Stefanos Sidiropoulos, Stefanos Sidiropoulos, Stefanos Sidiropoulos, Stefanos Sidiropoulos

A high speed CMOS signalling interface for application in multiprocessor interconnection networks has been developed. The interface utilizes 1-V push-pull drivers, a Delay Line PLL and sampling of...

Array-of-arrays Architecture for Parallel Floating Point Multiplication (2007)

Hema Dhanesha, Katayoun Falakshahi, Mark Horowitz

This paper presents a new architecture style for the design of a parallel floating point multiplier. The proposed architecture is a synergy of trees and arrays. Architectural models were designed to...

M.: ‘A 1.6Gb/s, 3 mW CMOS receiver for optical communication (2007)

Azita Emami-neyestanak, Dean Liu, Gordon Keeler, Noah Helman, Mark Horowitz

A 1.6Gb/s receiver for optical communication has been designed and fabricated in a 0.25-μm CMOS process. This receiver has no transimpedance amplifier and uses the parasitic capacitor of the...

A Tracking PLL with an FIR Loop Filter (2007)

Dean Liu, Henrik O. Johansson, Jaeha Kim, Mark Horowitz, Ae E, Ae E

To stabilize the feedback loop of a PLL a lead-lag filter is used, implemented by driving the charge-pump current to a series resistor capacitor network [1]. While many designs have created the...

Abstract Separating Protection and Resource Management in Operating Systems (2007)

David Lie, Chandramohan A. Thekkath, Mark Horowitz

Traditionally, operating systems have fulfilled the dual roles of enforcing security on computer systems, as well as managing and virtualizing resources for the various applications sharing the...

AND THE COMMITTEE ON GRADUATE STUDIES (2007)

Dr. Mark Horowitz

that I have read this dissertation and that in my opinion it is fully adequate, in scope

1 The Tiny Tera: 1 A Packet Switch Core (2007)

Nick Mckeown, Martin Izzard, Adisak Mekkittikul, William Ellersick, Mark Horowitz

Abstract--- In this paper, we present the Tiny Tera: a small packet switch with an aggregate bandwidth of 320Gb/s. The Tiny Tera is a CMOS-based input-queued, fixed-size packet switch suitable for a...

References [1] Scalable Coherent Interface (SCI). ANSI/IEEE Std 1596- (2007)

Anant Agarwal, Richard Simoni, Mark Horowitz, David Bailey, John Barton, Thomas Lasinski, ...

Mary Vernon, who taught the computational science class out of which this work grew. We also thank Profs. David Wood and Jim Goodman for supporting this work, and finally Babak Falsafi and Alain...

SIGNALING: Overview and Limitations (2007)

Mark Horowitz, Stefanos Sidiropoulos

integration levels increase with advances in fabrication technology, so must off-chip data bandwidth. Although this goal is challenging, circuit design techniques will enable bandwidth to scale....

AND THE COMMITTEE ON GRADUATE STUDIES (2007)

Dr. Mark Horowitz

that I have read this dissertation and that in my opinion it is fully adequate, in scope

Limitations (2007)

Mark Horowitz, Mark Horowitz, Mark Horowitz, Stefanos Sidiropoulos, ...

Abstract — Improving fabrication technology enables not only the scaling of on-chip gate speeds but also the data rate of inter-chip communication interfaces. Simple low latency offchip interfaces...

The Tiny Tera: 1 A Packet Switch Core (2007)

Nick Mckeown, Martin Izzard, Adisak Mekkittikul, William Ellersick, Mark Horowitz

Abstract — In this paper, we present the Tiny Tera: a small packet switch with an aggregate bandwidth of 320Gb/s. The Tiny Tera is a CMOS-based input-queued, fixed-size packet switch suitable for a...

Designing Graphics Architectures (2007)

Around Scalability And, Matthew Eldridge, Pat Hanrahan, Mark Horowitz

Communication forms the backbone of parallel graphics, allowing multiple functional units to cooperate to render images. The cost of this communication, both in system resources and money, is the...

CMOS Transceiver with Baud Rate Clock Recovery for Optical Interconnects (2007)

Samuel Palermo, Hae-chang Lee, Mark Horowitz

An efficient baud rate clock and data recovery architecture is applied to a double sampling/integrating front-end receiver for optical interconnects. Receiver performance is analyzed and projected...

1 Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors (2007)

Mark Horowitz, Margaret Martonosi, Todd C. Mowry, Michael D. Smith

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem...

A 90nm CMOS 16Gb/s transceiver for optical interconnects (2007)

Palermo, Samuel, Emami-Neyestanak, Azita, Horowitz, Mark

An optical interconnect transceiver incorporates a 4-tap FIR TX to reduce VCSEL average current and an integrating/double-sampling RX to eliminate the need for a bit-rate TIA. A dual-loop CDR with...

Comparing memory systems for chip multiprocessors (2007)

Jacob Leverich, Hideho Arakida, Alex Solomatnikov, Amin Firoozshahian, Mark Horowitz, Christos Kozyrakis

There are two basic models for the on-chip memory in CMP systems: hardware-managed coherent caches and software-managed streaming memory. This paper performs a direct comparison of the two models...

Replica compensated linear regulators for supply-regulated phase-locked loops (2006)

Elad Alon, Student Member, Jaeha Kim, Sudhakar Pamarti, Ken Chang, Mark Horowitz

Abstract—Supply-regulated phase-locked loops rely upon the VCO voltage regulator to maintain a low sensitivity to supply noise and hence low overall jitter. By analyzing regulator supply rejection,...

Soft Error Resilience of Probabilistic Inference Applications (2006)

Vicky Wong, Student Member, Mark Horowitz

Abstract — With shrinking device size and increasing complexity, soft errors are becoming an issue in the reliability of digital systems. To make efficient robust systems, it is important to...

A new method for design of robust digital circuits (2005)

Dinesh Patil, Sunghee Yun, Seung-jean Kim, Alvin Cheung, Mark Horowitz, Stephen Boyd

As technology continues to scale beyond 100nm, there is a significant increase in performance uncertainty of CMOS logic due to process and environmental variations. Traditional circuit optimization...

Dual photography (2005)

Pradeep Sen, Billy Chen, Gaurav Garg, Stephen R. Marschner, Mark Horowitz, Marc Levoy, ...

Figure 1: (a) Conventional photograph of a scene, illuminated by a projector with all its pixels turned on. (b) After measuring the light transport between the projector and the camera using...

A new method for design of robust digital circuits (2005)

Dinesh Patil, Sunghee Yun, Seung-jean Kim, Alvin Cheung, Mark Horowitz, Stephen Boyd

As technology continues to scale beyond 100nm, there is a significant increase in performance uncertainty of CMOS logic due to process and environmental variations. Traditional circuit optimization...

Circuits and techniques for high-resolution measurement of on-chip power supply noise (2005)

Elad Alon, Vladimir Stojanović, Mark Horowitz

A technique for characterizing the cyclically time varying statistical properties and spectrum of power supply noise using only two on-chip samplers is presented. The samplers utilize a...

The implementation of a 2-core multi-threaded Itanium family processor (2005)

Samuel Naffziger, Blaine Stackhouse, Tom Grutkowski, Doug Josephson, Jayen Desai, Elad Alon, ...

Abstract—The design of the high end server processor code named Montecito incorporated several ambitious goals requiring innovation. The most obvious being the incorporation of two legacy cores...

Synthetic aperture focusing using a shear-warp factorization of the viewing transform (2005)

Vaibhav Vaish, Gaurav Garg, Eino-ville Talvala, Emilio Antunez, Bennett Wilburn, Mark Horowitz, ...

Synthetic aperture focusing consists of warping and adding together the images in a 4D light field so that objects lying on a specified surface are aligned and thus in focus, while objects lying off...

CMOS transceiver with baud rate clock recovery for optical interconnects (2004)

Emami-Neyestanak, Azita, Palermo, Samuel, Lee, Hae-Chang, Horowitz, Mark

An efficient baud rate clock and data recovery architecture is applied to a double sampling/integrating front-end receiver for optical interconnects. Receiver performance is analyzed and projected...

High-speed videography using a dense camera array (2004)

Bennett Wilburn, Neel Joshi, Vaibhav Vaish, Marc Levoy, Mark Horowitz

We demonstrate a system for capturing multi-thousand frame-per-second (fps) video using a dense array of cheap 30fps CMOS image sensors. A benefit of using a camera array to capture high-speed video...

Burst Mode Packet Receiver Using a Second Order DLL,” Dig (2004)

Haechang Lee, Chi Ho Yue, Samuel Palermo, Kenneth W. Mai, Mark Horowitz

This paper describes a CDR that can be used to receive optically switched packets. Rather than using fast phase acquisition to lock onto each packet, it uses a second order delay locked loop to...

Synthetic aperture confocal imaging (2004)

Billy Chen, Vaibhav Vaish, Mark Horowitz, Mark Bolas

Figure 1: The techniques in this paper employ two computer-assisted optical effects: synthetic aperture photography and synthetic aperture illumination. On the left, we aim a camera at an array of...

Architecture and Circuit Techniques for a Reconfigurable Memory Block (2004)

Ken Mai Ron, Ron Ho, Elad Alon, Dean Liu, Younggon Kim, Dinesh Patil, ...

This paper describes the architecture and circuits of a reconfigurable memory block, called a mat, for use in such a memory system, whose architecture is more fully described in [1]. The primary...

Scaling Internet routers using optics (2003)

Isaac Keslassy, Shang-tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, ...

Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make efficient use of...

Implementing an untrusted operating system on trusted hardware (2003)

David Lie, Chandramohan A. Thekkath, Mark Horowitz

Recently, there has been considerable interest in providing “trusted computing platforms ” using hardware — TCPA and Palladium being the most publicly visible examples. In this paper we discuss...

Implementing an untrusted operating system on trusted hardware (2003)

David Lie, Chandramohan A. Thekkath, Mark Horowitz

Recently, there has been considerable interest in providing “trusted computing platforms ” using hardware — TCPA and Palladium being the most publicly visible examples.In this paper we discuss...

Efficient on-chip global interconnects (2003)

Ron Ho, Ken Mai, Mark Horowitz

Abstract — We present circuits for a high-efficiency low-swing interconnect scheme suitable for the Smart Memories reconfigurable architecture. By using a separate supply, global clocking, and...

Implementing an untrusted operating system on trusted hardware (2003)

David Lie, Chandramohan A. Thekkath, Mark Horowitz

Recently, there has been considerable interest in providing “trusted computing platforms ” using hardware — TCPA and Palladium being the most publicly visible examples. In this paper we discuss...

Implementing an Untrusted Operating System on Trusted Hardware (2003)

David Lie, Chandramohan A. Thekkath, Mark Horowitz

Recently, there has been considerable interest in providing "trusted computing platforms" using hardware -- TCPA and Palladium being the most publicly visible examples. In this paper we...

Scaling Internet Routers Using Optics (2003)

Isaac Keslassy, Shang-Tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, ...

Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make e#cient use of...

Scaling Internet routers using optics (extended version) (2003)

Isaac Keslassy, Shang-Tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, ...

Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make efficient use of...

Scaling Internet routers using optics (2003)

Isaac Keslassy, Shang-tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, ...

Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make efficient use of...

Scaling Internet routers using optics (2003)

Isaac Keslassy, Shang-tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, ...

Routers built around a single-stage crossbar and a centralized scheduler do not scale, and (in practice) do not provide the throughput guarantees that network operators need to make efficient use of...

Scaling Internet Routers Using Optics (extended version)”, Stanford HPNG (2003)

Isaac Keslassy, Shang-tse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, ...

This paper is an extended version of [1]. In conjunction with “A load-balanced switch with an arbitrary number of linecards ” [2], it replaces Abstract — Routers built around a single-stage...

A 1.6 Gb/s, 3 mW CMOS receiver for optical communication (2002)

Emami-Neyestanak, Azita, Liu, Dean, Keeler, Gordon, Helman, Noah, Horowitz, Mark

A 1.6 Gb/s receiver for optical communication has been designed and fabricated in a 0.25-μm CMOS process. This receiver has no transimpedance amplifier and uses the parasitic capacitor of the...

On-Chip Instrument Caches for High Performance Processors, (2002)

Agarwal,Anant, Chow,Paul, Horowitz,Mark, Acken,John, Salz,Arturo

Continued increases in clock rates of VLSI processors demand a reduction in the frequency of expensive off-chip memory references. Without such a reduction, the chip crossing time and the constraints...

A 20 MIPS Peak Microprocessor with On-Chip Cache, (2002)

Horowitz,Mark, Hennessy,John L., Chow,Paul, Gulak,P. G., Acken,John M.

MIPS-X is a 32b microprocessor with an on-chip 16Kb instruction cache. The chip is implemented in a 2 micron drawn channel length, 2-layer metal CMOS technology, contains 150K transistors in an 8mm...

The MIPS-X Microprocessor, (2002)

Horowitz,Mark, Chow,Paul

MIPS-X is the successor to the MIPS project at Stanford University. Like its predecessor, it is a single chip VLSI processor that uses a simplified instruction set, pipelining and a software code...

ATUM: A New Technique for Capturing Address Traces Using Microcode, (2002)

Agarwal,Anant, Sites,Richard L., Horowitz,Mark

Trace-driven simulation is often used in the design of computer systems, especially caches and translation lookaside buffers. Capturing address traces to drive such simulations has been problematic,...

A 1.6Gb/s, 3mW Integrating CMOS Optical Receiver with Hybrid GaAs Photo-Detectors (2002)

Azita Emami-neyestanak, Dean Liu, Gordon Keeler, Noah Helman, Mark Horowitz

the bit rate is limited by the sample bandwidth and input sensitivity of the sense amp, and not the gain-bandwidth of an amplifier. Higher data rates can be achieved using more samplers, clocked...

Using Texture Mapping with Mipmapping to Render a VLSI (2001)

Layout Jeff Solomon, Jeff Solomon, Mark Horowitz

This paper presents a method of using texture mapping with mipmapping to render a VLSI layout. Texture mapping is used to save already rasterized areas of the layout from frame to frame, and to take...

Symbolic Simulation Using Automatic Abstraction of Internal Node Values (2001)

James Christopher Wilson, David L. Dill, Mark Horowitz, Randal E. Bryant

In recent years, veri cation has emerged as a major portion of the eort in designing large, complex chips. Simulation-based methods such as directed and random testing are the most widely used veri...

FLASH vs. (Simulated) FLASH: Closing the Simulation Loop (2000)

Jeff Gibson, Robert Kunz, David Ofelt, Mark Horowitz, John Hennessy, Mark Heinrich

Simulation is the primary method for evaluating computer systems during all phases of the design process. One significant problem with simulation is that it rarely models the system exactly, and...

A 0.3µm CMOS 8-Gb/s 4-PAM Serial Link Transceiver (2000)

Ramin Farjad-rad, Mark Horowitz, Thomas Lee

An 8-Gb/s 0.3-μm CMOS transceiver uses multilevel signaling (4-PAM) and transmit pre-shaping in combination with receive equalization to reduce ISI due to channel lowpass effects. High on-chip...

Timing Analysis Including Clock Skew (1999)

David Harris, Mark Horowitz, Senior Member, Dean Liu

Abstract—Clock skew is an increasing concern for high-speed circuit designers. Circuit designers use transparent latches and skew-tolerant domino circuits to hide clock skew from the critical path...

Timing Analysis Including Clock Skew (1999)

David Harris, Mark Horowitz, Senior Member, Dean Liu

Abstract—Clock skew is an increasing concern for high-speed circuit designers. Circuit designers use transparent latches and skew-tolerant domino circuits to hide clock skew from the critical path...

GAD: A 12-GS/s CMOS 4-bit A/D Converter for an Equalized Multi-Level Link (1999)

William Ellersick, Mark Horowitz, William Dally

A 4-bit 12-GSample/sec A/D converter (GAD) has been fabricated in a 0.25-μm CMOS process to investigate the design of an equalized multi-level link. Clocked differential amplifiers were used to...

On-Chip Instruction Caches for High Performance Processors, (1998)

Agarwal,Anant, Chow,Paul, Horowitz,Mark, Acken,John, Salz,Arturo

Continued increases in clock rates of VLSI processors demand a reduction in the frequency of expensive off-chip memory references. Without such a reduction, the chip crossing time and the constraints...

An Analytical Cache Model. (1998)

Agarwal, Anant, Horowitz, Mark, Hennessy, John

Trace driven simulation and hardware measurement are the techniques most often used to obtain accurate performance figures for caches. The former requires a large amount of simulation time to...

Architectural Tradeoffs in the Design of MIPS-X, (1998)

Chow,Paul, Horowitz,Mark

The design of a RISC processor requires a careful analysis of the tradeoffs that can be made between hardware complexity and software. As new generations of processors are built to take advantage of...

A Static RAM as a Fault Model Evaluator, (1998)

Acken,John M., Horowitz,Mark

This investigation considers the relationship between the physical failures that occur during fabrication and the resulting faulty behavior of the circuit. Fault models are used to describe the...

Generating Incremental VLSI Compaction Spacing Constraints, (1998)

Carpenter,Clyde W., Horowitz,Mark

This paper describes using adjacency lists to incrementally generate design rule spacing constraints. The algorithm generates the smallest complete set of constraints for a design, yielding fast...

REDS: Resistance Extraction for Digital Simulation, (1998)

Stark,Don, Horowitz,Mark

This paper describes an extractor designed to produce resistance values for use in digital circuit simulation. REDS avoids resistance extraction on most nets in a design using a simple filter based...

Using Texture Mapping With Mipmapping to Render a VLSI Layout (1998)

Solomon, Jeff, Horowitz, Mark

This paper presents a method of using texture mapping with mip-mapping to render a VLSI layout. Texture mapping is used to save already rasterized areas of the layout from frame to frame, and to take...

Interconnect Scaling Implications for CAD (1998)

Ho, Ron, Mai, Ken, Kapadia, Hema, Horowitz, Mark

Interconnect scaling to deep submicron processes presents many challenges to today's CAD flows. A recent analysis by Sylvester and Keutzer examined the behavior of average length wires under scaling,...

The Future of Wires (1998)

Horowitz, Mark, Ho, Ron, Mai, Ken

This chapter examines wire scaling and the capabilities of future wiring systems in more detail to better understand the constraints of these systems. If an existing circuit is scaled to a new...

Applications of On-Chip Samplers for Test and Measurement of Integrated Circuits (1998)

Ho, Ron, Amrutur, Bharadwaj, Mai, Ken, Wilburn, Bennett, Mori, Toshihiko, Horowitz, Mark

Displaying the real-time behavior of critical signals on VLSI chips is difficult and can require expensive test equipment. The authors present a simple sampling technique to display the analog...

Smart Memory Systems: Polymorphous Computing Architectures (1998)

Horowitz, Mark

We describe a new universal computing element for future embedded applications. Our polymorphic architecture contains course-grain reconfigurable processors, memory, and network. Each application is...

The Tiny Tera: A Packet Switch Core (1998)

McKeown, Nick, Izzard, Martin, Mekkittikul, Adisak, Ellersick, Bill, Horowitz, Mark

The objective is to design and build a small, high-bandwidth switch.

Informing Memory Operations: Memory Performance Feedback Mechanisms and Their Applications (1998)

Mark Horowitz, Margaret Martonosi, Todd C. Mowry, Michael D. Smith

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem...

A 2 Gb/s/pin CMOS asymmetric serial link (1998)

William Ellersick, Shang-tse Chuang, Stefanos Sidiropoulos, Mark Horowitz

The design of an asymmetric serial link poses a number of tradeoffs for the designer. This paper describes measurements from a 0.25μm CMOS test chip which show that a properly designed asymmetric...

Applications of on-chip samplers for test and measurement of integrated circuits (1998)

Ron Ho, Bharadwaj Amrutur, Ken Mai, Bennett Wilburn, Toshihiko Mori, Mark Horowitz

Displaying the real-time behavior of critical signals on VLSI chips is difficult and can require expensive test equipment. We present a simple sampling technique to display the analog waveforms of...

Informing Memory Operations: Memory Performance Feedback Mechanisms and Their Applications (1998)

Mark Horowitz, Margaret Martonosi, Todd C. Mowry, Michael D. Smith

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem...

An Equalization Scheme for 10Gb/s 4-PAM Signalling over Long (1997)

Ramin Farjad-rad, Kevin Yu, Bill Ellersick, Mark Horowitz, Thomas H. Lee

Abstract | In this paper, we present simulation results for a 10Gb/s serial link. The serial link is composed of a transmitter, copper coaxial cable, and a receiver. Multilevel signalling (4-PAM) is...

A 700-Mb/s/pin CMOS Signaling Interface Using Current Integrating Receivers (1997)

Stefanos Sidiropoulos, Student Member, Mark Horowitz, Senior Member

Abstract — A high speed CMOS signaling interface for application in multiprocessor interconnection networks has been developed. The interface utilizes 1-V push–pull drivers, a delay line...

A Semi-Digital Dual Delay Locked Loop (1997)

Stefanos Sidiropoulos And, Mark Horowitz, C/o Darlene Hadding, Stefanos Sidiropoulos, Stefanos Sidiropoulos, Stefanos Sidiropoulos

This paper describes a dual Delay Locked Loop architecture which achieves low jitter, unlimited (modulo 2p) phase shift and large operating range. The architecture employs a core loop to generate...

Hardware/Software Codesign of the Stanford FLASH Multiprocessor (1997)

Mark Heinrich, David Ofelt, Mark Horowitz, John Hennessy

Hardware/software codesign is a methodology for solving design problems in systems with processors or embedded controllers where the design requirements mandate a functionality and performance level...

An Equalization Scheme for 10Gbs 4-PAM Signaling over Long Cables (1997)

Kevin Yu, Bill Ellersick, Mark Horowitz, Thomas H. Lee

In this paper, we present simulation results for a 10Gb#s serial link. The serial link is composed of a transmitter, copper coaxial cable, and a receiver. Multilevel signalling #4-PAM# is used along...

The Tiny Tera: A Packet Switch Core (1997)

Nick Mckeown, Martin Izzard, Adisak Mekkittikul, William Ellersick, Mark Horowitz

In this paper, we present the Tiny Tera: a small packet switch with an aggregate bandwidth of 320Gb/s. The Tiny Tera is a CMOS-based input-queued, fixed-size packet switch suitable for a wide range...

Hardware Fault Containment in Scalable Shared-Memory Multiprocessors (1997)

Dan Teodosiu, Joel Baxter, Kinshuk Govil, John Chapin, Mendel Rosenblum, Mark Horowitz

Current shared-memory multiprocessors are inherently vulnerable to faults: any significant hardware or system software fault causes the entire system to fail. Unless provisions are made to limit the...

Energy dissipation in general purpose processors (1996)

Ricardo Gonzalez, Mark Horowitz

In this paper we investigate how super-scalar issue and pipelining affect the energy-delay product of general purpose processors. We show that for idealized machines pipelining gives approximately a...

Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors (1996)

Mark Horowitz, Margaret Martonosi, Todd C. Mowry, Michael D. Smith

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem...

Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors (1996)

Mark Horowitz, Margaret Martonosi, Todd C. Mowry, Michael D. Smith

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem...

Informing memory operations: Providing memory performance feedback in modern processors (1996)

Mark Horowitz, Margaret Martonosi, Todd C. Mowry, Michael D. Smith

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem...

Gamal, “Regenerative Feedback Repeaters for Programmable Interconnections,” ISSCC (1995)

Ivo Dobbelaere, Mark Horowitz, Abbas El Gamal, Senior Member, Senior Member

Abstract—The use of regenerative feedback repeaters to reduce the delay in programmable interconnections is described. A static, complementary regenerative feedback (CRF) repeater is proposed. This...

Informing Loads: Enabling Software To Observe And React To Memory Behavior (1995)

Mark Horowitz, Mark Horowitz, Margaret Martonosi, Margaret Martonosi, Todd C. Mowry, Todd C. Mowry, ...

Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to address this problem...

Interleaving: A multithreading technique targeting multiprocessors and workstations (1994)

James Laudon, Anoop Gupta, Mark Horowitz

There is an increasing trend to use commodity microprocessors as the compute engines in large-scale multiprocessors. However, given that the majority of the microprocessors are sold in the...

Low-Power Digital Design (1994)

Mark Horowitz, Thomas Indermaur, Ricardo Gonzalez

Recently there has been a surge of interest in low-power devices and design techniques. While many papers have been published describing power-saving techniques for use in digital systems, trade-offs...

The Stanford FLASH Multiprocessor Page 1 (1994)

The Stanford, Jeffrey Kuskin, David Ofelt, Mark Heinrich, John Heinlein, Richard Simoni, ...

The FLASH multiprocessor efficiently integrates support for cache-coherent shared memory and high-performance message passing, while minimizing both the hardware and software overhead. Each node in...

Mark Horowitz, Thomas Indermaur, and Ricardo Gonzalez (1994)

Center For, Mark Horowitz, Thomas Indermaur, Ricardo Gonzalez

this paper. voltage regulation, is an obvious design decision if low power is an objective. Removing circuits that dissipate static power and powering down inactive blocks are other examples of how...

Evaluation of Charge Recovery Circuits and Adiabatic Switching for Low Power CMOS Design (1994)

Thomas Indermaur, Mark Horowitz

A technique called charge recovery or adiabatic switching has been proposed to trade speed for energy consumption in CMOS circuits. We compare the speed/power of charge recovery to standard CMOS...

The Performance Impact of Flexibility in the Stanford FLASH Multiprocessor (1994)

Mark Heinrich, Jeffrey Kuskin, David Ofelt, John Heinlein, Jaswinder Pal Singh, Richard Simoni, ...

Several multiprocessors have been proposed that offer programmable implementations of scalable cache coherence as well as support for message passing. In the FLASH machine, flexibility is obtained by...

The design of a high-performance cache controller: A case study in asynchronous synthesis (1993)

Steven M. Nowick, Mark E. Dean, David L. Dill, Mark Horowitz

Because of ever-increasing demands on digital system performance, there is a need for new architectures which fully exploit the capabilities of contemporary VLSI technology. Asynchronous or...

Piecewise Linear Models for Rsim (1993)

Russell Kao, Mark Horowitz

Rsim is a switch-level simulator which can simulate large digital MOS integrated circuits with speedups of over 3 orders of magnitude over SPICE. Unfortunately, Rsim's simple switched-resistor...

Architectural and Implementation Tradeoffs in the Design of Multiple-Context Processors (1992)

James Laudon, Anoop Gupta, Mark Horowitz

Multiple-context processors have been proposed as an architectural technique to mitigate the effects of large memory latency in multiprocessors. In this paper, we examine two schemes for implementing...

Efficient Superscalar Performance (1992)

Through Boosting Michael, Michael D. Smith, Mark Horowitz, Monica S. Lam

The foremost goal of superscalar processor design is to increase performance through the exploitation of instruction-level parallelism (ILP). Previous studies have shown that speculative execution is...

Architectural and Implementation Tradeoffs in the Design of Multiple-Context Processors (1992)

James Laudon, Anoop Gupta, Mark Horowitz

Multiple-context processors have been proposed as an architectural technique to mitigate the effects of large memory latency in multiprocessors. In this paper, we examine two schemes for implementing...

Self-Timed Logic Using Current-Sensing Completion Detection (CSCD (1991)

Mark E. Dean, David L. Dill, Mark Horowitz

Abstract. This article proposes a completion-detection method for efficiently implementing Boolean functions as self-timed logic structures. Current-Sensing Completion Detection, CSCD, allows...

Dynamic Pointer Allocation for Scalable Cache Coherence Directories (1991)

Richard Simoni, Mark Horowitz

The efficient implementation of cache consistency is one of the primary challenges in building shared memory multiprocessors with hundreds or thousands of processors. While directory-based coherency...

An Evaluation of Directory Schemes for Cache Coherence (1988)

Anant Agarwal, Richard Simoni, John Hennessy, Mark Horowitz

The problem of cache coherence in shared-memory multiprocessors has been addressed using two basic approaches: directory schemes and snoopy cache schemes. Directory schemes have been given less...

Architecture and inherent robustness of a bacterial cell-cycle control system

Shen, Xiling, Collier, Justine, Dill, David, Shapiro, Lucy, Horowitz, Mark, McAdams, Harley H.

A closed-loop control system drives progression of the coupled stalked and swarmer cell cycles of the bacterium Caulobacter crescentus in a near-mechanical step-like fashion. The cell-cycle control...

Osteoimmunology: Interactions of the Bone and Immune System

Lorenzo, Joseph, Horowitz, Mark, Choi, Yongwon

Bone and the immune system are both complex tissues that respectively regulate the skeleton and the body’s response to invading pathogens. It has now become clear that these organ systems often...