James C. French

Abstract Exploiting a Controlled Vocabulary to Improve Collection Selection and Retrieval Effectiveness (2008)

James C. French

Vocabulary incompatibilities arise when the terms used to index a document collection are largely unknown,oratleast not well-known to the users who eventually search the collection. No matter how...

W.N.: Multiple viewpoints: A strategy for searching multimedia content (2008)

James C. French, A. C. Chapin, W. N. Martin

Abstract. Multiple viewpoint systems are an approach to information retrieval that takes advantage of having more than one source of judgments on a body of information. Multiple viewpoints can...

1 (2008)

Naomi Dushay, James C. French, Carl Lagoze

We describe an architecture and investigate the characteristics of distributed searching in federated digital libraries. We introduce the notion of a query mediator as a digital library service...

Data Engineering (2007)

March Vol No, Judith Bayard Cushing, David Hansen, David Maier, Calton Pu, James C. French, ...

Scientific applications and databases rarely interoperate easily. That is, scientific researchers who use computers expend significant time and effort writing special procedures to use their program...

A Software Toolkit for Prototyping Distributed Applications (2007)

Preliminary Report, James C. French, Charles L. Viles

We describe a set of software tools for rapid prototyping of distributed applications. The toolkit is based upon a model of computational agents that are distributed in a developer specified manner...

An Archive Service with Persistent Naming for Objects (2007)

John Jones, James C. French

Wide-area systems for information storage and retrieval are rapidly gaining in popularity. Examples include FTP (File Transfer Protocol), Gopher, and World Wide Web (WWW) archives of many types of...

Using N-grams to Process Hindi Queries with Transliteration Variations (2007)

Anand Natrajan, Allison L. Powell, Allison L. Powell, James C. French, James C. French

Retrieval systems based on N-grams have been used as alternatives to word-based systems. N-grams offer a language-independent technique that allows retrieval based on portions of words. A query that...

Abstract Applications of Approximate Word Matching in Information Retrieval. (2007)

James C. French, Allison L. Powell, Eric Schulman

As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the effective retrieval of...

ap owell @ cnri.rest on.va.us (2007)

James C. French, Fredric Gey, Natalia Perelman

Vocabulacy incompatibilities arise when the terms used to index a document collection are largely unknown, or at least not well-known to the users who eventually search the collection. No matter how...

Extending the Vocabulary Available for Cross-Disciplinary Searching of Earth Science Data (2007)

James C. French, Worthy N. Martin, Lola M. Olsen

This paper discusses the development of a prototype search assistant designed to aid cross-disciplinary searching of Earth science data. The goal of the project is to provide aids to help searchers...

ARTICLE NO. 0006 Ensuring Retrieval Effectiveness in Distributed Digital Libraries (2007)

James C. French, L. Viles

We find that dissemination of collection-wide information • collection management; • organizing and indexing the materials for storage (CWI) in a distributed collection of documents is needed to...

alp4g|french¢ (2007)

Allison L. Powell, James C. French, Margaret Connell

Abstract The proliferation of online information resources increases the importance of effective and efficient distributed searching. Distributed searching is cast in three parts – database...

Abstract Flycasting: On the Fly Broadcasting (2007)

James C. French, David B. Hauver

In recent years, the popularity of online radio has exploded. This new entertainment medium affords an opportunity not available to conventional broadcast radio: the instantaneous listening audience...

Inverse Document Frequency and Web Search Engines (2007)

Kevin Prey, James C. French, Allison L. Powell, Charles L. Viles

Full text searching over a database of moderate size often uses the inverse document frequency, idf = log(N/df), as a component in term weighting functions used for document

Integrating Operational Specification and Performance Modeling for Digital -System Design" Dissertation presented to the faculty of the school of Engineering and Applied Science at the University of Virginia, May A Methodology and Algorithms for Effi (2007)

W. Miksad, James C. French, James H. Aylor, ...

Words do not suffice to express my gratitude towards my advisors: Jim Cohoon and Ron Waxman. Thanks for your support, especially when it counted the most. Thanks for your patience, understanding and...

Using Clustering Strategies for Creating Authority Files (2006)

French, James C., Powell, Allison L., Schulman, Eric

As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the effective retrieval of...

Content Locality in Distributed Digital Libraries (2006)

Viles, Charles L., French, James C.

This paper introduces the notion of content locality in distributed document collections. Content locality is the degree to which content-similar documents are colocated in a distributed collection....

W.: An empirical investigation of the scalability of a multiple viewpoint cbir system (2004)

James C. French, Xiangyu Jin, W. N. Martin

Abstract. Our work in content-based image retrieval (CBIR) relies on content-analysis of multiple representations of an image which we term multiple viewpoints or channels. The conceptual idea is to...

Multiple Viewpoints as an Approach to Digital Library Interfaces (2004)

James C. French, A. C. Chapin, Worthy N. Martin

We introduce a framework of multiple viewpoint systems for describing and designing systems that use more than one representation or set of relevance judgments on the same collection. A viewpoint is...

Using Multiple Image Representations to Improve the Quality of Content-Based Image Retrieval (2003)

James C. French, Worthy N. Martin, Jin Xiangyu

Content-based image retrieval (CBIR)... considerable study since the early 90's. Much effort has gone into characterizing the "content" of an image for the purpose of subsequent...

Integrating Multiple Multi-Channel CBIR Systems (Extended Abstract) (2003)

James C. French, Xiangyu Jin, W. N. Martin

James C. French James V. S. Watson Xiangyu Jin W. N. Martin Department of Computer Science University of Virginia Charlottesville, VA 01-434-982-2213 {french,jvw3n,xj3a,wnm}@cs.virginia.edu This work...

Comparing the performance of collection selection algorithms (2003)

Allison L. Powell, James C. French

The proliferation of online information resources increases the importance of effective and efficient information retrieval in a multicollection environment. Multicollection searching is cast in...

W.N.: An exogenous approach for adding multiple image representations to content-based image retrieval systems (2003)

James C. French, X. Jin, W. N. Martin

Content-based image retrieval (CBIR) uses features that can be extracted from the images themselves. In previous work we have shown that using more than one representation of the images in a...

Obtaining language models of web collections using query-based sampling techniques (2002)

Gary A. Monroe, James C. French, Allison L. Powell

In the context of information retrieval, traditional collection selection algorithms have been widely studied. These algorithms utilize language models, a representation of the contents of each text...

A Qualitative Examination of Content-Based Image Retrieval Behavior using Systematically Modified Test Images (2002)

James C. French, Worthy N. Martin

We describe the outcome of an effort to understand the behavior of content-based image retrieval (CBIR) technology by examining the behavior of a CBIR system in response to carefully constructed...

Network-Aided Concurrency Control in Distributed Databases (2002)

W. Miksad, James C. French, Jack W. Davidson, Ronald D. Williams, Rashmi Srinivasa, Rashmi Srinivasa, ...

Concurrency control is an integral part of a database system. Devising a concurrency control technique that has a low lost opportunity cost and a low restart cost is a hard problem. The...

Obtaining language models of web collections using query-based sampling techniques (2002)

Gary A. Monroe, James C. French, Allison L. Powell

In the context of information retrieval, traditional collection selection algorithms have been widely studied. These algorithms utilize language models, a representation of the contents of each text...

Exploiting Manual Indexing to Improve Collection Selection and Retrieval Effectiveness (2002)

James C. French, Allison L. Powell, Fredric Gey, Natalia Perelman

Vocabulary incompatibilities arise when the terms used to index a document collection are largely unknown, or at least not well-known to the users who eventually search the collection.

Obtaining language models of web collections using query-based sampling techniques (2002)

Gary A. Monroe, James C. French, Allison L. Powell

In the context of information retrieval, traditional collection selection algorithms have been widely studied. These algorithms utilize language models, a representation of the contents of each text...

Flycasting: Using Collaborative Filtering to Generate a Playlist for Online Radio (2001)

David B. Hauver, James C. French

In recent years, the popularity of online radio has exploded. This new entertainment medium affords an opportunity not available to conventional broadcast radio: the instantaneous listening audience...

Determining Stopping Criteria in the Generation of Web-Derived Language Models (2000)

Gary A. Monroe, David R. Mikesell, James C. French

In this work, we present a small-scale evaluation of two query-based sampling techniques for building language models, using a database comprised of world-wide web documents. We propose a metric by...

The impact of database selection on distributed searching (2000)

Allison L. Powell, James C. French, Margaret Connell Z

Abstract The proliferation of online information resources increases the importance of effective and efficient distributed searching. Distributed searching is cast in three parts – database...

The Impact of Database Selection on Distributed Searching (2000)

Allison L. Powell, James C. French, Jamie Callan, Margaret Connell, Charles L. Viles

The proliferation of online information resources increases the importance of effective and efficient distributed searching. Distributed searching is cast in three parts -- database selection, query...

Growth and Server Availability of the NCSTRL Digital Library (2000)

Allison Powell James, James C. French

This paper reports on measurements of the NCSTRL digital library taken over a two-year period. We report the growth of the system along two dimensions: number of participating institutions and number...

The impact of database selection on distributed searching (2000)

Allison L. Powell, James C. French, Margaret Connell Z

Abstract The proliferation of online information resources increases the importance of effective and efficient distributed searching. Distributed searching is cast in three parts – database...

Using Clustering Strategies for Creating Authority Files (2000)

James C. French, Allison L. Powell, Eric Schulman

As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the effective retrieval of...

Evaluating Astronomical Institutional Productivity (2000)

Using The Astrophysics, Eric Schulman, James C. French, Allison L. Powell

We used the Astrophysics Data System (ADS) to measure the productivity of the 38 institutions studied by Abt (1993, PASP, 105, 794) during the period 1985 to 1994. The ADS database contains 84,822...

Predicting Indexer Performance in a Distributed Digital Library (1999)

Dushay, Naomi, French, James C., Lagoze, Carl

Resource discovery in a distributed digital library poses many challenges, one of which is how to choose search engines for query distribution, given a query and a set of search engines. This paper...

Predicting Indexer Performance in a Distributed Digital Library (1999)

Dushay, Naomi, French, James C., Lagoze, Carl

Resource discovery in a distributed digital library poses many challenges, one of which is how to choose search engines for query distribution, given a query and a set of search engines. This paper...

Using Query Mediators for Distributed Searching in Federated Digital Libraries (1999)

Dushay, Naomi, French, James C., Lagoze, Carl

We describe an architecture and investigate the characteristics of distributed searching in federated digital libraries. We introduce the notion of a query mediator as a digital library service...

A Characterization Study of NCSTRL Distributed Searching (1999)

Dushay, Naomi, French, James C., Lagoze, Carl

NCSTRL, the Networked Computer Science Technical Reference Library, is a federated digital library based on the Dienst architecture. One aspect of this architecture is distributed searching, with...

A Characterization Study of NCSTRL Distributed Searching (1999)

Dushay, Naomi, French, James C., Lagoze, Carl

NCSTRL, the Networked Computer Science Technical Reference Library, is a federated digital library based on the Dienst architecture. One aspect of this architecture is distributed searching, with...

Effective and Efficient Automatic Database Selection (1999)

James C. French, Allison L. Powell, Jamie Callan

We examine a class of database selection algorithms that require only document frequency information. The CORI algorithm is an instance of this class of algorithms. In previous work, we showed that...

Comparing the Performance of Database Selection Algorithms (1999)

James C. French, Allison L. Powell, Jamie Callan, Charles L. Viles, Travis Emmit, Kevin J. Prey, ...

We compare the performance of two database selection algorithms reported in the literature. Their performance is compared using a common testbed designed specifically for database selection...

Metrics for Evaluating Database Selection Techniques (1999)

James French Allison, James C. French, Allison L. Powell

The increasing availability of online databases and other information resources in digital libraries has created the need for efficient and effective algorithms for selecting databases to search. A...

Personalized Information Environments: An Architecture for Customizable Access to Distributed Digital Libraries (1999)

James C. French, Charles L. Viles

We describe the conceptual architecture of a Personalized Information Environment or "PIE". A PIE allows unified, highly customizable access to distributed information resources by...

Comparing the Performance of Database Selection Algorithms (1999)

James C. French, Allison L. Powell, Jamie Callan, Charles L. Viles, Travis Emmitt, Kevin J. Prey, ...

We compare the performance of two database selection algorithms reported in the literature. Their performance is compared using a common testbed designed specifically for database selection...

Predicting Indexer Performance in a Distributed Digital Library (1999)

Naomi Dushay, James C. French, Carl Lagoze

. Resource discovery in a distributed digital library poses many challenges, one of which is how to choose search engines for query distribution, given a query and a set of search engines. This paper...

Comparing the performance of database selection algorithms (1999)

James C. French, Allison L. Powell, Jamie Callan, Charles L. Viles, Travis Emmitt, Kevin J. Prey, ...

Abstract We compare the performance of two database selection algorithms reported in the literature. Their performance is compared using a common testbed designed specifically for database selection...

Personalized Information Environments: An Architecture for Customizable Access to Distributed Digital Libraries (1999)

James C. French, Charles L. Viles

We describe the conceptual architecture of a Personalized Information Environment or \PIE". A PIE allows uni ed, highly customizable access to distributed information resources by providing...

Evaluating Database Selection Techniques: A Testbed and Experiment (1998)

James C. French, Allison L. Powell, Charles L. Viles, Travis Emmitt, Kevin J. Prey

We describe a testbed for database selection techniques and an experiment conducted using this testbed. The testbed is a decomposition of the TREC/TIPSTER data that allows analysis of the data along...

The Potential to Improve Retrieval Effectiveness with Multiple Viewpoints (1998)

Allison Powell, James C. French

We propose that providing multiple viewpoints of a document collection and allowing users to move among these viewpoints during a search or browse session will facilitate the location of useful...

Scalable, parallel, scientific databases (1998)

John L. Pfaltz, Russell F. Haddleton, James C. French

Abstract: Large scientific applications which rely on highly parallel computational analysis require highly parallel data access. We describe an object-oriented, scientific database system that...

Scalable, parallel, scientific databases (1998)

John L. Pfaltz, Russell F. Haddleton, James C. French

Abstract: Large scientific applications which rely on highly parallel computational analysis require highly parallel data access. We describe an object-oriented, scientific database system that...

Scalable, Parallel, Scientific Databases (1998)

John L. Pfaltz, Russell F. Haddleton, James C. French

Large scientific applications which rely on highly parallel computational analysis require highly parallel data access. We describe an object-oriented, scientific database system that achieves nearly...

Applications of Approximate Word Matching in Information Retrieval (1997)

James C. French, Allison L. Powell, Eric Schulman

As more online databases are integrated into digital libraries, the issue of quality control of the data becomes increasingly important, especially as it relates to the effective retrieval of...

Selection of Distance Metrics and Feature Subsets for k-Nearest Neighbor Classifiers (1997)

Allen L. Barker, Donald E. Brown, Advisor Minor Representative, John L. Pfaltz, ...

The k-nearest neighbor (kNN) classifier is a popular and effective method for associating a feature vector with a unique element in a known, finite set of classes. A common choice for the distance...

The Sociology of Astronomical Publication Using ADS and ADAMS (1997)

Eric Schulman, James C. French, Allison L. Powell, Stephen S. Murray, Guenther Eichhorn, Michael J. Kurtz

. We use the NASA Astrophysics Data System database of astronomical abstracts in seven major astronomy journals to study trends in astronomical publication over the last twenty years. Two of the most...

Automating the Construction of Authority Files in Digital Libraries: A Case Study (1997)

James C. French, Allison L. Powell, Eric Schulman, John L. Pfaltz

. The issue of quality control has become increasingly important as more online databases are integrated into digital libraries. This can have a dramatic effect on the search effectiveness of an...

Applying Hypertext Structures To Software Documentation (1997)

James C. French, John C. Knight, Allison L. Powell

. Software documentation represents a critical resource to the successful functioning of many enterprises. However, because it is static, documentation often fails to meet the needs of the many...

Applying hypertext structures to software documentation (1997)

James C. French, John C. Knight, Allison L. Powell

Abstract. Software documentation represents a critical resource to the successful functioning of many enterprises. However, because it is static, documentation often fails to meet the needs of the...

A Classification Approach to Boolean Query Reformulation (1997)

James C. French, Donald E. Brown, Nam-ho Kim

One of the difficulties in using current Boolean-based eration of networks and online databases makes it possible information retrieval systems is that it is hard for a user, especially a novice, to...

0 Using N-grams to Process Hindi Queries with Transliteration Variations (1997)

Allison L. Powell, James C. French, Allison L. Powell, James C. French

Retrieval systems based on N-grams have been used as alternatives to word-based systems. N-grams offer a language-independent technique that allows retrieval based on portions of words. A query that...

Library Access, Search and Retrieval (LASR) Pilot- (Final Report) (1996)

James C. French, Glen L. Bull, Martha R. Tarrant, Allison L. Powell, James C. French, Glen L. Bull, ...

Abstract: The Global Change Data and Information System (GCDIS) is a cooperative effort among eight United States government agencies and other organizations to provide public Internet access to...

Exploiting Coauthorship to Infer Topicality in a Digital Library of Computer Science Technical Reports (1996)

James C. French, Charles L. Viles

We propose a method of mapping the topical content of distributed digital libraries and demonstrate the technique using data from the Networked Computer Science Technical Report Library (NCSTRL)...

TREC-4 Experiments using Drift (1996)

Charles L. Viles, James C. French

Drift is a prototype, vector space based, information retrieval system in development at the University of Virginia. The system is designed to do experiments in distributed, dynamic information...

A systematic approach to creating and maintaining software documentation (1996)

Allison L. Powell, James C. French, John C. Knight

Abstract. Current software documentation is difficult to write and seldom meets the varying needs of its users. We propose that by considering different users and applying information retrieval...

APPROVAL SHEET (1995)

Ambar Sarkar, James C. French, James H. Aylor, Dean Richard, ...

Jim Cohoon and Ron Waxman. Thanks for your support, especially when it counted the most. Thanks for your patience, understanding and constructive criticisms. Thanks to my committee members: John...

Dissemination of Collection Wide Information in a Distributed Information Retrieval System (1995)

Charles L. Viles, James C. French

We find that dissemination of collection wide information (CWI) in a distributed collection of documents is needed to achieve retrieval effectiveness comparable to a centralized collection. Complete...

On the Update of Term Weights in Dynamic Information Retrieval Systems (1995)

Charles Viles, James C. French

Using the vector space information retrieval model, we show that the update of term weights under document insertions is computationally expensive for weighting schemes that use collection statistics...

TREC-4 Experiments using DRIFT (1995)

Charles L. Viles, James C. French

Drift is a prototype, vector space based, information retrieval system in development at the University of Virginia. The system is designed to do experiments in distributed, dynamic information...

Dissemination of Collection Wide Information in a Distributed Information Retrieval System (1995)

Charles Viles, James C. French

We find that dissemination of collection wide information (CWI) in a distributed collection of documents is needed to achieve retrieval effectiveness comparable to a centralized collection. Complete...

Legion: The Next Logical Step Toward a Nationwide Virtual Computer (1994)

Andrew S. Grimshaw, Andrew S. Grimshaw, William A. Wulf, William A. Wulf, James C. French, ...

The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations....

A Synopsis of the Legion Project (1994)

Andrew S. Grimshaw, Andrew S. Grimshaw, William A. Wulf, William A. Wulf, James C. French, James C. French, ...

The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations....

John F. Karpovich Andrew S. Grimshaw James C. French July 22, 1994 Object-Oriented Programming Systems, Languages, and Applications, (1994)

Pp October, John F. Karpovich, Andrew S. Grimshaw, James C. French

Scientific applications often manipulate very large sets of persistent data. Over the past decade, advances in disk storage device performance have consistently been outpaced by advances in the...

High Performance Access to Radio Astronomy Data: A Case Study (1994)

John Karpovich, James C. French, Andrew S. Grimshaw

As CPU performance has rapidly improved, increased pressure has been placed on the performance of accessing external data in order to keep up with demand. Increasingly often the I/O subsystem and...

Extensible file systems (ELFS): An object-oriented approach to high performance file I/O (1994)

John F. Karpovich, Andrew S. Grimshaw, James C. French

Scientific applications often manipulate very large sets of persistent data. Over the past decade, advances in disk storage device performance have consistently been outpaced by advances in the...

A Synopsis of the Legion Project A Synopsis of the Legion Project e pluribus unum-- one out of many (1994)

Andrew S. Grimshaw, William A. Wulf, James C. French, Alfred C. Weaver, Paul F. Reynolds, Andrew S. Grimshaw, ...

The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations....

Legion: The next logical step toward a nationwide virtual computer (1994)

Andrew S. Grimshaw, Andrew S. Grimshaw, William A. Wulf, William A. Wulf, James C. French, James C. French, ...

The coming of giga-bit networks makes possible the realization of a single nationwide virtual computer comprised of a variety of geographically distributed high-performance machines and workstations....

High Performance Access to Radio Astronomy Data: A Case Study (1994)

John F. Karpovich, John F. Karpovich, James C. French, James C. French, Andrew S. Grimshaw, Andrew S. Grimshaw

As CPU performance has rapidly improved, increased pressure has been placed on the performance of accessing external data in order to keep up with demand. Increasingly often the I/O subsystem and...

Multiple Inheritance and the Closure of Set Operators in Class Hierarchies (1992)

John Pfaltz, John L. Pfaltz, James C. French, James C. French

In this report, we establish essential closures in class hierarchies of database systems that support set operations, such as union and intersection, in their query language. In particular, we...

Performance Measurement of a Parallel Input/Output System for the Intel iPSC/2 Hypercube (1991)

James French, James C. French, Terrence W. Pratt, Terrence W. Pratt, Mriganka Das, Mriganka Das

The Intel Concurrent File System (CFS) for the iPSC/2 hypercube is one of the first production file systems to utilize the declustering of large files across numbers of disks to improve I/O...

Scientific Database Management (1990)

John L. Pfaltz, Michael J. Carey, James C. French, James C. French, Anita K. Jones, Anita K. Jones, ...

On March 12-13, 1990, the National Science Foundation sponsored a two day workshop, hosted by the University of Virginia, at which representatives from the earth, life, and space sciences gathered...

The ADAMS Database Language (1989)

John L. Pfaltz, James C. French, Andrew Grimshaw, Sang H. Son, Paul Baron, Stanley Janet, ...

: ADAMS provides a mechanism for applications programs, written in many languages, to define and access common persistent databases. The basic constructs are element, class, set, map, attribute, and...

Implementation of the ADAMS Database System (1989)

James C French, John Pfaltz, John L. Pfaltz, James C. French, Andrew Grimshaw, ...

: ADAMS provides a mechanism for applications programs, written in many languages, to define and access common persistent databases. The basic constructs are element, class, set, map, attribute, and...

Scoping Persistent Name Spaces in ADAMS (1988)

J.L. Whitlatch, John L. Pfaltz, James C. French, Jenona L. Whitlatch

: ADAMS is based on five primitive concepts: Attribute, co-domain, element, map, and set. Each instance of these primitives is a named entity to which user references resolve. Thus, the concept of...

Breaking the I/O Bottleneck at the National Radio Astronomy Observatory (NRAO)

John Karpovich, John F. Karpovich, Andrew S. Grimshaw, Andrew S. Grimshaw, James C. French, James C. French

this paper discusses our approach and the current NRAO environment in more detail and then presents the details of phase one of the project, including a brief discussion of the file structure chosen,...