C. Lee Giles

Objective Author Bias Acquisition models Results Summary (2009)

Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles, ...

online citation databases with different acquisition methods. The database entries in DBLP are inserted manually while the CiteSeer entries are obtained autonomously. There are advantages and...

Finding a Haystack in Haystacks – Simultaneous Identification of Concepts in Large Bio-Medical Corpora ∗ Abstract (2009)

Ying Liu, Lucian V. Lita, R. Stefan Niculescu, Prasenjit Mitra, C. Lee Giles

Since nearly all information is now created digitally, large text databases have become more prevalent than ever. Automatically mining information from these databases proves to be a challenge due to...

Probabilistic Models for Discovering E-Communities ABSTRACT (2009)

Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha

The increasing amount of communication between individuals in e-formats (e.g. email, Instant messaging and the Web) has motivated computational research in social network analysis (SNA). Previous...

Research Feature Digital Libraries and Autonomous Citation Indexing (2009)

Steve Lawrence, C. Lee Giles, Kurt Bollacker

The Web is revolutionizing the way researchers access scientific literature, however scientific literature on the Web is largely disorganized. Autonomous citation indexing can help organize the...

Topic Segmentation with Shared Topic Detection and Alignment of Multiple Documents ABSTRACT (2008)

Bingjun Sun, Prasenjit Mitra, Hongyuan Zha, C. Lee Giles, John Yen

Topic detection and tracking [26] and topic segmentation [15] play an important role in capturing the local and sequential information of documents. Previous work in this area usually focuses on...

Activity awareness in collaboratories (2008)

Umer Farooq, Craig H. Ganoe, John M. Carroll, C. Lee Giles

Abstract. We are investigating the support for activity awareness in the CiteSeer collaboratory. Activity awareness is awareness of collaborators ’ work that supports performance in complex tasks...

Research Feature Digital Libraries and Autonomous Citation Indexing (2008)

Steve Lawrence, C. Lee Giles, Kurt Bollacker

The Web is revolutionizing the way researchers access scientific literature, however scientific literature on the Web is largely disorganized. Autonomous citation indexing can help organize the...

Finding a Haystack in Haystacks – Simultaneous Identification of Concepts in Large Bio-Medical Corpora ∗ Abstract (2008)

Ying Liu, Lucian V. Lita, R. Stefan Niculescu, Prasenjit Mitra, C. Lee Giles

Since nearly all information is now created digitally, large text databases have become more prevalent than ever. Automatically mining information from these databases proves to be a challenge due to...

ChemXSeer: A Web Search Engine and Repository for e-Chemistry (2008)

C. Lee Giles, Prasenjit Mitra, Karl Mueller, James Z. Wang, Bingjun Sun, Levent Bolelli, ...

Cyberinfrastructure or e-science has become crucial for scientific progress and open source systems have greatly facilitated design and implementation. In chemistry, the growth of data has been...

Network Flow for Collaborative Ranking (2008)

Ziming Zhuang, Silviu Cucerzan, C. Lee Giles

Abstract: In query based Web search, a significant percentage of user queries are underspecified, most likely by naive users. Collaborative ranking helps the naive user by exploiting the collective...

Chemxseer: An echemistry web search engine and repository (2008)

C. Lee Giles, Prasenjit Mitra, Karl Mueller, James Z. Wang, Bingjun Sun, Levent Bolelli, ...

With the amount of scientific digital content available online constantly increasing, the community of science has been increasing its efforts towards automatically collecting and organizing such...

1.1 Predicting Noisy Time Series Data (2008)

C. Lee Giles, Steve Lawrence, Ah Chung Tsoi

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Analysis of lexical signatures for improving information persistence on the World Wide Web (2008)

David M. Pennock, C. Lee Giles, Robert Krovetz

A lexical signature (LS) consisting of several key words from a Web document is often sufficient information for finding the document later, even if its URL has changed. We conduct a large-scale...

Modern (Computational) Scientometrics & Next Generation CiteSeer (2008)

C. Lee Giles

• Intelligent search and search engines – Automatic search engine creation • Computational methods for knowledge extraction – Specialty & academic search engines • CiteSeer (computer...

WWW 2007 / Track: E*-Applications Session: E-Commerce and E-Content Extraction and Search of Chemical Formulae in Text Documents on the Web ∗ (2008)

Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee Giles

Often scientists seek to search for articles on the Web related to a particular chemical. When a scientist searches for a chemical formula using a search engine today, she gets articles where the...

Network Flow for Collaborative Ranking (2008)

Ziming Zhuang, Silviu Cucerzan, C. Lee Giles

Abstract. In query based Web search, a significant percentage of user queries are underspecified, most likely by naive users. Collaborative ranking helps the naive user by exploiting the collective...

Popularity Weighted Ranking for Academic Digital Libraries (2008)

Yang Sun, C. Lee Giles

Abstract. We propose a popularity weighted ranking algorithm for academic digital libraries that uses the popularity factor of a publication venue overcoming the limitations of impact factors. We...

Bayesian Classication and Feature Selection from Finite Data Sets (2007)

Frans Coetzee, C. Lee, Giles Bayesian Classi, Feature Selection, Finite Data Sets, Frans M. Coetzee, ...

Feature selection aims to select the smallest subset of features for a specied level of performance. The optimal achievable classication performance on a feature subset is summarized by its Receiver...

2 (2007)

Eric J. Glover, Steve Lawrence, William P. Birmingham, Andries Kruger, C. Lee Giles, David Pennock

A user searching for documents within a specific category using a general purpose search engine might have a difficult time finding valuable documents. To improve category specific search, we show...

2 (2007)

Alexandrin Popescul, Steve Lawrence, Lyle H. Ungar, C. Lee Giles

We introduce a simple and efficient method for clustering and identifying temporal trends in hyper-linked document databases. Our method can scale to large datasets because it exploits the underlying...

Equivalence in Knowledge Representation: Automata, Recurrent Neural Networks, and Dynamical Fuzzy Systems (2007)

C. Lee Giles, Christian W. Omlin, K. K. Thornber

Neuro-fuzzy systems - the combination of artificial neural networks with fuzzy logic - have become useful in many application domains. However, conventional neuro-fuzzy models usually need enhanced...

Binary Feature Selection and Integration in Specialized Search Engines (2007)

Frans Coetzee Andries, Andries Kruger, C. Lee Giles, Steve Lawrence, Christian W. Omlin

We present a methodology for rapid implementation of specialized search engines. To catalog data, these search engines interpret and classify the content of web material to identify different...

1 (2007)

Hui Han, Eren Manavoglu, C. Lee Giles, Hongyuan Zha

This paper introduces a rule-based, context-dependent word clustering method, with the rules derived from various domain databases and the word text orthographic properties. Besides significant...

1 (2007)

Gary W. Flake, Eric J. Glover, Steve Lawrence, C. Lee Giles

When searching the WWW, users often desire results restricted to a particular document category. Ideally, a user would be able to filter results with a text classifier to minimize false positive...

;y (2007)

David M. Pennock, Sandip Debnath, Eric J. Glover, C. Lee Giles

We develop a model of how information ows into a market, and derive algorithms for automatically detecting and explaining relevant events. We analyze data from twenty-two \political stock markets...

2 (2007)

Paat Rusmevichientong, David M. Pennock, Steve Lawrence, C. Lee Giles

We present two new algorithms for generating uniformly random samples of pages from the World Wide Web, building upon recent work by Henzinger et al. (Henzinger et al. 2000) and Bar-Yossef et al....

2 (2007)

Paat Rusmevichientong, David M. Pennock, Steve Lawrence, C. Lee Giles

We present two new algorithms for generating uniformly random samples of pages from the World Wide Web, building upon recent work by Henzinger et al. (Henzinger et al. 2000) and Bar-Yossef et al....

c a (2007)

C. Lee Giles, Mark W. Goudreau

There has been much interest in using optics to implement computer interconnection networks. However, there has been little discussion of any routing methodologies besides those already used in...

a (2007)

Christian W. Omlin, Karvel K. Thornber, C. Lee Giles

There has been an increased interest in combining fuzzy systems with neural networks because fuzzy neural systems merge the advantages of both paradigms. On the one hand, parameters in fuzzy systems...

2 (2007)

Paat Rusmevichientong, David M. Pennock, Steve Lawrence, C. Lee Giles

We present two new algorithms for generating uniformly random samples of pages from the World Wide Web, building upon recent work by Henzinger et al. (Henzinger et al. 2000) and Bar-Yossef et al....

Winners don't take all: A model of web link accumulation (2007)

David M. Pennock, C. Lee Giles, Gary W. Flake, Steve Lawrence, Eric Glover

Several studies show that the distribution of the number of links per web page follows a power law in the limit for large numbers of links. The same power law scaling appears in the connectivity...

y (2007)

Peter Tino, Bill G. Horne, C. Lee Giles

The position, number and stability types of fixed points of a two--neuron recurrent network with nonzero weights are investigated. Using simple geometrical arguments in the space of derivatives of...

1 (2007)

Tsungnan Lin, Bill G. Horne, C. Lee Giles

Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NARX networks perform...

1.1 Predicting Noisy Time Series Data (2007)

Steve Lawrence, Ah Chung Tsoi, C. Lee Giles

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Abstract On the Distribution of Performance from Multiple Neural-Network Trials (2007)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, C. Lee Giles

IEEE — The performance of neural-network simulations is often reported in terms of the mean and standard deviation of a number of simulations performed with different starting conditions. However,...

Informatics and Mathematical (2007)

David M. Pennock, Steve Lawrence, Dpennock Lawrence, Finn Arup Nielsen, C. Lee Giles

Game sites on the World Wide Web draw people from around the world with specialized interests, skills, and knowledge. Data from the games often re ects the players ' expertise and will to win....

2 (2007)

David M, Eric J. Glover, Eric J. Glover, Gary W. Flake, Gary W. Flake, Steve Lawrence, ...

Users looking for documents within specific categories may have a difficult time locating valuable documents using general purpose search engines. We present an automated method for learning query...

y (2007)

Andrew D. Back, Laboratory Artificial, Brain Systems, Ah Chung Tsoi, Bill G. Horne, C. Lee Giles

The shift operator, defined as q x(t) = x(t+1), is the basis for almost all discrete-time models. It has been shown however, that linear models based on the shift operator suffer problems when used...

ABSTRACT Panorama: Extending Digital Libraries with Topical Crawlers (2007)

Gautam Pant, Kostas Tsioutsiouliklis, Judy Johnson, C. Lee Giles

A large amount of research, technical and professional documents are available today in digital formats. Digital libraries are created to facilitate search and retrieval of information supplied by...

Extraction and Search of Chemical Formulae in Text Documents on the Web (2007)

Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee Giles

Often scientists seek to search for articles on the Web related to a particular chemical. When a scientist searches for a chemical formula using a search engine today, she gets articles where the...

Extraction and Search of Chemical Formulae in Text Documents on the Web (2007)

Bingjun Sun, Qingzhao Tan, Prasenjit Mitra, C. Lee Giles

Often scientists seek to search for articles on the Web related to a particular chemical. When a scientist searches for a chemical formula using a search engine today, she gets articles where the...

Adaptive Sorted Neighborhood Methods for Efficient Record Linkage (2007)

Su Yan, Dongwon Lee, Min-yen Kan, C. Lee Giles

Traditionally, record linkage algorithms have played an important role in maintaining digital libraries - i.e., identifying matching citations or authors for consolidation in updating or integrating...

A Clustering Method For Web Data With Multi-Type Interrelated Components (2007)

Levent Bolelli, Seyda Ertekin, Ding Zhou, C. Lee Giles

Traditional clustering algorithms work on "flat" data, making the assumption that the data instances can only be represented by a set of homogeneous and uniform features. Many real world...

Generative Models for Name Disambiguation (2007)

Yang Song, Jian Huang, Isaac G. Councill, Jia Li, C. Lee Giles

Name ambiguity is a special case of identity uncertainty where one person can be referenced by multiple name variations in different situations or evenshare the same name with other people. In this...

Learning User Clicks in Web Search (2007)

Ding Zhou, Levent Bolelli, Jia Li, C. Lee Giles, Hongyuan Zha

Machine learning for predicting user clicks in Webbased search offers automated explanation of user activity. We address click prediction in the Web search scenario by introducing a method for click...

Are Your Citations Clean? (2007)

Dongwon Lee, Jaewoo Kang, Prasenjit Mitra, C. Lee Giles, Byung-Won On

If the are, only one can refer to a distinct document; if not, many can refer to the same document.

Deriving Knowledge from Figures for Digital Libraries (2007)

Xiaonan Lu, James Z. Wang, Prasenjit Mitra, C. Lee Giles

Figures in digital documents contain important information. Current digital libraries do not summarize and index information available within figures for document retrieval. We present our system on...

Automatic Extraction of Data from 2-D Plots in Documents (2007)

Xiaonan Lu, James Z. Wang, Prasenjit Mitra, C. Lee Giles

Two-dimensional (2-D) plots in digital documents contain important information. Often, the results of scientific experiments and performance of businesses are summarized using plots. Although 2-D...

Co-Ranking Authors and Documents in a Heterogeneous Network (2007)

Ding Zhou, Sergey A. Orshanskiy, Hongyuan Zha, C. Lee Giles

The problem of evaluating scientific publications and their authors is important, and as such has attracted increasing attention. Recent graph-theoretic ranking approaches have demonstrated...

Measuring Conference Quality by Mining Program Committee Characteristics (2007)

Ziming Zhuang, Ergin Elmacioglu, Dongwon Lee, C. Lee Giles

- Digital Libraries provide effective recommendation and filtering tools - Computer Science is unique in its publication practice: often value conferences> journals constantly increasing number of...

Tableseer: Automatic table metadata extraction and searching in digital libraries (2007)

Ying Liu, Kun Bai, Prasenjit Mitra, C. Lee Giles

Tables are ubiquitous in digital libraries. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. However, current search engines...

Popularity Weighted Ranking for Academic Digital Libraries (2007)

Yang Sun, C. Lee Giles

We propose a popularity weighted ranking algorithm for academic digital libraries that uses the popularity factor of a publication venue overcoming the limitations of impact factors. We compare our...

An lda-based community structure discovery approach for large-scale social networks (2007)

Haizheng Zhang, Baojun Qiu, C. Lee Giles, Henry C. Foley, John Yen

Abstract — Community discovery has drawn significant research interests among researchers from many disciplines for its increasing application in multiple, disparate areas, including computer...

Determining bias to search engines from robots.txt (2007)

Yang Sun, Ziming Zhuang, Isaac G. Councill, C. Lee Giles

Search engines largely rely on robots (i.e., crawlers or spiders) to collect information from the Web. Such crawling activities can be regulated from the server side by deploying the Robots Exclusion...

Tableseer: Automatic table metadata extraction and searching in digital libraries (2007)

Ying Liu, Kun Bai, Prasenjit Mitra, C. Lee Giles

Tables are ubiquitous in digital libraries. In scientific documents, tables are widely used to present experimental results or statistical data in a condensed fashion. However, current search engines...

Determining bias to search engines from robots.txt (2007)

Yang Sun, Ziming Zhuang, Isaac G. Councill, C. Lee Giles

Search engines largely rely on robots (i.e., crawlers or spiders) to collect information from the Web. Such crawling activities can be regulated from the server side by deploying the Robots Exclusion...

Probabilistic community discovery using hierarchical latent gaussian mixture model (2007)

Haizheng Zhang, C. Lee Giles, Henry C. Foley, John Yen

Complex networks exist in a wide array of diverse domains, ranging from biology, sociology, and computer science. These real-world networks, while disparate in nature, often comprise of a set of...

Efficient Multiclass Boosting Classification with Active Learning (2007)

Jian Huang, Seyda Ertekin, Yang Song, Hongyuan Zha, C. Lee Giles

We propose a novel multiclass classification algorithm Gentle Adaptive Multiclass Boosting Learning (GAMBLE). The algorithm naturally extends the two class Gentle AdaBoost algorithm to multiclass...

Detecting Research Topics via the Correlation between Graphs and Texts (2007)

Yookyung Jo, Carl Lagoze, C. Lee Giles

In this paper we address the problem of detecting topics in large-scale linked document collections. Recently, topic detection has become a very active area of research due to its utility for...

Group-Linking Method: A Unified Benchmark for Machine Learning with Recurrent Neural Network (2007)

LIN, Tsungnan, GILES, C. Lee

This paper proposes a method (Group-Linking Method) that has control over the complexity of the sequential function to construct Finite Memory Machines with minimal order — the machines have the...

Efficient name disambiguation for large-scale databases (2006)

Jian Huang, C. Lee Giles

Abstract. Name disambiguation can occur when one is seeking a list of publications of an author who has used different name variations and when there are multiple other authors with the same name. We...

Automatic extraction of table metadata from digital documents (2006)

Ying Liu, Prasenjit Mitra, C. Lee Giles, Kun Bai

Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and highlight a collection...

Probabilistic models for discovering e-communities (2006)

Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha

The increasing amount of communication between individuals in e-formats (e.g. email, Instant messaging and the Web) has motivated computational research in social network analysis (SNA). Previous...

Network Flow for Collaborative Ranking (2006)

Ziming Zhuang, Silviu Cucerzan, C. Lee Giles

In query based Web search, a significant percentage of user queries are underspecified, most likely by naive users. Collaborative ranking helps the naive user by exploiting the collective expertise....

Learning Metadata from the Evidence in an On-Line Citation Matching Scheme (2006)

Isaac G. Councill, Huajing Li, Ziming Zhuang, Sandip Debnath, Levent Bolelli, Wang-chien Lee, ...

Citation matching, or the automatic grouping of bibliographic references that refer to the same document, is a data management problem faced by automatic digital libraries for scientific literature...

Probabilistic models for discovering e-communities (2006)

Ding Zhou, Eren Manavoglu, Jia Li, C. Lee Giles, Hongyuan Zha

The increasing amount of communication between individuals in e-formats (e.g. email, Instant messaging and the Web) has motivated computational research in social network analysis (SNA). Previous...

CiteSeerX: an Architecture and Web Service Design for an Academic Document Search Engine (2006)

Huajing Li, Isaac Councill, Wang-chien Lee, C. Lee Giles

CiteSeer is a scientific literature digital library and search engine which automatically crawls and indexes scientific documents in the field of computer and information science. After serving as a...

Boosting the Feature Space: Text Classification for Unstructured Data on the Web (2006)

Yang Song, Ding Zhou, Jian Huang, Isaac G. Councill, Hongyuan Zha, C. Lee Giles

The issue of seeking efficient and effective methods for classifying unstructured text in large document corpora has received much attention in recent years. Traditional document representation like...

Clustering scientific literature using sparse citation graph analysis (2006)

Levent Bolelli, Seyda Ertekin, C. Lee Giles

Abstract. It is well known that connectivity analysis of linked documents provides significant information about the structure of the document space for unsupervised learning tasks. However, the...

An Architecture for Creating Collaborative Semantically Capable Scientific Data Sharing Infrastructures (2006)

Anuj R. Jaiswal, C. Lee Giles, Prasenjit Mitra, James Z. Wang

Increasingly, scientists are seeking to collaborate and share data among themselves. Such sharing is can be readily done by publishing data on the World-Wide Web. Meaningful querying and searching on...

Automatic extraction of table metadata from digital documents (2006)

Ying Liu, Prasenjit Mitra, C. Lee Giles, Kun Bai

Tables are used to present, list, summarize, and structure important data in documents. In scholarly articles, they are often used to present the relationships among data and highlight a collection...

CiteSeerX - a scalable autonomous scientific digital library (2006)

Huajing Li, Isaac G. Councill, Levent Bolelli, Ding Zhou, Yang Song, Wang-chien Lee, ...

CiteSeer is a scientific literature digital library and search engine which automatically crawls and indexes scientific documents in the fields of computer and information science. Since it's...

Topic Evolution and Social Interactions: How Authors Effect Research (2006)

Ding Zhou, Xiang Ji, Hongyuan Zha, C. Lee Giles

We propose a method for discovering the dependency relationships between the topics of documents shared in social networks using the latent social interactions, attempting to answer the question:...

Learning metadata from the evidence in an on-line citation matching scheme (2006)

Isaac G. Councill, Huajing Li, Ziming Zhuang, Sandip Debnath, Levent Bolelli, Wang-chien Lee, ...

Citation matching, or the automatic grouping of bibliographic references that refer to the same document, is a data management problem faced by automatic digital libraries for scientific literature...

Automatic Acknowledgement Indexing: Expanding the Semantics of Contribution in the CiteSeer Digital Library (2005)

Isaac G. Councill, C. Lee Giles, Hui Han, Eren Manavoglu

Acknowledgements in research publications, like citations, indicate influential contributions to scientific work; however, large-scale acknowledgement analyses have traditionally been impractical due...

A Comparison of On-line Computer Science Citation Databases (2005)

Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles, C. Lee

This paper examines the difference and similarities between the two on-line computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the...

Modeling the Author Bias between Two On-line Computer Science Citation Databases (2005)

Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles

We examines the di#erence and similarities between two online computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the CiteSeer entries are...

Modeling the Author Bias between Two On-line Computer Science Citation Databases (2005)

Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles

We examine the di#erence and similarities between two online computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the CiteSeer entries are...

Knowledge Discovery in Web-Directories: Finding Term-Relations to Build a Business Ontology (2005)

Sandip Debnath, Tracy Mullen, Arun Upneja, C. Lee Giles

The Web continues to grow at a tremendous rate. Search engines find it increasingly difficult to provide useful results. To manage this explosively large number of Web documents, automatic clustering...

A comparison of on-line computer science citation databases (2005)

Vaclav Petricek, Ingemarj. Cox, Isaac G. Councill, C. Lee Giles

Abstract. This paper examines the difference and similarities between the two on-line computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while...

Modeling the author bias between two on-line computer science citation databases (2005)

Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles

We examine the difference and similarities between two online computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manually while the CiteSeer entries are...

Name disambiguation in author citations using a K-way spectral clustering method (2005)

Hui Han, Hongyuan Zha, C. Lee Giles

An author may have multiple names and multiple authors may share the same name simply due to name abbreviations, identical names, or name misspellings in publications or bibliographies 1. This can...

Automatic identification of informative sections of web pages (2005)

Ip Debnath, Prasenjit Mitra, Nirmal Pal, C. Lee Giles

Web-pages – especially dynamically generated ones – contain several items that cannot be classified as the “primary content”, e.g., navigation sidebars, advertisements, copyright notices,...

Automatic extraction of informative blocks from webpages (2005)

Ip Debnath, Prasenjit Mitra, C. Lee Giles

Search engines crawl and index webpages depending upon their informative content. However, webpages — especially dynamically generated ones — contain items that cannot be classified as the...

Automatic identification of informative sections of web pages (2005)

Ip Debnath, Prasenjit Mitra, Nirmal Pal, C. Lee Giles

Abstract—Web pages—especially dynamically generated ones—contain several items that cannot be classified as the “primary content, ” e.g., navigation sidebars, advertisements, copyright...

Identifying content blocks from web documents (2005)

Ip Debnath, Prasenjit Mitra, C. Lee Giles

Abstract. Intelligent information processing systems, such as digital libraries or search engines index web-pages according to their informative content. However, web-pages contain several...

A hierarchical naive Bayes mixture model for name disambiguation in author citations (2005)

Hui Han, Wei Xu, Hongyuan Zha, C. Lee Giles

Because of name variations, an author may have multiple names and multiple authors may share the same name. Such name ambiguity affects the performance of document retrieval, web search, database...

Enabling Interoperability For Autonomous Digital Libraries : An API To CiteSeer Services (2004)

Yves Petinot, C. Lee Giles, Vivek Bhatnagar, Pradeep B. Teregowda, Hui Han

We introduce CiteSeer-API, a public API to CiteSeer-like services. CiteSeer-API is SOAP/WSDL based and allows for easy programatical access to all the specific functionalities offered by CiteSeer...

CiteSeer-API: Towards Seamless Resource Location and Interlinking for Digital Libraries (2004)

Yves Petinot, C. Lee Giles, V. Bhatnagar, Vivek Bhatnagar, Pradeep B. Teregowda, Hui Han, ...

We introduce CiteSeer-API, a public API to CiteSeer-like services. CiteSeer-API is SOAP/WSDL based and allows for easy programmatical access to all the specific functionalities offered by CiteSeer...

A Service-Oriented Architecture for Digital Libraries (2004)

Yves Petinot, C. Lee Giles, V. Bhatnagar, Vivek Bhatnagar, Pradeep B. Teregowda, Hui Han, ...

CiteSeer is currently a very large source of meta-data information on the World Wide Web (WWW). This meta-data is the key material for the Semantic Web. Still, CiteSeer is not yet a Semantic-enabled...

Collaborative Filtering with Maximum Entropy (2004)

Dmitry Pavlov, Eren Manavoglu, David Pennock, C. Lee Giles

We describe a novel maximum entropy (maxent) approach for generating online recommendations as a user navigates through a collection of documents. We show how to handle high-dimensional sparse data...

Enabling Interoperability For Autonomous Digital Libraries : An API To CiteSeer Services (2004)

Yves Petinot, C. Lee Giles, Vivek Bhatnagar, Pradeep B. Teregowda, Hui Han

We introduce CiteSeer-API, a public API to CiteSeer-like services. CiteSeer-API is SOAP/WSDL based and allows for easy programatical access to all the specific functionalities offered by CiteSeer...

Offering collaborative-like recommendations when data is sparse: The case of attraction-weighted information filtering (2004)

Arnaud De Bruyn, C. Lee Giles, David M. Pennock

We propose a low-dimensional weighting scheme to map information filtering recommendations into more relevant, collaborative filtering-like recommendations. Similarly to content-based systems, the...

Comparing static and dynamic measurements and models of the internet’s topology (2004)

Seung-taek Park, David M. Pennock, C. Lee Giles

Abstract-Capturing a precise snapshot of the Internet’s topology is nearly impussihle. Recent efforts have produced autonomous-system (AS) level topologies with noticeably diver-gent...

Collaborative Filtering with Maximum Entropy (2004)

Dmitry Pavlov, Eren Manavoglu, David M. Pennock, C. Lee Giles

As users navigate through online document collections on high-volume Web servers, they depend on M i n i n g t h e W e b good recommendations. The authors present a novel maximum-entropy algorithm...

A Service-Oriented Architecture for Digital Libraries (2004)

Yves Petinot, C. Lee Giles, Vivek Bhatnagar, Pradeep B. Teregowda, Hui Han, Isaac Councill

CiteSeer is currently a very large source of meta-data information on the World Wide Web (WWW). This meta-data is the key material for the Semantic Web. Still, CiteSeer is not yet a Semanticenabled...

eBizSearch: An OAI-Compliant Digital Library for eBusiness (2003)

Yves Petinot, Pradeep B. Teregowda, Hui Han, C. Lee Giles, Steve Lawrence

Niche search engines offer an efficient alternative to traditional search engines ...

A Model-based K-means Algorithm for Name Disambiguation (2003)

Hui Han, Hongyuan Zha, C. Lee Giles

Unambiguous identities of resources are important aspect for semantic web. This paper addresses the personal identity issue in the context of bibliographies. Because of abbreviations or misspelling...

Static and dynamic analysis of the Internet's susceptibility to faults and attacks (2003)

Seung-Taek Park, Alexy Khrabrov, David M. Pennock, Steve Lawrence, C. Lee Giles, Lyle H. Ungar

We analyze the susceptibility of the Internet to random faults, malicious attacks, and mixtures of faults and attacks. We analyze actual Internet data, as well as simulated data created with network...

Probabilistic user behavior models (2003)

Eren Manavoglu, Dmitry Pavlov, C. Lee Giles

We present a mixture model based approach for generating individualized behavior models for the Web users. We investigate the use of maximum entropy and Markov mixture models for generating...

Collaborative filtering with maximum entropy (2003)

Dmitry Pavlov, Eren Manavoglu, David M. Pennock, C. Lee Giles

Abstract — We describe a novel maximum entropy (maxent) approach for generating online recommendations as a user navigates through a collection of documents. We show how to handle high-dimensional...

Information incorporation in online in-Game sports betting markets (2003)

Sandip Debnath, David M. Pennock, C. Lee Giles, Steve Lawrence

We analyze data from $52$ online in-game sports betting markets (where betting is allowed continuously throughout a game), including 34 markets based on soccer (European football) games from the 2002...

The Role of Search in Ubiquitous Computing (2003)

C. Lee Giles, Sandeep Purao

Search is a natural and everyday aspect of human activity, whether we are looking for our keys or for some information on the web. Searching is defined here as the process of seeking relevant...

Information Incorporation in Online In-Game Sports Betting Markets (2003)

Sandip Debnath, David M. Pennock, C. Lee Giles, Steve Lawrence, Google Inc

We analyze data from 52 online in-game sports betting markets (where betting is allowed continuously throughout a game), including 34 markets based on soccer (European football) games from the 2002...

eBizSearch: A Niche Search Engine for e-Business (2003)

C. Lee Giles, Yves Petinot, Pradeep B. Teregowda, Hui Han, Steve Lawrence, Arvind Rangaswamy, ...

Niche Search Engines offer an efficient alternative to traditional search engines when the results returned by general-purpose search engines do not provide a sufficient degree of relevance. By...

What’s the code? automatic classification of source code archives (2002)

Secil Ugurel, Robert Krovetz, C. Lee Giles, David M. Pennock, Eric J. Glover, Hongyuan Zha

There are various source code archives on the World Wide Web. These archives are usually organized by application categories and programming languages. However, manually organizing source code...

Self-organization and identification of web communities (2002)

Gary William Flake, Steve Lawrence, C. Lee Giles, Frans M. Coetzee

Despite the decentralized and unorganized nature of the web, we show that the web self-organizes such that communities of highly related pages can be efficiently identified based purely on...

Winners don’t take all: Characterizing the competition for links on the web (2002)

David M. Pennock, Gary W. Flake, Steve Lawrence, Eric J. Glover, C. Lee Giles

As a whole, the World Wide Web displays a striking "rich get richer " behavior, with a relatively small number of sites receiving a disproportionately large share of hyperlink...

Winners don’t take all: Characterizing the competition for links on the web (2002)

David M. Pennock, Gary W. Flake, Steve Lawrence, Eric J. Glover, C. Lee Giles

As a whole, the World Wide Web displays a striking "rich get richer " behavior, with a relatively small number of sites receiving a disproportionately large share of hyperlink...

Analysis of Lexical Signatures for Finding Lost or Related Documents (2002)

Seung-taek Park, David M. Pennock, C. Lee Giles, Robert Krovetz

A lexical signature of a web page is often sufficient for finding the page, even if its URL has changed. We conduct a largescale empirical study of eight methods for generating lexi-cal signatures,...

Self-organization and identification of web communities (2002)

Gary William Flake, Steve Lawrence, C. Lee Giles, Frans M. Coetzee

Despite the decentralized and unorganized nature of the web, we show that the web self-organizes such that communities of highly related pages can be efficiently identified based purely on...

Self-organization and identification of Web communities (2002)

Gary William Flake, Steve Lawrence, C. Lee Giles, Frans M Coetzee

The vast improvement in information access is not the only advantage resulting from the increasing percentage of hyperlinked human knowledge available on the Web. Additionally, much potential exists...

Learning Communication for Multi-agent Systems (2002)

C. Lee Giles, Kam-chuen Jim

We analyze a general model of multi-agent communication in which all agents communicate simultaneously to a message board. A genetic algorithm is used to learn multi-agent languages for the predator...

Winners don’t take all: Characterizing the competition for links on the web (2002)

David M. Pennock, Gary W. Flake, Steve Lawrence, Eric J. Glover, C. Lee Giles

As a whole, the World Wide Web displays a striking “rich get richer ” behavior, with a relatively small number of sites receiving a disproportionately large share of hyperlink references and...

What’s the code? automatic classification of source code archives (2002)

Secil Ugurel, Robert Krovetz, C. Lee Giles, David M. Pennock, Eric Glover, Hongyuan Zha

There are various source code archives on the World Wide Web. These archives are usually organized by application categories and programming languages. However, manually organizing source code...

Feature selection in web applications by roc inflections and powerset pruning (2001)

Frans M. Coetzee, Eric Glover, Steve Lawrence, C. Lee Giles

coetzee,compuman,lawrence,giles¥ A basic problem of information processing is selecting enough features to ensure that events are accurately represented for classification problems, while...

Attractive periodic sets in discrete time recurrent networks (with emphasis on point stability and bifurcations in two{neuron networks (2001)

Bill G. Horne, C. Lee Giles

Copyright MIT Press Abstract We perform a detailed fixed-point analysis of two-unit recurrent neural networks with sigmoid-shaped transfer functions. Using geometrical arguments in the space of...

Methods for sampling pages uniformly from the world wide web (2001)

Paat Rusmevichientong, David M. Pennock, Steve Lawrence, C. Lee Giles

We present two new algorithms for generating uniformly random samples of pages from the World Wide Web, building upon recent work by Henzinger et al. (Henzinger et al. 2000) and Bar-Yossef et al....

Methods for sampling pages uniformly from the world wide web (2001)

Paat Rusmevichientong, David M. Pennock, Steve Lawrence, C. Lee Giles

We present two new algorithms for generating uniformly random samples of pages from the World Wide Web, building upon recent work by Henzinger et al. (Henzinger et al. 2000) and Bar-Yossef et al....

Computer Science Literature and the World Wide Web. Available online at http://www.neci.nec.com/~lawrence/papers/cs-web01/cs-web01.pdf (2001)

Abby A. Goodrum, Katherine W. Mccain, Steve Lawrence, C. Lee Giles

We analyze the computer science literature on the web and compare it to the literature indexed in the Science Citation Index (SCI). The web contains articles from throughout the research timeline,...

Noisy time series prediction using a recurrent neural network and grammatical inference (2001)

C. Lee Giles, Steve Lawrence

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Feature Selection in Web Applications By ROC Inflections and Powerset Pruning (2001)

Frans Coetzee, Eric Glover, Steve Lawrence, C. Lee Giles

A basic problem of information processing is selecting enough features to ensure that events are accurately represented for classification problems, while simultaneously minimizing storage and...

Improving Category Specific Web Search by Learning Query Modifications (2001)

Eric Glover, Eric J. Glover, Gary Flake, Gary W. Flake, Steve Lawrence, Steve Lawrence, ...

Users looking for documents within specific categories may have a difficult time locating valuable documents using general purpose search engines. We present an automated method for learning query...

Insertion of Prior Knowledge (2001)

Paolo Frasconi, C. Lee Giles, Marco Gori, Christian Omlin

In this chapter we focus on methods for injecting prior knowledge (represented in the form of finite automata) into adaptive recurrent networks. Several algorithms and architectures are described,...

Understanding and Explaining DRN Behavior (2001)

C. Lee Giles, Christian Omlin

In this chapter we discuss how symbolic knowledge in the form of DFAs can be extracted from trained DRNs. We give an overview of various methods that have been proposed for DFA extraction and give a...

Representation of Discrete States (2001)

C. Lee Giles, Christian Omlin

Recurrent neural networks are appropriate tools for modeling timevarying systems (e.g. financial markets, physical dynamical systems, speech recognition, etc.). They can be used to recognize pattern...

Attractive Periodic Sets in Discrete Time Recurrent Networks (with Emphasis on Fixed Point Stability and Bifurcations in Two-Neuron Networks) (2001)

Peter Tino, Bill G. Horne, C. Lee Giles

The position, number and stability types of fixed points of a two--neuron recurrent network with nonzero weights are investigated. Using geometrical arguments in the space of derivatives of the...

Persistence of Web References in Scientific Research (2001)

Steve Lawrence, David M. Pennock, Gary William Flake, Robert Krovetz, Frans M. Coetzee, Eric Glover, ...

The lack of persistence of Web references has called into question the increasingly common practice of citing URLs in scientific papers. It is argued that although few critical resources have been...

Feature selection in web applications by roc inflections and powerset pruning (2001)

Frans M. Coetzee, Eric Glover, Steve Lawrence, C. Lee Giles

A basic problem of information processing is selecting enough fea-tures to ensure that events are accurately represented jbr classi-fication problems, while simultaneously minimizing storage and...

Noisy time series prediction using a recurrent neural network and grammatical inference (2001)

C. Lee Giles

Editors: Colin de la Higuera and Vasant Honavar Abstract. Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise,...

Noisy Time Series Prediction using a Recurrent Neural Network and Grammatical Inference (2001)

C. Lee Giles, Steve Lawrence, Ah Chung Tsoi

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Efficient Identification of Web Communities (2000)

Gary William Flake, Steve Lawrence, C. Lee Giles

We dene a community on the web as a set of sites that have more links (in either direction) to members of the community than to non-members. Members of such a community can be eciently identied in a...

Talking helps: Evolving communicating agents for the predator-prey pursuit problem (2000)

Kam-chuen Jim, C. Lee Giles

We analyze a general model of multi-agent communication in which all agents communicate simultaneously to a message board. A genetic algorithm is used to evolve multi-agent languages for the predator...

Overfitting and neural networks: Conjugate gradient and backpropagation (2000)

Steve Lawrence, C. Lee Giles

Methods for controlling the bias/variance tradeoff typically assume that overfitting or overtraining is a global phenomenon. For multi-layer perceptron (MLP) neural networks, global parameters such...

The power of play: Efficiency and forecast accuracy in web market games (2000)

David M. Pennock, Steve Lawrence, C. Lee Giles, Finn Arup Nielsen

We analyze the eciency and forecast accuracy of two market games on the World Wide Web: the Hollywood Stock Exchange (HSX) and the Foresight Exchange (FX). We quantify the degree of arbitrage...

A normative examination of ensemble learning algorithms (2000)

David M. Pennock, C. Lee Giles, Eric Horvitz

Ensemble learning algorithms combine the results of several classifiers to yield an aggregate classification. We present a normative evaluation of combination methods, applying and extending existing...

A normative examination of ensemble learning algorithms (2000)

David M. Pennock, C. Lee Giles, Eric Horvitz

Ensemble learning algorithms combine the results of several classifiers to yield an aggregate classification. We present a normative evaluation of combination methods, applying and extending existing...

DEADLINER: Building a New Niche Search Engine (2000)

Andries Kruger, C. Lee Giles, Frans Coetzee, Eric Glover, Gary W. Flake, Steve Lawrence, ...

We present DEADLINER, a search engine that catalogs conference and workshop announcements, and ultimately will monitor and extract a wide range of academic convocation material from the web. The...

Efficient Identification of Web Communities (2000)

Gary William Flake, Steve Lawrence, C. Lee Giles

We define a community on the web as a set of sites that have more links (in either direction) to members of the community than to non-members. Members of such a community can be efficiently...

Social Choice Theory and Recommender Systems: Analysis of the Axiomatic Foundations of Collaborative Filtering (2000)

David M. Pennock, Eric Horvitz, C. Lee Giles

The growth of Internet commerce has stimulated the use of collaborative filtering (CF) algorithms as recommender systems. Such systems leverage knowledge about the behavior of multiple users to...

Feature Selection in Web Applications Using ROC Inflections and Power Set Pruning (2000)

Power Set Pruning, Frans Coetzee Eric, Frans M. Coetzee, Frans M. Coetzee, Eric Glover, Eric Glover, ...

A basic problem of information processing is selecting enough features to ensure that events are accurately represented for classification problems, while simultaneously minimizing storage and...

Clustering and Identifying Temporal Trends in Document Databases (2000)

Alexandrin Popescul, Gary Flake, Steve Lawrence, Lyle H. Ungar, C. Lee Giles

We introduce a simple and efficient method for clustering and identifying temporal trends in hyper-linked document databases. Our method can scale to large datasets because it exploits the underlying...

DEADLINER: Building a New Niche Search Engine (2000)

Andries Kruger, C. Lee Giles, Frans M. Coetzee, Eric Glover, Gary W. Flake, Steve Lawrence, ...

We present DEADLINER, a search engine that catalogs conference and workshop announcements, and ultimately will monitor and extract a wide range of academic convocation material from the web. The...

Collaborative Filtering by Personality Diagnosis: A Hybrid Memory- and Model-Based Approach (2000)

David Pennock, Eric Horvitz, Steve Lawrence, C Lee Giles

The growth of Internet commerce has stimulated the use of collaborative filtering (CF) algorithms as recommender systems. Such systems leverage knowledge about the known preferences of multiple users...

Efficient Identification of Web Communities (2000)

Gary William Flake, Steve Lawrence, C. Lee Giles

We dene a community on the web as a set of sites that have more links (in either direction) to members of the community than to non-members. Members of such a community can be eciently identied in a...

Web Search -- Your Way (2000)

Eric J. Glover, Steve Lawrence, Michael D. Gordon, William P. Birmingham, C. Lee Giles

We describe a metasearch engine architecture, in use at NEC Research Institute, that allows users to provide preferences in the form of an information need category. This extra information is used to...

Bayesian Classification and Feature Selection from Finite Data Sets (2000)

Frans M. Coetzee, Steve Lawrence, C. Lee Giles

Feature selection aims to select the smallest subset of features for a speci ed level of performance. The optimal achievable classification performance on a feature subset is summarized by its...

Symbolic Knowledge Representation in Recurrent Neural Networks: Insights from Theoretical Models of Computation (2000)

Christian W. Omlin, C. Lee Giles

We give an overview of some of the fundamental issues found in the realm of recurrent neural networks. We use theoretical models of computation to characterize the representational, computational,...

Noisy Time Series Prediction using a Recurrent Neural Network and Grammatical Inference (2000)

C. Lee Giles, Steve Lawrence, Ah Chung Tsoi

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Clustering and identifying temporal trends in document databases (2000)

Rin Popescul, Gary William Flake, Steve Lawrence, Lyle H. Ungar, C. Lee Giles

popescul,ungar¥ We introduce a simple and efficient method for clustering and identifying temporal trends in hyper-linked document databases. Our method can scale to large datasets because it...

Natural Language Grammatical Inference with Recurrent Neural Networks (2000)

Steve Lawrence, C. Lee Giles, Sandiway Fong

This paper examines the inductive inference of a complex grammar with neural networks¿specifically, the task considered is that of training a network to classify natural language sentences as...

Discovering relevant scientific literature on the web (2000)

Kurt D. Bollacker, Steve Lawrence, C. Lee Giles

boon to scientific publication. It lets researchers disseminate their reports faster and at lower cost than ever before, greatly increasing the number and diversity of easily available publications....

Learning Chaotic Attractors by Neural Networks (2000)

Rembrandt Bakker, Jaap C. Schouten, C. Lee Giles, Floris Takens

An algorithm is introduced that trains a neural network to identify chaotic dynamics from a single measured time series. During training, the algorithm learns to short-term predict the time series....

Clustering and identifying temporal trends in document databases (2000)

Rin Popescul, Gary William Flake, Steve Lawrence, Lyle H. Ungar, C. Lee Giles

We introduce a simple and efficient method for clustering and identifying temporal trends in hyper-linked document databases. Our method can scale to large datasets because it exploits the underlying...

Clustering and identifying temporal trends in document databases (2000)

C. Lee, Giles Clustering, Identifying Temporal, Rin Popescul, Rin Popescul, Gary William Flake, ...

We introduce a simple and efficient method for clustering and identifying temporal trends in hyper-linked document databases. Our method can scale to large datasets because it exploits the underlying...

Attractive Periodic Sets in Discrete Time Recurrent Networks (with Emphasis on Fixed Point Stability and Bifurcations in Two-Neuron Networks) (2000)

Peter Tino, Bill Horne, C. Lee Giles

We perform a detailed xed-point analysis of two-unit recurrent neural networks with sigmoid-shaped transfer functions. Using geometrical arguments in the space of transfer function derivatives, we...

Talking helps: Evolving communicating agents for the predator-prey pursuit problem (2000)

Kam-chuen Jim, C. Lee Giles

Abstract We analyze a general model of multi-agent communication in which all agents communicate simultaneously to a message board. A genetic algorithm is used to evolve multi-agent languages for the...

Searching the web: General and scientific information access (1999)

Steve Lawrence, C. Lee Giles

The World Wide Web has revolutionized the way that people access information, and has opened up new possibilities in areas such as digital libraries, general and scientific information dissemination...

Searching the web: General and scientific information access (1999)

Steve Lawrence, C. Lee Giles

he World Wide Web is revolutionizing the way people access information, and has opened up new possibilities in areas such as digital libraries, general and scientific information dissemination and...

Digital libraries and autonomous citation indexing (1999)

Steve Lawrence, C. Lee Giles, Kurt Bollacker

The World Wide Web is revolutionizing the way that researchers access scientific information. Articles are increasingly being made available on the homepages of authors or institutions, at journal...

Digital libraries and autonomous citation indexing (1999)

Steve Lawrence, C. Lee Giles, Kurt Bollacker

The World Wide Web is revolutionizing the way that researchers access scientific information. Articles are increasingly being made available on the homepages of authors or institutions, at journal...

Autonomous citation matching (1999)

Steve Lawrence, C. Lee Giles, Kurt D. Bollacker

Advances in computational resources and the communications infrastructure, and the rapid rise of the World Wide Web, have led to the increasingly widespread availability of scientific papers in...

Distributed Error Correction (1999)

Steve Lawrence, Kurt Bollacker, C. Lee Giles

We propose distributed error correction for digital libraries, where individual users can correct information in a database in real-time. Distributed error correction is used in the ResearchIndex...

Indexing and retrieval of scientific literature (1999)

Steve Lawrence, Kurt Bollacker, C. Lee Giles

The web has greatly improved access to scientific literature. However, scientific articles on the web are largely disorganized, with research articles being spread across archive sites, institution...

Searching the Web: General and Scientific Information Access (1999)

Steve Lawrence, C. Lee Giles

The World Wide Web has revolutionized the way that people access information, and has opened up new possibilities in areas such as digital libraries, general and scientific information dissemination...

Alternative Discrete-Time Operators: An Algorithm for Optimal Selection of Parameters (1999)

Andrew D. Back, Bill G. Horne, Ah Chung Tsoi, C. Lee Giles

In this note, we consider the issue of parameter sensitivity in models based on alternative discrete time operators (ADTOs). A generic first order ADTO is proposed which encompasses all the known...

Flexible User Profiles for Large Scale Data Delivery (1999)

Ugur Çetintemel, Michael J. Franklin, C. Lee Giles

Push-based data delivery requires knowledge of user interests for making scheduling, bandwidth allocation, and routing decisions. Such information is maintained as user profiles. We propose a new...

Indexing and Retrieval of Scientific Literature (1999)

Steve Lawrence, Kurt Bollacker, C. Lee Giles

The web has greatly improved access to scientific literature. However, scientific articles on the web are largely disorganized, with research articles being spread across archive sites, institution...

Recommending Web Documents Based on User Preferences (1999)

Eric J. Glover, Steve Lawrence, Michael D. Gordon, William P. Birmingham, C. Lee Giles

Making recommendations requires treating users as individuals. In this paper, we describe a metasearch engine available at NEC Research Institute that allows individual search strategies to be used....

A System For Automatic Personalized Tracking of Scientific Literature on the Web (1999)

Kurt Bollacker Steve, Steve Lawrence, C. Lee Giles

We introduce a system as part of the CiteSeer digital library project for automatic tracking of scientific literature that is relevant to a user's research interests. Unlike previous systems...

Architecture of a Metasearch Engine that Supports User Information Needs (1999)

Eric Glover, Steve Lawrence, William P. Birmingham, C. Lee Giles

When a query is submitted to a metasearch engine, decisions are made with respect to the underlying search engines to be used, what modifications will be made to the query, and how to score the...

A System For Automatic Personalized Tracking of Scientific Literature on the Web (1999)

Kurt D. Bollacker, Steve Lawrence, C. Lee Giles

We introduce a system as part of the CiteSeer digital library project for automatic tracking of scientific literature that is relevant to a user's research interests. Unlike previous systems...

Text and Image Metasearch on the Web (1999)

Steve Lawrence, C. Lee Giles

As the Web continues to increase in size, the relative coverage of Web search engines is decreasing, and search tools that combine the results of multiple search engines are becoming more valuable....

Digital Libraries and Autonomous Citation Indexing (1999)

Steve Lawrence, C. Lee Giles, Kurt Bollacker

The Web is revolutionizing the way researchers access scientific literature, however scientific literature on the Web is largely disorganized. Autonomous citation indexing can help organize the...

Digital Libraries and Autonomous Citation Indexing (1999)

Steve Lawrence, C. Lee Giles, Kurt Bollacker

The World Wide Web is revolutionizing the way that researchers access scientific information. Articles are increasingly being made available on the homepages of authors or institutions, at journal...

Flexible User Profiles for Large Scale Data Delivery (1999)

Ugur Çetintemel, Michael J. Franklin, C. Lee Giles

Push-based data delivery requires knowledge of user interests for making scheduling, bandwidth allocation, and routing decisions. Such information is maintained as user profiles. We propose a new...

Digital Libraries and Autonomous Citation Indexing (1999)

Steve Lawrence, C. Lee Giles, Kurt Bollacker

Scientific literature on the Web is largely disorganized. Autonomous citation indexing can help organize the literature by automating the construction of citation indices. ACI aims to improve the...

Inquirus, the neci meta search engine (1998)

Steve Lawrence, C. Lee Giles

lawrence at necmail.com and giles at research.nj.nec.com World Wide Web (WWW) search engines (e.g. AltaVista, Infoseek, HotBot, etc.) have a number of deficiencies including: periods of downtime, low...

Evaluating Answer Quality/Efficiency Tradeoffs (1998)

Ugur Cetintemel, Bjorn T. J'onsson, Michael J. Franklin, C. Lee Giles, Divesh Srivastava

For many emerging applications and environments, information systems designers and implementers must consider the tradeoffs between efficiency and the quality of query answers. This flexibility,...

Citeseer: an automatic citation indexing system (1998)

C. Lee Giles, Kurt D. Bollacker, Steve Lawrence

We present CiteSeer: an autonomous citation indexing system which indexes academic literature in electronic format (e.g. Postscript files on the Web). CiteSeer understands how to parse citations,...

Natural Language Grammatical Inference with Recurrent Neural Networks (1998)

Steve Lawrence, C. Lee Giles, Sandiway Fong

This paper examines the inductive inference of a complex grammar with neural networks -- specifically, the task considered is that of training a network to classify natural language sentences as...

Searching the world wide web (1998)

Steve Lawrence, C. Lee Giles

The coverage and recency of the major World Wide Web search engines was analyzed, yielding some surprising results. The coverage of any one engine is significantly limited: No single engine indexes...

CiteSeer: An autonomous web agent for automatic retrieval and identification of interesting publications (1998)

Kurt D. Bollacker, C. Lee Giles

Published research papers available on the World Wide Web (WWW or Web) are often poorly organized, often exist in non-text form (e.g. Postscript) documents, and increase in quantity daily....

Neural Learning of Chaotic Dynamics: The Error Propagation Algorithm (1998)

Rembrandt Bakker, T Bakker, C. Lee Giles, Jaap C. Schouten

An algorithm is introduced that trains a neural network to identify chaotic dynamics from a single measured timeseries. The algorithm has four special features: 1. The state of the system is...

Natural Language Grammatical Inference with Recurrent Neural Networks (1998)

Steve Lawrence Lee, C. Lee Giles, Sandiway Fong

This paper examines the inductive inference of a complex grammar with neural networks -- specifically, the task considered is that of training a network to classify natural language sentences as...

Neural Network Classification and Prior Class Probabilities (1998)

Steve Lawrence, Ian Burns, Andrew Back, Ah Chung Tsoi, C. Lee Giles

A commonly encountered problem in MLP (multi-layer perceptron) classification problems is related to the prior probabilities of the individual classes - if the number of training examples that...

Equivalence in Knowledge Representation: Automata, Recurrent Neural Networks, and Dynamical Fuzzy Systems (1998)

Christian W. Omlin, C. Lee Giles, K.K. Thornber

Neuro-fuzzy systems - the combination of artificial neural networks with fuzzy logic - are becoming increasingly popular. However, neuro-fuzzy systems need to be extended for applications which...

CiteSeer: An Autonomous Web Agent for Automatic Retrieval and Identification of Interesting Publications (1998)

Kurt Bollacker, Steve Lawrence, C. Lee Giles

Published research papers available on the World Wide Web (WWW or Web) are often poorly organized, often exist in non-text form (e.g. Postscript) documents, and increase in quantity daily....

Context and Page Analysis for Improved WEB Search (1998)

Steve Lawrence, C. Lee Giles

Several popular and useful search engines -suach as Alta Vista, Exciete, HotBot, Infoseek, Lycos and Northern Light- attemp to mantain full-text indexes of the World Wide Web. However, relying on a...

CiteSeer: An Automatic Citation Indexing System (1998)

C. Lee Giles, Kurt D. Bollacker, Steve Lawrence

We present CiteSeer: an autonomous citation indexing system which indexes academic literature in electronic format (e.g. Postscript files on the Web). CiteSeer understands how to parse citations,...

CiteSeer: An Autonomous Web Agent for Automatic Retrieval and Identification of Interesting Publications (1998)

Kurt Bollacker, Steve Lawrence, C. Lee Giles

Published research papers available on the World Wide Web (WWW or Web) are often poorly organized, often exist in non-text form (e.g. Postscript) documents, and increase in quantity daily....

Finite State Machines and Recurrent Neural Networks -- Automata and Dynamical Systems Approaches (1998)

Bill G. Horne, C. Lee Giles, Pete C. Collingwood, School Of Computing, Man Sci, Peter Tino, ...

We present two approaches to the analysis of the relationship between a recurrent neural network (RNN) and the finite state machine M the network is able to exactly mimic. First, the network is...

Fuzzy Finite-State Automata Can Be Deterministically Encoded Into Recurrent Neural Networks (1998)

Christian Omlin, Karvel K. Thornber, C. Lee Giles

There has been an increased interest in combining fuzzy systems with neural networks because fuzzy neural systems merge the advantages of both paradigms. On the one hand, parameters in fuzzy systems...

Neural Network Classification and Prior Class Probabilities (1998)

Steve Lawrence, Ian Burns, Andrew Back, Ah Chung Tsoi, C. Lee Giles

. A commonly encountered problem in MLP (multi-layer perceptron) classication problems is related to the prior probabilities of the individual classes { if the number of training examples that...

Reconfigurable Processor Employing Optical Channels (1998)

Majd Sakr, Steven P. Levitan, C. Lee Giles, Donald M. Chiarulli

We describe a reconfigurable computing architecture that exploits parallel optical channels to support fast reconfiguration and compare our architecture to configuration cache based designs Keywords:...

Natural Language Grammatical Inference with Recurrent Neural Networks (1998)

Steve Lawrence, C. Lee Giles, Sandiway Fong

This paper examines the inductive inference of a complex grammar with neural networks -- specifically, the task considered is that of training a network to classify natural language sentences as...

Inquirus, the NECI meta search engine (1998)

Steve Lawrence, C. Lee Giles

World Wide Web (WWW) search engines (e.g. AltaVista, Infoseek, HotBot, etc.) have a number of deficiencies including: periods of downtime, low coverage of the WWW, inconsistent and inefficient user...

Evaluating Answer Quality/Efficiency Tradeoffs (1998)

Ugur Cetintemel, Björn T. Jónsson, Michael J. Franklin, C. Lee Giles, Divesh Srivastava

For many emerging applications and environments, information systems designers and implementors must consider the tradeoffs between efficiency and the quality of query answers. This flexibility,...

Context and Page Analysis for Improved Web Search (1998)

Steve Lawrence, C. Lee Giles

NEC Research Institute has developed a metasearch engine that improves the efficiency of Web searches by downloading and analyzing each document and then displaying results that show the query terms...

Recurrent Neural Networks Learn Deterministic Representations of Fuzzy Finite-State Automata (1998)

Christian W. Omlin, C. Lee Giles

The paradigm of deterministic finite-state automata (DFAs) and their corresponding regular languages have been shown to be very useful for addressing fundamental issues in recurrent neural networks....

Natural Language Grammatical Inference with Recurrent Neural Networks (1998)

Steve Lawrence, C. Lee Giles, Sandiway Fong

This paper examines the inductive inference of a complex grammar with neural networks -- specifically, the task considered is that of training a network to classify natural language sentences as...

Evaluating Answer Quality/Efficiency Tradeoffs (1998)

Ugur Cetintemel, Bjorn T. J'onsson, Michael J. Franklin, C. Lee Giles, Divesh Srivastava

For many emerging applications and environments, information systems designers and implementers must consider the tradeoffs between efficiency and the quality of query answers. This flexibility,...

Equivalence in Knowledge Representation: Automata, Recurrent Neural Networks, and Dynamical Fuzzy Systems (1998)

Christian Omlin, C. Lee Giles, K. K. Thornber

Neuro-fuzzy systems - the combination of artificial neural networks with fuzzy logic - have become useful in many application domains. However, conventional neuro-fuzzy models can have inadequate...

Natural Language Grammatical Inference with Recurrent Neural Networks (1998)

Steve Lawrence, C. Lee Giles, Sandiway Fong

This paper examines the inductive inference of a complex grammar with neural networks -- specifically, the task considered is that of training a network to classify natural language sentences as...

CiteSeer: An Autonomous Web Agent for Automatic Retrieval and Identification of Interesting Publications (1998)

Kurt D. Bollacker, Steve Lawrence, C. Lee Giles

Published research papers available on the World Wide Web (WWW or Web) are often poorly organized, often exist in non-text form (e.g. Postscript) documents, and increase in quantity daily....

CiteSeer: An Autonomous Web Agent for Automatic Retrieval and Identification of Interesting Publications (1998)

Kurt Bollacker, Steve Lawrence, C. Lee Giles

Published research papers available on the World Wide Web (WWW or Web) are often poorly organized, often exist in non-text form (e.g. Postscript) documents, and increase in quantity daily....

Fuzzy Finite-state Automata Can Be Deterministically Encoded into Recurrent Neural Networks (1998)

Christian W. Omlin, Karvel K. Thornber, C. Lee Giles

There has been an increased interest in combining fuzzy systems with neural networks because fuzzy neural systems merge the advantages of both paradigms. On the one hand, parameters in fuzzy systems...

Searching the world wide web (1998)

Steve Lawrence, C. Lee Giles

The coverage and recency of the major World Wide Web search engines was analyzed, yielding some surprising results. The coverage of any one engine is significantly limited: No single engine indexes...

Citeseer: an automatic citation indexing system (1998)

C. Lee Giles, Kurt D. Bollacker, Steve Lawrence

We present CiteSeer: an autonomous citation indexing system which indexes academic literature in electronic format (e.g. Postscript files on the Web). CiteSeer understands how to parse citations,...

Context and page analysis for improved web search (1998)

Steve Lawrence, C. Lee Giles

NEC Research Institute has developed a metasearch engine that improves the efficiency of Web searches by downloading and analyzing each document and then displaying results that show the query terms...

Presenting and analyzing the results of AI experiments: Data averaging and data snooping (1997)

C. Lee Giles, Steve Lawrence

Experimental results reported in the machine learning AI literature can be misleading. This paper investigates the common processes of data averaging (reporting results in terms of the mean and...

Face Recognition: A Convolutional Neural Network Approach (1997)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi, Andrew D. Back

Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult [43]. We present a hybrid neural network solution which...

Lessons in Neural Network Training: Overfitting May be Harder than Expected (1997)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi

For many reasons, neural networks have become very popular AI machine learning models. Two of the most important aspects of machine learning models are how well the model generalizes to unseen data,...

On the Distribution of Performance from Multiple Neural-Network Trials (1997)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, Senior Member, C. Lee Giles

The performance of neural-network simulations is often reported in terms of the mean and standard deviation of a number of simulations performed with different starting conditions. However, in many...

The Gamma MLP -- Using Multiple Temporal Resolutions for Improved Classification (1997)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, C. Lee Giles

We have previously introduced the Gamma MLP which is defined as an MLP with the usual synaptic weights replaced by gamma filters and associated gain terms throughout all layers. In this paper we...

On the Distribution of Performance from Multiple Neural Network Trials (1997)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, C. Lee Giles

The performance of neural network simulations is often reported in terms of the mean and standard deviation of a number of simulations performed with different starting conditions. However, in many...

Rule Inference for Financial Prediction using Recurrent Neural Networks (1997)

C. Lee Giles, Steve Lawrence, Ah Chung Tsoi

This paper considers the prediction of noisy time series data, specifically, the prediction of foreign exchange rate data. A novel hybrid neural network algorithm for noisy time series prediction is...

Noisy Time Series Prediction using Symbolic Representation and Recurrent Neural Network Grammatical Inference (1997)

Steve Lawrence, Ah Chung Tsoi, C. Lee Giles

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Computational capabilities of recurrent NARX neural networks (1997)

Hava Siegelmann, Bill G. Horne, C. Lee Giles

Recently, fully connected recurrent neural networks have been proven to be computationally rich --- at least as powerful as Turing machines. This work focuses on another network which is popular in...

Lessons in Neural Network Training: Overfitting May be Harder than Expected (1997)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi

For many reasons, neural networks have become very popular AI machine learning models. Two of the most important aspects of machine learning models are how well the model generalizes to unseen data,...

Lessons in Neural Network Training: Overfitting May be Harder than Expected (1997)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi

For many reasons, neural networks have become very popular AI machine learning models. Two of the most important aspects of machine learning models are how well the model generalizes to unseen data,...

The Gamma MLP -- Using Multiple Temporal Resolutions for Improved Classification (1997)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, C. Lee Giles

We have previously introduced the Gamma MLP which is defined as an MLP with the usual synaptic weights replaced by gamma filters and associated gain terms throughout all layers. In this paper we...

Noisy Time Series Prediction using Symbolic Representation and Recurrent Neural Network Grammatical Inference (1997)

Steve Lawrence, Ah Chung Tsoi, C. Lee Giles

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Alternative Discrete-Time Operators and Their Application to Nonlinear Models (1997)

Andrew D. Back, Ah Chung Tsoi, Bill G. Horne, C. Lee Giles

The shift operator, defined as q x(t) = x(t+1), is the basis for almost all discrete-time models. It has been shown however, that linear models based on the shift operator suffer problems when used...

Time-Delay Neural Networks: Representation and Induction of Finite State Machines (1997)

Daniel S. Clouse, C. Lee Giles, Bill G. Horne, Garrison W. Cottrell, D. S. Clouse

In this work, we characterize and contrast the capabilities of the general class of time-delay neural networks (TDNN), with input delay neural networks (IDNN), the subclass of TDNNs with delays...

Remembering the Past: The Role of Embedded Memory in Recurrent Neural Network Architectures (1997)

C. Lee Giles, Tsungnan Lin, Bill G. Horne

There has been much interest in learning long-term temporal dependencies with neural networks. Adequately learning such long-term information can be useful in many problems in signal processing,...

On the Distribution of Performance from Multiple Neural Network Trials (1997)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, C. Lee Giles

The performance of neural network simulations is often reported in terms of the mean and standard deviation of a number of simulations performed with different starting conditions. However, in many...

Neural Learning of Chaotic Dynamics: The Error Propagation Algorithm (1997)

Rembrandt Bakker, T Bakker, C. Lee Giles, Jaap C. Schouten

An algorithm is introduced that trains a neural network to identify chaotic dynamics from a single measured timeseries. The algorithm has four special features: 1. The state of the system is...

Computational capabilities of recurrent NARX neural networks (1997)

Hava T. Siegelmann, Bill G. Horne, C. Lee Giles

Recently, fully connected recurrent neural networks have been proven to be computationally rich --- at least as powerful as Turing machines. This work focuses on another network which is popular in...

On the Distribution of Performance from Multiple Neural Network Trials (1997)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, C. Lee Giles

The performance of neural network simulations is often reported in terms of the mean and standard deviation of a number of simulations performed with different starting conditions. However, in many...

Lessons in neural network training: Overfitting may be harder than expected (1997)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi

For many reasons, neural networks have become very popular AI machine learning models. Two of the most important aspects of machine learning models are how well the model generalizes to unseen data,...

Face Recognition: A Convolutional Neural Network Approach (1997)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi, Andrew D. Back

Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult [43]. We present a hybrid neural network solution which...

What size neural network gives optimal generalization? convergence properties of backpropagation (1996)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi

One of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a pre-specified...

Constructing deterministic finite-state automata in recurrent neural networks (1996)

Christian W. Omlin, C. Lee Giles

Recurrent neural networks that are trained to behave like deterministic finite-state automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can...

Stable encoding of large finite-state automata in recurrent neural networks with sigmoid discriminants (1996)

Christian W. Omlin, C. Lee Giles

We propose an algorithm for encoding deterministic finite-state automata (DFAs) in second-order recurrent neural networks with sigmoidal discriminant function and we prove that the languages accepted...

Correctness, Efficiency, Extendability and Maintainability in Neural Network Simulation (1996)

Steve Lawrence, Ah Chung Tsoi, C. Lee Giles

A large number of neural network simulators are publicly available to researchers, many free of charge [11]. However, when a new paradigm is being developed, as is often the case, the advantages of...

Constructing Deterministic Finite-State Automata in Recurrent Neural Networks (1996)

Christian W. Omlin, C. Lee Giles

Recurrent neural networks that are trained to behave like deterministic finite-state automata (DFA's) can show deteriorating performance when tested on long strings. This deteriorating...

What Size Neural Network Gives Optimal Generalization? Convergence Properties of Backpropagation (1996)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi

One of the most important aspects of any machine learning paradigm is how it scales according to problem size and complexity. Using a task with known optimal training error, and a pre-specified...

Correctness, Efficiency, Extendability and Maintainability in Neural Network Simulation (1996)

Steve Lawrence, Ah Chung Tsoi, C. Lee Giles

A large number of neural network simulators are publicly available to researchers, many free of charge [11]. However, when a new paradigm is being developed, as is often the case, the advantages of...

Learning long-term dependencies in NARX recurrent neural networks (1996)

Tsungnan Lin, Bill G. Horne, Peter Tino, C. Lee Giles

It has recently been shown that gradient-descent learning algorithms for recurrent neural networks can perform poorly on tasks that involve long--term dependencies, i.e. those problems for which the...

Constructing Deterministic Finite-State Automata in Recurrent Neural Networks (1996)

Christian W. Omlin, C. Lee Giles

Recurrent neural networks that are trained to behave like deterministic finite-state automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance can...

Can Recurrent Neural Networks Learn Natural Language Grammars? (1996)

Steve Lawrence, C. Lee Giles, Iway Fong

Recurrent neural networks are complex parametric dynamic systems that can exhibit a wide range of different behavior. We consider the task of grammatical inference with recurrent neural networks....

How Embedded Memory in Recurrent Neural Network Architectures Helps Learning Long-term Temporal Dependencies (1996)

Tsungnan Lin, Bill G. Horne, C. Lee Giles

Learning long-term temporal dependencies with recurrent neural networks can be a difficult problem. It has recently been shown that a class of recurrent neural networks called NARX networks perform...

An Analysis of Noise in Recurrent Neural Networks: Convergence and Generalization (1996)

Kam-chuen Jim, C. Lee Giles, Bill G. Horne

There has been much interest in applying noise to feedforward neural networks in order to observe their effect on network performance. We extend these results by introducing and analyzing various...

On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing (1996)

Steve Lawrence, Iway Fong, C. Lee Giles

How can we apply neural network and machine learning methodologies to natural language processing? In this paper we consider the task of training a neural network to classify natural language...

Constructing Deterministic Finite-State Automata in Recurrent Neural Networks (1996)

Christian W. Omlin, C. Lee Giles

. Recurrent neural networks that are trained to behave like deterministic finite-state automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating performance...

Noisy Time Series Prediction using Symbolic Representation and Recurrent Neural Network Grammatical Inference (1996)

Steve Lawrence, Ah Chung Tsoi, C. Lee Giles

Financial forecasting is an example of a signal processing problem which is challenging due to small sample sizes, high noise, non-stationarity, and non-linearity. Neural networks have been very...

Correctness, Efficiency, Extendability and Maintainability in Neural Network Simulation (1996)

Steve Lawrence, Ah Chung Tsoi, C. Lee Giles

A large number of neural network simulators are publicly available to researchers, many free of charge [11]. However, when a new paradigm is being developed, as is often the case, the advantages of...

Stable Encoding of Large Finite-State Automata in Recurrent Neural Networks with Sigmoid Discriminants (1996)

Christian W. Omlin, C. Lee Giles

We propose an algorithm for encoding deterministic finite-state automata (DFAs) in second-order recurrent neural networks with sigmoidal discriminant function and we prove that the languages accepted...

Natural Language Grammatical Inference: A Comparison of Recurrent Neural Networks and Machine Learning Methods (1996)

Steve Lawrence, Sandiway Fong, C. Lee Giles

We consider the task of training a neural network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the...

Face Recognition: A Hybrid Neural Network Approach (1996)

Steve Lawrence, C. Lee Giles, Ah Chung Tsoi, Andrew D. Back

Faces represent complex, multidimensional, meaningful visual stimuli and developing a computational model for face recognition is difficult (Turk and Pentland, 1991). We present a hybrid neural...

Representation of Fuzzy Finite State Automata in Continuous Recurrent Neural Networks (1996)

Christian W. Omlin, Karvel K. Thornber, C. Lee Giles

Based on previous work on encoding deterministic finite-state automata (DFAs) in discretetime, second-order recurrent neural networks with sigmoidal discriminant functions, we propose an algorithm...

An Analysis of Noise in Recurrent Neural Networks: Convergence and Generalization (1996)

Kam Jim, C. Lee Giles, Bill G. Horne

There has been much interest in applying noise to feedforward neural networks in order to observe their effect on network performance. We extend these results by introducing and analyzing various...

Can Recurrent Neural Networks Learn Natural Language Grammars? (1996)

Steve Lawrence, C. Lee Giles, Sandiway Fong, Iway Fong

Recurrent neural networks are complex parametric dynamic systems that can exhibit a wide range of different behavior. We consider the task of grammatical inference with recurrent neural networks....

An Analysis of Noise in Recurrent Neural Networks: Convergence and Generalization (1996)

Kam Jim, C. Lee Giles, Bill G. Horne

There has been much interest in applying noise to feedforward neural networks in order to observe their effect on network performance. We extend these results by introducing and analyzing various...

Natural language grammatical inference: A comparison of recurrent neural networks and machine learning methods (1996)

Steve Lawrence, Iway Fong, C. Lee Giles

We consider the task of training a neural network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the...

Constructing deterministic finite-state automata in recurrent neural networks (1996)

Christian W. Omlin, C. Lee Giles

Abstract. Recurrent neural networks that are trained to behave like deterministic finite-state automata (DFAs) can show deteriorating performance when tested on long strings. This deteriorating...

On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing (1995)

Steve Lawrence, C. Lee Giles, Sandiway Fong

We examine the inductive inference of a complex grammar - specifically, we consider the task of training a model to classify natural language sentences as grammatical or ungrammatical, thereby...

Product Unit Learning (1995)

Laurens R. Leerink, C. Lee Giles, Bill G. Horne, Marwan A. Jabri

Product units provide a method of automatically learning the higher-order input combinations required for the efficient synthesis of Boolean logic functions by neural networks. Product units also...

An Experimental Comparison of Recurrent Neural Networks (1995)

Bill G. Horne, C. Lee Giles

Many different discrete--time recurrent neural network architectures have been proposed. However, there has been virtually no effort to compare these architectures experimentally. In this paper we...

On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing (1995)

Steve Lawrence, C. Lee Giles, Sandiway Fong

We examine the inductive inference of a complex grammar - specifically, we consider the task of training a model to classify natural language sentences as grammatical or ungrammatical, thereby...

Learning a Class of Large Finite State Machines with a Recurrent Neural Network (1995)

C. Lee Giles, C. Lee Giles, B. G. Horne, B. G. Horne, T. Lin, T. Lin

One of the issues in any learning model is how it scales with problem size. Neural networks have not been immune to scaling issues. We show that a dynamically-driven discrete-time recurrent network...

Predictive Control of Opto-Electronic Reconfigurable Interconnection Networks Using Neural Networks (1995)

Majd Sakr, Steven P. Levitan, C. Lee Giles, Bill G. Horne, Marco Maggini, Donald M. Chiarulli

Opto-electronic reconfigurable interconnection networks are limited by significant control latency when used in large multiprocessor systems. This latency is the time required to analyze the current...

On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing (1995)

Steve Lawrence, Iway Fong, C. Lee Giles

How can we apply neural network and machine learning methodologies to natural language processing? In this paper we consider the task of training a neural network to classify natural language...

Fixed Points in Two-Neuron Discrete Time Recurrent Networks: Stability and Bifurcation Considerations (1995)

Peter Tino, Bill G. Horne, C. Lee Giles

The position, number and stability types of fixed points of a two--neuron recurrent network with nonzero weights are investigated. Using simple geometrical arguments in the space of derivatives of...

Using Recurrent Neural Networks to Learn the Structure of Interconnection Networks (1995)

Mark W. Goudreau, C. Lee Giles

A modified Recurrent Neural Network (RNN) is used to learn a Self-Routing Interconnection Network (SRIN) from a set of routing examples. The RNN is modified so that it has several distinct initial...

Routing in Optical Multistage Interconnection Networks: a Neural Network Solution (1995)

C. Lee Giles, Mark W. Goudreau

There has been much interest in using optics to implement computer interconnection networks. However, there has been little discussion of any routing methodologies besides those already used in...

On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing (1995)

Steve Lawrence, C. Lee Giles, Sandiway Fong

We examine the inductive inference of a complex grammar - specifically, we consider the task of training a model to classify natural language sentences as grammatical or ungrammatical, thereby...

Effects of Noise on Convergence and Generalization in Recurrent Networks (1995)

Kam Jim, Bill G. Horne, C. Lee Giles

We introduce and study methods of inserting synaptic noise into dynamically-driven recurrent neural networks and show that applying a controlled amount of noise during training may improve...

On the Applicability of Neural Network and Machine Learning Methodologies to Natural Language Processing (1995)

Steve Lawrence, C. Lee Giles, Sandiway Fong

We examine the inductive inference of a complex grammar - specifically, we consider the task of training a model to classify natural language sentences as grammatical or ungrammatical, thereby...

Learning, Representation, and Synthesis of Discrete Dynamical Systems in Continuous Recurrent Neural Networks (1995)

C. Lee Giles, Christian W. Omlin

This paper gives an overview on learning and representation of discrete-time, discrete-space dynamical systems in discretetime, continuous-space recurrent neural networks. We limit our discussion to...

Learning Large DeBruijn Automata with Feed-Forward Neural Networks (1994)

Daniel S. Clouse, C. Lee Giles, Bill G. Horne, Garrison W. Cottrell

In this paper we argue that a class of finite state machines (FSMs) which is representable by the NNFIR (Neural Network Finite Impulse Response) architecture is equivalent to the definite memory...

First-Order vs. Second-Order Single Layer Recurrent Neural Networks (1994)

Mark Goudreau, C. Lee Giles, Srimat T. Chakradhar, D. Chen

We examine the representational capabilities of first-order and second-order Single Layer Recurrent Neural Networks (SLRNNs) with hard-limiting neurons. We show that a secondorder SLRNN is strictly...

Using Prior Knowledge in an NNPDA to Learn Context-Free Languages (1993)

Sreerupa Das Dept, C. Lee Giles, Guo-zheng Sun

Although considerable interest has been shown in language inference and automata induction using recurrent neural networks, success of these models has mostly been limited to regular languages. We...

Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (1992)

Sreerupa Das, C. Lee Giles, Guo-zheng Sun

This work describes an approach for inferring Deterministic Context-free (DCF) Grammars in a Connectionist paradigm using a Recurrent Neural Network Pushdown Automaton (NNPDA). The NNPDA consists of...

The Gamma model - a new neural network for temporal processing (1992)

Steve Lawrence, Andrew D. Back, Ah Chung Tsoi, C. Lee Giles

We have previously introduced the Gamma MLP which is defined as an MLP with the usual synaptic weights replaced by gamma filters and associated gain terms throughout all layers. In this paper we...

Using Prior Knowledge in an NNPDA to Learn Context-Free Languages (1992)

Sreerupa Das, C. Lee Giles, Guo-zheng Sun

Although considerable interest has been shown in language inference and automata induction using recurrent neural networks, success of these models has mostly been limited to regular languages. We...

Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (1992)

Sreerupa Das, C. Lee Giles, Guo-zheng Sun

This work describes an approach for inferring Deterministic Context-free (DCF) Grammars in a Connectionist paradigm using a Recurrent Neural Network Pushdown Automaton (NNPDA). The NNPDA consists of...

Learning Context-free Grammars: Capabilities and Limitations of a Recurrent Neural Network with an External Stack Memory (1992)

Sreerupa Das, C. Lee Giles, Guo-zheng Sun

This work describes an approach for inferring Deterministic Context-free (DCF) Grammars in a Connectionist paradigm using a Recurrent Neural Network Pushdown Automaton (NNPDA). The NNPDA consists of...

Training Second-Order Recurrent Neural Networks using Hints (1992)

Christian W. Omlin, C. Lee Giles

We investigate a method for inserting rules into discrete-time second-order recurrent neural networks which are trained to recognize regular languages. The rules defining regular languages can be...

Natural Language Grammatical Inference: A Comparison of Recurrent Neural Networks and Machine Learning Methods (1991)

Steve Lawrence, Sandiway Fong, C. Lee Giles

We consider the task of training a neural network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the...

A Delay Damage Model Selection Algorithm for NARX Neural Networks

Tsungnan Lin, C. Lee Giles, Bill G. Horne, S. Y. Kung

Recurrent neural networks have become popular models for system identification and time series prediction. NARX (Nonlinear AutoRegressive models with eXogenous inputs) neural network models are a...

Block-Suffix Shifting: Fast, Simultaneous Medical Concept Set Identification in Large Medical Record Corpora

Liu, Ying, Lita, Lucian Vlad, Niculescu, Radu Stefan, Mitra, Prasenjit, Giles, C. Lee

Owing to new advances in computer hardware, large text databases have become more prevalent than ever. Automatically mining information from these databases proves to be a challenge due to slow...