Publication View

Both authors are affiliated with Machine Learning Systems Group (2007)

Abstract
With hardware advances in scientific instruments and data gathering techniques comes the inevitable flood of data that can render traditional approaches to science data analysis severely inadequate. The traditional approach of manual and exhaustive analysis of a data set is no longer feasible for many tasks ranging from remote sensing, astronomy, and atmospherics to medicine, molecular biology, and biochemistry. In this paper we present our views as practitioners engaged in building computational systems to help scientists deal with large data sets. We focus on what we view as challenges and shortcomings of the current state-of-the-art in data analysis in view of the massive data sets that are still awaiting analysis. The presentation is grounded in applications in astronomy, planetary sciences, solar physics, and atmospherics that are currently driving much of our work at JPL.

Publication details
Download http://citeseerx.ist.psu.edu/viewdoc/summary?doi=?doi=10.1.1.31.1606
Source ftp://ftp.research.microsoft.com/pub/dtg/fayyad/massive-datasets/fayyad-massive96.ps
Contributors CiteSeerX
Repository CiteSeerX - Scientific Literature Digital Library and Search Engine (United States)
Keywords science data analysis, limitations of current methods, challenges for massive data sets
Type text
Language English
Relation 10.1.1.18.4267, 10.1.1.35.4844, 10.1.1.31.2250