Book contents
- Frontmatter
- Dedication
- Contents
- Figures
- Tables
- Boxes
- Preface
- 1 What is Data Science?
- 2 Little Data, Big Data
- 3 The Process of Data Science
- 4 Tools for Data Analysis
- 5 Clustering and Social Network Analysis
- 6 Predictions and Forecasts
- 7 Text Analysis and Mining
- 8 The Future of Data Science and Information Professionals
- References
- Appendix – Programming Concepts for Data Science
- Index
2 - Little Data, Big Data
Published online by Cambridge University Press: 14 August 2020
- Frontmatter
- Dedication
- Contents
- Figures
- Tables
- Boxes
- Preface
- 1 What is Data Science?
- 2 Little Data, Big Data
- 3 The Process of Data Science
- 4 Tools for Data Analysis
- 5 Clustering and Social Network Analysis
- 6 Predictions and Forecasts
- 7 Text Analysis and Mining
- 8 The Future of Data Science and Information Professionals
- References
- Appendix – Programming Concepts for Data Science
- Index
Summary
Today there are vast quantities and varieties of data, with new data sources emerging all the time. This makes any attempt to provide an overview of data a futile and quickly outdated task. Therefore this chapter considers some of the data that is available according to three features: whether it is ‘big data’, its format, and its source.
‘Big data’ is typically used to describe the large quantities of data now being generated and the complex infrastructure needed for their collection and analysis, and it can be contrasted with small data sets that can be simply gathered and analysed on a desktop. Data format provides an overview of some of the ways data can appear, from tables of data in documents through to APIs and linked data. Finally data sources looks more closely at some of the sources of data that are currently publicly available, as well as some of the additional data a library or information professional may have access to.
Big data
The term ‘big data’ can be traced back to the mid-1990s (Kitchin and McArdle, 2016), although it really entered the public consciousness in 2012. The New York Times ran articles with titles such as ‘The Age of Big Data’ (Lohr, 2012a) and ‘How Big Data Became So Big’ (Lohr, 2012b); the Guardian had ‘Why Big Data is Now Such a Big Deal’ (Naughton, 2012b) and ‘Big Data: revolution by numbers’ (Naughton, 2012a); and Big Data, Big Impact (WEF, 2012) was a topic at Davos in 2012. Google Trends shows there was a rapid rise in online searches for the term over the year, and since then there has been a raft of popular science and business publications devoted to the subject: Big Data (Mayer-Schönberger and Cukier, 2013); Big Data (Marr, 2015); Big Data for Small Business for Dummies (Marr, 2016); and Big Data: does size matter? (Harkness, 2016).
Despite the rapid growth in the popularity of big data, pinning down what is meant by the term is more difficult.
- Type
- Chapter
- Information
- Practical Data Science for Information Professionals , pp. 17 - 38Publisher: FacetPrint publication year: 2020