Book contents
- Frontmatter
- Dedication
- Contents
- PREFACE
- 1 Introduction
- 2 Workload Data
- 3 Statistical Distributions
- 4 Fitting Distributions to Data
- 5 Heavy Tails
- 6 Correlations in Workloads
- 7 Self-Similarity and Long-Range Dependence
- 8 Hierarchical Generative Models
- 9 Case Studies
- 10 Summary and Outlook
- Appendix Data Sources
- Bibliography
- Index
PREFACE
Published online by Cambridge University Press: 05 March 2015
- Frontmatter
- Dedication
- Contents
- PREFACE
- 1 Introduction
- 2 Workload Data
- 3 Statistical Distributions
- 4 Fitting Distributions to Data
- 5 Heavy Tails
- 6 Correlations in Workloads
- 7 Self-Similarity and Long-Range Dependence
- 8 Hierarchical Generative Models
- 9 Case Studies
- 10 Summary and Outlook
- Appendix Data Sources
- Bibliography
- Index
Summary
In 1994 I wrote a long survey about parallel job scheduling [229]. This work described and classified the scheduling schemes of 76 systems, as well as many others that were proposed but never implemented, backed by 638 references. In retrospect, one of the things that struck me was that practically any paper that proposed a new scheme also proved it to be better than competing schemes. On reflection, my conclusion was that the source of the problem was in different assumptions and mindsets, including about the properties of the workloads that would run on these systems. The operational conclusion was that it may be more important to understand the workloads than to design new scheduling schemes.
At about the same time, in work on parallel I/O, I was exposed to the Charisma I/O traces collected by David Kotz and Nils Nieuwejaar [516]. Among the voluminous data on I/O operations were a few records about the jobs to which they belonged. This led to an interaction with Bill Nitzberg who provided me with data regarding three months of jobs from the NASA Ames iPSC/860 system, and then to the publication of the first analysis of such a workload log [244]. Several years later, this log became one of the first to be included in the Parallel Workloads Archive [533]. This archive has been instrumental in facilitating research based on real data rather than on baseless assumptions.
Fast forward to 2014. It is now widely accepted that workload characterization and workload modeling are very important for reliable performance evaluations of computer systems. If the workload is wrong, the results will be wrong too – not in the mathematical sense, but in the sense that they will not apply to the situation at hand. Regrettably, workloads are sometimes (and maybe often) still treated as an afterthought, despite a lot of work that has been done on this topic.
- Type
- Chapter
- Information
- Workload Modeling for Computer Systems Performance Evaluation , pp. xiii - xviPublisher: Cambridge University PressPrint publication year: 2015