Summary and Outlook

Dror G. Feitelson

doi:10.1017/CBO9781139939690.011

Developing a model of a nontrivial system is itself nontrivial. There is no simple recipe that can be applied that promises good results. Instead, model building is usually an iterative and interactive process, involving three recurring steps: model formulation, model estimation, and model validation [121, sect. 4.8]. Most books, including this one, devote most of their attention to model estimation. This is the activity of matching a specific piece of a model to a given feature of the data. But one must not forget the big picture.

From Workload Data to Workload Model

In previous chapters we have described and compared many workload models in various domains. Here we want to summarize recurring principles and draw them together.

To recap, there are three main approaches to using workload data:

Find the simplest abstract mathematical model that captures a desired feature.
Use raw data as when driving simulations directly from traces, or using empirical distributions.
Create a generative model that could plausibly give rise to the observed data.

Perhaps the most entrenched and commonly used approach in workload modeling is to use a mathematical abstraction in the form of a statistical model. For example, the method of moments can be used to fit a marginal distribution, and an autocorrelation function is used to characterize the dependence structure and fit a long-range dependent fARIMA model. When a new workload feature is recognized as being important, mathematical modeling is often the first approach used to evaluate its effect. And doing so often leads to great advances in understanding the effect of the new feature.

But such abstractions can also miss out on important issues. Distributions with the correct moments can still have the wrong shape and taint detailed analysis. Moreover, descriptive mathematical models may actually lead to conclusions that do not really reflect the workload. For example, consider a study of a communication network that finds a negative correlation between packet sizes and the subsequent interval to the next packet.

Book contents

10 - Summary and Outlook

Summary

Access options

Book contents

10 - Summary and Outlook

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive