Published online by Cambridge University Press: 30 September 2013
A test for evaluating wine judge performance is developed. The test is based on the premise that an expert wine judge will award similar scores to an identical wine. The definition of “similar” is parameterized to include varying numbers of adjacent awards on an ordinal scale, from No Award to Gold. For each index of similarity, a probability distribution is developed to determine the likelihood that a judge might pass the test by chance alone. When the test is applied to the results from a major wine competition, few judges pass the test. Of greater interest is that many judges who fail the test have vast professional experience in the wine industry. This leads to us to question the basic premise that experts are able to provide consistent evaluations in wine competitions and, hence, that wine competitions do not provide reliable recommendations of wine quality. (JEL Classifications: C02, C12, D81)
The Advisory Board that oversees the conduct of the California State Fair Commercial Wine Competition deserves acknowledgment for sustaining this study to evaluate judge performance over the past decade. It is the only study of its kind, and the board should be commended for allowing the results to be offered in the public domain. In particular, G.M. “Pooch” Pucilowski, chief judge, should be commended for initiating and supporting this work. Analysis of the data would not have been possible without the help of Aaron Kidder, chief programmer for the competition, for implementing replicate sampling into the flights and supplying a concise format for data analysis. Finally, the authors thank the anonymous reviewers for helpful suggestions that improved this paper.