Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-15T19:02:26.278Z Has data issue: false hasContentIssue false

Election Fraud: A Latent Class Framework for Digit-Based Tests

Published online by Cambridge University Press:  04 January 2017

Juraj Medzihorsky*
Affiliation:
Department of Political Science, Central European University, Nador u. 9., 1051 Budapest, Hungary
*
e-mail: juraj.medzihorsky@gmail.com (corresponding author)

Abstract

Digit-based election forensics (DBEF) typically relies on null hypothesis significance testing, with undesirable effects on substantive conclusions. This article proposes an alternative free of this problem. It rests on decomposing the observed numeral distribution into the “no fraud” and “fraud” latent classes, by finding the smallest fraction of numerals that needs to be either removed or reallocated to achieve a perfect fit of the “no fraud” model. The size of this fraction can be interpreted as a measure of fraudulence. Both alternatives are special cases of measures of model fit—the π∗ mixture index of fit and the Δ dissimilarity index, respectively. Furthermore, independently of the latent class framework, the distributional assumptions of DBEF can be relaxed in some contexts. Independently or jointly, the latent class framework and the relaxed distributional assumptions allow us to dissect the observed distributions using models more flexible than those of existing DBEF. Reanalysis of Beber and Scacco's (2012) data shows that the approach can lead to new substantive conclusions.

Type
Articles
Copyright
Copyright © The Author 2015. Published by Oxford University Press on behalf of the Society for Political Methodology 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Author's note: I am grateful to Tamás Rudas, Gábor Tóka, Levente Littvay, Zoltán Fazekas, Daniela Širinić, Pavol Hardos, two anonymous reviewers, and the editors for helpful comments and suggestions, and the members of the Political Behavior Research Group at CEU for helpful discussion. Replication materials are available online as Medzihorsky, Juraj, 2015, “Replication Data for: Election Fraud: A Latent Class Framework for Digit-Based Tests”, http://dx.doi.org/10.7910/DVN/1FYXUJ, Harvard Dataverse, V2 (Medzihorsky 2015b), and include the version of the R package pistar (Medzihorsky 2015a) used in the analysis. The article uses data from Beber and Scacco (2012), which is available online also as Beber and Scacco (2011). Supplementary materials for this article are available on the Political Analysis Web site.

References

Agresti, A. 2002. Categorical data analysis, 2nd ed. Hoboken, N.J.: John Wiley & Sons.Google Scholar
Alvarez, R. M., Hall, T. E. and Hyde, S. D. 2009. Election Fraud: Detecting and Deterring Electoral Manipulation. Washington, D.C.: Brookings Institution Press.Google Scholar
Alvarez, R. M., Atkeson, L. R., and Hall, T. E. 2012. Evaluating elections: A handbook of methods and standards. Cambridge [England]; New York: Cambridge University Press.CrossRefGoogle Scholar
Beber, B., and Scacco, A. 2011. Replication Data for: What the Numbers Say: A Digit-Based Test for Election Fraud. Harvard Dataverse, V2. http://hdl.handle.net/(1902).1/17151 (accessed April 26, 2014).Google Scholar
Beber, B., and Scacco, A. 2012. What the numbers say: A digit-based test for election fraud. Political Analysis 20(2): 211–34.CrossRefGoogle Scholar
Benford, F. 1938. The law of anomalous numbers. Proceedings of the American Philosophical Society 78:551–72.Google Scholar
Breunig, C., and Goerres, A. 2011. Searching for electoral irregularities in an established democracy: Applying Benford's law tests to Bundestag elections in unified Germany. Electoral Studies 30(3): 534–45.Google Scholar
Buttorf, G. 2008. Detecting fraud in America's Gilded Age. Unpublished manuscript, University of Iowa.Google Scholar
Cantú, F., and Saiegh, S. M. 2011. Fraudulent democracy? An analysis of Argentina's infamous decade using supervised machine learning. Political Analysis 19(4): 409–33.Google Scholar
Clogg, C., Rudas, T., and Xi, L. 1995. A new index of structure for the analysis of models for mobility tables and other cross-classifications. Sociological Methodology 25:197222.Google Scholar
Clogg, C. C., Rudas, T., and Matthews, S. 1997. Analysis of contingency tables using graphical displays based on the mixture index of fit. In Visualization of categorical data, eds. Blasius, J. and Greenacre, M., 425–39. San Diego: Academic Press.Google Scholar
Dayton, C. M. 2003. Applications and computational strategies for the two-point mixture index of fit. British Journal of Mathematical and Statistical Psychology 56(1): 113.Google Scholar
Deckert, J., Myagkov, M., and Ordeshook, P. C. 2011. Benford's law and the detection of election fraud. Political Analysis 19(3): 245–68.Google Scholar
Formann, A. K. 2000. Rater agreement and the generalized Rudas-Clogg-Lindsay index of fit. Statistics in Medicine 19(14): 1881–8.Google Scholar
Formann, A. K 2003a. Latent class model diagnosis from a frequentist point of view. Biometrics 59(1): 189–96.Google Scholar
Formann, A. K 2003b. Latent class model diagnostics—A review and some proposals. Computational Statistics & Data Analysis 41(3): 549–59.Google Scholar
Formann, A. K 2006. Testing the Rasch model by means of the mixture fit index. British Journal of Mathematical and Statistical Psychology 59(1): 8995.Google Scholar
Giles, D. E. 2007. Benford's law and naturally occurring prices in certain eBay auctions. Applied Economics Letters 14(3): 157–61.Google Scholar
Gini, C. 1914. Di una misura della dissomiglianza tra due gruppi di quantità e delle sue applicazioni allo studio delle relazione statistiche. Atti del Reale Instituto Veneto di Scienze, Lettere ed Arti (Series 8) 74:185213.Google Scholar
Hernández, J. M., Rubio, V. J., Revuelta, J., and Santacreu, J. 2006. A procedure for estimating intrasubject behavior consistency. Educational and Psychological Measurement 66(3): 417–34.Google Scholar
Hill, T. P. 1995. A statistical derivation of the significant-digit law. Statistical Science 10(4): 354–63.Google Scholar
Ispány, M., and Verdes, E. 2014. On the robustness of mixture index of fit. Journal of Mathematical Sciences 200(4): 432–40.Google Scholar
Jiménez, R., and Hidalgo, M. 2014. Forensic analysis of Venezuelan elections during the Chávez presidency. PLoS One 9(6):e100884.Google Scholar
Judge, G., and Schechter, L. 2009. Detecting problems in survey data using Benford's law. Journal of Human Resources 44(1): 124.Google Scholar
Leemann, L., and Bochsler, D. 2014. A systematic approach to study electoral fraud. Electoral Studies 35:3347.CrossRefGoogle Scholar
Leemis, L. M., Schmeiser, B. W., and Evans, D. L. 2000. Survival distributions satisfying Benford's law. American Statistician 54(4): 236–41.Google Scholar
Mebane, W. R. 2006a. Election forensics: The second-digit Benford's law test and recent American presidential elections. In Prepared for delivery at the Election Fraud Conference. September 29–30, Salt Lake City, Utah.Google Scholar
Mebane, W. R 2006b. Election forensics: Vote counts and Benford's law. Summer Meeting of the Political Methodology Society, UC-Davis, July.Google Scholar
Mebane, W. R 2007. Election forensics: Statistical interventions in election controversies. Prepared for presentation at the 2007 Annual Meeting of the American Political Science Association, Chicago, Aug 30-Sep 2.Google Scholar
Mebane, W. R 2008. Election forensics: Outlier and digit tests in America and Russia. American Electoral Process Conference, Center for the Study of Democratic Politics, Princeton University.Google Scholar
Mebane, W. R 2010a. Election fraud or strategic voting? Can second-digit tests tell the difference? Summer Meeting of the Political Methodology Society, University of Iowa.Google Scholar
Mebane, W. R 2010b. Fraud in the 2009 presidential election in Iran? Chance 23(1): 615.CrossRefGoogle Scholar
Mebane, W. R 2011. Comment on “Benford's law and the detection of election fraud.” Political Analysis 19(3): 269–72.Google Scholar
Mebane, W. R., and Kalinin, K. 2009. Comparative election fraud detection. Prepared for presentation at the 2009 Annual Meeting of the American Political Science Association, Toronto, Canada, Sept 3–6.Google Scholar
Medzihorsky, J. 2015a. pistar: Rudas, Clogg and Lindsay mixture index of fit. R package version 0.5.2.5. https://github.com/jmedzihorsky/pistar.Google Scholar
Medzihorsky, J 2015b. Replication Data for: Election Fraud: A Latent Class Framework for Digit-Based Tests. Harvard Dataverse, V2 [UNF:6:FIWHvsHNzZgPStT0+kgbsQ==]. http://dx.doi.org/10.7910/DVN/1FYXUJ.Google Scholar
Newcomb, S. (1881). Note on the frequency of use of the different digits in natural numbers. American Journal of Mathematics 4(1): 3940.Google Scholar
Nickerson, R. S. 2002. The production and perception of randomness. Psychological Review 109(2): 330.Google Scholar
Norris, P., Frank, R. W., and Coma, F. M. I. 2014. Advancing electoral integrity. Oxford: Oxford University Press.Google Scholar
Pericchi, L., and Torres, D. 2011. Quick anomaly detection by the Newcomb-Benford law, with applications to electoral processes data from the USA, Puerto Rico, and Venezuela. Statistical Science 26(4): 502–16.Google Scholar
Revuelta, J. 2008. Estimating the &b.pi;* goodness of fit index for finite mixtures of item response models. British Journal of Mathematical and Statistical Psychology 61(1): 93113.Google Scholar
Rudas, T. 1998. The mixture index of fit. In Advances in methodology, data analysis, and statistics, ed. Ferligoj, A., 1522. Ljubljana: FDV.Google Scholar
Rudas, T 1999. The mixture index of fit and minimax regression. Metrika 50(2): 163–72.Google Scholar
Rudas, T 2002. A latent class approach to measuring the fit of a statistical model. In Applied latent class analysis, eds. Hagenaars, J. A. and McCutcheon, A. L., 345–65. Cambridge: Cambridge University Press.Google Scholar
Rudas, T 2005. Mixture models of missing data. Quality & Quantity 39(1): 1936.Google Scholar
Rudas, T., Clogg, C., and Lindsay, B. 1994. A new index of fit based on mixture methods for the analysis of contingency tables. Journal of the Royal Statistical Society. Series B (Methodological) 56(4): 623–39.Google Scholar
Rudas, T., and Verdes, E. 2015. Model-based analysis of incomplete data using the mixture index of fit. In Advances in latent class analysis: A Festschrift in Honor of C. Mitchell Dayton, eds. Hancock, G. R. and Macready, G. B. Charlotte, NC: Information Age Publishing.Google Scholar
Rudas, T., and Zwick, R. 1997. Estimating the importance of differential item functioning. Journal of Educational and Behavioral Statistics 22(1): 3145.Google Scholar
Tam Cho, W. K., and Gaines, B. J. 2007. Breaking the (Benford) law: Statistical fraud detection in campaign finance. American Statistician 61(3): 218–23.Google Scholar
Verdes, E., and Rudas, T. 2003. The π* index as a new alternative for assessing goodness of fit of logistic regression. In Foundations of Statistical Inference: Proceedings of the Shoresh Conference 2000, eds. Haitovsky, Y. and Ritov, Y., 167–77. Berlin and Heidelberg: Springer.Google Scholar
Ziliak, S. T., and McCloskey, D. N. 2008. The cult of statistical significance: How the standard error costs us jobs, justice, and lives. Ann Arbor: University of Michigan Press.Google Scholar
Supplementary material: PDF

Medzihorsky supplementary material

Appendix

Download Medzihorsky supplementary material(PDF)
PDF 262.2 KB