Modeling Context-Dependent Latent Effect Heterogeneity

Diogo Ferrari

doi:10.1017/pan.2019.13

Modeling Context-Dependent Latent Effect Heterogeneity

Published online by Cambridge University Press: 20 May 2019

Diogo Ferrari

Show author details

Diogo Ferrari*: Affiliation:
Department of Political Science, University of Michigan, 505 South State Street, 5700 Haven Hall, Ann Arbor, MI 48104, USA. Email: diogoferrari@gmail.com
*: *Email: diogoferrari@gmail.com

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Classical generalized linear models assume that marginal effects are homogeneous in the population given the observed covariates. Researchers can never be sure a priori if that assumption is adequate. Recent literature in statistics and political science have proposed models that use Dirichlet process priors to deal with the possibility of latent heterogeneity in the covariate effects. In this paper, we extend and generalize those approaches and propose a hierarchical Dirichlet process of generalized linear models in which the latent heterogeneity can depend on context-level features. Such a model is important in comparative analyses when the data comes from different countries and the latent heterogeneity can be a function of country-level features. We provide a Gibbs sampler for the general model, a special Gibbs sampler for gaussian outcome variables, and a Hamiltonian Monte Carlo within Gibbs to handle discrete outcome variables. We demonstrate the importance of accounting for latent heterogeneity with a Monte Carlo exercise and with two applications that replicate recent scholarly work. We show how Simpson’s paradox can emerge in the empirical analysis if latent heterogeneity is ignored and how the proposed model can be used to estimate heterogeneity in the effect of covariates.

Keywords

bayesian nonparametric model latent variables heterogeneous effects generalized linear models semiparametric mixture modeling Dirichlet regression

Type: Articles
Information: Political Analysis , Volume 28 , Issue 1 , January 2020 , pp. 20 - 46

DOI: https://doi.org/10.1017/pan.2019.13 [Opens in a new window]
Copyright: Copyright © The Author(s) 2019. Published by Cambridge University Press on behalf of the Society for Political Methodology.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Author’s note: The author is thankful to Robert Franzese, Walter Mebane, Kevin Quinn, Long Nguyen, as well as participants of 2018 Polmeth and 2018 APSA Annual meeting for helpful comments on previous versions of this manuscript. The author also thanks the editor Jeff Gill and two anonymous reviewers for their invaluable suggestions. Replication materials are publicly available on the Political Analysis Harvard Dataverse (Ferrari 2018) as well as author’s website.

Contributing Editor: Jeff Gill

References

Aakvik, A., Heckman, J. J., and Vytlacil, E. J.. 2005. “Estimating Treatment Effects for Discrete Outcomes When Responses to Treatment Vary: an Application to Norwegian Vocational Rehabilitation Programs.” Journal of Econometrics 125(1–2):15–51.Google Scholar

Alesina, A., and Angeletos, G.-M.. 2005. “Fairness and Redistribution.” The American Economic Review 95(4):960–980.Google Scholar

Alesina, A., and Giuliano, P.. 2010. “Preferences for Redistribution.” In Handbook of Social Economics , edited by Benhabib, J., Bisin, A., and Jackson, M. O., 93–131. Amsterdam: Elsevier.Google Scholar

Arts, W., and Gelissen, J.. 2001. “Welfare States, Solidarity and Justice Principles: Does the Type Really Matter? Acta Sociologica 44(4):283–299.Google Scholar

Bechtel, M. M., Hainmueller, J., and Margalit, Y.. 2014. “Preferences for International Redistribution: The Divide Over the Eurozone Bailouts.” American Journal of Political Science 58(4):835–856.Google Scholar

Beramendi, P., and Rehm, P.. 2016. “Who gives, who gains? Progressivity and Preferences.” Comparative Political Studies 49(4):529–563.Google Scholar

Blei, D. M, and Jordan, M. I. et al. . 2006. “Variational inference for Dirichlet process mixtures.” Bayesian Analysis 1(1):121–143.Google Scholar

Blyth, C. R. 1972. “On Simpson’s Paradox and the Sure-Thing Principle.” Journal of the American Statistical Association 67(338):364–366.Google Scholar

Brooks, S. P., and Gelman, A.. 1998. “General Methods for Monitoring Convergence of Iterative Simulations.” Journal of Computational and Graphical Statistics 7(4):434–455.Google Scholar

Calin, O., and Chang, D.-C.. 2006. Geometric Mechanics on Riemannian Manifolds: Applications to Partial Differential Equations . Birkhauser: Springer Science & Business Media.Google Scholar

Carlin, B. P., and Louis, T. A.. 2000. Bayes and Empirical Bayes Methods for Data Analysis , 2nd edn. Boca Raton, FL: Chapman & Hall/CRC.Google Scholar

Carota, C., and Parmigiani, G.. 2002. “Semiparametric Regression for Count Data.” Biometrika 89(2):265–281.Google Scholar

Chen, X. 2007. “Large Sample Sieve Estimation of Semi-Nonparametric Models.” Handbook of Econometrics 6:5549–5632.Google Scholar

Cowles, M. K., and Carlin, B. P.. 1996. “Markov chain Monte Carlo convergence diagnostics: a comparative review.” Journal of the American Statistical Association 91(434):883–904.Google Scholar

De Iorio, M., Müller, P., Rosner, G. L., and MacEachern, S. N.. 2004. “An ANOVA Model for Dependent Random Measures.” Journal of the American Statistical Association 99(465):205–215.Google Scholar

De la Cruz-Mesía, R., Quintana, F. A., and Marshall, G.. 2008. “Model-Based Clustering for Longitudinal Data.” Computational Statistics & Data Analysis 52(3):1441–1457.Google Scholar

Diaconis, P., and Freedman, D.. 1986. “On the Consistency of Bayes Estimates.” Annals of Statistics 14(1):1–26.Google Scholar

Dorazio, R. M., Mukherjee, B., Zhang, L., Ghosh, M., Jelks, H. L., and Jordan, F.. 2008. “Modeling Unobserved Sources of Heterogeneity in Animal Abundance Using a Dirichlet Process Prior.” Biometrics 64(2):635–644.Google Scholar

Duane, S., Kennedy, A. D., Pendleton, B. J., and Roweth, D.. 1987. “Hybrid monte carlo.” Physics Letters B 195(2):216–222.Google Scholar

Ebbes, P., Wedel, M., and Böckenholt, U.. 2009. “Frugal IV Alternatives to Identify the Parameter for an Endogenous Regressor.” Journal of Applied Econometrics 24(3):446–468.Google Scholar

Ebbes, P., Wedel, M., Böckenholt, U., and Steerneman, T.. 2005. “Solving and Testing for Regressor-Error (in) Dependence When No Instrumental Variables are Available: With New Evidence for the Effect of Education on Income.” Quantitative Marketing and Economics 3(4):365–392.Google Scholar

Ebbes, P., Böckenholt, U., and Wedel, M.. 2004. “Regressor and random-effects dependencies in multilevel models.” Statistica Neerlandica 58(2):161–178.Google Scholar

Ferrari, D.2018. “Replication Data for: Modeling Context-Dependent Latent Effect Heterogeneity.” https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/WB9XLZ.Google Scholar

Flegal, J. M., Haran, M., and Jones, G. L.. 2008. “Markov Chain Monte Carlo: Can We Trust the Third Significant Figure?” Statistical Science 23(2):250–260.Google Scholar

Flegal, J. M.2008. “Monte Carlo Standard Errors for Markov Chain Monte Carlo.” PhD thesis, University of Minnesota.Google Scholar

Gaffney, S.2003. “Curve Clustering with Random Effects Regression Mixtures.” In Ninth International Workshop on Artificial Intelligence and Statistics, AISTATS. Key West, Florida.Google Scholar

Gelman, A., and Hill, J.. 2007. Data Analysis Using Regression and Multilevel/Hierarchical Models . Cambridge: Cambridge University Press.Google Scholar

Geweke, J. 1992. “Evaluating the Accuracy of Sampling-Based Approaches to the Calculation of Posterior Moments.” In Bayesian Statistics , 4th edn. 169–193. Oxford: Oxford University Press.Google Scholar

Ghosal, S., Ghosh, J. K., and Ramamoorthi, R. V.. 1999. “Consistent Semiparametric Bayesian Inference about a Location Parameter.” Journal of Statistical Planning and Inference 77(2):181–193.Google Scholar

Gill, J., and Casella, G.. 2009. “Nonparametric Priors for Ordinal Bayesian Social Science Models: Specification and Estimation.” Journal of the American Statistical Association 104(486):453–454.Google Scholar

Girolami, M., and Calderhead, B.. 2011. “Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73(2):123–214.Google Scholar

Grimmer, J. 2009. “A Bayesian Hierarchical Topic Model for Political Texts: Measuring Expressed Agendas in Senate Press Releases.” Political Analysis 18(1):1–35.Google Scholar

Hannah, L. A., Blei, D. M., and Powell, W. B.. 2011. “Dirichlet Process Mixtures of Generalized Linear Models.” Journal of Machine Learning Research 12(Jun):1923–1953.Google Scholar

Hayashi, F. 2000. Econometrics, vol. 1. Princeton, NJ: Princeton University Press.Google Scholar

Heckman, J. J., and Vytlacil, E. J.. 2007. “Econometric Evaluation of Social Programs, Part II: Using the Marginal Treatment Effect to Organize Alternative Econometric Estimators to Evaluate Social Programs, and to Forecast their Effects in New Environments.” Handbook of Econometrics 6:4875–5143.Google Scholar

Heinzl, F., and Tutz, G.. 2013. “Clustering in Linear Mixed Models with Approximate Dirichlet Process Mixtures Using EM Algorithm.” Statistical Modelling 13(1):41–67.Google Scholar

Hernán, M. A., Clayton, D., and Keiding, N.. 2011. “The Simpson’s Paradox Unraveled.” International Journal of Epidemiology 40(3):780–785.Google Scholar

Ichimura, H., and Todd, P. E.. 2007. “Implementing Nonparametric and Semiparametric Estimators.” Handbook of Econometrics 6:5369–5468.Google Scholar

Ishwaran, H., and James, L. F.. 2001. “Gibbs Sampling Methods for Stick-Breaking Priors.” Journal of the American Statistical Association 96(453):161–173.Google Scholar

Ishwaran, H., and Zarepour, M.. 2000. “Markov Chain Monte Carlo in Approximate Dirichlet and Beta Two-Parameter Process Hierarchical Models.” Biometrika 87(2):371–390.Google Scholar

Johnston, R., Banting, K., Kymlicka, W., and Soroka, S.. 2010. “National Identity and Support for the Welfare State.” Canadian Journal of Political Science 43(02):349–377.Google Scholar

Kievit, R., Frankenhuis, W. E., Waldorp, L., and Borsboom, D.. 2013. “Simpson’s Paradox in Psychological Science: A Practical Guide.” Frontiers in Psychology 4(513):1–14.Google Scholar

Kleinman, K. P., and Ibrahim, J. G.. 1998a. “A Semi-Parametric Bayesian Approach to Generalized Linear Mixed Models.” Statistics in Medicine 17(22):2579–2596.Google Scholar

Kleinman, K. P., and Ibrahim, J. G.. 1998b. “A Semiparametric Bayesian Approach to the Random Effects Model.” Biometrics 54(3):921–938.Google Scholar

Kyung, M., Gill, J., and Casella, G. et al. . 2010. “Estimation in Dirichlet Random Effects Models.” The Annals of Statistics 38(2):979–1009.Google Scholar

Lenk, P. J., and DeSarbo, W. S.. 2000. “Bayesian Inference for Finite Mixtures of Generalized Linear Models with Random Effects.” Psychometrika 65(1):93–119.Google Scholar

Little, R. et al. . 2011. “Calibrated Bayes, for Statistics in General, and Missing Data in Particular.” Statistical Science 26(2):162–174.Google Scholar

Liu, J. S. 2008. Monte Carlo Strategies in Scientific Computing . New York: Springer Science & Business Media.Google Scholar

Mallick, B. K., and Walker, S. G.. 1997. “Combining Information from Several Experiments with Nonparametric Priors.” Biometrika 84(3):697–706.Google Scholar

Matzkin, R. L. 2007. “Nonparametric Identification.” Handbook of Econometrics 6:5307–5368.Google Scholar

Mukhopadhyay, S., and Gelfand, A. E.. 1997. “Dirichlet Process Mixed Generalized Linear Models.” Journal of the American Statistical Association 92(438):633–639.Google Scholar

Müller, P., and Mitra, R.. 2013. “Bayesian Nonparametric Inference-Why and How.” Bayesian Analysis 8(2):269–302.Google Scholar

Müller, P., Quintana, F., and Rosner, G.. 2004. “A Method for Combining Inference Across Related Nonparametric Bayesian Models.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 66(3):735–749.Google Scholar

Neal, R. M. 2000. “Markov Chain Sampling Methods for Dirichlet Process Mixture Models.” Journal of Computational and Graphical Statistics 9(2):249–265.Google Scholar

Neal, R. M. et al. . 2011. MCMC Using Hamiltonian Dynamics , vol. 2. New York, NY: CRC Press.Google Scholar

Newman, B. J., Johnston, C. D., and Lown, P. L.. 2015. “False Consciousness or Class Awareness? Local Income Inequality, Personal Economic Position, and Belief in American Meritocracy.” American Journal of Political Science 59(2):326–340.Google Scholar

Ng, S.-K., McLachlan, G. J., Wang, K., Ben-Tovim Jones, L., and Ng, S.-W.. 2006. “A Mixture Model with Random-Effects Components for Clustering Correlated Gene-Expression Profiles.” Bioinformatics 22(14):1745–1752.Google Scholar

Pearl, J.2011. “Simpson’s Paradox: An Anatomy.” Technical Report UCLA: Department of Statistics Los Angeles, California. https://escholarship.org/uc/item/3s62r0d6.Google Scholar

Pearl, J. 2014. “Comment: Understanding Simpson’s Paradox.” The American Statistician 68(1):8–13.Google Scholar

Pearson, K., Lee, A., and Bramley-Moore, L.. 1899. “Mathematical Contributions to the Theory of Evolution. VI. Genetic (Reproductive) Selection: Inheritance of Fertility in Man, and of Fecundity in Thoroughbred Racehorses.” Philosophical Transactions of the Royal Society of London Series A 192:257–330.Google Scholar

Przeworski, A. 2007. “Is the Science of Comparative Politics Possible? In The Oxford Handbook of Comparative Politics , edited by Boix Boix, C. and Stokes, S. C., Oxford Handbooks Online.Google Scholar

Rehm, P. 2009. “Risks and Redistribution an Individual-Level Analysis.” Comparative Political Studies 42(7):855–881.Google Scholar

Rossi, P. 2014. Bayesian Non- and Semi-Parametric Methods and Applications . Princeton, NJ: Princeton University Press.Google Scholar

Rossi, P. E., Allenby, G. M., and McCulloch, R.. 2006. Bayesian Statistics and Marketing . Chichester: John Wiley & Sons.Google Scholar

Rueda, D., and Stegmueller, D.. 2016. “The Externalities of Inequality: Fear of Crime and Preferences for Redistribution in Western Europe.” American Journal of Political Science 60(2):472–489.Google Scholar

Samuels, M. L. 1993. “Simpson’s Paradox and Related Phenomena.” Journal of the American Statistical Association 88(421):81–88.Google Scholar

Sethuraman, J. 1994. “A Constructive Definition of Dirichlet Priors.” Statistica Sinica 4:639–650.Google Scholar

Shahbaba, B., and Radford, N.. 2009. “Nonlinear Models Using Dirichlet Process Mixtures.” Journal of Machine Learning Research 10(Aug):1829–1850.Google Scholar

Shayo, M. 2009. “A Model of Social Identity with an Application to Political Economy: Nation, Class, and Redistribution.” American Political Science Review 103(02):147–174.Google Scholar

Simpson, E. H. 1951. “The Interpretation of Interaction in Contingency Tables.” Journal of the Royal Statistical Society. Series B (Methodological) 13(2):238–241.Google Scholar

Spirling, A., and Quinn, K.. 2010. “Identifying Intraparty Voting Blocs in the UK House of Commons.” Journal of the American Statistical Association 105(490):447–457.Google Scholar

Stegmueller, D. 2013. “Modeling Dynamic Preferences: A Bayesian Robust Dynamic Latent Ordered Probit Model.” Political Analysis 21(3):314–333.Google Scholar

Stokes, S. C. 2014. “A Defense of Observational Research.” In Field Experiments and their Critics: Essays on the Uses and Abuses of Experimentation in the Social Sciences , edited by Teele, D. L., 33–57. New Haven, CT: Yale University Press.Google Scholar

Svallfors, S. 1997. “Worlds of Welfare and Attitudes to Redistribution: A Comparison of Eight Western Nations.” European Sociological Review 13(3):283–304.Google Scholar

Teh, Y. W., Jordan, M. I., Beal, M. J., and Blei, D. M.. 2006. “Hierarchical Dirichlet Processes.” Journal of the American Statistical Association 101:1566–1581.Google Scholar

Tokdar, S. T. 2006. “Posterior Consistency of Dirichlet Location-Scale Mixture of Normals in Density Estimation and Regression.” Sankhyā: The Indian Journal of Statistics 68(1):90–110.Google Scholar

Traunmuller, R., Murr, A., and Gill, J.. 2015. “Modeling Latent Information in Voting Data with Dirichlet Process Priors.” Political Analysis 23(1):1, http://dx.doi.org/10.1093/pan/mpu018.Google Scholar

Verbeke, G., and Lesaffre, E.. 1997. “The Effect of Misspecifying the Random-Effects Distribution in Linear Mixed Models for Longitudinal Data.” Computational Statistics & Data Analysis 23(4):541–556.Google Scholar

Villarroel, L., Marshall, G., and Barón, A. E.. 2009. “Cluster Analysis Using Multivariate Mixed Effects Models.” Statistics in Medicine 28(20):2552–2565.Google Scholar

Walker, S. G. 2007. “Sampling the Dirichlet Mixture Model with Slices.” Communications in Statistics - Simulation and Computation 36(1):45–54.Google Scholar

Woodridge, J. M. 2002. Econometric Analysis of Cross-Sectional and Panel Data . Cambridge and London: MIT Press.Google Scholar

Yule, G. U. 1903. “Notes on the Theory of Association of Attributes in Statistics.” Biometrika 2(2):121–134.Google Scholar

Ferrari supplementary material

Ferrari supplementary material 1

File 10.7 MB

Article contents

Modeling Context-Dependent Latent Effect Heterogeneity

Abstract

Keywords

Access options

Footnotes

References

Ferrari supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests