We Have to Be Discrete About This: A Non-Parametric Imputation Technique for Missing Categorical Data
Gill, Jeff, and Skyler J. Cranmer. “We Have to Be Discrete About This: A Non-Parametric Imputation Technique for Missing Categorical Data”. British Journal of Political Science 43, no. 2 (2013): 425-449
Missing values are a frequent problem in empirical political science research. Surprisingly, the match between the measurement of the missing values and the correcting algorithms applied is seldom studied. While multiple imputation is a vast improvement over the deletion of cases with missing values, it is often unsuitable for imputing highly non-granular discrete data. We develop a simple technique for imputing missing values in such situations, which is a variant of hot deck imputation, drawing from the conditional distribution of the variable with missing values to preserve the discrete measure of the variable. This method is tested against existing techniques using Monte Carlo analysis and then applied to real data on democratization and modernization theory. Software for our imputation technique is provided in a free, easy-to-use package for the R statistical environment.