A closer examination of three small-sample approximations to the multiple-imputation degrees of freedom
In addition to varying the imputation method, we varied the number of imputations (m = 5, 10, 20, 100) that were averaged over 500,000 replications to obtain the combined estimates and standard errors for a linear model that regressed the log price of a home on its age (years) and size (square feet) in a sample of 25 observations. Six age values were randomly set equal to missing for each replication. As assessed by the absolute percentage and relative percentage bias, the two approaches performed similarly. The absolute bias of the regression coefficients for age and size was roughly −0.1% across the levels of m for both approaches; the absolute bias for the constant was 0.6% for the chained-equations approach and 1.0% for the multivariate normal model. The absolute biases of the standard errors for age, size, and the constant were 0.2%, 0.3%, and 1.2%, respectively. In general, the relative percentage bias was slightly smaller for the chained-equations approach. Graphical and numerical inspection of the empirical sampling distributions for the three t statistics suggested that the area from the shoulder to the tail was reasonably well approximated by a t distribution and that the small-sample approximations to the multiple-imputation degrees of freedom proposed by Barnard and Rubin and by Reiter performed satisfactorily. View all articles by these authors: David A. Wagstaff, Ofer Harel View all articles with these keywords: missing data, multiple imputation, small-sample degrees of freedom Download citation: BibTeX RISDownload citation and abstract: BibTeX RIS |