Implementing Rubin's alternative multiple-imputation method for statistical matching in Stata
Anil Alpman
Paris School of Economics
Université Paris 1 Panthéon–Sorbonne
Paris, France
[email protected]
|
Abstract. This article introduces two new commands, smpc and smmatch, that
implement the statistical matching procedure proposed by Rubin (1986,
Journal of Business and Economic Statistics 4: 87–94). The purpose
of statistical matching in Rubin’s procedure is to generate a single dataset
from various datasets, where each dataset contains a specific variable of
interest and all contain some variables in common. For two variables of
interest that are not observed jointly for any unit, smpc generates the
predicted values of each as a function of the other variable of interest and a
set of control variables by assuming a partial correlation value (defined by
the user) between the two variables of interest (other statistical matching
procedures assume that they are conditionally independent given the control
variables). The smmatch command, on the other hand, matches observations
of different datasets according to their predicted values (using a minimum
distance criterion) conditional on a set of control variables, and it imputes
the observed value of the match for the missing.
View all articles by this author:
Anil Alpman
View all articles with these keywords:
smmatch, smpc, data combination, missing data, multiple imputation, statistical matching
Download citation: BibTeX RIS
Download citation and abstract: BibTeX RIS
|