Home  >>  Archives  >>  Volume 16 Number 3  >>  st0452

The Stata Journal
Volume 16 Number 3: pp. 717-739



Subscribe to the Stata Journal
cover

Implementing Rubin's alternative multiple-imputation method for statistical matching in Stata

Anil Alpman
Paris School of Economics
Université Paris 1 Panthéon–Sorbonne
Paris, France
[email protected]
Abstract.  This article introduces two new commands, smpc and smmatch, that implement the statistical matching procedure proposed by Rubin (1986, Journal of Business and Economic Statistics 4: 87–94). The purpose of statistical matching in Rubin’s procedure is to generate a single dataset from various datasets, where each dataset contains a specific variable of interest and all contain some variables in common. For two variables of interest that are not observed jointly for any unit, smpc generates the predicted values of each as a function of the other variable of interest and a set of control variables by assuming a partial correlation value (defined by the user) between the two variables of interest (other statistical matching procedures assume that they are conditionally independent given the control variables). The smmatch command, on the other hand, matches observations of different datasets according to their predicted values (using a minimum distance criterion) conditional on a set of control variables, and it imputes the observed value of the match for the missing.
Terms of use     View this article (PDF)

View all articles by this author: Anil Alpman

View all articles with these keywords: smmatch, smpc, data combination, missing data, multiple imputation, statistical matching

Download citation: BibTeX  RIS

Download citation and abstract: BibTeX  RIS