Three Essays on Identification and Estimation using Sample Combination in a Missing Data Context

It appears your Web browser is not configured to display PDF files. Download adobe Acrobat or click here to download the PDF file.

Click here to download the PDF file.


Felt, Marie-Helene




This thesis is concerned with identification and estimation in two different contexts of missing data. In both situations, identification is achieved by employing data combination strategies. The first chapter uses panel data to understand the impact of retail payment innovations on cash usage while accounting for unobserved heterogeneity. The challenging feature of the data pertains to high rates (about 50 percent) of non-ignorable attrition. This missing data problem is addressed by the use of refreshment samples, which allow one to correct for potential attrition bias due to general forms of attrition instead of relying on rather restrictive assumptions. The methodological contribution is to provide identification of a three-period attrition probability function, and to discuss how to control simultaneously for non-ignorable attrition and item nonresponse. The following chapters deal with the common missing data case in which the variables of interest to a research question are not all available in one single data set, and data combination is required. In contrast with most sample combination methods I propose, in the second chapter, an identification strategy that does not rely on either units or variables in common across the samples to be combined. Rather, I exploit the availability of a third sample where an aggregate distribution of the variables of interest, e.g. the distribution of their sum, is observed. Using deconvolution methods, I establish non-parametric identification of the joint distribution of interest by combining marginal distributions. While the identification framework considered in Chapter 2 might seem quite specific, it is encountered in practice in multiple situations, and many potential applications exist. One of them occurs when individual and household surveys provide, for a given population, independent samples where the same variable is measured at the individual level in the one case and at the household level in the other. This is the data setting considered in the third chapter, which presents an empirical application of the identification procedure in the context of payment behavior analysis. In particular, in the absence of intra-household data, I investigate intra-household payment behaviors by combining individual-level and aggregated household-level payment survey data.






Carleton University


co-author (first chapter): 
Chen Heng
co-author (first chapter): 
Huynh P. Kim
Executive Editor, Journal of the Royal Statistical Society: 
Owen Martin

Thesis Degree Name: 

Doctor of Philosophy: 

Thesis Degree Level: 


Thesis Degree Discipline: 


Parent Collection: 

Theses and Dissertations

Items in CURVE are protected by copyright, with all rights reserved, unless otherwise indicated. They are made available with permission from the author(s).