In this thesis, we study statistical inference in the presence of missing data. In Chapters 2-4, we obtain asymptotically valid imputed estimators for the population mean, distribution function and correlation coefficient, and propose adjustments to Shao and Sitter (1996) bootstrap confidence intervals under imputation for missing data. We show that the adjusted bootstrap estimators should be used with bootstrap data obtained by imitating the process of imputing the original data set.
In Chapter 5, we establish a goodness-of-fit test that can be applied to the case of longitudinal data
with missing at random (MAR) observations, by combining the concepts of weighted generalized estimating equations (Robins et al., 1995) and score test statistic for goodness-of-fit (Hosmer and Lemeshow, 1980; Horton et al., 1999). We show that the proposed goodness-of-fit method that incorporates the missingness process should be used when dealing with intermittent missingness.
In Chapter 6, we study a conditional model for a mixture of correlated, discrete and continuous, outcomes and apply the likelihood method to MAR data. We conduct a simulation study to compare the performance of
estimators resulting from the joint model with estimators based on separate models for binary and continuous outcomes. We show that when all data are observed, adopting the mixed model does not lead to notable improvements; on the contrary, under a scenario with binary MAR data, the joint model performs significantly better.