Graduate Student Seminar
Wednesday, April 29, 2015 · 11 AM - 12 PM
Session Chair | Gregory Haber |
Discussant | Dr. Malinovsky |
Speaker 1: Mina Hosseini
- Title
-
Generalized Linear Models for Data with Direct and Proxy Observations
- Abstract
- In this project, we review three different approaches for the statistical analysis of data sets with monotone missing data patterns. Such patterns occur in epidemiological data sets when the patients become unable to provide responses by themselves due to advancing severity of their conditions. It is common to use proxy responses by a relative or caregiver in these cases. We are investigating statistical models which can incorporate patient or proxy observations, or in some cases both, so that relevant parameters and their standard errors can be estimated in a single framework. We also compare our own weighted GEE code to an experimental SAS procedure, PROC GEE, and discuss possible improvements.
Speaker 2: April Albertine
- Title
-
Comparison of several types of confidence intervals based on multiply imputed synthetic data under a normal model
- Abstract
- The release of synthetic data sets is one approach for protecting confidentiality when the release of the original survey microdata is impossible due to privacy concerns. The goal is to preserve important summary features (thereby allowing outside researchers the opportunity for analysis) while disguising individual distinguishing responses that could be used to identify the subject themselves. A single synthetic data set can be made public (single imputation), or even multiple synthetic data sets generated from the same raw data (multiple imputation). We consider multiply imputed synthetic data where each set is imputed via posterior predictive sampling, assuming that the original data is normally distributed. We evaluate several different methods of combining the summary statistics of the synthetic data sets in order to do inference on the parameters of the underlying normal model. The different methods of combining the synthetic data sets are compared according to the expected length of confidence intervals for the two parameters of the normal model. We present an application using data on household income from the United States Current Population Survey.