Statistics Colloquium
Dr. Andrew Raim, U.S. Census Bureau
Title: An Extension of Generalized Linear Models to Finite Mixture Outcomes
Abstract: Finite mixture distributions
arise in sampling a heterogeneous
population, and data drawn from such a population will exhibit extra
variability relative to any single subpopulation. Statistical models
based on finite mixtures can assist in the analysis of categorical and
count data, where standard generalized linear models (GLMs) often
do
not adequately explain large variability observed in the data. We
propose an extension of GLM where the response is assumed to
follow
a finite mixture distribution, but the regression of interest is linked
to the mixture's mean. This approach may be preferred over a finite
mixture of regressions when the population mean is the quantity of
interest; here, only a single regression function must be specified and
interpreted in the analysis. A technical challenge is that the mean of a
finite mixture is a composite parameter which does not appear explicitly
in the distribution. The proposed model is completely likelihood-based,
and maintains the link to the regression through a certain random
effects structure. We consider typical GLM cases where means are either
real-valued, constrained to be positive, or constrained to be on
the
unit interval. In a Bayesian setting, good Markov chain Monte Carlo
performance is possible by avoiding direct computation of the
likelihood. Benefits of acknowledging the extra variation include
improved residual plots and appropriately widened prediction intervals.