Doctoral Dissertation Defense: Amanda Plunkett
Advisor: Dr. Junyong Park
Friday, July 17, 2015 · 10 AM - 12 PM
Title: Analysis and testing of sparse high dimensional discrete data
Abstract:
High dimensional data analysis has been one of the most challenging problems in statistics and related areas for the last two decades. High dimensions occur in many applications where computers are able to capture large amounts of information related to a collected sample. Applications include genetic research, image processing, natural language processing, and signal processing to name a few. We focus on the problem of twosample hypothesis testing for two cases: 1) sparse high dimensional multinomial data, and 2) sparse high dimensional binary data. We propose new statistical tests for each, prove their theoretical validity, and test their performance in various scenarios through simulations and analysis of applied problems. Additionally, we perform follow up analysis of these datasets using statistical classification methods.
Abstract:
High dimensional data analysis has been one of the most challenging problems in statistics and related areas for the last two decades. High dimensions occur in many applications where computers are able to capture large amounts of information related to a collected sample. Applications include genetic research, image processing, natural language processing, and signal processing to name a few. We focus on the problem of twosample hypothesis testing for two cases: 1) sparse high dimensional multinomial data, and 2) sparse high dimensional binary data. We propose new statistical tests for each, prove their theoretical validity, and test their performance in various scenarios through simulations and analysis of applied problems. Additionally, we perform follow up analysis of these datasets using statistical classification methods.