Statistics Colloquium: Dr. Gabor Szekely
National Science Foundation
Title: The Energy of Data
Abstract: The energy of data is the value of a real function of distances between data in metric spaces. The name energy derives from Newton's gravitational potential energy which is also a function of distances between physical objects. One of the advantages of working with energy functions (energy statistics) is that even if the observations/data are complex objects, like functions or graphs, we can use their real valued distances for inference. Other advantages will be illustrated and discussed in the talk. Concrete examples include energy clustering, distance correlation, energy testing for normality, energy testing for symmetry. Applications include genome, brain studies, and MCMC type approach for Bayesian computation. One of the ingredients of data energy is the Riesz energy of measures. We also plan to revisit Re’nyi’s axioms of dependence measures. --- Data energy was introduced by the speaker three decades ago but it became popular in statistical inference only after distance correlation was introduced ten years ago. Since then more than 3,000 papers apply data energy.