GSO Spring Speaker 2016
Probabilistic Modeling of Big Table and Networks
David B. Dunson
Arts and Sciences Distinguished Professor
Departments of Statistical Science, Mathematics, and Electrical & Computer Engineering, Duke University
Venue: BRNG 2290
Abstract:
In applications, data consist of high-dimensional complex and highly-structured discrete data. Our focus here is on high-dimensional unordered categorical data, which arise in epidemiology, social surveys and brain connectomics. In the first part of the talk, I will focus on data that can be structured as a multiway contingency table but otherwise have no obvious structure a priori. For such problems, we rely on probabilistic tensor factorizations, introducing new classes of factorizations, discussing relationships with sparse log-linear models, sketching theory on rates of convergence, and considering applications in social science surveys and genomics. In the second part of the talk, I focus on the case in which the categorical data consist of indicators of connections between pairs of nodes in a network, motivated in particular by brain connectomic studies. The probability distribution for such network-valued random variables can be conveniently represented via a hierarchical latent space representation. We propose a Bayesian approach to inference and show exciting results in performing inferences on differences in brain structure with phenotypes.