Michael I. Jordan, University of California, Berkeley
Friday, September 2 at 10:30 a.m. EDT
Statistical decisions are often given meaning in the context of other decisions, particularly when there are scarce resources to be shared. Managing such sharing is one of the classical goals of microeconomics, and it is given new relevance in the modern setting of large, human-focused datasets, and in data-analytic contexts such as classifiers and recommendation systems. I'll discuss several recent projects that aim to explore the interface between machine learning and microeconomics, including leader/follower dynamics in strategic classification, a Lyapunov theory for matching markets with transfers, and the use of contract theory as a way to design mechanisms that perform statistical inference.
Emmanuel Candès , Stanford University
Wednesday, September 7 at 11:30 a.m. EDT
Recent progress in machine learning provides us with many potentially effective tools to learn from datasets of ever-increasing sizes and make useful predictions. How do we know that these tools can be trusted in critical and highly-sensitive domains? If a learning algorithm predicts the GPA of a prospective college applicant, what guarantees do we have concerning the accuracy of this prediction? How do we know that it is not biased against certain groups of applicants? To address questions of this kind, this talk reviews a wonderful field of research known under the name of conformal inference/prediction, pioneered by Vladimir Vovk and his colleagues 20 years ago. After reviewing some of the basic ideas underlying distribution-free predictive inference, we shall survey recent progress in the field touching upon several issues: (1) efficiency: how can we provide tighter predictions? (2) data-reuse: what do we do when data is scarce? (3) algorithmic fairness: how do we make sure that learned models apply to individuals in an equitable manner? and (4) causal inference: can we predict the counterfactual response to a treatment given that the patient was not treated?
Xiao-Li Meng, Harvard University
Friday, September 16 at 10:30 a.m. EDT
Non-probability samples are deprived of the powerful design probability for randomization-based inference. This deprivation, however, encourages us to take advantage of a natural divine probability that comes with any finite population. A key metric from this perspective is the data defect correlation(ddc), which is the model-free finite-population correlation between the individual's sample inclusion indicator and the individual's attribute being sampled. A data generating mechanism is equivalent to a probability sampling, in terms of design effect, if and only if its corresponding ddc is of N-0.5 (stochastic) order, where N is the population size (Meng, 2018, AOAS). Consequently, existing valid linear estimation methods for non-probability samples can be recast as various strategies to miniaturize the ddc down to the order of N-0.5. The quasi design-based methods accomplish this task by diminishing the variability among the N inclusion propensities via weighting. The super-population model-based approach achieves the same goal through reducing the variability of the N individual attributes by replacing them with their residuals from a regression model. The double robust estimators enjoy their celebrated property because a correlation is zero whenever one of the variables being correlated is constant, regardless of which one. Understanding the commonality of these methods through ddc reveals the possibility of “double-plus robustness”: a valid estimation without relying on the full validity of either the super-population model or the estimated inclusion propensity, neither of which is guaranteed because both rely on device probability.
Rina Barber, University of Chicago
Friday, September 23 at 10:30 a.m. EDT
Conformal prediction is a popular, modern technique for providing valid predictive inference for arbitrary machine learning models. Its validity relies on the assumptions of exchangeability of the data, and symmetry of the given model fitting algorithm as a function of the data. However, exchangeability is often violated when predictive models are deployed in practice. For example, if the data distribution drifts over time, then the data points are no longer exchangeable; moreover, in such settings, we might want to use an algorithm that treats recent observations as more relevant, which would violate the assumption that data points are treated symmetrically. This paper proposes new methodology to deal with both aspects: we use weighted quantiles to introduce robustness against distribution drift, and design a new technique to allow for algorithms that do not treat data points symmetrically, with theoretical results verifying coverage guarantees that are robust to violations of exchangeability.
This work is joint with Emmanuel Candes, Aaditya Ramdas, and Ryan Tibshirani.