Judith Rousseau, Oxford University
Bayesian measures of uncertainty
Abstract: The reknown theorem of Bernstein von Mises in regular finite dimensional models has numerous interesting consequences, in particular it implies that a large class of credible regions are also asymptotically confidence regions, which in turns imply that different priors lead to the same credible regions to first order. Unfortunately the Bernstein von Mises theorem does not necessarily hold in high or infinite dimensional models and understanding the asymptotic behaviour of credible regions is much more involved. In this talk I will describe what are the new advances that have been obtained over the last 8 years or so on the understanding - or not- of credible regions in semi and non- parametric models. I will discuss some interesting phenomena which have been exhibited in high dimensional models, for certain families of priors. We can show that in a significant number of cases these priors tend to over penalize, leading to only partially robust confidence statements. I will also discuss the advances obtained in the context of non or semi parametric mixture models.
Genevera Allen, Rice University
Data Integration: Data-Driven Discovery from Diverse Data Sources.
Abstract: Data integration, or the strategic analysis of multiple sources of data simultaneously, can often lead to discoveries that may be hidden in individual analyses of a single data source. In this talk, we present several new techniques for data integration of mixed, multi-view data where multiple sets of features are measured for the same set of samples. This type of data is common in heathcare, biomedicine, and online advertising, among others. In this talk, we specifically highlight how mixed graphical models and new feature selection techniques for mixed, mutli-view data allow us to explore relationships amongst features from different domains. Next, we present new frameworks for integrated principal components analysis and integrated generalized convex clustering that leverage diverse data sources to discover joint patterns amongst the samples. We apply these techniques to integrative genomic studies in cancer and neurodegenerative diseases.
Aad Van Der Vaart, University of Leiden
Nonparametric Bayes: review and challenges.
Abstract: Nonparametric Bayesian methods have seen a great development in the past decades. They are ordinary Bayesian methods using a prior distribution on an infinite-dimensional or high-dimensional parameter, resulting in a posterior distribution, giving the plausibility of this parameter given the data. Nonparametric Bayesian methods are now routinely applied in many areas. Besides from statisticians they attract attention of computer scientists and mathematical analysts. Theory has been developed for classical nonparametric smoothing problems, sparse high-dimensional models and increasingly for inverse problems, and addresses a great variety of priors. Theory addresses rates of contraction of the posterior to a true parameter, distributional approximations of the posterior distribution of smooth functionals, and most recently the coverage of Bayesian credible sets. In this talk we present some examples of success stories, and point to open questions.
Victor M. Panaretos, Ecole Polytechnique Federale de Lausanne
Amplitude and Phase Variation of Random Processes.
Abstract: The amplitude variation of a random process consists of random oscillations in its range space (the ``y-axis''), typically encapsulated by its (co)variation around a mean level. In contrast, phase variation refers to fluctuations in its domain (the ``x-axis''), often caused by random time changes or spatial deformations. Many types of processes manifest both types of variation, and confounding them can seriously skew statistical inferences. We will consider some of the statistical challenges related to empirically separating these two forms of variation. Our approach will largely rely on the tools and geometry of optimal (multi)transport, and borrow from connections to notions from shape theory. The approach will also highlight the intriguing aspect of this problem, as being at the confluence of functional data analysis, where the data are elements of infinite dimensional vector spaces, and geometrical statistics, where the data are elements of differentiable manifolds.
Gilles Blanchard, University of Potsdam
Sketched learning using random moments. (with R. Gribonval, N. Keriven and Y. Traonmilin).
Abstract: We introduce and analyze a general framework for resource-efficient large-scale statistical learning by data sketching: a training data collection is compressed in one pass into a low-dimensional sketch (a vector of random empirical generalized moments) that should capture the information relevant to the considered estimation task. The estimation target is the minimizer of the population risk for a given loss function. An approximate minimizer of the empirical risk is computed from the sketch information only using a constrained moment matching principle. Sufficient sketch sizes to control the statistical error of this procedure are investigated. This principle is applied to different setups: PCA, clustering, and Gaussian mixture Modeling.
John Lafferty, Yale University
Computational perspectives on some statistical problems.
Abstract: We present some variations on classical statistical problems that take a computational, machine learning perspective. First, we study nonparametric regression when the data are distributed across multiple machines. We place limits on the number of bits that each machine can use to transmit information to the central machine. Second, we investigate the use of machine learning algorithms that are required to obey natural shape constraints suggested by domain knowledge. We develop methods for high-dimensional shape-constrained regression and classification. Finally, we study optimization procedures to minimize the empirical risk functional for certain families of deep neural networks. We develop an approach that optimizes a sequence of objective functions using network parameters obtained during different stages of the learning process. This is evaluated with deep generative networks used as a replacement for sparsity in compressed sensing and approximation.