Triple Header (Next Week at the Statistics / Machine Learning Seminars)
Three-Toed Sloth 2013-12-20
Summary:
Attention conservation notice: Only relevant if you (1) really care about statistics, and (2) will be in Pittsburgh on Monday.
Through a fortuitous concourse of calendars, we will have three outstanding talks on Monday, 14 October 2013. In chronological order:
- Michael I. Jordan, "On the Computational and Statistical Interface and 'Big Data'" (special joint statistics/ML seminar)
-
Abstract: The rapid growth in the size and scope of datasets in
science and technology has created a need for novel foundational perspectives
on data analysis that blend the statistical and computational sciences. That
classical perspectives from these fields are not adequate to address emerging
problems in "Big Data" is apparent from their sharply divergent nature at an
elementary level---in computer science, the growth of the number of data points
is a source of "complexity" that must be tamed via algorithms or hardware,
whereas in statistics, the growth of the number of data points is a source of
"simplicity" in that inferences are generally stronger and asymptotic results
can be invoked. Indeed, if data are a data analyst's principal resource, why
should more data be burdensome in some sense? Shouldn't it be possible to
exploit the increasing inferential strength of data at scale to keep
computational complexity at bay? I present three research vignettes that pursue
this theme, the first involving the deployment of resampling methods such as
the bootstrap on parallel and distributed computing platforms, the second
involving large-scale matrix completion, and the third introducing a
methodology of "algorithmic weakening," whereby hierarchies of convex
relaxations are used to control statistical risk as data accrue.
- (Joint work with Venkat Chandrasekaran, Ariel Kleiner, Lester Mackey, Purna Sarkar, and Ameet Talwalkar.)
- Time and place: Noon, Rangos 2, University Center
- (Joint work with Venkat Chandrasekaran, Ariel Kleiner, Lester Mackey, Purna Sarkar, and Ameet Talwalkar.)
- David Choi, "Testing for Coordination and Peer Influence in Network Data" (machine learning and the social sciences seminar)
-
Abstract: Many tests have been proposed for the detection of
"viral" peer influence in observational studies involving social network
data. However, these tests typically make strong (and sometimes unstated)
modeling assumptions on participant behavior. We propose a test which holds
under less restrictive assumptions, and which controls for unobserved homophily
variables that are unaccounted for in existing methods. We discuss conditions
under which the test is valid, and give preliminary results on its
effectiveness.
- Time and place: 3 pm in Gates Hall 4405
- Genevera Allen, "Sparse and Functional Principal Components Analysis"
-
Abstract: Regularized principal components analysis, especially
Sparse PCA and Functional PCA, has become widely used for dimension reduction
in high-dimensional settings. Many examples of massive data, however, may
benefit from estimating both sparse AND functional factors. These include
neuroimaging data where there are discrete brain regions of activation
(sparsity) but these regions tend to be smooth spatially (functional). Here,
we introduce an optimization framework that can encourage both sparsity and
smoothness of the row and/or column PCA factors. This framework generalizes
many of the existing approaches to Sparse PCA, Functional PCA and two-way
Sparse PCA and Functional PCA, as these are all special cases of our method.
In particular, our method permits flexible combinations of sparsity and
smoothness that lead to improvements in feature selection and signal recovery
as well as more interpretable PCA factors. We demonstrate our method on
simulated data and a neuroimaging example of EEG data. This work provides a
unified optimization framework for regularized PCA that can form the foundation
for a cohesive approach to regularization in high-dimensional multivariate
analysis.
- Time and place: 4 pm in Doherty Hall 1212
As always, the talks are free and open to the public.