Imagine a futuristic version of Google Street View that could dial up any possible place in the world, at any possible time. Effectively, such a service would be a recording of the plenoptic function—the hypothetical function described by Adelson and Bergen that captures all light rays passing through space at all times. While the plenoptic function is completely impractical to capture in its totality, every photo ever taken represents a sample of this function. I will present recent methods we've developed to reconstruct the plenoptic function from sparse space-time samples of photos—including Street View itself, as well as tourist photos of famous landmarks. The results of this work include the ability to take a single photo and synthesize a full dawn-to-dusk timelapse video, as well as compelling 4D view synthesis capabilities where a scene can simultaneously be explored in space and time.
One of the most striking characteristics of human behavior in contrast to all other animal is that we show extraordinary variability across populations. Human cultural diversity is a biological oddity. More specifically, we propose that what makes humans unique is the nature of the individual ontogenetic process, that results in this unparalleled cultural diversity. Hence, our central question is: How is human ontogeny adapted to cultural diversity and how does it contribute to it? This question is critical, because cultural diversity does not only entail our predominant mode of adaptation to local ecologies, but is key in the construction of our cognitive architecture. The colors we see, the tones that we hear, the memories we form, the norms we adhere to are all the consequence of an interaction between our emerging cognitive system and our lived experiences. While psychologists make careers measuring cognitive systems, we are terrible at measuring experience as are anthropologists, sociologists, etc. The standard methods all face unsurmountable limitations. In our department, we hope to apply Machine Learning, Deep Learning and Computer Vision to automatically extract developmentally important indicators of humans’ daily experience. Similarly to the way that modern sequencing technologies allow us to study the human genotype at scale, applying AI methods to reliably quantify humans’ lived experience would allow us to study the human behavioral phenotype at scale, and fundamentally alter the science of human behavior and its application in education, mental health and medicine: The phenotyping revolution.
Organizers: Timo Bolkart
I will survey our work on tracking and measurement, waypoints on the path to activity recognition and understanding, in sports video, highlighting some of our recent work on rectification and player tracking, not just in hockey but more recently in basketball, where we have addressed player identification both in a fully supervised and semi-supervised manner.
Methods for visual recognition have made dramatic strides in recent years on various online benchmarks, but performance in the real world still often falters. Classic gradient-histogram models make overly simplistic assumptions regarding image appearance statistics, both locally and globally. Recent progress suggests that new learning-based representations can improve recognition by devices that are embedded in a physical world.
I'll review new methods for domain adaptation which capture the visual domain shift between environments, and improve recognition of objects in specific places when trained from generic online sources. I'll discuss methods for cross-modal semi-supervised learning, which can leverage additional unlabeled modalities in a test environment.
Finally as time permits I'll present recent results learning hierarchical local image representations based on recursive probabilistic topic models, on learning strong object color models from sets of uncalibrated views using a new multi-view color constancy paradigm, and/or on recent results on monocular estimation of grasp affordances.
In the first part of the talk, I will describe methods that learn a single family of detectors for object classes that exhibit large within-class variation. One common solution is to use a divide-and-conquer strategy, where the space of possible within-class variations is partitioned, and different detectors are trained for different partitions.
However, these discrete partitions tend to be arbitrary in continuous spaces, and the classifiers have limited power when there are too few training samples in each subclass. To address this shortcoming, explicit feature sharing has been proposed, but it also makes training more expensive. We show that foreground-background classification (detection) and within-class classification of the foreground class (pose estimation) can be jointly solved in a multiplicative form of two kernel functions. One kernel measures similarity for foreground-background classification. The other kernel accounts for latent factors that control within-class variation and implicitly enables feature sharing among foreground training samples. The multiplicative kernel formulation enables feature sharing implicitly; the solution for the optimal sharing is a byproduct of SVM learning.
The resulting detector family is tuned to specific variations in the foreground. The effectiveness of this framework is demonstrated in experiments that involve detection, tracking, and pose estimation of human hands, faces, and vehicles in video.
Beginning with a seminal paper of Diaconis (1988), the aim of so-called "probabilistic numerics" is to compute probabilistic solutions to deterministic problems arising in numerical analysis by casting them as statistical inference problems. For example, numerical integration of a deterministic function can be seen as the integration of an unknown/random function, with evaluations of the integrand at the integration nodes proving partial information about the integrand. Advantages offered by this viewpoint include: access to the Bayesian representation of prior and posterior uncertainties; better propagation of uncertainty through hierarchical systems than simple worst-case error bounds; and appropriate accounting for numerical truncation and round-off error in inverse problems, so that the replicability of deterministic simulations is not confused with their accuracy, thereby yielding an inappropriately concentrated Bayesian posterior. This talk will describe recent work on probabilistic numerical solvers for ordinary and partial differential equations, including their theoretical construction, convergence rates, and applications to forward and inverse problems. Joint work with Andrew Stuart (Warwick).
Organizers: Philipp Hennig