Applying data-driven approaches to non-rigid 3D reconstruction has been difficult, which we believe can be attributed to the lack of a large-scale training corpus. One recent approach proposes self-supervision based on non-rigid reconstruction. Unfortunately, this method fails for important cases such as highly non-rigid deformations. We first address this problem of lack of data by introducing a novel semi-supervised strategy to obtain dense interframe correspondences from a sparse set of annotations. This way, we obtain a large dataset of 400 scenes, over 390,000 RGB-D frames, and 2,537 densely aligned frame pairs; in addition, we provide a test set along with several metrics for evaluation. Based on this corpus, we introduce a data-driven non-rigid feature matching approach, which we integrate into an optimization-based reconstruction pipeline. Here, we propose a new neural network that operates on RGB-D frames, while maintaining robustness under large non-rigid deformations and producing accurate predictions. Our approach significantly outperforms both existing non-rigid reconstruction methods that do not use learned data terms, as well as learning-based approaches that only use self-supervision.
Organizers: Vassilis Choutas
How can we tell that a video is playing backwards? People's motions look wrong when the video is played backwards--can we develop an algorithm to distinguish forward from backward video? Similarly, can we tell if a video is sped-up? We have developed algorithms to distinguish forwards from backwards video, and fast from slow. Training algorithms for these tasks provides a self-supervised task that facilitates human activity recognition. We'll show these results, and applications of these unsupervised video learning tasks. We also present a method to retime people in videos --- manipulating and editing the time over which the motions of individuals occurs. Our model not only disentangles the motions of each person in the video, but it also correlates each person with the scene changes they generate, and thus re-times the corresponding shadows, reflections, and motion of loose clothing appropriately.
Organizers: Yinghao Huang
In recent years, commodity 3D sensors have become widely available, spawning significant interest in both offline and real-time 3D reconstruction. While state-of-the-art reconstruction results from commodity RGB-D sensors are visually appealing, they are far from usable in practical computer graphics applications since they do not match the high quality of artist-modeled 3D graphics content. One of the biggest challenges in this context is that obtained 3D scans suffer from occlusions, thus resulting in incomplete 3D models. In this talk, I will present a data-driven approach towards generating high quality 3D models from commodity scan data, and the use of these geometrically complete 3D models towards semantic and texture understanding of real-world environments.
Organizers: Yinghao Huang
Much existing work in reinforcement learning involves environments that are either intentionally neutral, lacking a role for cooperation and competition, or intentionally simple, when agents need imagine nothing more than that they are playing versions of themselves. Richer game theoretic notions become important as these constraints are relaxed. For humans, this encompasses issues that concern utility, such as envy and guilt, and that concern inference, such as recursive modeling of other players, I will discuss studies treating a paradigmatic game of trust as an interactive partially-observable Markov decision process, and will illustrate the solution concepts with evidence from interactions between various groups of subjects, including those diagnosed with borderline and anti-social personality disorders.
In this talk, I will present my understanding on 3D face reconstruction, modelling and applications from a deep learning perspective. In the first part of my talk, I will discuss the relationship between representations (point clouds, meshes, etc) and network layers (CNN, GCN, etc) on face reconstruction task, then present my ECCV work PRN which proposed a new representation to help achieve state-of-the-art performance on face reconstruction and dense alignment tasks. I will also introduce my open source project face3d that provides examples for generating different 3D face representations. In the second part of the talk, I will talk some publications in integrating 3D techniques into deep networks, then introduce my upcoming work which implements this. In the third part, I will present how related tasks could promote each other in deep learning, including face recognition for face reconstruction task and face reconstruction for face anti-spoofing task. Finally, with such understanding of these three parts, I will present my plans on 3D face modelling and applications.
Organizers: Timo Bolkart
The past few years with the advent of Deep Convolutional Neural Networks (DCNNs), as well as the availability of visual data it was shown that it is possible to produce excellent results in very challenging tasks, such as visual object recognition, detection, tracking etc. Nevertheless, in certain tasks such as fine-grain object recognition (e.g., face recognition) it is very difficult to collect the amount of data that are needed. In this talk, I will show how, using DCNNs, we can generate highly realistic faces and heads and use them for training algorithms such as face and facial expression recognition. Next, I will reverse the problem and demonstrate how by having trained a very powerful face recognition network it can be used to perform very accurate 3D shape and texture reconstruction of faces from a single image. Finally, I will demonstrate how to create very lightweight networks for representing 3D face texture and shape structure by capitalising upon intrinsic mesh convolutions.
Organizers: Dimitrios Tzionas
Understanding objects and their behavior from images and videos is a difficult inverse problem. It requires learning a metric in image space that reflects object relations in real world. This metric learning problem calls for large volumes of training data. While images and videos are easily available, labels are not, thus motivating self-supervised metric and representation learning. Furthermore, I will present a widely applicable strategy based on deep reinforcement learning to improve the surrogate tasks underlying self-supervision. Thereafter, the talk will cover the learning of disentangled representations that explicitly separate different object characteristics. Our approach is based on an analysis-by-synthesis paradigm and can generate novel object instances with flexible changes to individual characteristics such as their appearance and pose. It nicely addresses diverse applications in human and animal behavior analysis, a topic we have intensive collaboration on with neuroscientists. Time permitting, I will discuss the disentangling of representations from a wider perspective including novel strategies to image stylization and new strategies for regularization of the latent space of generator networks.
Organizers: Joel Janai
Human pose stability analysis is the key to understanding locomotion and control of body equilibrium, with numerous applications in the fields of Kinesiology, Medicine and Robotics. We propose and validate a novel approach to learn dynamics from kinematics of a human body to aid stability analysis. More specifically, we propose an end-to-end deep learning architecture to regress foot pressure from a human pose derived from video. We have collected and utilized a set of long (5min +) choreographed Taiji (Tai Chi) sequences of multiple subjects with synchronized motion capture, foot pressure and video data. The derived human pose data and corresponding foot pressure maps are used jointly in training a convolutional neural network with residual architecture, named “PressNET”. Cross validation results show promising performance of PressNet, significantly outperforming the baseline method under reasonable sensor noise ranges.
Organizers: Nadine Rueegg
Animals and humans are excellent in conceiving of solutions to physical and geometric problems, for instance in using tools, coming up with creative constructions, or eventually inventing novel mechanisms and machines. Cognitive scientists coined the term intuitive physics in this context. It is a shame we do not yet have good computational models of such capabilities. A main stream of current robotics research focusses on training robots for narrow manipulation skills - often using massive data from physical simulators. Complementary to that we should also try to understand how basic principles underlying physics can directly be used to enable general purpose physical reasoning in robots, rather than sampling data from physical simulations. In this talk I will discuss an approach called Logic-Geometric Programming, which builds a bridge between control theory, AI planning and robot manipulation. It demonstrates strong performance on sequential manipulation problems, but also raises a number of highly interesting fundamental problems, including its probabilistic formulation, reactive execution and learning.
The state-of-the-art robotic systems adopting magnetically actuated ferromagnetic bodies or even whole miniature robots have recently become a fast advancing technological field, especially at the nano and microscale. The mesoscale and above all multiscale magnetically guided robotic systems appear to be the advanced field of study, where it is difficult to reflect different forces, precision and also energy demands. The major goal of our talk is to discuss the challenges in the field of magnetically guided mesoscale and multiscale actuation, followed by the results of our research in the field of magnetic positioning systems and the magnetic soft-robotic grippers.
Organizers: Metin Sitti
Recognition of pain in horses and other animals is important, because pain is a manifestation of disease and decreases animal welfare. Pain diagnostics for humans typically includes self-evaluation and location of the pain with the help of standardized forms, and labeling of the pain by an clinical expert using pain scales. However, animals cannot verbalize their pain as humans can, and the use of standardized pain scales is challenged by the fact that animals as horses and cattle, being prey animals, display subtle and less obvious pain behavior - it is simply beneficial for a prey animal to appear healthy, in order lower the interest from predators. We work together with veterinarians to develop methods for automatic video-based recognition of pain in horses. These methods are typically trained with video examples of behavioral traits labeled with pain level and pain characteristics. This automated, user independent system for recognition of pain behavior in horses will be the first of its kind in the world. A successful system might change the concept for how we monitor and care for our animals.
A dominant trend in manufacturing is the move toward small production volumes and high product variability. It is thus anticipated that future manufacturing automation systems will be characterized by a high degree of autonomy, and must be able to learn new behaviors without explicit programming. Robot Learning, and more generic, Autonomous Manufacturing, is an exciting research field at the intersection of Machine Learning and Automation. The combination of "traditional" control techniques with data-driven algorithms holds the promise of allowing robots to learn new behaviors through experience. This talk introduces selected Siemens research projects in the area of Autonomous Manufacturing.
Active motion of biological and artificial microswimmers is relevant in the real world, in microfluidics, and biological applications but also poses fundamental questions in non-equi- librium statistical physics. Mechanisms of single microswimmers either designed by nature or in the lab need to be understood and a detailed modeling of microorganisms helps to explore their complex cell design and their behavior. It also motivates biomimetic approaches. The emergent collective motion of microswimmers generates appealing dynamic patterns as a consequence of the non-equilibrium.