Header logo is

Institute Talks

Learning Non-rigid Optimization

  • 10 July 2020 • 15:00—16:00
  • Matthias Nießner
  • Remote talk on Zoom

Applying data-driven approaches to non-rigid 3D reconstruction has been difficult, which we believe can be attributed to the lack of a large-scale training corpus. One recent approach proposes self-supervision based on non-rigid reconstruction. Unfortunately, this method fails for important cases such as highly non-rigid deformations. We first address this problem of lack of data by introducing a novel semi-supervised strategy to obtain dense interframe correspondences from a sparse set of annotations. This way, we obtain a large dataset of 400 scenes, over 390,000 RGB-D frames, and 2,537 densely aligned frame pairs; in addition, we provide a test set along with several metrics for evaluation. Based on this corpus, we introduce a data-driven non-rigid feature matching approach, which we integrate into an optimization-based reconstruction pipeline. Here, we propose a new neural network that operates on RGB-D frames, while maintaining robustness under large non-rigid deformations and producing accurate predictions. Our approach significantly outperforms both existing non-rigid reconstruction methods that do not use learned data terms, as well as learning-based approaches that only use self-supervision.

Organizers: Vassilis Choutas

Learning from videos played forwards, backwards, fast, and slow

  • 13 July 2020 • 16:00—17:30
  • William T. Freeman

How can we tell that a video is playing backwards? People's motions look wrong when the video is played backwards--can we develop an algorithm to distinguish forward from backward video? Similarly, can we tell if a video is sped-up? We have developed algorithms to distinguish forwards from backwards video, and fast from slow. Training algorithms for these tasks provides a self-supervised task that facilitates human activity recognition. We'll show these results, and applications of these unsupervised video learning tasks. We also present a method to retime people in videos --- manipulating and editing the time over which the motions of individuals occurs. Our model not only disentangles the motions of each person in the video, but it also correlates each person with the scene changes they generate, and thus re-times the corresponding shadows, reflections, and motion of loose clothing appropriately.

Organizers: Yinghao Huang

Towards Commodity 3D Scanning for Content Creation

  • 16 July 2020 • 16:00—17:30
  • Angela Dai

In recent years, commodity 3D sensors have become widely available, spawning significant interest in both offline and real-time 3D reconstruction. While state-of-the-art reconstruction results from commodity RGB-D sensors are visually appealing, they are far from usable in practical computer graphics applications since they do not match the high quality of artist-modeled 3D graphics content. One of the biggest challenges in this context is that obtained 3D scans suffer from occlusions, thus resulting in incomplete 3D models. In this talk, I will present a data-driven approach towards generating high quality 3D models from commodity scan data, and the use of these geometrically complete 3D models towards semantic and texture understanding of real-world environments.

Organizers: Yinghao Huang

A Dynamical Systems Perspective on Optimization with Momentum

  • 19 September 2019 • 14:00—15:00
  • Dr. Michael Muehlebach
  • MPI-IS Stuttgart, Heisenbergstr. 3, seminar room 2P4

My talk will be divided into two parts. In the first part, I will analyze Nesterov's accelerated gradient method from a dynamical systems point of view. More precisely, I will derive the accelerated gradient method by discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. I will analyze both the ordinary differential equation and the discretization for obtaining insights into the phenomenon of acceleration. In particular, geometric properties of the dynamics, such as asymptotic stability, time-reversibility, and phase-space volume contraction are shown to be preserved through the discretization. In the second part, I will show that these geometric properties are enough for characterizing the convergence rate. The results therefore provide criteria that are easily verifiable for the accelerated convergence of any momentum-based optimization algorithm. The results also yield guidance for the design of new optimization algorithms. The talk will focus on unconstrained optimization problems with smooth and strongly-convex objective functions, even though the analysis potentially generalizes to non-convex or non-Euclidean settings, or when the decision variables are constrained to a smooth manifold.

Organizers: Sebastian Trimpe

  • Ernest (Ted) Gomez, MD, MTR
  • MPI-IS Stuttgart, Heisenbergstr. 3, Room 2P4

Surgery is a demanding activity that places a human life in the hands of others. However, innovations in minimally invasive surgery have physically separated surgeons' hands from their patients, creating the need for surgeons and their tools to develop both natural and artificial haptic intelligence. This lecture examines the essential role of haptic intelligence in skill development for laparoscopic and robotic surgery.

Organizers: Katherine J. Kuchenbecker

Feedback-Control for Self-Adaptive Predictable Computing

  • 10 September 2019 • 14:00—15:00
  • Prof. Martina Maggio
  • MPI-IS Stuttgart, Heisenbergstr. 3, seminar room 2P4

Cloud computing gives the illusion of infinite computational capacity and allows for on-demand resource provisioning. As a result, over the last few years, the cloud computing model has experienced widespread industrial adoption and companies like Netflix offloaded their entire infrastructure to the cloud. However, with even the largest datacenter being of a finite size, cloud infrastructures have experienced overload due to overbooking or transient failures. In essence, this is an excellent opportunity for the design of control solutions, that tackle the problem of mitigating overload peaks, using feedback from the infrastructure. These solutions can then exploit control-theoretical principles and take advantage of the knowledge and the analysis capabilities of control tools to provide formal guarantees on the predictability of the infrastructure behavior. This talk introduces recent research advances on feedback control in the cloud computing domain, together with my research agenda for enhancing predictability and formal guarantees for cloud computing.

Organizers: Sebastian Trimpe

A New Framework to Understanding Biological Vision

IS Colloquium
  • 03 September 2019 • 11:00—12:00
  • Zhaoping Li
  • MPI-IS Stuttgart, Heisenbergstr. 3, Room 2P4

Visual attention selects a tiny amount of information that can be deeply processed by the brain, and gaze shifts bring the selected visual object to fovea, the center of the visual field, for better visual decoding or recognition of the selected objects. Therefore, central and peripheral vision should differ qualitatively in visual decoding, rather than just quantitatively in visual acuity.

Organizers: Katherine J. Kuchenbecker

  • Björn Browatzki
  • PS Aquarium

Current solutions to discriminative and generative tasks in computer vision exist separately and often lack interpretability and explainability. Using faces as our application domain, here we present an architecture that is based around two core ideas that address these issues: first, our framework learns an unsupervised, low-dimensional embedding of faces using an adversarial autoencoder that is able to synthesize high-quality face images. Second, a supervised disentanglement splits the low-dimensional embedding vector into four sub-vectors, each of which contains separated information about one of four major face attributes (pose, identity, expression, and style) that can be used both for discriminative tasks and for manipulating all four attributes in an explicit manner. The resulting architecture achieves state-of-the-art image quality, good discrimination and face retrieval results on each of the four attributes, and supports various face editing tasks using a face representation of only 99 dimensions. Finally, we apply the architecture's robust image synthesis capabilities to visually debug label-quality issues in an existing face dataset.

Organizers: Timo Bolkart

  • Gunhyuk Park
  • MPI-IS Stuttgart, Heisenbergstr. 3, Room 2P4

Many hapticians have designed and implemented haptic effects to various user interactions. For several decades, hapticians have proved that the haptic feedback can improve multiple facets of user experience including task performance, analyzing and utilizing user perception, and substituting other sensory modalities. Among them, this talk introduces two representative rendering methods to provide vibrotactile effects to users: 2D phantom sensation that makes a user perceive illusive tactile perception by using multiple real vibrotactile actuators and vibrotactile dimensional reduction that reduces 3D acceleration data from real interactions to 1D vibrations for maximizing its realism and similarity.

Organizers: Katherine J. Kuchenbecker

  • Yoshihiro Kanamori
  • PS-Aquarium

Relighting of human images has various applications in image synthesis. For relighting, we must infer albedo, shape, and illumination from a human portrait. Previous techniques rely on human faces for this inference, based on spherical harmonics (SH) lighting. However, because they often ignore light occlusion, inferred shapes are biased and relit images are unnaturally bright particularly at hollowed regions such as armpits, crotches, or garment wrinkles. This paper introduces the first attempt to infer light occlusion in the SH formulation directly. Based on supervised learning using convolutional neural networks (CNNs), we infer not only an albedo map, illumination but also a light transport map that encodes occlusion as nine SH coefficients per pixel. The main difficulty in this inference is the lack of training datasets compared to unlimited variations of human portraits. Surprisingly, geometric information including occlusion can be inferred plausibly even with a small dataset of synthesized human figures, by carefully preparing the dataset so that the CNNs can exploit the data coherency. Our method accomplishes more realistic relighting than the occlusion-ignored formulation.

Organizers: Senya Polikovsky Jinlong Yang

Self-supervised 3D hand pose estimation

  • 23 July 2019 • 11:00—12:00
  • Chengde Wan
  • PS-Aquarium

Deep learning has significantly advanced state-of-the-art for 3D hand pose estimation, of which accuracy can be improved with increased amounts of labelled data. However, acquiring 3D hand pose labels can be extremely difficult. In this talk, I will present our recent two works on leveraging self-supervised learning techniques for hand pose estimation from depth map. In both works, we incorporate differentiable renderer to the network and formulate training loss as model fitting error to update network parameters. In first part of the talk, I will present our earlier work which approximates hand surface with a set of spheres. We then model the pose prior as a variational lower bound with variational auto-encoder(VAE). In second part, I will present our latest work on regressing the vertex coordinates of a hand mesh model with 2D fully convolutional network(FCN) in a single forward pass. In the first stage, the network estimates a dense correspondence field for every pixel on the image grid to the mesh grid. In the second stage, we design a differentiable operator to map features learned from the previous stage and regress a 3D coordinate map on the mesh grid. Finally, we sample from the mesh grid to recover the mesh vertices, and fit it an articulated template mesh in closed form. Without any human annotation, both works can perform competitively with strongly supervised methods. The later work will also be later extended to be compatible with MANO model.

Organizers: Dimitrios Tzionas

An introduction to bladder cancer & challenges for translational research

  • 22 July 2019 • 10:30 AM—22 April 2019 • 11:30 AM
  • Richard T Bryan
  • 2P4

  • Christoph Keplinger
  • MPI-IS Stuttgart, Room 2R04 / MPI-IS Tübingen, Room N0.002 (Broadcast)

Robots today rely on rigid components and electric motors based on metal and magnets, making them heavy, unsafe near humans, expensive and ill-suited for unpredictable environments. Nature, in contrast, makes extensive use of soft materials and has produced organisms that drastically outperform robots in terms of agility, dexterity, and adaptability. The Keplinger Lab aims to fundamentally challenge current limitations of robotic hardware, using an interdisciplinary approach that synergizes concepts from soft matter physics and chemistry with advanced engineering technologies to introduce robotic materials – material systems that integrate actuation, sensing and even computation – for a new generation of intelligent systems. This talk gives an overview of fundamental research questions that inspire current and future research directions. One major theme of research is the development of new classes of actuators – a key component of all robotic systems – that replicate the sweeping success of biological muscle, a masterpiece of evolution featuring astonishing all-around actuation performance, the ability to self-heal after damage, and seamless integration with sensing. A second theme of research are functional polymers with unusual combinations of properties, such as electrical conductivity paired with stretchability, transparency, biocompatibility and the ability to self-healing from mechanical and electrical damage. A third theme of research is the discovery of new energy capture principles that can provide power to intelligent autonomous systems, as well as – on larger scales – enable sustainable solutions for the use of waste heat from industrial processes or the use of untapped sources of renewable energy, such as ocean waves.