Wang, Yujiang Luo, Bingnan Shen, Jie Pantic, Maja
Published in
International Journal of Computer Vision
Inspired by the recent development of deep network-based methods in semantic image segmentation, we introduce an end-to-end trainable model for face mask extraction in video sequence. Comparing to landmark-based sparse face shape representation, our method can produce the segmentation masks of individual facial components, which can better reflect ...
Ding, Keyan Ma, Kede Wang, Shiqi Simoncelli, Eero P.
The performance of objective image quality assessment (IQA) models has been evaluated primarily by comparing model predictions to human quality judgments. Perceptual datasets gathered for this purpose have provided useful benchmarks for improving IQA methods, but their heavy use creates a risk of overfitting. Here, we perform a large-scale comparis...
Hahne, Christopher Aggoun, Amar Velisavljevic, Vladan Fiebig, Susanne Pesch, Matthias
Published in
International Journal of Computer Vision
In this paper, we demonstrate light field triangulation to determine depth distances and baselines in a plenoptic camera. Advances in micro lenses and image sensors have enabled plenoptic cameras to capture a scene from different viewpoints with sufficient spatial resolution. While object distances can be inferred from disparities in a stereo viewp...
Zaharescu, Andrei Horaud, Radu
Published in
International Journal of Computer Vision
In this paper we address the problem of building a class of robust factorization algorithms that solve for the shape and motion parameters with both affine (weak perspective) and perspective camera models. We introduce a Gaussian/uniform mixture model and its associated EM algorithm. This allows us to address parameter estimation within a data clus...
Wu, Jiajun Xue, Tianfan Lim, Joseph J. Tian, Yuandong Tenenbaum, Joshua B. Torralba, Antonio Freeman, William T.
Published in
International Journal of Computer Vision
Understanding 3D object structure from a single image is an important but challenging task in computer vision, mostly due to the lack of 3D object annotations to real images. Previous research tackled this problem by either searching for a 3D shape that best explains 2D annotations, or training purely on synthetic data with ground truth 3D informat...
Luiten, Jonathon Osep, Aljosa Dendorfer, Patrick Torr, Philip Geiger, Andreas Leal-Taixe, Laura Leibe, Bastian
Multi-Object Tracking (MOT) has been notoriously difficult to evaluate. Previous metrics overemphasize the importance of either detection or association. To address this, we present a novel MOT evaluation metric, HOTA (Higher Order Tracking Accuracy), which explicitly balances the effect of performing accurate detection, association and localizatio...
Sun, Rémy Lampert, Christoph H.
Published in
International Journal of Computer Vision
We study the problem of automatically detecting if a given multi-class classifier operates outside of its specifications (out-of-specs), i.e. on input data from a different distribution than what it was trained for. This is an important problem to solve on the road towards creating reliable computer vision systems for real-world applications, becau...
Chadebecq, François Vasconcelos, Francisco Lacher, René Maneas, Efthymios Desjardins, Adrien Ourselin, Sébastien Vercauteren, Tom Stoyanov, Danail
Published in
International Journal of Computer Vision
Recovering 3D geometry from cameras in underwater applications involves the Refractive Structure-from-Motion problem where the non-linear distortion of light induced by a change of medium density invalidates the single viewpoint assumption. The pinhole-plus-distortion camera projection model suffers from a systematic geometric bias since refractive...
Gehrig, Daniel Rebecq, Henri Gallego, Guillermo Scaramuzza, Davide
Published in
International Journal of Computer Vision
We present EKLT, a feature tracking method that leverages the complementarity of event cameras and standard cameras to track visual features with high temporal resolution. Event cameras are novel sensors that output pixel-level brightness changes, called “events”. They offer significant advantages over standard cameras, namely a very high dynamic r...
Wu, A. Piergiovanni, A. J. Ryoo, M. S.
Published in
International Journal of Computer Vision
We present a visual imitation learning framework that enables learning of robot action policies solely based on expert samples without any robot trials. Robot exploration and on-policy trials in a real-world environment could often be expensive/dangerous. We present a new approach to address this problem by learning a future scene prediction model ...