Affordable Access

Access to the full text

Real-Time 3D Head Pose Tracking Through 2.5D Constrained Local Models with Local Neural Fields

Authors
  • Ackland, Stephen1
  • Chiclana, Francisco1
  • Istance, Howell2
  • Coupland, Simon1
  • 1 De Montfort University, Leicester, UK , Leicester (United Kingdom)
  • 2 University of Tampere, Tampere, Finland , Tampere (Finland)
Type
Published Article
Journal
International Journal of Computer Vision
Publisher
Springer-Verlag
Publication Date
Mar 04, 2019
Volume
127
Issue
6-7
Pages
579–598
Identifiers
DOI: 10.1007/s11263-019-01152-w
Source
Springer Nature
Keywords
License
Yellow

Abstract

Tracking the head in a video stream is a common thread seen within computer vision literature, supplying the research community with a large number of challenging and interesting problems. Head pose estimation from monocular cameras is often considered an extended application after the face tracking task has already been performed. This often involves passing the resultant 2D data through a simpler algorithm that best fits the data to a static 3D model to determine the 3D pose estimate. This work describes the 2.5D constrained local model, combining a deformable 3D shape point model with 2D texture information to provide direct estimation of the pose parameters, avoiding the need for additional optimization strategies. It achieves this through an analytical derivation of a Jacobian matrix describing how changes in the parameters of the model create changes in the shape within the image through a full-perspective camera model. In addition, the model has very low computational complexity and can run in real-time on modern mobile devices such as tablets and laptops. The point distribution model of the face is built in a unique way, so as to minimize the effect of changes in facial expressions on the estimated head pose and hence make the solution more robust. Finally, the texture information is trained via local neural fields—a deep learning approach that utilizes small discriminative patches to exploit spatial relationships between the pixels and provide strong peaks at the optimal locations.

Report this publication

Statistics

Seen <100 times