Affordable Access

Access to the full text

Face Mask Extraction in Video Sequence

Authors
  • Wang, Yujiang1
  • Luo, Bingnan1
  • Shen, Jie1
  • Pantic, Maja1
  • 1 Imperial College London, Department of Computing, 180 Queens Gate, London, UK , London (United Kingdom)
Type
Published Article
Journal
International Journal of Computer Vision
Publisher
Springer-Verlag
Publication Date
Nov 16, 2018
Volume
127
Issue
6-7
Pages
625–641
Identifiers
DOI: 10.1007/s11263-018-1130-2
Source
Springer Nature
Keywords
License
Green

Abstract

Inspired by the recent development of deep network-based methods in semantic image segmentation, we introduce an end-to-end trainable model for face mask extraction in video sequence. Comparing to landmark-based sparse face shape representation, our method can produce the segmentation masks of individual facial components, which can better reflect their detailed shape variations. By integrating convolutional LSTM (ConvLSTM) algorithm with fully convolutional networks (FCN), our new ConvLSTM-FCN model works on a per-sequence basis and takes advantage of the temporal correlation in video clips. In addition, we also propose a novel loss function, called segmentation loss, to directly optimise the intersection over union (IoU) performances. In practice, to further increase segmentation accuracy, one primary model and two additional models were trained to focus on the face, eyes, and mouth regions, respectively. Our experiment shows the proposed method has achieved a 16.99% relative improvement (from 54.50 to 63.76% mean IoU) over the baseline FCN model on the 300 Videos in the Wild (300VW) dataset.

Report this publication

Statistics

Seen <100 times