Affordable Access

Access to the full text

Hierarchical Attention for Part-Aware Face Detection

Authors
  • Wu, Shuzhe1, 2
  • Kan, Meina1
  • Shan, Shiguang1, 2, 3
  • Chen, Xilin1, 3
  • 1 Institute of Computing Technology (ICT), CAS, Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS), Beijing, 100190, China , Beijing (China)
  • 2 University of Chinese Academy of Sciences (UCAS), Beijing, 100049, China , Beijing (China)
  • 3 CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai, 200031, China , Shanghai (China)
Type
Published Article
Journal
International Journal of Computer Vision
Publisher
Springer-Verlag
Publication Date
Mar 02, 2019
Volume
127
Issue
6-7
Pages
560–578
Identifiers
DOI: 10.1007/s11263-019-01157-5
Source
Springer Nature
Keywords
License
Yellow

Abstract

Expressive representations for characterizing face appearances are essential for accurate face detection. Due to different poses, scales, illumination, occlusion, etc, face appearances generally exhibit substantial variations, and the contents of each local region (facial part) vary from one face to another. Current detectors, however, particularly those based on convolutional neural networks, apply identical operations (e.g. convolution or pooling) to all local regions on each face for feature aggregation (in a generic sliding-window configuration), and take all local features as equally effective for the detection task. In such methods, not only is each local feature suboptimal due to ignoring region-wise distinctions, but also the overall face representations are semantically inconsistent. To address the issue, we design a hierarchical attention mechanism to allow adaptive exploration of local features. Given a face proposal, part-specific attention modeled as learnable Gaussian kernels is proposed to search for proper positions and scales of local regions to extract consistent and informative features of facial parts. Then face-specific attention predicted with LSTM is introduced to model relations between the local parts and adjust their contributions to the detection tasks. Such hierarchical attention leads to a part-aware face detector, which forms more expressive and semantically consistent face representations. Extensive experiments are performed on three challenging face detection datasets to demonstrate the effectiveness of our hierarchical attention and make comparisons with state-of-the-art methods.

Report this publication

Statistics

Seen <100 times