TY - GEN
T1 - Fusion of Keypoint Tracking and Facial Landmark Detection for Real-Time Head Pose Estimation
AU - Barros, Jilliam María Díaz
AU - Mirbach, Bruno
AU - Garcia, Frederic
AU - Varanasi, Kiran
AU - Stricker, Didier
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/5/3
Y1 - 2018/5/3
N2 - In this paper, we address the problem of extreme head pose estimation from intensity images, in a monocular setup. We introduce a novel fusion pipeline to integrate into a dedicated Kalman Filter the pose estimated from a tracking scheme in the prediction stage and the pose estimated from a detection scheme in the correction stage. To that end, the measurement covariance of the Kalman Filter is updated in every frame. The tracking scheme is performed using a set of keypoints extracted in the area of the head along with a simple 3D geometric model. The detection scheme, on the other hand, relies on the alignment of facial landmarks in each frame combined with 3D features extracted on a head mesh. The head pose in each scheme is estimated by minimizing the reprojection error from the 3D-2D correspondences. By combining both frameworks, we extend the applicability of head pose estimation from facial landmarks to cases where these features are no longer visible. We compared the proposed method to other related approaches, showing that it can achieve state-of-the-art performance. We also demonstrate that our approach is suitable for cases with extreme head rotations and (self-) occlusions, besides being suitable for real time applications.
AB - In this paper, we address the problem of extreme head pose estimation from intensity images, in a monocular setup. We introduce a novel fusion pipeline to integrate into a dedicated Kalman Filter the pose estimated from a tracking scheme in the prediction stage and the pose estimated from a detection scheme in the correction stage. To that end, the measurement covariance of the Kalman Filter is updated in every frame. The tracking scheme is performed using a set of keypoints extracted in the area of the head along with a simple 3D geometric model. The detection scheme, on the other hand, relies on the alignment of facial landmarks in each frame combined with 3D features extracted on a head mesh. The head pose in each scheme is estimated by minimizing the reprojection error from the 3D-2D correspondences. By combining both frameworks, we extend the applicability of head pose estimation from facial landmarks to cases where these features are no longer visible. We compared the proposed method to other related approaches, showing that it can achieve state-of-the-art performance. We also demonstrate that our approach is suitable for cases with extreme head rotations and (self-) occlusions, besides being suitable for real time applications.
UR - http://www.scopus.com/inward/record.url?scp=85050988982&partnerID=8YFLogxK
U2 - 10.1109/WACV.2018.00224
DO - 10.1109/WACV.2018.00224
M3 - Conference contribution
AN - SCOPUS:85050988982
T3 - Proceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
SP - 2028
EP - 2037
BT - Proceedings - 2018 IEEE Winter Conference on Applications of Computer Vision, WACV 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 18th IEEE Winter Conference on Applications of Computer Vision, WACV 2018
Y2 - 12 March 2018 through 15 March 2018
ER -