TY - GEN
T1 - Real-time monocular 6-DOF head pose estimation from salient 2D points
AU - Maria Diaz Barros, Jilliam
AU - Garcia, Frederic
AU - Mirbach, Bruno
AU - Stricker, Didier
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - We propose a real-time and robust approach to estimate the full 3D head pose from extreme head poses using a monocular system. To this end, we first model the head using a simple geometric shape initialized using facial landmarks, i.e., eye corners, extracted from the face. Next, 2D salient points are detected within the region defined by the projection of the visible surface of the geometric head model onto the image, and projected back to the head model to generate the corresponding 3D features. Optical flow is used to find the respective 2D correspondences in the next video frame. Assuming that the monocular system is calibrated, it is then possible to solve the Perspective-n-Point (PnP) problem of estimating the head pose given a set of 3D features on the geometric model surface and their corresponding 2D correspondences from optical flow in the next frame. The experimental evaluation shows that the performance of the proposed approach achieves, and in some cases improves the state-of-the-art performance with a major advantage of not requiring facial landmarks (except for initialization). As a result, our method also applies to real scenarios in which facial landmarks-based methods fail due to self-occlusions.
AB - We propose a real-time and robust approach to estimate the full 3D head pose from extreme head poses using a monocular system. To this end, we first model the head using a simple geometric shape initialized using facial landmarks, i.e., eye corners, extracted from the face. Next, 2D salient points are detected within the region defined by the projection of the visible surface of the geometric head model onto the image, and projected back to the head model to generate the corresponding 3D features. Optical flow is used to find the respective 2D correspondences in the next video frame. Assuming that the monocular system is calibrated, it is then possible to solve the Perspective-n-Point (PnP) problem of estimating the head pose given a set of 3D features on the geometric model surface and their corresponding 2D correspondences from optical flow in the next frame. The experimental evaluation shows that the performance of the proposed approach achieves, and in some cases improves the state-of-the-art performance with a major advantage of not requiring facial landmarks (except for initialization). As a result, our method also applies to real scenarios in which facial landmarks-based methods fail due to self-occlusions.
KW - Head pose estimation
KW - Monocular system
KW - Perspective-n-point
UR - http://www.scopus.com/inward/record.url?scp=85045326549&partnerID=8YFLogxK
U2 - 10.1109/ICIP.2017.8296255
DO - 10.1109/ICIP.2017.8296255
M3 - Conference contribution
AN - SCOPUS:85045326549
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 121
EP - 125
BT - 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings
PB - IEEE Computer Society
T2 - 24th IEEE International Conference on Image Processing, ICIP 2017
Y2 - 17 September 2017 through 20 September 2017
ER -