Elsevier

Neurocomputing

Volume 458, 11 October 2021, Pages 297-307
Neurocomputing

Towards Reading Beyond Faces for Sparsity-aware 3D/4D Affect Recognition

https://doi.org/10.1016/j.neucom.2021.06.023Get rights and content
Under a Creative Commons license
open access

Abstract

In this paper, we present a sparsity-aware deep network for automatic 3D/4D facial expression recognition (FER). We first propose a novel augmentation method to combat the data limitation problem for deep learning, specifically given 3D/4D face meshes. This is achieved by projecting the input data into RGB and depth map images and then iteratively performing randomized channel concatenation. Encoded in the given 3D landmarks, we also introduce an effective way to capture the facial muscle movements from three orthogonal plans (TOP), the TOP-landmarks over multi-views. Importantly, we then present a sparsity-aware deep network to compute the sparse representations of convolutional features over multi-views. This is not only effective for a higher recognition accuracy but also computationally convenient. For training, the TOP-landmarks and sparse representations are used to train a long short-term memory (LSTM) network for 4D data, and a pre-trained network for 3D data. The refined predictions are achieved when the learned features collaborate over multi-views. Extensive experimental results achieved on the Bosphorus, BU-3DFE, BU-4DFE and BP4D-Spontaneous datasets show the significance of our method over the state-of-the-art methods and demonstrate its effectiveness by reaching a promising accuracy of 99.69% on BU-4DFE for 4D FER.

Keywords

Affect
Augmentation
Deep learning
3D/4D facial expression recognition
Landmarks

Cited by (0)

Muzammil Behzad received his B.S. degree with distinctions (double medalist and valedictorian) from COMSATS University Islamabad (CUI), Pakistan, and his fully-funded M.S. degree from King Fahd University of Petroleum and Minerals, Saudi Arabia, both in Electrical Engineering. Currently, he is working as a Ph.D. Researcher at the Center for Machine Vision and Signal Analysis (CMVS) in the University of Oulu, Finland. He was a research scholar in Brown University from February – May 2019. Before joining CMVS, he worked as a full-time Researcher at Pukyong National University (PKNU), South Korea, and as a Research Associate at CUI. He is the winner of the prestigious three-minutes Ph.D. thesis competition at the IEEE International Conference on Image Processing 2020. His research interests lie around signal and image processing, computer vision, deep learning, and their applications.

Nhat Vo received his B.S. degree in Information Technology from the University of Science-VNUHCM, Vietnam, in 2010, and his M.S. and Ph.D. degrees in Electronics and Computer Engineering from Chonnam National University, Republic of Korea, in 2017. He worked as Postdoc researcher at the Center for Machine Vision and Signal Analysis, University of Oulu, Finland. He is currently working as AI scientist in Silo.AI. His study interests are multimedia and image processing, facial expression analysis, and pattern recognition.

Xiaobai Li received her B.Sc degree in Psychology from Peking University, M.Sc degree in Biophysics from the Chinese Academy of Science, and Ph.D. degree in Computer Science from University of Oulu. She is currently an assistant professor in the Center for Machine Vision and Signal Analysis of University of Oulu. Her research interests include spontaneous vs. posed facial expression comparison, micro-expression and deceitful behaviors, and heart rate measurement from facial videos.

Guoying Zhao (IEEE Senior member 2012, IAPR Fellow) received the Ph.D. degree in computer science from the Chinese Academy of Sciences, Beijing, China, in 2005. Then she worked as senior researcher since 2005 and an Associate Professor since 2014 with the Center for Machine Vision and Signal Analysis, University of Oulu, Finland. She is currently a full professor with University of Oulu, Finland and a visiting professor with Northwest University, China. In 2020, she was selected to the prestigious Academy Professor position. She was Nokia visiting professor in 2016. She has authored or co au-thored more than 260 papers in journals and conferences. Her papers have currently over 15690 citations in Google Scholar (h-index 57). She is co-program chair for ACM International Conference on Multimodal Interaction (ICMI 2021), was co-publicity chair for FG2018, General chair of 3rd International Conference on Biometric Engineering and Applications (ICBEA 2019),and Late Breaking Results Co-Chairs of 21st ACM International Conference on Multimodal Interaction (ICMI 2019), has served as area chairs for several conferences and is associate editor for Pattern Recognition, IEEE Transactions on Circuitsand Systems for Video Technology, and Image and Vision Computing Journals. She has lectured tutorials at ICPR 2006, ICCV 2009, SCIA 2013 and FG 2018, authored/edited three books and eight special issues in journals. Dr. Zhao was a Co-Chair of many International Workshops at ICCV, CVPR, ECCV, ACCV and BMVC. Her current research interests include image and video descriptors, facial-expression and micro-expression recognition, emotional gesture analysis, affective computing, and biometrics. Her research has been reported by Finnish TV programs, newspapers and MIT Technology Review.