Real-Time Emotion Recognition Framework Based on Convolution Neural Network

  • Hanting Yang
  • Guangzhe ZhaoEmail author
  • Lei Zhang
  • Na Zhu
  • Yanqing He
  • Chunxiao Zhao
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 157)


Efficient emotional state analyzing will enable machines to understand human better and facilitate the development of applications which involve human–machine interaction. Recently, deep learning methods become popular due to their generalization ability, but the disadvantage of complicated computation could not meet the requirements of real-time characteristics. This paper proposes an emotion recognition framework based on convolution neural network, which contains less number of parameters comparatively. In order to verify the proposed framework, we train a network on a large number of facial expression images and then use the pretrained model to predict image frame taken from a single camera. The experiment shows that compared to VGG13, our network reduces the parameters by 147 times.


CNN Emotion recognition Image processing 


  1. 1.
    Suwa, M., Sugie, N., Fujimora, K.: A preliminary note on pattern recognition of human emotional expression. In: Proceedings of the 4th International Joint Conference on Pattern Recognition 1978, IAPR, pp. 408–410, Kyoto, Japan (1978)Google Scholar
  2. 2.
    Tian, Y.I., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2002)CrossRefGoogle Scholar
  3. 3.
    Yin, L., Chen, X., Sun, Y., Worm, T., Reale, M.: A high-resolution 3D dynamic facial expression database. In: 8th IEEE International Conference on Automatic Face & Gesture Recognition 2008, pp. 1–6. Amsterdam, Netherlands (2008)Google Scholar
  4. 4.
    Corneanu, C.A., Simon, M.O., Cohn, J.F., et al.: Survey on RGB, 3D, thermal, and multimodal approaches for facial expression recognition: history, trends, and affect-related applications. IEEE Trans. Pattern Anal. Mach. Intell. 38(8), 1548–1568 (2016)CrossRefGoogle Scholar
  5. 5.
    Ji, Q.: Looney: A probabilistic framework for modeling and real-time monitoring human fatigue. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 36(5), 862–875 (2006)CrossRefGoogle Scholar
  6. 6.
    Ashraf, A.B., Lucey, S., Cohn, J.F.: The painful face—pain expression recognition using active appearance models. Image Vis. Comput. 27(12), 1788–1796 (2009)CrossRefGoogle Scholar
  7. 7.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference Computer Vision Pattern Recognition 2001, vol. 1, pp. I–511 (2001)Google Scholar
  8. 8.
    Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition 2005, CVPR, pp. 886–893, San Diego, USA (2005)Google Scholar
  9. 9.
    Osadchy, M., Miller, M., Lecun, Y.: Synergistic face detection and pose estimation. J. Mach. Learn. Res. 8(1), 1197–1215 (2006)Google Scholar
  10. 10.
    Cootes, T.F., Taylor, C.J., Cooper, D.H., et al.: Active shape models-their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995)CrossRefGoogle Scholar
  11. 11.
    Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Anal. Mach. Intell. 23(6), 681–686 (2001)CrossRefGoogle Scholar
  12. 12.
    Tian, Y.L., Kanade, T., Cohn, J.F.: Recognizing action units for facial expression analysis. IEEE Trans. Pattern Anal. Mach. Intell. 23(2), 97–115 (2001)CrossRefGoogle Scholar
  13. 13.
    Lemaire, P., Ardabilian, M., Chen, L., et al.: Fully automatic 3D facial expression recognition using differential mean curvature maps and histograms of oriented gradients. In: 10th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition 2013, (FG), pp. 1–7, Shanghai, China (2013)Google Scholar
  14. 14.
    Dapogny, A., Bailly, K., Dubuisson, S.: Dynamic facial expression recognition by joint static and multi-time gap transition classification. In: 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition 2015, (FG), pp. 1–6, Ljubljana, Slovenia (2015)Google Scholar
  15. 15.
    He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Proceedings of Computer Vision—ECCV 2016, vol. 9908, pp. 770–778. Springer, Cham (2016)Google Scholar
  16. 16.
    Szegedy, C., Liu, W., Jia, Y., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015)Google Scholar
  17. 17.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning. JMLR, pp. 448–456 (2015)Google Scholar
  18. 18.
    Su, W., Boyd, S., Candes, E.J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. Adv. Neural Inf. Process. Syst. 3(1), 2510–2518 (2015)zbMATHGoogle Scholar
  19. 19.
  20. 20.
    Barsoum, E., et al.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: ACM International Conference on Multimodal Interaction ACM, pp. 279–283 (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  • Hanting Yang
    • 1
  • Guangzhe Zhao
    • 1
    Email author
  • Lei Zhang
    • 1
  • Na Zhu
    • 1
  • Yanqing He
    • 1
  • Chunxiao Zhao
    • 1
  1. 1.Beijing University of Civil Engineering and ArchitectureBeijingChina

Personalised recommendations