When Siri Knows How You Feel: Study of Machine Learning in Automatic Sentiment Recognition from Human Speech

Zhang, L.; Ng, E. Y. K.

doi:10.1007/978-3-030-03405-4_41

L. Zhang¹⁷ &
E. Y. K. Ng¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 887))

Included in the following conference series:

Future of Information and Communication Conference

1014 Accesses

Abstract

Opinions and sentiments are essential to human activities and have a wide variety of applications. As many decision makers turn to social media due to large volume of opinion data available, efficient and accurate sentiment analysis is necessary to extract those data. Hence, text sentiment analysis has recently become a popular field and has attracted many researchers. However, extracting sentiments from audio speech remains a challenge. This project explored the possibility of applying supervised Machine Learning in recognizing sentiments in English utterances on a sentence level. In addition, the project also aimed to examine the effect of combining acoustic and linguistic features on classification accuracy. Six audio tracks were randomly selected to be training data from 40 YouTube videos (monologue) with strong presence of sentiments. Speakers expressed sentiments towards products, films, or political events. These sentiments were manually labelled as negative and positive based on independent judgment of three experimenters. A wide range of acoustic and linguistic features were then analyzed and extracted using sound editing and text mining tools, respectively. A novel approach was proposed, which used a simplified sentiment score to integrate linguistic features and estimate sentiment valence. This approach improved negation analysis and hence increased overall accuracy. Results showed that when both linguistic and acoustic features were used, accuracy of sentiment recognition improved significantly, and that excellent prediction was achieved when the four classifiers were trained, respectively, namely, kNN, SVM, Neural Network, and Naïve Bayes. Possible sources of error and inherent challenges of audio sentiment analysis were discussed to provide potential directions for future research.

L. Zhang—Yale-NUS College, Singapore

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A word or lexical unit that has several or multiple meanings.
2.
Multilinguality is a characteristic of tasks that involve the use of more than one natural language (Kay, n.d.).
3.
Exophoric reference is referring to a situation or entities outside the text. (University of Pennsylvania, 2006).

References

Liu, B.: Sentiment Analysis: Mining Opinions, Sentiments, and Emotions. Cambridge University Press, Cambridge (2015)
Book Google Scholar
Lee, C.M., Narayanan, S.S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293 (2005)
Article Google Scholar
Kaushik, L., Sangwan, A., Hansen, J.H.L.: A Holistic Lexicon-Based Approach to Opinion Mining. IEEE (2013)
Google Scholar
Ding, X., Liu, B., Yu, P.S.: A holistic lexicon-based approach to opinion mining. In: Proceedings of the Conference on Web Search and Web Data Mining (WSDM-2008) (2008)
Google Scholar
Hu, M., Liu, B.: Mining and summarizing customer reviews. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-2004), Aug 22–25, 2004, Seattle, Washington, USA
Google Scholar
Demšar, J., Curk, T., Erjavec, A.: Orange: data mining toolbox in Python. J. Mach. Learn. Res. 14, 2349–2353 (2013)
MATH Google Scholar
Unknown. Cornell University: (2003) https://www.cs.cornell.edu/courses/cs578/2003fa/performance_measures.pdf
Tape, T.G.: Interpreting diagnostic tests. University of Nebraska Medical Center (n.d). http://gim.unmc.edu/dxtests/roc3.htm

Download references

Acknowledgment

I would like to express my sincere gratitude to my project supervisor Professor Eddie Ng for being open-minded about my project topic, without which I could not have been able to delve deep into my field of interest. His insightful suggestions and unwavering support has guided me through doubts and difficulties.

Author information

Authors and Affiliations

Anglo-Chinese Junior College, Singapore, Singapore
L. Zhang
College of Engineering, Nanyang Technological University (NTU), Singapore, Singapore
E. Y. K. Ng

Authors

L. Zhang
View author publications
You can also search for this author in PubMed Google Scholar
E. Y. K. Ng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. Zhang .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, West Yorkshire, UK
Supriya Kapoor
The Science and Information (SAI) Organization, New Delhi, India
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, L., Ng, E.Y.K. (2019). When Siri Knows How You Feel: Study of Machine Learning in Automatic Sentiment Recognition from Human Speech. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Advances in Information and Communication Networks. FICC 2018. Advances in Intelligent Systems and Computing, vol 887. Springer, Cham. https://doi.org/10.1007/978-3-030-03405-4_41

Download citation

DOI: https://doi.org/10.1007/978-3-030-03405-4_41
Published: 27 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03404-7
Online ISBN: 978-3-030-03405-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics