Continuous Speech Recognition Technologies—A Review

Bhatt, Shobha; Jain, Anurag; Dev, Amita

doi:10.1007/978-981-15-5776-7_8

Shobha Bhatt¹⁰,
Anurag Jain¹⁰ &
Amita Dev¹¹

Part of the book series: Lecture Notes in Mechanical Engineering ((LNME))

1262 Accesses
12 Citations

Abstract

Speech recognition is the most emerging field of research, as speech is the natural way of communication. This paper presents the different technologies used for continuous speech recognition. The structure of speech recognition system with different stages is described. Different feature extraction techniques for developing speech recognition system have been studied with merits and demerits. Due to the vital role of language modeling in speech recognition, various aspects of language modeling in speech recognition were presented. Widely used classification techniques for developing speech recognition system were discussed. Importance of speech corpus during the speech recognition process was described. Speech recognition tools for analysis and development purpose were explored. Parameters of speech recognition system testing were discussed. Finally, a comparative study was listed for different technological aspects of speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Hardcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sarma BD, Mahadeva Prasanna SR (2017) Acoustic–phonetic analysis for speech recognition: a review. IETE Tech Rev 1–23
Google Scholar
kaldi-asr.org/doc/kaldi_for_dummies.html
Furui S (2007) Speech and speaker recognition evaluation. In: Dybkjær L, Hemsen H, Minker W (eds) Evaluation of text and speech systems. Text, speech and language technology, vol 37. Springer, Dordrecht
Google Scholar
Saon George, Chien Jen-Tzung (2012) Large-vocabulary continuous speech recognition systems: a look at some recent advances. IEEE Signal Process Mag 29(6):18–33
Article Google Scholar
Kacur J, Rozinaj G (2008) Practical issues of building robust HMM models using HTK and SPHINX systems, speech recognition, France Mihelic and Janez Zibert (ed), InTech. https://doi.org/10.5772/6376
Bahl LR et al (1999) Context dependent modeling of phones in continuous speech using decision trees. HLT
Google Scholar
Cutajar M et al (2013) Comparative study of automatic speech recognition techniques. IET Signal Process 7(1):25–46
Google Scholar
Lippmann Richard P (1989) Review of neural networks for speech recognition. Neural Comput 1(1):1–38
Article Google Scholar
Vimala C, Radha V (2015) Isolated speech recognition system for Tamil language using statistical pattern matching and machine learning techniques. J Eng Sci Technol (JESTEC) 10(5):617–632
Google Scholar
Picone Joseph W (1993) Signal modeling techniques in speech recognition. Proc IEEE 81(9):1215–1247
Article Google Scholar
Fook CY et al (2013) Comparison of speech parameterization techniques for the classification of speech disfluencies. Turkish J Electric Eng Comput Sci 21(1):983–1994
Google Scholar
Scharenborg OE, Bouwman AGG, Boves LWJ (2000) Connected digit recognition with class specific word models
Google Scholar
Nieuwoudt C, Botha EC (1999) Connected digit recognition in Afrikaans using hidden Markov models
Google Scholar
Bhiksha R, Singh R (2011) Design and implementation of speech recognition systems. Carniege Mellon School of Computer Science
Google Scholar
Davel M, Martirosian O (2009) Pronunciation dictionary development in resource-scarce environments
Google Scholar
Wu T (2009) Feature selection in speech and speaker recognition. Katholieke Universiteit Leuven
Google Scholar
Kumar K, Kim C, Stern RM (2011) Delta-spectral cepstral coefficients for robust speech recognition. In: 2011 IEEE International conference on acoustics, speech and signal processing (ICASSP). IEEE
Google Scholar
Aggarwal RK, Dave M (2012) Integration of multiple acoustic and language models for improved Hindi speech recognition system. Int J Speech Technol 15(2):165–180
Google Scholar
Bush M, Kopec G (1987) Network-based connected digit recognition. IEEE Trans Acoust Speech Signal Process 35(10):1401–1413
Google Scholar
Singhal S, Dubey RK (2015) Automatic speech recognition for connected words using DTW/HMM for English/Hindi languages. In: 2015 Communication, control and intelligent systems (CCIS). IEEE
Google Scholar
He ZG, Liu ZM (2012) Chinese connected word speech recognition based on derivative dynamic time warping. In: Advanced materials research, vol 542. Trans Tech Publications
Google Scholar
Bernardis G, Bourlard H (1998) Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems. In: Fifth international conference on spoken language processing
Google Scholar
Bourlard H, Morgan N (1998) Hybrid HMM/ANN systems for speech recognition: overview and new research directions. In: Adaptive processing of sequences and data structures. Springer, Berlin, pp 389–417
Google Scholar
Livescu Karen, Fosler-Lussier Eric, Metze Florian (2012) Subword modeling for automatic speech recognition: past, present, and emerging approaches. IEEE Signal Process Mag 29(6):44–57
Article Google Scholar
Renals S, McKelvie D, McInnes F (1991) A comparative study of continuous speech recognition using neural networks and hidden Markov models. In: 1991 International Conference on Acoustics, Speech, and Signal Processing. ICASSP-91. IEEE
Google Scholar
Saini P, Kaur P, Dua M (2013) Hindi automatic speech recognition using htk. Int J Eng Trends Technol (IJETT), 4(6), 2223–2229 versité de Aix-en-Provence, 1998
Google Scholar
Makhoul John, Schwartz Richard (1995) State of the art in continuous speech recognition. Proc Natl Acad Sci 92(22):9956–9963
Article Google Scholar
Klatt Dennis H (1977) Review of the ARPA speech understanding project. J Acoust Soc Am 62(6):1345–1366
Article Google Scholar
Jelinek Frederick (1976) Continuous speech recognition by statistical methods. Proc IEEE 64(4):532–556
Article Google Scholar
Levinson SE, Rabiner LR, Sondhi MM (1983) An introduction to the application of the theory of probabilistic functions of a Markov process to automatic speech recognition. Bell Syst Tech J 62(4): 1035–1074
Google Scholar
htk.eng.cam.ac.uk
Google Scholar
Dev Amita S, Agrawal S, Roy Choudhury D (2003) Categorization of Hindi phonemes by neural networks. AI & SOCIETY 17(3–4):375–382
Google Scholar
Anusuya MA, Katti SK (2011) Front end analysis of speech recognition: a review. Int J Speech Technol 14(2):99–145
Article Google Scholar
https://cmusphinx.github.io/
Bhatt S, Dev A, Jain A Hindi speech vowel recognition using hidden Markov model. In: Proceedings of The 6th International Workshop on Spoken Language Technologies for Under-Resourced Languages, pp 196–199
Google Scholar
Bhatt, Shobha, Dev, Amita Jain, Anurag. Hidden Markov Model Based Speech Recognition-A Review. In: 12 th INDIACom 2018, 5th International conference on “computing for sustainable global development, 1st to 3rd March, 2018. http://bvicam.ac.in/news/INDIACom%202018%20Proceedings/Main/papers/712.pdf
Bhatt S, Jain A, Dev A (2017) Hindi Speech recognition: issues and challenges. In: 11th INDIACom 4rd International conference on computing for sustainable global Development. 1st to 3rd March, 2017. http://bvicam.ac.in/news/INDIACom%202017%20Proceedings/Main/papers/936.pdf
Agrawal SS, Prakash N, Jain A (2010) Transformation of emotion based on acoustic features of intonation patterns for Hindi speech. Afr J Math Comput Sci Res 3(10): 255–266
Google Scholar
Madan A, Gupta D (2014) Speech feature extraction and classification: a comparative review. Int J Comput Appl 90(9)
Google Scholar

Download references

Acknowledgments

The authors would like to acknowledge the Ministry of Electronics and Information Technology (MeitY), Government of India, for providing financial assistance for this research work through “Visvesvaraya Ph.D. Scheme for Electronics and IT”.

Author information

Authors and Affiliations

U.S.I.C.T, Guru Gobind Singh Indraprastha University, Sector-16, Dwarka, Delhi, India
Shobha Bhatt & Anurag Jain
Indira Gandhi Delhi Technical University for Women, New Delhi, India
Amita Dev

Authors

Shobha Bhatt
View author publications
You can also search for this author in PubMed Google Scholar
Anurag Jain
View author publications
You can also search for this author in PubMed Google Scholar
Amita Dev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shobha Bhatt .

Editor information

Editors and Affiliations

Department of Physico-Mechanical Metrology, CSIR-National Physical Laboratory, New Delhi, Delhi, India
Mahavir Singh
Mechanical Engineering Department, Aligarh Muslim University, Aligarh, Uttar Pradesh, India
Yasser Rafat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bhatt, S., Jain, A., Dev, A. (2021). Continuous Speech Recognition Technologies—A Review. In: Singh, M., Rafat, Y. (eds) Recent Developments in Acoustics. Lecture Notes in Mechanical Engineering. Springer, Singapore. https://doi.org/10.1007/978-981-15-5776-7_8

Download citation

DOI: https://doi.org/10.1007/978-981-15-5776-7_8
Published: 20 September 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-5775-0
Online ISBN: 978-981-15-5776-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics