Efficient Language Model Generation Algorithm for Mobile Voice Commands

Yaeger, Daniel; Bubeck, Christopher

doi:10.1007/978-3-319-60011-6_11

Daniel Yaeger¹⁶ &
Christopher Bubeck¹⁶

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 598))

Included in the following conference series:

International Conference on Applied Human Factors and Ergonomics

1051 Accesses

Abstract

The Single Multimodal Android Service for HCI (SMASH) framework implements an automated language data generation algorithm to support high-accuracy, efficient, always-listening voice command recognition using the Carnegie Mellon University (CMU) PocketSphinx n-gram speech recognizer. SMASH injects additional language data into the language model generation process to augment the orthographies extracted from the input voice command grammar. This additional data allows for a larger variety of potential outcomes, and greater phonetic distance between outcomes, within the generated language model, resulting in more consistent probability scores for in-grammar utterances, and fewer false positives from out-of-grammar (OOG) utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Intent | Android Developers. https://developer.android.com/reference/android/content/intent.html
JSpeech Grammar Format. https://www.w3.org/TR/jsgf/
CMU Sphinx. http://cmusphinx.sourceforge.net
Services | Android Developers. https://developer.android.com/guide/components/services.html
JSGFGrammar (Sphinx-4). http://cmusphinx.sourceforge.net/doc/sphinx4/edu/cmu/sphinx/jsgf/JSGFGrammar.html
Bellegarda, J.R.: An overview of statistical language model adaptation. In: ITRW on Adaptation Methods for Speech Recognition. Sophia Antipolis, France (2001)
Google Scholar
Random Word Generator. https://randomwordgenerator.com
The ARPA-MIT LM format. http://www1.icsi.berkeley.edu/Speech/docs/HTKBook3.2/node213_mn.html
The CMU Pronouncing Dictionary. http://www.speech.cs.cmu.edu/cgi-bin/cmudict
Clarkson, P.: The CMU-Cambridge Statistical Language Modeling Toolkit v2. http://www.speech.cs.cmu.edu/SLM/toolkit_documentation.html
FreeTTS Programmer’s Guide. http://freetts.sourceforge.net/docs/ProgrammerGuide.html
Black, A.W., Lenzo, K., Pagel, V.: Issues in building general letter-to-sound rules. In: Proceedings of ECSA Workshop on Speech Synthesis, pp. 77–80, Australia (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

US Army, Aberdeen Proving Ground, USA
Daniel Yaeger & Christopher Bubeck

Authors

Daniel Yaeger
View author publications
You can also search for this author in PubMed Google Scholar
Christopher Bubeck
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Yaeger .

Editor information

Editors and Affiliations

University of Central Florida, Orlando, Florida, USA
Tareq Ahram
University of Central Florida, Winter Park, Florida, USA
Waldemar Karwowski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yaeger, D., Bubeck, C. (2018). Efficient Language Model Generation Algorithm for Mobile Voice Commands. In: Ahram, T., Karwowski, W. (eds) Advances in Human Factors, Software, and Systems Engineering. AHFE 2017. Advances in Intelligent Systems and Computing, vol 598. Springer, Cham. https://doi.org/10.1007/978-3-319-60011-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-319-60011-6_11
Published: 11 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60010-9
Online ISBN: 978-3-319-60011-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics