Skip to main content

Efficient Language Model Generation Algorithm for Mobile Voice Commands

  • Conference paper
  • First Online:
Advances in Human Factors, Software, and Systems Engineering (AHFE 2017)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 598))

Included in the following conference series:

  • 1051 Accesses

Abstract

The Single Multimodal Android Service for HCI (SMASH) framework implements an automated language data generation algorithm to support high-accuracy, efficient, always-listening voice command recognition using the Carnegie Mellon University (CMU) PocketSphinx n-gram speech recognizer. SMASH injects additional language data into the language model generation process to augment the orthographies extracted from the input voice command grammar. This additional data allows for a larger variety of potential outcomes, and greater phonetic distance between outcomes, within the generated language model, resulting in more consistent probability scores for in-grammar utterances, and fewer false positives from out-of-grammar (OOG) utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Intent | Android Developers. https://developer.android.com/reference/android/content/intent.html

  2. JSpeech Grammar Format. https://www.w3.org/TR/jsgf/

  3. CMU Sphinx. http://cmusphinx.sourceforge.net

  4. Services | Android Developers. https://developer.android.com/guide/components/services.html

  5. JSGFGrammar (Sphinx-4). http://cmusphinx.sourceforge.net/doc/sphinx4/edu/cmu/sphinx/jsgf/JSGFGrammar.html

  6. Bellegarda, J.R.: An overview of statistical language model adaptation. In: ITRW on Adaptation Methods for Speech Recognition. Sophia Antipolis, France (2001)

    Google Scholar 

  7. Random Word Generator. https://randomwordgenerator.com

  8. The ARPA-MIT LM format. http://www1.icsi.berkeley.edu/Speech/docs/HTKBook3.2/node213_mn.html

  9. The CMU Pronouncing Dictionary. http://www.speech.cs.cmu.edu/cgi-bin/cmudict

  10. Clarkson, P.: The CMU-Cambridge Statistical Language Modeling Toolkit v2. http://www.speech.cs.cmu.edu/SLM/toolkit_documentation.html

  11. FreeTTS Programmer’s Guide. http://freetts.sourceforge.net/docs/ProgrammerGuide.html

  12. Black, A.W., Lenzo, K., Pagel, V.: Issues in building general letter-to-sound rules. In: Proceedings of ECSA Workshop on Speech Synthesis, pp. 77–80, Australia (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Yaeger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG (outside the USA)

About this paper

Cite this paper

Yaeger, D., Bubeck, C. (2018). Efficient Language Model Generation Algorithm for Mobile Voice Commands. In: Ahram, T., Karwowski, W. (eds) Advances in Human Factors, Software, and Systems Engineering. AHFE 2017. Advances in Intelligent Systems and Computing, vol 598. Springer, Cham. https://doi.org/10.1007/978-3-319-60011-6_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-60011-6_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-60010-9

  • Online ISBN: 978-3-319-60011-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics