Sphinx: Developing Speech-Enabled Applications Training Course

Overview

Speech technology is increasingly being used to create highly interactive, voice-activated applications. From voice-control, to smart assistants, to speech transcription and translation, to closed-captioning and language learning, the improved accuracy and processing speed of this technology is enhancing the quality of applications and delivering greater user experiences.

In this course we use the open-source Sphinx toolkit (aka CMU Sphinx) to demonstrate and model various types of speech-enabled applications. By the end of the course participants should have a solid grasp of the tools and techniques needed to apply speech technology to their own applications. Sphinx 4 will be the basis for this training, however, coverage of Sphinx 3 can also be arranged.

Audience

  • Software developers and programmers

Format of the course

  • Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding

Requirements

  • An understanding of the fundamentals of speech technology
  • Programming experience, expecially Java for Sphinx 4 and C for PocketSphinx

Course Outline

Introduction to speech and speech technology

Downloading and building Sphinx

Preparing packages and models

Eclipse IDE setup

Overview of the CMU Sphinx toolkit

Building an application with Sphinx4

Building the dictionary

Building the language model

Adapting existing acoustic model

Building an acoustic model

Building an Android application with PocketSphinx

Performance tuning

Summary and conclusion

Leave a Reply

Your email address will not be published. Required fields are marked *