Integrated Engineering Department Publications

Symbolic Phonetic Features for Modeling of Pronunciation Variation

Rebecca Bates, Minnesota State University, MankatoFollow
Mari Ostendorf, University of Washington - Seattle Campus
Richard A. Wright, University of Washington - Seattle Campus

Document Type

Article

Publication Date

2-2007

Abstract

A significant source of variation in spontaneous speech is due to intra-speaker pronunciation changes, often realized as small feature changes, e.g., nasalized vowels or affricated stops, rather than full phone transformations. Previous computational modeling of pronunciation variation has typically involved transformations from one phone to another, in part because most speech processing systems use phone-based units. Here, a phonetic-feature-based prediction model is presented where phones are represented by a vector of symbolic features that can be on, off, unspecified or unused. Feature interaction is examined using different groupings of possibly dependent features, and a hierarchical grouping with conditional dependencies led to the best results. Feature-based models are shown to be more efficient than phone-based models, in the sense of requiring fewer parameters to predict variation while giving smaller distance and perplexity values when comparing predictions to the hand-labeled reference. A parsimonious model is better suited to incorporating new conditioning factors, and this work investigates high-level information sources, including both text (syntax, discourse) and prosody cues. Experiments show that feature-based models benefit from prosody cues, but not text, and that phone-based models do not benefit from any of the high-level cues explored here.

Department

Integrated Engineering

Publication Title

Speech Communication

Recommended Citation

Bates, R., Ostendorf, M., & Wright, R. (2007). Symbolic phonetic features for modeling of pronunciation variation. Speech Communication, 49(2), 83-97. doi:10.1016/j.specom.2006.10.007

DOI

10.1016/j.specom.2006.10.007

Link to Publisher Version (DOI)

https://doi.org/10.1016/j.specom.2006.10.007

Publisher's Copyright and Source

Copyright © 2006 Elsevier B.V. Article published by Elsevier in Speech Communication, volume 49, issue number 2, February 2007, pages 83-97. Available online https://doi.org/10.1016/j.specom.2006.10.007.

Link to Full Text

COinS

Integrated Engineering Department Publications

Symbolic Phonetic Features for Modeling of Pronunciation Variation

Document Type

Publication Date

Abstract

Department

Publication Title

Recommended Citation

DOI

Link to Publisher Version (DOI)

Publisher's Copyright and Source

Search

Author Corner

University Resources

Integrated Engineering Department Publications

Symbolic Phonetic Features for Modeling of Pronunciation Variation

Authors

Document Type

Publication Date

Abstract

Department

Publication Title

Recommended Citation

DOI

Link to Publisher Version (DOI)

Publisher's Copyright and Source

Share

Search

Author Corner

University Resources