Document Type : Full Research Paper

Authors

1 M. Sc, Bioelectric Department, Faculty of Biomedical Engineering, Amirkabir University of Technology

2 Associate Professor, Bioelectric Department, Faculty of Biomedical Engineering, Amirkabir University of Technology

3 3Associate Professor, Bioelectric Department, Faculty of Biomedical Engineering, Amirkabir University of Technology

10.22041/ijbme.2013.13121

Abstract

In present work, recognition of isolated word has been studied. The purpose of this research is to increase the performance of children’s speech recognizer using Vocal Tract Length Normalization. This recognition system has been created to design a speech therapy software. Recognition of correct and wrong pronunciation and help children to improve it using some feedbacks are the goals of this software. In test phase, some speech data that are related to correct and incorrect pronunciation of 47 words have been utilized. Four Baseline models have been Trained, one for children, one combined model (females and children) and two for Adults (by exploiting one Persian database). Children’s model was trained and tested with data that have been collected from 38 children (5 to 8 years old). These experiments were implemented in HTK toolkit. Poor performance was improved using VTLN. Improvement of adult’s model was more than children’s model.

Keywords

Main Subjects

[1] Potamianos A., Robust Recognition of Children’s Speech; IEEE transactions on speech and audio processing, 2003; 11(6).
[2] Giuliani D., Gerosa M., investigating recognition of children’s speech; ITC-irst, Center ob Scientific and Technological Research, Trento, Italy, 2003.
[3] Potamianos A., Narayanan S., Acoustics of children’s speech:Developmental changes of temporal and spectral parameters; Journal of Acoust. Soc. Amer, 1999; 105: 1455–1468.
[4] Tadayon Tabrizi Gh., HMM-Based Recognition and Adaptation of Persian Children's Speech; Department of Computer, Science and Research Branch, Islamic Azad University, Tehran, Iran, Contemporary Engineering Sciences, 2011; 4(5): 221 – 228.
[5] تدین تبریزی ق.، ستایشی س.، ارائه روشی مبتنی بر نرمالسازی اکوستیکی و خوشه بندی برای بهبود بازشناسی گفتار کودکان فارسی زبان؛ مجله فنی مهندسی دانشگاه آزاد اسلامی مشهد، دوره سوم، شماره اول، زمستان 88.
[6] باباعلی ب.، صامتی ح.، ویسی ه.، بکارگیری نرمالسازی اثر طول مسیر صوتی گوینده­ها در سیستم بازشناسی گفتار پیوسته فارسی مبتنی بر مدل مخفی مارکوف؛ سیزدهمین کنفرانس ملی انجمن کامپیوتر ایران، 1386.
[7] Elenius D., Blomberg M., Adaptation and Normalization Experiments in Speech Recognition for 4 to 8 Year old Children; Department of Speech Music and Hearing KTH, Stockholm, Sweden, INTERSPEECH, 2005.
[5] Sanand D.R., Kurimo M., A Study on Combining VTLN and SAT to Improve the Performance of Automatic Speech Recognition; Adaptive Informatics Research Center, Aalto University, Finland, Interspeech, 2011.
[9] Elenius D., Adaptation techniques for children’s speech recognition; KTH/TMH, 2004.
[10] Young S., Evermann G., et al., “The HTK book”, Cambridge University Engineering Department, 2006.
[11]Evandro B., Gouvêa, Acoustic-feature-based Frequency Warping for Speaker Normalization; Department of Electrical and Computer Engineering Carnegie Mellon University, Pittsburgh, Pennsylvania December, 1998.
[12] Feng H., Yuan C., Li Y., Speaker Normalization Method Based On the Piece-Wise Linear Frequency Warping; dept. computer and information engineering, 2009 International Conference on E-Learning, E-Business, Enterprise Information Systems, and E-Government.