Iranian Journal of Biomedical Engineering (IJBME)

تشخیص کلمات فارسی بر اساس سیگنال‌های الکترومایوگرام چهره

نوع مقاله : مقاله کامل پژوهشی

نویسندگان

1 دانشجوی دکتری، گروه مهندسی برق، دانشکده‌ی مهندسی برق و کامپیوتر، دانشگاه سمنان، سمنان، ایران

2 دانشیار، گروه مهندسی برق، دانشکده‌ی مهندسی برق و کامپیوتر، دانشگاه سمنان، سمنان، ایران

3 دانشیار، گروه مهندسی پزشکی، دانشگاه سمنان، سمنان، ایران

چکیده
از دست دادن صدا و حنجره یک معضل بزرگ برای افراد با اختلالات گفتاری است. این اتفاق عواقب جدی و منفی بر کیفیت زندگی فردی و گروهی این اشخاص به ‌ویژه در محیط‌های کاری ایجاد می‌کند. توسعه‌ی یک سیستم هوشمند بر پایه‌ی سیگنال‌های الکترومایوگرام با قابلیت تشخیص گفتار (بدون استفاده از صوت) می‌تواند دریچه‌ی امیدبخشی برای افرادی که حنجره و صدای خود را بر اثر سرطان از دست ‌داده‌اند باشد. اگر چه در این حوزه پژوهش‌هایی برای زبان‌های مختلف انجام شده اما برای زبان فارسی پژوهشی صورت نگرفته است. در این مقاله برای اولین بار، بازشناسی واژگان فارسی با استفاده از الکترومایوگرام عضلات چهره انجام پذیرفته است. بدین منظور سیگنال‌های sEMG از 8 عضله‌ی چهره‌ی 6 داوطلب هنگام بیان 12 کلمه‌ی زبان فارسی جمع‌آوری شده است. سپس ویژگی‌های MFL، VAR، DAMV، LTKE، IQR و Cardinality از هر کانال و هر پنجره از سیگنال استخراج گردیده و 432 ویژگی حاصل از هر سیگنال با استفاده از روش تحلیل مولفه‌ی اصلی به 49 ویژگی تقلیل یافته است. در نهایت به‌ منظور بازشناسی 12 کلمه‌ی زبان فارسی، ویژگی‌ها به طبقه‌بندهای SVM، KNN و RF داده شده است. میانگین صحت طبقه‌بندی به ترتیب 16/83%، 91/81% و 97/78% به دست آمده است. ارزیابی نتایج این مقاله گویای آن است که با استفاده از سیگنال‌های EMG می‌توان کلمات محدود زبان فارسی را با صحت خوبی بازشناسی نمود.

کلیدواژه‌ها

موضوعات


عنوان مقاله English

Persian Words Recognition based on Facial Electromyogram Signals

نویسندگان English

Pooria Sharifi 1
Hadi Soltanizadeh 2
Ali Maleki 3
1 Ph.D. Student, Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
2 Associate Professor, Electrical and Computer Engineering Department, Semnan University, Semnan, Iran
3 Associate Professor, Biomedical Engineering Department, Semnan University, Semnan, Iran
چکیده English

Losing of voice and larynx is a major problem for people with speech disorders. It creates serious and negative consequences on the quality of individual and group life of these people, especially in working environments. The development of an intelligent system based on electromyogram signals with the ability to recognize speech (without using sound) can be a window of hope for people who lost their larynx and voice due to cancer. Although progress and studies in this field are growing in our country and in different languages, but these studies have not been done for the Persian language. In this article, for the first time, recognition of Persian words was done using electromyogram of facial muscles. For this purpose, sEMG signals were collected from eight facial muscles and six volunteers while speaking twelve Persian words. Then, MFL, VAR, DAMV, LTKE, IQR and Cardinality features were extracted from each channel and each window from the signal, and the 432 features from each signal were reduced to 33 features using the PCA principal component analysis method. Finally, in order to recognize twelve Persian words, the features were given to SVM, KNN and RF classifiers. The average classification accuracy was 83.16%, 81.91% and 78.97%, respectively. Our evaluation in this article gives the hope that by using EMG signals it is possible to recognize the limited words of Persian language.

کلیدواژه‌ها English

Facial Electromyogram
Persian Words
Silent Speech Interface
Word Recognition
  1. Denby, T. Schultz, K. Honda, T. Hueber, J. M. Gilbert, and J. S. Brumberg, “Silent speech interfaces,” Speech Commun., vol. 52, no. 4, pp. 270–287, 2010, doi: 10.1016/j.specom.2009.08.002.
  2. Schultz, M. Wand, T. Hueber, D. J. Krusienski, C. Herff, and J. S. Brumberg, “Biosignal-Based Spoken Communication: A Survey,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 25, no. 12, pp. 2257–2271, 2017, doi: 10.1109/TASLP.2017.2752365.
  3. Denby et al., “Recent results in silent speech interfaces,” J. Acoust. Soc. Am., vol. 141, no. 5, pp. 3646–3646, 2017, doi: 10.1121/1.4987881.
  4. Wand, J. Koutnik, and J. Schmidhuber, “Lipreading with long short-term memory,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Mar. 2016, pp. 6115–6119. doi: 10.1109/ICASSP.2016.7472852.
  5. S. Chung, A. Senior, O. Vinyals, and A. Zisserman, “Lip reading sentences in the wild,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 3444–3450, 2017, doi: 10.1109/CVPR.2017.367.
  6. A. Cheah et al., “Preliminary evaluation of a silent speech interface based on intra-oral magnetic sensing,” BIODEVICES 2016 - 9th Int. Conf. Biomed. Electron. Devices, Proceedings; Part 9th Int. Jt. Conf. Biomed. Eng. Syst. Technol. BIOSTEC 2016, pp. 108–118, 2016, doi: 10.5220/0005824501080116.
  7. A. Gonzalez et al., “A silent speech system based on permanent magnet articulography and direct synthesis,” Comput. Speech Lang., vol. 39, pp. 67–87, 2016, doi: 10.1016/j.csl.2016.02.002.
  8. Bocquelet, T. Hueber, L. Girin, C. Savariaux, and B. Yvert, “Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces,” PLoS Comput. Biol., vol. 12, no. 11, 2016, doi: 10.1371/journal.pcbi.1005119.
  9. Schultz and M. Wand, “Modeling coarticulation in EMG-based continuous speech recognition,” Speech Commun., vol. 52, no. 4, pp. 341–353, 2010, doi: 10.1016/j.specom.2009.12.002.
  10. Wand and T. Schultz, “Analysis of phone confusion in EMG-based speech recognition,” ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc., pp. 757–760, 2011, doi: 10.1109/ICASSP.2011.5946514.
  11. Wand, M. Janke, and A. T. Schultz, “Tackling speaking mode varieties in EMG-based speech recognition,” IEEE Trans. Biomed. Eng., vol. 61, no. 10, pp. 2515–2526, 2014, doi: 10.1109/TBME.2014.2319000.
  12. Ratnovsky, S. Malayev, S. Ratnovsky, S. Naftali, and N. Rabin, “EMG-based speech recognition using dimensionality reduction methods,” J. Ambient Intell. Humaniz. Comput., 2021, doi: 10.1007/s12652-021-03315-5.
  13. Wang, M. Zhang, R. Wu, H. Wang, Z. Luo, and G. Li, “Speech neuromuscular decoding based on spectrogram images using conformal predictors with Bi-LSTM,” Neurocomputing, vol. 451, pp. 25–34, 2021, doi: 10.1016/j.neucom.2021.03.025.
  14. Chandrashekhar, “Classification of EMG Signals Using Machine Learning for the Construction of a Silent Speech Interface,” Young Res., vol. 5, no. 1, pp. 266–283, 2021, [Online]. Available: http://www.theyoungresearcher.com/papers/chandrashekhar.pdf
  15. Yang and M. Zhang, “An Alternative sEMG based Isolated Word Subvocal Speech Recognition System based on Interpolation Functions,” Proc. 2020 Int. Conf. Big Data Artif. Intell. Softw. Eng. ICBASE 2020, pp. 306–309, 2020, doi: 10.1109/ICBASE51474.2020.00071.
  16. Ye et al., “Attention Bidirectional LSTM Networks Based Mime Speech Recognition Using sEMG Data,” Conf. Proc. - IEEE Int. Conf. Syst. Man Cybern., vol. 2020-Octob, pp. 3162–3167, 2020, doi: 10.1109/SMC42975.2020.9282863.
  17. Srisuwan, P. Prukpattaranont, and C. Limsakul, “Comparison of Classifiers for EMG based Speech Recognition,” J. Phys. Conf. Ser., vol. 1438, no. 1, 2020, doi: 10.1088/1742-6596/1438/1/012032.
  18. Sae Jong and P. Phukpattaranont, “A speech recognition system based on electromyography for the rehabilitation of dysarthric patients: A Thai syllable study,” Biocybern. Biomed. Eng., vol. 39, no. 1, pp. 234–245, 2019, doi: 10.1016/j.bbe.2018.11.010.
  19. Zhu et al., “Automatic Speech Recognition in Different Languages Using High-Density Surface Electromyography Sensors,” IEEE Sens. J., vol. 21, no. 13, pp. 14155–14167, Jul. 2021, doi: 10.1109/JSEN.2020.3037061.
  20. S. Cha, W. Du Chang, and C. H. Im, “Deep-learning-based real-time silent speech recognition using facial electromyogram recorded around eyes for hands-free interfacing in a virtual reality environment,” Virtual Real., 2022, doi: 10.1007/s10055-021-00616-0.
  21. Phinyomark and E. Scheme, “EMG pattern recognition in the era of big data and deep learning,” Big Data Cogn. Comput., vol. 2, no. 3, pp. 1–27, 2018, doi: 10.3390/bdcc2030021.
  22. Andrews, E. Morin, and L. McLean, “Optimal electrode configurations for finger movement classification using EMG,” Proc. 31st Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. Eng. Futur. Biomed. EMBC 2009, pp. 2987–2990, 2009, doi: 10.1109/IEMBS.2009.5332520.
  23. R. Farrell and R. F. Weir, “The optimal controller delay for myoelectric prostheses,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 15, no. 1, pp. 111–118, 2007, doi: 10.1109/TNSRE.2007.891391.
  24. Phinyomark, P. Phukpattaranont, and C. Limsakul, “Fractal analysis features for weak and single-channel upper-limb EMG signals,” Expert Syst. Appl., vol. 39, no. 12, pp. 11156–11163, 2012, doi: 10.1016/j.eswa.2012.03.039.
  25. Tkach, H. Huang, and T. A. Kuiken, “Study of stability of time-domain features for electromyographic pattern recognition,” J. Neuroeng. Rehabil., vol. 7, no. 1, 2010, doi: 10.1186/1743-0003-7-21.
  26. S. Kim, H. H. Choi, C. S. Moon, and C. W. Mun, “Comparison of k-nearest neighbor, quadratic discriminant and linear discriminant analysis in classification of electromyogram signals based on the wrist-motion directions,” Curr. Appl. Phys., vol. 11, no. 3, pp. 740–745, 2011, doi: 10.1016/j.cap.2010.11.051.
  27. N. Khushaba, A. H. Al-Timemy, A. Al-Ani, and A. Al-Jumaily, “A Framework of Temporal-Spatial Descriptors-Based Feature Extraction for Improved Myoelectric Pattern Recognition,” IEEE Trans. Neural Syst. Rehabil. Eng., vol. 25, no. 10, pp. 1821–1831, 2017, doi: 10.1109/TNSRE.2017.2687520.
  28. Waris and E. N. Kamavuako, “Effect of threshold values on the combination of EMG time domain features: Surface versus intramuscular EMG,” Biomed. Signal Process. Control, vol. 45, pp. 267–273, 2018, doi: 10.1016/j.bspc.2018.05.036.
  29. R. Verma and B. Gupta, “Detecting Neuromuscular Disorders Using EMG Signals Based on TQWT Features,” Augment. Hum. Res., vol. 5, no. 1, 2020, doi: 10.1007/s41133-019-0020-7.
  30. Qi, G. Jiang, G. Li, Y. Sun, and B. Tao, “Surface EMG hand gesture recognition system based on PCA and GRNN,” Neural Comput. Appl., vol. 32, no. 10, pp. 6343–6351, 2020, doi: 10.1007/s00521-019-04142-8.
  31. Vihinen, “How to evaluate performance of prediction methods? Measures and their interpretation in variation effect analysis”, BMC Genomics, vol. 13 Suppl 4, no. Suppl 4, 2012, doi: 10.1186/1471-2164-13-S4-S2.
  32. Purushothaman and R. Vikas, “Identification of a feature selection based pattern recognition scheme for finger movement recognition from multichannel EMG signals,” Australas. Phys. Eng. Sci. Med., vol. 41, no. 2, pp. 549–559, 2018, doi: 10.1007/s13246-018-0646-7.
  33. Zhuang et al., “Comparison of Contributions between Facial and Neck Muscles for Speech Recognition Using High-Density surface Electromyography,” 2019 IEEE Int. Conf. Comput. Intell. Virtual Environ. Meas. Syst. Appl. CIVEMSA 2019 - Proc., 2019.
  34. Ma et al., “Silent Speech Recognition Based on Surface Electromyography,” in 2019 Chinese Automation Congress (CAC), Nov. 2019, pp. 4497–4501.
  35. W. Soon, M. I. H. Anuar, M. H. Z. Abidin, A. S. Azaman, and N. M. Noor, “Speech recognition using facial sEMG,” Proc. 2017 IEEE Int. Conf. Signal Image Process. Appl. ICSIPA 2017, pp. 1–5, 2017.
  36. Velliangiri, S. Alagumuthukrishnan, and S. I. Thankumar Joseph, “A Review of Dimensionality Reduction Techniques for Efficient Computation,” Procedia Comput. Sci., vol. 165, pp. 104–111, 2019, doi: 10.1016/j.procs.2020.01.079.
دوره 16، شماره 3
پاییز 1401
صفحه 231-244

  • تاریخ دریافت 31 مرداد 1401
  • تاریخ بازنگری 06 بهمن 1401
  • تاریخ پذیرش 18 بهمن 1401