Automatic Speech Recognition Enhanced Self-Instructional Learning Material to Improve English Pronunciation for Indonesian Primary School Students

Muh  Shofiyuddin; Erna  Zumrotun; Noor  Azizah; Muh  Muhaimin; Yudie  Irawan

Authors

Muh Shofiyuddin
Erna Zumrotun
Noor Azizah
Muh Muhaimin
Yudie Irawan

Keywords:

self-instructional material; Automatic Speech Recognition; English Pronunciation; Young Learner; Primary School

Abstract

English pronunciation of young learners is often limited due to insufficient exposure to the target language, and this affects their self-confidence and communication skills. This study aimed to analyze teacher needs in teaching pronunciation and to develop interactive, systematic, and child-specific self-instructional material based on Automatic Speech Recognition (ASR) to improve pronunciation mastery. The research applied Research and Development with the ADDIE model consisting of analysis, design, development, implementation, and evaluation. However, this study was limited to the analysis, design, and prototype development stages without field testing. Validation was conducted by two English language education experts through a 1–4 scale questionnaire and open-ended comments, assessing aspects of material content, language suitability, visual design, ASR function, illustrations and audio recordings, and repeated practice. The validation results showed that the material was of good to excellent quality, with ASR integration enabling repeated independent practice, real-time feedback, and increased motivation and learning independence. These findings confirm that the material effectively supports pronunciation systematically, although the study was limited to a prototype and a small number of validators. The contribution of this research lies in providing valid, practical, and adaptive ASR-based interactive learning media, offering a solution to the limitations of teachers and media in supporting pronunciation practice in elementary schools. The research can help both teachers and students in the learning process and self-learning with the ASR feature.

https://doi.org/10.26803/ijlter.25.2.37

References

Ahn, T., Hong, Y., Im, Y., Kim, D. H., Kang, D., Jeong, J. W., Kim, J. W., Kim, M. J., Cho, A.-R., & Nam, H. (2025). Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children. Clinical Linguistics & Phonetics, 39(10), 913–926. https://doi.org/10.1080/02699206.2024.2387609

Alharthi, S. M. (2024). Siri as an interactive pronunciation coach: its impact on EFL learners. Cogent Education, 11(1), 2304245. https://doi.org/10.1080/2331186X.2024.2304245

Amemasor, S. K., Oppong, S. O., Ghansah, B., Benuwa, B.-B., & Essel, D. D. (2025). A systematic review on the impact of teacher professional development on digital instructional integration and teaching practices. Frontiers in Education, 10, 1541031. https://doi.org/10.3389/feduc.2025.1541031

Amrate, M., & Tsai, P. (2024). Computer-assisted pronunciation training: A systematic review. ReCALL, 1–21. https://doi.org/10.1017/S0958344024000181

Aravantinos, S., Lavidas, K., Voulgari, I., Papadakis, S., Karalis, T., & Komis, V. (2024). Educational approaches with A? in primary school settings: A systematic review of the literature available in Scopus. Education Sciences, 14(7), 744. https://doi.org/10.3390/educsci14070744

Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2024). I Can Speak: improving English pronunciation through automatic speech recognition-based language learning systems. Innovation in Language Learning and Teaching, 18(5), 443–461. https://doi.org/10.1080/17501229.2024.2315101

Bhardwaj, V., Ben Othman, M. T., Kukreja, V., Belkhier, Y., Bajaj, M., Goud, B. S., Rehman, A. U., Shafiq, M., & Hamam, H. (2022). Automatic speech recognition (ASR) systems for children: A systematic literature review. Applied Sciences, 12(9), 4419. https://doi.org/10.3390/app12094419

Bogach, N., Boitsova, E., Chernonog, S., Lamtev, A., Lesnichaya, M., Lezhenin, I., Novopashenny, A., Svechnikov, R., Tsikach, D., & Vasiliev, K. (2021). Speech processing for language learning: A practical approach to computer-assisted pronunciation teaching. Electronics, 10(3), 235. https://doi.org/10.3390/electronics10030235

Caiza, G., Villafuerte, C., & Guanuche, A. (2025). Interactive Application with Virtual Reality and Artificial Intelligence for Improving Pronunciation in English Learning. Applied Sciences, 15(17), 9270. https://doi.org/10.3390/app15179270

Chang, V., & Fisher, D. (2003). The validation and application of a new learning environment instrument for online learning in higher education. In Technology-rich learning environments: A future perspective (pp. 1–20). World Scientific. https://doi.org/10.1142/9789812564412_0001

Clark, R. C., & Mayer, R. E. (2023). E-learning and the science of instruction: Proven guidelines for consumers and designers of multimedia learning. john Wiley & sons.

Creswell, J. W., & Plano Clark, V. L. (2023). Revisiting mixed methods research designs twenty years later. Handbook of Mixed Methods Research Designs, 1(1), 21–36.

Dai, Y., & Wu, Z. (2023). Mobile-assisted pronunciation learning with feedback from peers and/or automatic speech recognition: A mixed-methods study. Computer Assisted Language Learning, 36(5–6), 861–884. https://doi.org/10.1080/09588221.2021.1952272

Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic speech recognition technology as an effective means for teaching pronunciation. The Jalt Call Journal, 10(1), 21–47. https://doi.org/https://files.eric.ed.gov/fulltext/EJ1107929.pdf?utm_source

Evers, K., & Chen, S. (2021). Effects of automatic speech recognition software on pronunciation for adults with different learning styles. Journal of Educational Computing Research, 59(4), 669–685. https://doi.org/10.1177/0735633120972011

Evers, K., & Chen, S. (2022). Effects of an automatic speech recognition system with peer feedback on pronunciation instruction for adults. Computer Assisted Language Learning, 35(8), 1869–1889. https://doi.org/10.1080/09588221.2020.1839504

García, C., Nickolai, D., & Jones, L. (2020). Traditional versus ASR-based pronunciation instruction. Calico Journal, 37(3), 213–232. https://doi.org/10.1558/cj.40379

Gavriushenko, M., Karilainen, L., & Kankaanranta, M. (2015). Adaptive systems as enablers of feedback in English language learning game-based environments. 2015 IEEE Frontiers in Education Conference (FIE), 1–8. https://doi.org/10.1109/FIE.2015.7344107

Gowenlock, A. E., Norbury, C., & Rodd, J. M. (2024). Exposure to language in video and its impact on linguistic development in children aged 3–11: A scoping review. Journal of Cognition, 7(1), 57. https://doi.org/10.5334/joc.385

Guskaroska, A. (2024). Exploring technology acceptance of ASR for pronunciation learning. Iowa State University.

Inceoglu, S., Chen, W.-H., & Lim, H. (2024). Monitoring student behavior in autonomous automatic speech recognition-based pronunciation practice. System, 124, 103387. https://doi.org/10.1016/j.system.2024.103387

Lai, J. W., & Cheong, K. H. (2022). Educational opportunities and challenges in augmented reality: Featuring implementations in physics education. Ieee Access, 10, 43143–43158. https://doi.org/10.1109/ACCESS.2022.3166478

Leis, A. (2025). How speech-to-text technology affects pronunciation gains and self-confidence in EFL learners. Computer Assisted Language Learning, 1–24. https://doi.org/10.1080/09588221.2025.2534498

Liu, Hwang, G.-J., Yu, P., Tu, Y.-F., & Wang, Y. (2025). Effects of an automated corrective feedback-based peer assessment approach on students’ learning achievement, motivation, and self-regulated learning conceptions in foreign language pronunciation. Educational Technology Research and Development, 1–22. https://doi.org/10.1007/s11423-025-10484-z

Liu, T., Zhang, Z., & Gao, X. (2023). Pedagogical design in technology-enhanced language education research: A scoping review. Sustainability, 15(7), 6069. https://doi.org/10.3390/su15076069

Liu, X., Xu, M., Li, M., Han, M., Chen, Z., Mo, Y., Chen, X., & Liu, M. (2019). Improving English pronunciation via automatic speech recognition technology. International Journal of Innovation and Learning, 25(2), 126–140. https://doi.org/10.1504/IJIL.2019.097674

Liu, Y., binti Ab Rahman, F., & binti Mohamad Zain, F. (2025). A systematic literature review of research on automatic speech recognition in EFL pronunciation. Cogent Education, 12(1), 2466288. https://doi.org/10.1080/2331186X.2025.2466288

McCrocklin, S. M. (2016). Pronunciation learner autonomy: The potential of automatic speech recognition. System, 57(1), 25–42. https://doi.org/10.1016/j.system.2015.12.013

Metruk, R. (2024). Mobile-assisted language learning and pronunciation instruction: A systematic literature review. Education and Information Technologies, 29(13), 16255–16282. https://doi.org/10.1007/s10639-024-12345-6

Mohammed Cherif, D. (2024). The Use of the Multi-Sensory Teaching Approach in Enhancing Vocabulary Learning for EFL Young Learners. Université KASDI MERBAH-OUARGLA. https://dspace.univ-ouargla.dz/jspui/handle/123456789/36913

Moore, P. D. (2025). Smarter Learning: Integrating AI into Instructional Design for 21st-Century Education. SCIENTIA MORALITAS-International Journal of Multidisciplinary Research, 10(1), 86–108. https://doi.org/10.5281/zenodo.16335328

Morrison, G. R., Ross, S. J., Morrison, J. R., & Kalman, H. K. (2019). Designing effective instruction. John Wiley & Sons.

Ngo, T. T.-N., Chen, H. H.-J., & Lai, K. K.-W. (2024). The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis. ReCALL, 36(1), 4–21. https://doi.org/10.1017/S0958344023000113

Plomp, T. (2013). Educational design research: An introduction. Educational Design Research, 1, 11–50. https://www.fi.uu.nl/publicaties/literatuur/educational-design-research-part-a.pdf?utm_source

Richey, R. C., & Klein, J. D. (2014). Design and development research: Methods, strategies, and issues. Routledge. https://doi.org/10.4324/9780203826034

Sabri, S. M., Ismail, I., Annuar, N., Rahman, N. R. A., Abd Hamid, N. Z., & Abd Mutalib, H. (2024). A conceptual analysis of technology integration in classroom instruction towards enhancing student engagement and learning outcomes. Integration, 9(55), 750–769. https://doi.org/10.35631/IJEPC.955051

Sariani, S., Khairat, M. El, Haslina, W., & Baetty, B. (2024). ASR-based system for promoting pronunciation: Promoting collaborative approach for higher education ELF learners. AILA Review. https://doi.org/10.1075/aila.23021.sar

Shivakumar, P. G., Potamianos, A., Lee, S., & Narayanan, S. (2014). Improving speech recognition for children using acoustic adaptation and pronunciation modeling. https://sail.usc.edu/publications/files/Improving Speech Recognition for Children using Acoustic Adaptation and Pronunciation Modeling.pdf

Spring, R., & Tabuchi, R. (2021). Assessing the practicality of using an automatic speech recognition tool to teach English pronunciation online. Journal of English Teaching through Movies and Media, 22(2), 93–104. https://doi.org/10.16875/stem.2021.22.2.93

Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. Frontiers in Psychology, 14, 1210187. https://doi.org/10.3389/fpsyg.2023.1210187

Vena, A., & Yuliana, Y. G. S. (2023). The Role of Multisensory Approach in Fostering Student Well-Being in English Language Teaching for Young Learners: English. Journal of English Development, 3(01), 229–242. https://doi.org/https://doi.org/10.25217/jed.v3i01.4784

Wang, Y., & Young, S. (2015). Effectiveness of feedback for enhancing English pronunciation in an ASR?based CALL system. Journal of Computer Assisted Learning, 31(6), 493–504. https://doi.org/10.1111/jcal.12079

Wi?niewska, H. (2016). Learner autonomy: The role of educational materials in fostering self-evaluation. In Autonomy in second language learning: Managing the resources (pp. 85–98). Springer. https://link.springer.com/chapter/10.1007/978-3-319-07764-2_6

Won, Y. (2025). Assessing the efficacy of word error rate as a proxy for pronunciation quality: A comparative study of ASR systems and human evaluations among young EFL learners. Journal of Second Language Pronunciation.

Xiao, H., Ou, K., Wang, H., & van de Weijer, J. (2023). The effect of ASR apps on monophthong pronunciation improvement and generalization to new words in English. Science and Information Conference, 1410–1433. https://doi.org/10.1007/978-3-031-37717-4_93

Xu, G., Yu, A., & Liu, L. (2025). A meta-analysis examining AI-assisted L2 learning. International Review of Applied Linguistics in Language Teaching, 0. https://doi.org/10.1515/iral-2024-0213

ZIANE, S., & DOUb, A. (2024). Using ASR Technology to Assess EFL Learners’ Pronunciation. Proc. IAC Prague 2024, 95. https://www.researchgate.net/publication/384528886_Using_ASR_Technology_to_Assess_EFL_Learners%27_Pronunciation

Automatic Speech Recognition Enhanced Self-Instructional Learning Material to Improve English Pronunciation for Indonesian Primary School Students

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)