Automatic Speech Recognition Enhanced Self-Instructional Learning Material to Improve English Pronunciation for Indonesian Primary School Students
Keywords:
self-instructional material; Automatic Speech Recognition; English Pronunciation; Young Learner; Primary SchoolAbstract
English pronunciation of young learners is often limited due to insufficient exposure to the target language, and this affects their self-confidence and communication skills. This study aimed to analyze teacher needs in teaching pronunciation and to develop interactive, systematic, and child-specific self-instructional material based on Automatic Speech Recognition (ASR) to improve pronunciation mastery. The research applied Research and Development with the ADDIE model consisting of analysis, design, development, implementation, and evaluation. However, this study was limited to the analysis, design, and prototype development stages without field testing. Validation was conducted by two English language education experts through a 1–4 scale questionnaire and open-ended comments, assessing aspects of material content, language suitability, visual design, ASR function, illustrations and audio recordings, and repeated practice. The validation results showed that the material was of good to excellent quality, with ASR integration enabling repeated independent practice, real-time feedback, and increased motivation and learning independence. These findings confirm that the material effectively supports pronunciation systematically, although the study was limited to a prototype and a small number of validators. The contribution of this research lies in providing valid, practical, and adaptive ASR-based interactive learning media, offering a solution to the limitations of teachers and media in supporting pronunciation practice in elementary schools. The research can help both teachers and students in the learning process and self-learning with the ASR feature.
https://doi.org/10.26803/ijlter.25.2.37
References
Ahn, T., Hong, Y., Im, Y., Kim, D. H., Kang, D., Jeong, J. W., Kim, J. W., Kim, M. J., Cho, A.-R., & Nam, H. (2025). Automatic speech recognition (ASR) for the diagnosis of pronunciation of speech sound disorders in Korean children. Clinical Linguistics & Phonetics, 39(10), 913–926. https://doi.org/10.1080/02699206.2024.2387609
Alharthi, S. M. (2024). Siri as an interactive pronunciation coach: its impact on EFL learners. Cogent Education, 11(1), 2304245. https://doi.org/10.1080/2331186X.2024.2304245
Amemasor, S. K., Oppong, S. O., Ghansah, B., Benuwa, B.-B., & Essel, D. D. (2025). A systematic review on the impact of teacher professional development on digital instructional integration and teaching practices. Frontiers in Education, 10, 1541031. https://doi.org/10.3389/feduc.2025.1541031
Amrate, M., & Tsai, P. (2024). Computer-assisted pronunciation training: A systematic review. ReCALL, 1–21. https://doi.org/10.1017/S0958344024000181
Aravantinos, S., Lavidas, K., Voulgari, I., Papadakis, S., Karalis, T., & Komis, V. (2024). Educational approaches with A? in primary school settings: A systematic review of the literature available in Scopus. Education Sciences, 14(7), 744. https://doi.org/10.3390/educsci14070744
Bashori, M., van Hout, R., Strik, H., & Cucchiarini, C. (2024). I Can Speak: improving English pronunciation through automatic speech recognition-based language learning systems. Innovation in Language Learning and Teaching, 18(5), 443–461. https://doi.org/10.1080/17501229.2024.2315101
Bhardwaj, V., Ben Othman, M. T., Kukreja, V., Belkhier, Y., Bajaj, M., Goud, B. S., Rehman, A. U., Shafiq, M., & Hamam, H. (2022). Automatic speech recognition (ASR) systems for children: A systematic literature review. Applied Sciences, 12(9), 4419. https://doi.org/10.3390/app12094419
Bogach, N., Boitsova, E., Chernonog, S., Lamtev, A., Lesnichaya, M., Lezhenin, I., Novopashenny, A., Svechnikov, R., Tsikach, D., & Vasiliev, K. (2021). Speech processing for language learning: A practical approach to computer-assisted pronunciation teaching. Electronics, 10(3), 235. https://doi.org/10.3390/electronics10030235
Caiza, G., Villafuerte, C., & Guanuche, A. (2025). Interactive Application with Virtual Reality and Artificial Intelligence for Improving Pronunciation in English Learning. Applied Sciences, 15(17), 9270. https://doi.org/10.3390/app15179270
Chang, V., & Fisher, D. (2003). The validation and application of a new learning environment instrument for online learning in higher education. In Technology-rich learning environments: A future perspective (pp. 1–20). World Scientific. https://doi.org/10.1142/9789812564412_0001
Clark, R. C., & Mayer, R. E. (2023). E-learning and the science of instruction: Proven guidelines for consumers and designers of multimedia learning. john Wiley & sons.
Creswell, J. W., & Plano Clark, V. L. (2023). Revisiting mixed methods research designs twenty years later. Handbook of Mixed Methods Research Designs, 1(1), 21–36.
Dai, Y., & Wu, Z. (2023). Mobile-assisted pronunciation learning with feedback from peers and/or automatic speech recognition: A mixed-methods study. Computer Assisted Language Learning, 36(5–6), 861–884. https://doi.org/10.1080/09588221.2021.1952272
Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic speech recognition technology as an effective means for teaching pronunciation. The Jalt Call Journal, 10(1), 21–47. https://doi.org/https://files.eric.ed.gov/fulltext/EJ1107929.pdf?utm_source
Evers, K., & Chen, S. (2021). Effects of automatic speech recognition software on pronunciation for adults with different learning styles. Journal of Educational Computing Research, 59(4), 669–685. https://doi.org/10.1177/0735633120972011
Evers, K., & Chen, S. (2022). Effects of an automatic speech recognition system with peer feedback on pronunciation instruction for adults. Computer Assisted Language Learning, 35(8), 1869–1889. https://doi.org/10.1080/09588221.2020.1839504
García, C., Nickolai, D., & Jones, L. (2020). Traditional versus ASR-based pronunciation instruction. Calico Journal, 37(3), 213–232. https://doi.org/10.1558/cj.40379
Gavriushenko, M., Karilainen, L., & Kankaanranta, M. (2015). Adaptive systems as enablers of feedback in English language learning game-based environments. 2015 IEEE Frontiers in Education Conference (FIE), 1–8. https://doi.org/10.1109/FIE.2015.7344107
Gowenlock, A. E., Norbury, C., & Rodd, J. M. (2024). Exposure to language in video and its impact on linguistic development in children aged 3–11: A scoping review. Journal of Cognition, 7(1), 57. https://doi.org/10.5334/joc.385
Guskaroska, A. (2024). Exploring technology acceptance of ASR for pronunciation learning. Iowa State University.
Inceoglu, S., Chen, W.-H., & Lim, H. (2024). Monitoring student behavior in autonomous automatic speech recognition-based pronunciation practice. System, 124, 103387. https://doi.org/10.1016/j.system.2024.103387
Lai, J. W., & Cheong, K. H. (2022). Educational opportunities and challenges in augmented reality: Featuring implementations in physics education. Ieee Access, 10, 43143–43158. https://doi.org/10.1109/ACCESS.2022.3166478
Leis, A. (2025). How speech-to-text technology affects pronunciation gains and self-confidence in EFL learners. Computer Assisted Language Learning, 1–24. https://doi.org/10.1080/09588221.2025.2534498
Liu, Hwang, G.-J., Yu, P., Tu, Y.-F., & Wang, Y. (2025). Effects of an automated corrective feedback-based peer assessment approach on students’ learning achievement, motivation, and self-regulated learning conceptions in foreign language pronunciation. Educational Technology Research and Development, 1–22. https://doi.org/10.1007/s11423-025-10484-z
Liu, T., Zhang, Z., & Gao, X. (2023). Pedagogical design in technology-enhanced language education research: A scoping review. Sustainability, 15(7), 6069. https://doi.org/10.3390/su15076069
Liu, X., Xu, M., Li, M., Han, M., Chen, Z., Mo, Y., Chen, X., & Liu, M. (2019). Improving English pronunciation via automatic speech recognition technology. International Journal of Innovation and Learning, 25(2), 126–140. https://doi.org/10.1504/IJIL.2019.097674
Liu, Y., binti Ab Rahman, F., & binti Mohamad Zain, F. (2025). A systematic literature review of research on automatic speech recognition in EFL pronunciation. Cogent Education, 12(1), 2466288. https://doi.org/10.1080/2331186X.2025.2466288
McCrocklin, S. M. (2016). Pronunciation learner autonomy: The potential of automatic speech recognition. System, 57(1), 25–42. https://doi.org/10.1016/j.system.2015.12.013
Metruk, R. (2024). Mobile-assisted language learning and pronunciation instruction: A systematic literature review. Education and Information Technologies, 29(13), 16255–16282. https://doi.org/10.1007/s10639-024-12345-6
Mohammed Cherif, D. (2024). The Use of the Multi-Sensory Teaching Approach in Enhancing Vocabulary Learning for EFL Young Learners. Université KASDI MERBAH-OUARGLA. https://dspace.univ-ouargla.dz/jspui/handle/123456789/36913
Moore, P. D. (2025). Smarter Learning: Integrating AI into Instructional Design for 21st-Century Education. SCIENTIA MORALITAS-International Journal of Multidisciplinary Research, 10(1), 86–108. https://doi.org/10.5281/zenodo.16335328
Morrison, G. R., Ross, S. J., Morrison, J. R., & Kalman, H. K. (2019). Designing effective instruction. John Wiley & Sons.
Ngo, T. T.-N., Chen, H. H.-J., & Lai, K. K.-W. (2024). The effectiveness of automatic speech recognition in ESL/EFL pronunciation: A meta-analysis. ReCALL, 36(1), 4–21. https://doi.org/10.1017/S0958344023000113
Plomp, T. (2013). Educational design research: An introduction. Educational Design Research, 1, 11–50. https://www.fi.uu.nl/publicaties/literatuur/educational-design-research-part-a.pdf?utm_source
Richey, R. C., & Klein, J. D. (2014). Design and development research: Methods, strategies, and issues. Routledge. https://doi.org/10.4324/9780203826034
Sabri, S. M., Ismail, I., Annuar, N., Rahman, N. R. A., Abd Hamid, N. Z., & Abd Mutalib, H. (2024). A conceptual analysis of technology integration in classroom instruction towards enhancing student engagement and learning outcomes. Integration, 9(55), 750–769. https://doi.org/10.35631/IJEPC.955051
Sariani, S., Khairat, M. El, Haslina, W., & Baetty, B. (2024). ASR-based system for promoting pronunciation: Promoting collaborative approach for higher education ELF learners. AILA Review. https://doi.org/10.1075/aila.23021.sar
Shivakumar, P. G., Potamianos, A., Lee, S., & Narayanan, S. (2014). Improving speech recognition for children using acoustic adaptation and pronunciation modeling. https://sail.usc.edu/publications/files/Improving Speech Recognition for Children using Acoustic Adaptation and Pronunciation Modeling.pdf
Spring, R., & Tabuchi, R. (2021). Assessing the practicality of using an automatic speech recognition tool to teach English pronunciation online. Journal of English Teaching through Movies and Media, 22(2), 93–104. https://doi.org/10.16875/stem.2021.22.2.93
Sun, W. (2023). The impact of automatic speech recognition technology on second language pronunciation and speaking skills of EFL learners: a mixed methods investigation. Frontiers in Psychology, 14, 1210187. https://doi.org/10.3389/fpsyg.2023.1210187
Vena, A., & Yuliana, Y. G. S. (2023). The Role of Multisensory Approach in Fostering Student Well-Being in English Language Teaching for Young Learners: English. Journal of English Development, 3(01), 229–242. https://doi.org/https://doi.org/10.25217/jed.v3i01.4784
Wang, Y., & Young, S. (2015). Effectiveness of feedback for enhancing English pronunciation in an ASR?based CALL system. Journal of Computer Assisted Learning, 31(6), 493–504. https://doi.org/10.1111/jcal.12079
Wi?niewska, H. (2016). Learner autonomy: The role of educational materials in fostering self-evaluation. In Autonomy in second language learning: Managing the resources (pp. 85–98). Springer. https://link.springer.com/chapter/10.1007/978-3-319-07764-2_6
Won, Y. (2025). Assessing the efficacy of word error rate as a proxy for pronunciation quality: A comparative study of ASR systems and human evaluations among young EFL learners. Journal of Second Language Pronunciation.
Xiao, H., Ou, K., Wang, H., & van de Weijer, J. (2023). The effect of ASR apps on monophthong pronunciation improvement and generalization to new words in English. Science and Information Conference, 1410–1433. https://doi.org/10.1007/978-3-031-37717-4_93
Xu, G., Yu, A., & Liu, L. (2025). A meta-analysis examining AI-assisted L2 learning. International Review of Applied Linguistics in Language Teaching, 0. https://doi.org/10.1515/iral-2024-0213
ZIANE, S., & DOUb, A. (2024). Using ASR Technology to Assess EFL Learners’ Pronunciation. Proc. IAC Prague 2024, 95. https://www.researchgate.net/publication/384528886_Using_ASR_Technology_to_Assess_EFL_Learners%27_Pronunciation
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Muh Shofiyuddin, Erna Zumrotun, Noor Azizah, Muh Muhaimin, Yudie Irawan

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All articles published by IJLTER are licensed under a Creative Commons Attribution Non-Commercial No-Derivatives 4.0 International License (CCBY-NC-ND4.0).