Efficient and Explainable Transformer-Based Models for Low-Resource Language Understanding in Code-Switched Contexts

Eric Kiriinya; Brahmaleen Kaur Sidhu

Authors

Eric Kiriinya Research Scholar
Brahmaleen Kaur Sidhu

Keywords:

Code-switching, Lexical ambiguity, Low-Resource Languages (LRLs), Model compensation, Model efficiency, Natural language processing, Pruning, Quantization, Transformer models

Abstract

Transformer models' performance in Natural Language Processing (NLP) is unparalleled within the industry. When it comes to Low-Resource Languages (LRLs) and code-switched contexts, however, they are confronted with much more limited data, complicated linguistic structures, strict computational resources, and a high-priority workload. Code-switching means that speakers blend multiple languages throughout a single conversation. With this in mind, it significantly elevates the level of vocabulary ambiguity and breaks attention to grammar rules while transitioning from one language to another in an unforeseen manner. Models based on transformers mBERT and XLM-ROBERTa tend to underperform because they mostly learn from translation corpora scraped off the internet, which does not prepare them for the reality of fluidity associated with code-switched text. This paper proposes addressing these issues while targeting model performance and explainability. These challenges help us use model compression achieved through pruning, quantization, or knowledge distillation methods, reducing size while maintaining ultra-high standards of NLP. Attention visualization, gradient-based attribution, and rationalization are some of the explainability techniques used to interpret and clarify these models steps to increase trust when applying these models in sensitive contexts. Moreover, the paper analyzes socio-political issues concerning language inequity; it advocates that NLP should be designed for Low-Resource Languages (LRLs) to resolve the digital gap. Lastly, a conceptual design is offered for constructing explainable, efficient transformer architectures for low-resource code-switched languages centered around transfer learning, data augmentation, and cross-lingual transfer. This research addresses inclusivity and transparency in NLP technologies, especially concerning underserved multilingual regions.

References

A. Vaswani, N. Shazeer, N. Parmar, “Attention Is All You Need,” arXiv.org, Dec. 05, 2017. https://arxiv.org/abs/1706.03762

Green, “Basics of Large Language Models - transformers to LLMs,” Feb. 2025, doi: https://doi.org/10.6019/tol.basics-llm-w.2025.00001.1.

A. Khalizova, “Chapter 6. The Phenomenon of Code Alternation by Multilingual Speakers,” De Gruyter eBooks, pp. 85–106, Oct. 2020, doi: https://doi.org/10.1515/9781501514692-006.

R. Pugh and F. Tyers, “Experiments in Multi-Variant Natural Language Processing for Nahuatl,” Association for Computational Linguistics, pp. 140–151, Jan. 2024, doi: https://doi.org/10.18653/v1/2024.vardial-1.12.

A. Diwan., “MUCS 2021: Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages,” MUCS, Apr. 2021, doi: https://doi.org/10.21437/interspeech.2021-1339.

A. F. Hidayatullah, A. Qazi, D. T. C. Lai, and R. A. Apong, “A Systematic Review on Language Identification of Code-Mixed Text: Techniques, Data Availability, Challenges, and Framework Development,” IEEE Access, vol. 10, pp. 122812–122831, 2022, doi: https://doi.org/10.1109/ACCESS.2022.3223703.

D. Kakwani., A. Kunchukuttan, S. Golla, G. N.C., “IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages,” ACLWeb, Nov. 01, 2020. https://aclanthology.org/2020.findings-emnlp.445/ (accessed Apr. 25, 2022).

M. Azam, S. Hossain, K. Fatema, J. Ahmad, M. E. Ali, and S. Azam, “A Review on Large Language Models: Architectures, Applications, Taxonomies, Open Issues and Challenges,” IEEE Access, vol. 12, pp. 1–1, Jan. 2024, doi: https://doi.org/10.1109/access.2024.3365742.

A. Qorbani, R. Ramezani, A. Baraani, and A. Kazemi, “Multilingual neural machine translation for low-resource languages by twinning important nodes,” Neurocomputing, vol. 634, p. 129890, Jun. 2025, doi: https://doi.org/10.1016/j.neucom.2025.129890.

T. Aranovich and R. Matulionyte, “Ensuring AI Explainability in Healthcare: Problems and Possible Policy Solutions,” SSRN Electronic Journal, Jan. 2025, doi: https://doi.org/10.2139/ssrn.5256534

X. Li, Y. Ma, Y. Huang, X. Wang, Y. Lin, and C. Zhang, “Integrated Optimization of Large Language Models: Synergizing Data Utilization and Compression Techniques,” Prerpints.org, Sep. 2024, doi: https://doi.org/10.20944/preprints202409.0662.v1.

W. Lei and Z. Wei, “Reducing Computational Overhead in Transformers with Token Pruning,” SSRN, Jan. 2025, doi: https://doi.org/10.2139/ssrn.5171805.

Z. Bao et al., “Teacher–student complementary sample contrastive distillation,” Neural Networks, vol. 170, pp. 176–189, Nov. 2023, doi: https://doi.org/10.1016/j.neunet.2023.11.036

V. Hassija et al., “Interpreting Black-Box Models: a Review on Explainable Artificial Intelligence,” Cognitive Computation, vol. 16, no. 1, pp. 45–74, Aug. 2023, doi: https://doi.org/10.1007/s12559-023-10179-8.

M. M. H. Raihan, S. Subroto, N. Chowdhury, K. Koch, E. Ruttan, and T. C. Turin, “Dimensions and barriers for digital (in)equity and digital divide: a systematic integrative review,” Digital Transformation and Society, vol. 4, no. 2, Aug. 2024, doi: https://doi.org/10.1108/dts-04-2024-0054.

V. Protasov, E. Stakovskii, E. Voloshina, T. Shavrina, and A. Panchenko, “Super donors and super recipients: Studying cross-lingual transfer between high-resource and low-resource languages,” Association for Computational Linguistics, pp. 94–108, Jan. 2024, doi: https://doi.org/10.18653/v1/2024.loresmt-1.10.

P. Pakray, A. Gelbukh, and S. Bandyopadhyay, “Natural language processing applications for low-resource languages,” Natural Language Processing, vol. 31, no. 2, pp. 183–197, Feb. 2025, doi: https://doi.org/10.1017/nlp.2024.33.

A. Dash and Y. Sharma, “Towards Improving Translation Ability of Large Language Models on Low Resource Languages,” Proceedings of the 14th International Conference on Pattern Recognition Applications and Methods, pp. 801–807, 2025, doi: https://doi.org/10.5220/0013319000003905.

Y. Tay, M. Dehghani, D. Bahri, and D. Metzler, “Efficient Transformers: A Survey,” ACM Computing Surveys, vol. 55, no. 6, Apr. 2022, doi: https://doi.org/10.1145/3530811.

M. Touheed, “Applications of Pruning Methods in Natural Language Processing,” IEEE Access, vol. 12, pp. 89418–89438, 2024, doi: https://doi.org/10.1109/access.2024.3411776.

Y. Dong, J. Li, and R. Ni, “Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization,” Procedings of the British Machine Vision Conference 2017, 2017, doi: https://doi.org/10.5244/c.31.189

C. Yang., Y. Zhu, W. Lu, “Survey on Knowledge Distillation for Large Language Models: Methods, Evaluation, and Application,” ACM Transactions on Intelligent Systems and Technology, Oct. 2024, doi: https://doi.org/10.1145/3699518.

Zhang, D. Listiyani, P. Singh, and M. Mohanty, “Distilling Wisdom: A Review on Optimizing Learning from Massive Language Models,” IEEE Access, vol. 13, pp. 56296–56325, 2025, doi: https://doi.org/10.1109/access.2025.3554586.

A. Bhat, A. S. Assoa, and A. Raychowdhury, “Gradient Backpropagation based Feature Attribution to Enable Explainable-AI on the Edge,” 2022 IFIP/IEEE 30th International Conference on Very Large-Scale Integration (VLSI-SoC), pp. 1–6, Oct. 2022, doi: https://doi.org/10.1109/vlsi-soc54400.2022.9939601.

S. Jain, “The model thinks what?! Interpreting deep NLP models with rationales and influence,” Northeastern University, Jan. 2022, doi: https://doi.org/10.17760/d20467212.

A. Gezmu and A. Nürnberger, “Transformers for Low-resource Neural Machine Translation,” Proceedings of the 14th International Conference on Agents and Artificial Intelligence, pp. 459–466, 2022, doi: https://doi.org/10.5220/0010971500003116.

Heleen, M. Abubakar, and S. Ahmad, “The Impact of Pretrained Language Models in Cross-Lingual NLP Tasks,” SSRN, Jan. 2025, doi: https://doi.org/10.2139/ssrn.5206966

J. Hofweber and T. Marinis, “What Sentence Repetition Tasks Can Reveal about the Processing Effort Associated with Different Types of Code-Switching,” Languages, vol. 8, no. 1, p. 70, Feb. 2023, doi: https://doi.org/10.3390/languages8010070

Z. Wan, “Efficient Large Language Models: A Survey,” arXiv.org, 2023. https://arxiv.org/abs/2312.03863.

Efficient and Explainable Transformer-Based Models for Low-Resource Language Understanding in Code-Switched Contexts

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Current Issue