AI-Voice-based Automated Form Filling System
Keywords:
Artificial Intelligence (AI), Automatic Speech Recognition (ASR), Banking automation, Natural Language Processing (NLP), PDF generation, Speech recognition, Voice-based form fillingAbstract
Manual completion of banking forms is often time-consuming, error-prone, and challenging for elderly, visually impaired, and physically challenged individuals. To address these issues, an AI Voice-Based Automated Form Filling System is proposed that enables users to complete banking forms through voice interaction. The system utilizes Speech Recognition technology to capture user responses and convert spoken input into text. Natural Language Processing (NLP) techniques are employed to extract, organize, and validate relevant information before automatically populating the required form fields. The system interacts with users by asking questions related to personal and banking details, thereby reducing the need for manual typing. Validation mechanisms are incorporated to ensure the correctness and consistency of information such as names, phone numbers, and identification details. After successful validation, the collected data is automatically entered into the selected banking form, and a completed PDF document is generated. The proposed system improves form-filling efficiency, minimizes human errors, and enhances accessibility for users with limited technical skills. Experimental observations indicate that the system significantly reduces the time required for form completion compared with traditional manual methods. The integration of Artificial Intelligence, Speech Recognition, and NLP technologies provides a user-friendly and efficient solution for modern banking applications.
References
V. Suryawansh, A. Nikam, V. Deokar, G. Sawant, and P. Patil, “Vocal Forms: An Intelligent Voice Command Form Filling System for Accessible and Hands-Free Digital Data Entry,” International Journal of Scientific Research in Engineering and Management, vol. 09, no. 12, pp. 1–9, Dec. 2025.
S. KS and S. Gowda, “Voice-Assisted Census Form Filling,” International Research Journal of Modernization in Engineering Technology and Science, vol. 7, no. 9, 2025.
D. Harihar, V. Shrivastava, and P. Talele, “Voice-Based User Interface for Hands-Free Data Entry and Automation at Workplaces,” MethodsX, vol. 15, pp. 103596–103596, Aug. 2025.
C. Pasham Fraze and G. Narasimham, “Real-Time Voice-Based Web Form Interaction using Natural Language Processing,” International Journal for Research in Applied Science & Engineering Technology, vol. 13, no. 7, pp. 7-11, Jul, 2025.
W. Syed, “AI-Powered Multi-Modal Form Filling: Advancing Accessibility through Voice and Image Recognition,” International Journal of Scientific Research in Computer Science, Engineering and Information Technology, vol. 11, no. 1, pp. 01-11, Jan. 2025.
A. R. Hegde, S. Reddy, P. Kruthika, B. C. Pragathi, and S. Sai Lahari, “Automated government form filling for aged and monolingual people using an interactive tool,” Disability and Rehabilitation. Assistive Technology, vol. 19, no. 6, pp. 1–11, Nov. 2023.
A. Radford, J. W. Kim, T. Xu, G. Brockman, C. McLeavey, and I. Sutskever, “Robust Speech Recognition via Large-Scale Weak Supervision,” arXiv:2212.04356, Dec. 2022.
Y. Huang, T. Lv, L. Cui, Y. Lu, and F. Wei, “LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking,” arXiv (Cornell University), Apr. 2022.
OpenAI, “GPT-4 Technical Report,” arXiv:2303.08774 [cs], Mar. 2023.
L. Blecher, G. Cucurull, T. Scialom, and R. Stojnic, “Nougat: Neural Optical Understanding for Academic Documents,” arXiv.org, Aug. 25, 2023.
S. Pakhmode, V. Poojary, P. Bhore, and K. Thakur, “NLP-based AI Voice Assistant,” International Journal of Scientific Research in Engineering and Management, vol. 7, no. 3, Mar. 2023.
C. Wang et al., “Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers,” arXiv:2301.02111, Jan. 2023.
Y. Zhang and D. A. Shell, “A general class of combinatorial filters that can be minimized efficiently,” arXiv.org, 2022.
X. Zhang et al., “PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering,” arXiv.org, May 29, 2023.
A. Gaud, B. Mota, D. Kumbhar, V. Kumar, and P. Shashank Gothankar, “Chatbot Personal Assistant Using Natural Language Processing (NLP) 1,” International Journal of Innovative Research in Technology, 2022.
M. Li et al., “TrOCR: Transformer-based Optical Character Recognition with Pre-trained Models,” arXiv:2109.10282, Sep. 2022.
Y. Ren et al., “FastSpeech 2: Fast and High-Quality End-to-End Text to Speech,” arXiv:2006.04558, Mar. 2021.
W. -N. Hsu, B. Bolte, Y. -H. H. Tsai, K. Lakhotia, R. Salakhutdinov and A. Mohamed, "HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 3451-3460, 2021
A. Gulati et al., “Conformer: Convolution-augmented Transformer for Speech Recognition,” arXiv:2005.08100, May 2020.
G. Kim et al., “OCR-free Document Understanding Transformer,” arXiv:2111.15664, Aug. 2022.
P. Dahariya, M. Shanu, and K. Rakesh, “‘Speech Interface for Form Filling with Biometric Recognition and Authentication,’” International Journal of Research and Analytical Reviews, vol. 7, no. 1, Feb, 2020.
A. Baevski, H. Zhou, A. Mohamed, and M. Auli, “Wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations,” arXiv:2006.11477 Oct. 2020.
X. Zhang et al., “PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering,” arXiv.org, May, 2023.
H. Hua, X. Li, D. Dou, C.-Z. Xu, and J. Luo, “Improving Pre-trained Language Model Fine-tuning with Noise Stability Regularization,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–15, Nov, 2022.