Generative Artificial Intelligence for Visual Applications: Architectures, Applications,  and Challenges

Priyanka Dinesh Patil; Sai Takawale; Prasad Bhosle

Authors

Priyanka Dinesh Patil
Sai Takawale
Prasad Bhosle

Keywords:

Deep learning, Diffusion models, Generative adversarial networks (GANs), Generative artificial intelligence, Variational autoencoders (VAEs), Visual applications

Abstract

Generative artificial intelligence (generative AI) represents one of the most transformative advances in modern computing, especially in the domain of visual applications. Its ability to generate, reconstruct, and enhance visual content has redefined the boundaries of creativity, automation, and perception. This study presents a systematic review of existing studies on generative AI in visual domains, focusing on three dominant architectures—Generative adversarial networks (GANs), variational autoencoders (VAEs), and diffusion models. The methodology involves a structured literature search across major databases, screening studies using defined inclusion criteria, and synthesizing results into thematic insights. The review identifies core technical principles, major datasets, evaluation metrics, and application areas such as image synthesis, video generation, and 3D content creation. Moreover, it discusses ongoing challenges related to reproducibility, computational cost, interpretability, and ethical considerations, including bias and misinformation. The paper concludes with emerging trends such as controllable generation, multimodal fusion, and sustainability-oriented generative modeling, aiming to guide future research toward responsible and transparent visual AI.

References

B. Ahmad, J. Sun, Q. You, V. Palade, and Z. Mao, “Brain tumor classification using a combination of variational autoencoders and generative adversarial networks,” Biomedicines, vol. 10, no. 2, p. 223, 2022.

Y. Chen, J. Liu, L. Peng, Y. Wu, Y. Xu, and Z. Zhang, “Auto-encoding variational Bayes,” Cambridge Explorations in Arts and Sciences, vol. 2, no. 1, Feb. 2024.

Z. Gui, Z. Sun, Y. Wen, D. Tao, and J. Ye, “A review on generative adversarial networks: Algorithms, theory, and applications,” IEEE Trans. Knowl. Data Eng., vol. 35, no. 4, pp. 3313–3332, Apr. 1, 2023.

A. Wu et al., :AI4VIS: Survey on artificial intelligence approaches for data visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 12, pp. 5049–5070, 1 Dec. 2022.

Q. Wang, C. Zhu-Tian, Y. Wang and H. Qu, “A survey on ML4VIS: Applying machine learning advances to data visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 28, no. 12, pp. 5134–5153, 1 Dec. 2022.

A. Ramesh, P. Dhariwal, A. Nichol, C. Chu, and M. Chen, “Hierarchical text-conditional image generation with CLIP latents,” arXiv preprint arXiv:2204.06125, 2022.

H. Ku and M. Lee, “TextControlGAN: Text-to-image synthesis with controllable generative adversarial networks,” Appl. Sci., vol. 13, no. 8, p. 5098, 2023.

A. Bozkurt, “Generative artificial intelligence (AI) powered conversational educational agents: The inevitable paradigm shift,” Asian J. Distance Educ., vol. 18, no. 1, 2023.

S. S. Sengar, A. B. Hasan, S. Kumar, and F. Carroll, “Generative artificial intelligence: A systematic review and applications,” arXiv preprint, 2024.

S. U. Dar, M. Yurt, L. Karacan, A. Erdem, E. Erdem and T. Çukur, “Image synthesis in multi-contrast MRI with conditional generative adversarial networks,” IEEE Transactions on Medical Imaging, vol. 38, no. 10, pp. 2375–2388, Oct. 2019

O. -H. Kwon and K. -L. Ma, “A deep generative model for graph layout,” IEEE Transactions on Visualization and Computer Graphics, vol. 26, no. 1, pp. 665–675, Jan. 2020

A. Alsharhan, M. Al-Emran and K. Shaalan, “Chatbot adoption: A multiperspective systematic review and future research agenda,” IEEE Transactions on Engineering Management, vol. 71, pp. 10232–10244, 2024

K. Ahuja, H. Didde, R. Hada, M. Ochieng, K. Ramesh, P. Jain, A. Nambi, T. Ganu, S. Segal, M. Axmed, K. Bali, and S. Sitaram, “MEGA: Multilingual evaluation of generative AI,” Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP 2023), Association for Computational Linguistics, Dec. 6–10, 2023, pp. 4232–4242.

L. Kumar and D. K. Singh, “A comprehensive survey on generative adversarial networks used for synthesizing multimedia content,” Multimedia Tools and Applications, vol. 82, pp. 40585–40624, Mar. 2023.

P. K. Varshney, “Introduction,” Distributed Detection and Data Fusion, New York, NY, USA: Springer, 1997.

T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint, 2017.

S. K. Card, J. Mackinlay, and B. Shneiderman, Readings in Information Visualization: Using Vision to Think. San Francisco, CA, USA: Morgan Kaufmann, 1999.

A. Cairo, The Functional Art: An Introduction to Information Graphics and Visualization. Berkeley, CA, USA: New Riders, 2012.

Generative Artificial Intelligence for Visual Applications: Architectures, Applications, and Challenges

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Current Issue