Self-Supervised Learning in Computer Vision: Enhancing Model Accuracy with Limited Labeled Data

Authors

  • Samarth Jadhav
  • Manish Dhatrak
  • S. K. Gupta
  • N. Y. Siddiqui

Abstract

Recent advances in deep learning have transformed computer vision tasks like object detection, segmentation, and classification. Still, they rely heavily on large, labeled datasets that are expensive and time-consuming to gather. Self-Supervised Learning (SSL) offers a promising alternative by enabling models to learn meaningful features from unlabeled data, reducing the need for extensive labeling. This paper presents a novel SSL architecture combining contrastive learning and Vision Transformers (ViTs) to improve model accuracy with minimal labeled data. Contrastive learning enables the model to distinguish between similar and dissimilar images better, while Vision Transformers captures local and global image patterns. Our approach is evaluated on standard datasets like CIFAR-10 and ImageNet, where it achieves competitive accuracy even with minimal labeled data, significantly closing the gap with fully supervised models. We provide detailed quantitative and qualitative analyses through performance graphs, accuracy tables, and visualizations of learned representations, demonstrating the effectiveness of SSL in reducing data dependency while maintaining high performance in computer vision tasks.

Published

2024-12-06

How to Cite

Jadhav, S., Dhatrak, M., Gupta, S. K., & Siddiqui, N. Y. (2024). Self-Supervised Learning in Computer Vision: Enhancing Model Accuracy with Limited Labeled Data. Journal of Data Engineering and Knowledge Discovery, 1(3), 30–46. Retrieved from https://matjournals.net/engineering/index.php/JoDEKD/article/view/1167

Issue

Section

Articles