Design and Implementation of an AI-Based 3D Crime Scene Reconstruction from Visual Evidence

Authors

  • Pragna C. P
  • Keerthana B
  • Lekhana K
  • Sunayana Anuja
  • Priyanka H. V

Keywords:

3D visualization, Computer vision, Crime scene reconstruction, Object detection, Open3D, YOLOv5

Abstract

Artificial Intelligence (AI) and computer vision have significantly improved the performance of object detection systems. However, most existing approaches generate only two-dimensional (2D) outputs, limiting the understanding of spatial relationships between objects. This paper presents a system that integrates deep learning-based object detection with three-dimensional (3D) visualization to address this limitation. The proposed method utilizes the YOLOv5 model for accurate object detection and the Open3D library to construct an interactive 3D environment. Detected objects are mapped onto textured planes and arranged within a 3D space, enabling users to explore the scene from different perspectives. This enhances the interpretation of visual data compared to traditional 2D representations. The system is designed to be simple and efficient, requiring only image input to generate a 3D visualization. Experimental results demonstrate improved understanding and presentation of object detection outcomes. Overall, the proposed approach provides a more intuitive and effective solution for analysing visual data, with potential applications in surveillance, research, and smart monitoring systems.

References

G. Jocher, “YOLOv5 by Ultralytics,” Github Repository, 2024.

Z. Zou, K. Chen, Z. Shi, Y. Guo, and J. Ye, “Object detection in 20 years: A survey,” Proceedings of the IEEE, vol. 111, no. 3, pp. 257–276, 2023.

M. Tan and Q. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” in Proceedings of the International Conference on Machine Learning, 2019, pp. 6105–6114.

H. Caesar et al., “nuScenes: A multimodal dataset for autonomous driving,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 11621–11631.

Q. Y. Zhou, J. Park, and V. Koltun, “Open3D: A modern library for 3D data processing,” arXiv preprint arXiv:1801.09847, 2018.

A. Paszke, “PyTorch: An imperative style, high-performance deep learning library,” in Advances in Neural Information Processing Systems, vol. 32, 2019.

A. Bochkovskiy, C. Y. Wang, and H. Y. M. Liao, “YOLOv4: Optimal speed and accuracy of object detection,” arXiv preprint arXiv:2004.10934, 2020.

C. Szegedy, “Inception-v4, Inception-ResNet and the impact of residual connections on learning,” arXiv preprint arXiv:1602.07261, 2016.

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.698, 2014.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards real-time object detection with region proposal networks,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137–1149, 2017.

Y. Song, “Contextual-based image inpainting: Infer, match, and translate,” In Proceedings of the European Conference on Computer Vision, 2018, pp. 3–19.

K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.

W. Liu, “SSD: Single shot multibox detector,” In Proceedings of the European Conference on Computer Vision, 2016, pp. 21–37.

J. Redmon and A. Farhadi, “YOLO9000: Better, faster, stronger,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 6517–6525.

J. Redmon, “You only look once: Unified, real-time object detection,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 779–788.

R. Girshick, “Fast R-CNN,” in Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1440–1448.

O. Russakovsky, “ImageNet large scale visual recognition challenge,” International Journal of Computer Vision, vol. 115, no. 3, pp. 211–252, 2015.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.

J. Deng, “ImageNet: A large-scale hierarchical image database,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009, pp. 248–255.

R. Girshick, “Rich feature hierarchies for accurate object detection and semantic segmentation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580–587.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.

M. Everingham, “The PASCAL visual object classes (VOC) challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, 2010.

P. F. Felzenszwalb, “Object detection with discriminatively trained part-based models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627–1645, 2010.

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.

D. Forsyth and J. Ponce, Computer Vision: A Modern Approach, 2nd ed. Upper Saddle River, NJ, USA: Prentice Hall, 2011.

R. Szeliski, Computer Vision: Algorithms and Applications. New York, NY, USA: Springer, 2011.

M. Nixon and A. S. Aguado, Feature Extraction and Image Processing for Computer Vision. Academic Press, 2025.

R. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision. Cambridge, U.K. Cambridge University Press, 2003.

D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91–110, 2004.

N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2005, pp. 886–893.

S. Lazebnik, C. Schmid, and J. Ponce, “Spatial pyramid matching,” Object Categorization: Computer and Human Vision Perspectives, 2006.

T. Kohonen, M. R. Schroeder, and T. S. Huang, Self-Organizing Maps. New York, NY, USA: Springer, 2001.

C. M. Bishop and N. M. Nasrabadi, Pattern Recognition and Machine Learning. New York, NY, USA: Springer, 2006

S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach. Pearson, 2010.

D. Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. MIT Press, 2010.

G. Bradski, “The OpenCV library,” Dr. Dobb's Journal, vol. 25, no. 11, pp. 120–123, 2000.

G. Bradski and A. Kaehler, Learning OpenCV. O’Reilly Media, 2008.

H. Bay, “SURF: Speeded-up robust features,” Computer Vision and Image Understanding, vol. 110, no. 3, pp. 346–359, 2008.

P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2001.

A. Geiger, “Vision meets robotics: The KITTI dataset,” International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013.

L. Fei-Fei , “Learning generative visual models from few training examples,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2004.

D. Forsyth, “Object detection with discriminatively trained part-based models,” Compute, vol. 47, no. 2, pp. 6–7, 2014

J. Shotton et al., “Real-time human pose recognition in parts from single depth images,” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2011.

A. Torralba, R. Fergus, and W. T. Freeman, “80 million tiny images: A large dataset for nonparametric object and scene recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1958–1970, 2008.

Published

2026-04-13