Prediction of House Prices using Correlation and Simple Linear Regression

Authors

  • Dipali Tupe
  • Shruti Patil
  • Khushboo Khachane
  • Ruchi Gothankar
  • Ankita Yamagar
  • Divam machhi

Keywords:

CRISP-DM, GrLivArea, Linear regression, Modelling, OverallQual, YearBuilt

Abstract

Predicting house prices correctly is very important in real estate, city planning, and financial decision-making. This study uses simple statistical tools—correlation and multiple linear regression—to understand and estimate residential property prices. The main aim is to examine how different factors such as house size, number of bedrooms, building age, distance from the city centre, and neighbourhood quality affect the price of a house. First, correlation analysis is used to measure how strongly each factor is related to house prices. This helps identify which variables have a meaningful impact and which ones do not contribute much. The analysis shows that larger houses and better neighbourhoods are usually linked with higher prices. On the other hand, older houses and properties located farther from business areas tend to have lower prices. These results help us better understand what drives property values in the market. After identifying the important factors, a multiple linear regression model is developed to predict house prices. This model calculates how much each factor changes the price while keeping the other factors constant. The performance of the model is tested using measures such as R-squared and residual analysis to check how accurate and reliable the predictions are. Overall, the study shows that regression models, supported by correlation analysis, can explain a large portion of the differences in house prices. Even though linear regression cannot capture every complex market behaviour, it is simple, easy to understand, and efficient for estimating property values. The findings highlight that both property features and location play a major role in determining house prices and show how statistical methods can support better real estate decisions.

References

Erbulut, Ö. G., & Çolak, Z. (2026). A hybrid machine learning approach for housing price prediction: the stacking regressor method. International Journal of Housing Markets and Analysis, 19(4), 942-970.

Chen, Y., & Zhang, L. (2021). Housing Price Prediction Based on Multiple Linear Regression and Neural Networks. 2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC), 98-102.

Dwivedi, R., & Gupta, R. (2022). House Price Prediction using regression techniques. 2022 4th International Conference on Advances in Computing, Communication, Control and Networking (ICAC3N).

Li, H. (2023). House Price Prediction and Analysis Based on Random Forest and XGBoost Models. Highlights in Business Economics and Management, 21, 934–938.

Kok, N., Monkkonen, P., & Quigley, J. M. (2014). Land use regulations and the value of land and housing: An intra-metropolitan analysis. Journal of Urban Economics, 81, 136–148.

Xu, L., & Li, Z. (2021). A New Appraisal Model of Second-Hand Housing Prices in China's First-Tier Cities Based on Machine Learning Algorithms. Computational Economics, 57(2), 617.

Limbong, H., Lubis, M. A.., & Mhd. Furqan. (2025). House Price Prediction Analysis Using Linear Regression and Random Forest Algorithms. Journal of Artificial Intelligence and Engineering Applications (JAIEA), 4(3), 1928–1933.

Manasa, J., Gupta, R., & Narahari, N. S. (2020). Machine Learning based Predicting House Prices using Regression Techniques. 2nd International Conference on Innovative Mechanisms for Industry Applications (ICIMIA),624-630.

Mullainathan, S., & Spiess, J. (2017). Machine Learning: An Applied Econometric Approach. Journal of Economic Perspectives, 31(2), 87-106.

Park, B., & Bae, J. K. (2015). Using Machine Learning Algorithms for Housing Price Prediction: The Case of Fairfax County, Virginia Housing Data. Expert Systems with Applications, 42(6), 2928-2934.

Reddy, G. H., & Sriramya, P. (2023). Real Estate Price Prediction and Analysis Using Voting Regression Compared with Linear Regression. International Journal of Engineering Research, 625-630.

Sriram, D. V. N., Reddy, B. L. K., Reddy, K. D. K., & S, Dr. R. (2025). Real Estate Price Prediction using Machine Learning and Data Analytics. International Journal of Research and Scientific Innovation, 12(10), 2627–2636.

Tang, Y. (2025). Hybrid House Price Prediction Model by Integration of Simple Linear Regression and Cubic Spline Interpolation. Theoretical and Natural Science, 105(1), 61–70.

Sharma, S., Arora, D., Shankar, G., Sharma, P., & Motwani, V. (2023). House Price Prediction using Machine Learning Algorithm. IEEE Xplore.

Madhuri, C. R., Anuradha, G., & Pujitha, M. V. (2019). House price prediction using regression techniques: A comparative study. 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), 1-5.

Published

2026-06-10

How to Cite

Tupe, D. ., Shruti Patil, Khushboo Khachane, Ruchi Gothankar, Ankita Yamagar, & Divam machhi. (2026). Prediction of House Prices using Correlation and Simple Linear Regression. Journal of Business Analytics and Data Visualization (e-ISSN: 2584-1637), 14–28. Retrieved from https://matjournals.net/engineering/index.php/JBADV/article/view/3696