Price Prediction and Determination of the Affecting Variables of the Real Estate by Using X-Means Clustering and CART Decision Trees


YÜCEBAŞ S. C., Yalpir S., GENÇ L., Dogan M.

Journal of Universal Computer Science, cilt.30, sa.4, ss.531-560, 2024 (SCI-Expanded) identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 30 Sayı: 4
  • Basım Tarihi: 2024
  • Doi Numarası: 10.3897/jucs.98733
  • Dergi Adı: Journal of Universal Computer Science
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Directory of Open Access Journals
  • Sayfa Sayıları: ss.531-560
  • Anahtar Kelimeler: Classification and regression tree, Machine learning, prediction methods, Real estate, X-Means clustering
  • Çanakkale Onsekiz Mart Üniversitesi Adresli: Evet

Özet

The use of machine learning in real estate is quite new. When the working area is large, the factors affecting the price may vary according to the geographical regions and socioeconomic factors. It is thought that the price prediction performance of a model that will reflect these differences will be more successful than a general model. Unsupervised learning methods can be used both to increase performance and to show the variation of different factors affecting the price according to regions. With this aim, a hybrid model of X-Means clustering and CART decision trees was established in this study. This model successfully learned the geographical and physical variables that affect the price. The prediction performance of the model was compared with the direct capitalization method, which is the gold standard in the domain. The hybrid model has a superior performance over direct capitalization in terms of mean square error, root mean square error and adjusted R-Squared metrics. The scores were 72.86, 0.0057 and 0.978, respectively. The effect of clustering was also examined. Clustering increased the prediction performance by 36%.