International Journal of Housing Markets and Analysis, 2025 (ESCI)
Purpose: This study aims to explore alternative methodologies by comparing popular and effective machine learning models for housing price prediction. The primary objective is to develop a hybrid Stacking Regressor model combining multiple regression algorithms to leverage their strengths through a meta-model, thereby enhancing prediction accuracy. Design/methodology/approach: The performance of widely used machine learning algorithms, including CatBoost, XGBoost, Random Forest, Extra Trees, Hist Gradient Boosting and Gradient Boosting, was evaluated using various error metrics for housing price prediction. Feature engineering and parameter optimization were applied to improve model performance, resulting in significant enhancements, particularly for Random Forest and Extra Trees. Furthermore, a Stacking Regressor model was constructed by integrating multiple regression algorithms to capitalize on their collective predictive capabilities. Findings: The results indicate that CatBoost achieved the lowest error rates among the evaluated models. Random Forest and XGBoost also performed comparably, whereas Gradient Boosting exhibited higher error rates. The hybrid Stacking Regressor model outperformed all algorithms, demonstrating superior predictive accuracy. These findings underscore the potential of integrating machine learning models to address complex data sets and improve overall model performance. Originality/value: This study is the data preprocessing and feature engineering processes, which are often overlooked in prior research but critical to machine learning models’ success. Additionally, the study contributes to the field by proposing a hybrid model – the Stacking Regressor. This model combines multiple regression algorithms and uses a meta-model to integrate the strengths of the base models, thereby aiming to improve prediction accuracy.