Construction Schedule Prediction for Condominium Projects in Addis Ababa: A Comparative Analysis of Random Forest and Alternative Approaches
Abstract
Accurate construction duration estimation is critical for scheduling and resource allocation, yet traditional heuristic techniques often result in severe schedule overruns. This vulnerability is highly evident in Addis Ababa’s public housing sector (20/80 and 40/60 condominium schemes), where over 38,000 units face chronic multi-fold delays due to data deficiencies and rigid planning tools. To establish a robust context-specific duration prediction model, this study utilizes historical data spanning 2013 to 2023 from 595 building blocks compiled by the Addis Ababa Housing Development and Administration Bureau (AAHDB) to develop an interpretable, Python-based machine learning model. A hybrid feature selection pipeline consisting of embedded feature selection and correlation matrix analysis optimized twelve initial variables into a highly relevant seven-predictor framework, integrating key physical building attributes with operational and seasonal metrics. Four machine learning algorithms were benchmarked to evaluate predictive performance: Random Forest (RF), Support Vector Machine (SVM), Linear Regression (LR), and an Artificial Neural Network (ANN). The empirical results established a definitive performance hierarchy: Random Forest > Support Vector Machine > Multiple Linear Regression > Artificial Neural Network. The primary Random Forest model achieved exceptional predictive dependability, yielding a coefficient of determination R² of 0.999, paired with minimal error margins (RMSE = 0.081 months and MAE = 0.015 months). The non-linear SVM baseline served as the closest competitor, capturing complex multidimensional patterns with an R² of 0.976 (RMSE = 1.09 months), whereas traditional linear regression (R² = 0.822) and the data-constrained ANN (R² = 0.665) underperformed. Given the high predictive accuracy of the ensemble tree and kernel models, independent k-fold cross-validation is recommended before implementation. Ultimately, this research resolves historical ‘black-box’ rasping by balancing model interpretability with data-driven precision, offering public planners and construction practitioners an effective management tool to mitigate local housing project delays.
Downloads

This work is licensed under a Creative Commons Attribution 4.0 International License.