Integration of the Machine Learning Algorithms and I-MR Statistical Process Control for Solar Energy


Creative Commons License

Atalan Y. A., Atalan A.

Sustainability (Switzerland), vol.15, no.18, 2023 (SCI-Expanded) identifier

  • Publication Type: Article / Article
  • Volume: 15 Issue: 18
  • Publication Date: 2023
  • Doi Number: 10.3390/su151813782
  • Journal Name: Sustainability (Switzerland)
  • Journal Indexes: Science Citation Index Expanded (SCI-EXPANDED), Social Sciences Citation Index (SSCI), Scopus, Aerospace Database, Agricultural & Environmental Science Database, CAB Abstracts, Communication Abstracts, Food Science & Technology Abstracts, Geobase, INSPEC, Metadex, Veterinary Science Database, Directory of Open Access Journals, Civil Engineering Abstracts
  • Keywords: AdaBoost, gradient boosting, I-MR control chart, linear regression, machine learning, random forest, solar energy, statistical process control
  • Çanakkale Onsekiz Mart University Affiliated: No

Abstract

The importance of solar power generation facilities, as one of the renewable energy types, is increasing daily. This study proposes a two-way validation approach to verify the validity of the forecast data by integrating solar energy production quantity with machine learning (ML) and I-MR statistical process control (SPC) charts. The estimation data for the amount of solar energy production were obtained by using random forest (RF), linear regression (LR), gradient boosting (GB), and adaptive boost or AdaBoost (AB) algorithms from ML models. Data belonging to eight independent variables consisting of environmental and geographical factors were used. This study consists of approximately two years of data on the amount of solar energy production for 636 days. The study consisted of three stages: First, descriptive statistics and analysis of variance tests of the dependent and independent variables were performed. In the second stage of the method, estimation data for the amount of solar energy production, representing the dependent variable, were obtained from AB, RF, GB, and LR algorithms and ML models. The AB algorithm performed best among the ML models, with the lowest RMSE, MSE, and MAE values and the highest R2 value for the forecast data. For the estimation phase of the AB algorithm, the RMSE, MSE, MAE, and R2 values were calculated as 0.328, 0.107, 0.134, and 0.909, respectively. The RF algorithm performed worst with performance scores for the prediction data. The RMSE, MSE, MAE, and R2 values of the RF algorithm were calculated as 0.685, 0.469, 0.503, and 0.623, respectively. In the last stage, the estimation data were tested with I-MR control charts, one of the statistical control tools. At the end of all phases, this study aimed to validate the results obtained by integrating the two techniques. Therefore, this study offers a critical perspective to demonstrate a two-way verification approach to whether a system’s forecast data are under control for the future.