RIDGE REGRESSION IN MACHINE LEARNING AND AN APPLICATION


Söküt Açar T.

3rd International Conference On Applied Sciences, Cape-Town, Güney Afrika, 24 - 29 Ocak 2023, ss.1

  • Yayın Türü: Bildiri / Özet Bildiri
  • Basıldığı Şehir: Cape-Town
  • Basıldığı Ülke: Güney Afrika
  • Sayfa Sayıları: ss.1
  • Çanakkale Onsekiz Mart Üniversitesi Adresli: Evet

Özet

The science of machine learning (ML) focuses on using data and algorithms by simulating human learning. With rapidly advancing technology and the associated growth of data, machine learning (ML) is one of the technical domains with the quickest growth rates and it lies at the nexus of the computer and statistical sciences. Supervised, unsupervised, semi-supervised, and reinforcement learning are the learning algorithms used in ML. The data set is split into training and test data in supervised machine learning. Training data is labeled data, and the computer creates a function using this labeled data. The target variable is then calculated from unlabeled test data using this newly designed function. If the target variable is quantitative, this process is based on regression analysis. Ridge regression, commonly known as L2 regularized, is one of the various regression techniques created for machine learning. When the independent variables in multiple regression are linearly dependent, the biased estimator Ridge was suggested as an alternative to the ordinary least-squares (OLS) estimator in the statistical literature. Because the OLS can generate high variance regression estimates when multicollinearity is present. Similar to OLS, the purpose of Ridge regression is to estimate the regression coefficients that will minimize the sum of squares error. However, in this approach, a regularization (L2) is applied to the coefficients while minimizing. Data on air pollution were utilized in the application. The target variable was the air quality index, while the independent variables were PM2.5, PM10, SO2, NO, NOX, NO2, O3, CO, and NH3. It was discovered that the correlations between the independent variables were statistically significant. , MSE, RMSE, MAE, and MAPE were used to assess model performances after modeling was done using various regularization parameters.