3rd International Conference On Applied Sciences, Cape-Town, South Africa, 24 - 29 January 2023, pp.1
The science of machine learning (ML) focuses on using
data and algorithms by simulating human learning. With rapidly advancing
technology and the associated growth of data, machine learning (ML) is one of
the technical domains with the quickest growth rates and it lies at the nexus
of the computer and statistical sciences. Supervised,
unsupervised, semi-supervised, and reinforcement learning are the learning
algorithms used in ML. The data set is split into training and test data in
supervised machine learning. Training data is labeled data, and the computer
creates a function using this labeled data. The target variable is then
calculated from unlabeled test data using this newly designed function. If the target variable is quantitative, this process
is based on regression analysis. Ridge regression,
commonly known as L2 regularized, is one of the various regression techniques
created for machine learning. When the
independent variables in multiple regression are linearly dependent, the biased
estimator Ridge was suggested as an alternative to the ordinary least-squares
(OLS) estimator in the statistical literature. Because
the OLS can generate high variance regression estimates when multicollinearity
is present. Similar to OLS, the purpose of Ridge regression is to estimate the
regression coefficients that will minimize the sum of squares error. However,
in this approach, a regularization (L2) is applied to the coefficients while
minimizing. Data on air pollution were utilized in the application. The target
variable was the air quality index, while the independent variables were PM2.5,
PM10, SO2, NO, NOX, NO2, O3, CO, and NH3. It was discovered that the
correlations between the independent variables were statistically significant. , MSE, RMSE, MAE, and MAPE were used to assess model
performances after modeling was done using various regularization parameters.