Prediction of Total Anthocyanin Content in Single-Kernel Maize Using Spectral and Color Space Data Coupled with AutoML

Songur, Umut; Fidan, Sertuğ; ALACA YILDIRIM, EZGİ; KAHRIMAN, FATİH; TİRYAKİ, ALİ

doi:10.3390/s26030805

Prediction of Total Anthocyanin Content in Single-Kernel Maize Using Spectral and Color Space Data Coupled with AutoML

Songur U., Fidan S., ALACA YILDIRIM E., KAHRIMAN F., TİRYAKİ A. M.

Sensors, cilt.26, sa.3, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 26 Sayı: 3
Basım Tarihi: 2026
Doi Numarası: 10.3390/s26030805
Dergi Adı: Sensors
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, MEDLINE, Directory of Open Access Journals
Anahtar Kelimeler: machine learning, near infrared reflectance, plant pigments, Zea mays
Çanakkale Onsekiz Mart Üniversitesi Adresli: Evet

Özet

The non-destructive and chemical-free determination of anthocyanin content in single maize kernels is of great importance for plant-breeding programs. Previous studies have mainly relied on Near-Infrared Reflectance (NIR) spectroscopy and color-based approaches, often using conventional or randomly selected modeling techniques. In this study, an Automated Machine Learning (AutoML) framework was employed to predict anthocyanin content using spectral and digital image data obtained from individual maize kernels measured in two orientations (embryo-up and embryo-down). Forty colored maize genotypes representing diverse phenotypic characteristics were analyzed. Digital images were acquired in RGB, HSV, and LAB color spaces, together with NIR spectral data, from a total of 200 kernels. Reference anthocyanin content was determined using a colorimetric method. Ten datasets were constructed by combining different color space and spectral features and were grouped according to kernel orientation. AutoML was used to evaluate nine machine learning algorithms, while Partial Least Squares Regression (PLSR) served as a classical benchmark method, resulting in the development of 1918 predictive models. Kernel orientation had a notable effect on model performance and outlier detection. The best predictions were obtained from the RGB dataset for embryo-up kernels and from the combined RGB+HSV+LAB+NIR dataset for embryo-down kernels. Overall, AutoML outperformed conventional modeling by automatically identifying optimal algorithms for specific data structures, demonstrating its potential as an efficient screening tool for anthocyanin content at the single-kernel level.