Репозиторий Dspace

Comparing Machine Learning Algorithms for Large-scale Crop Yield Prediction Using Agroecological Parameters and Pesticide Usage

Показать сокращенную информацию

dc.contributor.author Lykhovyd, P.
dc.contributor.author Hranovska, L.
dc.contributor.author Averchev, O.
dc.contributor.author Zhuikov, O.
dc.contributor.author Karashchuk, G
dc.contributor.author Maksymov, D.
dc.date.accessioned 2026-02-02T06:26:45Z
dc.date.available 2026-02-02T06:26:45Z
dc.date.issued 2026-01-28
dc.identifier.citation P. Lykhovyd, L. Hranovska, O. Averchev, O. Zhuikov, G. Karashchuk, and D. Maksymov, “Comparing Machine Learning Algorithms for Large-scale Crop Yield Prediction Using Agroecological Parameters and Pesticide Usage”, C. R. Acad. Bulg. Sci., vol. 79, no. 1, pp. 136–144, Jan. 2026. ru
dc.identifier.uri http://hdl.handle.net/123456789/11846
dc.description.abstract Accurate crop yield prediction is essential to ensure global food security under changing climatic, environmental and agro-industrial conditions. This study presents a comprehensive comparative analysis of machine learning algorithms for large-scale yield prediction using agroecological parameters and pesticide usage as key explanatory variables. Crop yield prediction dataset, downloaded from Kaggle, comprising 28 242 records across multiple countries and crops was preprocessed and modelled with 17 algorithms, including tree-based, regression-based, support vector, neural network, boosting, and ensemble approaches. Model performance was assessed using root mean square error (RMSE) and coefficient of determination (R2). Modelling and visualization were performed in Python 3 using corresponding external modules. Among global models, Extra Trees achieved the highest accuracy (R2 = 0.991$, RMSE = 8282.92 hg/ha), outperforming both gradient boosting and neural network approaches. Ensemble techniques, particularly stacking ensembles, provided comparable accuracy R2 = 0.990), confirming the robustness of tree-based methods. Feature importance analysis highlighted crop type (0.609) as the dominant predictor, while pesticides (0.110), average temperature (0.108), and rainfall (0.087) emerged as the most influential agro-environmental factors. Country-specific models achieved near-perfect predictive power R2 ≈ 1.0) for India, Brazil, Pakistan, Mexico, and Turkey, while Ukraine's best-performing model (XGBoost, R2 = 0.980) revealed yields averaging 43.4% below the global mean. Crop-level analysis identified potatoes, cassava, and sweet potatoes as the highest-yielding crops globally. These results demonstrate the superiority of tree-based and ensemble models for yield forecasting and emphasize the value of localized modelling strategies. Findings provide actionable insights for optimizing agricultural practices and guiding sustainable intensification policies. ru
dc.language.iso en ru
dc.publisher Comptes rendus de l'Académie bulgare des Sciences ru
dc.subject data-driven agriculture ru
dc.subject feature importance analysis ru
dc.subject regression ru
dc.subject ensemble modeling ru
dc.subject sustainability ru
dc.title Comparing Machine Learning Algorithms for Large-scale Crop Yield Prediction Using Agroecological Parameters and Pesticide Usage ru
dc.type Article ru


Файлы в этом документе

Данный элемент включен в следующие коллекции

Показать сокращенную информацию

Поиск в DSpace


Просмотр

Моя учетная запись