Document Type : Research Article
Authors
1 MSc Student, Productivity Management Department, School of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran
2 Productivity Management Department, School of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran
3 System and e-Commerce Engineering Department, School of Industrial Engineering, Iran University of Science and Technology, Tehran, Iran
Abstract
Introduction
Sugarcane is a strategic agricultural product and increasing productivity and self-sufficiency in its production is of special importance. The most important product of sugarcane is sugar. Various factors like climatic and management conditions affect the yield of sugarcane and recoverable sugar. Crop yield forecasting is one of the most important topics in precision agriculture, which is used to estimate yield, match product supply with demand and manage product to increase productivity. The purpose of this study is to predict and model the factors affecting sugar extracted from sugarcane (recoverable sugar) in the farms of Amir-Kabir sugarcane agro-industry Company of Khuzestan province using machine learning methods.
Materials and Methods
To conduct this study, data from the agro-industrial company Amir-Kabir in the province of Khuzestan from 2010 to 2017 were used. This data has 3223 records which include four sets of data: climate, soil, crop and farm management. This data includes continuous and discrete variables. Discrete variables include production management, soil type, farm, variety, age (cane class), the month of harvest and times irrigation. Continuous variables include area, chemical fertilizer consumption, water consumption per hectare, total water consumption, drain, crop season duration, yield (cane yield) soil EC, purity, time interval drying off to crop harvest, precipitation, min and max temperature, min and max relative humidity, wind speed and evaporation. The recoverable sugar variable is considered as the target variable and is divided into two classes, values greater than or equal to 9 are in the optimal class and less than 9 are in the undesirable class. The other variables are considered as predictor variables. For modeling using the Holdout method the data were randomly divided into two independent sets, a training set and a test set. 70% of the data which includes 2256 records were used for training and 30% of the data which includes 967 records were used for testing. The modeling of this study was performed with the Python programming language version 3.8.6 in the Jupyter notebook environment. Random Forest, Adaboost, XGBoost and SVM (support vector machine) algorithms were used for modeling.
Results and Discussion
To evaluate the models, metrics of accuracy, precision, recall, f1 score and k-fold cross validation were used. The XGBoost model with 94.8% accuracy on the training set and the Adaboost model with 92.4% accuracy on the test set, are the best models. Based on precision and recall metrics Adaboost model with 87% precision and SVM model with 87% recall have better performance than the other models. Based on Repeated 10-fold stratified cross validation using two repeats the SVM model with 92.3% accuracy is the best model. The variables of purity, time interval drying off to crop harvest and crop season duration are the most important variables in predicting the recoverable sugar.
Conclusion
In this study a new approach based on machine learning methods for predicting recoverable sugar from sugarcane was presented. The most important innovation of this study is the simultaneous consideration of management and climatic factors, along with other factors such as soil and crop characteristics for modeling and classification the recoverable sugar percentage from sugarcane. The results show that the performance of all models is acceptable and machine learning methods and ensemble learning algorithms can be used to predict crop yield. The results of this study and the analysis of the rules obtained from the set of decision trees made in the random forest model can be used for managers of different agro-industries in determining appropriate strategies and preparing the conditions to achieve optimal production.
For future research as well as policy making and decision making Amir-Kabir sugarcane agro-industry Company the following suggestions are offered: more samples can be used to obtain more reliable results. Also can be used Deep learning methods, time series analysis and image processing. Use of IOT equipment to collect and real-time processing data on Amir-Kabir sugarcane agro-industry farms.
Keywords
Main Subjects
Open Access
©2021 The author(s). This article is licensed under Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source.
- Balakrishnan, N., & Muthukumarasamy, G. (2016). Crop production-ensemble machine learning model for prediction. International Journal of Computer Science and Software Engineering, 5, 148.
- Bocca, F. F., & Rodrigues, L. H. A. (2016). The effect of tuning, feature engineering, and feature selection in data mining applied to rainfed sugarcane yield modelling. Computers and Electronics in Agriculture, 128, 67-76. https://doi.org/10.1016/j.compag.2016.08.015.
- Brownlee, J. (2020a). How to Develop Your First XGBoost Model in Python with scikit-learn. https://machinelearningmastery.com/develop-first-xgboost-model-python-scikit-learn/.
- Brownlee, J. (2020b). A Gentle Introduction to XGBoost for Applied Machine Learning. https://machinelearningmastery.com/gentle-introduction-xgboost-applied-machine-learning/.
- Brownlee, J. (2020c). Extreme Gradient Boosting (XGBoost) Ensemble in Python. https://machinelearningmastery.com/extreme-gradient-boosting-ensemble-in-python/.
- Charoen-Ung, , & Mittrapiyanuruk, P. (2018). Sugarcane Yield Grade Prediction using random forest with forward feature selection and hyper-parameter tuning. Pages 33-42. International Conference on Computing and Information Technology: Springer. https://doi.org/10.1007/978-3-319-93692-5_4
- de Oliveira, M. P. G., Bocca, F., & Rodrigues, L. H. A. (2017). From spreadsheets to sugar content modeling: A data mining approach. Computers and Electronics in Agriculture, 132, 14-20. https://doi.org/10.1016/j.compag.2016.11.012
- Everingham, Y., Sexton, J., Skocaj, D., & Inman-Bamber, G. (2016). Accurate prediction of sugarcane yield using a random forest algorithm. Agronomy for Sustainable Development, 36, 27. https://doi.org/10.1007/s13593-016-0364-z
- Ferraro, D. O., Rivero, D. E., & Ghersa, C. M. (2009). An analysis of the factors that influence sugarcane yield in Northern Argentina using classification and regression trees. Field Crops Research, 112, 149-157. https://doi.org/10.1016/j.fcr.2009.02.014
- Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55, 119-139. https://doi.org/10.1006/jcss.1997.1504
- Han, J., Kamber, M., & Pei, J. (2019). Data mining: concepts and techniques, 3rd Niaze danesh. Tehran.
- Liakos, K. G., Busato, P., Moshou, D., Pearson, S., & Bochtis, D. (2018). Machine learning in agriculture: A review. Sensors, 18, 2674. https://doi.org/10.3390/s18082674
- Medar, R. A., Rajpurohit, V. S., & Ambekar, A. (2019). Sugarcane Crop Yield Forecasting Model Using Supervised Machine Learning. International Journal of Intelligent Systems and Applications, 11, 11. https://doi.org/10.5815/ijisa.2019.08.02
- Palanivel, K., & Surianarayanan, C. (2019). An approach for prediction of crop yield using machine learning and big data techniques. International Journal of Computer Engineering and Technology, 10, 110-118. https://ssrn.com/abstract=3555087
- Pande, A., Purohit, S., Jadhav, S., & Shah, K. (2019). Optimum Crop Prediction using Data Mining and Machine Learning Techniques. International Journal for Research in Applied Science and Engineering Technology, 7, 2392-2396. https://doi.org/10.22214/ijraset.2019.3436
- Rajeswari, S., Suthendran, K., & Rajakumar, K. (2017). A smart agricultural model by integrating IoT, mobile and cloud-based big data analytics. Pages 1-5. 2017 International Conference on Intelligent Computing and Control (I2C2): IEEE. https://doi.org/10.1109/I2C2.2017.8321902
- Ramesh, D., & Vardhan, B. V. (2013). Data mining techniques and applications to agricultural yield data. International Journal of Advanced Research in Computer and Communication Engineering, 2, 3477-3480.
- Shooshtari, M. B., Ahmadian, S., & Asfiaa, G. (2008). Sugarcane in Iran. Aeeizh. Tehran.
- Sishodia, R. P., Ray, R. L., & Singh, S. K. (2020). Applications of remote sensing in precision agriculture: A review. Remote Sensing, 12, https://doi.org/10.3390/rs12193136
- The Sugar Market. (n.d.). About Retrieved from https://www.isosugar.org/sugarsector/sugar
- Thuankaewsing, S., Khamjan, S., Piewthongngam, K., & Pathumnakul, S. (2015). Harvest scheduling algorithm to equalize supplier benefits: A case study from the Thai sugar cane industry. Computers and Electronics in Agriculture, 110, 42-55. https://doi.org/10.1016/j.compag.2014.10.005
- Van Klompenburg, T., Kassahun, A., & Catal, C. 2020. Crop yield prediction using machine learning: A systematic literature review. Computers and Electronics in Agriculture, 177, 105709. https://doi.org/10.1016/j.compag.2020.105709
- Veenadhari, S., Misra, B., & Singh, C. (2011). Soybean productivity modelling using decision tree algorithms. International Journal of Computer Applications, 27, 11-15.
- Veenadhari, S., Misra, B., & Singh, C. (2014). Machine learning approach for forecasting crop yield based on climatic parameters. Pages 1-5. 2014 International Conference on Computer Communication and Informatics: IEEE. https://doi.org/10.1109/ICCCI.2014.6921718
- Walton, J. The 5 Countries That Produce the Most Sugar. https://www.investopedia.com/articles/investing/101615/5-countries-produce-most-sugar.asp
- Zakidizaji, H., Bahrami, H., Monjezi, N., & Shiekhdavoodi, M. (2019). Modeling of the variables that influence sugarcane yield using C5. 0 and QUEST decision tree algorithms. Journal of Agricultural Machinery, 9(2), 469-484. https://doi.org/10.22067/jam.v9i2.69712
Send comment about this article