with the collaboration of Iranian Society of Mechanical Engineers (ISME)

Document Type : Research Article


1 Ph.D. Student, Faculty of Physics, Shahid Bahonar University of Kerman, Kerman, Iran

2 Faculty of Physics, Shahid Bahonar University of Kerman, Kerman, Iran


Maize is one of the most important cereal crops worldwide, providing staple food for people globally. Counting maize tassels provides essential information about yield prediction, growth status, and plant phenotyping, but traditional manual approaches are expensive and time-consuming. Recent developments in technology, including high-resolution RGB imagery acquired by unmanned aerial vehicles (UAVs) and advanced machine-learning techniques such as deep learning (DL), have been used to analyze genotypes, phenotypes, and crops.
In this study, we modified the YOLOv5s single-stage object detection technique based on a deep convolutional neural network and named it MYOLOv5s. We incorporated BottleneckCSP structures, Hardswish activation function, and two-dimensional spatial dropout layers to increase tassel detection accuracy and reduce overfitting. Our method's performance was compared with three state-of-the-art algorithms: Tasselnetv2+, RetinaNet, and Faster R-CNN. The results obtained from our proposed method demonstrate the effectiveness of MYOLOv5s in detecting and counting maize tassels. 
Materials and Methods
The High-Intensity Phenotyping Site (HIPS) dataset was collected from the large field at the Agronomy Center for Research and Education (ACRE) of Purdue University, located in West Lafayette, Indiana, USA during the 2020 growing season. A Sony Alpha 7R-III RGB camera mounted on a UAV at a 20m altitude captured high-resolution orthophotos with a pixel resolution of 0.25 cm. The dataset consisted of two replications of 22 entries each for hybrids and inbreds, planted on May 12 using a two-row segment plot layout with a plant population of 30,000 per acre. The hybrids and inbreds in this dataset had varying flowering dates, ranging from 20 days between the first and last variety.
This article uses orthophotos taken on July 20th and 24th to train and test the proposed deep network "MYOLOv5s." These orthophotos were divided into 15 images (3670×2150) and then cropped to obtain 150 images (608 × 2048) for each date. Three modifications were applied to the original YOLOv5s to form MYOLOv5s: BottleneckCSP structures were added to the neck part of the YOLOv5s, replacing some C3 modules; two-dimensional spatial dropout layers were used in the defect layer; and the Hardswish activation function was utilized in the convolution structures. These modifications improved tassel detection accuracy. MYOLOv5s was implemented in the Pytorch framework, and the Adam algorithm was applied to optimize it. Hyper-parameters such as the number of epochs, batch size, and learning rates were also optimized to increase tassel detection accuracy.
Results and Discussion
In this study, we first compared the original and modified YOLOv5s techniques, and our results show that MYOLOv5s improved tassel detection accuracy by approximately 2.80%. We then compared MYOLOv5s performance to the counting-based approach TasselNetv2+ and two detection-based techniques: Faster R-CNN and RetinaNet. Our results demonstrated the superiority of MYOLOv5s in terms of both accuracy and inference time. The proposed method achieved an AP value of 95.30% and an RMSE of 1.9% at 84 FPS, making it about 1.4 times faster than the other techniques. Additionally, MYOLOv5s correctly detected the highest number of maize tassels and showed at least a 17.64% improvement in AP value compared to Faster R-CNN and RetinaNet, respectively. Furthermore, our technique had the lowest false positive and false negative values. The regression plots show that MYOLOv5s provided slightly higher fidelity counts than other methods.
Finally, we investigated the effect of score values on the performance of detection-based models and calculated the optimal values of hyperparameters.
The MYOLOv5s technique outperformed other state-of-the-art models in detecting maize tassels, achieving the highest precision, recall, and average precision (AP) values.
The MYOLOv5s method had the lowest root mean square error (RMSE) value in the error counting metric, demonstrating its accuracy in detecting and counting maize tassels.
We evaluated the correlation between predicted and ground-truth values of maize tassels using the R2 score, and for the MYOLOv5s method, the R2 score was approximately 99.28%, indicating a strong correlation between predicted and actual values.
The MYOLOv5s method performed exceptionally well in detecting tassels, even in highly overlapping areas. It accurately distinguished and detected tassels, regardless of their proximity or overlap with other objects.
When compared to the counting-based approach TasselNetv2+, our proposed MYOLOv5s method showed faster inference times. This suggests that the MYOLOv5s method is computationally efficient while maintaining accurate tassel detection capabilities.


Main Subjects

©2023 The author(s). This article is licensed under Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source.

  1. Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017). Understanding of a convolutional neural network. Pages 1-6. 2017 International Conference on Engineering and Technology (ICET): IEEE. https://doi.org/1109/ICEngTechnol.2017.8308186
  2. Alzadjali, A., Alali, M. H., Sivakumar, A. N. V., Deogun, J. S., Scott, S., Schnable, J. C., & Shi, Y. (2021). Maize Tassel Detection from UAV Imagery Using Deep Learning. Frontiers in Robotics and AI. https://doi.org/10.3389/frobt.2021.600410
  3. Bisong, E. (2019). Building machine learning and deep learning models on Google Cloud Platform. Springer.
  4. Bochkovskiy, A., Wang, C. Y., & Liao, H. Y. M. (2020). Yolov4: Optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934. https://doi.org/10.48550/arXiv.2004.10934
  5. Chicco, D., Warrens, M. J., & Jurman, G. (2021). The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. Computer Science, 7, e623. https://doi.org/10.7717/peerj-cs.623
  6. Dillon, J. V., Langmore, I., Tran, D., Brevdo, E., Vasudevan, S., Moore, D., Patton, B., Alemi, A., Hoffman, M., & Saurous, R. A. (2017). Tensorflow distributions. arXiv preprint arXiv: 1711.10604. https://doi.org/10.48550/arXiv.1711.10604
  7. Elfwing, S., Uchibe, E., & Doya, K. (2018). Sigmoid-weighted linear units for neural network function approximation in reinforcement learning. Neural Networks, 107, 3-11. https://doi.org/10.1016/j.neunet.2017.12.012
  8. Farhadi, A., & Redmon, J. (2018). Yolov3: An incremental improvement. Pages 1804-2767. Computer Vision and Pattern Recognition: Springer Berlin/Heidelberg, Germany. https://doi.org/10.48550/arXiv.1804.02767
  9. Ghosal, S., Zheng, B., Chapman, S. C., Potgieter, A. B., Jordan, D. R., Wang, X., Singh, A. K., Singh, A., Hirafuji, M., & Ninomiya, S. (2019). A weakly supervised deep learning framework for sorghum head detection and counting. Plant Phenomics https://doi.org/10.34133/2019/1525874
  10. Gómez-Flores, W., Garza-Saldaña, J. J., & Varela-Fuentes, S. E. (2019). Detection of huanglongbing disease based on intensity-invariant texture analysis of images in the visible spectrum. Computers and Electronics in Agriculture, 162, 825-835. https://doi.org/10.1016/j.compag.2019.05.032
  11. Habib, A. F., Kim, E. M., & Kim, C. J. (2007). New methodologies for true orthophoto generation. Photogrammetric Engineering & Remote Sensing, 73, 25-36.
  12. Hawkins, D. M. (2004). The problem of overfitting. Journal of Chemical Information and Computer Sciences, 44, 1-12. https://doi.org/10.1021/ci0342472
  13. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904-1916. https://doi.org/10.1109/TPAMI.2015.2389824
  14. Howard, A., Sandler, M., Chu, G., Chen, L. C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., & Vasudevan, V. (2019). Searching for mobilenetv3. Pages 1314-1324. Proceedings of the IEEE/CVF International Conference on Computer Vision.
  15. Jocher, G., et al. (2020). Yolov5. https://github.com/ultralytics/yolov5
  16. Lempitsky, V., & Zisserman, A. (2010). Learning to count objects in images. Advances in Neural Information Processing Systems, 23, 1324-1332.
  17. Leung, H., & Haykin, S. (1991). The complex backpropagation algorithm. IEEE Transactions on Signal Processing, 39, 2101-2104. https://doi.org/10.1109/78.134446
  18. Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. Pages 2117-2125. Proceedings of the IEEE conference on computer vision and pattern recognition.
  19. Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J. (2018). Path aggregation network for instance segmentation. Pages 8759-8768. Proceedings of the IEEE conference on computer vision and pattern recognition.
  20. Liu, Y., Cen, C., Che, Y., Ke, R., Ma, Y., & Ma, Y. (2020). Detection of maize tassels from UAV RGB imagery with faster R-CNN. Remote Sensing, 12, 338. https://doi.org/10.3390/rs12020338
  21. Long, X., Deng, K., Wang, G., Zhang, Y., Dang, Q., Gao, Y., Shen, H., Ren, J., Han, S., & Ding, E. (2020). PP-YOLO: An effective and efficient implementation of object detector. arXiv preprint arXiv:2007.12099. https://doi.org/10.48550/arXiv.2007.12099
  22. Lu, H., & Cao, Z. (2020). Tasselnetv2+: A fast implementation for high-throughput plant counting from high-resolution RGB imagery. Frontiers in Plant Science, 11, 1929. https://doi.org/10.3389/fpls.2020.541960
  23. Lu, H., Cao, Z., Xiao, Y., Zhuang, B., & Shen, C. (2017). TasselNet: counting maize tassels in the wild via local counts regression network. Plant Methods, 13, 1-17. https://doi.org/10.1186/s13007-017-0224-0
  24. Ongsulee, P. (2017). Artificial intelligence, machine learning and deep learning. Pages 1-6. 2017 15th International Conference on ICT and Knowledge Engineering (ICT&KE): IEEE.
  25. Parihar, C., Jat, S., Singh, A., Kumar, R. S., Hooda, K., GK, C., & Singh, D. (2011). Maize production technologies in India.
  26. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., & Antiga, L. (2019). Pytorch: An imperative style, high-performance deep learning library. Advances in Neural Information Processing Systems, 32, 8026-8037.
  27. Pourreza, A., Lee, W. S., Etxeberria, E., & Banerjee, A. (2015). An evaluation of a vision-based sensor performance in Huanglongbing disease identification. Biosystems Engineering, 130, 13-22. https://doi.org/10.1016/j.biosystemseng.2014.11.013
  28. Quan, L., Feng, H., Lv, Y., Wang, Q., Zhang, C., Liu, J., & Yuan, Z. (2019). Maize seedling detection under different growth stages and complex field environments based on an improved Faster R–CNN. Biosystems Engineering, 184, 1-23. https://doi.org/10.1016/j.biosystemseng.2019.05.002
  29. Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. Pages 7263-7271. Proceedings of the IEEE conference on computer vision and pattern recognition.
  30. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. Pages 779-788. Proceedings of the IEEE conference on computer vision and pattern recognition.
  31. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., & Savarese, S. (2019). Generalized intersection over union: A metric and a loss for bounding box regression. Pages 658-666. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
  32. Shi, W., Caballero, J., Huszár, F., Totz, J., Aitken, A. P., Bishop, R., Rueckert, D., & Wang, Z. (2016). Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. Pages 1874-1883. Proceedings of the IEEE conference on computer vision and pattern recognition.
  33. Sokolova, M., Japkowicz, N., & Szpakowicz, S. (2006). Beyond accuracy, F-score and ROC: a family of discriminant measures for performance evaluation. Pages 1015-1021. Australasian joint conference on artificial intelligence: Springer.
  34. Tagne, A., Feujio, T., & Sonna, C. (2008). Essential oil and plant extracts as potential substitutes to synthetic fungicides in the control of fungi. Pages 12-15. International Conference Diversifying crop protection.
  35. Tompson, J., Goroshin, R., Jain, A., LeCun, Y., & Bregler, C. (2015). Efficient object localization using convolutional networks. Pages 648-656. Proceedings of the IEEE conference on computer vision and pattern recognition.
  36. Ubbens, J., Cieslak, M., Prusinkiewicz, P., & Stavness, I. (2018). The use of plant models in deep learning: an application to leaf counting in rosette plants. Plant methods, 14, 1-10. https://doi.org/10.1186/s13007-018-0273-z
  37. Wang, C. Y., Liao, H. Y. M., Wu, Y. H., Chen, P. Y., Hsieh, J. W., & Yeh, I. H. (2020). CSPNet: A new backbone that can enhance learning capability of CNN. Pages 390-391. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops.
  38. Xiong, H., Cao, Z., Lu, H., Madec, S., Liu, L., & Shen, C. (2019). TasselNetv2: in-field counting of wheat spikes with context-augmented local regression networks. Plant Methods, 15, 1-14. https://doi.org/10.1186/s13007-019-0537-2
  39. Zhu, M. (2004). Recall, precision and average precision. Department of Statistics and Actuarial Science, University of Waterloo, Waterloo 2: 6.
  40. Zou, H., Lu, H., Li, Y., Liu, L., & Cao, Z. (2020). Maize tassels detection: a benchmark of the state of the art. Plant Methods, 16, 1-15. https://doi.org/10.1186/s13007-020-00651-z