Maize is one of the most important cereal crops worldwide, providing staple food for people globally. Counting maize tassels provides essential information about yield prediction, growth status, and plant phenotyping, but traditional manual approaches are expensive and time-consuming. Recent developments in technology, including high-resolution RGB imagery acquired by unmanned aerial vehicles (UAVs) and advanced machine-learning techniques such as deep learning (DL), have been used to analyze genotypes, phenotypes, and crops.
In this study, we modified the YOLOv5s single-stage object detection technique based on a deep convolutional neural network and named it MYOLOv5s. We incorporated BottleneckCSP structures, Hardswish activation function, and two-dimensional spatial dropout layers to increase tassel detection accuracy and reduce overfitting. Our method's performance was compared with three state-of-the-art algorithms: Tasselnetv2+, RetinaNet, and Faster R-CNN. The results obtained from our proposed method demonstrate the effectiveness of MYOLOv5s in detecting and counting maize tassels.
Materials and Methods
The High-Intensity Phenotyping Site (HIPS) dataset was collected from the large field at the Agronomy Center for Research and Education (ACRE) of Purdue University, located in West Lafayette, Indiana, USA during the 2020 growing season. A Sony Alpha 7R-III RGB camera mounted on a UAV at a 20m altitude captured high-resolution orthophotos with a pixel resolution of 0.25 cm. The dataset consisted of two replications of 22 entries each for hybrids and inbreds, planted on May 12 using a two-row segment plot layout with a plant population of 30,000 per acre. The hybrids and inbreds in this dataset had varying flowering dates, ranging from 20 days between the first and last variety.
This article uses orthophotos taken on July 20th and 24th to train and test the proposed deep network "MYOLOv5s." These orthophotos were divided into 15 images (3670×2150) and then cropped to obtain 150 images (608 × 2048) for each date. Three modifications were applied to the original YOLOv5s to form MYOLOv5s: BottleneckCSP structures were added to the neck part of the YOLOv5s, replacing some C3 modules; two-dimensional spatial dropout layers were used in the defect layer; and the Hardswish activation function was utilized in the convolution structures. These modifications improved tassel detection accuracy. MYOLOv5s was implemented in the Pytorch framework, and the Adam algorithm was applied to optimize it. Hyper-parameters such as the number of epochs, batch size, and learning rates were also optimized to increase tassel detection accuracy.
Results and Discussion
In this study, we first compared the original and modified YOLOv5s techniques, and our results show that MYOLOv5s improved tassel detection accuracy by approximately 2.80%. We then compared MYOLOv5s performance to the counting-based approach TasselNetv2+ and two detection-based techniques: Faster R-CNN and RetinaNet. Our results demonstrated the superiority of MYOLOv5s in terms of both accuracy and inference time. The proposed method achieved an AP value of 95.30% and an RMSE of 1.9% at 84 FPS, making it about 1.4 times faster than the other techniques. Additionally, MYOLOv5s correctly detected the highest number of maize tassels and showed at least a 17.64% improvement in AP value compared to Faster R-CNN and RetinaNet, respectively. Furthermore, our technique had the lowest false positive and false negative values. The regression plots show that MYOLOv5s provided slightly higher fidelity counts than other methods.
Finally, we investigated the effect of score values on the performance of detection-based models and calculated the optimal values of hyperparameters.
The MYOLOv5s technique outperformed other state-of-the-art models in detecting maize tassels, achieving the highest precision, recall, and average precision (AP) values.
The MYOLOv5s method had the lowest root mean square error (RMSE) value in the error counting metric, demonstrating its accuracy in detecting and counting maize tassels.
We evaluated the correlation between predicted and ground-truth values of maize tassels using the R2 score, and for the MYOLOv5s method, the R2 score was approximately 99.28%, indicating a strong correlation between predicted and actual values.
The MYOLOv5s method performed exceptionally well in detecting tassels, even in highly overlapping areas. It accurately distinguished and detected tassels, regardless of their proximity or overlap with other objects.
When compared to the counting-based approach TasselNetv2+, our proposed MYOLOv5s method showed faster inference times. This suggests that the MYOLOv5s method is computationally efficient while maintaining accurate tassel detection capabilities.
©2023 The author(s). This article is licensed under Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source.