Abstract
Weed detection plays a crucial role in enhancing cotton agricultural productivity. However, the detection process is subject to challenges such as target scale diversity and loss of leaf symmetry due to leaf shading. Hence, this research presents an enhanced model, EY8-MFEM, for detecting weeds in cotton fields. Firstly, the ALGA module is proposed, which combines the local and global information of feature maps through weighting operations to better focus on the spatial information of feature maps. Following this, the C2F-ALGA module was developed to augment the feature extraction capability of the underlying backbone network. Secondly, the MDPM module is proposed to generate attention matrices by capturing the horizontal and vertical information of feature maps, reducing duplicate information in the feature maps. Finally, we will replace the upsampling module of YOLOv8 with the CARAFE module to provide better upsampling performance. Extensive experiments on two publicly available datasets showed that the F1, mAP50 and mAP75 metrics improved by 1.2%, 5.1%, 2.9% and 3.8%, 1.3%, 2.2%, respectively, compared to the baseline model. This study showcases the algorithm’s potential for practical applications in weed detection within cotton fields, promoting the significant development of artificial intelligence in the field of agriculture.
Introduction
Cotton, as a crucial economic crop, significantly contributes to the advancement of the global economy and is recognized as one of the foremost textile raw materials worldwide. With the ongoing expansion of the global population and the progress of urbanization, the available land area for cultivation is decreasing day by day, and the importance of smart agriculture is gradually becoming prominent [1]. Precision agriculture [2] refers to the specific application and implementation of information technology in the field of agriculture, to achieve refined management and decision support in agricultural production [3]. Among them, the task of weed detection in cotton fields is a specific application in the field of precision agriculture, which has important significance and impact on the agricultural production of cotton.
Weeds, as one of the important factors leading to a decrease in cotton crop yield, occupy the growth space and survival resources of cotton, resulting in insufficient nutrient absorption by cotton crops and a decrease in crop yield [4]. At present, chemical weed control and mechanical weed control are mainly used in farmland for prevention and control. Pesticide residues in the air, soil and crop surfaces not only cause respiratory irritation, skin allergies and poisoning to those who come into contact with them, but also affect the safety of the surrounding water sources, thus affecting people’s health. In contrast, mechanical weeding has the characteristics of environmental friendliness and strong controllability. But when the spacing between crops is small, problems such as seedling damage may occur, and after mechanical weeding, weeds may grow again, requiring multiple weeding operations [5]. One way to solve the above problems is to use deep learning algorithms to accurately locate weeds within agricultural fields and use agricultural intelligent robots for precise weed control [6], thereby reducing the use of chemical pesticides and indirectly having a positive impact on human health.
The weed detection process poses numerous challenges. Firstly, the existence of crops and weeds in the growth cycle has diverse target scales, and the same crop has different morphological and appearance characteristics in different growth cycles. In addition, plants within the same growth cycle may exhibit appearance characteristics of varying sizes due to the uneven distribution of nutrients. These changes require considering the changes in target scale under different growth cycles in object detection. Secondly, there is a problem of leaf obstruction between crops and weeds. Throughout the growth cycle of plants, leaf occlusion emerges as a prevalent challenge in object detection. As plants grow, their leaves may cross or obstruct each other, resulting in partial or complete obstruction of the target. A complete blade has a high degree of symmetry that can be easily recognized by the machine, but this blade symmetry is lost when the blade is occluded, and this occlusion adds a degree of difficulty to the detection process. Secondly, there is a problem of morphological similarity between weeds and crops. Distinguishing weeds and crops has become a challenge in object detection due to their similar appearance and morphological characteristics.
In response to the above issues, researchers have adopted methods such as data augmentation, vegetation index features, multi-scale feature fusion, and attention mechanisms to achieve weed detection for different crops. Eide et al. [7] used thermal remote sensing and multispectral remote sensing images obtained by drones, combined with the normalized vegetation index and synthesized wavelength maps to distinguish weed populations. Chen et al. [8] employed a support vector machine classifier along with fused feature combinations to achieve precise detection of diverse weed types and corn seedlings. Li et al. [9] utilized color index features and the Otsu threshold algorithm to accomplish the segmentation of vegetation and weeds, and input the processed dataset into the PSPNet model for training to achieve accurate segmentation of areas under high weed pressure. Moazzam et al. [10] introduced an innovative convolutional neural network named VGG Beet for the classification of multispectral datasets. This method simplifies the three-pixel classification problems into two categories to improve the classification accuracy. Wang et al. [11] introduced an enhanced YOLO model tailored for the precise detection of sunflower plants. This approach segments high-resolution images into pertinent sub graphs through overlap rate calculations, employing multi-scale training techniques to attain accuracy and recall rates of 0.9465 and 0.9017, respectively. In addressing the issue of inadequate focus on crucial target features and noise feature suppression within the YOLOv5 model’s feature extraction network, Wang et al. [12] introduced the C3 Host bottleneck module and integrated an attention mechanism to improve the network’s emphasis on relevant features. However, this model may mistakenly identify some wheat seedlings as weeds when processing them, so there remains potential for further enhancement in the feature extraction network.
Given the significant advantages of YOLO series models in real-time, accuracy, and ease of deployment, they are widely used in weed detection tasks. Firstly, this study drew on the idea of attention mechanism [13] and the idea of symmetry [14] to design an ALGA module to enhance the concentration on spatial information within feature maps. Meanwhile, to bolster the feature extraction capacity of the backbone network and address the challenge of distinguishing between crops and weeds in weed detection, we propose the C2F-ALGA module. Secondly, we observed that the SPPF structure may contain similar or repetitive information in the feature maps after feature fusion. To mitigate feature redundancy and computational complexity within the model, we propose the MDPM. This module enables the capture of a broader spectrum of contextual information across both horizontal and vertical directions within the feature map, and adjust the relationships between channels in the feature map through operations in different directions, thereby generating an attention matrix to improve the expressive capacity of features. Finally, this study improved the upsampling module of YOLOv8 by introducing CARAFE [15]. By reusing fine-grained features and adjusting content awareness, the CARAFE module can automatically adjust the upsampling method to provide better upsampling results.