Introduction

The camouflaged object segmentation model (COSM) aims to identify objects that exhibit various forms of camouflage. This field has a wide range of real-world applications, including search-and-rescue operations, the discovery of rare species, healthcare (such as automated diagnosis for colorectal polyps [1] and lung lesions [2], medical image fusion [3]), agriculture (including pest identification [4], fruit ripeness assessment [5] and biological disease diagnosis [6]), and content creation (such as recreational art [7]). Figure 1 depicts different categories of camouflage objects [8], with items (1)–(4) representing natural camouflage, and (5) and (6) showcasing artificial camouflage. More specifically, (1) features a terrestrial camouflage creature, (2) showcases an aquatic camouflage creature, (3) illustrates a flying camouflage creature, and (4) portrays a reptile camouflage creature. On the other hand, (5) displays camouflage soldiers, while (6) exhibits a human body painting camouflage object.

Fig. 1
figure 1

Camouflaged object segmentation task. The proposed camouflaged object detection task, where the goal is to detect objects that have a similar pattern (e.g., edge, texture, or color) to the natural habitat [8]

In recent years, this field has seen remarkable advancements, largely attributed to the availability of benchmark datasets such as COD10K [9, 10], and NC4K [11], in tandem with the rapid evolution of deep learning techniques. From SINet [9] in 2020 to POPNet [27] in 2023, the accuracy results for the COD10k test set are shown in Fig. 2, where the E-measure [50] improved from 0.864 to 0.897, the S-measure [49] improved from 0.776 to 0.827, the weighted F-measure [51] improved from 0.631 to 0.789, and the mean absolute error (MAE) decreased from 0.043 to 0.031. Evidently, the accuracy of the models has increased. However, research on the security of COSM against adversarial example attacks is still in its infancy. it remains unclear whether it can withstand adversarial attacks. This raises doubts about COSM being used in safety critical applications (such as, intelligent grid systems [12, 13] and autonomous driving systems [14]) since the networks could inexplicably classify a natural input incorrectly although it is almost identical to examples it has classified correctly before. Therefore, it is crucial to conduct research on the robustness of COSM, reducing the security risks associated with these models, and ultimately promoting their widespread application.

Fig. 2
figure 2

Accuracy statistics of the camouflaged object segmentation models

In this paper, Our work is the first investigation into how to launch adversarial attacks against COSM. We employ standard practices found in popular adversarial attack methods, such as the fast gradient sign method (FGSM) attack [26] and the projected gradient descent (PGD) attack [18]. COSM distinguishes itself from traditional image recognition (segmentation) models in two key aspects: (1) it produces masks without making label predictions, and (2) the objects detected by the model closely resemble the background.

For this purpose, we propose a framework named Attack-COSM, which is designed to launch attacks on COSM through the task of mask prediction for camouflaged targets. Specifically, our objective is to mislead COSM, causing the model to reverse its predictions for masked objects and backgrounds by increasing COSM loss. The experimental results indicate that adversarial attacks can effectively reduce the accuracy of COSM, implying that COSM is susceptible to adversarial examples. We also conducted experiments to evaluate the transfer attack performance of adversarial examples and found that adversarial examples generated using one COSM system can be used to effectively attack other COSM systems. In addition to the primary objective of reversing the predictions for masked objects and backgrounds, we consider the question of whether adversarial examples can be used to manipulate COSM and generate any desired mask. To achieve this, we structure the desired task in two settings: (1) by assigning a manually designed mask to a random position and (2) by generating a mask from another image. In general, we discover that it is feasible to consistently generate the desired mask in most cases, underscoring the susceptibility of COSM to adversarial attacks. Overall, the contributions of this work are summarized as follows.

  1. 1.

    We conduct the first yet investigation on attacking COSM with adversarial examples. We present a framework for attacking COSM with the objective of reversing its predictions for masked objects and backgrounds.

  2. 2.

    We uncover that COSM is susceptible to adversarial attacks in a complete white-box setting. Furthermore, we demonstrate that the adversary can target the model without prior knowledge of its parameters. In other words, COSM can be partially compromised by adversarial examples in a cross-model setup.

  3. 3.

    In addition to the primary objective of reversing its predictions for masked objects and backgrounds, through further investigation, we successfully demonstrate that COSM can be manipulated to generate any desired mask. This further underscores the vulnerability of COSM.

The remainder of this work is structured as follows. “Related work” section provides an overview of related research, summarizing the advancements in COSM and different attack methodologies.

In “Framework for Attack-COSM” section, we present the framework for Attack-COSM, formulating the objective as reversing the predictions of masked objects and backgrounds. In “Main results of Attack-COSM” section, We begin by presenting the results of Attack-COSM in the white-box setting. We then delve into an exploration of transfer-based attacks on COSM. Finally, we investigate techniques for manipulating COSM to generate masks according to specific requirements. “Discussion” section discusses the relationship between attacking label predictions and attacking mask predictions, as well as the limitations of our work.

Related work

In this section, we separately outline the characteristics and classifications of COSM and typical adversarial attack algorithms.

COSM

Numerous projects and papers have explored the above mentioned topic from various perspectives, which can be broadly categorized into two groups: single-task and multitask learning. (a) Single-task learning is the most commonly employed paradigm in COS, focusing solely on the segmentation of concealed targets. Within this paradigm, most recent works, [9, 10, 19] concentrate on the development of attention modules for identifying target regions. Multitask learning introduces an auxiliary task to complement the segmentation task, enhancing the robustness of COS learning. These multitask frameworks can be implemented through various methods, including confidence estimation [20,21,22,23], localization/ranking [11, 24], category prediction [25], learning depth methods [26, 27], boundary methods [28,29,30,31,32,33], and texture [16, 34] cues of camouflaged objects.

Adversarial attacks

Deep neural networks, including CNNs [26, 35, 36] and vision transformers (ViT) [37,38,39,40], are well recognized for their vulnerability to adversarial examples. This susceptibility has spurred numerous studies aimed at examining model robustness under various types of adversarial attacks. Adversarial attack methods can be categorized into two settings: the white-box setting [17, 41, 42], which allows full access to the target model, and black-box attacks [43,44,45,46,47,48], which primarily rely on the transferability of adversarial examples. Another way to classify attacks is as untargeted or targeted. In the context of image recognition (classification), an attack is deemed successful under the untargeted setting if the predicted label differs from the ground-truth label. In the more stringent targeted setting, the attack is considered a failure unless the predicted label matches the predetermined target label. The prior works mentioned have primarily concentrated on manipulating image-level label predictions for image classification tasks. In contrast, our work considers attacking COSM for the task of predicting the masks of camouflaged targets. Attacking COSM also sets itself apart from attacking semantic segmentation models, as the generated masks lack semantic labels. It remains uncertain whether COSM can withstand adversarial attacks.

As shown in Table 1, we summarizes and analyzes the current typical adversarial example generation methods based on attack type, attack target, attack frequency, advantage, limitation etc. Single-step denotes a single iteration, while iteration represents multiple iterations. W stands for white-box attack, B for black-box attack, T for targeted attack, and NT for non-targeted attack.

Table 1 Summary of typical adversarial attack

Framework for Attack-COSM

Inspired by the image classification attack method, we analyze the difference between the image classification task and the COS task, and finally get the attack flow of the COS task.

Preliminaries

Mask prediction

As illustrated in Fig. 1, we treat COSM as a class-independent, pixel-wise segmentation task. Formally, let \({\text{I}}\in {\mathbb{R}}^{{\text{H}}\times {\text{W}}\times 3}\) and \({\text{C}}\in {\mathbb{R}}^{{\text{H}}\times {\text{W}}\times 1}\) denote the input image and output camouflage map, respectively. Given a large collection of such pairs \(\left\{ {{\text{I}}_{\text{i}} ,{\text{C}}_{\text{i}} } \right\}_{{\text{i}} = 1}^{\text{N}}\), our task is to learn a mapping function \({\mathcal{F}}_{\Theta }\) parameterized by weights \(\Theta \) that can correctly transfer the novel input to its corresponding camouflage map. For each pixel (position) \({\uprho }_{{\text{o}}}\in [1,{\text{H}}\times {\text{W}}]\), the estimated score \(c_{\rho_o } \in [0,1][0,1]\) reflects the COD models, prediction, where a score of “1” indicates that it belongs to the camouflaged objects and vice versa. Note that for each pixel (position) \({\uprho }_{{\text{o}}}\in [1,{\text{H}}\times {\text{W}}]\), the CODM has an intermediate predicted value \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\), which undergoes a sigmoid functions operation to obtain \({{\text{c}}}_{{\uprho }_{{\text{o}}}}\). Namely, when \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) is a positive value, \({{\text{c}}}_{{\uprho }_{{\text{o}}}}\) takes the value of 1, and vice versa.

Common attack methods

Before introducing Attack-COSM, we begin by revisiting the commonly used attack methods in traditional classification tasks. We define \(f(\cdot ,\uptheta )\) as the target model to be attacked, parameterized by \(\uptheta \). With \(({{\text{X}}}_{{\text{c}}},{{\text{Y}}}_{{\text{c}}} )\) as data pairs from the original dataset, the adversarial image \({{\text{X}}}_{{\text{adv}}}\) is defined as\({{\text{X}}}_{{\text{c}}}+{\updelta }^{*}\), where \({\updelta }^{*}\) is optimized in Eq. 1. More specifically, the attack algorithm is designed to generate the optimal \({\updelta }^{*}\). The \({\mathbb{S}}\) in the formula represents the range of the perturbation limits. In the context of the classification task, \({{\text{Y}}}_{{\text{c}}}\) typically signifies the class label, \({{\text{J}}}_{{\text{c}}}\) is loss function, and the loss function is often the cross-entropy function.

$${\updelta }^{*}= {}_{\updelta \in {\mathbb{S}}}{}^{\mathrm{max }} {J}_{c}(f({X}_{c}+\updelta ;\uptheta ),{Y}_{c})$$
(1)

The typical attack algorithms listed in Table 1 are used to solve the above equation.

Attack-COSM

In typical adversarial attacks targeting image recognition models, the objective is to manipulate the predicted labels at the image level, thereby causing the model to produce inaccurate predictions. From Fig. 3, it can be observed that in adversarial attacks targeting COSM, the objective is to manipulate predicted labels at the pixel level. Additionally, due to the intrinsic similarity between camouflaged objects and background, COSM introduces new detection modules.

Fig. 3
figure 3

COSM adversarial example generation process

Task definition

Since the generated masks from COSM lack semantic labels, a direct approach to successfully attack COSM is to reverse the predictions for masked objects and backgrounds. In this work, we consider the reversal of predictions for masked objects and backgrounds as the fundamental objective of adversarial attacks on COSM. As per “Preliminaries” section, a pixel, denoted as \({\uprho }_{{\text{o}}}\), is classified as masked when the intermediate predicted value, denoted as \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\), is positive. Therefore, the task is deemed successful when the predicted values \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) become negative. Conversely, a pixel \({\uprho }_{{\text{o}}}\) is classified as background when the predicted value \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) is negative, and the task is considered successful when predicted values \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) turn positive.

Loss design

To reverse the predictions of masked objects and background by attacking COSM, the loss design is expected to be adjusted to decrease the predicted values \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) until they become negative in the masked region and increase the predicted values \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) in the background region until they become positive. As shown in Eq. 2, we achieve the aforementioned objective by directly elevating the loss value of COSM to diminish the model's prediction accuracy. The loss function \({{\text{J}}}_{{\text{s}}}\) used is BCEWithLogitsLoss. For a dataset related to attacking COS, we seek parameters \(\updelta \) to maximize the loss, i.e.,

$${\delta }^{*}={}_{\updelta \in {\mathbb{S}}}{}^{\mathrm{max }}\sum_{1}^{{\text{N}}}{{{\text{J}}}_{{\text{s}}}(\mathcal{F}}_{\Theta }\left({{\text{I}}}_{{\text{i}}}+{\updelta }_{i};\Theta \right),{C}_{i})$$
(2)

As can be seen from formula 3: Unlike traditional classification tasks, we aggregate the loss values for each pixel to calculate the overall image loss. The loss for each picture is:

$$ {\text{J}}_{\text{s}} \left( {{{\mathcal{F}}}_{\Theta } \left( {{\text{I}}_{\text{i}} + {\updelta }_i ;{\Theta }} \right),C_i } \right) = - \sum \limits_{{\uprho }_{\text{o}} \in \Omega } \left( {C_i^{{\uprho }_{\text{o}} } {\text{ln}}\left( {{\text{c}}_{{\uprho }_{\text{o}} } } \right) + \left( {1 - C_i^{{\uprho }_{\text{o}} } } \right){\text{ln}}\left( {1 - {\text{c}}_{{\uprho }_{\text{o}} } } \right)} \right) $$
(3)

Here, \(\Omega \) represents the spatial composition of all pixels for each image, and \({{\text{C}}}_{{\text{i}}}^{{\uprho }_{{\text{o}}}}\) represents the ground truth at pixel position \({\uprho }_{{\text{o}}}\) for the ith image.

Attack details

The FGSM [17] and PGD [18] are two widely employed methods for assessing model robustness, and they are chosen for their simplicity and effectiveness. FGSM is a single-step attack method based on the model gradient on the input image. PGD is a multi-step attack method, and it is represented as PGD followed by the number of iterations, denoted as PGD-N.

So, we employ the FGSM attack [17] and PGD attack [18], a method commonly used for assessing model robustness in prior studies. Following the established practices of attacking vision models in a white-box scenario, the default maximum perturbation magnitude is set to 8/255. We use step sizes of 8/255 for the FGSM attack and 2/255 for the PGD attack. In cases where no specific attack method is mentioned, we default to using the PGD-40 attack, where the "40" indicates that the attack involves 40 iterations.

Attack process

Using the PGD attack algorithm as an example, the algorithmic process for attacking a camouflage object detection model is as follows (Table 2):

Table 2 Flow table of ATTACK-COSM algorithm

Main results of Attack-COSM

We first introduce the experimental setting in detail, and then test the attack effect of the algorithm in white box and black box setting with reverse mask target and background as attack targets. Finally, we use our attack framework to realize the expansion of the mask and the generation of the specified shape mask.

Experimental setup

The experiments were conducted on Intel(R) Xeon(R) Platinum 8260 CPU@2.40 GHz × 96 and RTX 5000 platforms.

COSM

In the white-box setting, we used the FGSM [17] and PGD [18] attack methods to carry out adversarial attacks on the SINet model by generating adversarial examples. Subsequently, we performed transferability tests (black-box testing) on six representative COSM algorithms. These six representative camouflaged object detection algorithms are listed in Table 3.

Table 3 Introduction of the six representative camouflaged object detection models

Dataset

We conducted evaluations on two benchmark datasets: CAMO [25] and COD10K [9]. CAMO comprises 1250 camouflaged images spanning various categories, with 1000 images designated for training and 250 for testing. On the other hand, COD10K is presently the largest benchmark dataset, featuring 5066 camouflaged images sourced from multiple photography websites. It includes 3040 images for training and 2026 for testing, covering 5 superclasses and 69 subclasses. In line with prior research [9], we utilized the combined training sets of CAMO and COD10K, totaling 4040 images, and the testing set of COD10K for our evaluations.

Evaluation metrics

In the experiment, we employed four well-established evaluation metrics.

Structure measure (\({S}_{\alpha }\)) [49] is used to measure the structural similarity between a non-binary prediction map Y and a ground-truth mask C:

$${S}_{\alpha }=(1-{\upalpha }){S}_{o}({\text{Y}},{\text{C}})+{\upalpha }{S}_{r}({\text{Y}},{\text{C}})$$
(5)

where \({\upalpha }\) balances the object-aware similarity \({S}_{o}\) and region-aware similarity \({S}_{r}\). we set α = 0.5.

MAE (mean absolute error, M) is a conventional pixel-wise measure, which is defined as:

$$ M = \frac{1}{W \times H}\sum_x^W {\sum_y^H {\left| {Y(x,y) - C(x,y)} \right|} } $$
(6)

where (x, y) are pixel coordinates in C.

Enhanced-alignment measure (\(E_\phi\)) [50] is a recently proposed binary foreground evaluation metric, which considers both local and global similarity between two binary maps. Its formulation is defined as follows:

$$ E_\phi = \frac{1}{W \times H}\sum_x^W {\sum_y^H {\varphi [Y(x,y),C(x,y)]} } $$
(7)

where \(\varphi\) is the enhanced-alignment matrix.

Weighted F-measure (\(F_\beta^\omega\)) [51] can be defined as:

$$ F_\beta^\omega = (1 + \beta^2 )\frac{P^\omega \cdot R^\omega }{{\beta^2 \cdot P^\omega + R^\omega }} $$
(8)

\(P^\omega\) represents weighted Precision, which measures exactness, while \(R^\omega\) denotes weighted Recall, measuring completeness. \(\beta\) indicates the effectiveness of detection concerning a user who assigns \(\beta\) times as much importance to \(R^\omega\) as to \(P^\omega\).

Main results under white-box settings

As part of the basic setup, we initially attacked COSM with the objective of reversing the predictions for masked objects and the background, as discussed in “Attack-COSM” section. The attack is considered successful if the precision of \({{\text{Mask}}}_{{\text{adv}}}\) is significantly smaller than the precision of \({{\text{Mask}}}_{{\text{clean}}}\).

Qualitative results under white-box settings

For our white-box attack testing, we selected the first deep learning-based COSM algorithm, the SINet model. We present partial visualization results of adversarial images and predicted masks in Fig. 4. The model is capable of producing adversarial images with imperceptible perturbations following the FGSM and PGD attacks (refer to Fig. 4b, c). While COSM is capable of generating a high-quality \({{\text{Mask}}}_{{\text{clean}}}\) in Fig. 4e, both the FGSM and PGD attacks are effective in reversing the predictions for masked objects and the background, particularly the extensive white area of \({{\text{Mask}}}_{{\text{pgd}}}\) in Fig. 4g. Figure 4 demonstrates that COSM is susceptible to adversarial attacks, with the PGD attack outperforming the FGSM attack in the context of the COS task. In the experiment, we observed a phenomenon, as seen in Fig. 4g, where the model effectively reverses the prediction of the background into foreground, but the model had poor results when reversing the prediction of the foreground into the background. By examining the values of \({Mask}_{clean}\), we found that the output value \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) for the foreground is approximately 20, while the output value \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) for the background is approximately − 5. Therefore, through iterative attacks, the background can be predicted as foreground more quickly.

Fig. 4
figure 4

Attacking the SINet model to reverse the predictions of masked objects and background, a represents the clean image, b, c show adversarial images generated by FGSM and PGD attacks, respectively, eg represent masks predicted by COSM based on the images shown in ac, respectively

Quantitative results under white-box settings

We present the evaluation metric results following various attacks in Table 4. With the proposed loss function, the detection accuracy of the SINet model has shown a significant decrease after the PGD attack (e.g.,\({{\text{E}}}_{{\upvarphi }}\) drops from 0.817 to 0.279). Although the FGSM attack also results in a noticeable decrease in detection accuracy compared to the original model, the outcome under the FGSM attack is worse than that under the PGD attack due to its weaker attack strength. This indicates that it is difficult for the FGSM attack to cause a significant change in the predicted \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) and the label value within a single attack step. This result is consistent with the visualization in Fig. 4.

Table 4 Results of the change in detection accuracy after attacking the SINet model. Both the FGSM and PGD-40 attacks result in significantly lower detection accuracy compared to the setting with no attack, and PGD-40 results in the lowest detection accuracy

Main results under black-box settings

Main results under white-box settings” section demonstrates that COSM is vulnerable to adversarial attacks in a full white-box setting. This naturally raises the question: Is COSM resilient to transfer-based attacks? In the black-box setting, the attacker does not have access to all the necessary information when targeting a specific model. In this section, we use the adversarial examples generated by attacking the SINet model in “Main results under white-box settings” section to attack the other six COSM algorithms and assess the performance of transfer-based attacks.

Qualitative results under black-box settings

As shown in Fig. 5, even though \({{\text{x}}}_{{\text{fgsm}}}\) and \({{\text{x}}}_{{\text{pgd}}}\) are generated by attacking the SINet model, they can still successfully attack other models. This can be observed by comparing \({{\text{Mask}}}_{{\text{fgsm}}}\) and \({{\text{Mask}}}_{{\text{pgd}}}\) in Fig. 5d, e to \({{\text{Mask}}}_{{\text{clean}}}\) in Fig. 5c. Comparing \({{\text{Mask}}}_{{\text{pgd}}}\) in Fig. 5e to \({{\text{Mask}}}_{{\text{fgsm}}}\) in Fig. 5d, we found that in the COS task, PGD’s transfer attack performance is better than that of FGSM.

Fig. 5
figure 5

Masks predicted in the cross-model transfer task

Quantitative results under black-box settings

We present the evaluation metric results under black-box settings in Table 5. After using the two categories of adversarial examples, \({{\text{x}}}_{{\text{fgsm}}}\) and \({{\text{x}}}_{{\text{pgd}}}\), generated in “Qualitative results under white-box settings” section, to target six representative camouflaged target models, the accuracy of the COSM algorithms decreased. This indicates that adversarial examples have transfer attack capabilities in the COS task. Once more, in the black-box setting for COS, the PGD attack is more effective than the FGSM attack (e.g., the last row, \({{\text{E}}}_{{\upvarphi }}\) drops from 0.910 to 0.746 and 0.698 when the input is \({{\text{x}}}_{{\text{fgsm}}}\) and \({{\text{x}}}_{{\text{pgd}}}\), respectively). This result is consistent with the visualization in Fig. 5.

Table 5 Results of the change in detection accuracy under black-box settings. Both the FGSM and PGD-40 attacks result in significantly lower detection accuracy compared to the setting with no attack, and PGD-40 results in the lowest detection accuracy

Beyond reversing the predictions of masked objects and background

In the above sections, our primary focus was on reversing the predictions of masked objects and background. Here, we consider a more intriguing scenario, which involves using adversarial examples to generate any desired masks. Conceptually, the goal is to create entirely new masks rather than simply reversing the predictions of masked objects and background, as discussed above.

Mask enlargement

After investigating the mask reverse attack, it is natural to ask whether it is possible to add new masks to a segmentation map. In our preliminary investigation, we begin by attempting to enlarge the mask area without considering the shape or size of the original mask. To enlarge masks through an attack on COSM, the loss design should be aimed at increasing the predicted values \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) until they become positive. To mitigate the randomness effect, the goal is to ensure that the predicted values \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) are significantly higher than zero, rather than just slightly higher. To achieve this, the mean squared error (MSE) loss with a positive threshold is a suitable choice.

We define \({\mathcal{H}}_{\Theta }\left({{\text{I}}}_{{\text{i}}};\Theta \right)={\text{y}}\), where y represents the output value \({{\text{y}}}_{{\uprho }_{{\text{o}}}}\) for all pixels of each image and \({\text{Sigmoid}}({\mathcal{H}}_{\Theta }\left({{\text{I}}}_{{\text{i}}};\Theta \right))={\mathcal{F}}_{\Theta }\left({{\text{I}}}_{{\text{i}}};\Theta \right)\). As shown in Eq. 9, the predicted value \({\mathcal{H}}_{\Theta }\left({{\text{I}}}_{{\text{i}}}+\updelta ;\Theta \right)\) is optimized to be close to a positive threshold \({{\text{P}}}_{t}\) after the attack. In the extreme case where \({\mathcal{H}}_{\Theta }\left({{\text{I}}}_{{\text{i}}}+\updelta ;\Theta \right)={{\text{P}}}_{t}\) for all predicted values y, the MSE loss reaches its minimum: zero.

$$ {\updelta }^* = {\,}_{{\updelta } \in {{\mathbb{S}}}}^{min} \parallel {{\mathcal{H}}}_{\Theta } \left( {{\text{I}}_{\text{i}} + {\updelta };{\Theta }} \right) - {\text{P}}_t \parallel^2 $$
(9)

The parameter \({{\text{P}}}_{{\text{t}}}\) is set to 20 in this experiment. We visualize the result of mask enlargement in Fig. 6. The experimental results in Fig. 6 show that the mask of adversarial images \({Mask}_{pgd-160}\) is much larger than \({Mask}_{clean}\), as seen in Fig. 6g, c. This indicates that the adversarial attack is capable of not only reversing the predictions of the mask and background but also enlarging them. This motivates us to explore attacking COSM to generate any desired mask.

Fig. 6
figure 6

Results of the mask enlargement attack. \({Mask}_{clean}\) in c and \({Mask}_{adv}\) in dg are generated on \({x}_{clean}\) and \({x}_{pgd}\) in a, b, respectively. The results demonstrate that the mask predicted by COSM can be enlarged through the adversarial attack. With the number of iterations increasing from 40 to 160, the effectiveness of the attack improves

Generating any desired mask

Setting 1: Manually designed mask at a random position

In this setting, we explore whether an adversarial attack can generate manually desired masks at random positions. To maintain generality, we design masks in the form of geometric shapes, including circles and squares. Figure 7 illustrates that this goal can be achieved by setting the mask target as a circle and square at a random position when generating \({x}_{pgd}^{circle}\) and \({x}_{pgd}^{square}\) in Fig. 7a, b, respectively. Although the input image expects a mask of a fish, as in \({Mask}_{clean}\) of Fig. 7c, the desired circle or square masks can be obtained in \({Mask}_{adv}^{circle}\) and \({Mask}_{adv}^{square}\). Manually designing more complex masks than circles or squares can be challenging. Therefore, we further explore using the real object masks generated by COSM as the target masks to attack COSM (see Setting 2).

Fig. 7
figure 7

Generating any desired masks (Setting 1). \({Mask}_{adv}^{circle}\) and \({Mask}_{square}\) in e, g are generated from \({x}_{pgd}^{circle}\) and \({x}_{pgd}^{square}\) in a, b, respectively. With the adversarial attack, manually designed masks in d, f can be generated at random positions

Setting 2: A mask generated on a different image

We investigate this setting with two example images in Fig. 8. Taking the first row of Fig. 8 as an example, a tuna mask in Fig. 8c is predicted based on the clean images \({x}_{clean}\) in Fig. 8a. We take the batfish mask in Fig. 8d from the second row as the mask target and attack the \({x}_{clean}\) image of the tuna in Fig. 8a, and the resulting adversarial image \({x}_{adv}\) of the tuna is shown in Fig. 8b. Interestingly, a batfish mask, \({Mask}_{adv}\), is predicted in Fig. 8e based on \({x}_{adv}\) from the tuna image in Fig. 8b. A similar observation can also be made in the second row of Fig. 8, that is, predicting a tuna mask in \({Mask}_{adv}\) in Fig. 8e based on \({x}_{adv}\) from the batfish image in Fig. 8b.

Fig. 8
figure 8

Generating any desired masks (Setting 2)

Discussion

Attack goals: label prediction versus mask prediction

In contrast to existing works that mainly focus on attacking the model to change label predictions, our work investigates how to attack COSM to alter the mask predictions of camouflaged targets. Conceptually, our investigation to reverse the predictions of masked and background and generate any desired mask is conceptually similar to a targeted attack setting.

Limitations

We propose an adversarial exmple attack framework tailored for the COS task. Leveraging our framework, we conducted numerous experimental setups. The results of these experiments confirmed the susceptibility of COSM to adversarial example attacks, thereby highlighting its weak robustness. Some of the experimental results also point the way for future research. For example, in the field of COS, PGD has better attack capability than FGSM in both white box and black box Settings. Subsequent research should aim to uncover its internal mechanisms, laying the groundwork for the development of more potent attack methods. In some challenging scenarios, such as the mask enlargement task, success is only partial when the number of iterations is less than 160. However, increasing the number of iterations requires more time. To address this issue, future research could explore ways to enhance attack performance by designing a more effective loss function. Furthermore, we have only explored the robustness of COSM in the digital realm and have not investigated its robustness in the physical world, accounting for factors such as lighting, weather, and sensor influences. In our next steps, we will conduct further research in the domain of physical adversarial attacks.

Conclusion

Our work represents the first investigation into attacking COSM with adversarial examples. In the full white-box setting, we discovered that COSM is vulnerable, as we successfully reversed the predictions of masked objects and background. We also experimented with cross-model transferability and found that the adversarial examples generated by attacking the SINet model can successfully be used to attack other models. In addition to the fundamental goal of reversing the predictions of masked objects and background, we aim to generate any desired mask, achieving an overall satisfactory level of success. Our primary aim is not to discover the most potent method for attacking COSM. Instead, we concentrate on the adaptation of common attack methods, transitioning from attacking label prediction to targeting mask prediction, to assess the robustness of COSM against adversarial examples. The discovery that COSM is susceptible to adversarial examples underscores the importance of investigating the security implications of deploying COSM in safety–critical applications. In the future, we will continue to explore from the following aspects: (1) The attack method in this paper does not consider the attack migration, and the black box attack capability will be studied; (2) Deeply explore the impact of attack parameters on attack performance; (3) Research on defense technology to improve the robustness of the COSM; (4) Study the robustness of COSM in the field of physical attack.