当前位置: X-MOL 学术Int. J. Comput. Vis. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain Generalization
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2024-04-27 , DOI: 10.1007/s11263-024-02041-7
Xinyue Huo , Lingxi Xie , Hengtong Hu , Wengang Zhou , Houqiang Li , Qi Tian

In computer vision, an important challenge to deep neural networks comes from adjusting the varying properties of different image domains. To study this problem, researchers have been investigating a practical setting in which the deep neural networks are trained on a labeled source domain and then transferred to an unlabeled or even unseen target domain. The major difficulty lies in the potential domain gap, which essentially arises from the overfitting in the source domain. Hence, it is important to introduce generalized priors to alleviate the issue. From this perspective, this paper presents a novel framework that forces visual features to align with domain-agnostic priors (DAP). Specifically, we study two kinds of priors, (i) language-guided embedding and (ii) class-level relationship, and we believe that more such priors can be constructed. Our framework, referred to as DAP, is evaluated on both unsupervised domain adaptation (UDA) and domain generalization (DG) where the target domain is unlabeled and even unseen, respectively. We use the standard benchmark that performs transfer semantic segmentation on synthesized datasets (i.e., GTAv and SYNTHIA) and a real dataset (i.e., Cityscapes). Experiments validate the effectiveness of DAP with competitive accuracy in all tasks. In particular, language-guided priors work sufficiently well for UDA, while class-level priors serve as useful complements for DG. The proposed frameworks shed light that domain transfer benefits from better proxies, possibly from other modalities.



中文翻译:

无监督领域适应和领域泛化下语义分割的领域不可知先验

在计算机视觉中,深度神经网络的一个重要挑战来自于调整不同图像域的不同属性。为了研究这个问题,研究人员一直在研究一种实际设置,其中深度神经网络在标记的源域上进行训练,然后转移到未标记的甚至看不见的目标域。主要困难在于潜在的域差距,这本质上是由源域中的过度拟合引起的。因此,引入广义先验来缓解该问题非常重要。从这个角度来看,本文提出了一个新颖的框架,迫使视觉特征与领域不可知的先验(DAP)保持一致。具体来说,我们研究了两种先验,(i)语言引导的嵌入和(ii)类级关系,并且我们相信可以构造更多这样的先验。我们的框架(称为 DAP)在无监督域适应(UDA)和域泛化(DG)上进行评估,其中目标域分别是未标记的甚至是不可见的。我们使用标准基准对合成数据集(即 GTAv 和 SYNTHIA)和真实数据集(即 Cityscapes)执行传输语义分割。实验验证了 DAP 的有效性,在所有任务中都具有具有竞争力的准确性。特别是,语言引导的先验对于 UDA 来说效果足够好,而类级别的先验对于 DG 来说是有用的补充。拟议的框架揭示了域名转移受益于更好的代理,可能来自其他模式。

更新日期:2024-04-27
down
wechat
bug