Skip to main content
Log in

Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain Generalization

  • Published:
International Journal of Computer Vision Aims and scope Submit manuscript

Abstract

In computer vision, an important challenge to deep neural networks comes from adjusting the varying properties of different image domains. To study this problem, researchers have been investigating a practical setting in which the deep neural networks are trained on a labeled source domain and then transferred to an unlabeled or even unseen target domain. The major difficulty lies in the potential domain gap, which essentially arises from the overfitting in the source domain. Hence, it is important to introduce generalized priors to alleviate the issue. From this perspective, this paper presents a novel framework that forces visual features to align with domain-agnostic priors (DAP). Specifically, we study two kinds of priors, (i) language-guided embedding and (ii) class-level relationship, and we believe that more such priors can be constructed. Our framework, referred to as DAP, is evaluated on both unsupervised domain adaptation (UDA) and domain generalization (DG) where the target domain is unlabeled and even unseen, respectively. We use the standard benchmark that performs transfer semantic segmentation on synthesized datasets (i.e., GTAv and SYNTHIA) and a real dataset (i.e., Cityscapes). Experiments validate the effectiveness of DAP with competitive accuracy in all tasks. In particular, language-guided priors work sufficiently well for UDA, while class-level priors serve as useful complements for DG. The proposed frameworks shed light that domain transfer benefits from better proxies, possibly from other modalities.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

Data Availability

Datasets used in this work are publicly available and can be downloaded according to the related papers which are introduced in Sect. 4.1.

Code Availability

The code is available at this URL: https://github.com/xinyuehuo/DAP.

Notes

  1. https://platform.openai.com/tokenizer.

References

  • Akuzawa, K., Iwasawa, Y., & Matsuo, Y. (2019). Adversarial invariant feature learning with accuracy constraint for domain generalization. In Joint European conference on machine learning and knowledge discovery in databases, Springer, pp 315–331.

  • Balaji, Y., Sankaranarayanan, S., & Chellappa, R. (2018). Metareg: Towards domain generalization using meta-regularization. Advances in neural information processing systems, 31.

  • Ben-David, E., Oved, N., & Reichart, R. (2021). Pada: A prompt-based autoregressive approach for adaptation to unseen domains. arXiv preprint arXiv:2102.12206

  • Ben-David, S., Blitzer, J., Crammer, K., et al. (2010). A theory of learning from different domains. Machine Learning, 79(1), 151–175.

    Article  MathSciNet  Google Scholar 

  • Blanchard, G., Lee, G., & Scott, C. (2011). Generalizing from several related classification tasks to a new unlabeled sample. Advances in Neural Information Processing Systems,24.

  • Bousmalis, K., Silberman, N., Dohan, D., et al. (2017). Unsupervised pixel-level domain adaptation with generative adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3722–3731.

  • Bucher, M., Vu, T. H., Cord, M., et al. (2019). Zero-shot semantic segmentation. Advances in Neural Information Processing Systems, 32, 468–479.

    Google Scholar 

  • Chao, C. H., Cheng, B. W., & Lee, C. Y. (2021). Rethinking ensemble-distillation for semantic segmentation based unsupervised domain adaption. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2610–2620.

  • Chen, C., Xie, W., Huang, W., et al. (2019a). Progressive feature alignment for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 627–636.

  • Chen, L. C., Papandreou, G., Kokkinos, I., et al. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, Atrous convolution, and fully connected CRFS. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834–848.

    Article  Google Scholar 

  • Chen, L. C., Zhu, Y., Papandreou, G., et al. (2018a). Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV), pp. 801–818.

  • Chen, M., Xue, H., & Cai, D. (2019b). Domain adaptation for semantic segmentation with maximum squares loss. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 2090–2099.

  • Chen, Y., Li, W., Sakaridis, C., et al. (2018b). Domain adaptive faster R-CNN for object detection in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3339–3348.

  • Cheng, Y., Wei, F., Bao, J., et al. (2021). Dual path learning for domain adaptation of semantic segmentation. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 9082–9091.

  • Choi, S., Jung, S., Yun, H., et al. (2021). Robustnet: Improving domain generalization in urban-scene segmentation via instance selective whitening. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 11580–11590.

  • Cordts, M., Omran, M., Ramos, S., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3213–3223.

  • Csurka, G. (2017). Domain adaptation for visual applications: A comprehensive survey. arXiv preprint arXiv:1702.05374.

  • Deng, J., Dong, W., Socher, R., et al. (2009). Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, IEEE, pp. 248–255.

  • Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.

  • Dou, Q., Coelho de Castro, D., Kamnitsas, K., et al. (2019). Domain generalization via model-agnostic learning of semantic features. Advances in Neural Information Processing Systems, 32.

  • Feng, F., Wang, X., & Li, R. (2014). Cross-modal retrieval with correspondence autoencoder. In Proceedings of the 22nd ACM international conference on multimedia, pp. 7–16.

  • Fu, Y., Wei, Y., Wang, G., et al. (2019) Self-similarity grouping: A simple unsupervised cross domain adaptation approach for person re-identification. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 6112–6121.

  • Gan, C., Yang, T., & Gong, B. (2016). Learning attributes equals multi-source domain generalization. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 87–97.

  • Ganin, Y., & Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In International conference on machine learning, PMLR, pp. 1180–1189.

  • Gao, H., Guo, J., Wang, G., et al. (2022). Cross-domain correlation distillation for unsupervised domain adaptation in nighttime semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 9913–9923.

  • Gao, L., Zhang, J., Zhang, L., et al. (2021). Dsp: Dual soft-paste for unsupervised domain adaptive semantic segmentation. arXiv preprint arXiv:2107.09600.

  • Girshick, R., Donahue, J., Darrell, T., et al. (2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580–587.

  • Gong, B., Grauman, K., & Sha, F. (2014). Learning kernels for unsupervised domain adaptation with applications to visual object recognition. International Journal of Computer Vision, 109(1), 3–27.

    Article  MathSciNet  Google Scholar 

  • Gong, R., Li. W., Chen, Y., et al. (2019). Dlow: Domain flow for adaptation and generalization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2477–2486.

  • Guo, X., Yang, C., Li, B., et al. (2021). Metacorrection: Domain-aware meta loss correction for unsupervised domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3927–3936.

  • He, K., Zhang, X., Ren, S., et al. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778.

  • Hoffman, J., Rodner, E., Donahue, J., et al. (2014). Asymmetric and category invariant feature transformations for domain adaptation. International Journal of Computer Vision, 109(1), 28–41.

    Article  MathSciNet  Google Scholar 

  • Hoffman, J., Tzeng, E., Park, T., et al. (2018). Cycada: Cycle-consistent adversarial domain adaptation. In International conference on machine learning, PMLR, pp. 1989–1998.

  • Hoyer, L., Dai, D., & Van Gool, L. (2022). Daformer: Improving network architectures and training strategies for domain-adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.

  • Hu, H., Hong, R., Fu, W., et al. (2019). Efficient graph based multi-view learning. In MultiMedia modeling: 25th international conference, MMM 2019, Thessaloniki, Greece, January 8–11, 2019, Proceedings, Part I 25, (pp. 691–703), Springer.

  • Hu, H., Xie, L., Du, Z., et al. (2020). One-bit supervision for image classification. Advances in Neural Information Processing Systems, 33, 501–511.

    Google Scholar 

  • Hu, H., Xie, L., Hong, R., et al. (2020b). Creating something from nothing: Unsupervised knowledge distillation for cross-modal hashing. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3123–3132.

  • Hu, H., Xie, L., Huo, X., et al. (2022). Vibration-based uncertainty estimation for learning from limited supervision. In European conference on computer vision, (pp. 160–176), Springer.

  • Hu, H., Xie, L., Huo, X., et al. (2023). One-bit supervision for image classification: Problem, solution, and beyond. ACM Transactions on Multimedia Computing, Communications and Applications.

  • Huang, L., Zhou, Y., Zhu, F., et al. (2019). Iterative normalization: Beyond standardization towards efficient whitening. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4874–4883.

  • Huang, Y., Wu, Q., Xu, J., et al. (2021). Unsupervised domain adaptation with background shift mitigating for person re-identification. International Journal of Computer Vision, 129(7), 2244–2263.

    Article  Google Scholar 

  • Huo, X., Xie, L., He, J., et al. (2021). Atso: Asynchronous teacher-student optimization for semi-supervised image segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 1235–1244.

  • Huo, X., Xie, L., Hu, H., et al. (2022). Domain-agnostic prior for transfer semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 7075–7085.

  • Huo, X., Xie, L., Zhou, W., et al. (2023). Focus on your target: A dual teacher-student framework for domain-adaptive semantic segmentation. arXiv preprint arXiv:2303.09083.

  • Jia, C., Yang, Y., Xia, Y., et al. (2021). Scaling up visual and vision-language representation learning with noisy text supervision. arXiv preprint arXiv:2102.05918.

  • Kamath, A., Singh, M., LeCun, Y., et al. (2021). Mdetr-modulated detection for end-to-end multi-modal understanding. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 1780–1790.

  • Kim, M., & Byun, H. (2020). Learning texture invariant representation for domain adaptation of semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12975–12984.

  • Larochelle, H., Erhan, D., & Bengio, Y. (2008). Zero-data learning of new tasks. In AAAI, p. 3.

  • Lee, D. H., et al. (2013). Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In Workshop on challenges in representation learning, ICML, p. 896.

  • Li, D., Yang, Y., Song, Y. Z., et al. (2017a). Deeper, broader and artier domain generalization. In Proceedings of the IEEE international conference on computer vision, pp. 5542–5550.

  • Li, D., Zhang, J., Yang, Y., et al. (2019a). Episodic training for domain generalization. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 1446–1455.

  • Li, H., Pan, S. J., Wang, S., et al. (2018). Domain generalization with adversarial feature learning. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5400–5409.

  • Li, X., Sun, Q., Liu, Y., et al. (2019). Learning to self-train for semi-supervised few-shot classification. Advances in Neural Information Processing Systems, 32, 10,276-10,286.

    Google Scholar 

  • Li, X., Yin, X., Li, C., et al. (2020). Oscar: Object-semantics aligned pre-training for vision-language tasks. In European conference on computer vision, (pp. 121–137), Springer.

  • Li, Y., Fang, C., Yang, J., et al. (2017b). Universal style transfer via feature transforms. Advances in Neural Information Processing Systems, 30.

  • Lin, T. Y., Maire, M., Belongie, S., et al. (2014). Microsoft coco: Common objects in context. In European conference on computer vision, (pp 740–755), Springer.

  • Liu, J., Song, L., & Qin, Y. (2020). Prototype rectification for few-shot learning. In Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16, (pp. 741–756), Springer.

  • Long, M., Cao, Y., Wang, J., et al. (2015). Learning transferable features with deep adaptation networks. In International conference on machine learning, PMLR, pp. 97–105.

  • Lv, F., Liang, T., Chen, X., et al. (2020). Cross-domain semantic segmentation via domain-invariant interactive relation transfer. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4334–4343.

  • Ma, N., Zhang, X., Zheng, H. T., et al. (2018). Shufflenet v2: Practical guidelines for efficient CNN architecture design. In Proceedings of the European conference on computer vision (ECCV), pp. 116–131.

  • Mei, K., Zhu, C., Zou, J., et al. (2020). Instance adaptive self-training for unsupervised domain adaptation. In Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXVI 16, (pp. 415–430), Springer.

  • Mikolov, T., Sutskever, I., Chen, K., et al. (2013). Distributed representations of words and phrases and their compositionality. In Advances in Neural Information Processing Systems, pp. 3111–3119.

  • Moreno-Torres, J. G., Raeder, T., Alaiz-Rodríguez, R., et al. (2012). A unifying view on dataset shift in classification. Pattern Recognition, 45(1), 521–530.

    Article  Google Scholar 

  • Mottaghi, R., Chen, X., Liu, X., et al. (2014). The role of context for object detection and semantic segmentation in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 891–898.

  • Muandet, K., Balduzzi, D., & Schölkopf, B. (2013). Domain generalization via invariant feature representation. In International conference on machine learning, PMLR, pp. 10–18.

  • Neuhold, G., Ollmann, T., Rota Bulo, S., et al, (2017). The mapillary vistas dataset for semantic understanding of street scenes. In Proceedings of the IEEE international conference on computer vision, pp. 4990–4999.

  • Olsson, V., Tranheden, W., Pinto, J., et al. (2021). Classmix: Segmentation-based data augmentation for semi-supervised learning. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 1369–1378.

  • Pan, X., Luo, P., Shi, J., et al. (2018). Two at once: Enhancing learning and generalization capacities via ibn-net. In Proceedings of the European conference on computer vision (ECCV), pp. 464–479.

  • Pan, X., Zhan, X., Shi, J., et al. (2019). Switchable whitening for deep representation learning. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 1863–1871.

  • Peng, D., Lei, Y., Liu, L., et al. (2021). Global and local texture randomization for synthetic-to-real semantic segmentation. IEEE Transactions on Image Processing, 30, 6594–6608.

    Article  Google Scholar 

  • Peng, D., Lei, Y., Hayat, M., et al. (2022). Semantic-aware domain generalized segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2594–2605.

  • Qiao, F., Zhao, L., & Peng, X. (2020). Learning to learn single domain generalization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12556–12565.

  • Radford, A., Kim, J.W., Hallacy, C., et al. (2021). Learning transferable visual models from natural language supervision. arXiv preprint arXiv:2103.00020.

  • Rao, Y., Zhao, W., Chen, G., et al. (2022). Denseclip: Language-guided dense prediction with context-aware prompting. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 18082–18091.

  • Richter, S. R., Vineet, V., Roth, S., et al. (2016). Playing for data: Ground truth from computer games. In European conference on computer vision, (pp. 102–118), Springer.

  • Ros, G., Sellart, L., Materzynska, J., et al. (2016). The synthia dataset: A large collection of synthetic images for semantic segmentation of urban scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3234–3243.

  • Rosenberg, C., Hebert, M., & Schneiderman, H. (2005). Semi-supervised self-training of object detection models. Carnegie Mellon University.

  • Saito, K., Watanabe, K., Ushiku, Y., et al. (2018). Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3723–3732.

  • Sandler, M., Howard, A., Zhu, M., et al. (2018). Mobilenetv2: Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520.

  • Saporta, A., Douillard, A., Vu, T. H., et al. (2022). Multi-head distillation for continual unsupervised domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 3751–3760.

  • Seo, S., Suh, Y., Kim, D., et al. (2020). Learning to optimize domain specific normalization for domain generalization. In European conference on computer vision, (pp. 68–83), Springer.

  • Sharma, P., Ding, N., Goodman, S., et al. (2018). Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 1: Long Papers), pp. 2556–2565.

  • Shin, I., Woo, S., Pan, F., et al. (2020). Two-phase pseudo label densification for self-training based domain adaptation. In European conference on computer vision, (pp. 532–548), Springer.

  • Sindagi, V. A., & Srivastava, S. (2017). Domain adaptation for automatic OLED panel defect detection using adaptive support vector data description. International Journal of Computer Vision, 122(2), 193–211.

    Article  MathSciNet  Google Scholar 

  • Song, L., Wang, C., Zhang, L., et al. (2020). Unsupervised domain adaptive re-identification: Theory and practice. Pattern Recognition, 102(107), 173.

    Google Scholar 

  • Sun, S., Shi, H., & Wu, Y. (2015). A survey of multi-source domain adaptation. Information Fusion, 24, 84–92.

    Article  Google Scholar 

  • Tan, H., & Bansal, M. (2019). Lxmert: Learning cross-modality encoder representations from transformers. arXiv preprint arXiv:1908.07490.

  • Tarvainen, A., & Valpola, H. (2017). Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. arXiv preprint arXiv:1703.01780.

  • Tobin, J., Fong, R., Ray, A., et al. (2017). Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ international conference on intelligent robots and systems (IROS), IEEE, pp. 23–30.

  • Toldo, M., Maracani, A., Michieli, U., et al. (2020). Unsupervised domain adaptation in semantic segmentation: A review. Technologies, 8(2), 35.

    Article  Google Scholar 

  • Tranheden, W., Olsson, V., Pinto, J., et al. (2021). Dacs: Domain adaptation via cross-domain mixed sampling. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 1379–1389.

  • Tsai, Y. H., Hung, W. C., Schulter, S., et al. (2018). Learning to adapt structured output space for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7472–7481.

  • Tsai, Y. H., Sohn, K., Schulter, S., et al. (2019). Domain adaptation for structured output via discriminative patch representations. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 1456–1465.

  • Tzeng, E., Hoffman, J., Saenko, K., et al. (2017). Adversarial discriminative domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7167–7176.

  • Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2016). Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022.

  • Ulyanov, D., Vedaldi, A., & Lempitsky, V. (2017). Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6924–6932.

  • Vaswani, A., Shazeer, N., Parmar, N., et al. (2017). Attention is all you need. In Advances in neural information processing systems, pp. 5998–6008.

  • Vu, T.H., Jain, H., Bucher, M., et al. (2019). Advent: Adversarial entropy minimization for domain adaptation in semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2517–2526.

  • Wang, B., Yang, Y., Xu, X., et al. (2017). Adversarial cross-modal retrieval. In Proceedings of the 25th ACM international conference on multimedia, pp. 154–162.

  • Weiss, K., Khoshgoftaar, T. M., & Wang, D. (2016). A survey of transfer learning. Journal of Big Data, 3(1), 1–40.

    Article  Google Scholar 

  • Yang, Y., & Soatto, S. (2020). Fda: Fourier domain adaptation for semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 4085–4095.

  • Yosinski, J., Clune, J., Nguyen, A., et al. (2015). Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579.

  • Yu, F., Xian, W., Chen, Y., et al. (2018). Bdd100k: A diverse driving video database with scalable annotation tooling. 2(5), 6 arXiv preprint arXiv:1805.04687

  • Yu, F., Chen, H., Wang, X., et al. (2020). Bdd100k: A diverse driving dataset for heterogeneous multitask learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 2636–2645.

  • Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. In Computer vision–ECCV 2014: 13th European conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part I 13, (pp. 818–833), Springer.

  • Zhang, H., Cisse, M., Dauphin, Y. N., et al. (2017). Mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412.

  • Zhang, P., Zhang, B., Zhang, T., et al. (2021). Prototypical pseudo label denoising and target structure learning for domain adaptive semantic segmentation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12414–12424.

  • Zhao, H., Puig, X., Zhou, B., et al. (2017). Open vocabulary scene parsing. In Proceedings of the IEEE international conference on computer vision, pp. 2002–2010.

  • Zhao, S., Li, B., Xu, P., et al. (2021). Madan: Multi-source adversarial domain aggregation network for domain adaptation. International Journal of Computer Vision, 129(8), 2399–2424.

    Article  Google Scholar 

  • Zhao, Y., Zhong, Z., Yang, F., et al. (2021b). Learning to generalize unseen domains via memory-based multi-source meta-learning for person re-identification. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6277–6286.

  • Zhao, Y., Zhong, Z., Zhao, N., et al. (2022). Style-hallucinated dual consistency learning for domain generalized semantic segmentation. arXiv preprint arXiv:2204.02548.

  • Zheng, Z., & Yang, Y. (2021). Rectifying pseudo label learning via uncertainty estimation for domain adaptive semantic segmentation. International Journal of Computer Vision, 129(4), 1106–1120.

    Article  Google Scholar 

  • Zhou, K., Yang, Y., Hospedales, T., et al. (2020a). Learning to generate novel domains for domain generalization. In European conference on computer vision, (pp. 561–578), Springer.

  • Zhou, K., Liu, Z., Qiao, Y., et al. (2021a). Domain generalization in vision: A survey. arXiv preprint arXiv:2103.02503

  • Zhou, Q., Feng, Z., Gu, Q., et al. (2020b). Uncertainty-aware consistency regularization for cross-domain semantic segmentation. arXiv preprint arXiv:2004.08878

  • Zhou, Q., Feng, Z., Gu, Q., et al. (2021b). Context-aware mixup for domain adaptive semantic segmentation. arXiv preprint arXiv:2108.03557

  • Zhu, J. Y., Park, T., Isola, P., et al. (2017). Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision, pp. 2223–2232.

  • Zhu, X. J. (2005). Semi-supervised learning literature survey. University of Wisconsin-Madison Department of Computer Sciences.

  • Zou, Y., Yu, Z., Kumar, B., et al. (2018). Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In Proceedings of the European conference on computer vision (ECCV), pp. 289–305.

  • Zou, Y., Yu, Z., Liu, X., et al. (2019). Confidence regularized self-training. In Proceedings of the IEEE/CVF international conference on computer vision, pp. 5982–5991.

Download references

Acknowledgements

This work was supported in part by the National Natural Science Foundation of China under Contract 62021001, and in part by the Fundamental Research Funds for the Central Universities under contract WK3490000007. It was also supported by the GPU cluster built by MCC Lab of USTC.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lingxi Xie.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Communicated by Oliver Zendel.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huo, X., Xie, L., Hu, H. et al. Domain-Agnostic Priors for Semantic Segmentation Under Unsupervised Domain Adaptation and Domain Generalization. Int J Comput Vis (2024). https://doi.org/10.1007/s11263-024-02041-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11263-024-02041-7

Keywords

Navigation