Imbalance-Aware Discriminative Clustering for Unsupervised Semantic Segmentation,International Journal of Computer Vision

当前位置： X-MOL 学术 › Int. J. Comput. Vis. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Imbalance-Aware Discriminative Clustering for Unsupervised Semantic Segmentation
International Journal of Computer Vision ( IF 19.5 ) Pub Date : 2024-05-14 , DOI: 10.1007/s11263-024-02083-x
Mingyuan Liu , Jicong Zhang , Wei Tang

Unsupervised semantic segmentation (USS) aims at partitioning an image into semantically meaningful segments by learning from a collection of unlabeled images. The effectiveness of current approaches is plagued by difficulties in coordinating representation learning and pixel clustering, modeling the varying feature distributions of different classes, handling outliers and noise, and addressing the pixel class imbalance problem. This paper introduces a novel approach, termed Imbalance-Aware Dense Discriminative Clustering (IDDC), for USS, which addresses all these difficulties in a unified framework. Different from existing approaches, which learn USS in two stages (i.e., generating and updating pseudo masks, or refining and clustering embeddings), IDDC learns pixel-wise feature representation and dense discriminative clustering in an end-to-end and self-supervised manner, through a novel objective function that transfers the manifold structure of pixels in the embedding space of a vision Transformer (ViT) to the label space while tolerating the noise in pixel affinities. During inference, the trained model directly outputs the classification probability of each pixel conditioned on the image. In addition, this paper proposes a new regularizer, based on the Weibull function, to handle pixel class imbalance and cluster degeneration in a single shot. Experimental results demonstrate that IDDC significantly outperforms all previous USS methods on three real-world datasets, COCO-Stuff-27, COCO-Stuff-171, and Cityscapes. Extensive ablation studies validate the effectiveness of each design. Our code is available at https://github.com/MY-LIU100101/IDDC.

中文翻译：

用于无监督语义分割的不平衡感知判别聚类

无监督语义分割（USS）旨在通过从未标记的图像集合中学习，将图像划分为语义上有意义的片段。当前方法的有效性受到以下方面的困难的困扰：协调表示学习和像素聚类、对不同类别的不同特征分布进行建模、处理异常值和噪声以及解决像素类别不平衡问题。本文介绍了一种针对 USS 的新颖方法，称为不平衡感知密集判别聚类 (IDDC)，它在统一框架中解决了所有这些困难。与分两个阶段学习 USS（即生成和更新伪掩模，或细化和聚类嵌入）的现有方法不同，IDDC 以端到端和自监督的方式学习像素级特征表示和密集判别聚类，通过一种新颖的目标函数，将视觉变换器（ViT）嵌入空间中的像素流形结构转移到标签空间，同时容忍像素亲和力中的噪声。在推理过程中，训练后的模型直接输出以图像为条件的每个像素的分类概率。此外，本文提出了一种基于威布尔函数的新正则化器，用于处理单镜头中的像素类不平衡和簇退化。实验结果表明，IDDC 在三个真实世界数据集（COCO-Stuff-27、COCO-Stuff-171 和 Cityscapes）上显着优于以前的所有 USS 方法。广泛的消融研究验证了每种设计的有效性。我们的代码可在 https://github.com/MY-LIU100101/IDDC 获取。

更新日期：2024-05-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>