IEEE Transactions on Image Processing期刊最新论文, 计算机, 应用类期刊,

Deep Boosting Learning: A Brand-new Cooperative Approach for Image-Text Matching

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-07
Haiwen Diao, Ying Zhang, Shang Gao, Xiang Ruan, Huchuan Lu

更新日期：2024-05-07

详情收藏

Tensorized Multi-View Low-Rank Approximation Based Robust Hand-Print Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-06
Shuping Zhao, Lunke Fei, Bob Zhang, Jie Wen, Pengyang Zhao

Since hand-print recognition, i.e., palmprint, finger-knuckle-print (FKP), and hand-vein, have significant superiority in user convenience and hygiene, it has attracted greater enthusiasm from researchers. Seeking to handle the long-standing interference factors, i.e., noise, rotation, shadow, in hand-print images, multi-view hand-print representation has been proposed to enhance the feature expression

更新日期：2024-05-06

详情收藏

LEAPSE: Learning Environment Affordances for 3D Human Pose and Shape Estimation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-06
Fangzheng Tian, Sungchan Kim

We live in a 3D world where people interact with each other in the environment. Learning 3D posed humans therefore requires us to perceive and interpret these interactions. This paper proposes LEAPSE, a novel method that learns salient instance affordances for estimating a posed body from a single RGB image in a non-parametric manner. Existing methods mostly ignore the environment and estimate the

更新日期：2024-05-06

详情收藏

LSSVC: A Learned Spatially Scalable Video Coding Scheme

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-06
Yifan Bian, Xihua Sheng, Li Li, Dong Liu

Traditional block-based spatially scalable video coding has been studied for over twenty years. While significant advancements have been made, the scope for further improvement in compression performance is limited. Inspired by the success of learned video coding, we propose an end-to-end learned spatially scalable video coding scheme, LSSVC, which provides a new solution for scalable video coding

更新日期：2024-05-06

详情收藏

Multi-View Time-Series Hypergraph Neural Network for Action Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-03
Nan Ma, Zhixuan Wu, Yifan Feng, Cheng Wang, Yue Gao

Recently, action recognition has attracted considerable attention in the field of computer vision. In dynamic circumstances and complicated backgrounds, there are some problems, such as object occlusion, insufficient light, and weak correlation of human body joints, resulting in skeleton-based human action recognition accuracy being very low. To address this issue, we propose a Multi-View Time-Series

更新日期：2024-05-03

详情收藏

Multi-Stage Image-Language Cross-Generative Fusion Network for Video-Based Referring Expression Comprehension

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-02
Yujia Zhang, Qianzhong Li, Yi Pan, Xiaoguang Zhao, Min Tan

Video-based referring expression comprehension is a challenging task that requires locating the referred object in each video frame of a given video. While many existing approaches treat this task as an object-tracking problem, their performance is heavily reliant on the quality of the tracking templates. Furthermore, when there is not enough annotation data to assist in template selection, the tracking

更新日期：2024-05-02

详情收藏

Relationship Learning From Multisource Images via Spatial-Spectral Perception Network

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-02
Yunhao Gao, Wei Li, Junjie Wang, Mengmeng Zhang, Ran Tao

Advances in multisource remote sensing have allowed for the development of more comprehensive observation. The adoption of deep convolutional neural networks (CNN) naturally includes spatial-spectral information, which has achieved promising performance in multisource data classification. However, challenges are still found with the extraction of spatial distribution and spectrum relationships, which

更新日期：2024-05-02

详情收藏

Deep Feature Statistics Mapping for Generalized Screen Content Image Quality Assessment

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-05-01
Baoliang Chen, Hanwei Zhu, Lingyu Zhu, Shiqi Wang, Sam Kwong

The statistical regularities of natural images, referred to as natural scene statistics, play an important role in no-reference image quality assessment. However, it has been widely acknowledged that screen content images (SCIs), which are typically computer generated, do not hold such statistics. Here we make the first attempt to learn the statistics of SCIs, based upon which the quality of SCIs can

更新日期：2024-05-01

详情收藏

Occlusion-Aware Transformer With Second-Order Attention for Person Re-Identification

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-30
Yanping Li, Yizhang Liu, Hongyun Zhang, Cairong Zhao, Zhihua Wei, Duoqian Miao

Person re-identification (ReID) typically encounters varying degrees of occlusion in real-world scenarios. While previous methods have addressed this using handcrafted partitions or external cues, they often compromise semantic information or increase network complexity. In this paper, we propose a new method from a novel perspective, termed as OAT. Specifically, we first use a Transformer backbone

更新日期：2024-04-30

详情收藏

QueryTrack: Joint-Modality Query Fusion Network for RGBT Tracking

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-30
Huijie Fan, Zhencheng Yu, Qiang Wang, Baojie Fan, Yandong Tang

Existing RGB-Thermal trackers usually treat intra-modal feature extraction and inter-modal feature fusion as two separate processes, therefore the mutual promotion of extraction and fusion is neglected. Then, the complementary advantages of RGB-T fusion are not fully exploited, and the independent feature extraction is not adaptive to modal quality fluctuation during tracking. To address the limitations

更新日期：2024-04-30

详情收藏

Learning to Recover Spectral Reflectance From RGB Images

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-30
Dong Huo, Jian Wang, Yiming Qian, Yee-Hong Yang

This paper tackles spectral reflectance recovery (SRR) from RGB images. Since capturing ground-truth spectral reflectance and camera spectral sensitivity are challenging and costly, most existing approaches are trained on synthetic images and utilize the same parameters for all unseen testing images, which are suboptimal especially when the trained models are tested on real images because they never

更新日期：2024-04-30

详情收藏

Quality-Aware Selective Fusion Network for V-D-T Salient Object Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-30
Liuxin Bao, Xiaofei Zhou, Xiankai Lu, Yaoqi Sun, Haibing Yin, Zhenghui Hu, Jiyong Zhang, Chenggang Yan

Depth images and thermal images contain the spatial geometry information and surface temperature information, which can act as complementary information for the RGB modality. However, the quality of the depth and thermal images is often unreliable in some challenging scenarios, which will result in the performance degradation of the two-modal based salient object detection (SOD). Meanwhile, some researchers

更新日期：2024-04-30

详情收藏

Anisotropic Scale-Invariant Ellipse Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-29
Zikai Wang, Baojiang Zhong, Kai-Kuang Ma

Detecting ellipses poses a challenging low-level task indispensable to many image analysis applications. Existing ellipse detection methods commonly encounter two fundamental issues. First, inferior detection accuracy could be incurred on a small ellipse than that on a large one; this introduces the scale issue. Second, inferior detection accuracy could be yielded along the minor axis than along the

更新日期：2024-04-29

详情收藏

Multi-Label Action Anticipation for Real-World Videos With Scene Understanding

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-25
Yuqi Zhang, Xiucheng Li, Hao Xie, Weijun Zhuang, Shihui Guo, Zhijun Li

With human action anticipation becoming an essential tool for many practical applications, there has been an increasing trend in developing more accurate anticipation models in recent years. Most of the existing methods target standard action anticipation datasets, in which they could produce promising results by learning action-level contextual patterns. However, the over-simplified scenarios of standard

更新日期：2024-04-25

详情收藏

Fine-Grained Recognition With Learnable Semantic Data Augmentation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-25
Yifan Pu, Yizeng Han, Yulin Wang, Junlan Feng, Chao Deng, Gao Huang

Fine-grained image recognition is a longstanding computer vision challenge that focuses on differentiating objects belonging to multiple subordinate categories within the same meta-category. Since images belonging to the same meta-category usually share similar visual appearances, mining discriminative visual cues is the key to distinguishing fine-grained categories. Although commonly used image-level

更新日期：2024-04-25

详情收藏

Mitigating Search Interference With Task-Aware Nested Search

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Jiho Lee, Eunwoo Kim

Neural Architecture Search (NAS) has emerged as a promising tool in the field of AutoML for designing more accurate and efficient architectures. The majority of NAS works employ a weight-sharing technique to reduce the search cost by sharing the weights of a supernet, which is a composite of all architectures produced from the search space. Nonetheless, this method has a significant drawback in that

更新日期：2024-04-24

详情收藏

CS2DIPs: Unsupervised HSI Super-Resolution Using Coupled Spatial and Spectral DIPs

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Yuan Fang, Yipeng Liu, Chong-Yung Chi, Zhen Long, Ce Zhu

In recent years, fusing high spatial resolution multispectral images (HR-MSIs) and low spatial resolution hyperspectral images (LR-HSIs) has become a widely used approach for hyperspectral image super-resolution (HSI-SR). Various unsupervised HSI-SR methods based on deep image prior (DIP) have gained wide popularity thanks to no pre-training requirement. However, DIP-based methods often demonstrate

更新日期：2024-04-24

详情收藏

Multi-Stage Network With Geometric Semantic Attention for Two-View Correspondence Learning

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Shuyuan Lin, Xiao Chen, Guobao Xiao, Hanzi Wang, Feiran Huang, Jian Weng

The removal of outliers is crucial for establishing correspondence between two images. However, when the proportion of outliers reaches nearly 90%, the task becomes highly challenging. Existing methods face limitations in effectively utilizing geometric transformation consistency (GTC) information and incorporating geometric semantic neighboring information. To address these challenges, we propose

更新日期：2024-04-24

详情收藏

Model-Based Explainable Deep Learning for Light-Field Microscopy Imaging

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Pingfan Song, Herman Verinaz Jadan, Carmel L. Howe, Amanda J. Foust, Pier Luigi Dragotti

In modern neuroscience, observing the dynamics of large populations of neurons is a critical step of understanding how networks of neurons process information. Light-field microscopy (LFM) has emerged as a type of scanless, high-speed, three-dimensional (3D) imaging tool, particularly attractive for this purpose. Imaging neuronal activity using LFM calls for the development of novel computational approaches

更新日期：2024-04-24

详情收藏

Graph-Represented Distribution Similarity Index for Full-Reference Image Quality Assessment

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Wenhao Shen, Mingliang Zhou, Jun Luo, Zhengguo Li, Sam Kwong

In this paper, we propose a graph-represented image distribution similarity (GRIDS) index for full-reference (FR) image quality assessment (IQA), which can measure the perceptual distance between distorted and reference images by assessing the disparities between their distribution patterns under a graph-based representation. First, we transform the input image into a graph-based representation, which

更新日期：2024-04-24

详情收藏

Learning Contrast-Enhanced Shape-Biased Representations for Infrared Small Target Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Fanzhao Lin, Kexin Bao, Yong Li, Dan Zeng, Shiming Ge

Detecting infrared small targets under cluttered background is mainly challenged by dim textures, low contrast and varying shapes. This paper proposes an approach to facilitate infrared small target detection by learning contrast-enhanced shape-biased representations. The approach cascades a contrast-shape encoder and a shape-reconstructable decoder to learn discriminative representations that can

更新日期：2024-04-24

详情收藏

Fine-Grained Essential Tensor Learning for Robust Multi-View Spectral Clustering

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Chong Peng, Kehan Kang, Yongyong Chen, Zhao Kang, Chenglizhao Chen, Qiang Cheng

Multi-view subspace clustering (MVSC) has drawn significant attention in recent study. In this paper, we propose a novel approach to MVSC. First, the new method is capable of preserving high-order neighbor information of the data, which provides essential and complicated underlying relationships of the data that is not straightforwardly preserved by the first-order neighbors. Second, we design log-based

更新日期：2024-04-24

详情收藏

Multi-Granularity Contrastive Cross-Modal Collaborative Generation for End-to-End Long-Term Video Question Answering

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-24
Ting Yu, Kunhao Fu, Jian Zhang, Qingming Huang, Jun Yu

Long-term Video Question Answering (VideoQA) is a challenging vision-and-language bridging task focusing on semantic understanding of untrimmed long-term videos and diverse free-form questions, simultaneously emphasizing comprehensive cross-modal reasoning to yield precise answers. The canonical approaches often rely on off-the-shelf feature extractors to detour the expensive computation overhead,

更新日期：2024-04-24

详情收藏

Exploring Video Denoising in Thermal Infrared Imaging: Physics-inspired Noise Generator, Dataset and Model

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-23
Lijing Cai, Xiangyu Dong, Kailai Zhou, Xun Cao

更新日期：2024-04-23

详情收藏

Accurate 3D Measurement of Complex Texture Objects by Height Compensation Using a Dual-Projector Structure

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-22
Pengcheng Yao, Yuchong Chen, Shaoyan Gai, Feipeng Da

Fringe projection profilometry is a widely used technique for 3D measurement due to its high accuracy and speed. However, the accuracy significantly decreases when measuring complex texture objects, especially in the junction of different colors. This paper analyzes the causes of errors resulting from complex textures and proposes a height compensation method to revise the error by employing a dual-projector

更新日期：2024-04-22

详情收藏

Classification of Small Drones Using Low-Uncertainty Micro-Doppler Signature Images and Ultra-Lightweight Convolutional Neural Network

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19
Junhyeong Park, Jun-Sung Park

Many studies have attempted to classify small drones in response to threats posed by the technical progress of small drones. Recently, small drones have been classified utilizing convolutional neural networks (CNNs) with micro-Doppler signature (MDS) images generated from frequency-modulated continuous-wave (FMCW) radars. This study proposes a comprehensive method for classifying small drones in real-time

更新日期：2024-04-19

详情收藏

Image Reconstruction for Accelerated MR Scan With Faster Fourier Convolutional Neural Networks

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19
Xiaohan Liu, Yanwei Pang, Xuebin Sun, Yiming Liu, Yonghong Hou, Zhenchang Wang, Xuelong Li

High quality image reconstruction from undersampled ${k}$ -space data is key to accelerating MR scanning. Current deep learning methods are limited by the small receptive fields in reconstruction networks, which restrict the exploitation of long-range information, and impede the mitigation of full-image artifacts, particularly in 3D reconstruction tasks. Additionally, the substantial computational

更新日期：2024-04-19

详情收藏

Fast Continual Multi-View Clustering With Incomplete Views

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-19
Xinhang Wan, Bin Xiao, Xinwang Liu, Jiyuan Liu, Weixuan Liang, En Zhu

Multi-view clustering (MVC) has attracted broad attention due to its capacity to exploit consistent and complementary information across views. This paper focuses on a challenging issue in MVC called the incomplete continual data problem (ICDP). Specifically, most existing algorithms assume that views are available in advance and overlook the scenarios where data observations of views are accumulated

更新日期：2024-04-19

详情收藏

Multi-Relational Deep Hashing for Cross-Modal Search

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-16
Xiao Liang, Erkun Yang, Yanhua Yang, Cheng Deng

Deep cross-modal hashing retrieval has recently made significant progress. However, existing methods generally learn hash functions with pairwise or triplet supervisions, which involves learning the relevant information by splicing partial similarity between data pairs; notably, this approach only captures the data similarity locally and incompletely, resulting in sub-optimal retrieval performance

更新日期：2024-04-16

详情收藏

GLPanoDepth: Global-to-Local Panoramic Depth Estimation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-15
Jiayang Bai, Haoyu Qin, Shuichang Lai, Jie Guo, Yanwen Guo

Depth estimation is a fundamental task in many vision applications. With the popularity of omnidirectional cameras, it becomes a new trend to tackle this problem in the spherical space. In this paper, we propose a learning-based method for predicting dense depth values of a scene from a monocular omnidirectional image. An omnidirectional image has a full field-of-view, providing much more complete

更新日期：2024-04-15

详情收藏

ISTR: Mask-Embedding-Based Instance Segmentation Transformer

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12
Jie Hu, Yao Lu, Shengchuan Zhang, Liujuan Cao

Transformer-based instance-level recognition has attracted increasing research attention recently due to the superior performance. However, although attempts have been made to encode masks as embeddings into Transformer-based frameworks, how to combine mask embeddings and spatial information for a transformer-based approach is still not fully explored. In this paper, we revisit the design of mask-embedding-based

更新日期：2024-04-12

详情收藏

Deep Variation Prior: Joint Image Denoising and Noise Variance Estimation Without Clean Data

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12
Rihuan Ke

With recent deep learning based approaches showing promising results in removing noise from images, the best denoising performance has been reported in a supervised learning setup that requires a large set of paired noisy images and ground truth data for training. The strong data requirement can be mitigated by unsupervised learning techniques, however, accurate modelling of images or noise variances

更新日期：2024-04-12

详情收藏

Saliency Guided Deep Neural Network for Color Transfer With Light Optimization

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-12
Yuming Fang, Pengwei Yuan, Chenlei Lv, Chen Peng, Jiebin Yan, Weisi Lin

Color transfer aims to change the color information of the target image according to the reference one. Many studies propose color transfer methods by analysis of color distribution and semantic relevance, which do not take the perceptual characteristics for visual quality into consideration. In this study, we propose a novel color transfer method based on the saliency information with brightness optimization

更新日期：2024-04-12

详情收藏

Single Stage Adaptive Multi-Attention Network for Image Restoration

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Anas Zafar, Danyal Aftab, Rizwan Qureshi, Xinqi Fan, Pingjun Chen, Jia Wu, Hazrat Ali, Shah Nawaz, Sheheryar Khan, Mubarak Shah

Recently attention-based networks have been successful for image restoration tasks. However, existing methods are either computationally expensive or have limited receptive fields, adding constraints to the model. They are also less resilient in spatial and contextual aspects and lack pixel-to-pixel correspondence, which may degrade feature representations. In this paper, we propose a novel and computationally

更新日期：2024-04-10

详情收藏

High-Quality and Diverse Few-Shot Image Generation via Masked Discrimination

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan

Few-shot image generation aims to generate images of high quality and great diversity with limited data. However, it is difficult for modern GANs to avoid overfitting when trained on only a few images. The discriminator can easily remember all the training samples and guide the generator to replicate them, leading to severe diversity degradation. Several methods have been proposed to relieve overfitting

更新日期：2024-04-10

详情收藏

RefQSR: Reference-Based Quantization for Image Super-Resolution Networks

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Hongjae Lee, Jun-Sang Yoo, Seung-Won Jung

Single image super-resolution (SISR) aims to reconstruct a high-resolution image from its low-resolution observation. Recent deep learning-based SISR models show high performance at the expense of increased computational costs, limiting their use in resource-constrained environments. As a promising solution for computationally efficient network design, network quantization has been extensively studied

更新日期：2024-04-10

详情收藏

Nonconvex Robust High-Order Tensor Completion Using Randomized Low-Rank Approximation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-10
Wenjin Qin, Hailin Wang, Feng Zhang, Weijun Ma, Jianjun Wang, Tingwen Huang

Within the tensor singular value decomposition (T-SVD) framework, existing robust low-rank tensor completion approaches have made great achievements in various areas of science and engineering. Nevertheless, these methods involve the T-SVD based low-rank approximation, which suffers from high computational costs when dealing with large-scale tensor data. Moreover, most of them are only applicable to

更新日期：2024-04-10

详情收藏

Source-Guided Target Feature Reconstruction for Cross-Domain Classification and Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-09
Yifan Jiao, Hantao Yao, Bing-Kun Bao, Changsheng Xu

Existing cross-domain classification and detection methods usually apply a consistency constraint between the target sample and its self-augmentation for unsupervised learning without considering the essential source knowledge. In this paper, we propose a Source-guided Target Feature Reconstruction (STFR) module for cross-domain visual tasks, which applies source visual words to reconstruct the target

更新日期：2024-04-09

详情收藏

Relationship-Incremental Scene Graph Generation by a Divide-and-Conquer Pipeline with Feature Adapter

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-08
Xuewei Li, Guangcong Zheng, Yunlong Yu, Naye Ji, Xi Li

更新日期：2024-04-08

详情收藏

DriftRec: Adapting Diffusion Models to Blind JPEG Restoration

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05
Simon Welker, Henry N. Chapman, Timo Gerkmann

In this work, we utilize the high-fidelity generation abilities of diffusion models to solve blind JPEG restoration at high compression levels. We propose an elegant modification of the forward stochastic differential equation of diffusion models to adapt them to this restoration task and name our method DriftRec. Comparing DriftRec against an $L_{2}$ regression baseline with the same network architecture

更新日期：2024-04-05

详情收藏

Generalizing to Out-of-Sample Degradations via Model Reprogramming

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-05
Runhua Jiang, Yahong Han

Existing image restoration models are typically designed for specific tasks and struggle to generalize to out-of-sample degradations not encountered during training. While zero-shot methods can address this limitation by fine-tuning model parameters on testing samples, their effectiveness relies on predefined natural priors and physical models of specific degradations. Nevertheless, determining out-of-sample

更新日期：2024-04-05

详情收藏

Shared Manifold Regularized Joint Feature Selection for Joint Classification and Regression in Alzheimer’s Disease Diagnosis

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-04
Zhi Chen, Yongguo Liu, Yun Zhang, Jiajing Zhu, Qiaoqin Li, Xindong Wu

In Alzheimer’s disease (AD) diagnosis, joint feature selection for predicting disease labels (classification) and estimating cognitive scores (regression) with neuroimaging data has received increasing attention. In this paper, we propose a model named Shared Manifold regularized Joint Feature Selection (SMJFS) that performs classification and regression in a unified framework for AD diagnosis. For

更新日期：2024-04-04

详情收藏

Orthogonal Spatial Binary Coding Method for High-Speed 3D Measurement

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01
Haitao Wu, Yiping Cao, Yongbo Dai, Zhimi Wei

Temporal phase unwrapping based on single auxiliary binary coded pattern has been proven to be effective for high-speed 3D measurement. However, in traditional spatial binary coding, it often leads to an imbalance between the number of periodic divisions and codewords. To meet this challenge, a large codewords orthogonal spatial binary coding method is proposed in this paper. By expanding spatial multiplexing

更新日期：2024-04-01

详情收藏

Hierarchical Perceptual Noise Injection for Social Media Fingerprint Privacy Protection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-04-01
Simin Li, Huangxinxin Xu, Jiakai Wang, Ruixiao Xu, Aishan Liu, Fazhi He, Xianglong Liu, Dacheng Tao

Billions of people share images from their daily lives on social media every day. However, their biometric information (e.g., fingerprints) could be easily stolen from these images. The threat of fingerprint leakage from social media has created a strong desire to anonymize shared images while maintaining image quality, since fingerprints act as a lifelong individual biometric password. To guard the

更新日期：2024-04-01

详情收藏

Bilateral Context Modeling for Residual Coding in Lossless 3D Medical Image Compression

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25
Xiangrui Liu, Meng Wang, Shiqi Wang, Sam Kwong

Residual coding has gained prevalence in lossless compression, where a lossy layer is initially employed and the reconstruction errors (i.e., residues) are then losslessly compressed. The underlying principle of the residual coding revolves around the exploration of priors based on context modeling. Herein, we propose a residual coding framework for 3D medical images, involving the off-the-shelf video

更新日期：2024-03-30

详情收藏

Anomaly Detection for Medical Images Using Heterogeneous Auto-Encoder

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29
Shuai Lu, Weihang Zhang, He Zhao, Hanruo Liu, Ningli Wang, Huiqi Li

Anomaly detection is an important task for medical image analysis, which can alleviate the reliance of supervised methods on large labelled datasets. Most existing methods use a pixel-wise self-reconstruction framework for anomaly detection. However, there are two challenges of these studies: 1) they tend to overfit learning an identity mapping between the input and output, which leads to failure in

更新日期：2024-03-29

详情收藏

Region Aware Video Object Segmentation With Deep Motion Modeling

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-29
Bo Miao, Mohammed Bennamoun, Yongsheng Gao, Ajmal Mian

Current semi-supervised video object segmentation (VOS) methods often employ the entire features of one frame to predict object masks and update memory. This introduces significant redundant computations. To reduce redundancy, we introduce a Region Aware Video Object Segmentation (RAVOS) approach, which predicts regions of interest (ROIs) for efficient object segmentation and memory storage. RAVOS

更新日期：2024-03-29

详情收藏

Knowledge-Augmented Visual Question Answering With Natural Language Explanation

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28
Jiayuan Xie, Yi Cai, Jiali Chen, Ruohang Xu, Jiexin Wang, Qing Li

Visual question answering with natural language explanation (VQA-NLE) is a challenging task that requires models to not only generate accurate answers but also to provide explanations that justify the relevant decision-making processes. This task is accomplished by generating natural language sentences based on the given question-image pair. However, existing methods often struggle to ensure consistency

更新日期：2024-03-28

详情收藏

Robust Fine-Grained Visual Recognition With Neighbor-Attention Label Correction

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-28
Shunan Mao, Shiliang Zhang

Existing deep learning methods for fine-grained visual recognition often rely on large-scale, well-annotated training data. Obtaining fine-grained annotations in the wild typically requires concentration and expertise, such as fine category annotation for species recognition, instance annotation for person re-identification (re-id) and dense annotation for segmentation, which inevitably leads to label

更新日期：2024-03-28

详情收藏

Label-Aware Calibration and Relation-Preserving in Visual Intention Understanding

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
QingHongYa Shi, Mang Ye, Wenke Huang, Weijian Ruan, Bo Du

Visual intention understanding is a challenging task that explores the hidden intention behind the images of publishers in social media. Visual intention represents implicit semantics, whose ambiguous definition inevitably leads to label shifting and label blemish. The former indicates that the same image delivers intention discrepancies under different data augmentations, while the latter represents

更新日期：2024-03-27

详情收藏

Weakly-Supervised Contrastive Learning for Unsupervised Object Discovery

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Yunqiu Lv, Jing Zhang, Nick Barnes, Yuchao Dai

Unsupervised object discovery (UOD) refers to the task of discriminating the whole region of objects from the background within a scene without relying on labeled datasets, which benefits the task of bounding-box-level localization and pixel-level segmentation. This task is promising due to its ability to discover objects in a generic manner. We roughly categorize existing techniques into two main

更新日期：2024-03-27

详情收藏

Temporal Feature Fusion for 3D Detection in Monocular Video

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Haoran Cheng, Liang Peng, Zheng Yang, Binbin Lin, Xiaofei He, Boxi Wu

Previous monocular 3D detection works focus on the single frame input in both training and inference. In real-world applications, temporal and motion information naturally exists in monocular video. It is valuable for 3D detection but under-explored in monocular works. In this paper, we propose a straightforward and effective method for temporal feature fusion, which exhibits low computation cost and

更新日期：2024-03-27

详情收藏

Instance-Specific Semantic Augmentation for Long-Tailed Image Classification

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Jiahao Chen, Bing Su

Recent long-tailed classification methods generally adopt the two-stage pipeline and focus on learning the classifier to tackle the imbalanced data in the second stage via re-sampling or re-weighting, but the classifier is easily prone to overconfidence in head classes. Data augmentation is a natural way to tackle this issue. Existing augmentation methods either perform low-level transformations or

更新日期：2024-03-27

详情收藏

BadCM: Invisible Backdoor Attack Against Cross-Modal Learning

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Zheng Zhang, Xu Yuan, Lei Zhu, Jingkuan Song, Liqiang Nie

Despite remarkable successes in unimodal learning tasks, backdoor attacks against cross-modal learning are still underexplored due to the limited generalization and inferior stealthiness when involving multiple modalities. Notably, since works in this area mainly inherit ideas from unimodal visual attacks, they struggle with dealing with diverse cross-modal attack circumstances and manipulating imperceptible

更新日期：2024-03-27

详情收藏

Toward Accurate Human Parsing Through Edge Guided Diffusion

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Ting Liu, Hongkun Zhu, Yunchao Wei, Shikui Wei, Yao Zhao, Yanning Zhang

Existing human parsing frameworks commonly employ joint learning of semantic edge detection and human parsing to facilitate the localization around boundary regions. Nevertheless, the parsing prediction within the interior of the part contour may still exhibit inconsistencies due to the inherent ambiguity of fine-grained semantics. In contrast, binary edge detection does not suffer from such fine-grained

更新日期：2024-03-27

详情收藏

In Defense of Clip-Based Video Relation Detection

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Meng Wei, Long Chen, Wei Ji, Xiaoyu Yue, Roger Zimmermann

Video Visual Relation Detection (VidVRD) aims to detect visual relationship triplets in videos using spatial bounding boxes and temporal boundaries. Existing VidVRD methods can be broadly categorized into bottom-up and top-down paradigms, depending on their approach to classifying relations. Bottom-up methods follow a clip-based approach where they classify relations of short clip tubelet pairs and

更新日期：2024-03-27

详情收藏

Cross-Layer Contrastive Learning of Latent Semantics for Facial Expression Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Weicheng Xie, Zhibin Peng, Linlin Shen, Wenya Lu, Yang Zhang, Siyang Song

Convolutional neural networks (CNNs) have achieved significant improvement for the task of facial expression recognition. However, current training still suffers from the inconsistent learning intensities among different layers, i.e., the feature representations in the shallow layers are not sufficiently learned compared with those in deep layers. To this end, this work proposes a contrastive learning

更新日期：2024-03-27

详情收藏

Single-Image-Based Deep Learning for Segmentation of Early Esophageal Cancer Lesions

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-27
Haipeng Li, Dingrui Liu, Yu Zeng, Shuaicheng Liu, Tao Gan, Nini Rao, Jinlin Yang, Bing Zeng

Accurate segmentation of lesions is crucial for diagnosis and treatment of early esophageal cancer (EEC). However, neither traditional nor deep learning-based methods up to today can meet the clinical requirements, with the mean Dice score - the most important metric in medical image analysis - hardly exceeding 0.75. In this paper, we present a novel deep learning approach for segmenting EEC lesions

更新日期：2024-03-27

详情收藏

DeGCN: Deformable Graph Convolutional Networks for Skeleton-Based Action Recognition

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25
Woomin Myung, Nan Su, Jing-Hao Xue, Guijin Wang

Graph convolutional networks (GCN) have recently been studied to exploit the graph topology of the human body for skeleton-based action recognition. However, most of these methods unfortunately aggregate messages via an inflexible pattern for various action samples, lacking the awareness of intra-class variety and the suitableness for skeleton sequences, which often contain redundant or even detrimental

更新日期：2024-03-25

详情收藏

Cross-Modal Retrieval With Noisy Correspondence via Consistency Refining and Mining

IEEE Trans. Image Process. (IF 10.6) Pub Date : 2024-03-25
Xinran Ma, Mouxing Yang, Yunfan Li, Peng Hu, Jiancheng Lv, Xi Peng

The success of existing cross-modal retrieval (CMR) methods heavily rely on the assumption that the annotated cross-modal correspondence is faultless. In practice, however, the correspondence of some pairs would be inevitably contaminated during data collection or annotation, thus leading to the so-called Noisy Correspondence (NC) problem. To alleviate the influence of NC, we propose a novel method

更新日期：2024-03-25

详情收藏