針對長尾視覺辨識之自適應目標增強策略

No Thumbnail Available

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

監督學習中的長尾問題是由於現實世界資料集中固有的不平衡性所引起的,其中少數幾個類別或樣本佔據了資料分佈的大部分,而大多數類別(「尾部」)則擁有顯著較少的樣本。這個問題對傳統的監督學習算法構成了挑戰,因為這些算法通常優先優化在頻繁(頭部)類別上的表現,而犧牲了在罕見(尾部)類別上的表現。在近期提出的方法中,資料增強技術如 MixUp 和 CutMix 被廣泛應用於解決長尾問題。MixUp 通過對兩張影像進行插值,而 CutMix 則將一張影像的剪切區域貼到另一張影像上,從而合成更多樣化的訓練樣本。然而,據我們所知,目前尚無研究明確探討應該配對或結合哪些影像來達到最佳效果。為了解決這個挑戰,本研究提出了一種名為特徵感知分數選擇 (Feature-Aware Score-Based Selection, FASS) 的新策略。在應用 MixUp 或 CutMix 之前,FASS 根據影像的特徵表現動態選擇並配對影像。與傳統增強方法主要著重於增強少數類別樣本不同,FASS 動態識別與特徵相關的目標類別,以提升模型區分相似特徵的能力。當 FASS 與其他先進方法結合時,在 CIFAR-100 和 ImageNet-LT 等基準資料集上,FASS 展現出卓越的性能,達到了最新的最佳表現。
The long-tail problem in supervised learning arises due to the inherent imbalance in real-world datasets, where a few classes or instances dominate the data distribution while a majority of classes (tail) have significantly fewer examples. This issue poses challenges for traditional supervised learning algorithms, which often prioritize optimizing performance on the frequent (head) classes at the expense of rare (tail) classes. Among the recent proposed approaches, data augmentation methods like MixUp, which interpolates two images, and CutMix, which pastes a cutout region from one image onto another, are widely used to address the long-tail problem by synthesizing more diverse training examples. However, to our knowledge, there has been no research explicitly investigating which images should be paired or combined for augmentation to achieve the best results. To address this challenge, we propose Feature-Aware Score-Based Selection (FASS), a novel strategy that dynamically selects and pairs images based on their feature representations before applying Mixup or CutMix. Unlike traditional augmentation methods that primarily focus on enriching minority class samples, FASS dynamically identifies feature-relevant target classes to improve the model’s ability to distinguish closely related features. When integrated with other advanced methods, FASS demonstrates superior performance on benchmarks such as CIFAR-100 and ImageNet-LT, achieving state-of-the-art results.

Description

Keywords

長尾分佈, 資料增強, long-tail distribution, data augmentation

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By