Skip to content

This is an official implementation for Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation.

Notifications You must be signed in to change notification settings

Imageomics/Finer-CAM

Repository files navigation

Finer-CAM : Spotting the Difference Reveals Finer Details for Visual Explanation

Official implementation of "Finer-CAM [arxiv]".

CAM methods highlight image regions influencing predictions but often struggle in fine-grained tasks due to shared feature activation across similar classes. We propose Finer-CAM, which explicitly compares the target class with similar ones, suppressing shared features and emphasizing unique, discriminative details.

Finer-CAM retains CAM’s efficiency, offers precise localization, and adapts to multi-modal zero-shot models, accurately activating object parts or attributes. It enhances explainability in fine-grained tasks without increasing complexity.

images

Demo

Experience the power of Finer-CAM with our interactive demos! Witness accurate localization of discriminative features.

  • Try the multi-modal demo and see how Finer-CAM activates detailed and relevant regions for diverse concepts: Open In Colab
  • Test the classifier demo to explore class-specific activation maps with enhanced explainability: Open In Colab

Reqirements

# create conda env
conda create -n finer-cam python=3.9 -y
conda activate finer-cam

# install packages
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install numpy opencv-python ftfy regex tqdm ttach tensorboard lxml cython scikit-learn matplotlib

Preparing Datasets

Stanford Cars

  1. Download the dataset using the following command:

    curl -L -o datasets/stanford_cars.zip \
    https://www.kaggle.com/api/v1/datasets/download/cyizhuo/stanford-cars-by-classes-folder
    
    
  2. Unzip the downloaded file

    unzip datasets/stanford_cars.zip -d datasets/
    
  3. The structure of datasets/should be organized as follows:

datasets/
├── train/
│   ├── Acura Integra Type R 2001/
│   │   ├── 000405.jpg
│   │   ├── 000406.jpg
│   │   └── ...
│   ├── Acura RL Sedan 2012/
│   │   ├── 000090.jpg
│   │   ├── 000091.jpg
│   │   └── ...
│   └── ...
└── test/
    ├── Acura Integra Type R 2001/
    │   ├── 000450.jpg
    │   ├── 000451.jpg
    │   └── ...
    ├── Acura RL Sedan 2012/
    │   ├── 000122.jpg

Preparing pre-trained model

Download DINOv2 pre-trained [ViT-B/14] at here and put it to pretrained_models/dinov2.

Usage

Step 1. Generate CAMs for Validation Set

Run the Script:

  • Execute the generate_cam.py script with the appropriate arguments using the following command:
     python generate_cam.py \
         --classifier_path <path_to_classifier_model> \
         --model_path <path_to_dino_model> \
         --dataset_path <path_to_dataset_or_image_list> \
         --save_path <path_to_save_results>

Step 2. Visualize Results

Run the Script:

  • Execute the visualize.py script with the appropriate arguments using the following command:
    python visualize.py --dataset_path <path_to_dataset_directory> \
                        --cams_path <path_to_cams_directory> \
                        --save_path <path_to_save_visualizations>

Acknowledgement

We utilized code from:

Thanks for their wonderful works.

Citation

If you find this repository useful, please consider citing our work 📝 and giving a star 🌟 :

@article{zhang2025finer,
  title={Finer-CAM: Fine-grained Visual Interpretability through Class-Specific Gradient Refinements},
  author={Ziheng Zhang and Jianyang Gu and Arpita Chowdhury and Zheda Mai and David Carlyn and Tanya Berger-Wolf and Yu Su and Wei-Lun Chao},
  journal={arXiv preprint arXiv:2501.11309},
  year={2025},
}

About

This is an official implementation for Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages