Official implementation of "Finer-CAM [arxiv]".
CAM methods highlight image regions influencing predictions but often struggle in fine-grained tasks due to shared feature activation across similar classes. We propose Finer-CAM, which explicitly compares the target class with similar ones, suppressing shared features and emphasizing unique, discriminative details.
Finer-CAM retains CAM’s efficiency, offers precise localization, and adapts to multi-modal zero-shot models, accurately activating object parts or attributes. It enhances explainability in fine-grained tasks without increasing complexity.
Experience the power of Finer-CAM with our interactive demos! Witness accurate localization of discriminative features.
- Try the multi-modal demo and see how Finer-CAM activates detailed and relevant regions for diverse concepts:
- Test the classifier demo to explore class-specific activation maps with enhanced explainability:
# create conda env
conda create -n finer-cam python=3.9 -y
conda activate finer-cam
# install packages
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
pip install numpy opencv-python ftfy regex tqdm ttach tensorboard lxml cython scikit-learn matplotlib
-
Download the dataset using the following command:
curl -L -o datasets/stanford_cars.zip \ https://www.kaggle.com/api/v1/datasets/download/cyizhuo/stanford-cars-by-classes-folder
-
Unzip the downloaded file
unzip datasets/stanford_cars.zip -d datasets/
-
The structure of
datasets/
should be organized as follows:
datasets/
├── train/
│ ├── Acura Integra Type R 2001/
│ │ ├── 000405.jpg
│ │ ├── 000406.jpg
│ │ └── ...
│ ├── Acura RL Sedan 2012/
│ │ ├── 000090.jpg
│ │ ├── 000091.jpg
│ │ └── ...
│ └── ...
└── test/
├── Acura Integra Type R 2001/
│ ├── 000450.jpg
│ ├── 000451.jpg
│ └── ...
├── Acura RL Sedan 2012/
│ ├── 000122.jpg
Download DINOv2 pre-trained [ViT-B/14] at here and put it to pretrained_models/dinov2
.
Run the Script:
- Execute the
generate_cam.py
script with the appropriate arguments using the following command:python generate_cam.py \ --classifier_path <path_to_classifier_model> \ --model_path <path_to_dino_model> \ --dataset_path <path_to_dataset_or_image_list> \ --save_path <path_to_save_results>
Run the Script:
- Execute the
visualize.py
script with the appropriate arguments using the following command:python visualize.py --dataset_path <path_to_dataset_directory> \ --cams_path <path_to_cams_directory> \ --save_path <path_to_save_visualizations>
We utilized code from:
Thanks for their wonderful works.
If you find this repository useful, please consider citing our work 📝 and giving a star 🌟 :
@article{zhang2025finer,
title={Finer-CAM: Fine-grained Visual Interpretability through Class-Specific Gradient Refinements},
author={Ziheng Zhang and Jianyang Gu and Arpita Chowdhury and Zheda Mai and David Carlyn and Tanya Berger-Wolf and Yu Su and Wei-Lun Chao},
journal={arXiv preprint arXiv:2501.11309},
year={2025},
}