Introduction
- 细粒度图像,相对于通用图像(general/generic images)的区别和难点在于其图像所属类别的粒度更为精细,是计算机视觉领域比较热门的一个方向,包括了分类、检索以及图像生成等
- 细粒度图像识别的难点和挑战主要在于:
- 类间差异小 (small inter-class variance):都属于同一个物种下的小类
- 类内差异大 (large intra-class variance):受姿态、尺度和旋转等因素影响
Tutorials
-
Fine-Grained Image Analysis. Xiu-Shen Wei, and Jianxin Wu. Pacific Rim International Conference on Artificial Intelligence (PRICAI), 2018.
-
Fine-Grained Image Analysis. Xiu-Shen Wei. IEEE International Conference on Multimedia and Expo (ICME), 2019.
Survey papers
-
Deep Learning for Fine-Grained Image Analysis: A Survey. Xiu-Shen Wei, Jianxin Wu, and Quan Cui. arXiv: 1907.03069, 2019.
-
A Survey on Deep Learning-based Fine-Grained Object Classification and Semantic Segmentation. Bo Zhao, Jiashi Feng, Xiao Wu, and Shuicheng Yan. International Journal of Automation and Computing, 2017.
Benchmark datasets
展示了 11 个数据集,如下图所示,其中 BBox 表示数据集提供物体的边界框信息,Part anno 则是数据集共了关键部位的位置信息,HRCHY 表示有分层次的标签,ATR 表示属性标签(比如翅膀颜色等),Texts 表示提供了图片的文本描述信息。
Dataset name | Year | Meta-class | images | categories | BBox | Part annotation | HRCHY | ATR | Texts |
---|---|---|---|---|---|---|---|---|---|
Oxford flower | 2008 | Flowers | 8,189 | 102 | ![]() |
||||
CUB200 | 2011 | Birds | 11,788 | 200 | ![]() |
![]() |
![]() |
![]() |
|
Stanford Dog | 2011 | Dogs | 20,580 | 120 | ![]() |
||||
Stanford Car | 2013 | Cars | 16,185 | 196 | ![]() |
||||
FGVC Aircraft | 2013 | Aircrafts | 10,000 | 100 | ![]() |
![]() |
|||
Birdsnap | 2014 | Birds | 49,829 | 500 | ![]() |
![]() |
![]() |
||
NABirds | 2015 | Birds | 48,562 | 555 | ![]() |
![]() |
|||
DeepFashion | 2016 | Clothes | 800,000 | 1,050 | ![]() |
![]() |
![]() |
||
Fru92 | 2017 | Fruits | 69,614 | 92 | ![]() |
||||
Veg200 | 2017 | Vegetable | 91,117 | 200 | ![]() |
||||
iNat2017 | 2017 | Plants & Animals | 859,000 | 5,089 | ![]() |
![]() |
|||
RPC | 2019 | Retail products | 83,739 | 200 | ![]() |
![]() |
Fine-grained image recognition
Fine-grained recognition by localization-classification subnetworks
基于定位-分类网络
Classical State-of-the-arts
-
Mask-CNN: Localizing Parts and Selecting Descriptors for Fine-Grained Im age Recognition
主要包括两个模块:第一个是Part Localization,第二个全局和局部图像块的特征学习
-
在Mask-CNN中,借助FCN学习一个三分类分割模型(一类为头部、一类为躯干、最后一类则是背景),GT mask是通过Part Annotation得到的头部和躯干部位的最小外接矩形。
-
FCN训练完毕后,可以对测试集中的细粒度图像进行较精确地part定位,得到part mask,合起来为object-mask,用于part localization和useful feature selection
-
将不同部位输入到CNN子网络后输出feature map,利用前面得到的part-mask和object-mask作为权重,与对应像素点点乘。然后再分别进行max pooling和average pooling得到的特征级联作为子网络的最终feature vector。最后将三个子网特征再次级联作为整张图像的特征表示
-
-
Selective Sparse Sampling for Fine-Grained Image Recognition
提出了一种捕捉细粒度级别特征同时不会丢失上下文信息的简单有效的框架,
-
采用class peak responses,从class response map中定位局部最大值,从而形成sparse attention。sparse attention通常对应于精细的图像部分
- 定义了两个平行的采样分支去重采样图片:
- 判别性(discriminative)分支:抽取判别性的特征
- 互补性(complementary)分支:抽取互补性的特征
- 将三个输出拼接后通过FC层实现最终的分类
-
Related Works
-
Part-based R-CNNs for Fine-Grained Category Detection. Ning Zhang, Jeff Donahue, Ross Girshick, and Trevor Darrell. ECCV, 2014.
[code]
-
Fine-Grained Recognition without Part Annotations. Jonathan Krause, Hailin Jin, Jianchao Yang, and Li Fei-Fei. CVPR, 2015.
[code]
-
The Application of Two-level Attention Models in Deep Convolutional Neural Network for Fine-grained Image Classification. Tianjun Xiao, Yichong Xu, Kuiyuan Yang, Jiaxing Zhang, Yuxin Peng, and Zheng Zhang. CVPR, 2015.
-
Deep LAC: Deep Localization, Alignment and Classification for Fine-grained Recognition. Di Lin, Xiaoyong Shen, Cewu Lu, and Jiaya Jia. CVPR, 2015.
-
Spatial Transformer Networks. Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu. NeurIPS, 2015.
[code]
-
Part-Stacked CNN for Fine-Grained Visual Categorization. Shaoli Huang, Zhe Xu, Dacheng Tao, and Ya Zhang. CVPR, 2016.
-
Mining Discriminative Triplets of Patches for Fine-Grained Classification. Yaming Wang, Jonghyun Choi, Vlad I. Morariu, and Larry S. Davis. CVPR, 2016.
-
SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-grained Recognition. Han Zhang, Tao Xu, Mohamed Elhoseiny, Xiaolei Huang, Shaoting Zhang, Ahmed Elgammal, and Dimitris Metaxas. CVPR, 2016.
-
Picking Deep Filter Responses for Fine-grained Image Recognition. Xiaopeng Zhang, Hongkai, Xiong, Wengang Zhou, Weiyao Lin, and Qi Tian. CVPR, 2016.
-
Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-Grained Image Recognition. Jianlong Fu, Heliang Zheng, and Tao Mei. CVPR, 2017.
-
Fine-Grained Recognition as HSnet Search for Informative Image Parts. Michael Lam, Behrooz Mahasseni, and Sinisa Todorovic. CVPR, 2017.
-
Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition. Heliang Zheng, Jianlong Fu, Tao Mei, and Jiebo Luo. ICCV, 2017.
[code]
-
Weakly Supervised Learning of Part Selection Model with Spatial Constraints for Fine-Grained Image Classification. Xiangteng He, and Yuxin Peng. AAAI, 2017.
-
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition. Xiao Liu, Jiang Wang, Shilei Wen, Errui Ding, and Yuanqing Lin. AAAI, 2017.
-
Learning to Navigate for Fine-grained Classification. Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, and Liwei Wang. ECCV, 2018.
[code]
-
Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition. Ming Sun, Yuchen Yuan, Feng Zhou, and Errui Ding. ECCV, 2018.
[code]
-
Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up. Weifeng Ge, Xiangru Lin, and Yizhou Yu. CVPR, 2019.
-
Selective Sparse Sampling for Fine-grained Image Recognition. Yao Ding, Yanzhao Zhou, Yi Zhu, Qixiang Ye, and Jianbin Jiao. ICCV, 2019.
[code]
-
Interpretable and Accurate Fine-grained Recognition via Region Grouping. Zixuan Huang, and Yin Li. CVPR, 2020.
-
Weakly Supervised Fine-grained Image Classification via Guassian Mixture Model Oriented Discriminative Learning. Zhihui Wang, Shijie Wang, Shuhui Yang, Haojie Li, Jianjun Li, and Zezhou Li. CVPR, 2020.
-
Graph-Propagation Based Correlation Learning for Weakly Supervised Fine-Grained Image Classification. Zhihui Wang, Shijie Wang, Haojie Li, Zhi Dou, and Jianjun Li. AAAI, 2020.
-
Filtration and Distillation: Enhancing Region Attention for Fine-Grained Visual Categorization. Chuanbin Liu, Hongtao Xie, Zheng-Jun Zha, Lingfeng Ma, Lingyun Yu, and Yongdong Zhang. AAAI, 2020.
Fine-grained recognition by end-to-end feature encoding
端对端特征编码
Classical State-of-the-arts
Related Works
-
Fine-Grained Visual Categorization via Multi-stage Metric Learning. Qi Qian, Rong Jin, Shenghuo Zhu, and Yuanqing Lin. CVPR, 2015.
-
Hyper-Class Augmented and Regularized Deep Learning for Fine-Grained Image Classification. Saining Xie, Tianbao Yang, Xiaoyu Wang, and Yuanqing Lin. CVPR, 2015.
-
Subset Feature Learning for Fine-Grained Category Classification. ZongYuan Ge, Christopher McCool, Conrad Sanderson, and Peter Corke. CVPR, 2015.
-
Bilinear CNN Models for Fine-grained Visual Recognition. Tsung-Yu Lin, Aruni RoyChowdhury, and Subhransu Maji. ICCV, 2015.
[code]
-
Multiple Granularity Descriptors for Fine-Grained Categorization. Dequan Wang, Zhiqiang Shen, Jie Shao, Wei Zhang, Xiangyang Xue, and Zheng Zhang. ICCV, 2015.
-
Compact Bilinear Pooling. Yang Gao, Oscar Beijbom, Ning Zhang, and Trevor Darrell. CVPR, 2016.
[code]
-
Fine-Grained Image Classification by Exploring Bipartite-Graph Labels. Feng Zhou, and Yuanqing Lin. CVPR, 2016.
[project page]
-
Kernel Pooling for Convolutional Neural Networks. Yin Cui, Feng Zhou, Jiang Wang, Xiao Liu, Yuanqing Lin, and Serge Belongie. CVPR, 2017.
-
Low-rank Bilinear Pooling for Fine-Grained Classification. Shu Kong, and Charless Fowlkes. CVPR, 2017.
[code]
-
Higher-order Integration of Hierarchical Convolutional Activations for Fine-Grained Visual Categorization. Sijia Cai, Wangmeng Zuo, and Lei Zhang. ICCV, 2017.
[code]
-
Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition. Yaming Wang, Vlad I. Morariu, and Larry S. Davis. CVPR, 2018.
[code]
-
Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization. Peihua Li, Jiangtao Xie, Qilong Wang, and Zilin Gao. CVPR, 2018.
[code]
-
Maximum-Entropy Fine Grained Classification. Abhimanyu Dubey, Otkrist Gupta, Ramesh Raskar, and Nikhil Naik. NeurIPS, 2018.
-
Pairwise Confusion for Fine-Grained Visual Classification. Abhimanyu Dubey, Otkrist Gupta, Pei Guo, Ramesh Raskar, Ryan Farrell, and Nikhil Naik. ECCV, 2018.
[code]
-
DeepKSPD: Learning Kernel-matrix-based SPD Representation for Fine-Grained Image Recognition. Melih Engin, Lei Wang, Luping Zhou, and Xinwang Liu. ECCV, 2018.
-
Hierarchical Bilinear Pooling for Fine-Grained Visual Recognition. Chaojian Yu, Xinyi Zhao, Qi Zheng, Peng Zhang, and Xinge You. ECCV, 2018.
[code]
-
Grassmann Pooling as Compact Homogeneous Bilinear Pooling for Fine-Grained Visual Classification. Xing Wei, Yue Zhang, Yihong Gong, Jiawei Zhang, and Nanning Zheng. ECCV, 2018.
-
Learning Deep Bilinear Transformation for Fine-grained Image Representation. Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, and Jiebo Luo. NeurIPS, 2019.
-
Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition. Heliang Zheng, Jianlong Fu, Zheng-Jun Zha, and Jiebo Luo. CVPR, 2019.
[code]
-
Destruction and Construction Learning for Fine-grained Image Recognition. Yue Chen, Yalong Bai, Wei Zhang, and Tao Mei. CVPR, 2019.
[code]
-
Learning a Mixture of Granularity-Specific Experts for Fine-Grained Categorization. Lianbo Zhang, Shaoli Huang, Wei Liu, and Dacheng Tao. ICCV, 2019.
-
Cross-X Learning for Fine-Grained Visual Categorization. Wei Luo, Xiong Yang, Xianjie Mo, Yuheng Lu, Larry S. Davis, Jun Li, Jian Yang, and Ser-Nam Lim. ICCV, 2019.
[code]
-
Fine-grained Image-to-Image Transformation towards Visual Recognition. Wei Xiong, Yutong He, Yixuan Zhang, Wenhan Luo, Lin Ma, and Jiebo Luo. CVPR, 2020.
-
Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization. Ruyi Ji, Longyin Wen, Libo Zhang, Dawei Du, Yanjun Wu, Chen Zhao, Xianglong Liu, and Feiyue Huang. CVPR, 2020.
[code]
-
Fine-Grained Visual Classification via Progressive Multi-Granularity Training of Jigsaw Patches. Ruoyi Du, Dongliang Chang, Ayan Kumar Bhunia, Jiyang Xie, Yi-Zhe Song, Zhanyu Ma, and Jun Guo. ECCV, 2020.
[code]
-
Channel Interaction Networks for Fine-Grained Image Categorization. Yu Gao, Xintong Han, Xun Wang, Weilin Huang, and Matthew R. Scott. AAAI, 2020.
-
Learning Attentive Pairwise Interaction for Fine-Grained Classification. Peiqin Zhuang, Yali Wang, and Yu Qiao. AAAI, 2020.
-
Fine-grained Recognition: Accounting for Subtle Differences between Similar Classes. Guolei Sun, Hisham Cholakkal, Salman Khan, Fahad Shahbaz Khan, and Ling Shao. AAAI, 2020.
Fine-grained by leveraging attention mechanisms
利用注意力机制
Classical State-of-the-arts
Fine-grained by contrastive learning manners
利用对比学习
Classical State-of-the-arts
-
Learning Attentive Pairwise Interation for Fine-Grained Classification
-
Channel Interaction Networks for Fine-Grained Image Categorization
Fine-grained recognition with external information
采用额外信息,减少标注成本
Fine-grained recognition with web data / auxiliary data
web data / auxiliary data需要利用模型进行数据降噪
-
Augmenting Strong Supervision Using Web Data for Fine-Grained Categorization. Zhe Xu, Shaoli Huang, Ya Zhang, and Dacheng Tao. ICCV, 2015.
-
Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification. Li Niu, Ashok Veeraraghavan, and Vshu Sabbarwal. CVPR, 2018.
-
Fine-Grained Visual Categorization using Meta-Learning Optimization with Sample Selection of Auxiliary Data. Yabin Zhang, Hui Tang, and Kai Jia. ECCV, 2018.
[code]
-
Webly-Supervised Fine-Grained Visual Categorization via Deep Domain Adaptation. Zhe Xu, Shaoli Huang, Ya Zhang, and Dacheng Tao. IEEE TPAMI, 2018.
-
Learning from Web Data using Adversarial Discriminative Neural Networks for Fine-Grained Classification. Xiaoxiao Sun, Liyi Chen, and Jufeng Yang. AAAI, 2019.
-
Web-Supervised Network with Softly Update-Drop Training for Fine-Grained Visual Classification. Chuanyi Zhang, Yazhou Yao, Huafeng Liu, Guo-Sen Xie, Xiangbo Shu, Tianfei Zhou, Zheng Zhang, Fumin Shen, and Zhenmin Tang. AAAI, 2020.
Fine-grained recognition with multi-modality data
-
Fine-Grained Image Classification via Combining Vision and Language. Xiangteng He, and Yuxin Peng. CVPR, 2017.
-
Audio Visual Attribute Discovery for Fine-Grained Object Recognition. Hua Zhang, Xiaochun Cao, and Rui Wang. AAAI, 2018.
-
Fine-grained Image Classification by Visual-Semantic Embedding. Huapeng Xu, Guilin Qi, Jingjing Li, Meng Wang, Kang Xu, and Huan Gao. IJCAI, 2018.
-
Knowledge-Embedded Representation Learning for Fine-Grained Image Recognition. Tianshui Chen, Liang Lin, Riquan Chen, Yang Wu, and Xiannan Luo. IJCAI, 2018.
-
Bi-Modal Progressive Mask Attention for Fine-Grained Recognition. Kaitao Song, Xiu-Shen Wei, Xiangbo Shu, Ren-Jie Song, Jianfeng Lu. IEEE TIP, 2020.
Fine-grained recognition with humans in the loop
-
Fine-grained Categorization and Dataset Bootstrapping using Deep Metric Learning with Humans in the Loop. Yin Cui, Feng Zhou, Yuanqing Lin, and Serge Belongie. CVPR, 2016.
-
Leveraging the Wisdom of the Crowd for Fine-Grained Recognition. Jia Deng, Jonathan Krause, Michael Stark, and Li Fei-Fei. IEEE TPAMI, 2016.
Fine-grained image recognition with limited data
少样本学习在细粒度识别的应用
Classical State-of-the-arts
-
Piecewise Classifier Mappings: Learning Fine-Grained Learners for Novel Categories with Few Examples
-
Multi-Attention Meta Learning for Few-Shot Fine-Grained Image Recognition
-
Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition
Fine-grained image retrieval
Unsupervised with pre-trained models
- Selective Convolutional Descriptor Aggregation for Fine-Grained Image Retrieval.
Xiu-Shen Wei, Jian-Hao Luo, Jianxin Wu, and Zhi-Hua Zhou. IEEE TIP, 2017.
[project page]
Supervised with metric learning
-
Centralized Ranking Loss with Weakly Supervised Localization for Fine-Grained Object Retrieval. Xiawu Zheng, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Feiyue Huang, and Yanhua Yang. IJCAI, 2018.
-
Towards Optimal Fine Grained Retrieval via Decorrelated Centralized Loss with Normalize-Scale layer. Xiawu Zheng, Rongrong Ji, Xiaoshuai Sun, Baochang Zhang, Yongjian Wu, and Feiyue Huang. AAAI, 2019.
Fine-grained image generation
Generating from fine-grained image distributions
-
CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training. Jianmin Bao, Dong Chen, Fang Wen, Houqiang Li, and Gang Hua. ICCV, 2017.
[code]
-
FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery. Krishna Kumar Singh, Utkarsh Ojha, and Yong Jae Lee. CVPR, 2019.
[code]
Generating from text descriptions
- AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks.
Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei Huang, and Xiaodong He. CVPR, 2018.
[code]
Future directions of FGIA
Fine-grained few shot learning
-
Piecewise classifier mappings: Learning fine-grained learners for novel categories with few examples. Xiu-Shen Wei, Peng Wang, Lingqiao Liu, Chunhua Shen, and Jianxin Wu. IEEE TIP, 2019.
-
Meta-Reinforced Synthetic Data for One-Shot Fine-Grained Visual Recognition. Satoshi Tsutsui, Yanwei Fu, and David Crandall. NeurIPS, 2019.
-
Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition. Luming Tang, Davis Wertheimer, and Bharath Hariharan. CVPR, 2020.
[code]
-
Multi-attention Meta Learning for Few-shot Fine-grained Image Recognition. Yaohui Zhu, Chenlong Liu, and Shuqiang Jiang. IJCAI, 2020.
Fine-Grained hashing
- ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Image Retrieval. Quan Cui, Qing-Yuan Jiang, Xiu-Shen Wei, Wu-Jun Li, and Osamu Yoshie. ECCV, 2020.
Fine-grained domain adaptation
-
Fine-grained Recognition in the Wild: A Multi-Task Domain Adaptation Approach. Timnit Geru, Judy Hoffman, and Li Fei-Fei. ICCV, 2017.
-
Progressive Adversarial Networks for Fine-Grained Domain Adaptation. Sinan Wang, Xinyang Chen, Yunbo Wang, Mingsheng Long, and Jianmin Wang. CVPR, 2020.
-
An Adversarial Domain Adaptation Network for Cross-Domain Fine-Grained Recognition. Yimu Wang, Ren-Jie Song, Xiu-Shen Wei, and Lijun Zhang. WACV, 2020.
FGIA within more realistic settings
-
The iNaturalist Species Classification and Detection Dataset. Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. CVPR, 2018.
-
RPC: A Large-Scale Retail Product Checkout Dataset. Xiu-Shen Wei, Quan Cui, Lei Yang, Peng Wang, and Lingqiao Liu. arXiv: 1901.07249, 2019.
[project page]
-
Presence-Only Geographical Priors for Fine-Grained Image Classification. Oisin Mac Aodha, Elijah Cole, and Pietro Perona. ICCV, 2019.
Recognition leaderboard
在数据集 CUB200-2011 数据集上的测试准确率,列举出目前最好的方法和其是否采用标准信息、额外的数据、采用的网络结构、输入图片的大小设置以及分类准确率
Method | Publication | BBox? | Part? | External information? | Base model | Image resolution | Accuracy |
---|---|---|---|---|---|---|---|
PB R-CNN | ECCV 2014 | Alex-Net | 224x224 | 73.9% | |||
MaxEnt | NeurIPS 2018 | GoogLeNet | TBD | 74.4% | |||
PB R-CNN | ECCV 2014 | ![]() |
Alex-Net | 224x224 | 76.4% | ||
PS-CNN | CVPR 2016 | ![]() |
![]() |
CaffeNet | 454x454 | 76.6% | |
MaxEnt | NeurIPS 2018 | VGG-16 | TBD | 77.0% | |||
Mask-CNN | PR 2018 | ![]() |
Alex-Net | 448x448 | 78.6% | ||
PC | ECCV 2018 | ResNet-50 | TBD | 80.2% | |||
DeepLAC | CVPR 2015 | ![]() |
![]() |
Alex-Net | 227x227 | 80.3% | |
MaxEnt | NeurIPS 2018 | ResNet-50 | TBD | 80.4% | |||
Triplet-A | CVPR 2016 | ![]() |
Manual labour | GoogLeNet | TBD | 80.7% | |
Multi-grained | ICCV 2015 | WordNet etc. | VGG-19 | 224x224 | 81.7% | ||
Krause et al. | CVPR 2015 | ![]() |
CaffeNet | TBD | 82.0% | ||
Multi-grained | ICCV 2015 | ![]() |
WordNet etc. | VGG-19 | 224x224 | 83.0% | |
TS | CVPR 2016 | VGGD+VGGM | 448x448 | 84.0% | |||
Bilinear CNN | ICCV 2015 | VGGD+VGGM | 448x448 | 84.1% | |||
STN | NeurIPS 2015 | GoogLeNet+BN | 448x448 | 84.1% | |||
LRBP | CVPR 2017 | VGG-16 | 224x224 | 84.2% | |||
PDFS | CVPR 2016 | VGG-16 | TBD | 84.5% | |||
Xu et al. | ICCV 2015 | ![]() |
![]() |
Web data | CaffeNet | 224x224 | 84.6% |
Cai et al. | ICCV 2017 | VGG-16 | 448x448 | 85.3% | |||
RA-CNN | CVPR 2017 | VGG-19 | 448x448 | 85.3% | |||
MaxEnt | NeurIPS 2018 | Bilinear CNN | TBD | 85.3% | |||
PC | ECCV 2018 | Bilinear CNN | TBD | 85.6% | |||
CVL | CVPR 2017 | Texts | VGG | TBD | 85.6% | ||
Mask-CNN | PR 2018 | ![]() |
VGG-16 | 448x448 | 85.7% | ||
GP-256 | ECCV 2018 | VGG-16 | 448x448 | 85.8% | |||
KP | CVPR 2017 | VGG-16 | 224x224 | 86.2% | |||
T-CNN | IJCAI 2018 | ResNet | 224x224 | 86.2% | |||
MA-CNN | ICCV 2017 | VGG-19 | 448x448 | 86.5% | |||
MaxEnt | NeurIPS 2018 | DenseNet-161 | TBD | 86.5% | |||
DeepKSPD | ECCV 2018 | VGG-19 | 448x448 | 86.5% | |||
OSME+MAMC | ECCV 2018 | ResNet-101 | 448x448 | 86.5% | |||
StackDRL | IJCAI 2018 | VGG-19 | 224x224 | 86.6% | |||
DFL-CNN | CVPR 2018 | VGG-16 | 448x448 | 86.7% | |||
Bi-Modal PMA | IEEE TIP 2020 | VGG-16 | 448x448 | 86.8% | |||
PC | ECCV 2018 | DenseNet-161 | TBD | 86.9% | |||
KERL | IJCAI 2018 | Attributes | VGG-16 | 224x224 | 87.0% | ||
HBP | ECCV 2018 | VGG-16 | 448x448 | 87.1% | |||
Mask-CNN | PR 2018 | ![]() |
ResNet-50 | 448x448 | 87.3% | ||
DFL-CNN | CVPR 2018 | ResNet-50 | 448x448 | 87.4% | |||
NTS-Net | ECCV 2018 | ResNet-50 | 448x448 | 87.5% | |||
HSnet | CVPR 2017 | ![]() |
![]() |
GoogLeNet+BN | TBD | 87.5% | |
Bi-Modal PMA | IEEE TIP 2020 | ResNet-50 | 448x448 | 87.5% | |||
CIN | AAAI 2020 | ResNet-50 | 448x448 | 87.5% | |||
MetaFGNet | ECCV 2018 | Auxiliary data | ResNet-34 | TBD | 87.6% | ||
Cross-X | CVPR 2020 | ResNet-50 | 448x448 | 87.7% | |||
DCL | CVPR 2019 | ResNet-50 | 448x448 | 87.8% | |||
ACNet | CVPR 2020 | VGG-16 | 448x448 | 87.8% | |||
TASN | CVPR 2019 | ResNet-50 | 448x448 | 87.9% | |||
ACNet | CVPR 2020 | ResNet-50 | 448x448 | 88.1% | |||
CIN | AAAI 2020 | ResNet-101 | 448x448 | 88.1% | |||
DBTNet-101 | NeurIPS 2019 | ResNet-101 | 448x448 | 88.1% | |||
Bi-Modal PMA | IEEE TIP 2020 | Texts | VGG-16 | 448x448 | 88.2% | ||
GCL | AAAI 2020 | ResNet-50 | 448x448 | 88.3% | |||
S3N | CVPR 2020 | ResNet-50 | 448x448 | 88.5% | |||
Sun et al. | AAAI 2020 | ResNet-50 | 448x448 | 88.6% | |||
FDL | AAAI 2020 | ResNet-50 | 448x448 | 88.6% | |||
Bi-Modal PMA | IEEE TIP 2020 | Texts | ResNet-50 | 448x448 | 88.7% | ||
DF-GMM | CVPR 2020 | ResNet-50 | 448x448 | 88.8% | |||
PMG | ECCV 2020 | VGG-16 | 550x550 | 88.8% | |||
FDL | AAAI 2020 | DenseNet-161 | 448x448 | 89.1% | |||
PMG | ECCV 2020 | ResNet-50 | 550x550 | 89.6% | |||
API-Net | AAAI 2020 | DenseNet-161 | 512x512 | 90.0% | |||
Ge et al. | CVPR 2019 | GoogLeNet+BN | Shorter side is 800 px | 90.4% |