Class-Difficulty Based Methods for Long-Tailed Visual Recognition View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2022-08-18

AUTHORS

Saptarshi Sinha, Hiroki Ohashi, Katsuyuki Nakamura

ABSTRACT

Long-tailed datasets are very frequently encountered in real-world use cases where few classes or categories (known as majority or head classes) have higher number of data samples compared to the other classes (known as minority or tail classes). Training deep neural networks on such datasets gives results biased towards the head classes. So far, researchers have come up with multiple weighted loss and data re-sampling techniques in efforts to reduce the bias. However, most of such techniques assume that the tail classes are always the most difficult classes to learn and therefore need more weightage or attention. Here, we argue that the assumption might not always hold true. Therefore, we propose a novel approach to dynamically measure the instantaneous difficulty of each class during the training phase of the model. Further, we use the difficulty measures of each class to design a novel weighted loss technique called ‘class-wise difficulty based weighted (CDB-W) loss’ and a novel data sampling technique called ‘class-wise difficulty based sampling (CDB-S)’. To verify the wide-scale usability of our CDB methods, we conducted extensive experiments on multiple tasks such as image classification, object detection, instance segmentation and video-action classification. Results verified that CDB-W loss and CDB-S could achieve state-of-the-art results on many class-imbalanced datasets such as ImageNet-LT, LVIS and EGTEA, that resemble real-world use cases. More... »

PAGES

2517-2531

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s11263-022-01643-3

DOI

http://dx.doi.org/10.1007/s11263-022-01643-3

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1150326039


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Intelligent Vision Research Department, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan", 
          "id": "http://www.grid.ac/institutes/grid.417547.4", 
          "name": [
            "Intelligent Vision Research Department, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Sinha", 
        "givenName": "Saptarshi", 
        "id": "sg:person.016025532327.19", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016025532327.19"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Intelligent Vision Research Department, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan", 
          "id": "http://www.grid.ac/institutes/grid.417547.4", 
          "name": [
            "Intelligent Vision Research Department, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ohashi", 
        "givenName": "Hiroki", 
        "id": "sg:person.016550503100.94", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016550503100.94"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "R &D Group, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan", 
          "id": "http://www.grid.ac/institutes/grid.417547.4", 
          "name": [
            "R &D Group, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Nakamura", 
        "givenName": "Katsuyuki", 
        "id": "sg:person.010643012300.54", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010643012300.54"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/978-3-030-69544-6_33", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1135743780", 
          "https://doi.org/10.1007/978-3-030-69544-6_33"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-030-58568-6_43", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1132574890", 
          "https://doi.org/10.1007/978-3-030-58568-6_43"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/11538059_91", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022715535", 
          "https://doi.org/10.1007/11538059_91"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-319-98074-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1108232028", 
          "https://doi.org/10.1007/978-3-319-98074-4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-030-01228-1_38", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1107463274", 
          "https://doi.org/10.1007/978-3-030-01228-1_38"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-319-46466-4_37", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009862037", 
          "https://doi.org/10.1007/978-3-319-46466-4_37"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2022-08-18", 
    "datePublishedReg": "2022-08-18", 
    "description": "Long-tailed datasets are very frequently encountered in real-world use cases where few classes or categories (known as majority or head classes) have higher number of data samples compared to the other classes (known as minority or tail classes). Training deep neural networks on such datasets gives results biased towards the head classes. So far, researchers have come up with multiple weighted loss and data re-sampling techniques in efforts to reduce the bias. However, most of such techniques assume that the tail classes are always the most difficult classes to learn and therefore need more weightage or attention. Here, we argue that the assumption might not always hold true. Therefore, we propose a novel approach to dynamically measure the instantaneous difficulty of each class during the training phase of the model. Further, we use the difficulty measures of each class to design a novel weighted loss technique called \u2018class-wise difficulty based weighted (CDB-W) loss\u2019 and a novel data sampling technique called \u2018class-wise difficulty based sampling (CDB-S)\u2019. To verify the wide-scale usability of our CDB methods, we conducted extensive experiments on multiple tasks such as image classification, object detection, instance segmentation and video-action classification. Results verified that CDB-W loss and CDB-S could achieve state-of-the-art results on many class-imbalanced datasets such as ImageNet-LT, LVIS and EGTEA, that resemble real-world use cases.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s11263-022-01643-3", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1032807", 
        "issn": [
          "0920-5691", 
          "1573-1405"
        ], 
        "name": "International Journal of Computer Vision", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "10", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "130"
      }
    ], 
    "keywords": [
      "real-world use cases", 
      "use cases", 
      "video action classification", 
      "deep neural networks", 
      "class-imbalanced datasets", 
      "instance segmentation", 
      "image classification", 
      "Extensive experiments", 
      "art results", 
      "neural network", 
      "re-sampling techniques", 
      "tail classes", 
      "multiple tasks", 
      "such datasets", 
      "training phase", 
      "visual recognition", 
      "head classes", 
      "difficult class", 
      "dataset", 
      "Based Method", 
      "data samples", 
      "such techniques", 
      "ImageNet-LT", 
      "more weightage", 
      "novel approach", 
      "difficulty measures", 
      "classification", 
      "EGTEA", 
      "usability", 
      "segmentation", 
      "network", 
      "technique", 
      "task", 
      "recognition", 
      "class", 
      "novel data", 
      "difficulties", 
      "weightage", 
      "method", 
      "detection", 
      "researchers", 
      "results", 
      "model", 
      "data", 
      "experiments", 
      "efforts", 
      "number", 
      "attention", 
      "categories", 
      "assumption", 
      "state", 
      "higher number", 
      "CDB", 
      "cases", 
      "sampling", 
      "measures", 
      "loss", 
      "LVI", 
      "phase", 
      "bias", 
      "Long", 
      "loss technique", 
      "samples", 
      "approach"
    ], 
    "name": "Class-Difficulty Based Methods for Long-Tailed Visual Recognition", 
    "pagination": "2517-2531", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1150326039"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s11263-022-01643-3"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s11263-022-01643-3", 
      "https://app.dimensions.ai/details/publication/pub.1150326039"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-12-01T06:44", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_921.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s11263-022-01643-3"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s11263-022-01643-3'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s11263-022-01643-3'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s11263-022-01643-3'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s11263-022-01643-3'


 

This table displays all metadata directly associated to this object as RDF triples.

161 TRIPLES      21 PREDICATES      94 URIs      80 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s11263-022-01643-3 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N052ef566eccb4164a31eef05feb7d20d
4 schema:citation sg:pub.10.1007/11538059_91
5 sg:pub.10.1007/978-3-030-01228-1_38
6 sg:pub.10.1007/978-3-030-58568-6_43
7 sg:pub.10.1007/978-3-030-69544-6_33
8 sg:pub.10.1007/978-3-319-46466-4_37
9 sg:pub.10.1007/978-3-319-98074-4
10 schema:datePublished 2022-08-18
11 schema:datePublishedReg 2022-08-18
12 schema:description Long-tailed datasets are very frequently encountered in real-world use cases where few classes or categories (known as majority or head classes) have higher number of data samples compared to the other classes (known as minority or tail classes). Training deep neural networks on such datasets gives results biased towards the head classes. So far, researchers have come up with multiple weighted loss and data re-sampling techniques in efforts to reduce the bias. However, most of such techniques assume that the tail classes are always the most difficult classes to learn and therefore need more weightage or attention. Here, we argue that the assumption might not always hold true. Therefore, we propose a novel approach to dynamically measure the instantaneous difficulty of each class during the training phase of the model. Further, we use the difficulty measures of each class to design a novel weighted loss technique called ‘class-wise difficulty based weighted (CDB-W) loss’ and a novel data sampling technique called ‘class-wise difficulty based sampling (CDB-S)’. To verify the wide-scale usability of our CDB methods, we conducted extensive experiments on multiple tasks such as image classification, object detection, instance segmentation and video-action classification. Results verified that CDB-W loss and CDB-S could achieve state-of-the-art results on many class-imbalanced datasets such as ImageNet-LT, LVIS and EGTEA, that resemble real-world use cases.
13 schema:genre article
14 schema:isAccessibleForFree true
15 schema:isPartOf Nf91669841b1c47da8cb86c1c6f258079
16 Nf97f3927d7394f72ac2d40fb8b354191
17 sg:journal.1032807
18 schema:keywords Based Method
19 CDB
20 EGTEA
21 Extensive experiments
22 ImageNet-LT
23 LVI
24 Long
25 approach
26 art results
27 assumption
28 attention
29 bias
30 cases
31 categories
32 class
33 class-imbalanced datasets
34 classification
35 data
36 data samples
37 dataset
38 deep neural networks
39 detection
40 difficult class
41 difficulties
42 difficulty measures
43 efforts
44 experiments
45 head classes
46 higher number
47 image classification
48 instance segmentation
49 loss
50 loss technique
51 measures
52 method
53 model
54 more weightage
55 multiple tasks
56 network
57 neural network
58 novel approach
59 novel data
60 number
61 phase
62 re-sampling techniques
63 real-world use cases
64 recognition
65 researchers
66 results
67 samples
68 sampling
69 segmentation
70 state
71 such datasets
72 such techniques
73 tail classes
74 task
75 technique
76 training phase
77 usability
78 use cases
79 video action classification
80 visual recognition
81 weightage
82 schema:name Class-Difficulty Based Methods for Long-Tailed Visual Recognition
83 schema:pagination 2517-2531
84 schema:productId N1e8dbc56fc864ab5bf6e2fcb6734c968
85 Nc566e35887c340f7891204b61f02e4f0
86 schema:sameAs https://app.dimensions.ai/details/publication/pub.1150326039
87 https://doi.org/10.1007/s11263-022-01643-3
88 schema:sdDatePublished 2022-12-01T06:44
89 schema:sdLicense https://scigraph.springernature.com/explorer/license/
90 schema:sdPublisher N8aa14fb67fb442ffa9115d90c422bd1f
91 schema:url https://doi.org/10.1007/s11263-022-01643-3
92 sgo:license sg:explorer/license/
93 sgo:sdDataset articles
94 rdf:type schema:ScholarlyArticle
95 N052ef566eccb4164a31eef05feb7d20d rdf:first sg:person.016025532327.19
96 rdf:rest Na4867c3be3ae4e4c8f549e0d73ad951e
97 N1e8dbc56fc864ab5bf6e2fcb6734c968 schema:name dimensions_id
98 schema:value pub.1150326039
99 rdf:type schema:PropertyValue
100 N8aa14fb67fb442ffa9115d90c422bd1f schema:name Springer Nature - SN SciGraph project
101 rdf:type schema:Organization
102 Na4867c3be3ae4e4c8f549e0d73ad951e rdf:first sg:person.016550503100.94
103 rdf:rest Nc4702bd5a03049c6bb44f9af79aa7bec
104 Nc4702bd5a03049c6bb44f9af79aa7bec rdf:first sg:person.010643012300.54
105 rdf:rest rdf:nil
106 Nc566e35887c340f7891204b61f02e4f0 schema:name doi
107 schema:value 10.1007/s11263-022-01643-3
108 rdf:type schema:PropertyValue
109 Nf91669841b1c47da8cb86c1c6f258079 schema:issueNumber 10
110 rdf:type schema:PublicationIssue
111 Nf97f3927d7394f72ac2d40fb8b354191 schema:volumeNumber 130
112 rdf:type schema:PublicationVolume
113 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
114 schema:name Information and Computing Sciences
115 rdf:type schema:DefinedTerm
116 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
117 schema:name Artificial Intelligence and Image Processing
118 rdf:type schema:DefinedTerm
119 sg:journal.1032807 schema:issn 0920-5691
120 1573-1405
121 schema:name International Journal of Computer Vision
122 schema:publisher Springer Nature
123 rdf:type schema:Periodical
124 sg:person.010643012300.54 schema:affiliation grid-institutes:grid.417547.4
125 schema:familyName Nakamura
126 schema:givenName Katsuyuki
127 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010643012300.54
128 rdf:type schema:Person
129 sg:person.016025532327.19 schema:affiliation grid-institutes:grid.417547.4
130 schema:familyName Sinha
131 schema:givenName Saptarshi
132 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016025532327.19
133 rdf:type schema:Person
134 sg:person.016550503100.94 schema:affiliation grid-institutes:grid.417547.4
135 schema:familyName Ohashi
136 schema:givenName Hiroki
137 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016550503100.94
138 rdf:type schema:Person
139 sg:pub.10.1007/11538059_91 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022715535
140 https://doi.org/10.1007/11538059_91
141 rdf:type schema:CreativeWork
142 sg:pub.10.1007/978-3-030-01228-1_38 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107463274
143 https://doi.org/10.1007/978-3-030-01228-1_38
144 rdf:type schema:CreativeWork
145 sg:pub.10.1007/978-3-030-58568-6_43 schema:sameAs https://app.dimensions.ai/details/publication/pub.1132574890
146 https://doi.org/10.1007/978-3-030-58568-6_43
147 rdf:type schema:CreativeWork
148 sg:pub.10.1007/978-3-030-69544-6_33 schema:sameAs https://app.dimensions.ai/details/publication/pub.1135743780
149 https://doi.org/10.1007/978-3-030-69544-6_33
150 rdf:type schema:CreativeWork
151 sg:pub.10.1007/978-3-319-46466-4_37 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009862037
152 https://doi.org/10.1007/978-3-319-46466-4_37
153 rdf:type schema:CreativeWork
154 sg:pub.10.1007/978-3-319-98074-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1108232028
155 https://doi.org/10.1007/978-3-319-98074-4
156 rdf:type schema:CreativeWork
157 grid-institutes:grid.417547.4 schema:alternateName Intelligent Vision Research Department, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan
158 R &D Group, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan
159 schema:name Intelligent Vision Research Department, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan
160 R &D Group, Hitachi Ltd., 185-8601, Kokubunji, Tokyo, Japan
161 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...