Semantic Bottlenecks: Quantifying and Improving Inspectability of Deep Representations View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2021-09-14

AUTHORS

Max Losch, Mario Fritz, Bernt Schiele

ABSTRACT

Today’s deep learning systems deliver high performance based on end-to-end training but are notoriously hard to inspect. We argue that there are at least two reasons making inspectability challenging: (i) representations are distributed across hundreds of channels and (ii) a unifying metric quantifying inspectability is lacking. In this paper, we address both issues by proposing Semantic Bottlenecks (SB), which can be integrated into pretrained networks, to align channel outputs with individual visual concepts and introduce the model agnostic Area Under inspectability Curve (AUiC) metric to measure the alignment. We present a case study on semantic segmentation to demonstrate that SBs improve the AUiC up to six-fold over regular network outputs. We explore two types of SB-layers in this work. First, concept-supervised SB-layers (SSB), which offer inspectability w.r.t. predefined concepts that the model is demanded to rely on. And second, unsupervised SBs (USB), which offer equally strong AUiC improvements by restricting distributedness of representations across channels. Importantly, for both SB types, we can recover state of the art segmentation performance across two different models despite a drastic dimensionality reduction from 1000s of non aligned channels to 10s of semantics-aligned channels that all downstream results are based on. More... »

PAGES

3136-3153

References to SciGraph publications

  • 2014. Visualizing and Understanding Convolutional Networks in COMPUTER VISION – ECCV 2014
  • 2019-06-02. Semantic Bottleneck for Computer Vision Tasks in COMPUTER VISION – ACCV 2018
  • 2018-10-06. Unified Perceptual Parsing for Scene Understanding in COMPUTER VISION – ECCV 2018
  • 2020-11-07. Object-Contextual Representations for Semantic Segmentation in COMPUTER VISION – ECCV 2020
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s11263-021-01498-0

    DOI

    http://dx.doi.org/10.1007/s11263-021-01498-0

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1141079444


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr\u00fccken, Germany", 
              "id": "http://www.grid.ac/institutes/grid.419528.3", 
              "name": [
                "Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr\u00fccken, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Losch", 
            "givenName": "Max", 
            "id": "sg:person.013504324635.08", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013504324635.08"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "CISPA Helmholtz Center for Information Security, Saarbr\u00fccken, Germany", 
              "id": "http://www.grid.ac/institutes/grid.507511.7", 
              "name": [
                "CISPA Helmholtz Center for Information Security, Saarbr\u00fccken, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Fritz", 
            "givenName": "Mario", 
            "id": "sg:person.013361072755.17", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013361072755.17"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr\u00fccken, Germany", 
              "id": "http://www.grid.ac/institutes/grid.419528.3", 
              "name": [
                "Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbr\u00fccken, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Schiele", 
            "givenName": "Bernt", 
            "id": "sg:person.01174260421.90", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01174260421.90"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/978-3-030-01228-1_26", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1107463261", 
              "https://doi.org/10.1007/978-3-030-01228-1_26"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-030-20890-5_44", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1116240831", 
              "https://doi.org/10.1007/978-3-030-20890-5_44"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-030-58539-6_11", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1132404102", 
              "https://doi.org/10.1007/978-3-030-58539-6_11"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-319-10590-1_53", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032233097", 
              "https://doi.org/10.1007/978-3-319-10590-1_53"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2021-09-14", 
        "datePublishedReg": "2021-09-14", 
        "description": "Today\u2019s deep learning systems deliver high performance based on end-to-end training but are notoriously hard to inspect. We argue that there are at least two reasons making inspectability challenging: (i) representations are distributed across hundreds of channels and (ii) a unifying metric quantifying inspectability is lacking. In this paper, we address both issues by proposing Semantic Bottlenecks (SB), which can be integrated into pretrained networks, to align channel outputs with individual visual concepts and introduce the model agnostic Area Under inspectability Curve (AUiC) metric to measure the alignment. We present a case study on semantic segmentation to demonstrate that SBs improve the AUiC up to six-fold over regular network outputs. We explore two types of SB-layers in this work. First, concept-supervised SB-layers (SSB), which offer inspectability w.r.t. predefined concepts that the model is demanded to rely on. And second, unsupervised SBs (USB), which offer equally strong AUiC improvements by restricting distributedness of representations across channels. Importantly, for both SB types, we can recover state of the art segmentation performance across two different models despite a drastic dimensionality reduction from 1000s of non aligned channels to 10s of semantics-aligned channels that all downstream results are based on.", 
        "genre": "article", 
        "id": "sg:pub.10.1007/s11263-021-01498-0", 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1032807", 
            "issn": [
              "0920-5691", 
              "1573-1405"
            ], 
            "name": "International Journal of Computer Vision", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "11", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "129"
          }
        ], 
        "keywords": [
          "deep learning system", 
          "semantic bottleneck", 
          "art segmentation performance", 
          "hundreds of channels", 
          "drastic dimensionality reduction", 
          "semantic segmentation", 
          "deep representation", 
          "end training", 
          "segmentation performance", 
          "visual concepts", 
          "learning system", 
          "network output", 
          "dimensionality reduction", 
          "high performance", 
          "channel output", 
          "representation", 
          "inspectability", 
          "distributedness", 
          "segmentation", 
          "case study", 
          "network", 
          "performance", 
          "bottleneck", 
          "different models", 
          "downstream results", 
          "concept", 
          "output", 
          "channels", 
          "model", 
          "system", 
          "issues", 
          "training", 
          "hundreds", 
          "alignment", 
          "work", 
          "SB type", 
          "improvement", 
          "quantifying", 
          "end", 
          "types", 
          "state", 
          "results", 
          "area", 
          "reasons", 
          "curves", 
          "AUIC", 
          "reduction", 
          "study", 
          "Sb layer", 
          "paper"
        ], 
        "name": "Semantic Bottlenecks: Quantifying and Improving Inspectability of Deep Representations", 
        "pagination": "3136-3153", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1141079444"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s11263-021-01498-0"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s11263-021-01498-0", 
          "https://app.dimensions.ai/details/publication/pub.1141079444"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-12-01T06:42", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_879.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/s11263-021-01498-0"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s11263-021-01498-0'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s11263-021-01498-0'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s11263-021-01498-0'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s11263-021-01498-0'


     

    This table displays all metadata directly associated to this object as RDF triples.

    140 TRIPLES      21 PREDICATES      78 URIs      66 LITERALS      6 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s11263-021-01498-0 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N376694ad1bd545d7bcc974ef084b42da
    4 schema:citation sg:pub.10.1007/978-3-030-01228-1_26
    5 sg:pub.10.1007/978-3-030-20890-5_44
    6 sg:pub.10.1007/978-3-030-58539-6_11
    7 sg:pub.10.1007/978-3-319-10590-1_53
    8 schema:datePublished 2021-09-14
    9 schema:datePublishedReg 2021-09-14
    10 schema:description Today’s deep learning systems deliver high performance based on end-to-end training but are notoriously hard to inspect. We argue that there are at least two reasons making inspectability challenging: (i) representations are distributed across hundreds of channels and (ii) a unifying metric quantifying inspectability is lacking. In this paper, we address both issues by proposing Semantic Bottlenecks (SB), which can be integrated into pretrained networks, to align channel outputs with individual visual concepts and introduce the model agnostic Area Under inspectability Curve (AUiC) metric to measure the alignment. We present a case study on semantic segmentation to demonstrate that SBs improve the AUiC up to six-fold over regular network outputs. We explore two types of SB-layers in this work. First, concept-supervised SB-layers (SSB), which offer inspectability w.r.t. predefined concepts that the model is demanded to rely on. And second, unsupervised SBs (USB), which offer equally strong AUiC improvements by restricting distributedness of representations across channels. Importantly, for both SB types, we can recover state of the art segmentation performance across two different models despite a drastic dimensionality reduction from 1000s of non aligned channels to 10s of semantics-aligned channels that all downstream results are based on.
    11 schema:genre article
    12 schema:isAccessibleForFree true
    13 schema:isPartOf N4050739cede64c0295c1ff07a4db4f9d
    14 Nde0da39c0bd542b18a44df6ef95b92aa
    15 sg:journal.1032807
    16 schema:keywords AUIC
    17 SB type
    18 Sb layer
    19 alignment
    20 area
    21 art segmentation performance
    22 bottleneck
    23 case study
    24 channel output
    25 channels
    26 concept
    27 curves
    28 deep learning system
    29 deep representation
    30 different models
    31 dimensionality reduction
    32 distributedness
    33 downstream results
    34 drastic dimensionality reduction
    35 end
    36 end training
    37 high performance
    38 hundreds
    39 hundreds of channels
    40 improvement
    41 inspectability
    42 issues
    43 learning system
    44 model
    45 network
    46 network output
    47 output
    48 paper
    49 performance
    50 quantifying
    51 reasons
    52 reduction
    53 representation
    54 results
    55 segmentation
    56 segmentation performance
    57 semantic bottleneck
    58 semantic segmentation
    59 state
    60 study
    61 system
    62 training
    63 types
    64 visual concepts
    65 work
    66 schema:name Semantic Bottlenecks: Quantifying and Improving Inspectability of Deep Representations
    67 schema:pagination 3136-3153
    68 schema:productId N255e034391cd4b30aed4489a9d1f45dd
    69 Nd57b409d8d1e416387e58c4d76e3fe97
    70 schema:sameAs https://app.dimensions.ai/details/publication/pub.1141079444
    71 https://doi.org/10.1007/s11263-021-01498-0
    72 schema:sdDatePublished 2022-12-01T06:42
    73 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    74 schema:sdPublisher N544217948deb46e189a2b3c217a04da7
    75 schema:url https://doi.org/10.1007/s11263-021-01498-0
    76 sgo:license sg:explorer/license/
    77 sgo:sdDataset articles
    78 rdf:type schema:ScholarlyArticle
    79 N255e034391cd4b30aed4489a9d1f45dd schema:name doi
    80 schema:value 10.1007/s11263-021-01498-0
    81 rdf:type schema:PropertyValue
    82 N376694ad1bd545d7bcc974ef084b42da rdf:first sg:person.013504324635.08
    83 rdf:rest Nf824d8b668cf4c339db8a615ca6bd4e1
    84 N4050739cede64c0295c1ff07a4db4f9d schema:issueNumber 11
    85 rdf:type schema:PublicationIssue
    86 N544217948deb46e189a2b3c217a04da7 schema:name Springer Nature - SN SciGraph project
    87 rdf:type schema:Organization
    88 Na10c2b84c6894c3fbf6d214d538d99f8 rdf:first sg:person.01174260421.90
    89 rdf:rest rdf:nil
    90 Nd57b409d8d1e416387e58c4d76e3fe97 schema:name dimensions_id
    91 schema:value pub.1141079444
    92 rdf:type schema:PropertyValue
    93 Nde0da39c0bd542b18a44df6ef95b92aa schema:volumeNumber 129
    94 rdf:type schema:PublicationVolume
    95 Nf824d8b668cf4c339db8a615ca6bd4e1 rdf:first sg:person.013361072755.17
    96 rdf:rest Na10c2b84c6894c3fbf6d214d538d99f8
    97 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    98 schema:name Information and Computing Sciences
    99 rdf:type schema:DefinedTerm
    100 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    101 schema:name Artificial Intelligence and Image Processing
    102 rdf:type schema:DefinedTerm
    103 sg:journal.1032807 schema:issn 0920-5691
    104 1573-1405
    105 schema:name International Journal of Computer Vision
    106 schema:publisher Springer Nature
    107 rdf:type schema:Periodical
    108 sg:person.01174260421.90 schema:affiliation grid-institutes:grid.419528.3
    109 schema:familyName Schiele
    110 schema:givenName Bernt
    111 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01174260421.90
    112 rdf:type schema:Person
    113 sg:person.013361072755.17 schema:affiliation grid-institutes:grid.507511.7
    114 schema:familyName Fritz
    115 schema:givenName Mario
    116 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013361072755.17
    117 rdf:type schema:Person
    118 sg:person.013504324635.08 schema:affiliation grid-institutes:grid.419528.3
    119 schema:familyName Losch
    120 schema:givenName Max
    121 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013504324635.08
    122 rdf:type schema:Person
    123 sg:pub.10.1007/978-3-030-01228-1_26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107463261
    124 https://doi.org/10.1007/978-3-030-01228-1_26
    125 rdf:type schema:CreativeWork
    126 sg:pub.10.1007/978-3-030-20890-5_44 schema:sameAs https://app.dimensions.ai/details/publication/pub.1116240831
    127 https://doi.org/10.1007/978-3-030-20890-5_44
    128 rdf:type schema:CreativeWork
    129 sg:pub.10.1007/978-3-030-58539-6_11 schema:sameAs https://app.dimensions.ai/details/publication/pub.1132404102
    130 https://doi.org/10.1007/978-3-030-58539-6_11
    131 rdf:type schema:CreativeWork
    132 sg:pub.10.1007/978-3-319-10590-1_53 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032233097
    133 https://doi.org/10.1007/978-3-319-10590-1_53
    134 rdf:type schema:CreativeWork
    135 grid-institutes:grid.419528.3 schema:alternateName Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany
    136 schema:name Max Planck Institute for Informatics, Saarland Informatics Campus, Saarbrücken, Germany
    137 rdf:type schema:Organization
    138 grid-institutes:grid.507511.7 schema:alternateName CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
    139 schema:name CISPA Helmholtz Center for Information Security, Saarbrücken, Germany
    140 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...