A Generic Model to Compose Vision Modules for Holistic Scene Understanding View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2012

AUTHORS

Congcong Li , Adarsh Kowdle , Ashutosh Saxena , Tsuhan Chen

ABSTRACT

The problem of holistic scene understanding involves many vision tasks such as depth estimation, scene categorization, event categorization, etc. Each of these tasks explores some aspects of the scene but, these tasks are related in that, they represent attributes of the same scene. An intuition is that one task can provide meaningful attributes to aid the learning process of another task. In this work, we propose a generic model (together with learning and inference techniques) for connecting different vision tasks in the form of a 2-layer cascade. Our model considers the first layer as a hidden layer, where the latent variables are inferred by feedback from the second layer. The feedback mechanism allows the first layer classifiers to focus on more important image modes, and draws their output towards “attributes” rather than the original “labels”. Our model also automatically discovers sparse connections between the learned attributes on the first layer and the target task on the second layer. Note that in our model, the same vision tasks can act as attribute learners as well as target tasks, while being set up on different layers. In extensive experiments, we show that the same proposed model improves the performance in all the tasks we consider: single image depth estimation, scene categorization, saliency detection and event categorization. More... »

PAGES

70-85

Book

TITLE

Trends and Topics in Computer Vision

ISBN

978-3-642-35748-0
978-3-642-35749-7

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-35749-7_6

DOI

http://dx.doi.org/10.1007/978-3-642-35749-7_6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1020487843


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Cornell University", 
          "id": "https://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "School of Electrical & Computer Engineering, Cornell University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Congcong", 
        "id": "sg:person.0773655267.44", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0773655267.44"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Cornell University", 
          "id": "https://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "School of Electrical & Computer Engineering, Cornell University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kowdle", 
        "givenName": "Adarsh", 
        "id": "sg:person.010315021706.95", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010315021706.95"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Cornell University", 
          "id": "https://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "Department of Computer Science, Cornell University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Saxena", 
        "givenName": "Ashutosh", 
        "id": "sg:person.012720014203.07", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012720014203.07"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Cornell University", 
          "id": "https://www.grid.ac/institutes/grid.5386.8", 
          "name": [
            "School of Electrical & Computer Engineering, Cornell University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chen", 
        "givenName": "Tsuhan", 
        "id": "sg:person.012245072625.31", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012245072625.31"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1006/jcss.1997.1504", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004338842"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11263-007-0071-y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014656451", 
          "https://doi.org/10.1007/s11263-007-0071-y"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1011139631724", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019562355", 
          "https://doi.org/10.1023/a:1011139631724"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-88690-7_4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049900285", 
          "https://doi.org/10.1007/978-3-540-88690-7_4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-88690-7_4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049900285", 
          "https://doi.org/10.1007/978-3-540-88690-7_4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1037/0033-295x.113.4.766", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053285721"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/72.883477", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061219512"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tpami.2008.132", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061743494"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/iccv.2009.5459194", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093497050"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/cvpr.2009.5206596", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093755850"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/cvpr.2009.5206772", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094211775"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/cvpr.2009.5206594", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094500840"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/iccv.2005.9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095480350"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2012", 
    "datePublishedReg": "2012-01-01", 
    "description": "The problem of holistic scene understanding involves many vision tasks such as depth estimation, scene categorization, event categorization, etc. Each of these tasks explores some aspects of the scene but, these tasks are related in that, they represent attributes of the same scene. An intuition is that one task can provide meaningful attributes to aid the learning process of another task. In this work, we propose a generic model (together with learning and inference techniques) for connecting different vision tasks in the form of a 2-layer cascade. Our model considers the first layer as a hidden layer, where the latent variables are inferred by feedback from the second layer. The feedback mechanism allows the first layer classifiers to focus on more important image modes, and draws their output towards \u201cattributes\u201d rather than the original \u201clabels\u201d. Our model also automatically discovers sparse connections between the learned attributes on the first layer and the target task on the second layer. Note that in our model, the same vision tasks can act as attribute learners as well as target tasks, while being set up on different layers. In extensive experiments, we show that the same proposed model improves the performance in all the tasks we consider: single image depth estimation, scene categorization, saliency detection and event categorization.", 
    "editor": [
      {
        "familyName": "Kutulakos", 
        "givenName": "Kiriakos N.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-35749-7_6", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-642-35748-0", 
        "978-3-642-35749-7"
      ], 
      "name": "Trends and Topics in Computer Vision", 
      "type": "Book"
    }, 
    "name": "A Generic Model to Compose Vision Modules for Holistic Scene Understanding", 
    "pagination": "70-85", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-35749-7_6"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "619f15697f195df11f7f9b7f4f7c2f96a41ed9b43cd15946dcb5d21d03dcd87b"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1020487843"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-35749-7_6", 
      "https://app.dimensions.ai/details/publication/pub.1020487843"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T19:08", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8684_00000255.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-642-35749-7_6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-35749-7_6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-35749-7_6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-35749-7_6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-35749-7_6'


 

This table displays all metadata directly associated to this object as RDF triples.

126 TRIPLES      23 PREDICATES      39 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-35749-7_6 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N0cfcd5130c994d49be0afbbb2a8a849b
4 schema:citation sg:pub.10.1007/978-3-540-88690-7_4
5 sg:pub.10.1007/s11263-007-0071-y
6 sg:pub.10.1023/a:1011139631724
7 https://doi.org/10.1006/jcss.1997.1504
8 https://doi.org/10.1037/0033-295x.113.4.766
9 https://doi.org/10.1109/72.883477
10 https://doi.org/10.1109/cvpr.2009.5206594
11 https://doi.org/10.1109/cvpr.2009.5206596
12 https://doi.org/10.1109/cvpr.2009.5206772
13 https://doi.org/10.1109/iccv.2005.9
14 https://doi.org/10.1109/iccv.2009.5459194
15 https://doi.org/10.1109/tpami.2008.132
16 schema:datePublished 2012
17 schema:datePublishedReg 2012-01-01
18 schema:description The problem of holistic scene understanding involves many vision tasks such as depth estimation, scene categorization, event categorization, etc. Each of these tasks explores some aspects of the scene but, these tasks are related in that, they represent attributes of the same scene. An intuition is that one task can provide meaningful attributes to aid the learning process of another task. In this work, we propose a generic model (together with learning and inference techniques) for connecting different vision tasks in the form of a 2-layer cascade. Our model considers the first layer as a hidden layer, where the latent variables are inferred by feedback from the second layer. The feedback mechanism allows the first layer classifiers to focus on more important image modes, and draws their output towards “attributes” rather than the original “labels”. Our model also automatically discovers sparse connections between the learned attributes on the first layer and the target task on the second layer. Note that in our model, the same vision tasks can act as attribute learners as well as target tasks, while being set up on different layers. In extensive experiments, we show that the same proposed model improves the performance in all the tasks we consider: single image depth estimation, scene categorization, saliency detection and event categorization.
19 schema:editor Nffa734ce1aed4242915df3829e726c1a
20 schema:genre chapter
21 schema:inLanguage en
22 schema:isAccessibleForFree false
23 schema:isPartOf N85493b04778d4c0e930f94f637c12b3b
24 schema:name A Generic Model to Compose Vision Modules for Holistic Scene Understanding
25 schema:pagination 70-85
26 schema:productId N256d6f44a69e41929b18a1f7dbf7e10a
27 N4f5c897296cd409c942aa0dfbe078a09
28 N6a4a7013c88f42aa83769801799545c2
29 schema:publisher N14920519d8794d849820b1e756011a01
30 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020487843
31 https://doi.org/10.1007/978-3-642-35749-7_6
32 schema:sdDatePublished 2019-04-15T19:08
33 schema:sdLicense https://scigraph.springernature.com/explorer/license/
34 schema:sdPublisher N1ccdf376620942398b78fa78b5fa6ffe
35 schema:url http://link.springer.com/10.1007/978-3-642-35749-7_6
36 sgo:license sg:explorer/license/
37 sgo:sdDataset chapters
38 rdf:type schema:Chapter
39 N0cfcd5130c994d49be0afbbb2a8a849b rdf:first sg:person.0773655267.44
40 rdf:rest N9b6bb04935c74bdfa6d9667e67527b18
41 N14920519d8794d849820b1e756011a01 schema:location Berlin, Heidelberg
42 schema:name Springer Berlin Heidelberg
43 rdf:type schema:Organisation
44 N1ccdf376620942398b78fa78b5fa6ffe schema:name Springer Nature - SN SciGraph project
45 rdf:type schema:Organization
46 N256d6f44a69e41929b18a1f7dbf7e10a schema:name dimensions_id
47 schema:value pub.1020487843
48 rdf:type schema:PropertyValue
49 N4f5c897296cd409c942aa0dfbe078a09 schema:name readcube_id
50 schema:value 619f15697f195df11f7f9b7f4f7c2f96a41ed9b43cd15946dcb5d21d03dcd87b
51 rdf:type schema:PropertyValue
52 N6a4a7013c88f42aa83769801799545c2 schema:name doi
53 schema:value 10.1007/978-3-642-35749-7_6
54 rdf:type schema:PropertyValue
55 N85493b04778d4c0e930f94f637c12b3b schema:isbn 978-3-642-35748-0
56 978-3-642-35749-7
57 schema:name Trends and Topics in Computer Vision
58 rdf:type schema:Book
59 N9b6bb04935c74bdfa6d9667e67527b18 rdf:first sg:person.010315021706.95
60 rdf:rest Ne63410896296436f908c40af266f8bba
61 Na5eb1e07498549b0a14bf6fad5e79d04 schema:familyName Kutulakos
62 schema:givenName Kiriakos N.
63 rdf:type schema:Person
64 Ne63410896296436f908c40af266f8bba rdf:first sg:person.012720014203.07
65 rdf:rest Ned8c6d3e40494e448097b61df530eafc
66 Ned8c6d3e40494e448097b61df530eafc rdf:first sg:person.012245072625.31
67 rdf:rest rdf:nil
68 Nffa734ce1aed4242915df3829e726c1a rdf:first Na5eb1e07498549b0a14bf6fad5e79d04
69 rdf:rest rdf:nil
70 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
71 schema:name Information and Computing Sciences
72 rdf:type schema:DefinedTerm
73 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
74 schema:name Artificial Intelligence and Image Processing
75 rdf:type schema:DefinedTerm
76 sg:person.010315021706.95 schema:affiliation https://www.grid.ac/institutes/grid.5386.8
77 schema:familyName Kowdle
78 schema:givenName Adarsh
79 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010315021706.95
80 rdf:type schema:Person
81 sg:person.012245072625.31 schema:affiliation https://www.grid.ac/institutes/grid.5386.8
82 schema:familyName Chen
83 schema:givenName Tsuhan
84 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012245072625.31
85 rdf:type schema:Person
86 sg:person.012720014203.07 schema:affiliation https://www.grid.ac/institutes/grid.5386.8
87 schema:familyName Saxena
88 schema:givenName Ashutosh
89 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012720014203.07
90 rdf:type schema:Person
91 sg:person.0773655267.44 schema:affiliation https://www.grid.ac/institutes/grid.5386.8
92 schema:familyName Li
93 schema:givenName Congcong
94 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0773655267.44
95 rdf:type schema:Person
96 sg:pub.10.1007/978-3-540-88690-7_4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049900285
97 https://doi.org/10.1007/978-3-540-88690-7_4
98 rdf:type schema:CreativeWork
99 sg:pub.10.1007/s11263-007-0071-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1014656451
100 https://doi.org/10.1007/s11263-007-0071-y
101 rdf:type schema:CreativeWork
102 sg:pub.10.1023/a:1011139631724 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019562355
103 https://doi.org/10.1023/a:1011139631724
104 rdf:type schema:CreativeWork
105 https://doi.org/10.1006/jcss.1997.1504 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004338842
106 rdf:type schema:CreativeWork
107 https://doi.org/10.1037/0033-295x.113.4.766 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053285721
108 rdf:type schema:CreativeWork
109 https://doi.org/10.1109/72.883477 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061219512
110 rdf:type schema:CreativeWork
111 https://doi.org/10.1109/cvpr.2009.5206594 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094500840
112 rdf:type schema:CreativeWork
113 https://doi.org/10.1109/cvpr.2009.5206596 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093755850
114 rdf:type schema:CreativeWork
115 https://doi.org/10.1109/cvpr.2009.5206772 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094211775
116 rdf:type schema:CreativeWork
117 https://doi.org/10.1109/iccv.2005.9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095480350
118 rdf:type schema:CreativeWork
119 https://doi.org/10.1109/iccv.2009.5459194 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093497050
120 rdf:type schema:CreativeWork
121 https://doi.org/10.1109/tpami.2008.132 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061743494
122 rdf:type schema:CreativeWork
123 https://www.grid.ac/institutes/grid.5386.8 schema:alternateName Cornell University
124 schema:name Department of Computer Science, Cornell University, USA
125 School of Electrical & Computer Engineering, Cornell University, USA
126 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...