Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2010

AUTHORS

Carl Vondrick , Deva Ramanan , Donald Patterson

ABSTRACT

Accurately annotating entities in video is labor intensive and expensive. As the quantity of online video grows, traditional solutions to this task are unable to scale to meet the needs of researchers with limited budgets. Current practice provides a temporary solution by paying dedicated workers to label a fraction of the total frames and otherwise settling for linear interpolation. As budgets and scale require sparser key frames, the assumption of linearity fails and labels become inaccurate. To address this problem we have created a public framework for dividing the work of labeling video data into micro-tasks that can be completed by huge labor pools available through crowdsourced marketplaces. By extracting pixel-based features from manually labeled entities, we are able to leverage more sophisticated interpolation between key frames to maximize performance given a budget. Finally, by validating the power of our framework on difficult, real-world data sets we demonstrate an inherent trade-off between the mix of human and cloud computing used vs. the accuracy and cost of the labeling. More... »

PAGES

610-623

Book

TITLE

Computer Vision – ECCV 2010

ISBN

978-3-642-15560-4
978-3-642-15561-1

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-15561-1_44

DOI

http://dx.doi.org/10.1007/978-3-642-15561-1_44

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1030274604


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of California, Irvine, USA", 
          "id": "http://www.grid.ac/institutes/grid.266093.8", 
          "name": [
            "Department of Computer Science, University of California, Irvine, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vondrick", 
        "givenName": "Carl", 
        "id": "sg:person.011554751423.38", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011554751423.38"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of California, Irvine, USA", 
          "id": "http://www.grid.ac/institutes/grid.266093.8", 
          "name": [
            "Department of Computer Science, University of California, Irvine, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ramanan", 
        "givenName": "Deva", 
        "id": "sg:person.010041565404.26", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010041565404.26"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of California, Irvine, USA", 
          "id": "http://www.grid.ac/institutes/grid.266093.8", 
          "name": [
            "Department of Computer Science, University of California, Irvine, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Patterson", 
        "givenName": "Donald", 
        "id": "sg:person.016065540770.63", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016065540770.63"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2010", 
    "datePublishedReg": "2010-01-01", 
    "description": "Accurately annotating entities in video is labor intensive and expensive. As the quantity of online video grows, traditional solutions to this task are unable to scale to meet the needs of researchers with limited budgets. Current practice provides a temporary solution by paying dedicated workers to label a fraction of the total frames and otherwise settling for linear interpolation. As budgets and scale require sparser key frames, the assumption of linearity fails and labels become inaccurate. To address this problem we have created a public framework for dividing the work of labeling video data into micro-tasks that can be completed by huge labor pools available through crowdsourced marketplaces. By extracting pixel-based features from manually labeled entities, we are able to leverage more sophisticated interpolation between key frames to maximize performance given a budget. Finally, by validating the power of our framework on difficult, real-world data sets we demonstrate an inherent trade-off between the mix of human and cloud computing used vs. the accuracy and cost of the labeling.", 
    "editor": [
      {
        "familyName": "Daniilidis", 
        "givenName": "Kostas", 
        "type": "Person"
      }, 
      {
        "familyName": "Maragos", 
        "givenName": "Petros", 
        "type": "Person"
      }, 
      {
        "familyName": "Paragios", 
        "givenName": "Nikos", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-15561-1_44", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-642-15560-4", 
        "978-3-642-15561-1"
      ], 
      "name": "Computer Vision \u2013 ECCV 2010", 
      "type": "Book"
    }, 
    "keywords": [
      "key frames", 
      "real-world data sets", 
      "pixel-based features", 
      "cloud computing", 
      "video annotation", 
      "video data", 
      "crowdsourced marketplaces", 
      "online videos", 
      "traditional solutions", 
      "needs of researchers", 
      "video", 
      "sophisticated interpolation", 
      "data sets", 
      "limited budget", 
      "total frame", 
      "linear interpolation", 
      "public framework", 
      "dedicated workers", 
      "computing", 
      "framework", 
      "frame", 
      "interpolation", 
      "annotation", 
      "marketplace", 
      "task", 
      "entities", 
      "solution", 
      "accuracy", 
      "set", 
      "labels", 
      "performance", 
      "cost", 
      "budget", 
      "features", 
      "researchers", 
      "current practice", 
      "work", 
      "temporary solution", 
      "need", 
      "assumption of linearity", 
      "data", 
      "power", 
      "labor pool", 
      "assumption", 
      "practice", 
      "mix", 
      "labeling", 
      "quantity", 
      "pool", 
      "labor", 
      "workers", 
      "linearity", 
      "fraction", 
      "problem", 
      "sparser key frames", 
      "huge labor pools"
    ], 
    "name": "Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces", 
    "pagination": "610-623", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1030274604"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-15561-1_44"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-15561-1_44", 
      "https://app.dimensions.ai/details/publication/pub.1030274604"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2021-12-01T20:03", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211201/entities/gbq_results/chapter/chapter_281.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-642-15561-1_44"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-15561-1_44'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-15561-1_44'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-15561-1_44'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-15561-1_44'


 

This table displays all metadata directly associated to this object as RDF triples.

140 TRIPLES      23 PREDICATES      82 URIs      75 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-15561-1_44 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N93cd3eff719b4f558c320e032dc1fc37
4 schema:datePublished 2010
5 schema:datePublishedReg 2010-01-01
6 schema:description Accurately annotating entities in video is labor intensive and expensive. As the quantity of online video grows, traditional solutions to this task are unable to scale to meet the needs of researchers with limited budgets. Current practice provides a temporary solution by paying dedicated workers to label a fraction of the total frames and otherwise settling for linear interpolation. As budgets and scale require sparser key frames, the assumption of linearity fails and labels become inaccurate. To address this problem we have created a public framework for dividing the work of labeling video data into micro-tasks that can be completed by huge labor pools available through crowdsourced marketplaces. By extracting pixel-based features from manually labeled entities, we are able to leverage more sophisticated interpolation between key frames to maximize performance given a budget. Finally, by validating the power of our framework on difficult, real-world data sets we demonstrate an inherent trade-off between the mix of human and cloud computing used vs. the accuracy and cost of the labeling.
7 schema:editor Na08adf19af8e493caff00071d2f40536
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N1770d9cfe1094a8f820539f0a175bbd1
12 schema:keywords accuracy
13 annotation
14 assumption
15 assumption of linearity
16 budget
17 cloud computing
18 computing
19 cost
20 crowdsourced marketplaces
21 current practice
22 data
23 data sets
24 dedicated workers
25 entities
26 features
27 fraction
28 frame
29 framework
30 huge labor pools
31 interpolation
32 key frames
33 labeling
34 labels
35 labor
36 labor pool
37 limited budget
38 linear interpolation
39 linearity
40 marketplace
41 mix
42 need
43 needs of researchers
44 online videos
45 performance
46 pixel-based features
47 pool
48 power
49 practice
50 problem
51 public framework
52 quantity
53 real-world data sets
54 researchers
55 set
56 solution
57 sophisticated interpolation
58 sparser key frames
59 task
60 temporary solution
61 total frame
62 traditional solutions
63 video
64 video annotation
65 video data
66 work
67 workers
68 schema:name Efficiently Scaling Up Video Annotation with Crowdsourced Marketplaces
69 schema:pagination 610-623
70 schema:productId N646e5f948f184a48a4a4d0706e0afa3b
71 N6d846ed1073c45c390cda4ce6c28389c
72 schema:publisher N169626f9292445f29d15a4d30bb44abd
73 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030274604
74 https://doi.org/10.1007/978-3-642-15561-1_44
75 schema:sdDatePublished 2021-12-01T20:03
76 schema:sdLicense https://scigraph.springernature.com/explorer/license/
77 schema:sdPublisher N12bc4bcc8b694d8db72137c4a798056e
78 schema:url https://doi.org/10.1007/978-3-642-15561-1_44
79 sgo:license sg:explorer/license/
80 sgo:sdDataset chapters
81 rdf:type schema:Chapter
82 N12bc4bcc8b694d8db72137c4a798056e schema:name Springer Nature - SN SciGraph project
83 rdf:type schema:Organization
84 N169626f9292445f29d15a4d30bb44abd schema:name Springer Nature
85 rdf:type schema:Organisation
86 N1770d9cfe1094a8f820539f0a175bbd1 schema:isbn 978-3-642-15560-4
87 978-3-642-15561-1
88 schema:name Computer Vision – ECCV 2010
89 rdf:type schema:Book
90 N1c5ad3a85b6443059e7dfddfa5191212 schema:familyName Daniilidis
91 schema:givenName Kostas
92 rdf:type schema:Person
93 N288bd35d58634b93b6dca69c88a09bd2 schema:familyName Paragios
94 schema:givenName Nikos
95 rdf:type schema:Person
96 N4407f35c97474fe286bde4ffc00f6a8c rdf:first N47b1584764c24083b9e9b19127eab20e
97 rdf:rest Nedea152a3e284de2a0815f4299fef88b
98 N47b1584764c24083b9e9b19127eab20e schema:familyName Maragos
99 schema:givenName Petros
100 rdf:type schema:Person
101 N646e5f948f184a48a4a4d0706e0afa3b schema:name doi
102 schema:value 10.1007/978-3-642-15561-1_44
103 rdf:type schema:PropertyValue
104 N6d846ed1073c45c390cda4ce6c28389c schema:name dimensions_id
105 schema:value pub.1030274604
106 rdf:type schema:PropertyValue
107 N6dbc2d325344497da7fa207ac2a769ac rdf:first sg:person.016065540770.63
108 rdf:rest rdf:nil
109 N8d9beca931424fdb982e6056227f1389 rdf:first sg:person.010041565404.26
110 rdf:rest N6dbc2d325344497da7fa207ac2a769ac
111 N93cd3eff719b4f558c320e032dc1fc37 rdf:first sg:person.011554751423.38
112 rdf:rest N8d9beca931424fdb982e6056227f1389
113 Na08adf19af8e493caff00071d2f40536 rdf:first N1c5ad3a85b6443059e7dfddfa5191212
114 rdf:rest N4407f35c97474fe286bde4ffc00f6a8c
115 Nedea152a3e284de2a0815f4299fef88b rdf:first N288bd35d58634b93b6dca69c88a09bd2
116 rdf:rest rdf:nil
117 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
118 schema:name Information and Computing Sciences
119 rdf:type schema:DefinedTerm
120 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
121 schema:name Artificial Intelligence and Image Processing
122 rdf:type schema:DefinedTerm
123 sg:person.010041565404.26 schema:affiliation grid-institutes:grid.266093.8
124 schema:familyName Ramanan
125 schema:givenName Deva
126 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010041565404.26
127 rdf:type schema:Person
128 sg:person.011554751423.38 schema:affiliation grid-institutes:grid.266093.8
129 schema:familyName Vondrick
130 schema:givenName Carl
131 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011554751423.38
132 rdf:type schema:Person
133 sg:person.016065540770.63 schema:affiliation grid-institutes:grid.266093.8
134 schema:familyName Patterson
135 schema:givenName Donald
136 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016065540770.63
137 rdf:type schema:Person
138 grid-institutes:grid.266093.8 schema:alternateName Department of Computer Science, University of California, Irvine, USA
139 schema:name Department of Computer Science, University of California, Irvine, USA
140 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...