Active Learning in Multi-armed Bandits View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2008

AUTHORS

András Antos , Varun Grover , Csaba Szepesvári

ABSTRACT

In this paper we consider the problem of actively learning the mean values of distributions associated with a finite number of options (arms). The algorithms can select which option to generate the next sample from in order to produce estimates with equally good precision for all the distributions. When an algorithm uses sample means to estimate the unknown values then the optimal solution, assuming full knowledge of the distributions, is to sample each option proportional to its variance. In this paper we propose an incremental algorithm that asymptotically achieves the same loss as an optimal rule. We prove that the excess loss suffered by this algorithm, apart from logarithmic factors, scales as n− 3/2, which we conjecture to be the optimal rate. The performance of the algorithm is illustrated in a simple problem. More... »

PAGES

287-302

References to SciGraph publications

  • 2010-09. Adaptive Optimal Allocation in Stratified Sampling Methods in METHODOLOGY AND COMPUTING IN APPLIED PROBABILITY
  • 1996. A Probabilistic Theory of Pattern Recognition in NONE
  • 2002-05. Finite-time Analysis of the Multiarmed Bandit Problem in MACHINE LEARNING
  • Book

    TITLE

    Algorithmic Learning Theory

    ISBN

    978-3-540-87986-2
    978-3-540-87987-9

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-540-87987-9_25

    DOI

    http://dx.doi.org/10.1007/978-3-540-87987-9_25

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1005167171


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "MTA Institute for Computer Science and Control", 
              "id": "https://www.grid.ac/institutes/grid.4836.9", 
              "name": [
                "Computer and Automation Research Institute, of the Hungarian Academy of Sciences, Kende u. 13-17, 1111, Budapest, Hungary"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Antos", 
            "givenName": "Andr\u00e1s", 
            "id": "sg:person.012122503615.61", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012122503615.61"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Alberta", 
              "id": "https://www.grid.ac/institutes/grid.17089.37", 
              "name": [
                "Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Grover", 
            "givenName": "Varun", 
            "id": "sg:person.015404616621.24", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015404616621.24"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Alberta", 
              "id": "https://www.grid.ac/institutes/grid.17089.37", 
              "name": [
                "Computer and Automation Research Institute, of the Hungarian Academy of Sciences, Kende u. 13-17, 1111, Budapest, Hungary", 
                "Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, Canada"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Szepesv\u00e1ri", 
            "givenName": "Csaba", 
            "id": "sg:person.016202177221.23", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016202177221.23"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/s11009-008-9108-0", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008316375", 
              "https://doi.org/10.1007/s11009-008-9108-0"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0196-8858(85)90002-8", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1033673111"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1023/a:1013689704352", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1039349898", 
              "https://doi.org/10.1023/a:1013689704352"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1045010725", 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4612-0711-5", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045010725", 
              "https://doi.org/10.1007/978-1-4612-0711-5"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4612-0711-5", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045010725", 
              "https://doi.org/10.1007/978-1-4612-0711-5"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2008", 
        "datePublishedReg": "2008-01-01", 
        "description": "In this paper we consider the problem of actively learning the mean values of distributions associated with a finite number of options (arms). The algorithms can select which option to generate the next sample from in order to produce estimates with equally good precision for all the distributions. When an algorithm uses sample means to estimate the unknown values then the optimal solution, assuming full knowledge of the distributions, is to sample each option proportional to its variance. In this paper we propose an incremental algorithm that asymptotically achieves the same loss as an optimal rule. We prove that the excess loss suffered by this algorithm, apart from logarithmic factors, scales as n\u2212 3/2, which we conjecture to be the optimal rate. The performance of the algorithm is illustrated in a simple problem.", 
        "editor": [
          {
            "familyName": "Freund", 
            "givenName": "Yoav", 
            "type": "Person"
          }, 
          {
            "familyName": "Gy\u00f6rfi", 
            "givenName": "L\u00e1szl\u00f3", 
            "type": "Person"
          }, 
          {
            "familyName": "Tur\u00e1n", 
            "givenName": "Gy\u00f6rgy", 
            "type": "Person"
          }, 
          {
            "familyName": "Zeugmann", 
            "givenName": "Thomas", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-540-87987-9_25", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-540-87986-2", 
            "978-3-540-87987-9"
          ], 
          "name": "Algorithmic Learning Theory", 
          "type": "Book"
        }, 
        "name": "Active Learning in Multi-armed Bandits", 
        "pagination": "287-302", 
        "productId": [
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-540-87987-9_25"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "189e24af16360cf2f6be70ed0a0036920234664a50e31dcb372229c3f8eb535a"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1005167171"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-540-87987-9_25", 
          "https://app.dimensions.ai/details/publication/pub.1005167171"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T06:11", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000350_0000000350/records_77586_00000000.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-540-87987-9_25"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-87987-9_25'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-87987-9_25'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-87987-9_25'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-87987-9_25'


     

    This table displays all metadata directly associated to this object as RDF triples.

    115 TRIPLES      23 PREDICATES      32 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-540-87987-9_25 schema:about anzsrc-for:01
    2 anzsrc-for:0104
    3 schema:author N40a599ba4b4540888a5e2f081dad2ea5
    4 schema:citation sg:pub.10.1007/978-1-4612-0711-5
    5 sg:pub.10.1007/s11009-008-9108-0
    6 sg:pub.10.1023/a:1013689704352
    7 https://app.dimensions.ai/details/publication/pub.1045010725
    8 https://doi.org/10.1016/0196-8858(85)90002-8
    9 schema:datePublished 2008
    10 schema:datePublishedReg 2008-01-01
    11 schema:description In this paper we consider the problem of actively learning the mean values of distributions associated with a finite number of options (arms). The algorithms can select which option to generate the next sample from in order to produce estimates with equally good precision for all the distributions. When an algorithm uses sample means to estimate the unknown values then the optimal solution, assuming full knowledge of the distributions, is to sample each option proportional to its variance. In this paper we propose an incremental algorithm that asymptotically achieves the same loss as an optimal rule. We prove that the excess loss suffered by this algorithm, apart from logarithmic factors, scales as n− 3/2, which we conjecture to be the optimal rate. The performance of the algorithm is illustrated in a simple problem.
    12 schema:editor N056a5075d5cd450d8d5dfc62e7ad723c
    13 schema:genre chapter
    14 schema:inLanguage en
    15 schema:isAccessibleForFree true
    16 schema:isPartOf Nd3cf2449bb0a4ed2acdccee149a68227
    17 schema:name Active Learning in Multi-armed Bandits
    18 schema:pagination 287-302
    19 schema:productId N9caca7ec0fd241ff9392fa9f3765493b
    20 Nbd1755667990439ea6371e121dda24b6
    21 Ne0755da168884de1a88e2c6e83f309f0
    22 schema:publisher N5d90d99ce2874e0ea7d6086ecc91b6f7
    23 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005167171
    24 https://doi.org/10.1007/978-3-540-87987-9_25
    25 schema:sdDatePublished 2019-04-16T06:11
    26 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    27 schema:sdPublisher N6a19df4c8b86425998d83012d3fca44f
    28 schema:url https://link.springer.com/10.1007%2F978-3-540-87987-9_25
    29 sgo:license sg:explorer/license/
    30 sgo:sdDataset chapters
    31 rdf:type schema:Chapter
    32 N001a16cfdbab4d139f8b9f7d2c5bace6 rdf:first sg:person.016202177221.23
    33 rdf:rest rdf:nil
    34 N045fac31dd4a44b7b706f29e13a981f3 schema:familyName Turán
    35 schema:givenName György
    36 rdf:type schema:Person
    37 N056a5075d5cd450d8d5dfc62e7ad723c rdf:first N5f6416e3ca3144d687a2735b0375b87a
    38 rdf:rest N0a2f26a7a9234dc8b72a1284222c9f64
    39 N0a2f26a7a9234dc8b72a1284222c9f64 rdf:first N2241ee18ae5e4554a7b7dbb3ea43b160
    40 rdf:rest N48a96086c804491c8af6c3c733e743d0
    41 N20b8708d3b7546f182aedd91fa3148c9 schema:familyName Zeugmann
    42 schema:givenName Thomas
    43 rdf:type schema:Person
    44 N2241ee18ae5e4554a7b7dbb3ea43b160 schema:familyName Györfi
    45 schema:givenName László
    46 rdf:type schema:Person
    47 N40a599ba4b4540888a5e2f081dad2ea5 rdf:first sg:person.012122503615.61
    48 rdf:rest Na3120cbba6f2472e947614d494e29c9d
    49 N48a96086c804491c8af6c3c733e743d0 rdf:first N045fac31dd4a44b7b706f29e13a981f3
    50 rdf:rest Nd27aa18c902741378d1458fc489f94a4
    51 N5d90d99ce2874e0ea7d6086ecc91b6f7 schema:location Berlin, Heidelberg
    52 schema:name Springer Berlin Heidelberg
    53 rdf:type schema:Organisation
    54 N5f6416e3ca3144d687a2735b0375b87a schema:familyName Freund
    55 schema:givenName Yoav
    56 rdf:type schema:Person
    57 N6a19df4c8b86425998d83012d3fca44f schema:name Springer Nature - SN SciGraph project
    58 rdf:type schema:Organization
    59 N9caca7ec0fd241ff9392fa9f3765493b schema:name readcube_id
    60 schema:value 189e24af16360cf2f6be70ed0a0036920234664a50e31dcb372229c3f8eb535a
    61 rdf:type schema:PropertyValue
    62 Na3120cbba6f2472e947614d494e29c9d rdf:first sg:person.015404616621.24
    63 rdf:rest N001a16cfdbab4d139f8b9f7d2c5bace6
    64 Nbd1755667990439ea6371e121dda24b6 schema:name dimensions_id
    65 schema:value pub.1005167171
    66 rdf:type schema:PropertyValue
    67 Nd27aa18c902741378d1458fc489f94a4 rdf:first N20b8708d3b7546f182aedd91fa3148c9
    68 rdf:rest rdf:nil
    69 Nd3cf2449bb0a4ed2acdccee149a68227 schema:isbn 978-3-540-87986-2
    70 978-3-540-87987-9
    71 schema:name Algorithmic Learning Theory
    72 rdf:type schema:Book
    73 Ne0755da168884de1a88e2c6e83f309f0 schema:name doi
    74 schema:value 10.1007/978-3-540-87987-9_25
    75 rdf:type schema:PropertyValue
    76 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    77 schema:name Mathematical Sciences
    78 rdf:type schema:DefinedTerm
    79 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Statistics
    81 rdf:type schema:DefinedTerm
    82 sg:person.012122503615.61 schema:affiliation https://www.grid.ac/institutes/grid.4836.9
    83 schema:familyName Antos
    84 schema:givenName András
    85 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012122503615.61
    86 rdf:type schema:Person
    87 sg:person.015404616621.24 schema:affiliation https://www.grid.ac/institutes/grid.17089.37
    88 schema:familyName Grover
    89 schema:givenName Varun
    90 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015404616621.24
    91 rdf:type schema:Person
    92 sg:person.016202177221.23 schema:affiliation https://www.grid.ac/institutes/grid.17089.37
    93 schema:familyName Szepesvári
    94 schema:givenName Csaba
    95 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016202177221.23
    96 rdf:type schema:Person
    97 sg:pub.10.1007/978-1-4612-0711-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045010725
    98 https://doi.org/10.1007/978-1-4612-0711-5
    99 rdf:type schema:CreativeWork
    100 sg:pub.10.1007/s11009-008-9108-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008316375
    101 https://doi.org/10.1007/s11009-008-9108-0
    102 rdf:type schema:CreativeWork
    103 sg:pub.10.1023/a:1013689704352 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039349898
    104 https://doi.org/10.1023/a:1013689704352
    105 rdf:type schema:CreativeWork
    106 https://app.dimensions.ai/details/publication/pub.1045010725 schema:CreativeWork
    107 https://doi.org/10.1016/0196-8858(85)90002-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033673111
    108 rdf:type schema:CreativeWork
    109 https://www.grid.ac/institutes/grid.17089.37 schema:alternateName University of Alberta
    110 schema:name Computer and Automation Research Institute, of the Hungarian Academy of Sciences, Kende u. 13-17, 1111, Budapest, Hungary
    111 Department of Computing Science, University of Alberta, T6G 2E8, Edmonton, Canada
    112 rdf:type schema:Organization
    113 https://www.grid.ac/institutes/grid.4836.9 schema:alternateName MTA Institute for Computer Science and Control
    114 schema:name Computer and Automation Research Institute, of the Hungarian Academy of Sciences, Kende u. 13-17, 1111, Budapest, Hungary
    115 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...