Ontology type: schema:Chapter Open Access: True
2011
AUTHORSAntoine Salomon , Jean-Yves Audibert
ABSTRACTThis paper studies the deviations of the regret in a stochastic multi-armed bandit problem. When the total number of plays n is known beforehand by the agent, Audibert et al. (2009) exhibit a policy such that with probability at least 1-1/n, the regret of the policy is of order logn. They have also shown that such a property is not shared by the popular ucb1 policy of Auer et al. (2002). This work first answers an open question: it extends this negative result to any anytime policy. The second contribution of this paper is to design anytime robust policies for specific multi-armed bandit problems in which some restrictions are put on the set of possible distributions of the different arms. More... »
PAGES159-173
Algorithmic Learning Theory
ISBN
978-3-642-24411-7
978-3-642-24412-4
http://scigraph.springernature.com/pub.10.1007/978-3-642-24412-4_15
DOIhttp://dx.doi.org/10.1007/978-3-642-24412-4_15
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1050961054
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1605",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Policy and Administration",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/16",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Studies in Human Society",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Laboratoire d'Informatique Gaspard-Monge",
"id": "https://www.grid.ac/institutes/grid.462940.d",
"name": [
"Imagine, LIGM, \u00c9cole des Ponts ParisTech Universit\u00e9 Paris Est, France"
],
"type": "Organization"
},
"familyName": "Salomon",
"givenName": "Antoine",
"type": "Person"
},
{
"affiliation": {
"alternateName": "French National Centre for Scientific Research",
"id": "https://www.grid.ac/institutes/grid.4444.0",
"name": [
"Imagine, LIGM, \u00c9cole des Ponts ParisTech Universit\u00e9 Paris Est, France",
"Sierra, CNRS/ENS/INRIA, Paris, France"
],
"type": "Organization"
},
"familyName": "Audibert",
"givenName": "Jean-Yves",
"type": "Person"
}
],
"citation": [
{
"id": "https://doi.org/10.1145/1566374.1566386",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1001774469"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/1374376.1374475",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1004451711"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1057/978-1-349-95121-5_2386-1",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1021542300",
"https://doi.org/10.1057/978-1-349-95121-5_2386-1"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/j.tcs.2009.01.016",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1032851363"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/0196-8858(85)90002-8",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1033673111"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/11871842_29",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1037233025",
"https://doi.org/10.1007/11871842_29"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/11871842_29",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1037233025",
"https://doi.org/10.1007/11871842_29"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1090/s0002-9904-1952-09620-8",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1037264252"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1145/1566374.1566388",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1037453164"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1023/a:1013689704352",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1039349898",
"https://doi.org/10.1023/a:1013689704352"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1214/105051604000000350",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1064391735"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1214/aop/1176990746",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1064403995"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.2307/1427934",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1069490858"
],
"type": "CreativeWork"
},
{
"id": "https://app.dimensions.ai/details/publication/pub.1111345088",
"type": "CreativeWork"
}
],
"datePublished": "2011",
"datePublishedReg": "2011-01-01",
"description": "This paper studies the deviations of the regret in a stochastic multi-armed bandit problem. When the total number of plays n is known beforehand by the agent, Audibert et al. (2009) exhibit a policy such that with probability at least 1-1/n, the regret of the policy is of order logn. They have also shown that such a property is not shared by the popular ucb1 policy of Auer et al. (2002). This work first answers an open question: it extends this negative result to any anytime policy. The second contribution of this paper is to design anytime robust policies for specific multi-armed bandit problems in which some restrictions are put on the set of possible distributions of the different arms.",
"editor": [
{
"familyName": "Kivinen",
"givenName": "Jyrki",
"type": "Person"
},
{
"familyName": "Szepesv\u00e1ri",
"givenName": "Csaba",
"type": "Person"
},
{
"familyName": "Ukkonen",
"givenName": "Esko",
"type": "Person"
},
{
"familyName": "Zeugmann",
"givenName": "Thomas",
"type": "Person"
}
],
"genre": "chapter",
"id": "sg:pub.10.1007/978-3-642-24412-4_15",
"inLanguage": [
"en"
],
"isAccessibleForFree": true,
"isPartOf": {
"isbn": [
"978-3-642-24411-7",
"978-3-642-24412-4"
],
"name": "Algorithmic Learning Theory",
"type": "Book"
},
"name": "Deviations of Stochastic Bandit Regret",
"pagination": "159-173",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1050961054"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/978-3-642-24412-4_15"
]
},
{
"name": "readcube_id",
"type": "PropertyValue",
"value": [
"f2c2af5f609f5f3f9575b5dd53a287601f4da4b8f2d51f35d330b7a96940f579"
]
}
],
"publisher": {
"location": "Berlin, Heidelberg",
"name": "Springer Berlin Heidelberg",
"type": "Organisation"
},
"sameAs": [
"https://doi.org/10.1007/978-3-642-24412-4_15",
"https://app.dimensions.ai/details/publication/pub.1050961054"
],
"sdDataset": "chapters",
"sdDatePublished": "2019-04-16T09:45",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000375_0000000375/records_91453_00000001.jsonl",
"type": "Chapter",
"url": "https://link.springer.com/10.1007%2F978-3-642-24412-4_15"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-24412-4_15'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-24412-4_15'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-24412-4_15'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-24412-4_15'
This table displays all metadata directly associated to this object as RDF triples.
130 TRIPLES
23 PREDICATES
40 URIs
20 LITERALS
8 BLANK NODES