Parallel Reinforcement Learning for Weighted Multi-criteria Model with Adaptive Margin View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2008

AUTHORS

Kazuyuki Hiraoka , Manabu Yoshida , Taketoshi Mishima

ABSTRACT

Reinforcement learning (RL) for a linear family of tasks is studied in this paper. The key of our discussion is nonlinearity of the optimal solution even if the task family is linear; we cannot obtain the optimal policy by a naive approach. Though there exists an algorithm for calculating the equivalent result to Q-learning for each task all together, it has a problem with explosion of set sizes. We introduce adaptive margins to overcome this difficulty. More... »

PAGES

487-496

Book

TITLE

Neural Information Processing

ISBN

978-3-540-69154-9
978-3-540-69158-7

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-69158-7_51

DOI

http://dx.doi.org/10.1007/978-3-540-69158-7_51

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1009083778


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1701", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Psychology", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/17", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Psychology and Cognitive Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Saitama University", 
          "id": "https://www.grid.ac/institutes/grid.263023.6", 
          "name": [
            "Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Japan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hiraoka", 
        "givenName": "Kazuyuki", 
        "id": "sg:person.010465371205.51", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010465371205.51"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Saitama University", 
          "id": "https://www.grid.ac/institutes/grid.263023.6", 
          "name": [
            "Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Japan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yoshida", 
        "givenName": "Manabu", 
        "id": "sg:person.01065256234.76", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065256234.76"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Saitama University", 
          "id": "https://www.grid.ac/institutes/grid.263023.6", 
          "name": [
            "Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Japan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Mishima", 
        "givenName": "Taketoshi", 
        "id": "sg:person.013106042101.51", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013106042101.51"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2008", 
    "datePublishedReg": "2008-01-01", 
    "description": "Reinforcement learning (RL) for a linear family of tasks is studied in this paper. The key of our discussion is nonlinearity of the optimal solution even if the task family is linear; we cannot obtain the optimal policy by a naive approach. Though there exists an algorithm for calculating the equivalent result to Q-learning for each task all together, it has a problem with explosion of set sizes. We introduce adaptive margins to overcome this difficulty.", 
    "editor": [
      {
        "familyName": "Ishikawa", 
        "givenName": "Masumi", 
        "type": "Person"
      }, 
      {
        "familyName": "Doya", 
        "givenName": "Kenji", 
        "type": "Person"
      }, 
      {
        "familyName": "Miyamoto", 
        "givenName": "Hiroyuki", 
        "type": "Person"
      }, 
      {
        "familyName": "Yamakawa", 
        "givenName": "Takeshi", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-69158-7_51", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-69154-9", 
        "978-3-540-69158-7"
      ], 
      "name": "Neural Information Processing", 
      "type": "Book"
    }, 
    "name": "Parallel Reinforcement Learning for Weighted Multi-criteria Model with Adaptive Margin", 
    "pagination": "487-496", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-69158-7_51"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "2b46db34bba80b926818e1e5cefa372663f766faae70e6766f0060780c13e0b0"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1009083778"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-69158-7_51", 
      "https://app.dimensions.ai/details/publication/pub.1009083778"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T14:09", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8669_00000015.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-540-69158-7_51"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-69158-7_51'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-69158-7_51'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-69158-7_51'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-69158-7_51'


 

This table displays all metadata directly associated to this object as RDF triples.

94 TRIPLES      22 PREDICATES      27 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-69158-7_51 schema:about anzsrc-for:17
2 anzsrc-for:1701
3 schema:author N36a869429a544d3584bdf68fc6a42a1b
4 schema:datePublished 2008
5 schema:datePublishedReg 2008-01-01
6 schema:description Reinforcement learning (RL) for a linear family of tasks is studied in this paper. The key of our discussion is nonlinearity of the optimal solution even if the task family is linear; we cannot obtain the optimal policy by a naive approach. Though there exists an algorithm for calculating the equivalent result to Q-learning for each task all together, it has a problem with explosion of set sizes. We introduce adaptive margins to overcome this difficulty.
7 schema:editor N0be31bcb0f44492db2e51a763c53734e
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf N299e41d195a842a38233387b644a5fc6
12 schema:name Parallel Reinforcement Learning for Weighted Multi-criteria Model with Adaptive Margin
13 schema:pagination 487-496
14 schema:productId N81b0a40e1ab34e34a9fe0d838e789239
15 Nc4c7e647de264ed38ddbfba3608d4bf7
16 Ne37de13318604806b793bcf4230559c5
17 schema:publisher N46a5a8af86104a5aa048c32f6d378f26
18 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009083778
19 https://doi.org/10.1007/978-3-540-69158-7_51
20 schema:sdDatePublished 2019-04-15T14:09
21 schema:sdLicense https://scigraph.springernature.com/explorer/license/
22 schema:sdPublisher Nac3276e4fa75489a890648abe056f39d
23 schema:url http://link.springer.com/10.1007/978-3-540-69158-7_51
24 sgo:license sg:explorer/license/
25 sgo:sdDataset chapters
26 rdf:type schema:Chapter
27 N032e094fd156479880971872051e3ca0 rdf:first Ne38174411d564ddba16c62d6200c7894
28 rdf:rest Nbf21bc56cc9e49d4a597d02219d703a7
29 N0be31bcb0f44492db2e51a763c53734e rdf:first N11141dea7d6049f18a2673038f4488a9
30 rdf:rest N891086921eeb4da98c922cd94c857c8a
31 N11141dea7d6049f18a2673038f4488a9 schema:familyName Ishikawa
32 schema:givenName Masumi
33 rdf:type schema:Person
34 N1556ed1718794d5cbc71ba2517395dd8 rdf:first sg:person.013106042101.51
35 rdf:rest rdf:nil
36 N299e41d195a842a38233387b644a5fc6 schema:isbn 978-3-540-69154-9
37 978-3-540-69158-7
38 schema:name Neural Information Processing
39 rdf:type schema:Book
40 N369a071e820f48e3916c069f541acc2a schema:familyName Doya
41 schema:givenName Kenji
42 rdf:type schema:Person
43 N36a869429a544d3584bdf68fc6a42a1b rdf:first sg:person.010465371205.51
44 rdf:rest N5ecd6dca325443e4942036a9d6f91b85
45 N3fee719ada22446cb8e78bea7f5539c9 schema:familyName Yamakawa
46 schema:givenName Takeshi
47 rdf:type schema:Person
48 N46a5a8af86104a5aa048c32f6d378f26 schema:location Berlin, Heidelberg
49 schema:name Springer Berlin Heidelberg
50 rdf:type schema:Organisation
51 N5ecd6dca325443e4942036a9d6f91b85 rdf:first sg:person.01065256234.76
52 rdf:rest N1556ed1718794d5cbc71ba2517395dd8
53 N81b0a40e1ab34e34a9fe0d838e789239 schema:name doi
54 schema:value 10.1007/978-3-540-69158-7_51
55 rdf:type schema:PropertyValue
56 N891086921eeb4da98c922cd94c857c8a rdf:first N369a071e820f48e3916c069f541acc2a
57 rdf:rest N032e094fd156479880971872051e3ca0
58 Nac3276e4fa75489a890648abe056f39d schema:name Springer Nature - SN SciGraph project
59 rdf:type schema:Organization
60 Nbf21bc56cc9e49d4a597d02219d703a7 rdf:first N3fee719ada22446cb8e78bea7f5539c9
61 rdf:rest rdf:nil
62 Nc4c7e647de264ed38ddbfba3608d4bf7 schema:name readcube_id
63 schema:value 2b46db34bba80b926818e1e5cefa372663f766faae70e6766f0060780c13e0b0
64 rdf:type schema:PropertyValue
65 Ne37de13318604806b793bcf4230559c5 schema:name dimensions_id
66 schema:value pub.1009083778
67 rdf:type schema:PropertyValue
68 Ne38174411d564ddba16c62d6200c7894 schema:familyName Miyamoto
69 schema:givenName Hiroyuki
70 rdf:type schema:Person
71 anzsrc-for:17 schema:inDefinedTermSet anzsrc-for:
72 schema:name Psychology and Cognitive Sciences
73 rdf:type schema:DefinedTerm
74 anzsrc-for:1701 schema:inDefinedTermSet anzsrc-for:
75 schema:name Psychology
76 rdf:type schema:DefinedTerm
77 sg:person.010465371205.51 schema:affiliation https://www.grid.ac/institutes/grid.263023.6
78 schema:familyName Hiraoka
79 schema:givenName Kazuyuki
80 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010465371205.51
81 rdf:type schema:Person
82 sg:person.01065256234.76 schema:affiliation https://www.grid.ac/institutes/grid.263023.6
83 schema:familyName Yoshida
84 schema:givenName Manabu
85 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065256234.76
86 rdf:type schema:Person
87 sg:person.013106042101.51 schema:affiliation https://www.grid.ac/institutes/grid.263023.6
88 schema:familyName Mishima
89 schema:givenName Taketoshi
90 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013106042101.51
91 rdf:type schema:Person
92 https://www.grid.ac/institutes/grid.263023.6 schema:alternateName Saitama University
93 schema:name Saitama University, 255 Shimo-Okubo, Sakura-ku, Saitama-shi, Japan
94 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...