Hierarchical End-to-end Control Policy for Multi-degree-of-freedom Manipulators View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2022-08-27

AUTHORS

Cheol-Hui Min, Jae-Bok Song

ABSTRACT

In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy function learning. In this study, the control problem is addressed by in-troducing a hierarchical reinforcement learning method that can learn the end-to-end control policy of a multi-DOF manipula-tor without any constraints on the state-action space. The proposed method learns hierarchical policy using two off-policy methods. Using human demonstration data and a newly proposed data-correction method, controlling the multi-DOF manipu-lator in an end-to-end manner is shown to outperform the non-hierarchical deep reinforcement learning methods. More... »

PAGES

3296-3311

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s12555-021-0511-4

DOI

http://dx.doi.org/10.1007/s12555-021-0511-4

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1150553855


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/09", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Engineering", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0906", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Electrical and Electronic Engineering", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0910", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Manufacturing Engineering", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0913", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mechanical Engineering", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "School of Mechanical Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, Korea", 
          "id": "http://www.grid.ac/institutes/grid.222754.4", 
          "name": [
            "School of Mechanical Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, Korea"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Min", 
        "givenName": "Cheol-Hui", 
        "id": "sg:person.012142644074.08", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012142644074.08"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "School of Mechanical Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, Korea", 
          "id": "http://www.grid.ac/institutes/grid.222754.4", 
          "name": [
            "School of Mechanical Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, Korea"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Song", 
        "givenName": "Jae-Bok", 
        "id": "sg:person.014564140207.01", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014564140207.01"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1038/nature14236", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030517994", 
          "https://doi.org/10.1038/nature14236"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2022-08-27", 
    "datePublishedReg": "2022-08-27", 
    "description": "In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy function learning. In this study, the control problem is addressed by in-troducing a hierarchical reinforcement learning method that can learn the end-to-end control policy of a multi-DOF manipula-tor without any constraints on the state-action space. The proposed method learns hierarchical policy using two off-policy methods. Using human demonstration data and a newly proposed data-correction method, controlling the multi-DOF manipu-lator in an end-to-end manner is shown to outperform the non-hierarchical deep reinforcement learning methods.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s12555-021-0511-4", 
    "isAccessibleForFree": false, 
    "isPartOf": [
      {
        "id": "sg:journal.1041911", 
        "issn": [
          "1598-6446", 
          "2005-4092"
        ], 
        "name": "International Journal of Control, Automation and Systems", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "10", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "20"
      }
    ], 
    "keywords": [
      "state-action space", 
      "freedom manipulator", 
      "number of constraints", 
      "control problem", 
      "control policies", 
      "high-dimensional state-action space", 
      "reinforcement learning method", 
      "data correction method", 
      "policy function", 
      "human demonstration data", 
      "hierarchical end", 
      "manipulator", 
      "learning methods", 
      "space", 
      "constraints", 
      "demonstration data", 
      "deep reinforcement learning method", 
      "reinforcement learning", 
      "problem", 
      "deep reinforcement learning", 
      "complexity", 
      "end manner", 
      "policy methods", 
      "function", 
      "recent years", 
      "number", 
      "end", 
      "data", 
      "learning", 
      "manner", 
      "previous studies", 
      "study", 
      "policy", 
      "hierarchical policy", 
      "years", 
      "method", 
      "end control policies", 
      "hierarchical reinforcement learning method"
    ], 
    "name": "Hierarchical End-to-end Control Policy for Multi-degree-of-freedom Manipulators", 
    "pagination": "3296-3311", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1150553855"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s12555-021-0511-4"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s12555-021-0511-4", 
      "https://app.dimensions.ai/details/publication/pub.1150553855"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-12-01T06:44", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_952.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s12555-021-0511-4"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s12555-021-0511-4'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s12555-021-0511-4'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s12555-021-0511-4'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s12555-021-0511-4'


 

This table displays all metadata directly associated to this object as RDF triples.

114 TRIPLES      21 PREDICATES      65 URIs      54 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s12555-021-0511-4 schema:about anzsrc-for:09
2 anzsrc-for:0906
3 anzsrc-for:0910
4 anzsrc-for:0913
5 schema:author N32714abfcab246b7bf0d94763250638a
6 schema:citation sg:pub.10.1038/nature14236
7 schema:datePublished 2022-08-27
8 schema:datePublishedReg 2022-08-27
9 schema:description In recent years, several control policies for a multi-degree-of-freedom (DOF) manipulator using deep reinforcement learning have been proposed. To avoid complexity, previous studies have applied a number of constraints on the high-dimensional state-action space, thus hindering generalized policy function learning. In this study, the control problem is addressed by in-troducing a hierarchical reinforcement learning method that can learn the end-to-end control policy of a multi-DOF manipula-tor without any constraints on the state-action space. The proposed method learns hierarchical policy using two off-policy methods. Using human demonstration data and a newly proposed data-correction method, controlling the multi-DOF manipu-lator in an end-to-end manner is shown to outperform the non-hierarchical deep reinforcement learning methods.
10 schema:genre article
11 schema:isAccessibleForFree false
12 schema:isPartOf N5c4274718dbc469aa206d13cefccca07
13 Nd1685bccf8184279b7eb8e1892a78247
14 sg:journal.1041911
15 schema:keywords complexity
16 constraints
17 control policies
18 control problem
19 data
20 data correction method
21 deep reinforcement learning
22 deep reinforcement learning method
23 demonstration data
24 end
25 end control policies
26 end manner
27 freedom manipulator
28 function
29 hierarchical end
30 hierarchical policy
31 hierarchical reinforcement learning method
32 high-dimensional state-action space
33 human demonstration data
34 learning
35 learning methods
36 manipulator
37 manner
38 method
39 number
40 number of constraints
41 policy
42 policy function
43 policy methods
44 previous studies
45 problem
46 recent years
47 reinforcement learning
48 reinforcement learning method
49 space
50 state-action space
51 study
52 years
53 schema:name Hierarchical End-to-end Control Policy for Multi-degree-of-freedom Manipulators
54 schema:pagination 3296-3311
55 schema:productId N9b1e5d890092445bb3a3c6dbc96775fe
56 Nc932f96bb1b94e9fb9acbffe0d58edc8
57 schema:sameAs https://app.dimensions.ai/details/publication/pub.1150553855
58 https://doi.org/10.1007/s12555-021-0511-4
59 schema:sdDatePublished 2022-12-01T06:44
60 schema:sdLicense https://scigraph.springernature.com/explorer/license/
61 schema:sdPublisher Nb3da377f36ee4ad992aa813e5ee05c58
62 schema:url https://doi.org/10.1007/s12555-021-0511-4
63 sgo:license sg:explorer/license/
64 sgo:sdDataset articles
65 rdf:type schema:ScholarlyArticle
66 N32714abfcab246b7bf0d94763250638a rdf:first sg:person.012142644074.08
67 rdf:rest Nc12da2566e5c45b390c245cbc543b886
68 N5c4274718dbc469aa206d13cefccca07 schema:volumeNumber 20
69 rdf:type schema:PublicationVolume
70 N9b1e5d890092445bb3a3c6dbc96775fe schema:name dimensions_id
71 schema:value pub.1150553855
72 rdf:type schema:PropertyValue
73 Nb3da377f36ee4ad992aa813e5ee05c58 schema:name Springer Nature - SN SciGraph project
74 rdf:type schema:Organization
75 Nc12da2566e5c45b390c245cbc543b886 rdf:first sg:person.014564140207.01
76 rdf:rest rdf:nil
77 Nc932f96bb1b94e9fb9acbffe0d58edc8 schema:name doi
78 schema:value 10.1007/s12555-021-0511-4
79 rdf:type schema:PropertyValue
80 Nd1685bccf8184279b7eb8e1892a78247 schema:issueNumber 10
81 rdf:type schema:PublicationIssue
82 anzsrc-for:09 schema:inDefinedTermSet anzsrc-for:
83 schema:name Engineering
84 rdf:type schema:DefinedTerm
85 anzsrc-for:0906 schema:inDefinedTermSet anzsrc-for:
86 schema:name Electrical and Electronic Engineering
87 rdf:type schema:DefinedTerm
88 anzsrc-for:0910 schema:inDefinedTermSet anzsrc-for:
89 schema:name Manufacturing Engineering
90 rdf:type schema:DefinedTerm
91 anzsrc-for:0913 schema:inDefinedTermSet anzsrc-for:
92 schema:name Mechanical Engineering
93 rdf:type schema:DefinedTerm
94 sg:journal.1041911 schema:issn 1598-6446
95 2005-4092
96 schema:name International Journal of Control, Automation and Systems
97 schema:publisher Springer Nature
98 rdf:type schema:Periodical
99 sg:person.012142644074.08 schema:affiliation grid-institutes:grid.222754.4
100 schema:familyName Min
101 schema:givenName Cheol-Hui
102 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012142644074.08
103 rdf:type schema:Person
104 sg:person.014564140207.01 schema:affiliation grid-institutes:grid.222754.4
105 schema:familyName Song
106 schema:givenName Jae-Bok
107 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014564140207.01
108 rdf:type schema:Person
109 sg:pub.10.1038/nature14236 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030517994
110 https://doi.org/10.1038/nature14236
111 rdf:type schema:CreativeWork
112 grid-institutes:grid.222754.4 schema:alternateName School of Mechanical Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, Korea
113 schema:name School of Mechanical Engineering, Korea University, 145, Anam-ro, Seongbuk-gu, Seoul, Korea
114 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...