Reinforcement learning in a continuum of agents View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2018-03

AUTHORS

Adrian Šošić, Abdelhak M. Zoubir, Heinz Koeppl

ABSTRACT

We present a decision-making framework for modeling the collective behavior of large groups of cooperatively interacting agents based on a continuum description of the agents’ joint state. The continuum model is derived from an agent-based system of locally coupled stochastic differential equations, taking into account that each agent in the group is only partially informed about the global system state. The usefulness of the proposed framework is twofold: (i) for multi-agent scenarios, it provides a computational approach to handling large-scale distributed decision-making problems and learning decentralized control policies. (ii) For single-agent systems, it offers an alternative approximation scheme for evaluating expectations of state distributions. We demonstrate our framework on a variant of the Kuramoto model using a variety of distributed control tasks, such as positioning and aggregation. As part of our experiments, we compare the effectiveness of the controllers learned by the continuum model and agent-based systems of different sizes, and we analyze how the degree of observability in the system affects the learning process. More... »

PAGES

23-51

References to SciGraph publications

  • 2013-03. Swarm robotics: a review from the swarm engineering perspective in SWARM INTELLIGENCE
  • 2005. Programming an Amorphous Computational Medium in UNCONVENTIONAL PROGRAMMING PARADIGMS
  • 2005. A Review of Probabilistic Macroscopic Models for Swarm Robotic Systems in SWARM ROBOTICS
  • 2003. Distributed Sensor Networks, A Multiagent Perspective in NONE
  • 2006. System Identification of Self-Organizing Robotic Swarms in DISTRIBUTED AUTONOMOUS ROBOTIC SYSTEMS 7
  • 1975. Self-entrainment of a population of coupled non-linear oscillators in INTERNATIONAL SYMPOSIUM ON MATHEMATICAL PROBLEMS IN THEORETICAL PHYSICS
  • 1989. The Fokker-Planck Equation, Methods of Solution and Applications in NONE
  • 2008-12. A framework of space–time continuous models for algorithm design in swarm robotics in SWARM INTELLIGENCE
  • 2002. How Many Robots? Group Size and Efficiency in Collective Search Tasks in DISTRIBUTED AUTONOMOUS ROBOTIC SYSTEMS 5
  • 2007-03. Mean field games in JAPANESE JOURNAL OF MATHEMATICS
  • 2000-07. On sequential Monte Carlo sampling methods for Bayesian filtering in STATISTICS AND COMPUTING
  • 2012. Bayesian Nonparametric Inverse Reinforcement Learning in MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES
  • 1998. Brownian Motion and Stochastic Calculus in NONE
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s11721-017-0142-9

    DOI

    http://dx.doi.org/10.1007/s11721-017-0142-9

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1092212192


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Technical University of Darmstadt", 
              "id": "https://www.grid.ac/institutes/grid.6546.1", 
              "name": [
                "Department of Electrical Engineering and Information Technology, Technische Universit\u00e4t Darmstadt, 64283, Darmstadt, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "\u0160o\u0161i\u0107", 
            "givenName": "Adrian", 
            "id": "sg:person.014746645213.02", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014746645213.02"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Technical University of Darmstadt", 
              "id": "https://www.grid.ac/institutes/grid.6546.1", 
              "name": [
                "Department of Electrical Engineering and Information Technology, Technische Universit\u00e4t Darmstadt, 64283, Darmstadt, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Zoubir", 
            "givenName": "Abdelhak M.", 
            "id": "sg:person.013316510015.38", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013316510015.38"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Technical University of Darmstadt", 
              "id": "https://www.grid.ac/institutes/grid.6546.1", 
              "name": [
                "Department of Electrical Engineering and Information Technology, Technische Universit\u00e4t Darmstadt, 64283, Darmstadt, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Koeppl", 
            "givenName": "Heinz", 
            "id": "sg:person.0613146061.09", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613146061.09"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1103/physrevlett.75.1226", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003549718"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1103/physrevlett.75.1226", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003549718"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/4-431-35881-1_4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003994643", 
              "https://doi.org/10.1007/4-431-35881-1_4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1126/science.1070821", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1005920915"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0921-8890(99)00038-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006122019"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1166/jctn.2005.001", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006750943"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-33486-3_10", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1010786031", 
              "https://doi.org/10.1007/978-3-642-33486-3_10"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1146/annurev-conmatphys-070909-104101", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1014391404"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bfb0013365", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019275746", 
              "https://doi.org/10.1007/bfb0013365"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0004-3702(98)00023-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020908447"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1073/pnas.92.23.10742", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023964182"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/jtbi.2002.3065", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1026503384"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11721-008-0015-3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1026568685", 
              "https://doi.org/10.1007/s11721-008-0015-3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11721-012-0075-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027362954", 
              "https://doi.org/10.1007/s11721-012-0075-2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11537-007-0657-8", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027618526", 
              "https://doi.org/10.1007/s11537-007-0657-8"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-61544-3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027903074", 
              "https://doi.org/10.1007/978-3-642-61544-3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-61544-3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027903074", 
              "https://doi.org/10.1007/978-3-642-61544-3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-30552-1_12", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028649317", 
              "https://doi.org/10.1007/978-3-540-30552-1_12"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-30552-1_12", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028649317", 
              "https://doi.org/10.1007/978-3-540-30552-1_12"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11527800_10", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030482076", 
              "https://doi.org/10.1007/11527800_10"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11527800_10", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030482076", 
              "https://doi.org/10.1007/11527800_10"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1088/0305-4470/29/24/001", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031608831"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/jtbi.1993.1007", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032088373"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.neunet.2009.12.004", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035041723"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1023/a:1008935410038", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035894305", 
              "https://doi.org/10.1023/a:1008935410038"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-4-431-65941-9_29", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043812321", 
              "https://doi.org/10.1007/978-4-431-65941-9_29"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4615-0363-7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047450729", 
              "https://doi.org/10.1007/978-1-4615-0363-7"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4615-0363-7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047450729", 
              "https://doi.org/10.1007/978-1-4615-0363-7"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1048154783", 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4612-0949-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048154783", 
              "https://doi.org/10.1007/978-1-4612-0949-2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4612-0949-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048154783", 
              "https://doi.org/10.1007/978-1-4612-0949-2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/332833.332842", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052419132"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1103/physreve.91.022115", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053406798"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1103/physreve.91.022115", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053406798"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1017/s0962492914000130", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1054905308"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1103/physrevlett.74.5148", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1060811316"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1103/physrevlett.74.5148", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1060811316"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/2.774914", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061106138"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tac.2014.2368731", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061479358"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tsmcc.2012.2218595", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061798463"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/s0219477505002641", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062994374"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1143/jpsj.77.044002", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1063123508"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1287/moor.27.4.819.297", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064724360"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1561/2300000021", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1068001431"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.2307/1913732", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1069640922"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.7551/978-0-262-32621-6-ch055", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1099448393"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/9780470316962", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1109489376"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1109489376", 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2018-03", 
        "datePublishedReg": "2018-03-01", 
        "description": "We present a decision-making framework for modeling the collective behavior of large groups of cooperatively interacting agents based on a continuum description of the agents\u2019 joint state. The continuum model is derived from an agent-based system of locally coupled stochastic differential equations, taking into account that each agent in the group is only partially informed about the global system state. The usefulness of the proposed framework is twofold: (i) for multi-agent scenarios, it provides a computational approach to handling large-scale distributed decision-making problems and learning decentralized control policies. (ii) For single-agent systems, it offers an alternative approximation scheme for evaluating expectations of state distributions. We demonstrate our framework on a variant of the Kuramoto model using a variety of distributed control tasks, such as positioning and aggregation. As part of our experiments, we compare the effectiveness of the controllers learned by the continuum model and agent-based systems of different sizes, and we analyze how the degree of observability in the system affects the learning process.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1007/s11721-017-0142-9", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": [
          {
            "id": "sg:journal.1136777", 
            "issn": [
              "1935-3812", 
              "1935-3820"
            ], 
            "name": "Swarm Intelligence", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "12"
          }
        ], 
        "name": "Reinforcement learning in a continuum of agents", 
        "pagination": "23-51", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "65c624db7dfc10aa801203c03a4318edbf148986896d6737bcdbf2cab1a69abe"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s11721-017-0142-9"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1092212192"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s11721-017-0142-9", 
          "https://app.dimensions.ai/details/publication/pub.1092212192"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-10T22:47", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8690_00000601.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://link.springer.com/10.1007%2Fs11721-017-0142-9"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s11721-017-0142-9'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s11721-017-0142-9'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s11721-017-0142-9'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s11721-017-0142-9'


     

    This table displays all metadata directly associated to this object as RDF triples.

    206 TRIPLES      21 PREDICATES      67 URIs      19 LITERALS      7 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s11721-017-0142-9 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N5b5936080cb6490c9519bb42fe89ea56
    4 schema:citation sg:pub.10.1007/11527800_10
    5 sg:pub.10.1007/4-431-35881-1_4
    6 sg:pub.10.1007/978-1-4612-0949-2
    7 sg:pub.10.1007/978-1-4615-0363-7
    8 sg:pub.10.1007/978-3-540-30552-1_12
    9 sg:pub.10.1007/978-3-642-33486-3_10
    10 sg:pub.10.1007/978-3-642-61544-3
    11 sg:pub.10.1007/978-4-431-65941-9_29
    12 sg:pub.10.1007/bfb0013365
    13 sg:pub.10.1007/s11537-007-0657-8
    14 sg:pub.10.1007/s11721-008-0015-3
    15 sg:pub.10.1007/s11721-012-0075-2
    16 sg:pub.10.1023/a:1008935410038
    17 https://app.dimensions.ai/details/publication/pub.1048154783
    18 https://app.dimensions.ai/details/publication/pub.1109489376
    19 https://doi.org/10.1002/9780470316962
    20 https://doi.org/10.1006/jtbi.1993.1007
    21 https://doi.org/10.1006/jtbi.2002.3065
    22 https://doi.org/10.1016/j.neunet.2009.12.004
    23 https://doi.org/10.1016/s0004-3702(98)00023-x
    24 https://doi.org/10.1016/s0921-8890(99)00038-x
    25 https://doi.org/10.1017/s0962492914000130
    26 https://doi.org/10.1073/pnas.92.23.10742
    27 https://doi.org/10.1088/0305-4470/29/24/001
    28 https://doi.org/10.1103/physreve.91.022115
    29 https://doi.org/10.1103/physrevlett.74.5148
    30 https://doi.org/10.1103/physrevlett.75.1226
    31 https://doi.org/10.1109/2.774914
    32 https://doi.org/10.1109/tac.2014.2368731
    33 https://doi.org/10.1109/tsmcc.2012.2218595
    34 https://doi.org/10.1126/science.1070821
    35 https://doi.org/10.1142/s0219477505002641
    36 https://doi.org/10.1143/jpsj.77.044002
    37 https://doi.org/10.1145/332833.332842
    38 https://doi.org/10.1146/annurev-conmatphys-070909-104101
    39 https://doi.org/10.1166/jctn.2005.001
    40 https://doi.org/10.1287/moor.27.4.819.297
    41 https://doi.org/10.1561/2300000021
    42 https://doi.org/10.2307/1913732
    43 https://doi.org/10.7551/978-0-262-32621-6-ch055
    44 schema:datePublished 2018-03
    45 schema:datePublishedReg 2018-03-01
    46 schema:description We present a decision-making framework for modeling the collective behavior of large groups of cooperatively interacting agents based on a continuum description of the agents’ joint state. The continuum model is derived from an agent-based system of locally coupled stochastic differential equations, taking into account that each agent in the group is only partially informed about the global system state. The usefulness of the proposed framework is twofold: (i) for multi-agent scenarios, it provides a computational approach to handling large-scale distributed decision-making problems and learning decentralized control policies. (ii) For single-agent systems, it offers an alternative approximation scheme for evaluating expectations of state distributions. We demonstrate our framework on a variant of the Kuramoto model using a variety of distributed control tasks, such as positioning and aggregation. As part of our experiments, we compare the effectiveness of the controllers learned by the continuum model and agent-based systems of different sizes, and we analyze how the degree of observability in the system affects the learning process.
    47 schema:genre research_article
    48 schema:inLanguage en
    49 schema:isAccessibleForFree false
    50 schema:isPartOf Na759cc6c531a48698e3490b472eed00a
    51 Nbcdd3aef550d400eb5f1e089bd37119b
    52 sg:journal.1136777
    53 schema:name Reinforcement learning in a continuum of agents
    54 schema:pagination 23-51
    55 schema:productId N56938c2b57b247c18244cb0ea9fb820b
    56 Nbcddf8b74ec9419ea36fd87fdd219f26
    57 Ne4267b374ea14c17be6dd898af3f13d2
    58 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092212192
    59 https://doi.org/10.1007/s11721-017-0142-9
    60 schema:sdDatePublished 2019-04-10T22:47
    61 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    62 schema:sdPublisher N6c76fd5dfacf43afb61fd8e5eb1331d7
    63 schema:url https://link.springer.com/10.1007%2Fs11721-017-0142-9
    64 sgo:license sg:explorer/license/
    65 sgo:sdDataset articles
    66 rdf:type schema:ScholarlyArticle
    67 N50dfe253146e48388388a318a4c23318 rdf:first sg:person.013316510015.38
    68 rdf:rest N58c340d3351644aebcd99e29b43c67cd
    69 N56938c2b57b247c18244cb0ea9fb820b schema:name readcube_id
    70 schema:value 65c624db7dfc10aa801203c03a4318edbf148986896d6737bcdbf2cab1a69abe
    71 rdf:type schema:PropertyValue
    72 N58c340d3351644aebcd99e29b43c67cd rdf:first sg:person.0613146061.09
    73 rdf:rest rdf:nil
    74 N5b5936080cb6490c9519bb42fe89ea56 rdf:first sg:person.014746645213.02
    75 rdf:rest N50dfe253146e48388388a318a4c23318
    76 N6c76fd5dfacf43afb61fd8e5eb1331d7 schema:name Springer Nature - SN SciGraph project
    77 rdf:type schema:Organization
    78 Na759cc6c531a48698e3490b472eed00a schema:volumeNumber 12
    79 rdf:type schema:PublicationVolume
    80 Nbcdd3aef550d400eb5f1e089bd37119b schema:issueNumber 1
    81 rdf:type schema:PublicationIssue
    82 Nbcddf8b74ec9419ea36fd87fdd219f26 schema:name doi
    83 schema:value 10.1007/s11721-017-0142-9
    84 rdf:type schema:PropertyValue
    85 Ne4267b374ea14c17be6dd898af3f13d2 schema:name dimensions_id
    86 schema:value pub.1092212192
    87 rdf:type schema:PropertyValue
    88 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    89 schema:name Information and Computing Sciences
    90 rdf:type schema:DefinedTerm
    91 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    92 schema:name Artificial Intelligence and Image Processing
    93 rdf:type schema:DefinedTerm
    94 sg:journal.1136777 schema:issn 1935-3812
    95 1935-3820
    96 schema:name Swarm Intelligence
    97 rdf:type schema:Periodical
    98 sg:person.013316510015.38 schema:affiliation https://www.grid.ac/institutes/grid.6546.1
    99 schema:familyName Zoubir
    100 schema:givenName Abdelhak M.
    101 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013316510015.38
    102 rdf:type schema:Person
    103 sg:person.014746645213.02 schema:affiliation https://www.grid.ac/institutes/grid.6546.1
    104 schema:familyName Šošić
    105 schema:givenName Adrian
    106 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014746645213.02
    107 rdf:type schema:Person
    108 sg:person.0613146061.09 schema:affiliation https://www.grid.ac/institutes/grid.6546.1
    109 schema:familyName Koeppl
    110 schema:givenName Heinz
    111 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613146061.09
    112 rdf:type schema:Person
    113 sg:pub.10.1007/11527800_10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030482076
    114 https://doi.org/10.1007/11527800_10
    115 rdf:type schema:CreativeWork
    116 sg:pub.10.1007/4-431-35881-1_4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003994643
    117 https://doi.org/10.1007/4-431-35881-1_4
    118 rdf:type schema:CreativeWork
    119 sg:pub.10.1007/978-1-4612-0949-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048154783
    120 https://doi.org/10.1007/978-1-4612-0949-2
    121 rdf:type schema:CreativeWork
    122 sg:pub.10.1007/978-1-4615-0363-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047450729
    123 https://doi.org/10.1007/978-1-4615-0363-7
    124 rdf:type schema:CreativeWork
    125 sg:pub.10.1007/978-3-540-30552-1_12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028649317
    126 https://doi.org/10.1007/978-3-540-30552-1_12
    127 rdf:type schema:CreativeWork
    128 sg:pub.10.1007/978-3-642-33486-3_10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010786031
    129 https://doi.org/10.1007/978-3-642-33486-3_10
    130 rdf:type schema:CreativeWork
    131 sg:pub.10.1007/978-3-642-61544-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027903074
    132 https://doi.org/10.1007/978-3-642-61544-3
    133 rdf:type schema:CreativeWork
    134 sg:pub.10.1007/978-4-431-65941-9_29 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043812321
    135 https://doi.org/10.1007/978-4-431-65941-9_29
    136 rdf:type schema:CreativeWork
    137 sg:pub.10.1007/bfb0013365 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019275746
    138 https://doi.org/10.1007/bfb0013365
    139 rdf:type schema:CreativeWork
    140 sg:pub.10.1007/s11537-007-0657-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027618526
    141 https://doi.org/10.1007/s11537-007-0657-8
    142 rdf:type schema:CreativeWork
    143 sg:pub.10.1007/s11721-008-0015-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026568685
    144 https://doi.org/10.1007/s11721-008-0015-3
    145 rdf:type schema:CreativeWork
    146 sg:pub.10.1007/s11721-012-0075-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027362954
    147 https://doi.org/10.1007/s11721-012-0075-2
    148 rdf:type schema:CreativeWork
    149 sg:pub.10.1023/a:1008935410038 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035894305
    150 https://doi.org/10.1023/a:1008935410038
    151 rdf:type schema:CreativeWork
    152 https://app.dimensions.ai/details/publication/pub.1048154783 schema:CreativeWork
    153 https://app.dimensions.ai/details/publication/pub.1109489376 schema:CreativeWork
    154 https://doi.org/10.1002/9780470316962 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109489376
    155 rdf:type schema:CreativeWork
    156 https://doi.org/10.1006/jtbi.1993.1007 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032088373
    157 rdf:type schema:CreativeWork
    158 https://doi.org/10.1006/jtbi.2002.3065 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026503384
    159 rdf:type schema:CreativeWork
    160 https://doi.org/10.1016/j.neunet.2009.12.004 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035041723
    161 rdf:type schema:CreativeWork
    162 https://doi.org/10.1016/s0004-3702(98)00023-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1020908447
    163 rdf:type schema:CreativeWork
    164 https://doi.org/10.1016/s0921-8890(99)00038-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1006122019
    165 rdf:type schema:CreativeWork
    166 https://doi.org/10.1017/s0962492914000130 schema:sameAs https://app.dimensions.ai/details/publication/pub.1054905308
    167 rdf:type schema:CreativeWork
    168 https://doi.org/10.1073/pnas.92.23.10742 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023964182
    169 rdf:type schema:CreativeWork
    170 https://doi.org/10.1088/0305-4470/29/24/001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031608831
    171 rdf:type schema:CreativeWork
    172 https://doi.org/10.1103/physreve.91.022115 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053406798
    173 rdf:type schema:CreativeWork
    174 https://doi.org/10.1103/physrevlett.74.5148 schema:sameAs https://app.dimensions.ai/details/publication/pub.1060811316
    175 rdf:type schema:CreativeWork
    176 https://doi.org/10.1103/physrevlett.75.1226 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003549718
    177 rdf:type schema:CreativeWork
    178 https://doi.org/10.1109/2.774914 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061106138
    179 rdf:type schema:CreativeWork
    180 https://doi.org/10.1109/tac.2014.2368731 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061479358
    181 rdf:type schema:CreativeWork
    182 https://doi.org/10.1109/tsmcc.2012.2218595 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061798463
    183 rdf:type schema:CreativeWork
    184 https://doi.org/10.1126/science.1070821 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005920915
    185 rdf:type schema:CreativeWork
    186 https://doi.org/10.1142/s0219477505002641 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062994374
    187 rdf:type schema:CreativeWork
    188 https://doi.org/10.1143/jpsj.77.044002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063123508
    189 rdf:type schema:CreativeWork
    190 https://doi.org/10.1145/332833.332842 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052419132
    191 rdf:type schema:CreativeWork
    192 https://doi.org/10.1146/annurev-conmatphys-070909-104101 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014391404
    193 rdf:type schema:CreativeWork
    194 https://doi.org/10.1166/jctn.2005.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006750943
    195 rdf:type schema:CreativeWork
    196 https://doi.org/10.1287/moor.27.4.819.297 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064724360
    197 rdf:type schema:CreativeWork
    198 https://doi.org/10.1561/2300000021 schema:sameAs https://app.dimensions.ai/details/publication/pub.1068001431
    199 rdf:type schema:CreativeWork
    200 https://doi.org/10.2307/1913732 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069640922
    201 rdf:type schema:CreativeWork
    202 https://doi.org/10.7551/978-0-262-32621-6-ch055 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099448393
    203 rdf:type schema:CreativeWork
    204 https://www.grid.ac/institutes/grid.6546.1 schema:alternateName Technical University of Darmstadt
    205 schema:name Department of Electrical Engineering and Information Technology, Technische Universität Darmstadt, 64283, Darmstadt, Germany
    206 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...