Co-Evolution in the Successful Learning of Backgammon Strategy View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

1998-09

AUTHORS

Jordan B. Pollack, Alan D. Blair

ABSTRACT

Following Tesauro's work on TD-Gammon, we used a 4,000 parameter feedforward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and selection of the position with the highest evaluation. However, no backpropagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hillclimbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a “meta-game” of self-learning. More... »

PAGES

225-240

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1023/a:1007417214905

DOI

http://dx.doi.org/10.1023/a:1007417214905

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1021109064


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Brandeis University", 
          "id": "https://www.grid.ac/institutes/grid.253264.4", 
          "name": [
            "Computer Science Department, Volen Center for Complex Systems, Brandeis University, 02254. E-mail: Email, Waltham, MA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Pollack", 
        "givenName": "Jordan B.", 
        "id": "sg:person.010041771163.31", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010041771163.31"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Brandeis University", 
          "id": "https://www.grid.ac/institutes/grid.253264.4", 
          "name": [
            "Computer Science Department, Volen Center for Complex Systems, Brandeis University, 02254. E-mail: Email, Waltham, MA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Blair", 
        "givenName": "Alan D.", 
        "id": "sg:person.013652324101.09", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013652324101.09"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1038/323533a0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018367015", 
          "https://doi.org/10.1038/323533a0"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00115009", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028679285", 
          "https://doi.org/10.1007/bf00115009"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00992697", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029271228", 
          "https://doi.org/10.1007/bf00992697"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/artl.1994.1.4.353", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038487925"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00993346", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044433028", 
          "https://doi.org/10.1007/bf00993346"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-59496-5_300", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048714750", 
          "https://doi.org/10.1007/3-540-59496-5_300"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/203330.203343", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052366953"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "1998-09", 
    "datePublishedReg": "1998-09-01", 
    "description": "Following Tesauro's work on TD-Gammon, we used a 4,000 parameter feedforward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and selection of the position with the highest evaluation. However, no backpropagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hillclimbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a \u201cmeta-game\u201d of self-learning.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1023/a:1007417214905", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1125588", 
        "issn": [
          "0885-6125", 
          "1573-0565"
        ], 
        "name": "Machine Learning", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "3", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "32"
      }
    ], 
    "name": "Co-Evolution in the Successful Learning of Backgammon Strategy", 
    "pagination": "225-240", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "6492dffb13e88af6354288e48acc8da2b768605f57b2e33f03d9fe63efb9498c"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1023/a:1007417214905"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1021109064"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1023/a:1007417214905", 
      "https://app.dimensions.ai/details/publication/pub.1021109064"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T16:39", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8669_00000499.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1023/A:1007417214905"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1007417214905'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1007417214905'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1007417214905'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1007417214905'


 

This table displays all metadata directly associated to this object as RDF triples.

94 TRIPLES      21 PREDICATES      34 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1023/a:1007417214905 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nc6c6d9e736814a889dcc0954f870e7a1
4 schema:citation sg:pub.10.1007/3-540-59496-5_300
5 sg:pub.10.1007/bf00115009
6 sg:pub.10.1007/bf00992697
7 sg:pub.10.1007/bf00993346
8 sg:pub.10.1038/323533a0
9 https://doi.org/10.1145/203330.203343
10 https://doi.org/10.1162/artl.1994.1.4.353
11 schema:datePublished 1998-09
12 schema:datePublishedReg 1998-09-01
13 schema:description Following Tesauro's work on TD-Gammon, we used a 4,000 parameter feedforward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of the dice, application of the network to all legal moves, and selection of the position with the highest evaluation. However, no backpropagation, reinforcement or temporal difference learning methods were employed. Instead we apply simple hillclimbing in a relative fitness environment. We start with an initial champion of all zero weights and proceed simply by playing the current champion network against a slightly mutated challenger and changing weights if the challenger wins. Surprisingly, this worked rather well. We investigate how the peculiar dynamics of this domain enabled a previously discarded weak method to succeed, by preventing suboptimal equilibria in a “meta-game” of self-learning.
14 schema:genre research_article
15 schema:inLanguage en
16 schema:isAccessibleForFree true
17 schema:isPartOf Ne5bf8b297ebf49f7b08d754be61aec17
18 Nedc49a3d392b4a8d90b075a6ae0cc811
19 sg:journal.1125588
20 schema:name Co-Evolution in the Successful Learning of Backgammon Strategy
21 schema:pagination 225-240
22 schema:productId N1b7dd02ef0e34ab9aa15a95eaca0e942
23 N8b16f2958eb342b79ccc62a00b7e7d22
24 Nf880f2b974c44a238968abf972bed6f2
25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021109064
26 https://doi.org/10.1023/a:1007417214905
27 schema:sdDatePublished 2019-04-10T16:39
28 schema:sdLicense https://scigraph.springernature.com/explorer/license/
29 schema:sdPublisher N6bb3247e2e084496a859df206ae1c6cb
30 schema:url http://link.springer.com/10.1023/A:1007417214905
31 sgo:license sg:explorer/license/
32 sgo:sdDataset articles
33 rdf:type schema:ScholarlyArticle
34 N1b7dd02ef0e34ab9aa15a95eaca0e942 schema:name readcube_id
35 schema:value 6492dffb13e88af6354288e48acc8da2b768605f57b2e33f03d9fe63efb9498c
36 rdf:type schema:PropertyValue
37 N6bb3247e2e084496a859df206ae1c6cb schema:name Springer Nature - SN SciGraph project
38 rdf:type schema:Organization
39 N8b16f2958eb342b79ccc62a00b7e7d22 schema:name doi
40 schema:value 10.1023/a:1007417214905
41 rdf:type schema:PropertyValue
42 Nc3f3ee7d2b554c18985619448de446d4 rdf:first sg:person.013652324101.09
43 rdf:rest rdf:nil
44 Nc6c6d9e736814a889dcc0954f870e7a1 rdf:first sg:person.010041771163.31
45 rdf:rest Nc3f3ee7d2b554c18985619448de446d4
46 Ne5bf8b297ebf49f7b08d754be61aec17 schema:issueNumber 3
47 rdf:type schema:PublicationIssue
48 Nedc49a3d392b4a8d90b075a6ae0cc811 schema:volumeNumber 32
49 rdf:type schema:PublicationVolume
50 Nf880f2b974c44a238968abf972bed6f2 schema:name dimensions_id
51 schema:value pub.1021109064
52 rdf:type schema:PropertyValue
53 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
54 schema:name Information and Computing Sciences
55 rdf:type schema:DefinedTerm
56 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
57 schema:name Artificial Intelligence and Image Processing
58 rdf:type schema:DefinedTerm
59 sg:journal.1125588 schema:issn 0885-6125
60 1573-0565
61 schema:name Machine Learning
62 rdf:type schema:Periodical
63 sg:person.010041771163.31 schema:affiliation https://www.grid.ac/institutes/grid.253264.4
64 schema:familyName Pollack
65 schema:givenName Jordan B.
66 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010041771163.31
67 rdf:type schema:Person
68 sg:person.013652324101.09 schema:affiliation https://www.grid.ac/institutes/grid.253264.4
69 schema:familyName Blair
70 schema:givenName Alan D.
71 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013652324101.09
72 rdf:type schema:Person
73 sg:pub.10.1007/3-540-59496-5_300 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048714750
74 https://doi.org/10.1007/3-540-59496-5_300
75 rdf:type schema:CreativeWork
76 sg:pub.10.1007/bf00115009 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028679285
77 https://doi.org/10.1007/bf00115009
78 rdf:type schema:CreativeWork
79 sg:pub.10.1007/bf00992697 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029271228
80 https://doi.org/10.1007/bf00992697
81 rdf:type schema:CreativeWork
82 sg:pub.10.1007/bf00993346 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044433028
83 https://doi.org/10.1007/bf00993346
84 rdf:type schema:CreativeWork
85 sg:pub.10.1038/323533a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018367015
86 https://doi.org/10.1038/323533a0
87 rdf:type schema:CreativeWork
88 https://doi.org/10.1145/203330.203343 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052366953
89 rdf:type schema:CreativeWork
90 https://doi.org/10.1162/artl.1994.1.4.353 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038487925
91 rdf:type schema:CreativeWork
92 https://www.grid.ac/institutes/grid.253264.4 schema:alternateName Brandeis University
93 schema:name Computer Science Department, Volen Center for Complex Systems, Brandeis University, 02254. E-mail: Email, Waltham, MA
94 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...