Evaluating the Markov assumption in Markov Decision Processes for spoken dialogue management View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2006-02

AUTHORS

Tim Paek, David Maxwell Chickering

ABSTRACT

The goal of dialogue management in a spoken dialogue system is to take actions based on observations and inferred beliefs. To ensure that the actions optimize the performance or robustness of the system, researchers have turned to reinforcement learning methods to learn policies for action selection. To derive an optimal policy from data, the dynamics of the system is often represented as a Markov Decision Process (MDP), which assumes that the state of the dialogue depends only on the previous state and action. In this article, we investigate whether constraining the state space by the Markov assumption, especially when the structure of the state space may be unknown, truly affords the highest reward. In simulation experiments conducted in the context of a dialogue system for interacting with a speech-enabled web browser, models under the Markov assumption did not perform as well as an alternative model which classifies the total reward with accumulating features. We discuss the implications of the study as well as its limitations. More... »

PAGES

47-66

References to SciGraph publications

  • 1992-05. Q-learning in MACHINE LEARNING
  • 2007-03. Personalizing influence diagrams: applying online learning strategies to dialogue management in USER MODELING AND USER-ADAPTED INTERACTION
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10579-006-9008-2

    DOI

    http://dx.doi.org/10.1007/s10579-006-9008-2

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1033822871


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Microsoft (United States)", 
              "id": "https://www.grid.ac/institutes/grid.419815.0", 
              "name": [
                "Microsoft Research, One Microsoft Way, Redmond, WA98052, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Paek", 
            "givenName": "Tim", 
            "id": "sg:person.012601701553.65", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012601701553.65"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Microsoft (United States)", 
              "id": "https://www.grid.ac/institutes/grid.419815.0", 
              "name": [
                "Microsoft Research, One Microsoft Way, Redmond, WA98052, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Chickering", 
            "givenName": "David Maxwell", 
            "id": "sg:person.011240332636.47", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011240332636.47"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1098/rsta.2000.0593", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003146152"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf00992698", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1033088958", 
              "https://doi.org/10.1007/bf00992698"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11257-006-9020-7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041445852", 
              "https://doi.org/10.1007/s11257-006-9020-7"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/21.52548", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061122306"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/89.817450", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061242562"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tnn.1998.712192", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061716400"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1287/deca.1050.0020", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064706028"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1287/mnsc.47.9.1235.9779", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064722162"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1287/opre.36.4.589", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064729937"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/asru.2005.1566498", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1093171072"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1017/cbo9780511620539", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1098706397"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.3115/1075218.1075231", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1099236137"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.3115/1075218.1075231", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1099236137"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.3115/1073012.1073078", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1099239585"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.3115/1073012.1073078", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1099239585"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1613/jair.301", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1105538429"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1613/jair.859", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1105579535"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2006-02", 
        "datePublishedReg": "2006-02-01", 
        "description": "The goal of dialogue management in a spoken dialogue system is to take actions based on observations and inferred beliefs. To ensure that the actions optimize the performance or robustness of the system, researchers have turned to reinforcement learning methods to learn policies for action selection. To derive an optimal policy from data, the dynamics of the system is often represented as a Markov Decision Process (MDP), which assumes that the state of the dialogue depends only on the previous state and action. In this article, we investigate whether constraining the state space by the Markov assumption, especially when the structure of the state space may be unknown, truly affords the highest reward. In simulation experiments conducted in the context of a dialogue system for interacting with a speech-enabled web browser, models under the Markov assumption did not perform as well as an alternative model which classifies the total reward with accumulating features. We discuss the implications of the study as well as its limitations.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1007/s10579-006-9008-2", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": [
          {
            "id": "sg:journal.1294988", 
            "issn": [
              "1574-020X", 
              "1574-0218"
            ], 
            "name": "Language Resources and Evaluation", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "40"
          }
        ], 
        "name": "Evaluating the Markov assumption in Markov Decision Processes for spoken dialogue management", 
        "pagination": "47-66", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "a11f0f02659642f6d2a33a56dae7d738d5256487ffd8ff43236051e676c3c4ae"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10579-006-9008-2"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1033822871"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10579-006-9008-2", 
          "https://app.dimensions.ai/details/publication/pub.1033822871"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-11T14:30", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000373_0000000373/records_13090_00000001.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "http://link.springer.com/10.1007/s10579-006-9008-2"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10579-006-9008-2'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10579-006-9008-2'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10579-006-9008-2'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10579-006-9008-2'


     

    This table displays all metadata directly associated to this object as RDF triples.

    115 TRIPLES      21 PREDICATES      42 URIs      19 LITERALS      7 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10579-006-9008-2 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N186e3885eb734b77b9f80efa5ce6dcd3
    4 schema:citation sg:pub.10.1007/bf00992698
    5 sg:pub.10.1007/s11257-006-9020-7
    6 https://doi.org/10.1017/cbo9780511620539
    7 https://doi.org/10.1098/rsta.2000.0593
    8 https://doi.org/10.1109/21.52548
    9 https://doi.org/10.1109/89.817450
    10 https://doi.org/10.1109/asru.2005.1566498
    11 https://doi.org/10.1109/tnn.1998.712192
    12 https://doi.org/10.1287/deca.1050.0020
    13 https://doi.org/10.1287/mnsc.47.9.1235.9779
    14 https://doi.org/10.1287/opre.36.4.589
    15 https://doi.org/10.1613/jair.301
    16 https://doi.org/10.1613/jair.859
    17 https://doi.org/10.3115/1073012.1073078
    18 https://doi.org/10.3115/1075218.1075231
    19 schema:datePublished 2006-02
    20 schema:datePublishedReg 2006-02-01
    21 schema:description The goal of dialogue management in a spoken dialogue system is to take actions based on observations and inferred beliefs. To ensure that the actions optimize the performance or robustness of the system, researchers have turned to reinforcement learning methods to learn policies for action selection. To derive an optimal policy from data, the dynamics of the system is often represented as a Markov Decision Process (MDP), which assumes that the state of the dialogue depends only on the previous state and action. In this article, we investigate whether constraining the state space by the Markov assumption, especially when the structure of the state space may be unknown, truly affords the highest reward. In simulation experiments conducted in the context of a dialogue system for interacting with a speech-enabled web browser, models under the Markov assumption did not perform as well as an alternative model which classifies the total reward with accumulating features. We discuss the implications of the study as well as its limitations.
    22 schema:genre research_article
    23 schema:inLanguage en
    24 schema:isAccessibleForFree false
    25 schema:isPartOf N813ec9e54a1e40adbc8249988b2f7860
    26 Ne5b8d04ff184450480d4b1a4f0bddcf9
    27 sg:journal.1294988
    28 schema:name Evaluating the Markov assumption in Markov Decision Processes for spoken dialogue management
    29 schema:pagination 47-66
    30 schema:productId N1e045971313742e9bcf29ae3194cc4e1
    31 N58d53e0562ed4da4a68185ce4e4b20ef
    32 N8d2190bebc224e648b26f809c3bd9215
    33 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033822871
    34 https://doi.org/10.1007/s10579-006-9008-2
    35 schema:sdDatePublished 2019-04-11T14:30
    36 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    37 schema:sdPublisher Nc13a595b50054b9ba58910b0b9772e0e
    38 schema:url http://link.springer.com/10.1007/s10579-006-9008-2
    39 sgo:license sg:explorer/license/
    40 sgo:sdDataset articles
    41 rdf:type schema:ScholarlyArticle
    42 N186e3885eb734b77b9f80efa5ce6dcd3 rdf:first sg:person.012601701553.65
    43 rdf:rest Naae8950f644c4d3f919f98ce9af142e4
    44 N1e045971313742e9bcf29ae3194cc4e1 schema:name dimensions_id
    45 schema:value pub.1033822871
    46 rdf:type schema:PropertyValue
    47 N58d53e0562ed4da4a68185ce4e4b20ef schema:name readcube_id
    48 schema:value a11f0f02659642f6d2a33a56dae7d738d5256487ffd8ff43236051e676c3c4ae
    49 rdf:type schema:PropertyValue
    50 N813ec9e54a1e40adbc8249988b2f7860 schema:volumeNumber 40
    51 rdf:type schema:PublicationVolume
    52 N8d2190bebc224e648b26f809c3bd9215 schema:name doi
    53 schema:value 10.1007/s10579-006-9008-2
    54 rdf:type schema:PropertyValue
    55 Naae8950f644c4d3f919f98ce9af142e4 rdf:first sg:person.011240332636.47
    56 rdf:rest rdf:nil
    57 Nc13a595b50054b9ba58910b0b9772e0e schema:name Springer Nature - SN SciGraph project
    58 rdf:type schema:Organization
    59 Ne5b8d04ff184450480d4b1a4f0bddcf9 schema:issueNumber 1
    60 rdf:type schema:PublicationIssue
    61 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    62 schema:name Information and Computing Sciences
    63 rdf:type schema:DefinedTerm
    64 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    65 schema:name Artificial Intelligence and Image Processing
    66 rdf:type schema:DefinedTerm
    67 sg:journal.1294988 schema:issn 1574-020X
    68 1574-0218
    69 schema:name Language Resources and Evaluation
    70 rdf:type schema:Periodical
    71 sg:person.011240332636.47 schema:affiliation https://www.grid.ac/institutes/grid.419815.0
    72 schema:familyName Chickering
    73 schema:givenName David Maxwell
    74 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011240332636.47
    75 rdf:type schema:Person
    76 sg:person.012601701553.65 schema:affiliation https://www.grid.ac/institutes/grid.419815.0
    77 schema:familyName Paek
    78 schema:givenName Tim
    79 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012601701553.65
    80 rdf:type schema:Person
    81 sg:pub.10.1007/bf00992698 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033088958
    82 https://doi.org/10.1007/bf00992698
    83 rdf:type schema:CreativeWork
    84 sg:pub.10.1007/s11257-006-9020-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041445852
    85 https://doi.org/10.1007/s11257-006-9020-7
    86 rdf:type schema:CreativeWork
    87 https://doi.org/10.1017/cbo9780511620539 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098706397
    88 rdf:type schema:CreativeWork
    89 https://doi.org/10.1098/rsta.2000.0593 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003146152
    90 rdf:type schema:CreativeWork
    91 https://doi.org/10.1109/21.52548 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061122306
    92 rdf:type schema:CreativeWork
    93 https://doi.org/10.1109/89.817450 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061242562
    94 rdf:type schema:CreativeWork
    95 https://doi.org/10.1109/asru.2005.1566498 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093171072
    96 rdf:type schema:CreativeWork
    97 https://doi.org/10.1109/tnn.1998.712192 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061716400
    98 rdf:type schema:CreativeWork
    99 https://doi.org/10.1287/deca.1050.0020 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064706028
    100 rdf:type schema:CreativeWork
    101 https://doi.org/10.1287/mnsc.47.9.1235.9779 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064722162
    102 rdf:type schema:CreativeWork
    103 https://doi.org/10.1287/opre.36.4.589 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064729937
    104 rdf:type schema:CreativeWork
    105 https://doi.org/10.1613/jair.301 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105538429
    106 rdf:type schema:CreativeWork
    107 https://doi.org/10.1613/jair.859 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105579535
    108 rdf:type schema:CreativeWork
    109 https://doi.org/10.3115/1073012.1073078 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099239585
    110 rdf:type schema:CreativeWork
    111 https://doi.org/10.3115/1075218.1075231 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099236137
    112 rdf:type schema:CreativeWork
    113 https://www.grid.ac/institutes/grid.419815.0 schema:alternateName Microsoft (United States)
    114 schema:name Microsoft Research, One Microsoft Way, Redmond, WA98052, USA
    115 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...