A comparison of scientific and engineering criteria for Bayesian model selection View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2000-01

AUTHORS

David Maxwell Chickering, David Heckerman

ABSTRACT

Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial. More... »

PAGES

55-62

References to SciGraph publications

Journal

TITLE

Statistics and Computing

ISSUE

1

VOLUME

10

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1023/a:1008936501289

DOI

http://dx.doi.org/10.1023/a:1008936501289

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1047713127


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Microsoft (United States)", 
          "id": "https://www.grid.ac/institutes/grid.419815.0", 
          "name": [
            "Microsoft Research, 98052-6399, Redmond, WA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chickering", 
        "givenName": "David Maxwell", 
        "id": "sg:person.011240332636.47", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011240332636.47"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Microsoft (United States)", 
          "id": "https://www.grid.ac/institutes/grid.419815.0", 
          "name": [
            "Microsoft Research, 98052-6399, Redmond, WA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Heckerman", 
        "givenName": "David", 
        "id": "sg:person.01134362461.98", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01134362461.98"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1016/0304-4076(81)90073-7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001885768"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00994016", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035524560", 
          "https://doi.org/10.1007/bf00994016"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00994110", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046316965", 
          "https://doi.org/10.1007/bf00994110"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/224530", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058544028"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/biomet/82.4.669", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059420600"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1214/aos/1176344689", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1064407540"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2000-01", 
    "datePublishedReg": "2000-01-01", 
    "description": "Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian \u201cestimate\u201d of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1023/a:1008936501289", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": [
      {
        "id": "sg:journal.1327447", 
        "issn": [
          "0960-3174", 
          "1573-1375"
        ], 
        "name": "Statistics and Computing", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "10"
      }
    ], 
    "name": "A comparison of scientific and engineering criteria for Bayesian model selection", 
    "pagination": "55-62", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "e34b10686277933d744500f2c55ed1d0066df96aafea7be26f5b5e6aebce6476"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1023/a:1008936501289"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1047713127"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1023/a:1008936501289", 
      "https://app.dimensions.ai/details/publication/pub.1047713127"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T17:29", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8672_00000501.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1023/A:1008936501289"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1008936501289'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1008936501289'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1008936501289'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1008936501289'


 

This table displays all metadata directly associated to this object as RDF triples.

88 TRIPLES      21 PREDICATES      33 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1023/a:1008936501289 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author N95f0bdae8a37474087f4c17f05165122
4 schema:citation sg:pub.10.1007/bf00994016
5 sg:pub.10.1007/bf00994110
6 https://doi.org/10.1016/0304-4076(81)90073-7
7 https://doi.org/10.1086/224530
8 https://doi.org/10.1093/biomet/82.4.669
9 https://doi.org/10.1214/aos/1176344689
10 schema:datePublished 2000-01
11 schema:datePublishedReg 2000-01-01
12 schema:description Given a set of possible models for variables X and a set of possible parameters for each model, the Bayesian “estimate” of the probability distribution for X given observed data is obtained by averaging over the possible models and their parameters. An often-used approximation for this estimate is obtained by selecting a single model and averaging over its parameters. The approximation is useful because it is computationally efficient, and because it provides a model that facilitates understanding of the domain. A common criterion for model selection is the posterior probability of the model. Another criterion for model selection, proposed by San Martini and Spezzafari (1984), is the predictive performance of a model for the next observation to be seen. From the standpoint of domain understanding, both criteria are useful, because one identifies the model that is most likely, whereas the other identifies the model that is the best predictor of the next observation. To highlight the difference, we refer to the posterior-probability and alternative criteria as the scientific criterion (SC) and engineering criterion (EC), respectively. When we are interested in predicting the next observation, the model-averaged estimate is at least as good as that produced by EC, which itself is at least as good as the estimate produced by SC. We show experimentally that, for Bayesian-network models containing discrete variables only, the predictive performance of the model average can be significantly better than those of single models selected by either criterion, and that differences between models selected by the two criterion can be substantial.
13 schema:genre research_article
14 schema:inLanguage en
15 schema:isAccessibleForFree false
16 schema:isPartOf N430a42ee3d564d0593d5b97251efc3e8
17 Nfd97f334f909485984184b4646af4545
18 sg:journal.1327447
19 schema:name A comparison of scientific and engineering criteria for Bayesian model selection
20 schema:pagination 55-62
21 schema:productId N12744527a6c84a61b5e85bcde66764e4
22 N4f21952f2fd74796ae4a1bd689e6beff
23 N9f6e98881c5f4b49a64ff96a013ca035
24 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047713127
25 https://doi.org/10.1023/a:1008936501289
26 schema:sdDatePublished 2019-04-10T17:29
27 schema:sdLicense https://scigraph.springernature.com/explorer/license/
28 schema:sdPublisher Nf68ce715f21745d29fb8d663a2aec576
29 schema:url http://link.springer.com/10.1023/A:1008936501289
30 sgo:license sg:explorer/license/
31 sgo:sdDataset articles
32 rdf:type schema:ScholarlyArticle
33 N12744527a6c84a61b5e85bcde66764e4 schema:name doi
34 schema:value 10.1023/a:1008936501289
35 rdf:type schema:PropertyValue
36 N430a42ee3d564d0593d5b97251efc3e8 schema:issueNumber 1
37 rdf:type schema:PublicationIssue
38 N4f21952f2fd74796ae4a1bd689e6beff schema:name readcube_id
39 schema:value e34b10686277933d744500f2c55ed1d0066df96aafea7be26f5b5e6aebce6476
40 rdf:type schema:PropertyValue
41 N95f0bdae8a37474087f4c17f05165122 rdf:first sg:person.011240332636.47
42 rdf:rest Neb5f741fbdd54cc892e8c9d4956ee4d6
43 N9f6e98881c5f4b49a64ff96a013ca035 schema:name dimensions_id
44 schema:value pub.1047713127
45 rdf:type schema:PropertyValue
46 Neb5f741fbdd54cc892e8c9d4956ee4d6 rdf:first sg:person.01134362461.98
47 rdf:rest rdf:nil
48 Nf68ce715f21745d29fb8d663a2aec576 schema:name Springer Nature - SN SciGraph project
49 rdf:type schema:Organization
50 Nfd97f334f909485984184b4646af4545 schema:volumeNumber 10
51 rdf:type schema:PublicationVolume
52 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
53 schema:name Mathematical Sciences
54 rdf:type schema:DefinedTerm
55 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
56 schema:name Statistics
57 rdf:type schema:DefinedTerm
58 sg:journal.1327447 schema:issn 0960-3174
59 1573-1375
60 schema:name Statistics and Computing
61 rdf:type schema:Periodical
62 sg:person.011240332636.47 schema:affiliation https://www.grid.ac/institutes/grid.419815.0
63 schema:familyName Chickering
64 schema:givenName David Maxwell
65 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011240332636.47
66 rdf:type schema:Person
67 sg:person.01134362461.98 schema:affiliation https://www.grid.ac/institutes/grid.419815.0
68 schema:familyName Heckerman
69 schema:givenName David
70 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01134362461.98
71 rdf:type schema:Person
72 sg:pub.10.1007/bf00994016 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035524560
73 https://doi.org/10.1007/bf00994016
74 rdf:type schema:CreativeWork
75 sg:pub.10.1007/bf00994110 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046316965
76 https://doi.org/10.1007/bf00994110
77 rdf:type schema:CreativeWork
78 https://doi.org/10.1016/0304-4076(81)90073-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001885768
79 rdf:type schema:CreativeWork
80 https://doi.org/10.1086/224530 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058544028
81 rdf:type schema:CreativeWork
82 https://doi.org/10.1093/biomet/82.4.669 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059420600
83 rdf:type schema:CreativeWork
84 https://doi.org/10.1214/aos/1176344689 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064407540
85 rdf:type schema:CreativeWork
86 https://www.grid.ac/institutes/grid.419815.0 schema:alternateName Microsoft (United States)
87 schema:name Microsoft Research, 98052-6399, Redmond, WA, USA
88 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...