An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2000-08

AUTHORS

Thomas G. Dietterich

ABSTRACT

Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a “base” learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance of the decision-tree algorithm C4.5. The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting. In situations with substantial classification noise, bagging is much better than boosting, and sometimes better than randomization. More... »

PAGES

139-157

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1023/a:1007607513941

DOI

http://dx.doi.org/10.1023/a:1007607513941

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1041829946


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Oregon State University", 
          "id": "https://www.grid.ac/institutes/grid.4391.f", 
          "name": [
            "Department of Computer Science, Oregon State University, 97331, Corvallis, OR, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Dietterich", 
        "givenName": "Thomas G.", 
        "id": "sg:person.01324347170.02", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324347170.02"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/bf00058655", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002929950", 
          "https://doi.org/10.1007/bf00058655"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1007515423169", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017116781", 
          "https://doi.org/10.1023/a:1007515423169"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00058611", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045654538", 
          "https://doi.org/10.1007/bf00058611"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/089976698300017197", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053132543"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1142/s021821309700027x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062965255"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2000-08", 
    "datePublishedReg": "2000-08-01", 
    "description": "Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a \u201cbase\u201d learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance of the decision-tree algorithm C4.5. The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting. In situations with substantial classification noise, bagging is much better than boosting, and sometimes better than randomization.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1023/a:1007607513941", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1125588", 
        "issn": [
          "0885-6125", 
          "1573-0565"
        ], 
        "name": "Machine Learning", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "2", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "40"
      }
    ], 
    "name": "An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization", 
    "pagination": "139-157", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "c7423ca013885e8ee352a31da1083e871d2a45aaf5139c4637bd30c9525cebb6"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1023/a:1007607513941"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1041829946"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1023/a:1007607513941", 
      "https://app.dimensions.ai/details/publication/pub.1041829946"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T19:06", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8678_00000500.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1023/A:1007607513941"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'


 

This table displays all metadata directly associated to this object as RDF triples.

79 TRIPLES      21 PREDICATES      32 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1023/a:1007607513941 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nf7342ea0d1544d2b8de4cb96c647437f
4 schema:citation sg:pub.10.1007/bf00058611
5 sg:pub.10.1007/bf00058655
6 sg:pub.10.1023/a:1007515423169
7 https://doi.org/10.1142/s021821309700027x
8 https://doi.org/10.1162/089976698300017197
9 schema:datePublished 2000-08
10 schema:datePublishedReg 2000-08-01
11 schema:description Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a “base” learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance of the decision-tree algorithm C4.5. The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting. In situations with substantial classification noise, bagging is much better than boosting, and sometimes better than randomization.
12 schema:genre research_article
13 schema:inLanguage en
14 schema:isAccessibleForFree true
15 schema:isPartOf N3843fcd9b2b6455fbb128371b6cb40e2
16 N8bc9897f2f714606b4b4a403fa7dfc29
17 sg:journal.1125588
18 schema:name An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization
19 schema:pagination 139-157
20 schema:productId N1f6a38569ea04cbcbe58291627809ee9
21 Nb89d3ce584ad4ced8016f92d27e59437
22 Nbedb6558925746078152a13a6b90a947
23 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041829946
24 https://doi.org/10.1023/a:1007607513941
25 schema:sdDatePublished 2019-04-10T19:06
26 schema:sdLicense https://scigraph.springernature.com/explorer/license/
27 schema:sdPublisher N5a097ab21fd543c4b03f2d0262cd60ea
28 schema:url http://link.springer.com/10.1023/A:1007607513941
29 sgo:license sg:explorer/license/
30 sgo:sdDataset articles
31 rdf:type schema:ScholarlyArticle
32 N1f6a38569ea04cbcbe58291627809ee9 schema:name doi
33 schema:value 10.1023/a:1007607513941
34 rdf:type schema:PropertyValue
35 N3843fcd9b2b6455fbb128371b6cb40e2 schema:volumeNumber 40
36 rdf:type schema:PublicationVolume
37 N5a097ab21fd543c4b03f2d0262cd60ea schema:name Springer Nature - SN SciGraph project
38 rdf:type schema:Organization
39 N8bc9897f2f714606b4b4a403fa7dfc29 schema:issueNumber 2
40 rdf:type schema:PublicationIssue
41 Nb89d3ce584ad4ced8016f92d27e59437 schema:name readcube_id
42 schema:value c7423ca013885e8ee352a31da1083e871d2a45aaf5139c4637bd30c9525cebb6
43 rdf:type schema:PropertyValue
44 Nbedb6558925746078152a13a6b90a947 schema:name dimensions_id
45 schema:value pub.1041829946
46 rdf:type schema:PropertyValue
47 Nf7342ea0d1544d2b8de4cb96c647437f rdf:first sg:person.01324347170.02
48 rdf:rest rdf:nil
49 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
50 schema:name Information and Computing Sciences
51 rdf:type schema:DefinedTerm
52 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
53 schema:name Artificial Intelligence and Image Processing
54 rdf:type schema:DefinedTerm
55 sg:journal.1125588 schema:issn 0885-6125
56 1573-0565
57 schema:name Machine Learning
58 rdf:type schema:Periodical
59 sg:person.01324347170.02 schema:affiliation https://www.grid.ac/institutes/grid.4391.f
60 schema:familyName Dietterich
61 schema:givenName Thomas G.
62 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324347170.02
63 rdf:type schema:Person
64 sg:pub.10.1007/bf00058611 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045654538
65 https://doi.org/10.1007/bf00058611
66 rdf:type schema:CreativeWork
67 sg:pub.10.1007/bf00058655 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002929950
68 https://doi.org/10.1007/bf00058655
69 rdf:type schema:CreativeWork
70 sg:pub.10.1023/a:1007515423169 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017116781
71 https://doi.org/10.1023/a:1007515423169
72 rdf:type schema:CreativeWork
73 https://doi.org/10.1142/s021821309700027x schema:sameAs https://app.dimensions.ai/details/publication/pub.1062965255
74 rdf:type schema:CreativeWork
75 https://doi.org/10.1162/089976698300017197 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053132543
76 rdf:type schema:CreativeWork
77 https://www.grid.ac/institutes/grid.4391.f schema:alternateName Oregon State University
78 schema:name Department of Computer Science, Oregon State University, 97331, Corvallis, OR, USA
79 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...