An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2000-08

AUTHORS

Thomas G. Dietterich

ABSTRACT

Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a “base” learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance of the decision-tree algorithm C4.5. The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting. In situations with substantial classification noise, bagging is much better than boosting, and sometimes better than randomization. More... »

PAGES

139-157

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1023/a:1007607513941

DOI

http://dx.doi.org/10.1023/a:1007607513941

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1041829946


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Oregon State University", 
          "id": "https://www.grid.ac/institutes/grid.4391.f", 
          "name": [
            "Department of Computer Science, Oregon State University, 97331, Corvallis, OR, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Dietterich", 
        "givenName": "Thomas G.", 
        "id": "sg:person.01324347170.02", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324347170.02"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/bf00058655", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002929950", 
          "https://doi.org/10.1007/bf00058655"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1007515423169", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017116781", 
          "https://doi.org/10.1023/a:1007515423169"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00058611", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045654538", 
          "https://doi.org/10.1007/bf00058611"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/089976698300017197", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053132543"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1142/s021821309700027x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062965255"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2000-08", 
    "datePublishedReg": "2000-08-01", 
    "description": "Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a \u201cbase\u201d learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance of the decision-tree algorithm C4.5. The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting. In situations with substantial classification noise, bagging is much better than boosting, and sometimes better than randomization.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1023/a:1007607513941", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1125588", 
        "issn": [
          "0885-6125", 
          "1573-0565"
        ], 
        "name": "Machine Learning", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "2", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "40"
      }
    ], 
    "name": "An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization", 
    "pagination": "139-157", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "c7423ca013885e8ee352a31da1083e871d2a45aaf5139c4637bd30c9525cebb6"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1023/a:1007607513941"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1041829946"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1023/a:1007607513941", 
      "https://app.dimensions.ai/details/publication/pub.1041829946"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T19:06", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8678_00000500.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1023/A:1007607513941"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1007607513941'


 

This table displays all metadata directly associated to this object as RDF triples.

79 TRIPLES      21 PREDICATES      32 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1023/a:1007607513941 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N02ed8448de9b4e979e6d97164758de48
4 schema:citation sg:pub.10.1007/bf00058611
5 sg:pub.10.1007/bf00058655
6 sg:pub.10.1023/a:1007515423169
7 https://doi.org/10.1142/s021821309700027x
8 https://doi.org/10.1162/089976698300017197
9 schema:datePublished 2000-08
10 schema:datePublishedReg 2000-08-01
11 schema:description Bagging and boosting are methods that generate a diverse ensemble of classifiers by manipulating the training data given to a “base” learning algorithm. Breiman has pointed out that they rely for their effectiveness on the instability of the base learning algorithm. An alternative approach to generating an ensemble is to randomize the internal decisions made by the base algorithm. This general approach has been studied previously by Ali and Pazzani and by Dietterich and Kong. This paper compares the effectiveness of randomization, bagging, and boosting for improving the performance of the decision-tree algorithm C4.5. The experiments show that in situations with little or no classification noise, randomization is competitive with (and perhaps slightly superior to) bagging but not as accurate as boosting. In situations with substantial classification noise, bagging is much better than boosting, and sometimes better than randomization.
12 schema:genre research_article
13 schema:inLanguage en
14 schema:isAccessibleForFree true
15 schema:isPartOf N2dec0fd0fd11429a81efa65760b8bb77
16 N3b1aeaabe9164c0284e290a9233a110e
17 sg:journal.1125588
18 schema:name An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization
19 schema:pagination 139-157
20 schema:productId N2d93e4b82598475b8fc145f709dfd6a5
21 Ncccbe65dd0bc4908b1ec9a528d410a22
22 Nfda69ea6490248ee84f704139ddba29a
23 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041829946
24 https://doi.org/10.1023/a:1007607513941
25 schema:sdDatePublished 2019-04-10T19:06
26 schema:sdLicense https://scigraph.springernature.com/explorer/license/
27 schema:sdPublisher Nbbd319fd0e7240f4a571a0804bf445f8
28 schema:url http://link.springer.com/10.1023/A:1007607513941
29 sgo:license sg:explorer/license/
30 sgo:sdDataset articles
31 rdf:type schema:ScholarlyArticle
32 N02ed8448de9b4e979e6d97164758de48 rdf:first sg:person.01324347170.02
33 rdf:rest rdf:nil
34 N2d93e4b82598475b8fc145f709dfd6a5 schema:name dimensions_id
35 schema:value pub.1041829946
36 rdf:type schema:PropertyValue
37 N2dec0fd0fd11429a81efa65760b8bb77 schema:issueNumber 2
38 rdf:type schema:PublicationIssue
39 N3b1aeaabe9164c0284e290a9233a110e schema:volumeNumber 40
40 rdf:type schema:PublicationVolume
41 Nbbd319fd0e7240f4a571a0804bf445f8 schema:name Springer Nature - SN SciGraph project
42 rdf:type schema:Organization
43 Ncccbe65dd0bc4908b1ec9a528d410a22 schema:name readcube_id
44 schema:value c7423ca013885e8ee352a31da1083e871d2a45aaf5139c4637bd30c9525cebb6
45 rdf:type schema:PropertyValue
46 Nfda69ea6490248ee84f704139ddba29a schema:name doi
47 schema:value 10.1023/a:1007607513941
48 rdf:type schema:PropertyValue
49 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
50 schema:name Information and Computing Sciences
51 rdf:type schema:DefinedTerm
52 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
53 schema:name Artificial Intelligence and Image Processing
54 rdf:type schema:DefinedTerm
55 sg:journal.1125588 schema:issn 0885-6125
56 1573-0565
57 schema:name Machine Learning
58 rdf:type schema:Periodical
59 sg:person.01324347170.02 schema:affiliation https://www.grid.ac/institutes/grid.4391.f
60 schema:familyName Dietterich
61 schema:givenName Thomas G.
62 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324347170.02
63 rdf:type schema:Person
64 sg:pub.10.1007/bf00058611 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045654538
65 https://doi.org/10.1007/bf00058611
66 rdf:type schema:CreativeWork
67 sg:pub.10.1007/bf00058655 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002929950
68 https://doi.org/10.1007/bf00058655
69 rdf:type schema:CreativeWork
70 sg:pub.10.1023/a:1007515423169 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017116781
71 https://doi.org/10.1023/a:1007515423169
72 rdf:type schema:CreativeWork
73 https://doi.org/10.1142/s021821309700027x schema:sameAs https://app.dimensions.ai/details/publication/pub.1062965255
74 rdf:type schema:CreativeWork
75 https://doi.org/10.1162/089976698300017197 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053132543
76 rdf:type schema:CreativeWork
77 https://www.grid.ac/institutes/grid.4391.f schema:alternateName Oregon State University
78 schema:name Department of Computer Science, Oregon State University, 97331, Corvallis, OR, USA
79 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...