Bagging predictors View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

1996-08

AUTHORS

Leo Breiman

ABSTRACT

Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy. More... »

PAGES

123-140

References to SciGraph publications

  • 1993. Learning classification trees in ARTIFICIAL INTELLIGENCE FRONTIERS IN STATISTICS
  • 1993. An Introduction to the Bootstrap in NONE
  • Journal

    TITLE

    Machine Learning

    ISSUE

    2

    VOLUME

    24

    Related Patents

  • Detecting Return-Oriented Programming Payloads By Evaluating Data For A Gadget Address Space Address And Determining Whether Operations Associated With Instructions Beginning At The Address Indicate A Return-Oriented Programming Payload
  • System And Software For Creation And Modification Of Software
  • Method And System To Safely Guide Interventions In Procedures The Substrate Whereof Is Neuronal Plasticity
  • Traffic Simulation To Identify Malicious Activity
  • Methods, Media, And Systems For Detecting An Anomalous Sequence Of Function Calls
  • Method And System For Network-Based Detecting Of Malware From Behavioral Clustering
  • Performing Vocabulary-Based Visual Search Using Multi-Resolution Feature Descriptors
  • Systems, Methods, And Media For Generating Sanitized Data, Sanitizing Anomaly Detection Models, And/Or Generating Sanitized Anomaly Detection Models
  • System And Method For Detecting Central Pulmonary Embolism In Ct Pulmonary Angiography Images
  • Systems, Methods, And Media For Generating Sanitized Data, Sanitizing Anomaly Detection Models, And/Or Generating Sanitized Anomaly Detection Models
  • Performing Vocabulary-Based Visual Search Using Multi-Resolution Feature Descriptors
  • Multivariate Responses Using Classification And Regression Trees Systems And Methods
  • Single-Pass Distributed Sampling From Block-Partitioned Matrices
  • Systems And Methods For Distributed Rules Processing
  • Selective Sharing For Collaborative Application Usage
  • Ensemble Learning System And Method
  • Generating Training Documents
  • Detecting Return-Oriented Programming Payloads By Evaluating Data For A Gadget Address Space Address And Determining Whether Operations Associated With Instructions Beginning At The Address Indicate A Return-Oriented Programming Payload
  • Historical Analysis To Identify Malicious Activity
  • System And Method For Pain Monitoring Using A Multidimensional Analysis Of Physiological Signals
  • Method And System For Detecting Dga-Based Malware
  • Method And System For Meshing Human And Computer Competencies For Object Categorization
  • Methods And Apparatus For User Interface Optimization
  • An Improved Stacking Schema For Classification Tasks
  • Systems, Methods, And Media For Generating Sanitized Data, Sanitizing Anomaly Detection Models, And/Or Generating Sanitized Anomaly Detection Models
  • Methods, Media, And Systems For Detecting An Anomalous Sequence Of Function Calls
  • System And Method For Image Sequence Processing
  • Data Mining To Identify Malicious Activity
  • Measuring, Categorizing, And/Or Mitigating Malware Distribution Paths
  • Method And System For Detecting Malicious And/Or Botnet-Related Domain Names
  • System And Method Of Designing Models In A Feedback Loop
  • Methods And Systems For Network Flow Analysis
  • Connecting Graphical Shapes Using Gestures
  • Binary Tree For Complex Supervised Learning
  • Method And Apparatus For User Interface Non-Conformance Detection And Correction
  • Method For Screening And Treating Patients At Risk Of Medical Disorders
  • System And Method For Updating Or Modifying An Application Without Manual Coding
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/bf00058655

    DOI

    http://dx.doi.org/10.1007/bf00058655

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1002929950


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Statistics Department, University of California, 94720, Berkeley, CA", 
              "id": "http://www.grid.ac/institutes/grid.47840.3f", 
              "name": [
                "Statistics Department, University of California, 94720, Berkeley, CA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Breiman", 
            "givenName": "Leo", 
            "id": "sg:person.01275565034.02", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01275565034.02"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/978-1-4899-4537-2_15", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1089746414", 
              "https://doi.org/10.1007/978-1-4899-4537-2_15"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4899-4541-9", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1109705929", 
              "https://doi.org/10.1007/978-1-4899-4541-9"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "1996-08", 
        "datePublishedReg": "1996-08-01", 
        "description": "Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.", 
        "genre": "article", 
        "id": "sg:pub.10.1007/bf00058655", 
        "inLanguage": "en", 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1125588", 
            "issn": [
              "0885-6125", 
              "1573-0565"
            ], 
            "name": "Machine Learning", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "2", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "24"
          }
        ], 
        "keywords": [
          "multiple versions", 
          "bagging predictors", 
          "learning set", 
          "subset selection", 
          "simulated data sets", 
          "data sets", 
          "regression trees", 
          "new learning sets", 
          "bagging", 
          "prediction method", 
          "set", 
          "substantial gains", 
          "plurality vote", 
          "accuracy", 
          "version", 
          "learning", 
          "classification", 
          "vital element", 
          "method", 
          "trees", 
          "selection", 
          "vote", 
          "linear regression", 
          "class", 
          "bootstrap replicates", 
          "gain", 
          "elements", 
          "regression", 
          "numerical outcomes", 
          "average", 
          "test", 
          "predictors", 
          "changes", 
          "outcomes", 
          "instability", 
          "significant changes", 
          "replicates"
        ], 
        "name": "Bagging predictors", 
        "pagination": "123-140", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1002929950"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/bf00058655"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/bf00058655", 
          "https://app.dimensions.ai/details/publication/pub.1002929950"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-05-20T07:20", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/article/article_286.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/bf00058655"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/bf00058655'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/bf00058655'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/bf00058655'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/bf00058655'


     

    This table displays all metadata directly associated to this object as RDF triples.

    103 TRIPLES      22 PREDICATES      65 URIs      55 LITERALS      6 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/bf00058655 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N1660a4f3e5ff4ffeba15ed79684faf61
    4 schema:citation sg:pub.10.1007/978-1-4899-4537-2_15
    5 sg:pub.10.1007/978-1-4899-4541-9
    6 schema:datePublished 1996-08
    7 schema:datePublishedReg 1996-08-01
    8 schema:description Bagging predictors is a method for generating multiple versions of a predictor and using these to get an aggregated predictor. The aggregation averages over the versions when predicting a numerical outcome and does a plurality vote when predicting a class. The multiple versions are formed by making bootstrap replicates of the learning set and using these as new learning sets. Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy. The vital element is the instability of the prediction method. If perturbing the learning set can cause significant changes in the predictor constructed, then bagging can improve accuracy.
    9 schema:genre article
    10 schema:inLanguage en
    11 schema:isAccessibleForFree true
    12 schema:isPartOf N618fd9ad02844f8a978b5c7320550645
    13 N6bf4d5e331684dd48ac35aa46c2f9b8a
    14 sg:journal.1125588
    15 schema:keywords accuracy
    16 average
    17 bagging
    18 bagging predictors
    19 bootstrap replicates
    20 changes
    21 class
    22 classification
    23 data sets
    24 elements
    25 gain
    26 instability
    27 learning
    28 learning set
    29 linear regression
    30 method
    31 multiple versions
    32 new learning sets
    33 numerical outcomes
    34 outcomes
    35 plurality vote
    36 prediction method
    37 predictors
    38 regression
    39 regression trees
    40 replicates
    41 selection
    42 set
    43 significant changes
    44 simulated data sets
    45 subset selection
    46 substantial gains
    47 test
    48 trees
    49 version
    50 vital element
    51 vote
    52 schema:name Bagging predictors
    53 schema:pagination 123-140
    54 schema:productId N666ee9c56fdd4f23afdfbdf7fbc68dbb
    55 Nb594fdaa991645b08ab783d010bb0fac
    56 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002929950
    57 https://doi.org/10.1007/bf00058655
    58 schema:sdDatePublished 2022-05-20T07:20
    59 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    60 schema:sdPublisher N96283e70afc040788ec64055baac9bb2
    61 schema:url https://doi.org/10.1007/bf00058655
    62 sgo:license sg:explorer/license/
    63 sgo:sdDataset articles
    64 rdf:type schema:ScholarlyArticle
    65 N1660a4f3e5ff4ffeba15ed79684faf61 rdf:first sg:person.01275565034.02
    66 rdf:rest rdf:nil
    67 N618fd9ad02844f8a978b5c7320550645 schema:volumeNumber 24
    68 rdf:type schema:PublicationVolume
    69 N666ee9c56fdd4f23afdfbdf7fbc68dbb schema:name doi
    70 schema:value 10.1007/bf00058655
    71 rdf:type schema:PropertyValue
    72 N6bf4d5e331684dd48ac35aa46c2f9b8a schema:issueNumber 2
    73 rdf:type schema:PublicationIssue
    74 N96283e70afc040788ec64055baac9bb2 schema:name Springer Nature - SN SciGraph project
    75 rdf:type schema:Organization
    76 Nb594fdaa991645b08ab783d010bb0fac schema:name dimensions_id
    77 schema:value pub.1002929950
    78 rdf:type schema:PropertyValue
    79 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    80 schema:name Information and Computing Sciences
    81 rdf:type schema:DefinedTerm
    82 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    83 schema:name Artificial Intelligence and Image Processing
    84 rdf:type schema:DefinedTerm
    85 sg:journal.1125588 schema:issn 0885-6125
    86 1573-0565
    87 schema:name Machine Learning
    88 schema:publisher Springer Nature
    89 rdf:type schema:Periodical
    90 sg:person.01275565034.02 schema:affiliation grid-institutes:grid.47840.3f
    91 schema:familyName Breiman
    92 schema:givenName Leo
    93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01275565034.02
    94 rdf:type schema:Person
    95 sg:pub.10.1007/978-1-4899-4537-2_15 schema:sameAs https://app.dimensions.ai/details/publication/pub.1089746414
    96 https://doi.org/10.1007/978-1-4899-4537-2_15
    97 rdf:type schema:CreativeWork
    98 sg:pub.10.1007/978-1-4899-4541-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109705929
    99 https://doi.org/10.1007/978-1-4899-4541-9
    100 rdf:type schema:CreativeWork
    101 grid-institutes:grid.47840.3f schema:alternateName Statistics Department, University of California, 94720, Berkeley, CA
    102 schema:name Statistics Department, University of California, 94720, Berkeley, CA
    103 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...