An Algorithm for Positive-Breakdown Regression Based on Concentration Steps View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2000

AUTHORS

Peter J. Rousseeuw , Katrien Van Driessen

ABSTRACT

Positive-breakdown regression is able to extract previously unknown patterns or substructures from the data. Here we will focus on least trimmed squares (LTS) regression, which is based on the subset of h cases (out of n) whose least squares fit possesses the smallest sum of squared residuals. The coverage h may be set between n/2 and n. The computation time of existing LTS algorithms grows too much with the size of the data set. In this paper we develop a new algorithm called FAST-LTS. The basic idea is the ‘concentration step’, which is based on a new inequality involving order statistics and sums of squared residuals. Further reductions of the computation time are obtained by techniques which we call ‘selective iteration’ and ‘nested extensions’. We also use an intercept adjustment technique to improve the precision. For small data sets FAST-LTS typically finds the exact LTS, whereas for larger data sets it gives more accurate results than existing algorithms for LTS and is faster by orders of magnitude. This allows us to apply FAST-LTS to large datasets. More... »

PAGES

335-346

References to SciGraph publications

  • 1985. Multivariate Estimation with High Breakdown Point in MATHEMATICAL STATISTICS AND APPLICATIONS
  • 1998-09. Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values in DATA MINING AND KNOWLEDGE DISCOVERY
  • 1997-06. BIRCH: A New Data Clustering Algorithm and Its Applications in DATA MINING AND KNOWLEDGE DISCOVERY
  • Book

    TITLE

    Data Analysis

    ISBN

    978-3-540-67731-4
    978-3-642-58250-9

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-642-58250-9_27

    DOI

    http://dx.doi.org/10.1007/978-3-642-58250-9_27

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1043348595


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Antwerp", 
              "id": "https://www.grid.ac/institutes/grid.5284.b", 
              "name": [
                "Department of Mathematics and Computer Science, University of Antwerp (UIA), B-2610, Wilrijk, Belgium"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Rousseeuw", 
            "givenName": "Peter J.", 
            "id": "sg:person.0775337371.63", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0775337371.63"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Antwerp", 
              "id": "https://www.grid.ac/institutes/grid.5284.b", 
              "name": [
                "Faculty of Applied Economics, University of Antwerp (UFSIA), B-2000, Antwerp, Belgium"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Van Driessen", 
            "givenName": "Katrien", 
            "id": "sg:person.016127315362.74", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016127315362.74"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1023/a:1009783824328", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000292060", 
              "https://doi.org/10.1023/a:1009783824328"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0167-9473(92)00070-8", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003170913"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1023/a:1009769707641", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027035492", 
              "https://doi.org/10.1023/a:1009769707641"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/b978-0-444-87877-9.50039-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043141706"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0167-9473(98)00082-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1049486548"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/00401706.1999.10485670", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058287776"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/01621459.1984.10477105", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058302950"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/01621459.1990.10474920", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058303860"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/01621459.1994.10476821", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058304685"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-94-009-5438-0_20", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1089521729", 
              "https://doi.org/10.1007/978-94-009-5438-0_20"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-94-009-5438-0_20", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1089521729", 
              "https://doi.org/10.1007/978-94-009-5438-0_20"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1107763504", 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/0471725382", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1107763504"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/9780470316801", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1109496256"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1109496256", 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2000", 
        "datePublishedReg": "2000-01-01", 
        "description": "Positive-breakdown regression is able to extract previously unknown patterns or substructures from the data. Here we will focus on least trimmed squares (LTS) regression, which is based on the subset of h cases (out of n) whose least squares fit possesses the smallest sum of squared residuals. The coverage h may be set between n/2 and n. The computation time of existing LTS algorithms grows too much with the size of the data set. In this paper we develop a new algorithm called FAST-LTS. The basic idea is the \u2018concentration step\u2019, which is based on a new inequality involving order statistics and sums of squared residuals. Further reductions of the computation time are obtained by techniques which we call \u2018selective iteration\u2019 and \u2018nested extensions\u2019. We also use an intercept adjustment technique to improve the precision. For small data sets FAST-LTS typically finds the exact LTS, whereas for larger data sets it gives more accurate results than existing algorithms for LTS and is faster by orders of magnitude. This allows us to apply FAST-LTS to large datasets.", 
        "editor": [
          {
            "familyName": "Gaul", 
            "givenName": "Wolfgang", 
            "type": "Person"
          }, 
          {
            "familyName": "Opitz", 
            "givenName": "Otto", 
            "type": "Person"
          }, 
          {
            "familyName": "Schader", 
            "givenName": "Martin", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-642-58250-9_27", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": {
          "isbn": [
            "978-3-540-67731-4", 
            "978-3-642-58250-9"
          ], 
          "name": "Data Analysis", 
          "type": "Book"
        }, 
        "name": "An Algorithm for Positive-Breakdown Regression Based on Concentration Steps", 
        "pagination": "335-346", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1043348595"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-642-58250-9_27"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "ce7c88b22ef519d8ff4addd69b5ce7fe9a370423ed1a4c6c463364304b498a0e"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-642-58250-9_27", 
          "https://app.dimensions.ai/details/publication/pub.1043348595"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T09:46", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000375_0000000375/records_91459_00000000.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-642-58250-9_27"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-58250-9_27'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-58250-9_27'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-58250-9_27'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-58250-9_27'


     

    This table displays all metadata directly associated to this object as RDF triples.

    126 TRIPLES      23 PREDICATES      41 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-642-58250-9_27 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N84ff27147683414680ea9044edfb2c66
    4 schema:citation sg:pub.10.1007/978-94-009-5438-0_20
    5 sg:pub.10.1023/a:1009769707641
    6 sg:pub.10.1023/a:1009783824328
    7 https://app.dimensions.ai/details/publication/pub.1107763504
    8 https://app.dimensions.ai/details/publication/pub.1109496256
    9 https://doi.org/10.1002/0471725382
    10 https://doi.org/10.1002/9780470316801
    11 https://doi.org/10.1016/0167-9473(92)00070-8
    12 https://doi.org/10.1016/b978-0-444-87877-9.50039-x
    13 https://doi.org/10.1016/s0167-9473(98)00082-6
    14 https://doi.org/10.1080/00401706.1999.10485670
    15 https://doi.org/10.1080/01621459.1984.10477105
    16 https://doi.org/10.1080/01621459.1990.10474920
    17 https://doi.org/10.1080/01621459.1994.10476821
    18 schema:datePublished 2000
    19 schema:datePublishedReg 2000-01-01
    20 schema:description Positive-breakdown regression is able to extract previously unknown patterns or substructures from the data. Here we will focus on least trimmed squares (LTS) regression, which is based on the subset of h cases (out of n) whose least squares fit possesses the smallest sum of squared residuals. The coverage h may be set between n/2 and n. The computation time of existing LTS algorithms grows too much with the size of the data set. In this paper we develop a new algorithm called FAST-LTS. The basic idea is the ‘concentration step’, which is based on a new inequality involving order statistics and sums of squared residuals. Further reductions of the computation time are obtained by techniques which we call ‘selective iteration’ and ‘nested extensions’. We also use an intercept adjustment technique to improve the precision. For small data sets FAST-LTS typically finds the exact LTS, whereas for larger data sets it gives more accurate results than existing algorithms for LTS and is faster by orders of magnitude. This allows us to apply FAST-LTS to large datasets.
    21 schema:editor Nfe54e606a26746798de15ba28c557b2a
    22 schema:genre chapter
    23 schema:inLanguage en
    24 schema:isAccessibleForFree false
    25 schema:isPartOf N156ce43da5da4c1d87b9d2670b475cc3
    26 schema:name An Algorithm for Positive-Breakdown Regression Based on Concentration Steps
    27 schema:pagination 335-346
    28 schema:productId N3ac1d43aac9748be9fe639411c377245
    29 N7307ae82d7754ba7a72d56faf5a63053
    30 Ncec57b03fb1b4fa49a1924bb5a149452
    31 schema:publisher N366f55cdd6474599b76d7af2eece281a
    32 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043348595
    33 https://doi.org/10.1007/978-3-642-58250-9_27
    34 schema:sdDatePublished 2019-04-16T09:46
    35 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    36 schema:sdPublisher N51dfe06a0f7c4135bc4fbdad84790839
    37 schema:url https://link.springer.com/10.1007%2F978-3-642-58250-9_27
    38 sgo:license sg:explorer/license/
    39 sgo:sdDataset chapters
    40 rdf:type schema:Chapter
    41 N156ce43da5da4c1d87b9d2670b475cc3 schema:isbn 978-3-540-67731-4
    42 978-3-642-58250-9
    43 schema:name Data Analysis
    44 rdf:type schema:Book
    45 N1fadf523a81749d1a76e8d91d1d20653 rdf:first Ncddc25b7756c437796dbf2c63d949486
    46 rdf:rest Nda37a9ba77a94a0788eb7260741b377c
    47 N2cf18f046fd345e7820dd3cea0f612bf schema:familyName Gaul
    48 schema:givenName Wolfgang
    49 rdf:type schema:Person
    50 N2e29d43621c24755bb76ac1cc3de90b1 rdf:first sg:person.016127315362.74
    51 rdf:rest rdf:nil
    52 N366f55cdd6474599b76d7af2eece281a schema:location Berlin, Heidelberg
    53 schema:name Springer Berlin Heidelberg
    54 rdf:type schema:Organisation
    55 N3ac1d43aac9748be9fe639411c377245 schema:name readcube_id
    56 schema:value ce7c88b22ef519d8ff4addd69b5ce7fe9a370423ed1a4c6c463364304b498a0e
    57 rdf:type schema:PropertyValue
    58 N51dfe06a0f7c4135bc4fbdad84790839 schema:name Springer Nature - SN SciGraph project
    59 rdf:type schema:Organization
    60 N7307ae82d7754ba7a72d56faf5a63053 schema:name dimensions_id
    61 schema:value pub.1043348595
    62 rdf:type schema:PropertyValue
    63 N7c5c94bb3f83456f9a32bb775e17fc76 schema:familyName Schader
    64 schema:givenName Martin
    65 rdf:type schema:Person
    66 N84ff27147683414680ea9044edfb2c66 rdf:first sg:person.0775337371.63
    67 rdf:rest N2e29d43621c24755bb76ac1cc3de90b1
    68 Ncddc25b7756c437796dbf2c63d949486 schema:familyName Opitz
    69 schema:givenName Otto
    70 rdf:type schema:Person
    71 Ncec57b03fb1b4fa49a1924bb5a149452 schema:name doi
    72 schema:value 10.1007/978-3-642-58250-9_27
    73 rdf:type schema:PropertyValue
    74 Nda37a9ba77a94a0788eb7260741b377c rdf:first N7c5c94bb3f83456f9a32bb775e17fc76
    75 rdf:rest rdf:nil
    76 Nfe54e606a26746798de15ba28c557b2a rdf:first N2cf18f046fd345e7820dd3cea0f612bf
    77 rdf:rest N1fadf523a81749d1a76e8d91d1d20653
    78 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    79 schema:name Information and Computing Sciences
    80 rdf:type schema:DefinedTerm
    81 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    82 schema:name Artificial Intelligence and Image Processing
    83 rdf:type schema:DefinedTerm
    84 sg:person.016127315362.74 schema:affiliation https://www.grid.ac/institutes/grid.5284.b
    85 schema:familyName Van Driessen
    86 schema:givenName Katrien
    87 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016127315362.74
    88 rdf:type schema:Person
    89 sg:person.0775337371.63 schema:affiliation https://www.grid.ac/institutes/grid.5284.b
    90 schema:familyName Rousseeuw
    91 schema:givenName Peter J.
    92 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0775337371.63
    93 rdf:type schema:Person
    94 sg:pub.10.1007/978-94-009-5438-0_20 schema:sameAs https://app.dimensions.ai/details/publication/pub.1089521729
    95 https://doi.org/10.1007/978-94-009-5438-0_20
    96 rdf:type schema:CreativeWork
    97 sg:pub.10.1023/a:1009769707641 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027035492
    98 https://doi.org/10.1023/a:1009769707641
    99 rdf:type schema:CreativeWork
    100 sg:pub.10.1023/a:1009783824328 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000292060
    101 https://doi.org/10.1023/a:1009783824328
    102 rdf:type schema:CreativeWork
    103 https://app.dimensions.ai/details/publication/pub.1107763504 schema:CreativeWork
    104 https://app.dimensions.ai/details/publication/pub.1109496256 schema:CreativeWork
    105 https://doi.org/10.1002/0471725382 schema:sameAs https://app.dimensions.ai/details/publication/pub.1107763504
    106 rdf:type schema:CreativeWork
    107 https://doi.org/10.1002/9780470316801 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109496256
    108 rdf:type schema:CreativeWork
    109 https://doi.org/10.1016/0167-9473(92)00070-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003170913
    110 rdf:type schema:CreativeWork
    111 https://doi.org/10.1016/b978-0-444-87877-9.50039-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1043141706
    112 rdf:type schema:CreativeWork
    113 https://doi.org/10.1016/s0167-9473(98)00082-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049486548
    114 rdf:type schema:CreativeWork
    115 https://doi.org/10.1080/00401706.1999.10485670 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058287776
    116 rdf:type schema:CreativeWork
    117 https://doi.org/10.1080/01621459.1984.10477105 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058302950
    118 rdf:type schema:CreativeWork
    119 https://doi.org/10.1080/01621459.1990.10474920 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058303860
    120 rdf:type schema:CreativeWork
    121 https://doi.org/10.1080/01621459.1994.10476821 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058304685
    122 rdf:type schema:CreativeWork
    123 https://www.grid.ac/institutes/grid.5284.b schema:alternateName University of Antwerp
    124 schema:name Department of Mathematics and Computer Science, University of Antwerp (UIA), B-2610, Wilrijk, Belgium
    125 Faculty of Applied Economics, University of Antwerp (UFSIA), B-2000, Antwerp, Belgium
    126 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...