Outlier Detection under Interval Uncertainty: Algorithmic Solvability and Computational Complexity View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2005-02

AUTHORS

Vladik Kreinovich, Luc Longpré, Praveen Patangay, Scott Ferson, Lev Ginzburg

ABSTRACT

In many application areas,it is important to detect outliers. The traditional engineering approach to outlier detection is that we start with some “normal” values x1, ...,xn, compute the sample average E, the sample standard variation σ, and then mark a value x as an outlier if x is outside the k0-sigma interval [E – k0 ⋅ σ, E + k0 ⋅ σ] (for some pre-selected parameter k0).In real life,we often have only interval ranges [\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${\underline x}_i, {\bar x}_i $\end{document}] for the normal values x1, ...,xn. In this case,we only have intervals of possible values for the bounds \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$E - k_0 \cdot \sigma$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$E + k_0 \cdot \sigma$\end{document}. We can therefore identify outliers as values that are outside all k0-sigma intervals.Once we identify a value as an outlier for a fixed k0, it is also desirable to find out to what degree this value is an outlier, i.e., what is the largest value k0 for which this value is an outlier.In this paper,we analyze the computational complexity of these outlier detection problems, provide efficient algorithms that solve some of these problems (under reasonable conditions), and list related open problems. More... »

PAGES

59-76

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s11155-005-5943-7

DOI

http://dx.doi.org/10.1007/s11155-005-5943-7

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1050829705


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0802", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computation Theory and Mathematics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA", 
          "id": "http://www.grid.ac/institutes/grid.267324.6", 
          "name": [
            "Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kreinovich", 
        "givenName": "Vladik", 
        "id": "sg:person.012602771355.27", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012602771355.27"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA", 
          "id": "http://www.grid.ac/institutes/grid.267324.6", 
          "name": [
            "Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Longpr\u00e9", 
        "givenName": "Luc", 
        "id": "sg:person.010474226755.58", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010474226755.58"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA", 
          "id": "http://www.grid.ac/institutes/grid.267324.6", 
          "name": [
            "Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Patangay", 
        "givenName": "Praveen", 
        "id": "sg:person.016401717555.50", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016401717555.50"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Applied Biomathematics, 100 North Country Road, Setauket11733, NY, USA", 
          "id": "http://www.grid.ac/institutes/grid.422751.7", 
          "name": [
            "Applied Biomathematics, 100 North Country Road, Setauket11733, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ferson", 
        "givenName": "Scott", 
        "id": "sg:person.012115026643.10", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012115026643.10"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Applied Biomathematics, 100 North Country Road, Setauket11733, NY, USA", 
          "id": "http://www.grid.ac/institutes/grid.422751.7", 
          "name": [
            "Applied Biomathematics, 100 North Country Road, Setauket11733, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ginzburg", 
        "givenName": "Lev", 
        "id": "sg:person.015102730643.26", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015102730643.26"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1023/a:1011415408313", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027297358", 
          "https://doi.org/10.1023/a:1011415408313"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-540-24588-9_26", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1005640002", 
          "https://doi.org/10.1007/978-3-540-24588-9_26"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2005-02", 
    "datePublishedReg": "2005-02-01", 
    "description": "In many application areas,it is important to detect outliers. The traditional engineering approach to outlier detection is that we start with some \u201cnormal\u201d values x1, ...,xn, compute the sample average E, the sample standard variation \u03c3, and then mark a value x as an outlier if x is outside the k0-sigma interval [E \u2013 k0 \u22c5 \u03c3, E + k0 \u22c5 \u03c3] (for some pre-selected parameter k0).In real life,we often have only interval ranges [\\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}${\\underline x}_i, {\\bar x}_i $\\end{document}] for the normal values x1, ...,xn. In this case,we only have intervals of possible values for the bounds \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$E - k_0 \\cdot \\sigma$\\end{document} and \\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$E + k_0 \\cdot \\sigma$\\end{document}. We can therefore identify outliers as values that are outside all k0-sigma intervals.Once we identify a value as an outlier for a fixed k0, it is also desirable to find out to what degree this value is an outlier, i.e., what is the largest value k0 for which this value is an outlier.In this paper,we analyze the computational complexity of these outlier detection problems, provide efficient algorithms that solve some of these problems (under reasonable conditions), and list related open problems.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s11155-005-5943-7", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1140838", 
        "issn": [
          "1385-3139", 
          "1573-1340"
        ], 
        "name": "Reliable Computing", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "11"
      }
    ], 
    "keywords": [
      "traditional engineering approaches", 
      "engineering approach", 
      "application areas", 
      "computational complexity", 
      "interval uncertainty", 
      "area", 
      "outliers", 
      "approach", 
      "outlier detection", 
      "detection", 
      "real life", 
      "life", 
      "interval range", 
      "range", 
      "possible values", 
      "values", 
      "K0", 
      "complexity", 
      "detection problem", 
      "problem", 
      "efficient algorithm", 
      "algorithm", 
      "uncertainty", 
      "X1", 
      "samples", 
      "value x", 
      "intervals", 
      "cases", 
      "bounds", 
      "degree", 
      "paper", 
      "list", 
      "open problem", 
      "algorithmic solvability", 
      "solvability", 
      "Xn", 
      "outlier detection problem", 
      "values x1", 
      "sample standard variation \u03c3", 
      "standard variation \u03c3", 
      "variation \u03c3", 
      "k0-sigma interval", 
      "normal values x1", 
      "largest value k0", 
      "value k0"
    ], 
    "name": "Outlier Detection under Interval Uncertainty: Algorithmic Solvability and Computational Complexity", 
    "pagination": "59-76", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1050829705"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s11155-005-5943-7"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s11155-005-5943-7", 
      "https://app.dimensions.ai/details/publication/pub.1050829705"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2021-11-01T18:08", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211101/entities/gbq_results/article/article_395.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s11155-005-5943-7"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s11155-005-5943-7'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s11155-005-5943-7'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s11155-005-5943-7'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s11155-005-5943-7'


 

This table displays all metadata directly associated to this object as RDF triples.

142 TRIPLES      22 PREDICATES      73 URIs      63 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s11155-005-5943-7 schema:about anzsrc-for:08
2 anzsrc-for:0802
3 schema:author N12650ff78aba488189138478308d332f
4 schema:citation sg:pub.10.1007/978-3-540-24588-9_26
5 sg:pub.10.1023/a:1011415408313
6 schema:datePublished 2005-02
7 schema:datePublishedReg 2005-02-01
8 schema:description In many application areas,it is important to detect outliers. The traditional engineering approach to outlier detection is that we start with some “normal” values x1, ...,xn, compute the sample average E, the sample standard variation σ, and then mark a value x as an outlier if x is outside the k0-sigma interval [E – k0 ⋅ σ, E + k0 ⋅ σ] (for some pre-selected parameter k0).In real life,we often have only interval ranges [\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}${\underline x}_i, {\bar x}_i $\end{document}] for the normal values x1, ...,xn. In this case,we only have intervals of possible values for the bounds \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$E - k_0 \cdot \sigma$\end{document} and \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$E + k_0 \cdot \sigma$\end{document}. We can therefore identify outliers as values that are outside all k0-sigma intervals.Once we identify a value as an outlier for a fixed k0, it is also desirable to find out to what degree this value is an outlier, i.e., what is the largest value k0 for which this value is an outlier.In this paper,we analyze the computational complexity of these outlier detection problems, provide efficient algorithms that solve some of these problems (under reasonable conditions), and list related open problems.
9 schema:genre article
10 schema:inLanguage en
11 schema:isAccessibleForFree true
12 schema:isPartOf Nd00b3a1d20eb4d06ba4b12b1ab40632c
13 Neae94146fdc14ddead49e4d436e3514a
14 sg:journal.1140838
15 schema:keywords K0
16 X1
17 Xn
18 algorithm
19 algorithmic solvability
20 application areas
21 approach
22 area
23 bounds
24 cases
25 complexity
26 computational complexity
27 degree
28 detection
29 detection problem
30 efficient algorithm
31 engineering approach
32 interval range
33 interval uncertainty
34 intervals
35 k0-sigma interval
36 largest value k0
37 life
38 list
39 normal values x1
40 open problem
41 outlier detection
42 outlier detection problem
43 outliers
44 paper
45 possible values
46 problem
47 range
48 real life
49 sample standard variation σ
50 samples
51 solvability
52 standard variation σ
53 traditional engineering approaches
54 uncertainty
55 value k0
56 value x
57 values
58 values x1
59 variation σ
60 schema:name Outlier Detection under Interval Uncertainty: Algorithmic Solvability and Computational Complexity
61 schema:pagination 59-76
62 schema:productId N52e9f70b3ffc41fba199a073e7d5a0c8
63 Nc79435023df540b1a158867a26a71e90
64 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050829705
65 https://doi.org/10.1007/s11155-005-5943-7
66 schema:sdDatePublished 2021-11-01T18:08
67 schema:sdLicense https://scigraph.springernature.com/explorer/license/
68 schema:sdPublisher N4c4c359594864eab9ae072cf7192660e
69 schema:url https://doi.org/10.1007/s11155-005-5943-7
70 sgo:license sg:explorer/license/
71 sgo:sdDataset articles
72 rdf:type schema:ScholarlyArticle
73 N0d56a8d64ae54e10aff48382a902aa2c rdf:first sg:person.010474226755.58
74 rdf:rest N1861e36dd6944ee4bcd908f707b7aa30
75 N12650ff78aba488189138478308d332f rdf:first sg:person.012602771355.27
76 rdf:rest N0d56a8d64ae54e10aff48382a902aa2c
77 N1861e36dd6944ee4bcd908f707b7aa30 rdf:first sg:person.016401717555.50
78 rdf:rest N3c4700195a0f4de69f19afca0a7bcde6
79 N2cf862a40fcb4709bb7b9a5845869064 rdf:first sg:person.015102730643.26
80 rdf:rest rdf:nil
81 N3c4700195a0f4de69f19afca0a7bcde6 rdf:first sg:person.012115026643.10
82 rdf:rest N2cf862a40fcb4709bb7b9a5845869064
83 N4c4c359594864eab9ae072cf7192660e schema:name Springer Nature - SN SciGraph project
84 rdf:type schema:Organization
85 N52e9f70b3ffc41fba199a073e7d5a0c8 schema:name dimensions_id
86 schema:value pub.1050829705
87 rdf:type schema:PropertyValue
88 Nc79435023df540b1a158867a26a71e90 schema:name doi
89 schema:value 10.1007/s11155-005-5943-7
90 rdf:type schema:PropertyValue
91 Nd00b3a1d20eb4d06ba4b12b1ab40632c schema:volumeNumber 11
92 rdf:type schema:PublicationVolume
93 Neae94146fdc14ddead49e4d436e3514a schema:issueNumber 1
94 rdf:type schema:PublicationIssue
95 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
96 schema:name Information and Computing Sciences
97 rdf:type schema:DefinedTerm
98 anzsrc-for:0802 schema:inDefinedTermSet anzsrc-for:
99 schema:name Computation Theory and Mathematics
100 rdf:type schema:DefinedTerm
101 sg:journal.1140838 schema:issn 1385-3139
102 1573-1340
103 schema:name Reliable Computing
104 schema:publisher Springer Nature
105 rdf:type schema:Periodical
106 sg:person.010474226755.58 schema:affiliation grid-institutes:grid.267324.6
107 schema:familyName Longpré
108 schema:givenName Luc
109 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010474226755.58
110 rdf:type schema:Person
111 sg:person.012115026643.10 schema:affiliation grid-institutes:grid.422751.7
112 schema:familyName Ferson
113 schema:givenName Scott
114 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012115026643.10
115 rdf:type schema:Person
116 sg:person.012602771355.27 schema:affiliation grid-institutes:grid.267324.6
117 schema:familyName Kreinovich
118 schema:givenName Vladik
119 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012602771355.27
120 rdf:type schema:Person
121 sg:person.015102730643.26 schema:affiliation grid-institutes:grid.422751.7
122 schema:familyName Ginzburg
123 schema:givenName Lev
124 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015102730643.26
125 rdf:type schema:Person
126 sg:person.016401717555.50 schema:affiliation grid-institutes:grid.267324.6
127 schema:familyName Patangay
128 schema:givenName Praveen
129 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016401717555.50
130 rdf:type schema:Person
131 sg:pub.10.1007/978-3-540-24588-9_26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005640002
132 https://doi.org/10.1007/978-3-540-24588-9_26
133 rdf:type schema:CreativeWork
134 sg:pub.10.1023/a:1011415408313 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027297358
135 https://doi.org/10.1023/a:1011415408313
136 rdf:type schema:CreativeWork
137 grid-institutes:grid.267324.6 schema:alternateName Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA
138 schema:name Computer Science Department, University of Texas at El Paso, 79968, El Paso, TX, USA
139 rdf:type schema:Organization
140 grid-institutes:grid.422751.7 schema:alternateName Applied Biomathematics, 100 North Country Road, Setauket11733, NY, USA
141 schema:name Applied Biomathematics, 100 North Country Road, Setauket11733, NY, USA
142 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...