Testing for Multivariate Outliers in the Presence of Missing Data View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2002-02

AUTHORS

W. A. Woodward, S. R. Sain, H. L. Gray, B. Zhao, M. D. Fisk

ABSTRACT

— We consider the problem of multivariate outlier testing for purposes of distinguishing seismic signals of underground nuclear events from training samples based on non-nuclear seismic events when certain data are missing. We consider the case in which the training data follow a multivariate normal distribution. Assume a potential outlier is observed on which k features of interest are measured. Assume further that the available training set of n observations on these k features is available but that some of the observations in the training data have missing features. The approach currently used in practice is to perform the outlier testing using a generalized likelihood ratio test procedure based only on the data vectors in the training data with complete data. When there is a substantial amount of missing data within the training set, use of this strategy may lead to a loss of valuable information. An alternative procedure is to incorporate all n of the data vectors in the training data using the EM algorithm to appropriately handle the missing data in the training set. Resampling methods are used to find appropriate critical regions. We use simulation results and analysis of models fit to Pg/Lg ratios for the WMQ station in China to compare these two strategies for dealing with missing data. More... »

PAGES

889-903

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s00024-002-8663-5

DOI

http://dx.doi.org/10.1007/s00024-002-8663-5

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1017269672


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Woodward", 
        "givenName": "W. A.", 
        "id": "sg:person.013707166007.62", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013707166007.62"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Sain", 
        "givenName": "S. R.", 
        "id": "sg:person.016621675366.88", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016621675366.88"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Gray", 
        "givenName": "H. L.", 
        "id": "sg:person.011651455247.77", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011651455247.77"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhao", 
        "givenName": "B.", 
        "id": "sg:person.013702106566.63", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013702106566.63"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Mission Research Corporation, 8560 Cinderbed Road, Suite 700, Newington, VA, 22122, U.S.A., US", 
          "id": "http://www.grid.ac/institutes/grid.421454.2", 
          "name": [
            "Mission Research Corporation, 8560 Cinderbed Road, Suite 700, Newington, VA, 22122, U.S.A., US"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Fisk", 
        "givenName": "M. D.", 
        "id": "sg:person.01134353065.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01134353065.48"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2002-02", 
    "datePublishedReg": "2002-02-01", 
    "description": "Abstract\u200a\u2014\u200aWe consider the problem of multivariate outlier testing for purposes of distinguishing seismic signals of underground nuclear events from training samples based on non-nuclear seismic events when certain data are missing. We consider the case in which the training data follow a multivariate normal distribution. Assume a potential outlier is observed on which k features of interest are measured. Assume further that the available training set of n observations on these k features is available but that some of the observations in the training data have missing features. The approach currently used in practice is to perform the outlier testing using a generalized likelihood ratio test procedure based only on the data vectors in the training data with complete data. When there is a substantial amount of missing data within the training set, use of this strategy may lead to a loss of valuable information. An alternative procedure is to incorporate all n of the data vectors in the training data using the EM algorithm to appropriately handle the missing data in the training set. Resampling methods are used to find appropriate critical regions. We use simulation results and analysis of models fit to Pg/Lg ratios for the WMQ station in China to compare these two strategies for dealing with missing data.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s00024-002-8663-5", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": [
      {
        "id": "sg:journal.1136817", 
        "issn": [
          "0033-4553", 
          "1420-9136"
        ], 
        "name": "Pure and Applied Geophysics", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "4", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "159"
      }
    ], 
    "keywords": [
      "multivariate normal distribution", 
      "data vectors", 
      "likelihood ratio test procedure", 
      "outlier testing", 
      "ratio test procedure", 
      "analysis of models", 
      "EM algorithm", 
      "multivariate outliers", 
      "Missing Data", 
      "normal distribution", 
      "resampling method", 
      "training data", 
      "available training set", 
      "potential outliers", 
      "simulation results", 
      "seismic signals", 
      "training set", 
      "critical region", 
      "outliers", 
      "features of interest", 
      "set", 
      "training samples", 
      "certain data", 
      "vector", 
      "alternative procedure", 
      "algorithm", 
      "problem", 
      "seismic events", 
      "model", 
      "distribution", 
      "features", 
      "observations", 
      "procedure", 
      "data", 
      "approach", 
      "signals", 
      "test procedure", 
      "stations", 
      "valuable information", 
      "cases", 
      "results", 
      "interest", 
      "complete data", 
      "information", 
      "analysis", 
      "strategies", 
      "region", 
      "ratio", 
      "purpose", 
      "use", 
      "presence", 
      "testing", 
      "events", 
      "samples", 
      "substantial amount", 
      "amount", 
      "loss", 
      "nuclear events", 
      "practice", 
      "China", 
      "method", 
      "underground nuclear events", 
      "Lg ratios"
    ], 
    "name": "Testing for Multivariate Outliers in the Presence of Missing Data", 
    "pagination": "889-903", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1017269672"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s00024-002-8663-5"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s00024-002-8663-5", 
      "https://app.dimensions.ai/details/publication/pub.1017269672"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-05-20T07:22", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/article/article_354.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s00024-002-8663-5"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s00024-002-8663-5'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s00024-002-8663-5'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s00024-002-8663-5'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s00024-002-8663-5'


 

This table displays all metadata directly associated to this object as RDF triples.

152 TRIPLES      21 PREDICATES      89 URIs      81 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s00024-002-8663-5 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author Nb6bdd3ecbb924fe8bc3265baddffc375
4 schema:datePublished 2002-02
5 schema:datePublishedReg 2002-02-01
6 schema:description Abstract — We consider the problem of multivariate outlier testing for purposes of distinguishing seismic signals of underground nuclear events from training samples based on non-nuclear seismic events when certain data are missing. We consider the case in which the training data follow a multivariate normal distribution. Assume a potential outlier is observed on which k features of interest are measured. Assume further that the available training set of n observations on these k features is available but that some of the observations in the training data have missing features. The approach currently used in practice is to perform the outlier testing using a generalized likelihood ratio test procedure based only on the data vectors in the training data with complete data. When there is a substantial amount of missing data within the training set, use of this strategy may lead to a loss of valuable information. An alternative procedure is to incorporate all n of the data vectors in the training data using the EM algorithm to appropriately handle the missing data in the training set. Resampling methods are used to find appropriate critical regions. We use simulation results and analysis of models fit to Pg/Lg ratios for the WMQ station in China to compare these two strategies for dealing with missing data.
7 schema:genre article
8 schema:inLanguage en
9 schema:isAccessibleForFree false
10 schema:isPartOf N60be508e1d3c4ed6b4de0bf89c59f441
11 Nfb76277fee56468d87b21c81e086a552
12 sg:journal.1136817
13 schema:keywords China
14 EM algorithm
15 Lg ratios
16 Missing Data
17 algorithm
18 alternative procedure
19 amount
20 analysis
21 analysis of models
22 approach
23 available training set
24 cases
25 certain data
26 complete data
27 critical region
28 data
29 data vectors
30 distribution
31 events
32 features
33 features of interest
34 information
35 interest
36 likelihood ratio test procedure
37 loss
38 method
39 model
40 multivariate normal distribution
41 multivariate outliers
42 normal distribution
43 nuclear events
44 observations
45 outlier testing
46 outliers
47 potential outliers
48 practice
49 presence
50 problem
51 procedure
52 purpose
53 ratio
54 ratio test procedure
55 region
56 resampling method
57 results
58 samples
59 seismic events
60 seismic signals
61 set
62 signals
63 simulation results
64 stations
65 strategies
66 substantial amount
67 test procedure
68 testing
69 training data
70 training samples
71 training set
72 underground nuclear events
73 use
74 valuable information
75 vector
76 schema:name Testing for Multivariate Outliers in the Presence of Missing Data
77 schema:pagination 889-903
78 schema:productId N608ef60c98234c26869e07a1421799b8
79 Nb45d15ecdf874c2b91ca123eb69950c1
80 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017269672
81 https://doi.org/10.1007/s00024-002-8663-5
82 schema:sdDatePublished 2022-05-20T07:22
83 schema:sdLicense https://scigraph.springernature.com/explorer/license/
84 schema:sdPublisher Nb29cf077019a4dc1824692d6f4cfea5d
85 schema:url https://doi.org/10.1007/s00024-002-8663-5
86 sgo:license sg:explorer/license/
87 sgo:sdDataset articles
88 rdf:type schema:ScholarlyArticle
89 N4a50f34a07a54576af62cfe03c919cdf rdf:first sg:person.013702106566.63
90 rdf:rest N9b7e7b2114ce46e6a8ab1d49c434e515
91 N608ef60c98234c26869e07a1421799b8 schema:name dimensions_id
92 schema:value pub.1017269672
93 rdf:type schema:PropertyValue
94 N60be508e1d3c4ed6b4de0bf89c59f441 schema:issueNumber 4
95 rdf:type schema:PublicationIssue
96 N9b7e7b2114ce46e6a8ab1d49c434e515 rdf:first sg:person.01134353065.48
97 rdf:rest rdf:nil
98 Nb29cf077019a4dc1824692d6f4cfea5d schema:name Springer Nature - SN SciGraph project
99 rdf:type schema:Organization
100 Nb45d15ecdf874c2b91ca123eb69950c1 schema:name doi
101 schema:value 10.1007/s00024-002-8663-5
102 rdf:type schema:PropertyValue
103 Nb6bdd3ecbb924fe8bc3265baddffc375 rdf:first sg:person.013707166007.62
104 rdf:rest Nba5c432e73db455e9321405cdaafba45
105 Nba5c432e73db455e9321405cdaafba45 rdf:first sg:person.016621675366.88
106 rdf:rest Ned027a02d5a64538bb5c497424ddd908
107 Ned027a02d5a64538bb5c497424ddd908 rdf:first sg:person.011651455247.77
108 rdf:rest N4a50f34a07a54576af62cfe03c919cdf
109 Nfb76277fee56468d87b21c81e086a552 schema:volumeNumber 159
110 rdf:type schema:PublicationVolume
111 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
112 schema:name Mathematical Sciences
113 rdf:type schema:DefinedTerm
114 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
115 schema:name Statistics
116 rdf:type schema:DefinedTerm
117 sg:journal.1136817 schema:issn 0033-4553
118 1420-9136
119 schema:name Pure and Applied Geophysics
120 schema:publisher Springer Nature
121 rdf:type schema:Periodical
122 sg:person.01134353065.48 schema:affiliation grid-institutes:grid.421454.2
123 schema:familyName Fisk
124 schema:givenName M. D.
125 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01134353065.48
126 rdf:type schema:Person
127 sg:person.011651455247.77 schema:affiliation grid-institutes:grid.263864.d
128 schema:familyName Gray
129 schema:givenName H. L.
130 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011651455247.77
131 rdf:type schema:Person
132 sg:person.013702106566.63 schema:affiliation grid-institutes:grid.263864.d
133 schema:familyName Zhao
134 schema:givenName B.
135 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013702106566.63
136 rdf:type schema:Person
137 sg:person.013707166007.62 schema:affiliation grid-institutes:grid.263864.d
138 schema:familyName Woodward
139 schema:givenName W. A.
140 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013707166007.62
141 rdf:type schema:Person
142 sg:person.016621675366.88 schema:affiliation grid-institutes:grid.263864.d
143 schema:familyName Sain
144 schema:givenName S. R.
145 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016621675366.88
146 rdf:type schema:Person
147 grid-institutes:grid.263864.d schema:alternateName Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US
148 schema:name Southern Methodist University, Department of Geological Sciences, Dallas, TX 75275, U.S.A., US
149 rdf:type schema:Organization
150 grid-institutes:grid.421454.2 schema:alternateName Mission Research Corporation, 8560 Cinderbed Road, Suite 700, Newington, VA, 22122, U.S.A., US
151 schema:name Mission Research Corporation, 8560 Cinderbed Road, Suite 700, Newington, VA, 22122, U.S.A., US
152 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...