Testing for Multivariate Outliers in the Presence of Missing Data View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2002

AUTHORS

Wayne A. Woodward , Stephen R. Sain , H. L. Gray , Bojuan Zhao , Mark D. Fisk

ABSTRACT

We consider the problem of multivariate outlier testing for purposes of distinguishing seismic signals of underground nuclear events from training samples based on non-nuclear seismic events when certain data are missing. We consider the case in which the training data follow a multivariate normal distribution. Assume a potential outlier is observed on which k features of interest are measured. Assume further that the available training set of n observations on these k features is available but that some of the observations in the training data have missing features. The approach currently used in practice is to perform the outlier testing using a generalized likelihood ratio test procedure based only on the data vectors in the training data with complete data. When there is a substantial amount of missing data within the training set, use of this strategy may lead to a loss of valuable information. An alternative procedure is to incorporate all n of the data vectors in the training data using the EM algorithm to appropriately handle the missing data in the training set. Resampling methods are used to find appropriate critical regions. We use simulation results and analysis of models fit to Pg/Lg ratios for the WMQ station in China to compare these two strategies for dealing with missing data. More... »

PAGES

889-903

Book

TITLE

Monitoring the Comprehensive Nuclear-Test-Ban Treaty: Seismic Event Discrimination and Identification

ISBN

978-3-7643-6675-9
978-3-0348-8169-2

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-0348-8169-2_14

DOI

http://dx.doi.org/10.1007/978-3-0348-8169-2_14

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1012605182


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, USA", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Woodward", 
        "givenName": "Wayne A.", 
        "id": "sg:person.013707166007.62", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013707166007.62"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, USA", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Sain", 
        "givenName": "Stephen R.", 
        "id": "sg:person.016621675366.88", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016621675366.88"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, USA", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Gray", 
        "givenName": "H. L.", 
        "id": "sg:person.011651455247.77", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011651455247.77"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Southern Methodist University, USA", 
          "id": "http://www.grid.ac/institutes/grid.263864.d", 
          "name": [
            "Southern Methodist University, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhao", 
        "givenName": "Bojuan", 
        "id": "sg:person.013702106566.63", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013702106566.63"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Mission Research Corporation, USA", 
          "id": "http://www.grid.ac/institutes/grid.421454.2", 
          "name": [
            "Mission Research Corporation, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Fisk", 
        "givenName": "Mark D.", 
        "id": "sg:person.01134353065.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01134353065.48"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2002", 
    "datePublishedReg": "2002-01-01", 
    "description": "We consider the problem of multivariate outlier testing for purposes of distinguishing seismic signals of underground nuclear events from training samples based on non-nuclear seismic events when certain data are missing. We consider the case in which the training data follow a multivariate normal distribution. Assume a potential outlier is observed on which k features of interest are measured. Assume further that the available training set of n observations on these k features is available but that some of the observations in the training data have missing features. The approach currently used in practice is to perform the outlier testing using a generalized likelihood ratio test procedure based only on the data vectors in the training data with complete data. When there is a substantial amount of missing data within the training set, use of this strategy may lead to a loss of valuable information. An alternative procedure is to incorporate all n of the data vectors in the training data using the EM algorithm to appropriately handle the missing data in the training set. Resampling methods are used to find appropriate critical regions. We use simulation results and analysis of models fit to Pg/Lg ratios for the WMQ station in China to compare these two strategies for dealing with missing data.", 
    "editor": [
      {
        "familyName": "Walter", 
        "givenName": "William R.", 
        "type": "Person"
      }, 
      {
        "familyName": "Hartse", 
        "givenName": "Hans E.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-0348-8169-2_14", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-7643-6675-9", 
        "978-3-0348-8169-2"
      ], 
      "name": "Monitoring the Comprehensive Nuclear-Test-Ban Treaty: Seismic Event Discrimination and Identification", 
      "type": "Book"
    }, 
    "keywords": [
      "multivariate normal distribution", 
      "data vectors", 
      "likelihood ratio test procedure", 
      "outlier testing", 
      "ratio test procedure", 
      "analysis of models", 
      "EM algorithm", 
      "multivariate outliers", 
      "Missing Data", 
      "normal distribution", 
      "resampling method", 
      "training data", 
      "available training set", 
      "potential outliers", 
      "simulation results", 
      "seismic signals", 
      "training set", 
      "critical region", 
      "outliers", 
      "features of interest", 
      "set", 
      "training samples", 
      "certain data", 
      "vector", 
      "alternative procedure", 
      "algorithm", 
      "problem", 
      "seismic events", 
      "model", 
      "distribution", 
      "observations", 
      "features", 
      "approach", 
      "procedure", 
      "data", 
      "signals", 
      "test procedure", 
      "stations", 
      "cases", 
      "valuable information", 
      "interest", 
      "results", 
      "complete data", 
      "analysis", 
      "information", 
      "strategies", 
      "region", 
      "ratio", 
      "purpose", 
      "use", 
      "presence", 
      "events", 
      "samples", 
      "substantial amount", 
      "testing", 
      "amount", 
      "loss", 
      "nuclear events", 
      "practice", 
      "China", 
      "method", 
      "underground nuclear events", 
      "Lg ratios"
    ], 
    "name": "Testing for Multivariate Outliers in the Presence of Missing Data", 
    "pagination": "889-903", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1012605182"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-0348-8169-2_14"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-0348-8169-2_14", 
      "https://app.dimensions.ai/details/publication/pub.1012605182"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-05-20T07:43", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/chapter/chapter_184.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-0348-8169-2_14"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-0348-8169-2_14'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-0348-8169-2_14'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-0348-8169-2_14'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-0348-8169-2_14'


 

This table displays all metadata directly associated to this object as RDF triples.

159 TRIPLES      23 PREDICATES      89 URIs      82 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-0348-8169-2_14 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author N8c0791844cb04082ba6a7bf1a130aa19
4 schema:datePublished 2002
5 schema:datePublishedReg 2002-01-01
6 schema:description We consider the problem of multivariate outlier testing for purposes of distinguishing seismic signals of underground nuclear events from training samples based on non-nuclear seismic events when certain data are missing. We consider the case in which the training data follow a multivariate normal distribution. Assume a potential outlier is observed on which k features of interest are measured. Assume further that the available training set of n observations on these k features is available but that some of the observations in the training data have missing features. The approach currently used in practice is to perform the outlier testing using a generalized likelihood ratio test procedure based only on the data vectors in the training data with complete data. When there is a substantial amount of missing data within the training set, use of this strategy may lead to a loss of valuable information. An alternative procedure is to incorporate all n of the data vectors in the training data using the EM algorithm to appropriately handle the missing data in the training set. Resampling methods are used to find appropriate critical regions. We use simulation results and analysis of models fit to Pg/Lg ratios for the WMQ station in China to compare these two strategies for dealing with missing data.
7 schema:editor N02c2f46c2eac43d5a0e9e9384533c88e
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N47220be9e494422281573aec90f10a49
12 schema:keywords China
13 EM algorithm
14 Lg ratios
15 Missing Data
16 algorithm
17 alternative procedure
18 amount
19 analysis
20 analysis of models
21 approach
22 available training set
23 cases
24 certain data
25 complete data
26 critical region
27 data
28 data vectors
29 distribution
30 events
31 features
32 features of interest
33 information
34 interest
35 likelihood ratio test procedure
36 loss
37 method
38 model
39 multivariate normal distribution
40 multivariate outliers
41 normal distribution
42 nuclear events
43 observations
44 outlier testing
45 outliers
46 potential outliers
47 practice
48 presence
49 problem
50 procedure
51 purpose
52 ratio
53 ratio test procedure
54 region
55 resampling method
56 results
57 samples
58 seismic events
59 seismic signals
60 set
61 signals
62 simulation results
63 stations
64 strategies
65 substantial amount
66 test procedure
67 testing
68 training data
69 training samples
70 training set
71 underground nuclear events
72 use
73 valuable information
74 vector
75 schema:name Testing for Multivariate Outliers in the Presence of Missing Data
76 schema:pagination 889-903
77 schema:productId N5ff0198d9a37400caa5980bd6c6c26af
78 N79a821412e6c43219d09a1ff03468e9a
79 schema:publisher Nbbdfcdcfb6044911a19d89900263eaa2
80 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012605182
81 https://doi.org/10.1007/978-3-0348-8169-2_14
82 schema:sdDatePublished 2022-05-20T07:43
83 schema:sdLicense https://scigraph.springernature.com/explorer/license/
84 schema:sdPublisher N8c966227388944cebd72d74f85ffe78b
85 schema:url https://doi.org/10.1007/978-3-0348-8169-2_14
86 sgo:license sg:explorer/license/
87 sgo:sdDataset chapters
88 rdf:type schema:Chapter
89 N02c2f46c2eac43d5a0e9e9384533c88e rdf:first Ndc28b1f030ab40d09cbb2c0cef730064
90 rdf:rest N05979cf4de8141d78936dc9614530e62
91 N05979cf4de8141d78936dc9614530e62 rdf:first N8c53197d8a354931a790a0286fcb779f
92 rdf:rest rdf:nil
93 N140090019f724b3abbb3b8ff6dacd74b rdf:first sg:person.013702106566.63
94 rdf:rest N665f0ab3c69a4f5e85a369d255cff5e2
95 N33a330fb03d74b199907ff61fbca4471 rdf:first sg:person.016621675366.88
96 rdf:rest Ne096bed209214c60b8da068a2f2dada7
97 N47220be9e494422281573aec90f10a49 schema:isbn 978-3-0348-8169-2
98 978-3-7643-6675-9
99 schema:name Monitoring the Comprehensive Nuclear-Test-Ban Treaty: Seismic Event Discrimination and Identification
100 rdf:type schema:Book
101 N5ff0198d9a37400caa5980bd6c6c26af schema:name doi
102 schema:value 10.1007/978-3-0348-8169-2_14
103 rdf:type schema:PropertyValue
104 N665f0ab3c69a4f5e85a369d255cff5e2 rdf:first sg:person.01134353065.48
105 rdf:rest rdf:nil
106 N79a821412e6c43219d09a1ff03468e9a schema:name dimensions_id
107 schema:value pub.1012605182
108 rdf:type schema:PropertyValue
109 N8c0791844cb04082ba6a7bf1a130aa19 rdf:first sg:person.013707166007.62
110 rdf:rest N33a330fb03d74b199907ff61fbca4471
111 N8c53197d8a354931a790a0286fcb779f schema:familyName Hartse
112 schema:givenName Hans E.
113 rdf:type schema:Person
114 N8c966227388944cebd72d74f85ffe78b schema:name Springer Nature - SN SciGraph project
115 rdf:type schema:Organization
116 Nbbdfcdcfb6044911a19d89900263eaa2 schema:name Springer Nature
117 rdf:type schema:Organisation
118 Ndc28b1f030ab40d09cbb2c0cef730064 schema:familyName Walter
119 schema:givenName William R.
120 rdf:type schema:Person
121 Ne096bed209214c60b8da068a2f2dada7 rdf:first sg:person.011651455247.77
122 rdf:rest N140090019f724b3abbb3b8ff6dacd74b
123 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
124 schema:name Mathematical Sciences
125 rdf:type schema:DefinedTerm
126 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
127 schema:name Statistics
128 rdf:type schema:DefinedTerm
129 sg:person.01134353065.48 schema:affiliation grid-institutes:grid.421454.2
130 schema:familyName Fisk
131 schema:givenName Mark D.
132 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01134353065.48
133 rdf:type schema:Person
134 sg:person.011651455247.77 schema:affiliation grid-institutes:grid.263864.d
135 schema:familyName Gray
136 schema:givenName H. L.
137 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011651455247.77
138 rdf:type schema:Person
139 sg:person.013702106566.63 schema:affiliation grid-institutes:grid.263864.d
140 schema:familyName Zhao
141 schema:givenName Bojuan
142 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013702106566.63
143 rdf:type schema:Person
144 sg:person.013707166007.62 schema:affiliation grid-institutes:grid.263864.d
145 schema:familyName Woodward
146 schema:givenName Wayne A.
147 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013707166007.62
148 rdf:type schema:Person
149 sg:person.016621675366.88 schema:affiliation grid-institutes:grid.263864.d
150 schema:familyName Sain
151 schema:givenName Stephen R.
152 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016621675366.88
153 rdf:type schema:Person
154 grid-institutes:grid.263864.d schema:alternateName Southern Methodist University, USA
155 schema:name Southern Methodist University, USA
156 rdf:type schema:Organization
157 grid-institutes:grid.421454.2 schema:alternateName Mission Research Corporation, USA
158 schema:name Mission Research Corporation, USA
159 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...