Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2003-12

AUTHORS

Anne Chao, Tsung-Jen Shen

ABSTRACT

A biological community usually has a large number of species with relatively small abundances. When a random sample of individuals is selected and each individual is classified according to species identity, some rare species may not be discovered. This paper is concerned with the estimation of Shannon’s index of diversity when the number of species and the species abundances are unknown. The traditional estimator that ignores the missing species underestimates when there is a non-negligible number of unseen species. We provide a different approach based on unequal probability sampling theory because species have different probabilities of being discovered in the sample. No parametric forms are assumed for the species abundances. The proposed estimation procedure combines the Horvitz–Thompson (1952) adjustment for missing species and the concept of sample coverage, which is used to properly estimate the relative abundances of species discovered in the sample. Simulation results show that the proposed estimator works well under various abundance models even when a relatively large fraction of the species is missing. Three real data sets, two from biology and the other one from numismatics, are given for illustration. More... »

PAGES

429-443

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1023/a:1026096204727

DOI

http://dx.doi.org/10.1023/a:1026096204727

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1024775421


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "National Tsing Hua University", 
          "id": "https://www.grid.ac/institutes/grid.38348.34", 
          "name": [
            "Institute of Statistics, National Tsing Hua University, 30043, Hsin-Chu, TAIWAN"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chao", 
        "givenName": "Anne", 
        "id": "sg:person.011612166521.45", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011612166521.45"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "National Tsing Hua University", 
          "id": "https://www.grid.ac/institutes/grid.38348.34", 
          "name": [
            "Institute of Statistics, National Tsing Hua University, 30043, Hsin-Chu, TAIWAN"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shen", 
        "givenName": "Tsung-Jen", 
        "id": "sg:person.01304441055.07", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01304441055.07"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1023/a:1009659922745", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023855607", 
          "https://doi.org/10.1023/a:1009659922745"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.43.3.293", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030325064"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1098/rstb.1994.0091", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033931755"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1146/annurev.es.05.110174.001441", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040179280"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1080/03610910008813661", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042559689"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1080/757584397", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045151635"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1080/01621459.1952.10483446", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058299008"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1080/01621459.1992.10475194", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058304223"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1080/01621459.1998.10473807", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058305446"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/biomet/80.1.193", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059420367"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/1104033", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062864263"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1214/aos/1176350066", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1064409041"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/1935358", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069659500"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/1935359", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069659501"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/1936227", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069660286"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/2529778", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069975385"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/5493", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1070640962"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1074258104", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2003-12", 
    "datePublishedReg": "2003-12-01", 
    "description": "A biological community usually has a large number of species with relatively small abundances. When a random sample of individuals is selected and each individual is classified according to species identity, some rare species may not be discovered. This paper is concerned with the estimation of Shannon\u2019s index of diversity when the number of species and the species abundances are unknown. The traditional estimator that ignores the missing species underestimates when there is a non-negligible number of unseen species. We provide a different approach based on unequal probability sampling theory because species have different probabilities of being discovered in the sample. No parametric forms are assumed for the species abundances. The proposed estimation procedure combines the Horvitz\u2013Thompson (1952) adjustment for missing species and the concept of sample coverage, which is used to properly estimate the relative abundances of species discovered in the sample. Simulation results show that the proposed estimator works well under various abundance models even when a relatively large fraction of the species is missing. Three real data sets, two from biology and the other one from numismatics, are given for illustration.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1023/a:1026096204727", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1356907", 
        "issn": [
          "1573-3009", 
          "1352-8505"
        ], 
        "name": "Environmental and Ecological Statistics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "4", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "10"
      }
    ], 
    "name": "Nonparametric estimation of Shannon\u2019s index of diversity when there are unseen species in sample", 
    "pagination": "429-443", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "7607e73bdb50a8c4a71cf4d77b1719389d1ae2d0ecb939990be204334d66aa29"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1023/a:1026096204727"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1024775421"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1023/a:1026096204727", 
      "https://app.dimensions.ai/details/publication/pub.1024775421"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T14:08", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8660_00000505.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1023%2FA%3A1026096204727"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1026096204727'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1026096204727'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1026096204727'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1026096204727'


 

This table displays all metadata directly associated to this object as RDF triples.

122 TRIPLES      21 PREDICATES      45 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1023/a:1026096204727 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author N3b3b8e7bcf784957811c2fd369806f5a
4 schema:citation sg:pub.10.1023/a:1009659922745
5 https://app.dimensions.ai/details/publication/pub.1074258104
6 https://doi.org/10.1073/pnas.43.3.293
7 https://doi.org/10.1080/01621459.1952.10483446
8 https://doi.org/10.1080/01621459.1992.10475194
9 https://doi.org/10.1080/01621459.1998.10473807
10 https://doi.org/10.1080/03610910008813661
11 https://doi.org/10.1080/757584397
12 https://doi.org/10.1093/biomet/80.1.193
13 https://doi.org/10.1098/rstb.1994.0091
14 https://doi.org/10.1137/1104033
15 https://doi.org/10.1146/annurev.es.05.110174.001441
16 https://doi.org/10.1214/aos/1176350066
17 https://doi.org/10.2307/1935358
18 https://doi.org/10.2307/1935359
19 https://doi.org/10.2307/1936227
20 https://doi.org/10.2307/2529778
21 https://doi.org/10.2307/5493
22 schema:datePublished 2003-12
23 schema:datePublishedReg 2003-12-01
24 schema:description A biological community usually has a large number of species with relatively small abundances. When a random sample of individuals is selected and each individual is classified according to species identity, some rare species may not be discovered. This paper is concerned with the estimation of Shannon’s index of diversity when the number of species and the species abundances are unknown. The traditional estimator that ignores the missing species underestimates when there is a non-negligible number of unseen species. We provide a different approach based on unequal probability sampling theory because species have different probabilities of being discovered in the sample. No parametric forms are assumed for the species abundances. The proposed estimation procedure combines the Horvitz–Thompson (1952) adjustment for missing species and the concept of sample coverage, which is used to properly estimate the relative abundances of species discovered in the sample. Simulation results show that the proposed estimator works well under various abundance models even when a relatively large fraction of the species is missing. Three real data sets, two from biology and the other one from numismatics, are given for illustration.
25 schema:genre research_article
26 schema:inLanguage en
27 schema:isAccessibleForFree true
28 schema:isPartOf N32300918778842dfaad9d02090684265
29 N827c4e8d7b09437c96c6f288ae535af3
30 sg:journal.1356907
31 schema:name Nonparametric estimation of Shannon’s index of diversity when there are unseen species in sample
32 schema:pagination 429-443
33 schema:productId N851c844013894d168fda9e457ca29693
34 Nb405df58f4db474db3d31e80e9a2d5f3
35 Ncafa061bccff4cb49c324123b75ecb04
36 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024775421
37 https://doi.org/10.1023/a:1026096204727
38 schema:sdDatePublished 2019-04-10T14:08
39 schema:sdLicense https://scigraph.springernature.com/explorer/license/
40 schema:sdPublisher N1fec130eed024916b4c580d8faf0076b
41 schema:url http://link.springer.com/10.1023%2FA%3A1026096204727
42 sgo:license sg:explorer/license/
43 sgo:sdDataset articles
44 rdf:type schema:ScholarlyArticle
45 N1fec130eed024916b4c580d8faf0076b schema:name Springer Nature - SN SciGraph project
46 rdf:type schema:Organization
47 N32300918778842dfaad9d02090684265 schema:issueNumber 4
48 rdf:type schema:PublicationIssue
49 N3b3b8e7bcf784957811c2fd369806f5a rdf:first sg:person.011612166521.45
50 rdf:rest N7a37fb0da35b4930a3b185b8c267c7f8
51 N7a37fb0da35b4930a3b185b8c267c7f8 rdf:first sg:person.01304441055.07
52 rdf:rest rdf:nil
53 N827c4e8d7b09437c96c6f288ae535af3 schema:volumeNumber 10
54 rdf:type schema:PublicationVolume
55 N851c844013894d168fda9e457ca29693 schema:name dimensions_id
56 schema:value pub.1024775421
57 rdf:type schema:PropertyValue
58 Nb405df58f4db474db3d31e80e9a2d5f3 schema:name doi
59 schema:value 10.1023/a:1026096204727
60 rdf:type schema:PropertyValue
61 Ncafa061bccff4cb49c324123b75ecb04 schema:name readcube_id
62 schema:value 7607e73bdb50a8c4a71cf4d77b1719389d1ae2d0ecb939990be204334d66aa29
63 rdf:type schema:PropertyValue
64 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
65 schema:name Mathematical Sciences
66 rdf:type schema:DefinedTerm
67 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
68 schema:name Statistics
69 rdf:type schema:DefinedTerm
70 sg:journal.1356907 schema:issn 1352-8505
71 1573-3009
72 schema:name Environmental and Ecological Statistics
73 rdf:type schema:Periodical
74 sg:person.011612166521.45 schema:affiliation https://www.grid.ac/institutes/grid.38348.34
75 schema:familyName Chao
76 schema:givenName Anne
77 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011612166521.45
78 rdf:type schema:Person
79 sg:person.01304441055.07 schema:affiliation https://www.grid.ac/institutes/grid.38348.34
80 schema:familyName Shen
81 schema:givenName Tsung-Jen
82 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01304441055.07
83 rdf:type schema:Person
84 sg:pub.10.1023/a:1009659922745 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023855607
85 https://doi.org/10.1023/a:1009659922745
86 rdf:type schema:CreativeWork
87 https://app.dimensions.ai/details/publication/pub.1074258104 schema:CreativeWork
88 https://doi.org/10.1073/pnas.43.3.293 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030325064
89 rdf:type schema:CreativeWork
90 https://doi.org/10.1080/01621459.1952.10483446 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058299008
91 rdf:type schema:CreativeWork
92 https://doi.org/10.1080/01621459.1992.10475194 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058304223
93 rdf:type schema:CreativeWork
94 https://doi.org/10.1080/01621459.1998.10473807 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058305446
95 rdf:type schema:CreativeWork
96 https://doi.org/10.1080/03610910008813661 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042559689
97 rdf:type schema:CreativeWork
98 https://doi.org/10.1080/757584397 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045151635
99 rdf:type schema:CreativeWork
100 https://doi.org/10.1093/biomet/80.1.193 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059420367
101 rdf:type schema:CreativeWork
102 https://doi.org/10.1098/rstb.1994.0091 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033931755
103 rdf:type schema:CreativeWork
104 https://doi.org/10.1137/1104033 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062864263
105 rdf:type schema:CreativeWork
106 https://doi.org/10.1146/annurev.es.05.110174.001441 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040179280
107 rdf:type schema:CreativeWork
108 https://doi.org/10.1214/aos/1176350066 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064409041
109 rdf:type schema:CreativeWork
110 https://doi.org/10.2307/1935358 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069659500
111 rdf:type schema:CreativeWork
112 https://doi.org/10.2307/1935359 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069659501
113 rdf:type schema:CreativeWork
114 https://doi.org/10.2307/1936227 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069660286
115 rdf:type schema:CreativeWork
116 https://doi.org/10.2307/2529778 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069975385
117 rdf:type schema:CreativeWork
118 https://doi.org/10.2307/5493 schema:sameAs https://app.dimensions.ai/details/publication/pub.1070640962
119 rdf:type schema:CreativeWork
120 https://www.grid.ac/institutes/grid.38348.34 schema:alternateName National Tsing Hua University
121 schema:name Institute of Statistics, National Tsing Hua University, 30043, Hsin-Chu, TAIWAN
122 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...