From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2019-03

AUTHORS

Sylvia Frühwirth-Schnatter, Gertraud Malsiner-Walli

ABSTRACT

In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303–324, 2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with K components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than K with high probability. The number of clusters is then inferred a posteriori from the data. The present paper makes the following contributions in the context of sparse finite mixture modelling. First, it is illustrated that the concept of sparse finite mixture is very generic and easily extended to cluster various types of non-Gaussian data, in particular discrete data and continuous multivariate data arising from non-Gaussian clusters. Second, sparse finite mixtures are compared to Dirichlet process mixtures with respect to their ability to identify the number of clusters. For both model classes, a random hyper prior is considered for the parameters determining the weight distribution. By suitable matching of these priors, it is shown that the choice of this hyper prior is far more influential on the cluster solution than whether a sparse finite mixture or a Dirichlet process mixture is taken into consideration. More... »

PAGES

1-32

References to SciGraph publications

  • 1996-09. A general maximum likelihood analysis of overdispersion in generalized linear models in STATISTICS AND COMPUTING
  • 2013-11. Model-based clustering and classification with non-normal mixture distributions in STATISTICAL METHODS & APPLICATIONS
  • 2009-12. Improved auxiliary mixture sampling for hierarchical models of non-Gaussian data in STATISTICS AND COMPUTING
  • 2006-01. Multivariate mixtures of normals with unknown number of components in STATISTICS AND COMPUTING
  • 1985-12. Comparing partitions in JOURNAL OF CLASSIFICATION
  • 2016-01. Model-based clustering based on sparse finite Gaussian mixtures in STATISTICS AND COMPUTING
  • 1997-03. Inference in model-based cluster analysis in STATISTICS AND COMPUTING
  • 2011-01. Slice sampling mixture models in STATISTICS AND COMPUTING
  • 1998. Computing Nonparametric Hierarchical Models in PRACTICAL NONPARAMETRIC AND SEMIPARAMETRIC BAYESIAN STATISTICS
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s11634-018-0329-y

    DOI

    http://dx.doi.org/10.1007/s11634-018-0329-y

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1106333352


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Vienna University of Economics and Business", 
              "id": "https://www.grid.ac/institutes/grid.15788.33", 
              "name": [
                "Institute for Statistics and Mathematics, Vienna University of Economics and Business (WU), Welthandelsplatz 1, 1020, Vienna, Austria"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Fr\u00fchwirth-Schnatter", 
            "givenName": "Sylvia", 
            "id": "sg:person.0702362777.46", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702362777.46"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Vienna University of Economics and Business", 
              "id": "https://www.grid.ac/institutes/grid.15788.33", 
              "name": [
                "Institute for Statistics and Mathematics, Vienna University of Economics and Business (WU), Welthandelsplatz 1, 1020, Vienna, Austria"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Malsiner-Walli", 
            "givenName": "Gertraud", 
            "id": "sg:person.01320037026.26", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320037026.26"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/s11222-006-5338-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000275112", 
              "https://doi.org/10.1007/s11222-006-5338-6"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11222-006-5338-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000275112", 
              "https://doi.org/10.1007/s11222-006-5338-6"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11222-014-9500-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003696325", 
              "https://doi.org/10.1007/s11222-014-9500-2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11222-014-9500-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1003696325", 
              "https://doi.org/10.1007/s11222-014-9500-2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf00140869", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004597255", 
              "https://doi.org/10.1007/bf00140869"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf00140869", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004597255", 
              "https://doi.org/10.1007/bf00140869"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1111/j.1368-423x.2004.00125.x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1005446821"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4612-1732-9_1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1007108393", 
              "https://doi.org/10.1007/978-1-4612-1732-9_1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4612-1732-9_1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1007108393", 
              "https://doi.org/10.1007/978-1-4612-1732-9_1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1111/j.1467-9868.2011.00781.x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1007193880"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/biomet/83.4.715", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008663110"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/j.csda.2008.03.028", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009988938"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1111/1467-9868.00095", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011703011"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/10485250211383", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1013876418"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/9781119995678.ch10", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1014810147"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11222-008-9109-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017712707", 
              "https://doi.org/10.1007/s11222-008-9109-4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11222-008-9109-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017712707", 
              "https://doi.org/10.1007/s11222-008-9109-4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/biostatistics/kxp062", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020133500"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/biostatistics/kxp062", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020133500"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf01908075", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1022323983", 
              "https://doi.org/10.1007/bf01908075"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s10260-013-0237-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025298908", 
              "https://doi.org/10.1007/s10260-013-0237-4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1111/1467-9469.00242", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025772916"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1111/1467-9868.00391", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025959631"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/b978-0-12-589320-6.50018-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031555271"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1159/000087446", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034731359"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1159/000087446", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034731359"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11222-009-9150-y", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034800568", 
              "https://doi.org/10.1007/s11222-009-9150-y"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s11222-009-9150-y", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034800568", 
              "https://doi.org/10.1007/s11222-009-9150-y"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1023/a:1018510926151", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1036767102", 
              "https://doi.org/10.1023/a:1018510926151"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1111/1467-9868.00402", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037524261"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1214/aos/1176342360", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040652855"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/10618600.2016.1200472", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041645640"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/bth068", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042662743"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1371/journal.pone.0131739", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050428796"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.2333/bhmk.21.1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050511218"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/01621459.1984.10477093", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058302938"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/01621459.1995.10476550", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058304833"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/01621459.2013.829001", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058306096"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1080/01621459.2016.1255636", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058306653"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/biomet/61.2.215", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059418327"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/biomet/asm086", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059421632"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/34.865189", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061157115"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1198/016214501750332758", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064197802"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1198/106186007x238855", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064199598"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1214/009053604000000788", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064388755"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1214/06-ba122", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064389488"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1214/13-ba811", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064394069"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1214/aos/1176342752", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1064406948"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.18637/jss.v042.i10", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1068672632"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.2307/2532201", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1069977629"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1177/1471082x17739058", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1100433455"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1177/1471082x17739058", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1100433455"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1109491899", 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1109491899", 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2019-03", 
        "datePublishedReg": "2019-03-01", 
        "description": "In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303\u2013324, 2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with K components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than K with high probability. The number of clusters is then inferred a posteriori from the data. The present paper makes the following contributions in the context of sparse finite mixture modelling. First, it is illustrated that the concept of sparse finite mixture is very generic and easily extended to cluster various types of non-Gaussian data, in particular discrete data and continuous multivariate data arising from non-Gaussian clusters. Second, sparse finite mixtures are compared to Dirichlet process mixtures with respect to their ability to identify the number of clusters. For both model classes, a random hyper prior is considered for the parameters determining the weight distribution. By suitable matching of these priors, it is shown that the choice of this hyper prior is far more influential on the cluster solution than whether a sparse finite mixture or a Dirichlet process mixture is taken into consideration.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1007/s11634-018-0329-y", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.7578981", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1045303", 
            "issn": [
              "1862-5347", 
              "1862-5355"
            ], 
            "name": "Advances in Data Analysis and Classification", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "13"
          }
        ], 
        "name": "From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering", 
        "pagination": "1-32", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "8e130c67aea04d6bd3339216aadc79cc423583459b63b2399c582376a0af71c7"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s11634-018-0329-y"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1106333352"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s11634-018-0329-y", 
          "https://app.dimensions.ai/details/publication/pub.1106333352"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-11T13:59", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000371_0000000371/records_130823_00000006.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://link.springer.com/10.1007%2Fs11634-018-0329-y"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s11634-018-0329-y'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s11634-018-0329-y'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s11634-018-0329-y'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s11634-018-0329-y'


     

    This table displays all metadata directly associated to this object as RDF triples.

    210 TRIPLES      21 PREDICATES      71 URIs      19 LITERALS      7 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s11634-018-0329-y schema:about anzsrc-for:01
    2 anzsrc-for:0104
    3 schema:author N0c32284d42124b6ab121d9386e2a5604
    4 schema:citation sg:pub.10.1007/978-1-4612-1732-9_1
    5 sg:pub.10.1007/bf00140869
    6 sg:pub.10.1007/bf01908075
    7 sg:pub.10.1007/s10260-013-0237-4
    8 sg:pub.10.1007/s11222-006-5338-6
    9 sg:pub.10.1007/s11222-008-9109-4
    10 sg:pub.10.1007/s11222-009-9150-y
    11 sg:pub.10.1007/s11222-014-9500-2
    12 sg:pub.10.1023/a:1018510926151
    13 https://app.dimensions.ai/details/publication/pub.1109491899
    14 https://doi.org/10.1002/9781119995678.ch10
    15 https://doi.org/10.1016/b978-0-12-589320-6.50018-6
    16 https://doi.org/10.1016/j.csda.2008.03.028
    17 https://doi.org/10.1080/01621459.1984.10477093
    18 https://doi.org/10.1080/01621459.1995.10476550
    19 https://doi.org/10.1080/01621459.2013.829001
    20 https://doi.org/10.1080/01621459.2016.1255636
    21 https://doi.org/10.1080/10485250211383
    22 https://doi.org/10.1080/10618600.2016.1200472
    23 https://doi.org/10.1093/bioinformatics/bth068
    24 https://doi.org/10.1093/biomet/61.2.215
    25 https://doi.org/10.1093/biomet/83.4.715
    26 https://doi.org/10.1093/biomet/asm086
    27 https://doi.org/10.1093/biostatistics/kxp062
    28 https://doi.org/10.1109/34.865189
    29 https://doi.org/10.1111/1467-9469.00242
    30 https://doi.org/10.1111/1467-9868.00095
    31 https://doi.org/10.1111/1467-9868.00391
    32 https://doi.org/10.1111/1467-9868.00402
    33 https://doi.org/10.1111/j.1368-423x.2004.00125.x
    34 https://doi.org/10.1111/j.1467-9868.2011.00781.x
    35 https://doi.org/10.1159/000087446
    36 https://doi.org/10.1177/1471082x17739058
    37 https://doi.org/10.1198/016214501750332758
    38 https://doi.org/10.1198/106186007x238855
    39 https://doi.org/10.1214/009053604000000788
    40 https://doi.org/10.1214/06-ba122
    41 https://doi.org/10.1214/13-ba811
    42 https://doi.org/10.1214/aos/1176342360
    43 https://doi.org/10.1214/aos/1176342752
    44 https://doi.org/10.1371/journal.pone.0131739
    45 https://doi.org/10.18637/jss.v042.i10
    46 https://doi.org/10.2307/2532201
    47 https://doi.org/10.2333/bhmk.21.1
    48 schema:datePublished 2019-03
    49 schema:datePublishedReg 2019-03-01
    50 schema:description In model-based clustering mixture models are used to group data points into clusters. A useful concept introduced for Gaussian mixtures by Malsiner Walli et al. (Stat Comput 26:303–324, 2016) are sparse finite mixtures, where the prior distribution on the weight distribution of a mixture with K components is chosen in such a way that a priori the number of clusters in the data is random and is allowed to be smaller than K with high probability. The number of clusters is then inferred a posteriori from the data. The present paper makes the following contributions in the context of sparse finite mixture modelling. First, it is illustrated that the concept of sparse finite mixture is very generic and easily extended to cluster various types of non-Gaussian data, in particular discrete data and continuous multivariate data arising from non-Gaussian clusters. Second, sparse finite mixtures are compared to Dirichlet process mixtures with respect to their ability to identify the number of clusters. For both model classes, a random hyper prior is considered for the parameters determining the weight distribution. By suitable matching of these priors, it is shown that the choice of this hyper prior is far more influential on the cluster solution than whether a sparse finite mixture or a Dirichlet process mixture is taken into consideration.
    51 schema:genre research_article
    52 schema:inLanguage en
    53 schema:isAccessibleForFree true
    54 schema:isPartOf N4fbd0d87b94f48bc9c6dc15d97cb4bd7
    55 N6fb82675da034fa7bca2e555c67b35ae
    56 sg:journal.1045303
    57 schema:name From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering
    58 schema:pagination 1-32
    59 schema:productId N24542b127019463bb22f8a47b5eeaaa9
    60 N502e000d6d8c4184b96ebea50a0a1a03
    61 Na0e74f4f5fdb4364be095e296b767bc4
    62 schema:sameAs https://app.dimensions.ai/details/publication/pub.1106333352
    63 https://doi.org/10.1007/s11634-018-0329-y
    64 schema:sdDatePublished 2019-04-11T13:59
    65 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    66 schema:sdPublisher Nf1dfbb629b5f4755baa5f99824f30783
    67 schema:url https://link.springer.com/10.1007%2Fs11634-018-0329-y
    68 sgo:license sg:explorer/license/
    69 sgo:sdDataset articles
    70 rdf:type schema:ScholarlyArticle
    71 N0c32284d42124b6ab121d9386e2a5604 rdf:first sg:person.0702362777.46
    72 rdf:rest N3d3359e82b1b444282f06ea1a5dea6ef
    73 N24542b127019463bb22f8a47b5eeaaa9 schema:name doi
    74 schema:value 10.1007/s11634-018-0329-y
    75 rdf:type schema:PropertyValue
    76 N3d3359e82b1b444282f06ea1a5dea6ef rdf:first sg:person.01320037026.26
    77 rdf:rest rdf:nil
    78 N4fbd0d87b94f48bc9c6dc15d97cb4bd7 schema:issueNumber 1
    79 rdf:type schema:PublicationIssue
    80 N502e000d6d8c4184b96ebea50a0a1a03 schema:name readcube_id
    81 schema:value 8e130c67aea04d6bd3339216aadc79cc423583459b63b2399c582376a0af71c7
    82 rdf:type schema:PropertyValue
    83 N6fb82675da034fa7bca2e555c67b35ae schema:volumeNumber 13
    84 rdf:type schema:PublicationVolume
    85 Na0e74f4f5fdb4364be095e296b767bc4 schema:name dimensions_id
    86 schema:value pub.1106333352
    87 rdf:type schema:PropertyValue
    88 Nf1dfbb629b5f4755baa5f99824f30783 schema:name Springer Nature - SN SciGraph project
    89 rdf:type schema:Organization
    90 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    91 schema:name Mathematical Sciences
    92 rdf:type schema:DefinedTerm
    93 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    94 schema:name Statistics
    95 rdf:type schema:DefinedTerm
    96 sg:grant.7578981 http://pending.schema.org/fundedItem sg:pub.10.1007/s11634-018-0329-y
    97 rdf:type schema:MonetaryGrant
    98 sg:journal.1045303 schema:issn 1862-5347
    99 1862-5355
    100 schema:name Advances in Data Analysis and Classification
    101 rdf:type schema:Periodical
    102 sg:person.01320037026.26 schema:affiliation https://www.grid.ac/institutes/grid.15788.33
    103 schema:familyName Malsiner-Walli
    104 schema:givenName Gertraud
    105 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01320037026.26
    106 rdf:type schema:Person
    107 sg:person.0702362777.46 schema:affiliation https://www.grid.ac/institutes/grid.15788.33
    108 schema:familyName Frühwirth-Schnatter
    109 schema:givenName Sylvia
    110 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702362777.46
    111 rdf:type schema:Person
    112 sg:pub.10.1007/978-1-4612-1732-9_1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007108393
    113 https://doi.org/10.1007/978-1-4612-1732-9_1
    114 rdf:type schema:CreativeWork
    115 sg:pub.10.1007/bf00140869 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004597255
    116 https://doi.org/10.1007/bf00140869
    117 rdf:type schema:CreativeWork
    118 sg:pub.10.1007/bf01908075 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022323983
    119 https://doi.org/10.1007/bf01908075
    120 rdf:type schema:CreativeWork
    121 sg:pub.10.1007/s10260-013-0237-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025298908
    122 https://doi.org/10.1007/s10260-013-0237-4
    123 rdf:type schema:CreativeWork
    124 sg:pub.10.1007/s11222-006-5338-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000275112
    125 https://doi.org/10.1007/s11222-006-5338-6
    126 rdf:type schema:CreativeWork
    127 sg:pub.10.1007/s11222-008-9109-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017712707
    128 https://doi.org/10.1007/s11222-008-9109-4
    129 rdf:type schema:CreativeWork
    130 sg:pub.10.1007/s11222-009-9150-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1034800568
    131 https://doi.org/10.1007/s11222-009-9150-y
    132 rdf:type schema:CreativeWork
    133 sg:pub.10.1007/s11222-014-9500-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003696325
    134 https://doi.org/10.1007/s11222-014-9500-2
    135 rdf:type schema:CreativeWork
    136 sg:pub.10.1023/a:1018510926151 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036767102
    137 https://doi.org/10.1023/a:1018510926151
    138 rdf:type schema:CreativeWork
    139 https://app.dimensions.ai/details/publication/pub.1109491899 schema:CreativeWork
    140 https://doi.org/10.1002/9781119995678.ch10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014810147
    141 rdf:type schema:CreativeWork
    142 https://doi.org/10.1016/b978-0-12-589320-6.50018-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031555271
    143 rdf:type schema:CreativeWork
    144 https://doi.org/10.1016/j.csda.2008.03.028 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009988938
    145 rdf:type schema:CreativeWork
    146 https://doi.org/10.1080/01621459.1984.10477093 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058302938
    147 rdf:type schema:CreativeWork
    148 https://doi.org/10.1080/01621459.1995.10476550 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058304833
    149 rdf:type schema:CreativeWork
    150 https://doi.org/10.1080/01621459.2013.829001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058306096
    151 rdf:type schema:CreativeWork
    152 https://doi.org/10.1080/01621459.2016.1255636 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058306653
    153 rdf:type schema:CreativeWork
    154 https://doi.org/10.1080/10485250211383 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013876418
    155 rdf:type schema:CreativeWork
    156 https://doi.org/10.1080/10618600.2016.1200472 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041645640
    157 rdf:type schema:CreativeWork
    158 https://doi.org/10.1093/bioinformatics/bth068 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042662743
    159 rdf:type schema:CreativeWork
    160 https://doi.org/10.1093/biomet/61.2.215 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059418327
    161 rdf:type schema:CreativeWork
    162 https://doi.org/10.1093/biomet/83.4.715 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008663110
    163 rdf:type schema:CreativeWork
    164 https://doi.org/10.1093/biomet/asm086 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059421632
    165 rdf:type schema:CreativeWork
    166 https://doi.org/10.1093/biostatistics/kxp062 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020133500
    167 rdf:type schema:CreativeWork
    168 https://doi.org/10.1109/34.865189 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061157115
    169 rdf:type schema:CreativeWork
    170 https://doi.org/10.1111/1467-9469.00242 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025772916
    171 rdf:type schema:CreativeWork
    172 https://doi.org/10.1111/1467-9868.00095 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011703011
    173 rdf:type schema:CreativeWork
    174 https://doi.org/10.1111/1467-9868.00391 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025959631
    175 rdf:type schema:CreativeWork
    176 https://doi.org/10.1111/1467-9868.00402 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037524261
    177 rdf:type schema:CreativeWork
    178 https://doi.org/10.1111/j.1368-423x.2004.00125.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1005446821
    179 rdf:type schema:CreativeWork
    180 https://doi.org/10.1111/j.1467-9868.2011.00781.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1007193880
    181 rdf:type schema:CreativeWork
    182 https://doi.org/10.1159/000087446 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034731359
    183 rdf:type schema:CreativeWork
    184 https://doi.org/10.1177/1471082x17739058 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100433455
    185 rdf:type schema:CreativeWork
    186 https://doi.org/10.1198/016214501750332758 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064197802
    187 rdf:type schema:CreativeWork
    188 https://doi.org/10.1198/106186007x238855 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064199598
    189 rdf:type schema:CreativeWork
    190 https://doi.org/10.1214/009053604000000788 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064388755
    191 rdf:type schema:CreativeWork
    192 https://doi.org/10.1214/06-ba122 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064389488
    193 rdf:type schema:CreativeWork
    194 https://doi.org/10.1214/13-ba811 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064394069
    195 rdf:type schema:CreativeWork
    196 https://doi.org/10.1214/aos/1176342360 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040652855
    197 rdf:type schema:CreativeWork
    198 https://doi.org/10.1214/aos/1176342752 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064406948
    199 rdf:type schema:CreativeWork
    200 https://doi.org/10.1371/journal.pone.0131739 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050428796
    201 rdf:type schema:CreativeWork
    202 https://doi.org/10.18637/jss.v042.i10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1068672632
    203 rdf:type schema:CreativeWork
    204 https://doi.org/10.2307/2532201 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069977629
    205 rdf:type schema:CreativeWork
    206 https://doi.org/10.2333/bhmk.21.1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050511218
    207 rdf:type schema:CreativeWork
    208 https://www.grid.ac/institutes/grid.15788.33 schema:alternateName Vienna University of Economics and Business
    209 schema:name Institute for Statistics and Mathematics, Vienna University of Economics and Business (WU), Welthandelsplatz 1, 1020, Vienna, Austria
    210 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...