An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2019-03-19

AUTHORS

Reshad Hosseini, Suvrit Sra

ABSTRACT

We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods. More... »

PAGES

1-37

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s10107-019-01381-4

DOI

http://dx.doi.org/10.1007/s10107-019-01381-4

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1112877718


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Institute for Research in Fundamental Sciences", 
          "id": "https://www.grid.ac/institutes/grid.418744.a", 
          "name": [
            "School of ECE, College of Engineering, University of Tehran, Tehran, Iran", 
            "School of Computer Science, Institute of Research in Fundamental Sciences (IPM), Tehran, Iran"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hosseini", 
        "givenName": "Reshad", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Massachusetts Institute of Technology", 
          "id": "https://www.grid.ac/institutes/grid.116068.8", 
          "name": [
            "Massachusetts Institute of Technology, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Sra", 
        "givenName": "Suvrit", 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1016/j.laa.2011.10.029", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021543635"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/neco.1994.6.2.181", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028049770"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/089976600300014764", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033608189"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4419-9982-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034115258", 
          "https://doi.org/10.1007/978-1-4419-9982-5"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4419-9982-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034115258", 
          "https://doi.org/10.1007/978-1-4419-9982-5"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1006/dspr.1999.0361", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035295221"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.laa.2009.01.025", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035774209"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/neco.1996.8.1.129", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036460383"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-94-015-8390-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052397284", 
          "https://doi.org/10.1007/978-94-015-8390-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-94-015-8390-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052397284", 
          "https://doi.org/10.1007/978-94-015-8390-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tac.2013.2254619", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061478702"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tsp.2012.2218241", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061803530"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/080731359", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062855007"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/1026034", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062861996"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/11082885x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062864769"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/110845768", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062865262"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/120880811", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062869556"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/140978168", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062872544"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1137/s0363012902419977", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062880669"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/sffcs.1999.814639", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093792530"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/focs.2010.15", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095406261"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1109491899", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1109491899", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.2517-6161.1977.tb01600.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1110458051"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.2517-6161.1977.tb01600.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1110458051"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2019-03-19", 
    "datePublishedReg": "2019-03-19", 
    "description": "We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1007/s10107-019-01381-4", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.3849014", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1047630", 
        "issn": [
          "0025-5610", 
          "1436-4646"
        ], 
        "name": "Mathematical Programming", 
        "type": "Periodical"
      }
    ], 
    "name": "An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization", 
    "pagination": "1-37", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "9740bc7942113f64d45f2ba4957db95e0a49e49e751eec38ac0ed01bee428848"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s10107-019-01381-4"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1112877718"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s10107-019-01381-4", 
      "https://app.dimensions.ai/details/publication/pub.1112877718"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T12:21", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000362_0000000362/records_87078_00000002.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1007%2Fs10107-019-01381-4"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10107-019-01381-4'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10107-019-01381-4'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10107-019-01381-4'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10107-019-01381-4'


 

This table displays all metadata directly associated to this object as RDF triples.

130 TRIPLES      21 PREDICATES      45 URIs      16 LITERALS      5 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s10107-019-01381-4 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author N6361a1af45c542c7ad2fb00670df3dbc
4 schema:citation sg:pub.10.1007/978-1-4419-9982-5
5 sg:pub.10.1007/978-94-015-8390-9
6 https://app.dimensions.ai/details/publication/pub.1109491899
7 https://doi.org/10.1006/dspr.1999.0361
8 https://doi.org/10.1016/j.laa.2009.01.025
9 https://doi.org/10.1016/j.laa.2011.10.029
10 https://doi.org/10.1109/focs.2010.15
11 https://doi.org/10.1109/sffcs.1999.814639
12 https://doi.org/10.1109/tac.2013.2254619
13 https://doi.org/10.1109/tsp.2012.2218241
14 https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
15 https://doi.org/10.1137/080731359
16 https://doi.org/10.1137/1026034
17 https://doi.org/10.1137/11082885x
18 https://doi.org/10.1137/110845768
19 https://doi.org/10.1137/120880811
20 https://doi.org/10.1137/140978168
21 https://doi.org/10.1137/s0363012902419977
22 https://doi.org/10.1162/089976600300014764
23 https://doi.org/10.1162/neco.1994.6.2.181
24 https://doi.org/10.1162/neco.1996.8.1.129
25 schema:datePublished 2019-03-19
26 schema:datePublishedReg 2019-03-19
27 schema:description We consider maximum likelihood estimation for Gaussian Mixture Models (Gmm s). This task is almost invariably solved (in theory and practice) via the Expectation Maximization (EM) algorithm. EM owes its success to various factors, of which is its ability to fulfill positive definiteness constraints in closed form is of key importance. We propose an alternative to EM grounded in the Riemannian geometry of positive definite matrices, using which we cast Gmm parameter estimation as a Riemannian optimization problem. Surprisingly, such an out-of-the-box Riemannian formulation completely fails and proves much inferior to EM. This motivates us to take a closer look at the problem geometry, and derive a better formulation that is much more amenable to Riemannian optimization. We then develop Riemannian batch and stochastic gradient algorithms that outperform EM, often substantially. We provide a non-asymptotic convergence analysis for our stochastic method, which is also the first (to our knowledge) such global analysis for Riemannian stochastic gradient. Numerous empirical results are included to demonstrate the effectiveness of our methods.
28 schema:genre research_article
29 schema:inLanguage en
30 schema:isAccessibleForFree false
31 schema:isPartOf sg:journal.1047630
32 schema:name An alternative to EM for Gaussian mixture models: batch and stochastic Riemannian optimization
33 schema:pagination 1-37
34 schema:productId N1d7bc7fd55314ebaa4ba1385626352f7
35 N5dd07f93822c456991132779f67ba854
36 Nc4a101be9ef440c493a538c0799595e4
37 schema:sameAs https://app.dimensions.ai/details/publication/pub.1112877718
38 https://doi.org/10.1007/s10107-019-01381-4
39 schema:sdDatePublished 2019-04-11T12:21
40 schema:sdLicense https://scigraph.springernature.com/explorer/license/
41 schema:sdPublisher N0de5aeeedd9548c2942ad8e07cfa302e
42 schema:url https://link.springer.com/10.1007%2Fs10107-019-01381-4
43 sgo:license sg:explorer/license/
44 sgo:sdDataset articles
45 rdf:type schema:ScholarlyArticle
46 N0de5aeeedd9548c2942ad8e07cfa302e schema:name Springer Nature - SN SciGraph project
47 rdf:type schema:Organization
48 N1d7bc7fd55314ebaa4ba1385626352f7 schema:name dimensions_id
49 schema:value pub.1112877718
50 rdf:type schema:PropertyValue
51 N5d3352d0397942ebbdfce0dd44ba0251 schema:affiliation https://www.grid.ac/institutes/grid.116068.8
52 schema:familyName Sra
53 schema:givenName Suvrit
54 rdf:type schema:Person
55 N5dd07f93822c456991132779f67ba854 schema:name readcube_id
56 schema:value 9740bc7942113f64d45f2ba4957db95e0a49e49e751eec38ac0ed01bee428848
57 rdf:type schema:PropertyValue
58 N6361a1af45c542c7ad2fb00670df3dbc rdf:first Na2c4b2d000ba417da9a7808ee9ec4445
59 rdf:rest N7e2a3ee45c684d818347b94ca03c7f22
60 N7e2a3ee45c684d818347b94ca03c7f22 rdf:first N5d3352d0397942ebbdfce0dd44ba0251
61 rdf:rest rdf:nil
62 Na2c4b2d000ba417da9a7808ee9ec4445 schema:affiliation https://www.grid.ac/institutes/grid.418744.a
63 schema:familyName Hosseini
64 schema:givenName Reshad
65 rdf:type schema:Person
66 Nc4a101be9ef440c493a538c0799595e4 schema:name doi
67 schema:value 10.1007/s10107-019-01381-4
68 rdf:type schema:PropertyValue
69 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
70 schema:name Mathematical Sciences
71 rdf:type schema:DefinedTerm
72 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
73 schema:name Statistics
74 rdf:type schema:DefinedTerm
75 sg:grant.3849014 http://pending.schema.org/fundedItem sg:pub.10.1007/s10107-019-01381-4
76 rdf:type schema:MonetaryGrant
77 sg:journal.1047630 schema:issn 0025-5610
78 1436-4646
79 schema:name Mathematical Programming
80 rdf:type schema:Periodical
81 sg:pub.10.1007/978-1-4419-9982-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034115258
82 https://doi.org/10.1007/978-1-4419-9982-5
83 rdf:type schema:CreativeWork
84 sg:pub.10.1007/978-94-015-8390-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052397284
85 https://doi.org/10.1007/978-94-015-8390-9
86 rdf:type schema:CreativeWork
87 https://app.dimensions.ai/details/publication/pub.1109491899 schema:CreativeWork
88 https://doi.org/10.1006/dspr.1999.0361 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035295221
89 rdf:type schema:CreativeWork
90 https://doi.org/10.1016/j.laa.2009.01.025 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035774209
91 rdf:type schema:CreativeWork
92 https://doi.org/10.1016/j.laa.2011.10.029 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021543635
93 rdf:type schema:CreativeWork
94 https://doi.org/10.1109/focs.2010.15 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095406261
95 rdf:type schema:CreativeWork
96 https://doi.org/10.1109/sffcs.1999.814639 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093792530
97 rdf:type schema:CreativeWork
98 https://doi.org/10.1109/tac.2013.2254619 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061478702
99 rdf:type schema:CreativeWork
100 https://doi.org/10.1109/tsp.2012.2218241 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061803530
101 rdf:type schema:CreativeWork
102 https://doi.org/10.1111/j.2517-6161.1977.tb01600.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1110458051
103 rdf:type schema:CreativeWork
104 https://doi.org/10.1137/080731359 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062855007
105 rdf:type schema:CreativeWork
106 https://doi.org/10.1137/1026034 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062861996
107 rdf:type schema:CreativeWork
108 https://doi.org/10.1137/11082885x schema:sameAs https://app.dimensions.ai/details/publication/pub.1062864769
109 rdf:type schema:CreativeWork
110 https://doi.org/10.1137/110845768 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062865262
111 rdf:type schema:CreativeWork
112 https://doi.org/10.1137/120880811 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062869556
113 rdf:type schema:CreativeWork
114 https://doi.org/10.1137/140978168 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062872544
115 rdf:type schema:CreativeWork
116 https://doi.org/10.1137/s0363012902419977 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062880669
117 rdf:type schema:CreativeWork
118 https://doi.org/10.1162/089976600300014764 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033608189
119 rdf:type schema:CreativeWork
120 https://doi.org/10.1162/neco.1994.6.2.181 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028049770
121 rdf:type schema:CreativeWork
122 https://doi.org/10.1162/neco.1996.8.1.129 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036460383
123 rdf:type schema:CreativeWork
124 https://www.grid.ac/institutes/grid.116068.8 schema:alternateName Massachusetts Institute of Technology
125 schema:name Massachusetts Institute of Technology, Cambridge, MA, USA
126 rdf:type schema:Organization
127 https://www.grid.ac/institutes/grid.418744.a schema:alternateName Institute for Research in Fundamental Sciences
128 schema:name School of Computer Science, Institute of Research in Fundamental Sciences (IPM), Tehran, Iran
129 School of ECE, College of Engineering, University of Tehran, Tehran, Iran
130 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...