A Stochastic Subgradient Method for Distributionally Robust Non-convex and Non-smooth Learning View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2022-07-08

AUTHORS

Mert Gürbüzbalaban, Andrzej Ruszczyński, Landi Zhu

ABSTRACT

We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to ambiguity in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses mean–semideviation risk for quantifying uncertainty, allowing us to compute solutions that are robust against perturbations in the population data distribution. We consider a broad class of generalized differentiable loss functions that can be non-convex and non-smooth, involving upward and downward cusps, and we develop an efficient stochastic subgradient method for distributionally robust problems with such functions. We prove that it converges to a point satisfying the optimality conditions. To our knowledge, this is the first method with rigorous convergence guarantees in the context of generalized differentiable non-convex and non-smooth distributionally robust stochastic optimization. Our method allows for the control of the desired level of robustness with little extra computational cost compared to population risk minimization with stochastic gradient methods. We also illustrate the performance of our algorithm on real datasets arising in convex and non-convex supervised learning problems. More... »

PAGES

1014-1041

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6

DOI

http://dx.doi.org/10.1007/s10957-022-02063-6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1149336045


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0102", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Applied Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0103", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Numerical and Computational Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Rutgers University, 08550, Piscataway, NJ, USA", 
          "id": "http://www.grid.ac/institutes/grid.430387.b", 
          "name": [
            "Rutgers University, 08550, Piscataway, NJ, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "G\u00fcrb\u00fczbalaban", 
        "givenName": "Mert", 
        "id": "sg:person.07375135557.45", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07375135557.45"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Rutgers University, 08550, Piscataway, NJ, USA", 
          "id": "http://www.grid.ac/institutes/grid.430387.b", 
          "name": [
            "Rutgers University, 08550, Piscataway, NJ, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ruszczy\u0144ski", 
        "givenName": "Andrzej", 
        "id": "sg:person.013275445013.36", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013275445013.36"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Rutgers University, 08550, Piscataway, NJ, USA", 
          "id": "http://www.grid.ac/institutes/grid.430387.b", 
          "name": [
            "Rutgers University, 08550, Piscataway, NJ, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhu", 
        "givenName": "Landi", 
        "id": "sg:person.012757542543.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012757542543.48"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/pl00011396", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046930638", 
          "https://doi.org/10.1007/pl00011396"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10463-016-0559-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017939837", 
          "https://doi.org/10.1007/s10463-016-0559-8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10107-013-0681-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018210674", 
          "https://doi.org/10.1007/s10107-013-0681-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01099354", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028538768", 
          "https://doi.org/10.1007/bf01099354"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10107-017-1172-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090376560", 
          "https://doi.org/10.1007/s10107-017-1172-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11590-020-01537-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1124234453", 
          "https://doi.org/10.1007/s11590-020-01537-8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10107-016-1017-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039419824", 
          "https://doi.org/10.1007/s10107-016-1017-3"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2022-07-08", 
    "datePublishedReg": "2022-07-08", 
    "description": "We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to ambiguity in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses mean\u2013semideviation risk for quantifying uncertainty, allowing us to compute solutions that are robust against perturbations in the population data distribution. We consider a broad class of generalized differentiable loss functions that can be non-convex and non-smooth, involving upward and downward cusps, and we develop an efficient stochastic subgradient method for distributionally robust problems with such functions. We prove that it converges to a point satisfying the optimality conditions. To our knowledge, this is the first method with rigorous convergence guarantees in the context of generalized differentiable non-convex and non-smooth distributionally robust stochastic optimization. Our method allows for the control of the desired level of robustness with little extra computational cost compared to population risk minimization with stochastic gradient methods. We also illustrate the performance of our algorithm on real datasets arising in convex and non-convex supervised learning problems.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s10957-022-02063-6", 
    "isAccessibleForFree": false, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.9621398", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.7671894", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1044187", 
        "issn": [
          "0022-3239", 
          "1573-2878"
        ], 
        "name": "Journal of Optimization Theory and Applications", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "3", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "194"
      }
    ], 
    "keywords": [
      "stochastic subgradient method", 
      "subgradient method", 
      "rigorous convergence guarantees", 
      "robust stochastic optimization", 
      "stochastic optimization problem", 
      "stochastic gradient method", 
      "coherent risk measures", 
      "extra computational cost", 
      "stochastic optimization", 
      "differentiable loss function", 
      "optimality conditions", 
      "optimization problem", 
      "convergence guarantees", 
      "robust problem", 
      "level of robustness", 
      "little extra computational cost", 
      "robust formulation", 
      "optimization techniques", 
      "gradient method", 
      "quantifying uncertainty", 
      "statistical learning", 
      "broad class", 
      "computational cost", 
      "data distribution", 
      "risk minimization", 
      "risk measures", 
      "loss function", 
      "such functions", 
      "downward cusp", 
      "real datasets", 
      "first method", 
      "learning problem", 
      "problem", 
      "robustness", 
      "formulation", 
      "convex", 
      "minimization", 
      "optimization", 
      "perturbations", 
      "theory", 
      "distribution", 
      "algorithm", 
      "uncertainty", 
      "guarantees", 
      "solution", 
      "function", 
      "class", 
      "cusp", 
      "point", 
      "technique", 
      "respect", 
      "performance", 
      "dataset", 
      "ambiguity", 
      "conditions", 
      "control", 
      "cost", 
      "learning", 
      "measures", 
      "context", 
      "knowledge", 
      "levels", 
      "risk", 
      "method"
    ], 
    "name": "A Stochastic Subgradient Method for Distributionally Robust Non-convex and Non-smooth Learning", 
    "pagination": "1014-1041", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1149336045"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s10957-022-02063-6"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s10957-022-02063-6", 
      "https://app.dimensions.ai/details/publication/pub.1149336045"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-11-24T21:10", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221124/entities/gbq_results/article/article_953.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s10957-022-02063-6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'


 

This table displays all metadata directly associated to this object as RDF triples.

175 TRIPLES      21 PREDICATES      97 URIs      80 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s10957-022-02063-6 schema:about anzsrc-for:01
2 anzsrc-for:0102
3 anzsrc-for:0103
4 anzsrc-for:0104
5 schema:author Nb86b9e9afe0744aa8f5fc43e7d37ad02
6 schema:citation sg:pub.10.1007/bf01099354
7 sg:pub.10.1007/pl00011396
8 sg:pub.10.1007/s10107-013-0681-9
9 sg:pub.10.1007/s10107-016-1017-3
10 sg:pub.10.1007/s10107-017-1172-1
11 sg:pub.10.1007/s10463-016-0559-8
12 sg:pub.10.1007/s11590-020-01537-8
13 schema:datePublished 2022-07-08
14 schema:datePublishedReg 2022-07-08
15 schema:description We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to ambiguity in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses mean–semideviation risk for quantifying uncertainty, allowing us to compute solutions that are robust against perturbations in the population data distribution. We consider a broad class of generalized differentiable loss functions that can be non-convex and non-smooth, involving upward and downward cusps, and we develop an efficient stochastic subgradient method for distributionally robust problems with such functions. We prove that it converges to a point satisfying the optimality conditions. To our knowledge, this is the first method with rigorous convergence guarantees in the context of generalized differentiable non-convex and non-smooth distributionally robust stochastic optimization. Our method allows for the control of the desired level of robustness with little extra computational cost compared to population risk minimization with stochastic gradient methods. We also illustrate the performance of our algorithm on real datasets arising in convex and non-convex supervised learning problems.
16 schema:genre article
17 schema:isAccessibleForFree false
18 schema:isPartOf N35feb7668b004b0e9688dfa496fee145
19 Ndbfae89776b44e22a994515d7d861d8d
20 sg:journal.1044187
21 schema:keywords algorithm
22 ambiguity
23 broad class
24 class
25 coherent risk measures
26 computational cost
27 conditions
28 context
29 control
30 convergence guarantees
31 convex
32 cost
33 cusp
34 data distribution
35 dataset
36 differentiable loss function
37 distribution
38 downward cusp
39 extra computational cost
40 first method
41 formulation
42 function
43 gradient method
44 guarantees
45 knowledge
46 learning
47 learning problem
48 level of robustness
49 levels
50 little extra computational cost
51 loss function
52 measures
53 method
54 minimization
55 optimality conditions
56 optimization
57 optimization problem
58 optimization techniques
59 performance
60 perturbations
61 point
62 problem
63 quantifying uncertainty
64 real datasets
65 respect
66 rigorous convergence guarantees
67 risk
68 risk measures
69 risk minimization
70 robust formulation
71 robust problem
72 robust stochastic optimization
73 robustness
74 solution
75 statistical learning
76 stochastic gradient method
77 stochastic optimization
78 stochastic optimization problem
79 stochastic subgradient method
80 subgradient method
81 such functions
82 technique
83 theory
84 uncertainty
85 schema:name A Stochastic Subgradient Method for Distributionally Robust Non-convex and Non-smooth Learning
86 schema:pagination 1014-1041
87 schema:productId N87a090b9ebd14e4a81efec15641179c5
88 Nb7f5b31c899142fa9dd63c39424467fb
89 schema:sameAs https://app.dimensions.ai/details/publication/pub.1149336045
90 https://doi.org/10.1007/s10957-022-02063-6
91 schema:sdDatePublished 2022-11-24T21:10
92 schema:sdLicense https://scigraph.springernature.com/explorer/license/
93 schema:sdPublisher Nbf031f8a25db403cb6fda9f033df3966
94 schema:url https://doi.org/10.1007/s10957-022-02063-6
95 sgo:license sg:explorer/license/
96 sgo:sdDataset articles
97 rdf:type schema:ScholarlyArticle
98 N35feb7668b004b0e9688dfa496fee145 schema:volumeNumber 194
99 rdf:type schema:PublicationVolume
100 N6ff79af63be14adfaca71d0c31101e58 rdf:first sg:person.013275445013.36
101 rdf:rest Ne7ac94a5383445e489a78ecfc4f1d3b5
102 N87a090b9ebd14e4a81efec15641179c5 schema:name doi
103 schema:value 10.1007/s10957-022-02063-6
104 rdf:type schema:PropertyValue
105 Nb7f5b31c899142fa9dd63c39424467fb schema:name dimensions_id
106 schema:value pub.1149336045
107 rdf:type schema:PropertyValue
108 Nb86b9e9afe0744aa8f5fc43e7d37ad02 rdf:first sg:person.07375135557.45
109 rdf:rest N6ff79af63be14adfaca71d0c31101e58
110 Nbf031f8a25db403cb6fda9f033df3966 schema:name Springer Nature - SN SciGraph project
111 rdf:type schema:Organization
112 Ndbfae89776b44e22a994515d7d861d8d schema:issueNumber 3
113 rdf:type schema:PublicationIssue
114 Ne7ac94a5383445e489a78ecfc4f1d3b5 rdf:first sg:person.012757542543.48
115 rdf:rest rdf:nil
116 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
117 schema:name Mathematical Sciences
118 rdf:type schema:DefinedTerm
119 anzsrc-for:0102 schema:inDefinedTermSet anzsrc-for:
120 schema:name Applied Mathematics
121 rdf:type schema:DefinedTerm
122 anzsrc-for:0103 schema:inDefinedTermSet anzsrc-for:
123 schema:name Numerical and Computational Mathematics
124 rdf:type schema:DefinedTerm
125 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
126 schema:name Statistics
127 rdf:type schema:DefinedTerm
128 sg:grant.7671894 http://pending.schema.org/fundedItem sg:pub.10.1007/s10957-022-02063-6
129 rdf:type schema:MonetaryGrant
130 sg:grant.9621398 http://pending.schema.org/fundedItem sg:pub.10.1007/s10957-022-02063-6
131 rdf:type schema:MonetaryGrant
132 sg:journal.1044187 schema:issn 0022-3239
133 1573-2878
134 schema:name Journal of Optimization Theory and Applications
135 schema:publisher Springer Nature
136 rdf:type schema:Periodical
137 sg:person.012757542543.48 schema:affiliation grid-institutes:grid.430387.b
138 schema:familyName Zhu
139 schema:givenName Landi
140 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012757542543.48
141 rdf:type schema:Person
142 sg:person.013275445013.36 schema:affiliation grid-institutes:grid.430387.b
143 schema:familyName Ruszczyński
144 schema:givenName Andrzej
145 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013275445013.36
146 rdf:type schema:Person
147 sg:person.07375135557.45 schema:affiliation grid-institutes:grid.430387.b
148 schema:familyName Gürbüzbalaban
149 schema:givenName Mert
150 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07375135557.45
151 rdf:type schema:Person
152 sg:pub.10.1007/bf01099354 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028538768
153 https://doi.org/10.1007/bf01099354
154 rdf:type schema:CreativeWork
155 sg:pub.10.1007/pl00011396 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046930638
156 https://doi.org/10.1007/pl00011396
157 rdf:type schema:CreativeWork
158 sg:pub.10.1007/s10107-013-0681-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018210674
159 https://doi.org/10.1007/s10107-013-0681-9
160 rdf:type schema:CreativeWork
161 sg:pub.10.1007/s10107-016-1017-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039419824
162 https://doi.org/10.1007/s10107-016-1017-3
163 rdf:type schema:CreativeWork
164 sg:pub.10.1007/s10107-017-1172-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090376560
165 https://doi.org/10.1007/s10107-017-1172-1
166 rdf:type schema:CreativeWork
167 sg:pub.10.1007/s10463-016-0559-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017939837
168 https://doi.org/10.1007/s10463-016-0559-8
169 rdf:type schema:CreativeWork
170 sg:pub.10.1007/s11590-020-01537-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1124234453
171 https://doi.org/10.1007/s11590-020-01537-8
172 rdf:type schema:CreativeWork
173 grid-institutes:grid.430387.b schema:alternateName Rutgers University, 08550, Piscataway, NJ, USA
174 schema:name Rutgers University, 08550, Piscataway, NJ, USA
175 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...