A Stochastic Subgradient Method for Distributionally Robust Non-convex and Non-smooth Learning View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2022-07-08

AUTHORS

Mert Gürbüzbalaban, Andrzej Ruszczyński, Landi Zhu

ABSTRACT

We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to ambiguity in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses mean–semideviation risk for quantifying uncertainty, allowing us to compute solutions that are robust against perturbations in the population data distribution. We consider a broad class of generalized differentiable loss functions that can be non-convex and non-smooth, involving upward and downward cusps, and we develop an efficient stochastic subgradient method for distributionally robust problems with such functions. We prove that it converges to a point satisfying the optimality conditions. To our knowledge, this is the first method with rigorous convergence guarantees in the context of generalized differentiable non-convex and non-smooth distributionally robust stochastic optimization. Our method allows for the control of the desired level of robustness with little extra computational cost compared to population risk minimization with stochastic gradient methods. We also illustrate the performance of our algorithm on real datasets arising in convex and non-convex supervised learning problems. More... »

PAGES

1014-1041

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6

DOI

http://dx.doi.org/10.1007/s10957-022-02063-6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1149336045


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0102", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Applied Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0103", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Numerical and Computational Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Rutgers University, 08550, Piscataway, NJ, USA", 
          "id": "http://www.grid.ac/institutes/grid.430387.b", 
          "name": [
            "Rutgers University, 08550, Piscataway, NJ, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "G\u00fcrb\u00fczbalaban", 
        "givenName": "Mert", 
        "id": "sg:person.07375135557.45", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07375135557.45"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Rutgers University, 08550, Piscataway, NJ, USA", 
          "id": "http://www.grid.ac/institutes/grid.430387.b", 
          "name": [
            "Rutgers University, 08550, Piscataway, NJ, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ruszczy\u0144ski", 
        "givenName": "Andrzej", 
        "id": "sg:person.013275445013.36", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013275445013.36"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Rutgers University, 08550, Piscataway, NJ, USA", 
          "id": "http://www.grid.ac/institutes/grid.430387.b", 
          "name": [
            "Rutgers University, 08550, Piscataway, NJ, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhu", 
        "givenName": "Landi", 
        "id": "sg:person.012757542543.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012757542543.48"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/s10463-016-0559-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017939837", 
          "https://doi.org/10.1007/s10463-016-0559-8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10107-013-0681-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018210674", 
          "https://doi.org/10.1007/s10107-013-0681-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/pl00011396", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046930638", 
          "https://doi.org/10.1007/pl00011396"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11590-020-01537-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1124234453", 
          "https://doi.org/10.1007/s11590-020-01537-8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10107-017-1172-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090376560", 
          "https://doi.org/10.1007/s10107-017-1172-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10107-016-1017-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039419824", 
          "https://doi.org/10.1007/s10107-016-1017-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01099354", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028538768", 
          "https://doi.org/10.1007/bf01099354"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2022-07-08", 
    "datePublishedReg": "2022-07-08", 
    "description": "We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to ambiguity in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses mean\u2013semideviation risk for quantifying uncertainty, allowing us to compute solutions that are robust against perturbations in the population data distribution. We consider a broad class of generalized differentiable loss functions that can be non-convex and non-smooth, involving upward and downward cusps, and we develop an efficient stochastic subgradient method for distributionally robust problems with such functions. We prove that it converges to a point satisfying the optimality conditions. To our knowledge, this is the first method with rigorous convergence guarantees in the context of generalized differentiable non-convex and non-smooth distributionally robust stochastic optimization. Our method allows for the control of the desired level of robustness with little extra computational cost compared to population risk minimization with stochastic gradient methods. We also illustrate the performance of our algorithm on real datasets arising in convex and non-convex supervised learning problems.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s10957-022-02063-6", 
    "isAccessibleForFree": false, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.9621398", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.7671894", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1044187", 
        "issn": [
          "0022-3239", 
          "1573-2878"
        ], 
        "name": "Journal of Optimization Theory and Applications", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "3", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "194"
      }
    ], 
    "keywords": [
      "stochastic subgradient method", 
      "subgradient method", 
      "rigorous convergence guarantees", 
      "robust stochastic optimization", 
      "stochastic optimization problem", 
      "stochastic gradient method", 
      "coherent risk measures", 
      "extra computational cost", 
      "stochastic optimization", 
      "differentiable loss function", 
      "optimality conditions", 
      "optimization problem", 
      "convergence guarantees", 
      "robust problem", 
      "level of robustness", 
      "little extra computational cost", 
      "robust formulation", 
      "optimization techniques", 
      "gradient method", 
      "quantifying uncertainty", 
      "statistical learning", 
      "broad class", 
      "computational cost", 
      "data distribution", 
      "risk minimization", 
      "risk measures", 
      "loss function", 
      "such functions", 
      "downward cusp", 
      "real datasets", 
      "first method", 
      "learning problem", 
      "problem", 
      "robustness", 
      "formulation", 
      "convex", 
      "minimization", 
      "optimization", 
      "perturbations", 
      "theory", 
      "distribution", 
      "algorithm", 
      "uncertainty", 
      "guarantees", 
      "solution", 
      "function", 
      "class", 
      "cusp", 
      "point", 
      "technique", 
      "respect", 
      "performance", 
      "dataset", 
      "ambiguity", 
      "conditions", 
      "control", 
      "cost", 
      "learning", 
      "measures", 
      "context", 
      "knowledge", 
      "levels", 
      "risk", 
      "method"
    ], 
    "name": "A Stochastic Subgradient Method for Distributionally Robust Non-convex and Non-smooth Learning", 
    "pagination": "1014-1041", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1149336045"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s10957-022-02063-6"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s10957-022-02063-6", 
      "https://app.dimensions.ai/details/publication/pub.1149336045"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-09-02T16:08", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220902/entities/gbq_results/article/article_940.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s10957-022-02063-6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10957-022-02063-6'


 

This table displays all metadata directly associated to this object as RDF triples.

175 TRIPLES      21 PREDICATES      97 URIs      80 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s10957-022-02063-6 schema:about anzsrc-for:01
2 anzsrc-for:0102
3 anzsrc-for:0103
4 anzsrc-for:0104
5 schema:author N129d79d765474b349566532947fed91f
6 schema:citation sg:pub.10.1007/bf01099354
7 sg:pub.10.1007/pl00011396
8 sg:pub.10.1007/s10107-013-0681-9
9 sg:pub.10.1007/s10107-016-1017-3
10 sg:pub.10.1007/s10107-017-1172-1
11 sg:pub.10.1007/s10463-016-0559-8
12 sg:pub.10.1007/s11590-020-01537-8
13 schema:datePublished 2022-07-08
14 schema:datePublishedReg 2022-07-08
15 schema:description We consider a distributionally robust formulation of stochastic optimization problems arising in statistical learning, where robustness is with respect to ambiguity in the underlying data distribution. Our formulation builds on risk-averse optimization techniques and the theory of coherent risk measures. It uses mean–semideviation risk for quantifying uncertainty, allowing us to compute solutions that are robust against perturbations in the population data distribution. We consider a broad class of generalized differentiable loss functions that can be non-convex and non-smooth, involving upward and downward cusps, and we develop an efficient stochastic subgradient method for distributionally robust problems with such functions. We prove that it converges to a point satisfying the optimality conditions. To our knowledge, this is the first method with rigorous convergence guarantees in the context of generalized differentiable non-convex and non-smooth distributionally robust stochastic optimization. Our method allows for the control of the desired level of robustness with little extra computational cost compared to population risk minimization with stochastic gradient methods. We also illustrate the performance of our algorithm on real datasets arising in convex and non-convex supervised learning problems.
16 schema:genre article
17 schema:isAccessibleForFree false
18 schema:isPartOf Nd0bd80585d504c1887e8d5a7c7b87e00
19 Ndf5b783f2f97411ebb552dc589ebd2a7
20 sg:journal.1044187
21 schema:keywords algorithm
22 ambiguity
23 broad class
24 class
25 coherent risk measures
26 computational cost
27 conditions
28 context
29 control
30 convergence guarantees
31 convex
32 cost
33 cusp
34 data distribution
35 dataset
36 differentiable loss function
37 distribution
38 downward cusp
39 extra computational cost
40 first method
41 formulation
42 function
43 gradient method
44 guarantees
45 knowledge
46 learning
47 learning problem
48 level of robustness
49 levels
50 little extra computational cost
51 loss function
52 measures
53 method
54 minimization
55 optimality conditions
56 optimization
57 optimization problem
58 optimization techniques
59 performance
60 perturbations
61 point
62 problem
63 quantifying uncertainty
64 real datasets
65 respect
66 rigorous convergence guarantees
67 risk
68 risk measures
69 risk minimization
70 robust formulation
71 robust problem
72 robust stochastic optimization
73 robustness
74 solution
75 statistical learning
76 stochastic gradient method
77 stochastic optimization
78 stochastic optimization problem
79 stochastic subgradient method
80 subgradient method
81 such functions
82 technique
83 theory
84 uncertainty
85 schema:name A Stochastic Subgradient Method for Distributionally Robust Non-convex and Non-smooth Learning
86 schema:pagination 1014-1041
87 schema:productId Na2f45fb6e9c44dd8addb3c331875ad1e
88 Nbfe721a0dacb465398978ae55d40eec1
89 schema:sameAs https://app.dimensions.ai/details/publication/pub.1149336045
90 https://doi.org/10.1007/s10957-022-02063-6
91 schema:sdDatePublished 2022-09-02T16:08
92 schema:sdLicense https://scigraph.springernature.com/explorer/license/
93 schema:sdPublisher N4d2ef2d290e748e3951c5a44454a3216
94 schema:url https://doi.org/10.1007/s10957-022-02063-6
95 sgo:license sg:explorer/license/
96 sgo:sdDataset articles
97 rdf:type schema:ScholarlyArticle
98 N129d79d765474b349566532947fed91f rdf:first sg:person.07375135557.45
99 rdf:rest N6a380c95ee284a59a027ceefe7ff2906
100 N4d2ef2d290e748e3951c5a44454a3216 schema:name Springer Nature - SN SciGraph project
101 rdf:type schema:Organization
102 N6a380c95ee284a59a027ceefe7ff2906 rdf:first sg:person.013275445013.36
103 rdf:rest Nf5729365b3664e4f96a6c80e3ccb373c
104 Na2f45fb6e9c44dd8addb3c331875ad1e schema:name dimensions_id
105 schema:value pub.1149336045
106 rdf:type schema:PropertyValue
107 Nbfe721a0dacb465398978ae55d40eec1 schema:name doi
108 schema:value 10.1007/s10957-022-02063-6
109 rdf:type schema:PropertyValue
110 Nd0bd80585d504c1887e8d5a7c7b87e00 schema:issueNumber 3
111 rdf:type schema:PublicationIssue
112 Ndf5b783f2f97411ebb552dc589ebd2a7 schema:volumeNumber 194
113 rdf:type schema:PublicationVolume
114 Nf5729365b3664e4f96a6c80e3ccb373c rdf:first sg:person.012757542543.48
115 rdf:rest rdf:nil
116 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
117 schema:name Mathematical Sciences
118 rdf:type schema:DefinedTerm
119 anzsrc-for:0102 schema:inDefinedTermSet anzsrc-for:
120 schema:name Applied Mathematics
121 rdf:type schema:DefinedTerm
122 anzsrc-for:0103 schema:inDefinedTermSet anzsrc-for:
123 schema:name Numerical and Computational Mathematics
124 rdf:type schema:DefinedTerm
125 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
126 schema:name Statistics
127 rdf:type schema:DefinedTerm
128 sg:grant.7671894 http://pending.schema.org/fundedItem sg:pub.10.1007/s10957-022-02063-6
129 rdf:type schema:MonetaryGrant
130 sg:grant.9621398 http://pending.schema.org/fundedItem sg:pub.10.1007/s10957-022-02063-6
131 rdf:type schema:MonetaryGrant
132 sg:journal.1044187 schema:issn 0022-3239
133 1573-2878
134 schema:name Journal of Optimization Theory and Applications
135 schema:publisher Springer Nature
136 rdf:type schema:Periodical
137 sg:person.012757542543.48 schema:affiliation grid-institutes:grid.430387.b
138 schema:familyName Zhu
139 schema:givenName Landi
140 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012757542543.48
141 rdf:type schema:Person
142 sg:person.013275445013.36 schema:affiliation grid-institutes:grid.430387.b
143 schema:familyName Ruszczyński
144 schema:givenName Andrzej
145 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013275445013.36
146 rdf:type schema:Person
147 sg:person.07375135557.45 schema:affiliation grid-institutes:grid.430387.b
148 schema:familyName Gürbüzbalaban
149 schema:givenName Mert
150 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07375135557.45
151 rdf:type schema:Person
152 sg:pub.10.1007/bf01099354 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028538768
153 https://doi.org/10.1007/bf01099354
154 rdf:type schema:CreativeWork
155 sg:pub.10.1007/pl00011396 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046930638
156 https://doi.org/10.1007/pl00011396
157 rdf:type schema:CreativeWork
158 sg:pub.10.1007/s10107-013-0681-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018210674
159 https://doi.org/10.1007/s10107-013-0681-9
160 rdf:type schema:CreativeWork
161 sg:pub.10.1007/s10107-016-1017-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039419824
162 https://doi.org/10.1007/s10107-016-1017-3
163 rdf:type schema:CreativeWork
164 sg:pub.10.1007/s10107-017-1172-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090376560
165 https://doi.org/10.1007/s10107-017-1172-1
166 rdf:type schema:CreativeWork
167 sg:pub.10.1007/s10463-016-0559-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017939837
168 https://doi.org/10.1007/s10463-016-0559-8
169 rdf:type schema:CreativeWork
170 sg:pub.10.1007/s11590-020-01537-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1124234453
171 https://doi.org/10.1007/s11590-020-01537-8
172 rdf:type schema:CreativeWork
173 grid-institutes:grid.430387.b schema:alternateName Rutgers University, 08550, Piscataway, NJ, USA
174 schema:name Rutgers University, 08550, Piscataway, NJ, USA
175 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...