Robust distributed estimation and variable selection for massive datasets via rank regression View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2021-06-20

AUTHORS

Jiaming Luan, Hongwei Wang, Kangning Wang, Benle Zhang

ABSTRACT

Rank regression is a robust modeling tool; it is challenging to implement it for the distributed massive data owing to memory constraints. In practice, the massive data may be distributed heterogeneously from machine to machine; how to incorporate the heterogeneity is also an interesting issue. This paper proposes a distributed rank regression (DR2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {DR}^{2}$$\end{document}), which can be implemented in the master machine by solving a weighted least-squares and adaptive when the data are heterogeneous. Theoretically, we prove that the resulting estimator is statistically as efficient as the global rank regression estimator. Furthermore, based on the adaptive LASSO and a newly defined distributed BIC-type tuning parameter selector, we propose a distributed regularized rank regression (DR3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {DR}^{3}$$\end{document}), which can make consistent variable selection and can also be easily implemented by using the LARS algorithm on the master machine. Simulation results and real data analysis are included to validate our method. More... »

PAGES

1-16

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s10463-021-00803-5

DOI

http://dx.doi.org/10.1007/s10463-021-00803-5

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1139022363


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Luan", 
        "givenName": "Jiaming", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Hongwei", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Kangning", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhang", 
        "givenName": "Benle", 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/s00184-014-0491-y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1005004152", 
          "https://doi.org/10.1007/s00184-014-0491-y"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4757-2769-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1108501739", 
          "https://doi.org/10.1007/978-1-4757-2769-2"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2021-06-20", 
    "datePublishedReg": "2021-06-20", 
    "description": "Rank regression is a robust modeling tool; it is challenging to implement it for the distributed massive data owing to memory constraints. In practice, the massive data may be distributed heterogeneously from machine to machine; how to incorporate the heterogeneity is also an interesting issue. This paper proposes a distributed rank regression (DR2\\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$$\\mathrm {DR}^{2}$$\\end{document}), which can be implemented in the master machine by solving a weighted least-squares and adaptive when the data are heterogeneous. Theoretically, we prove that the resulting estimator is statistically as efficient as the global rank regression estimator. Furthermore, based on the adaptive LASSO and a newly defined distributed BIC-type tuning parameter selector, we propose a distributed regularized rank regression (DR3\\documentclass[12pt]{minimal}\n\t\t\t\t\\usepackage{amsmath}\n\t\t\t\t\\usepackage{wasysym}\n\t\t\t\t\\usepackage{amsfonts}\n\t\t\t\t\\usepackage{amssymb}\n\t\t\t\t\\usepackage{amsbsy}\n\t\t\t\t\\usepackage{mathrsfs}\n\t\t\t\t\\usepackage{upgreek}\n\t\t\t\t\\setlength{\\oddsidemargin}{-69pt}\n\t\t\t\t\\begin{document}$$\\mathrm {DR}^{3}$$\\end{document}), which can make consistent variable selection and can also be easily implemented by using the LARS algorithm on the master machine. Simulation results and real data analysis are included to validate our method.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s10463-021-00803-5", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": [
      {
        "id": "sg:journal.1041657", 
        "issn": [
          "0020-3157", 
          "1572-9052"
        ], 
        "name": "Annals of the Institute of Statistical Mathematics", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }
    ], 
    "keywords": [
      "massive data", 
      "master machine", 
      "massive datasets", 
      "memory constraints", 
      "variable selection", 
      "robust modeling tools", 
      "modeling tools", 
      "machine", 
      "LARS algorithm", 
      "simulation results", 
      "interesting issues", 
      "data analysis", 
      "real data analysis", 
      "parameter selector", 
      "consistent variable selection", 
      "algorithm", 
      "dataset", 
      "adaptive lasso", 
      "adaptive", 
      "selection", 
      "constraints", 
      "data", 
      "Lasso", 
      "tool", 
      "tuning parameter selector", 
      "issues", 
      "estimation", 
      "estimator", 
      "selector", 
      "rank regression", 
      "method", 
      "regression", 
      "regression estimator", 
      "results", 
      "practice", 
      "analysis", 
      "heterogeneity", 
      "paper", 
      "rank regression estimator", 
      "global rank regression estimator", 
      "BIC-type tuning parameter selector"
    ], 
    "name": "Robust distributed estimation and variable selection for massive datasets via rank regression", 
    "pagination": "1-16", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1139022363"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s10463-021-00803-5"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s10463-021-00803-5", 
      "https://app.dimensions.ai/details/publication/pub.1139022363"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-01-01T18:58", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_890.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s10463-021-00803-5"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00803-5'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00803-5'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00803-5'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00803-5'


 

This table displays all metadata directly associated to this object as RDF triples.

118 TRIPLES      22 PREDICATES      66 URIs      56 LITERALS      4 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s10463-021-00803-5 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author Nf3ea9425600a41e7a04e67271befa565
4 schema:citation sg:pub.10.1007/978-1-4757-2769-2
5 sg:pub.10.1007/s00184-014-0491-y
6 schema:datePublished 2021-06-20
7 schema:datePublishedReg 2021-06-20
8 schema:description Rank regression is a robust modeling tool; it is challenging to implement it for the distributed massive data owing to memory constraints. In practice, the massive data may be distributed heterogeneously from machine to machine; how to incorporate the heterogeneity is also an interesting issue. This paper proposes a distributed rank regression (DR2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {DR}^{2}$$\end{document}), which can be implemented in the master machine by solving a weighted least-squares and adaptive when the data are heterogeneous. Theoretically, we prove that the resulting estimator is statistically as efficient as the global rank regression estimator. Furthermore, based on the adaptive LASSO and a newly defined distributed BIC-type tuning parameter selector, we propose a distributed regularized rank regression (DR3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mathrm {DR}^{3}$$\end{document}), which can make consistent variable selection and can also be easily implemented by using the LARS algorithm on the master machine. Simulation results and real data analysis are included to validate our method.
9 schema:genre article
10 schema:inLanguage en
11 schema:isAccessibleForFree false
12 schema:isPartOf sg:journal.1041657
13 schema:keywords BIC-type tuning parameter selector
14 LARS algorithm
15 Lasso
16 adaptive
17 adaptive lasso
18 algorithm
19 analysis
20 consistent variable selection
21 constraints
22 data
23 data analysis
24 dataset
25 estimation
26 estimator
27 global rank regression estimator
28 heterogeneity
29 interesting issues
30 issues
31 machine
32 massive data
33 massive datasets
34 master machine
35 memory constraints
36 method
37 modeling tools
38 paper
39 parameter selector
40 practice
41 rank regression
42 rank regression estimator
43 real data analysis
44 regression
45 regression estimator
46 results
47 robust modeling tools
48 selection
49 selector
50 simulation results
51 tool
52 tuning parameter selector
53 variable selection
54 schema:name Robust distributed estimation and variable selection for massive datasets via rank regression
55 schema:pagination 1-16
56 schema:productId Nf988ea240dc8402681033adb477c6198
57 Nfe3fe35f964a4e79bacc55a995ac9490
58 schema:sameAs https://app.dimensions.ai/details/publication/pub.1139022363
59 https://doi.org/10.1007/s10463-021-00803-5
60 schema:sdDatePublished 2022-01-01T18:58
61 schema:sdLicense https://scigraph.springernature.com/explorer/license/
62 schema:sdPublisher N39f1be6196b341ba9837dabe461ee29e
63 schema:url https://doi.org/10.1007/s10463-021-00803-5
64 sgo:license sg:explorer/license/
65 sgo:sdDataset articles
66 rdf:type schema:ScholarlyArticle
67 N13db3d4da2ef4eedaf85359e472acf3a rdf:first N2078748f092347779b29736d147e6df9
68 rdf:rest N20fad9c4fb3f42a49d80f4a3f41603a4
69 N2078748f092347779b29736d147e6df9 schema:affiliation grid-institutes:None
70 schema:familyName Wang
71 schema:givenName Kangning
72 rdf:type schema:Person
73 N20fad9c4fb3f42a49d80f4a3f41603a4 rdf:first Ncd2f1e9f2d71416cbbeb08f2e340c9e2
74 rdf:rest rdf:nil
75 N39f1be6196b341ba9837dabe461ee29e schema:name Springer Nature - SN SciGraph project
76 rdf:type schema:Organization
77 N557ff5b20dcb4518b71add55a22813e3 schema:affiliation grid-institutes:None
78 schema:familyName Luan
79 schema:givenName Jiaming
80 rdf:type schema:Person
81 N973a63c79d964815b9a1b7c68a1e69b3 rdf:first Nafad70e9f82640fb98ab27b9605f9305
82 rdf:rest N13db3d4da2ef4eedaf85359e472acf3a
83 Nafad70e9f82640fb98ab27b9605f9305 schema:affiliation grid-institutes:None
84 schema:familyName Wang
85 schema:givenName Hongwei
86 rdf:type schema:Person
87 Ncd2f1e9f2d71416cbbeb08f2e340c9e2 schema:affiliation grid-institutes:None
88 schema:familyName Zhang
89 schema:givenName Benle
90 rdf:type schema:Person
91 Nf3ea9425600a41e7a04e67271befa565 rdf:first N557ff5b20dcb4518b71add55a22813e3
92 rdf:rest N973a63c79d964815b9a1b7c68a1e69b3
93 Nf988ea240dc8402681033adb477c6198 schema:name dimensions_id
94 schema:value pub.1139022363
95 rdf:type schema:PropertyValue
96 Nfe3fe35f964a4e79bacc55a995ac9490 schema:name doi
97 schema:value 10.1007/s10463-021-00803-5
98 rdf:type schema:PropertyValue
99 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
100 schema:name Mathematical Sciences
101 rdf:type schema:DefinedTerm
102 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
103 schema:name Statistics
104 rdf:type schema:DefinedTerm
105 sg:journal.1041657 schema:issn 0020-3157
106 1572-9052
107 schema:name Annals of the Institute of Statistical Mathematics
108 schema:publisher Springer Nature
109 rdf:type schema:Periodical
110 sg:pub.10.1007/978-1-4757-2769-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1108501739
111 https://doi.org/10.1007/978-1-4757-2769-2
112 rdf:type schema:CreativeWork
113 sg:pub.10.1007/s00184-014-0491-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1005004152
114 https://doi.org/10.1007/s00184-014-0491-y
115 rdf:type schema:CreativeWork
116 grid-institutes:None schema:alternateName Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China
117 schema:name Shandong Technology and Business University, No. 191, Binhai Middle Road, 264005, Laishan District, Yantai, China
118 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...