A high-dimensional M-estimator framework for bi-level variable selection View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2021-09-09

AUTHORS

Bin Luo, Xiaoli Gao

ABSTRACT

In high-dimensional data analysis, bi-level sparsity is often assumed when covariates function group-wisely and sparsity can appear either at the group level or within certain groups. In such cases, an ideal model should be able to encourage the bi-level variable selection consistently. Bi-level variable selection has become even more challenging when data have heavy-tailed distribution or outliers exist in random errors and covariates. In this paper, we study a framework of high-dimensional M-estimation for bi-level variable selection. This framework encourages bi-level sparsity through a computationally efficient two-stage procedure. In theory, we provide sufficient conditions under which our two-stage penalized M-estimator possesses simultaneous local estimation consistency and the bi-level variable selection consistency if certain non-convex penalty functions are used at the group level. Both our simulation studies and real data analysis demonstrate satisfactory finite sample performance of the proposed estimators under different irregular settings. More... »

PAGES

1-21

References to SciGraph publications

  • 1996-12. Oncogenic regulation and function of keratins 8 and 18 in CANCER AND METASTASIS REVIEWS
  • 2012-12-21. Gradient methods for minimizing composite functions in MATHEMATICAL PROGRAMMING
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10463-021-00809-z

    DOI

    http://dx.doi.org/10.1007/s10463-021-00809-z

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1141002982


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Department of Biostatistics and Bioinformatics, Duke University, 2424 Erwin Road, 27705, Durham, NC, United States", 
              "id": "http://www.grid.ac/institutes/grid.26009.3d", 
              "name": [
                "Department of Biostatistics and Bioinformatics, Duke University, 2424 Erwin Road, 27705, Durham, NC, United States"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Luo", 
            "givenName": "Bin", 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Mathematics and Statistics, The University of North Carolina at Greensboro, 116 Petty Building, 27402, Greensboro, NC, United States", 
              "id": "http://www.grid.ac/institutes/grid.266860.c", 
              "name": [
                "Department of Mathematics and Statistics, The University of North Carolina at Greensboro, 116 Petty Building, 27402, Greensboro, NC, United States"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Gao", 
            "givenName": "Xiaoli", 
            "id": "sg:person.011504317322.11", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011504317322.11"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/s10107-012-0629-5", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000563802", 
              "https://doi.org/10.1007/s10107-012-0629-5"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf00054012", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004733785", 
              "https://doi.org/10.1007/bf00054012"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2021-09-09", 
        "datePublishedReg": "2021-09-09", 
        "description": "In high-dimensional data analysis, bi-level sparsity is often assumed when covariates function group-wisely and sparsity can appear either at the group level or within certain groups. In such cases, an ideal model should be able to encourage the bi-level variable selection consistently. Bi-level variable selection has become even more challenging when data have heavy-tailed distribution or outliers exist in random errors and covariates. In this paper, we study a framework of high-dimensional M-estimation for bi-level variable selection. This framework encourages bi-level sparsity through a computationally efficient two-stage procedure. In theory, we provide sufficient conditions under which our two-stage penalized M-estimator possesses simultaneous local estimation consistency and the bi-level variable selection consistency if certain non-convex penalty functions are used at the group level. Both our simulation studies and real data analysis demonstrate satisfactory finite sample performance of the proposed estimators under different irregular settings.", 
        "genre": "article", 
        "id": "sg:pub.10.1007/s10463-021-00809-z", 
        "inLanguage": "en", 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1041657", 
            "issn": [
              "0020-3157", 
              "1572-9052"
            ], 
            "name": "Annals of the Institute of Statistical Mathematics", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }
        ], 
        "keywords": [
          "bi-level variable selection", 
          "variable selection", 
          "high-dimensional data analysis", 
          "non-convex penalty functions", 
          "satisfactory finite sample performance", 
          "variable selection consistency", 
          "finite sample performance", 
          "real data analysis", 
          "selection consistency", 
          "estimation consistency", 
          "sufficient conditions", 
          "sample performance", 
          "penalty function", 
          "tailed distribution", 
          "random errors", 
          "simulation study", 
          "estimator framework", 
          "sparsity", 
          "estimator", 
          "data analysis", 
          "two-stage procedure", 
          "two-stage", 
          "framework", 
          "irregular settings", 
          "theory", 
          "estimation", 
          "outliers", 
          "error", 
          "covariates", 
          "model", 
          "distribution", 
          "such cases", 
          "selection", 
          "consistency", 
          "function", 
          "ideal model", 
          "analysis", 
          "performance", 
          "cases", 
          "certain groups", 
          "procedure", 
          "conditions", 
          "data", 
          "group level", 
          "setting", 
          "levels", 
          "study", 
          "group", 
          "paper", 
          "bi-level sparsity", 
          "simultaneous local estimation consistency", 
          "local estimation consistency", 
          "bi-level variable selection consistency", 
          "certain non-convex penalty functions", 
          "different irregular settings"
        ], 
        "name": "A high-dimensional M-estimator framework for bi-level variable selection", 
        "pagination": "1-21", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1141002982"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10463-021-00809-z"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10463-021-00809-z", 
          "https://app.dimensions.ai/details/publication/pub.1141002982"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-01-01T19:00", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_881.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/s10463-021-00809-z"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00809-z'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00809-z'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00809-z'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10463-021-00809-z'


     

    This table displays all metadata directly associated to this object as RDF triples.

    124 TRIPLES      22 PREDICATES      80 URIs      70 LITERALS      4 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10463-021-00809-z schema:about anzsrc-for:01
    2 anzsrc-for:0104
    3 schema:author Nf80b3a4a548142f89dfd4bc688815f0d
    4 schema:citation sg:pub.10.1007/bf00054012
    5 sg:pub.10.1007/s10107-012-0629-5
    6 schema:datePublished 2021-09-09
    7 schema:datePublishedReg 2021-09-09
    8 schema:description In high-dimensional data analysis, bi-level sparsity is often assumed when covariates function group-wisely and sparsity can appear either at the group level or within certain groups. In such cases, an ideal model should be able to encourage the bi-level variable selection consistently. Bi-level variable selection has become even more challenging when data have heavy-tailed distribution or outliers exist in random errors and covariates. In this paper, we study a framework of high-dimensional M-estimation for bi-level variable selection. This framework encourages bi-level sparsity through a computationally efficient two-stage procedure. In theory, we provide sufficient conditions under which our two-stage penalized M-estimator possesses simultaneous local estimation consistency and the bi-level variable selection consistency if certain non-convex penalty functions are used at the group level. Both our simulation studies and real data analysis demonstrate satisfactory finite sample performance of the proposed estimators under different irregular settings.
    9 schema:genre article
    10 schema:inLanguage en
    11 schema:isAccessibleForFree true
    12 schema:isPartOf sg:journal.1041657
    13 schema:keywords analysis
    14 bi-level sparsity
    15 bi-level variable selection
    16 bi-level variable selection consistency
    17 cases
    18 certain groups
    19 certain non-convex penalty functions
    20 conditions
    21 consistency
    22 covariates
    23 data
    24 data analysis
    25 different irregular settings
    26 distribution
    27 error
    28 estimation
    29 estimation consistency
    30 estimator
    31 estimator framework
    32 finite sample performance
    33 framework
    34 function
    35 group
    36 group level
    37 high-dimensional data analysis
    38 ideal model
    39 irregular settings
    40 levels
    41 local estimation consistency
    42 model
    43 non-convex penalty functions
    44 outliers
    45 paper
    46 penalty function
    47 performance
    48 procedure
    49 random errors
    50 real data analysis
    51 sample performance
    52 satisfactory finite sample performance
    53 selection
    54 selection consistency
    55 setting
    56 simulation study
    57 simultaneous local estimation consistency
    58 sparsity
    59 study
    60 such cases
    61 sufficient conditions
    62 tailed distribution
    63 theory
    64 two-stage
    65 two-stage procedure
    66 variable selection
    67 variable selection consistency
    68 schema:name A high-dimensional M-estimator framework for bi-level variable selection
    69 schema:pagination 1-21
    70 schema:productId N3147ec2773e345cbb02cd08cfbdd9080
    71 N4ec3bed5198441a5bd3b3d8c3e61a832
    72 schema:sameAs https://app.dimensions.ai/details/publication/pub.1141002982
    73 https://doi.org/10.1007/s10463-021-00809-z
    74 schema:sdDatePublished 2022-01-01T19:00
    75 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    76 schema:sdPublisher Nc47f7acb0c4944c1b2fa5779688659c9
    77 schema:url https://doi.org/10.1007/s10463-021-00809-z
    78 sgo:license sg:explorer/license/
    79 sgo:sdDataset articles
    80 rdf:type schema:ScholarlyArticle
    81 N3147ec2773e345cbb02cd08cfbdd9080 schema:name dimensions_id
    82 schema:value pub.1141002982
    83 rdf:type schema:PropertyValue
    84 N4ec3bed5198441a5bd3b3d8c3e61a832 schema:name doi
    85 schema:value 10.1007/s10463-021-00809-z
    86 rdf:type schema:PropertyValue
    87 Na43c30865f794420ad805d4da317a5f9 schema:affiliation grid-institutes:grid.26009.3d
    88 schema:familyName Luo
    89 schema:givenName Bin
    90 rdf:type schema:Person
    91 Nc47f7acb0c4944c1b2fa5779688659c9 schema:name Springer Nature - SN SciGraph project
    92 rdf:type schema:Organization
    93 Nf4d86286f1f24743b7d65fb7ee851249 rdf:first sg:person.011504317322.11
    94 rdf:rest rdf:nil
    95 Nf80b3a4a548142f89dfd4bc688815f0d rdf:first Na43c30865f794420ad805d4da317a5f9
    96 rdf:rest Nf4d86286f1f24743b7d65fb7ee851249
    97 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    98 schema:name Mathematical Sciences
    99 rdf:type schema:DefinedTerm
    100 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    101 schema:name Statistics
    102 rdf:type schema:DefinedTerm
    103 sg:journal.1041657 schema:issn 0020-3157
    104 1572-9052
    105 schema:name Annals of the Institute of Statistical Mathematics
    106 schema:publisher Springer Nature
    107 rdf:type schema:Periodical
    108 sg:person.011504317322.11 schema:affiliation grid-institutes:grid.266860.c
    109 schema:familyName Gao
    110 schema:givenName Xiaoli
    111 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011504317322.11
    112 rdf:type schema:Person
    113 sg:pub.10.1007/bf00054012 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004733785
    114 https://doi.org/10.1007/bf00054012
    115 rdf:type schema:CreativeWork
    116 sg:pub.10.1007/s10107-012-0629-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000563802
    117 https://doi.org/10.1007/s10107-012-0629-5
    118 rdf:type schema:CreativeWork
    119 grid-institutes:grid.26009.3d schema:alternateName Department of Biostatistics and Bioinformatics, Duke University, 2424 Erwin Road, 27705, Durham, NC, United States
    120 schema:name Department of Biostatistics and Bioinformatics, Duke University, 2424 Erwin Road, 27705, Durham, NC, United States
    121 rdf:type schema:Organization
    122 grid-institutes:grid.266860.c schema:alternateName Department of Mathematics and Statistics, The University of North Carolina at Greensboro, 116 Petty Building, 27402, Greensboro, NC, United States
    123 schema:name Department of Mathematics and Statistics, The University of North Carolina at Greensboro, 116 Petty Building, 27402, Greensboro, NC, United States
    124 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...