Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2006

AUTHORS

Petros Drineas , Michael W. Mahoney , S. Muthukrishnan

ABSTRACT

Given an m ×n matrix A and an integer k less than the rank of A, the “best” rank k approximation to A that minimizes the error with respect to the Frobenius norm is Ak, which is obtained by projecting A on the top k left singular vectors of A. While Ak is routinely used in data analysis, it is difficult to interpret and understand it in terms of the original data, namely the columns and rows of A. For example, these columns and rows often come from some application domain, whereas the singular vectors are linear combinations of (up to all) the columns or rows of A. We address the problem of obtaining low-rank approximations that are directly interpretable in terms of the original columns or rows of A. Our main results are two polynomial time randomized algorithms that take as input a matrix A and return as output a matrix C, consisting of a “small” (i.e., a low-degree polynomial in k, 1/ε, and log(1/δ)) number of actual columns of A such that ||A–CC + A||F ≤(1+ε) ||A–Ak||F with probability at least 1–δ. Our algorithms are simple, and they take time of the order of the time needed to compute the top k right singular vectors of A. In addition, they sample the columns of A via the method of “subspace sampling,” so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors and since they ensure that we capture entirely a certain subspace of interest. More... »

PAGES

316-326

References to SciGraph publications

  • 2006. Adaptive Sampling and Fast Low-Rank Matrix Approximation in APPROXIMATION, RANDOMIZATION, AND COMBINATORIAL OPTIMIZATION. ALGORITHMS AND TECHNIQUES
  • 1997. Matrix Analysis in NONE
  • Book

    TITLE

    Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques

    ISBN

    978-3-540-38044-3
    978-3-540-38045-0

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/11830924_30

    DOI

    http://dx.doi.org/10.1007/11830924_30

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1027396901


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "name": [
                "Department of Computer Science, RPI"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Drineas", 
            "givenName": "Petros", 
            "id": "sg:person.011256317073.58", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011256317073.58"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "name": [
                "Yahoo Research Labs"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Mahoney", 
            "givenName": "Michael W.", 
            "id": "sg:person.010375731441.46", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010375731441.46"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Rutgers University", 
              "id": "https://www.grid.ac/institutes/grid.430387.b", 
              "name": [
                "Department of Computer Science, Rutgers University"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Muthukrishnan", 
            "givenName": "S.", 
            "id": "sg:person.0673515550.33", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0673515550.33"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/978-1-4612-0653-8", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000441999", 
              "https://doi.org/10.1007/978-1-4612-0653-8"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-1-4612-0653-8", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000441999", 
              "https://doi.org/10.1007/978-1-4612-0653-8"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11830924_28", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1012998535", 
              "https://doi.org/10.1007/11830924_28"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11830924_28", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1012998535", 
              "https://doi.org/10.1007/11830924_28"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1039488.1039494", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1015066517"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1109557.1109681", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023446975"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1109557.1109682", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1033072080"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.5741407", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044500248"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1086/425587", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1058714278"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1137/s0097539704442684", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062879531"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1137/s0097539704442696", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062879532"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1137/s0097539704442702", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062879533"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/sfcs.1998.743487", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1093416036"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1017/cbo9780511810817", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1098732013"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2006", 
        "datePublishedReg": "2006-01-01", 
        "description": "Given an m \u00d7n matrix A and an integer k less than the rank of A, the \u201cbest\u201d rank k approximation to A that minimizes the error with respect to the Frobenius norm is Ak, which is obtained by projecting A on the top k left singular vectors of A. While Ak is routinely used in data analysis, it is difficult to interpret and understand it in terms of the original data, namely the columns and rows of A. For example, these columns and rows often come from some application domain, whereas the singular vectors are linear combinations of (up to all) the columns or rows of A. We address the problem of obtaining low-rank approximations that are directly interpretable in terms of the original columns or rows of A. Our main results are two polynomial time randomized algorithms that take as input a matrix A and return as output a matrix C, consisting of a \u201csmall\u201d (i.e., a low-degree polynomial in k, 1/\u03b5, and log(1/\u03b4)) number of actual columns of A such that ||A\u2013CC + A||F \u2264(1+\u03b5) ||A\u2013Ak||F with probability at least 1\u2013\u03b4. Our algorithms are simple, and they take time of the order of the time needed to compute the top k right singular vectors of A. In addition, they sample the columns of A via the method of \u201csubspace sampling,\u201d so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors and since they ensure that we capture entirely a certain subspace of interest.", 
        "editor": [
          {
            "familyName": "D\u00edaz", 
            "givenName": "Josep", 
            "type": "Person"
          }, 
          {
            "familyName": "Jansen", 
            "givenName": "Klaus", 
            "type": "Person"
          }, 
          {
            "familyName": "Rolim", 
            "givenName": "Jos\u00e9 D. P.", 
            "type": "Person"
          }, 
          {
            "familyName": "Zwick", 
            "givenName": "Uri", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/11830924_30", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-540-38044-3", 
            "978-3-540-38045-0"
          ], 
          "name": "Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques", 
          "type": "Book"
        }, 
        "name": "Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods", 
        "pagination": "316-326", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1027396901"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/11830924_30"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "82828a87f0a4453ddc8cadc44aca9b4f9f76ad30c5cee146351a078879e44730"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/11830924_30", 
          "https://app.dimensions.ai/details/publication/pub.1027396901"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T07:32", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000356_0000000356/records_57900_00000000.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F11830924_30"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11830924_30'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11830924_30'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11830924_30'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11830924_30'


     

    This table displays all metadata directly associated to this object as RDF triples.

    136 TRIPLES      23 PREDICATES      39 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/11830924_30 schema:about anzsrc-for:01
    2 anzsrc-for:0104
    3 schema:author N67fe3af98c6a40ee9b2131729b490a4d
    4 schema:citation sg:pub.10.1007/11830924_28
    5 sg:pub.10.1007/978-1-4612-0653-8
    6 https://doi.org/10.1017/cbo9780511810817
    7 https://doi.org/10.1086/425587
    8 https://doi.org/10.1101/gr.5741407
    9 https://doi.org/10.1109/sfcs.1998.743487
    10 https://doi.org/10.1137/s0097539704442684
    11 https://doi.org/10.1137/s0097539704442696
    12 https://doi.org/10.1137/s0097539704442702
    13 https://doi.org/10.1145/1039488.1039494
    14 https://doi.org/10.1145/1109557.1109681
    15 https://doi.org/10.1145/1109557.1109682
    16 schema:datePublished 2006
    17 schema:datePublishedReg 2006-01-01
    18 schema:description Given an m ×n matrix A and an integer k less than the rank of A, the “best” rank k approximation to A that minimizes the error with respect to the Frobenius norm is Ak, which is obtained by projecting A on the top k left singular vectors of A. While Ak is routinely used in data analysis, it is difficult to interpret and understand it in terms of the original data, namely the columns and rows of A. For example, these columns and rows often come from some application domain, whereas the singular vectors are linear combinations of (up to all) the columns or rows of A. We address the problem of obtaining low-rank approximations that are directly interpretable in terms of the original columns or rows of A. Our main results are two polynomial time randomized algorithms that take as input a matrix A and return as output a matrix C, consisting of a “small” (i.e., a low-degree polynomial in k, 1/ε, and log(1/δ)) number of actual columns of A such that ||A–CC + A||F ≤(1+ε) ||A–Ak||F with probability at least 1–δ. Our algorithms are simple, and they take time of the order of the time needed to compute the top k right singular vectors of A. In addition, they sample the columns of A via the method of “subspace sampling,” so-named since the sampling probabilities depend on the lengths of the rows of the top singular vectors and since they ensure that we capture entirely a certain subspace of interest.
    19 schema:editor N5d50714297d546dbbc433439cb0429a2
    20 schema:genre chapter
    21 schema:inLanguage en
    22 schema:isAccessibleForFree true
    23 schema:isPartOf Nbebfc889cda74a2eb9b3a2787af8484e
    24 schema:name Subspace Sampling and Relative-Error Matrix Approximation: Column-Based Methods
    25 schema:pagination 316-326
    26 schema:productId N31510028177a4b4ca105198b2ad5421e
    27 N734b99540682496aa36129d460cbd1ee
    28 N8bb11f646a654aa2a5331bc7b535761f
    29 schema:publisher N8550c1cf628542a9afaae1c85d06eca4
    30 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027396901
    31 https://doi.org/10.1007/11830924_30
    32 schema:sdDatePublished 2019-04-16T07:32
    33 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    34 schema:sdPublisher N23d3395dfcb44b52bf52342cabbe9e0d
    35 schema:url https://link.springer.com/10.1007%2F11830924_30
    36 sgo:license sg:explorer/license/
    37 sgo:sdDataset chapters
    38 rdf:type schema:Chapter
    39 N078540a647c7443492caa36f0e38d3d5 rdf:first N63aeaea6f4f74af990562882e6581b58
    40 rdf:rest N403a51722b304520a22d2e47554cbd6b
    41 N23d3395dfcb44b52bf52342cabbe9e0d schema:name Springer Nature - SN SciGraph project
    42 rdf:type schema:Organization
    43 N25b6f7f494d946b694fcb39f1c5b1735 rdf:first sg:person.010375731441.46
    44 rdf:rest Nb56811ed2033469d864fd1a5487b1a6d
    45 N31510028177a4b4ca105198b2ad5421e schema:name doi
    46 schema:value 10.1007/11830924_30
    47 rdf:type schema:PropertyValue
    48 N353c2fe3283f48df90d797318c1f4498 schema:familyName Díaz
    49 schema:givenName Josep
    50 rdf:type schema:Person
    51 N403a51722b304520a22d2e47554cbd6b rdf:first Na74f129bbba04497b8108bb1d92b6b63
    52 rdf:rest rdf:nil
    53 N5d50714297d546dbbc433439cb0429a2 rdf:first N353c2fe3283f48df90d797318c1f4498
    54 rdf:rest N7108b3e94a6244b88b497dd6b4af97f8
    55 N63aeaea6f4f74af990562882e6581b58 schema:familyName Rolim
    56 schema:givenName José D. P.
    57 rdf:type schema:Person
    58 N67fe3af98c6a40ee9b2131729b490a4d rdf:first sg:person.011256317073.58
    59 rdf:rest N25b6f7f494d946b694fcb39f1c5b1735
    60 N7108b3e94a6244b88b497dd6b4af97f8 rdf:first Nf78e10db309741818ae7aa73eb9e43ca
    61 rdf:rest N078540a647c7443492caa36f0e38d3d5
    62 N734b99540682496aa36129d460cbd1ee schema:name dimensions_id
    63 schema:value pub.1027396901
    64 rdf:type schema:PropertyValue
    65 N82aee4eaa8294d9a804e419836f0da9e schema:name Department of Computer Science, RPI
    66 rdf:type schema:Organization
    67 N8550c1cf628542a9afaae1c85d06eca4 schema:location Berlin, Heidelberg
    68 schema:name Springer Berlin Heidelberg
    69 rdf:type schema:Organisation
    70 N8bb11f646a654aa2a5331bc7b535761f schema:name readcube_id
    71 schema:value 82828a87f0a4453ddc8cadc44aca9b4f9f76ad30c5cee146351a078879e44730
    72 rdf:type schema:PropertyValue
    73 Na74f129bbba04497b8108bb1d92b6b63 schema:familyName Zwick
    74 schema:givenName Uri
    75 rdf:type schema:Person
    76 Nb56811ed2033469d864fd1a5487b1a6d rdf:first sg:person.0673515550.33
    77 rdf:rest rdf:nil
    78 Nbebfc889cda74a2eb9b3a2787af8484e schema:isbn 978-3-540-38044-3
    79 978-3-540-38045-0
    80 schema:name Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques
    81 rdf:type schema:Book
    82 Ne69616b2c0234a6db681423929c3e60c schema:name Yahoo Research Labs
    83 rdf:type schema:Organization
    84 Nf78e10db309741818ae7aa73eb9e43ca schema:familyName Jansen
    85 schema:givenName Klaus
    86 rdf:type schema:Person
    87 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    88 schema:name Mathematical Sciences
    89 rdf:type schema:DefinedTerm
    90 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    91 schema:name Statistics
    92 rdf:type schema:DefinedTerm
    93 sg:person.010375731441.46 schema:affiliation Ne69616b2c0234a6db681423929c3e60c
    94 schema:familyName Mahoney
    95 schema:givenName Michael W.
    96 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010375731441.46
    97 rdf:type schema:Person
    98 sg:person.011256317073.58 schema:affiliation N82aee4eaa8294d9a804e419836f0da9e
    99 schema:familyName Drineas
    100 schema:givenName Petros
    101 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011256317073.58
    102 rdf:type schema:Person
    103 sg:person.0673515550.33 schema:affiliation https://www.grid.ac/institutes/grid.430387.b
    104 schema:familyName Muthukrishnan
    105 schema:givenName S.
    106 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0673515550.33
    107 rdf:type schema:Person
    108 sg:pub.10.1007/11830924_28 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012998535
    109 https://doi.org/10.1007/11830924_28
    110 rdf:type schema:CreativeWork
    111 sg:pub.10.1007/978-1-4612-0653-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000441999
    112 https://doi.org/10.1007/978-1-4612-0653-8
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1017/cbo9780511810817 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098732013
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1086/425587 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058714278
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1101/gr.5741407 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044500248
    119 rdf:type schema:CreativeWork
    120 https://doi.org/10.1109/sfcs.1998.743487 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093416036
    121 rdf:type schema:CreativeWork
    122 https://doi.org/10.1137/s0097539704442684 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062879531
    123 rdf:type schema:CreativeWork
    124 https://doi.org/10.1137/s0097539704442696 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062879532
    125 rdf:type schema:CreativeWork
    126 https://doi.org/10.1137/s0097539704442702 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062879533
    127 rdf:type schema:CreativeWork
    128 https://doi.org/10.1145/1039488.1039494 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015066517
    129 rdf:type schema:CreativeWork
    130 https://doi.org/10.1145/1109557.1109681 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023446975
    131 rdf:type schema:CreativeWork
    132 https://doi.org/10.1145/1109557.1109682 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033072080
    133 rdf:type schema:CreativeWork
    134 https://www.grid.ac/institutes/grid.430387.b schema:alternateName Rutgers University
    135 schema:name Department of Computer Science, Rutgers University
    136 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...