Fast and robust adjustment of cell mixtures in epigenome-wide association studies with SmartSVA View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2017-05-26

AUTHORS

Jun Chen, Ehsan Behnam, Jinyan Huang, Miriam F. Moffatt, Daniel J. Schaid, Liming Liang, Xihong Lin

ABSTRACT

BackgroundOne problem that plagues epigenome-wide association studies is the potential confounding due to cell mixtures when purified target cells are not available. Reference-free adjustment of cell mixtures has become increasingly popular due to its flexibility and simplicity. However, existing methods are still not optimal: increased false positive rates and reduced statistical power have been observed in many scenarios.MethodsWe develop SmartSVA, an optimized surrogate variable analysis (SVA) method, for fast and robust reference-free adjustment of cell mixtures. SmartSVA corrects the limitation of traditional SVA under highly confounded scenarios by imposing an explicit convergence criterion and improves the computational efficiency for large datasets.ResultsCompared to traditional SVA, SmartSVA achieves an order-of-magnitude speedup and better false positive control. It protects the signals when capturing the cell mixtures, resulting in significant power increase while controlling for false positives. Through extensive simulations and real data applications, we demonstrate a better performance of SmartSVA than the existing methods.ConclusionsSmartSVA is a fast and robust method for reference-free adjustment of cell mixtures for epigenome-wide association studies. As a general method, SmartSVA can be applied to other genomic studies to capture unknown sources of variability. More... »

PAGES

413

References to SciGraph publications

  • 2015-01-30. DNA methylation age of blood predicts all-cause mortality in later life in GENOME BIOLOGY
  • 2016-05-03. An evaluation of methods correcting for cell-type heterogeneity in DNA methylation studies in GENOME BIOLOGY
  • 2014-01-26. Epigenome-wide association studies without the need for cell-type composition in NATURE METHODS
  • 2011-03-16. Genomic inflation factors under polygenic inheritance in EUROPEAN JOURNAL OF HUMAN GENETICS
  • 2016-01-11. Weakly supervised learning of biomedical information extraction from curated data in BMC BIOINFORMATICS
  • 2012-05-08. DNA methylation arrays as surrogate measures of cell mixture distribution in BMC BIOINFORMATICS
  • 2016-06-29. Reference-free deconvolution of DNA methylation data and mediation by cell composition effects in BMC BIOINFORMATICS
  • 2010-01-14. Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach in BMC BIOINFORMATICS
  • 2013-01-20. Epigenome-wide association data implicate DNA methylation as an intermediary of genetic risk in rheumatoid arthritis in NATURE BIOTECHNOLOGY
  • 2016-03-28. Sparse PCA corrects for cell type heterogeneity in epigenome-wide association studies in NATURE METHODS
  • 2015-02-18. An epigenome-wide association study of total serum immunoglobulin E concentration in NATURE
  • 2010-01-01. Detection of gene pathways with predictive power for breast cancer prognosis in BMC BIOINFORMATICS
  • 2016-06-28. Pubertal development in healthy children is mirrored by DNA methylation patterns in peripheral blood in SCIENTIFIC REPORTS
  • 2014-11-18. Age-related variations in the methylome associated with gene expression in human monocytes and T cells in NATURE COMMUNICATIONS
  • 2013-09-27. Recommendations for the design and analysis of epigenome-wide association studies in NATURE METHODS
  • 2010-11-30. Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis in BMC BIOINFORMATICS
  • 2015-08-19. Age-related profiling of DNA methylation in CD8+ T cells reveals changes in immune response and transcriptional regulator genes in SCIENTIFIC REPORTS
  • 2010-09-14. Tackling the widespread and critical impact of batch effects in high-throughput data in NATURE REVIEWS GENETICS
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1186/s12864-017-3808-1

    DOI

    http://dx.doi.org/10.1186/s12864-017-3808-1

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1085598099

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/28549425


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Biological Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/11", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Medical and Health Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Algorithms", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Epigenomics", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Genome-Wide Association Study", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA", 
              "id": "http://www.grid.ac/institutes/grid.66875.3a", 
              "name": [
                "Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Chen", 
            "givenName": "Jun", 
            "id": "sg:person.01023412627.52", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01023412627.52"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA", 
              "id": "http://www.grid.ac/institutes/grid.66875.3a", 
              "name": [
                "Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Behnam", 
            "givenName": "Ehsan", 
            "id": "sg:person.0773314513.84", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0773314513.84"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "State Key Laboratory of Medical Genomics, Rui-jin Hospital & Shanghai Jiao Tong University School of Medicine, 197 Rui Jin Er Road, 200025, Shanghai, China", 
              "id": "http://www.grid.ac/institutes/grid.16821.3c", 
              "name": [
                "State Key Laboratory of Medical Genomics, Rui-jin Hospital & Shanghai Jiao Tong University School of Medicine, 197 Rui Jin Er Road, 200025, Shanghai, China"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Huang", 
            "givenName": "Jinyan", 
            "id": "sg:person.01363744764.18", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01363744764.18"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Faculty of Medicine, National Heart & Lung Institute, Imperial College London, Dovehouse St, SW3 6LY, London, UK", 
              "id": "http://www.grid.ac/institutes/grid.7445.2", 
              "name": [
                "Faculty of Medicine, National Heart & Lung Institute, Imperial College London, Dovehouse St, SW3 6LY, London, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Moffatt", 
            "givenName": "Miriam F.", 
            "id": "sg:person.01062415655.03", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01062415655.03"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA", 
              "id": "http://www.grid.ac/institutes/grid.66875.3a", 
              "name": [
                "Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Schaid", 
            "givenName": "Daniel J.", 
            "id": "sg:person.01142463656.42", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01142463656.42"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Biostatistics, Harvard T.H. School of Public Health, 677 Huntington Ave, 02115, Boston, MA, USA", 
              "id": "http://www.grid.ac/institutes/None", 
              "name": [
                "Department of Epidemiology, Harvard T.H. School of Public Health, Boston, 677 Huntington Ave, 02115, Boston, MA, USA", 
                "Department of Biostatistics, Harvard T.H. School of Public Health, 677 Huntington Ave, 02115, Boston, MA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Liang", 
            "givenName": "Liming", 
            "id": "sg:person.0725506330.81", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0725506330.81"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Biostatistics, Harvard T.H. School of Public Health, 677 Huntington Ave, 02115, Boston, MA, USA", 
              "id": "http://www.grid.ac/institutes/None", 
              "name": [
                "Department of Biostatistics, Harvard T.H. School of Public Health, 677 Huntington Ave, 02115, Boston, MA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Lin", 
            "givenName": "Xihong", 
            "id": "sg:person.011306525067.25", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011306525067.25"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1186/s13059-016-0935-y", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1014176746", 
              "https://doi.org/10.1186/s13059-016-0935-y"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-11-27", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037859675", 
              "https://doi.org/10.1186/1471-2105-11-27"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/s12859-015-0844-1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018189285", 
              "https://doi.org/10.1186/s12859-015-0844-1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-11-587", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025448148", 
              "https://doi.org/10.1186/1471-2105-11-587"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.3809", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006355713", 
              "https://doi.org/10.1038/nmeth.3809"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-11-1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1012858905", 
              "https://doi.org/10.1186/1471-2105-11-1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-13-86", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047392734", 
              "https://doi.org/10.1186/1471-2105-13-86"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/srep13107", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045248336", 
              "https://doi.org/10.1038/srep13107"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/ejhg.2011.39", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1051554988", 
              "https://doi.org/10.1038/ejhg.2011.39"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/s13059-015-0584-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1024223648", 
              "https://doi.org/10.1186/s13059-015-0584-6"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.2487", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1001445372", 
              "https://doi.org/10.1038/nbt.2487"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/ncomms6366", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017750448", 
              "https://doi.org/10.1038/ncomms6366"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.2632", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035683421", 
              "https://doi.org/10.1038/nmeth.2632"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/srep28657", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030402693", 
              "https://doi.org/10.1038/srep28657"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.2815", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1015462245", 
              "https://doi.org/10.1038/nmeth.2815"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/s12859-016-1140-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034385270", 
              "https://doi.org/10.1186/s12859-016-1140-4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nrg2825", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037809833", 
              "https://doi.org/10.1038/nrg2825"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature14125", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037675737", 
              "https://doi.org/10.1038/nature14125"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2017-05-26", 
        "datePublishedReg": "2017-05-26", 
        "description": "BackgroundOne problem that plagues epigenome-wide association studies is the potential confounding due to cell mixtures when purified target cells are not available. Reference-free adjustment of cell mixtures has become increasingly popular due to its flexibility and simplicity. However, existing methods are still not optimal: increased false positive rates and reduced statistical power have been observed in many scenarios.MethodsWe develop SmartSVA, an optimized surrogate variable analysis (SVA) method, for fast and robust reference-free adjustment of cell mixtures. SmartSVA corrects the limitation of traditional SVA under highly confounded scenarios by imposing an explicit convergence criterion and improves the computational efficiency for large datasets.ResultsCompared to traditional SVA, SmartSVA achieves an order-of-magnitude speedup and better false positive control. It protects the signals when capturing the cell mixtures, resulting in significant power increase while controlling for false positives. Through extensive simulations and real data applications, we demonstrate a better performance of SmartSVA than the existing methods.ConclusionsSmartSVA is a fast and robust method for reference-free adjustment of cell mixtures for epigenome-wide association studies. As a general method, SmartSVA can be applied to other genomic studies to capture unknown sources of variability.", 
        "genre": "article", 
        "id": "sg:pub.10.1186/s12864-017-3808-1", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.2634384", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2435852", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.4243063", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2503842", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.3639828", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2787176", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2504165", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2517451", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1023790", 
            "issn": [
              "1471-2164"
            ], 
            "name": "BMC Genomics", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "18"
          }
        ], 
        "keywords": [
          "real data application", 
          "false positive control", 
          "explicit convergence criterion", 
          "computational efficiency", 
          "convergence criteria", 
          "magnitude speedup", 
          "general method", 
          "data applications", 
          "extensive simulations", 
          "statistical power", 
          "robust method", 
          "robust adjustment", 
          "analysis method", 
          "large datasets", 
          "better performance", 
          "power increases", 
          "unknown source", 
          "speedup", 
          "simulations", 
          "problem", 
          "simplicity", 
          "scenarios", 
          "significant power increase", 
          "applications", 
          "false positive rate", 
          "power", 
          "false positives", 
          "order", 
          "signals", 
          "association studies", 
          "epigenome-wide association studies", 
          "performance", 
          "flexibility", 
          "efficiency", 
          "dataset", 
          "criteria", 
          "control", 
          "mixture", 
          "limitations", 
          "adjustment", 
          "confounding", 
          "source", 
          "positive rate", 
          "study", 
          "variability", 
          "positives", 
          "rate", 
          "increase", 
          "cell mixtures", 
          "potential confounding", 
          "SVA", 
          "genomic studies", 
          "cells", 
          "method", 
          "target cells", 
          "positive control", 
          "MethodsWe", 
          "ResultsCompared"
        ], 
        "name": "Fast and robust adjustment of cell mixtures in epigenome-wide association studies with SmartSVA", 
        "pagination": "413", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1085598099"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1186/s12864-017-3808-1"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "28549425"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1186/s12864-017-3808-1", 
          "https://app.dimensions.ai/details/publication/pub.1085598099"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-09-02T16:00", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220902/entities/gbq_results/article/article_737.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1186/s12864-017-3808-1"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s12864-017-3808-1'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s12864-017-3808-1'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s12864-017-3808-1'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s12864-017-3808-1'


     

    This table displays all metadata directly associated to this object as RDF triples.

    274 TRIPLES      21 PREDICATES      105 URIs      78 LITERALS      10 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1186/s12864-017-3808-1 schema:about N0cc7b5ec575048b3bd540162789905ad
    2 N14d05fd8d9084379a30ba726c5038414
    3 N172a2114be554da3a2cd5227308cf188
    4 anzsrc-for:06
    5 anzsrc-for:08
    6 anzsrc-for:11
    7 schema:author N43bbb973a6dc4d8f884bf73b2a78f0f4
    8 schema:citation sg:pub.10.1038/ejhg.2011.39
    9 sg:pub.10.1038/nature14125
    10 sg:pub.10.1038/nbt.2487
    11 sg:pub.10.1038/ncomms6366
    12 sg:pub.10.1038/nmeth.2632
    13 sg:pub.10.1038/nmeth.2815
    14 sg:pub.10.1038/nmeth.3809
    15 sg:pub.10.1038/nrg2825
    16 sg:pub.10.1038/srep13107
    17 sg:pub.10.1038/srep28657
    18 sg:pub.10.1186/1471-2105-11-1
    19 sg:pub.10.1186/1471-2105-11-27
    20 sg:pub.10.1186/1471-2105-11-587
    21 sg:pub.10.1186/1471-2105-13-86
    22 sg:pub.10.1186/s12859-015-0844-1
    23 sg:pub.10.1186/s12859-016-1140-4
    24 sg:pub.10.1186/s13059-015-0584-6
    25 sg:pub.10.1186/s13059-016-0935-y
    26 schema:datePublished 2017-05-26
    27 schema:datePublishedReg 2017-05-26
    28 schema:description BackgroundOne problem that plagues epigenome-wide association studies is the potential confounding due to cell mixtures when purified target cells are not available. Reference-free adjustment of cell mixtures has become increasingly popular due to its flexibility and simplicity. However, existing methods are still not optimal: increased false positive rates and reduced statistical power have been observed in many scenarios.MethodsWe develop SmartSVA, an optimized surrogate variable analysis (SVA) method, for fast and robust reference-free adjustment of cell mixtures. SmartSVA corrects the limitation of traditional SVA under highly confounded scenarios by imposing an explicit convergence criterion and improves the computational efficiency for large datasets.ResultsCompared to traditional SVA, SmartSVA achieves an order-of-magnitude speedup and better false positive control. It protects the signals when capturing the cell mixtures, resulting in significant power increase while controlling for false positives. Through extensive simulations and real data applications, we demonstrate a better performance of SmartSVA than the existing methods.ConclusionsSmartSVA is a fast and robust method for reference-free adjustment of cell mixtures for epigenome-wide association studies. As a general method, SmartSVA can be applied to other genomic studies to capture unknown sources of variability.
    29 schema:genre article
    30 schema:isAccessibleForFree true
    31 schema:isPartOf Nf41ecc4f2eec42709d56054a7482f3a1
    32 Nf6b87119a6e24402ad19a5ea13b0906a
    33 sg:journal.1023790
    34 schema:keywords MethodsWe
    35 ResultsCompared
    36 SVA
    37 adjustment
    38 analysis method
    39 applications
    40 association studies
    41 better performance
    42 cell mixtures
    43 cells
    44 computational efficiency
    45 confounding
    46 control
    47 convergence criteria
    48 criteria
    49 data applications
    50 dataset
    51 efficiency
    52 epigenome-wide association studies
    53 explicit convergence criterion
    54 extensive simulations
    55 false positive control
    56 false positive rate
    57 false positives
    58 flexibility
    59 general method
    60 genomic studies
    61 increase
    62 large datasets
    63 limitations
    64 magnitude speedup
    65 method
    66 mixture
    67 order
    68 performance
    69 positive control
    70 positive rate
    71 positives
    72 potential confounding
    73 power
    74 power increases
    75 problem
    76 rate
    77 real data application
    78 robust adjustment
    79 robust method
    80 scenarios
    81 signals
    82 significant power increase
    83 simplicity
    84 simulations
    85 source
    86 speedup
    87 statistical power
    88 study
    89 target cells
    90 unknown source
    91 variability
    92 schema:name Fast and robust adjustment of cell mixtures in epigenome-wide association studies with SmartSVA
    93 schema:pagination 413
    94 schema:productId N1caa78ddbaff4a6db55da037087d33ad
    95 N95e7d8f75540452c8a3ca1494e0e027d
    96 Ne1159fd0643b4e10b28b25a9573bdbb1
    97 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085598099
    98 https://doi.org/10.1186/s12864-017-3808-1
    99 schema:sdDatePublished 2022-09-02T16:00
    100 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    101 schema:sdPublisher N8917781324414bfb85d66a5fa5911446
    102 schema:url https://doi.org/10.1186/s12864-017-3808-1
    103 sgo:license sg:explorer/license/
    104 sgo:sdDataset articles
    105 rdf:type schema:ScholarlyArticle
    106 N0cc7b5ec575048b3bd540162789905ad schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    107 schema:name Epigenomics
    108 rdf:type schema:DefinedTerm
    109 N14d05fd8d9084379a30ba726c5038414 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    110 schema:name Algorithms
    111 rdf:type schema:DefinedTerm
    112 N15540e00a86343f4abf460e8b66707b9 rdf:first sg:person.01363744764.18
    113 rdf:rest N8e2b2c42ae424d53911d109c13885120
    114 N172a2114be554da3a2cd5227308cf188 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    115 schema:name Genome-Wide Association Study
    116 rdf:type schema:DefinedTerm
    117 N1caa78ddbaff4a6db55da037087d33ad schema:name dimensions_id
    118 schema:value pub.1085598099
    119 rdf:type schema:PropertyValue
    120 N1e92713aaccc4aea9d92ca4b001bf5f2 rdf:first sg:person.011306525067.25
    121 rdf:rest rdf:nil
    122 N43bbb973a6dc4d8f884bf73b2a78f0f4 rdf:first sg:person.01023412627.52
    123 rdf:rest Nc20471c36de4458cbd2f9496c512e489
    124 N63aac43f95b34c8daa2a489e40796b8f rdf:first sg:person.01142463656.42
    125 rdf:rest Na086208951354717b64f48427ab3c3db
    126 N8917781324414bfb85d66a5fa5911446 schema:name Springer Nature - SN SciGraph project
    127 rdf:type schema:Organization
    128 N8e2b2c42ae424d53911d109c13885120 rdf:first sg:person.01062415655.03
    129 rdf:rest N63aac43f95b34c8daa2a489e40796b8f
    130 N95e7d8f75540452c8a3ca1494e0e027d schema:name pubmed_id
    131 schema:value 28549425
    132 rdf:type schema:PropertyValue
    133 Na086208951354717b64f48427ab3c3db rdf:first sg:person.0725506330.81
    134 rdf:rest N1e92713aaccc4aea9d92ca4b001bf5f2
    135 Nc20471c36de4458cbd2f9496c512e489 rdf:first sg:person.0773314513.84
    136 rdf:rest N15540e00a86343f4abf460e8b66707b9
    137 Ne1159fd0643b4e10b28b25a9573bdbb1 schema:name doi
    138 schema:value 10.1186/s12864-017-3808-1
    139 rdf:type schema:PropertyValue
    140 Nf41ecc4f2eec42709d56054a7482f3a1 schema:volumeNumber 18
    141 rdf:type schema:PublicationVolume
    142 Nf6b87119a6e24402ad19a5ea13b0906a schema:issueNumber 1
    143 rdf:type schema:PublicationIssue
    144 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
    145 schema:name Biological Sciences
    146 rdf:type schema:DefinedTerm
    147 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    148 schema:name Information and Computing Sciences
    149 rdf:type schema:DefinedTerm
    150 anzsrc-for:11 schema:inDefinedTermSet anzsrc-for:
    151 schema:name Medical and Health Sciences
    152 rdf:type schema:DefinedTerm
    153 sg:grant.2435852 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    154 rdf:type schema:MonetaryGrant
    155 sg:grant.2503842 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    156 rdf:type schema:MonetaryGrant
    157 sg:grant.2504165 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    158 rdf:type schema:MonetaryGrant
    159 sg:grant.2517451 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    160 rdf:type schema:MonetaryGrant
    161 sg:grant.2634384 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    162 rdf:type schema:MonetaryGrant
    163 sg:grant.2787176 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    164 rdf:type schema:MonetaryGrant
    165 sg:grant.3639828 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    166 rdf:type schema:MonetaryGrant
    167 sg:grant.4243063 http://pending.schema.org/fundedItem sg:pub.10.1186/s12864-017-3808-1
    168 rdf:type schema:MonetaryGrant
    169 sg:journal.1023790 schema:issn 1471-2164
    170 schema:name BMC Genomics
    171 schema:publisher Springer Nature
    172 rdf:type schema:Periodical
    173 sg:person.01023412627.52 schema:affiliation grid-institutes:grid.66875.3a
    174 schema:familyName Chen
    175 schema:givenName Jun
    176 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01023412627.52
    177 rdf:type schema:Person
    178 sg:person.01062415655.03 schema:affiliation grid-institutes:grid.7445.2
    179 schema:familyName Moffatt
    180 schema:givenName Miriam F.
    181 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01062415655.03
    182 rdf:type schema:Person
    183 sg:person.011306525067.25 schema:affiliation grid-institutes:None
    184 schema:familyName Lin
    185 schema:givenName Xihong
    186 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011306525067.25
    187 rdf:type schema:Person
    188 sg:person.01142463656.42 schema:affiliation grid-institutes:grid.66875.3a
    189 schema:familyName Schaid
    190 schema:givenName Daniel J.
    191 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01142463656.42
    192 rdf:type schema:Person
    193 sg:person.01363744764.18 schema:affiliation grid-institutes:grid.16821.3c
    194 schema:familyName Huang
    195 schema:givenName Jinyan
    196 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01363744764.18
    197 rdf:type schema:Person
    198 sg:person.0725506330.81 schema:affiliation grid-institutes:None
    199 schema:familyName Liang
    200 schema:givenName Liming
    201 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0725506330.81
    202 rdf:type schema:Person
    203 sg:person.0773314513.84 schema:affiliation grid-institutes:grid.66875.3a
    204 schema:familyName Behnam
    205 schema:givenName Ehsan
    206 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0773314513.84
    207 rdf:type schema:Person
    208 sg:pub.10.1038/ejhg.2011.39 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051554988
    209 https://doi.org/10.1038/ejhg.2011.39
    210 rdf:type schema:CreativeWork
    211 sg:pub.10.1038/nature14125 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037675737
    212 https://doi.org/10.1038/nature14125
    213 rdf:type schema:CreativeWork
    214 sg:pub.10.1038/nbt.2487 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001445372
    215 https://doi.org/10.1038/nbt.2487
    216 rdf:type schema:CreativeWork
    217 sg:pub.10.1038/ncomms6366 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017750448
    218 https://doi.org/10.1038/ncomms6366
    219 rdf:type schema:CreativeWork
    220 sg:pub.10.1038/nmeth.2632 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035683421
    221 https://doi.org/10.1038/nmeth.2632
    222 rdf:type schema:CreativeWork
    223 sg:pub.10.1038/nmeth.2815 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015462245
    224 https://doi.org/10.1038/nmeth.2815
    225 rdf:type schema:CreativeWork
    226 sg:pub.10.1038/nmeth.3809 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006355713
    227 https://doi.org/10.1038/nmeth.3809
    228 rdf:type schema:CreativeWork
    229 sg:pub.10.1038/nrg2825 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037809833
    230 https://doi.org/10.1038/nrg2825
    231 rdf:type schema:CreativeWork
    232 sg:pub.10.1038/srep13107 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045248336
    233 https://doi.org/10.1038/srep13107
    234 rdf:type schema:CreativeWork
    235 sg:pub.10.1038/srep28657 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030402693
    236 https://doi.org/10.1038/srep28657
    237 rdf:type schema:CreativeWork
    238 sg:pub.10.1186/1471-2105-11-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012858905
    239 https://doi.org/10.1186/1471-2105-11-1
    240 rdf:type schema:CreativeWork
    241 sg:pub.10.1186/1471-2105-11-27 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037859675
    242 https://doi.org/10.1186/1471-2105-11-27
    243 rdf:type schema:CreativeWork
    244 sg:pub.10.1186/1471-2105-11-587 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025448148
    245 https://doi.org/10.1186/1471-2105-11-587
    246 rdf:type schema:CreativeWork
    247 sg:pub.10.1186/1471-2105-13-86 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047392734
    248 https://doi.org/10.1186/1471-2105-13-86
    249 rdf:type schema:CreativeWork
    250 sg:pub.10.1186/s12859-015-0844-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018189285
    251 https://doi.org/10.1186/s12859-015-0844-1
    252 rdf:type schema:CreativeWork
    253 sg:pub.10.1186/s12859-016-1140-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034385270
    254 https://doi.org/10.1186/s12859-016-1140-4
    255 rdf:type schema:CreativeWork
    256 sg:pub.10.1186/s13059-015-0584-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024223648
    257 https://doi.org/10.1186/s13059-015-0584-6
    258 rdf:type schema:CreativeWork
    259 sg:pub.10.1186/s13059-016-0935-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1014176746
    260 https://doi.org/10.1186/s13059-016-0935-y
    261 rdf:type schema:CreativeWork
    262 grid-institutes:None schema:alternateName Department of Biostatistics, Harvard T.H. School of Public Health, 677 Huntington Ave, 02115, Boston, MA, USA
    263 schema:name Department of Biostatistics, Harvard T.H. School of Public Health, 677 Huntington Ave, 02115, Boston, MA, USA
    264 Department of Epidemiology, Harvard T.H. School of Public Health, Boston, 677 Huntington Ave, 02115, Boston, MA, USA
    265 rdf:type schema:Organization
    266 grid-institutes:grid.16821.3c schema:alternateName State Key Laboratory of Medical Genomics, Rui-jin Hospital & Shanghai Jiao Tong University School of Medicine, 197 Rui Jin Er Road, 200025, Shanghai, China
    267 schema:name State Key Laboratory of Medical Genomics, Rui-jin Hospital & Shanghai Jiao Tong University School of Medicine, 197 Rui Jin Er Road, 200025, Shanghai, China
    268 rdf:type schema:Organization
    269 grid-institutes:grid.66875.3a schema:alternateName Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA
    270 schema:name Division of Biomedical Statistics and Informatics, Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, 200 1st St SW, 55905, Rochester, MN, USA
    271 rdf:type schema:Organization
    272 grid-institutes:grid.7445.2 schema:alternateName Faculty of Medicine, National Heart & Lung Institute, Imperial College London, Dovehouse St, SW3 6LY, London, UK
    273 schema:name Faculty of Medicine, National Heart & Lung Institute, Imperial College London, Dovehouse St, SW3 6LY, London, UK
    274 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...