Count-based differential expression analysis of RNA sequencing data using R and Bioconductor View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2013-08-22

AUTHORS

Simon Anders, Davis J McCarthy, Yunshun Chen, Michal Okoniewski, Gordon K Smyth, Wolfgang Huber, Mark D Robinson

ABSTRACT

RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4–10 samples) can be <1 h, with computation time <1 d using a standard desktop PC. More... »

PAGES

1765-1786

References to SciGraph publications

  • 2010-03-02. A scaling normalization method for differential expression analysis of RNA-seq data in GENOME BIOLOGY
  • 2002. Sweave: Dynamic Generation of Statistical Reports Using Literate Data Analysis in COMPSTAT
  • 2010-05-02. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation in NATURE BIOTECHNOLOGY
  • 2012-12-09. Epigenetic expansion of VHL-HIF signal output drives multiorgan metastasis in renal cancer in NATURE MEDICINE
  • 2010-08-10. baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data in BMC BIOINFORMATICS
  • 2009-05. How to map billions of short reads onto genomes in NATURE BIOTECHNOLOGY
  • 2011-12-17. GC-Content Normalization for RNA-Seq Data in BMC BIOINFORMATICS
  • 2008-05-30. Mapping and quantifying mammalian transcriptomes by RNA-Seq in NATURE METHODS
  • 2005-01-01. limma: Linear Models for Microarray Data in BIOINFORMATICS AND COMPUTATIONAL BIOLOGY SOLUTIONS USING R AND BIOCONDUCTOR
  • 2013-06-02. Rev-Erbs repress macrophage gene expression by inhibiting enhancer-directed transcription in NATURE
  • 2010-10-27. Differential expression analysis for sequence count data in GENOME BIOLOGY
  • 2013-03-09. A comparison of methods for differential expression analysis of RNA-seq data in BMC BIOINFORMATICS
  • 2010-09-14. Tackling the widespread and critical impact of batch effects in high-throughput data in NATURE REVIEWS GENETICS
  • 2009-01. RNA-Seq: a revolutionary tool for transcriptomics in NATURE REVIEWS GENETICS
  • 2012-01-04. Differential oestrogen receptor binding is associated with clinical outcome in breast cancer in NATURE
  • 2010-02-18. Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments in BMC BIOINFORMATICS
  • 2012-03-01. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks in NATURE PROTOCOLS
  • 2007-03-14. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements in NATURE
  • 2011-07-11. Sequencing technology does not eliminate biological variability in NATURE BIOTECHNOLOGY
  • 2011-05-15. Full-length transcriptome assembly from RNA-Seq data without a reference genome in NATURE BIOTECHNOLOGY
  • 2004-09-15. Bioconductor: open software development for computational biology and bioinformatics in GENOME BIOLOGY
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1038/nprot.2013.099

    DOI

    http://dx.doi.org/10.1038/nprot.2013.099

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1033430059

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/23975260


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Biological Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Genetics", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Base Sequence", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computational Biology", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Gene Expression Profiling", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Sequence Analysis, RNA", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Software", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Workflow", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany", 
              "id": "http://www.grid.ac/institutes/grid.4709.a", 
              "name": [
                "Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Anders", 
            "givenName": "Simon", 
            "id": "sg:person.0626036202.10", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0626036202.10"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK", 
              "id": "http://www.grid.ac/institutes/grid.270683.8", 
              "name": [
                "Department of Statistics, University of Oxford, Oxford, UK", 
                "Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "McCarthy", 
            "givenName": "Davis J", 
            "id": "sg:person.01345633021.62", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01345633021.62"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Medical Biology, University of Melbourne, Melbourne, Victoria, Australia", 
              "id": "http://www.grid.ac/institutes/grid.1008.9", 
              "name": [
                "Bioinformatics Division, Walter and Eliza Hall Institute, Parkville, Victoria, Australia", 
                "Department of Medical Biology, University of Melbourne, Melbourne, Victoria, Australia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Chen", 
            "givenName": "Yunshun", 
            "id": "sg:person.0641317761.98", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0641317761.98"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Functional Genomics Center UNI ETH, Zurich, Switzerland", 
              "id": "http://www.grid.ac/institutes/grid.5801.c", 
              "name": [
                "Functional Genomics Center UNI ETH, Zurich, Switzerland"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Okoniewski", 
            "givenName": "Michal", 
            "id": "sg:person.01364364513.44", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01364364513.44"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia", 
              "id": "http://www.grid.ac/institutes/grid.1008.9", 
              "name": [
                "Bioinformatics Division, Walter and Eliza Hall Institute, Parkville, Victoria, Australia", 
                "Department of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Smyth", 
            "givenName": "Gordon K", 
            "id": "sg:person.0665226271.44", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0665226271.44"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany", 
              "id": "http://www.grid.ac/institutes/grid.4709.a", 
              "name": [
                "Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Huber", 
            "givenName": "Wolfgang", 
            "id": "sg:person.0750614167.42", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0750614167.42"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland", 
              "id": "http://www.grid.ac/institutes/grid.7400.3", 
              "name": [
                "Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland", 
                "SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Robinson", 
            "givenName": "Mark D", 
            "id": "sg:person.01161667677.19", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01161667677.19"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/978-3-642-57489-4_89", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042406449", 
              "https://doi.org/10.1007/978-3-642-57489-4_89"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/0-387-29362-0_23", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025432622", 
              "https://doi.org/10.1007/0-387-29362-0_23"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature12209", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1023040655", 
              "https://doi.org/10.1038/nature12209"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature05676", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030633547", 
              "https://doi.org/10.1038/nature05676"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2010-11-10-r106", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031289083", 
              "https://doi.org/10.1186/gb-2010-11-10-r106"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nm.3029", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009981708", 
              "https://doi.org/10.1038/nm.3029"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-12-480", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1048338371", 
              "https://doi.org/10.1186/1471-2105-12-480"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nrg2484", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030687647", 
              "https://doi.org/10.1038/nrg2484"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.1883", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1015803168", 
              "https://doi.org/10.1038/nbt.1883"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2004-5-10-r80", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018457673", 
              "https://doi.org/10.1186/gb-2004-5-10-r80"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-14-91", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016675314", 
              "https://doi.org/10.1186/1471-2105-14-91"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-11-94", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053091615", 
              "https://doi.org/10.1186/1471-2105-11-94"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-11-422", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047456674", 
              "https://doi.org/10.1186/1471-2105-11-422"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1226", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045381177", 
              "https://doi.org/10.1038/nmeth.1226"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt0509-455", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043194556", 
              "https://doi.org/10.1038/nbt0509-455"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.1910", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030579983", 
              "https://doi.org/10.1038/nbt.1910"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2010-11-3-r25", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050509557", 
              "https://doi.org/10.1186/gb-2010-11-3-r25"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nrg2825", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1037809833", 
              "https://doi.org/10.1038/nrg2825"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nprot.2012.016", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030124536", 
              "https://doi.org/10.1038/nprot.2012.016"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.1621", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031035095", 
              "https://doi.org/10.1038/nbt.1621"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature10730", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1005577478", 
              "https://doi.org/10.1038/nature10730"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2013-08-22", 
        "datePublishedReg": "2013-08-22", 
        "description": "RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4\u201310 samples) can be <1 h, with computation time <1 d using a standard desktop PC.", 
        "genre": "article", 
        "id": "sg:pub.10.1038/nprot.2013.099", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.5237961", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.3800145", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.6736593", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1037502", 
            "issn": [
              "1754-2189", 
              "1750-2799"
            ], 
            "name": "Nature Protocols", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "9", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "8"
          }
        ], 
        "keywords": [
          "differential expression analysis", 
          "expression analysis", 
          "profiling of transcriptomes", 
          "RNA sequencing data", 
          "areas of biology", 
          "gene regulation", 
          "RNA-seq differential expression analysis", 
          "RNA sequencing", 
          "sequencing data", 
          "open-source R language", 
          "Bioconductor software", 
          "transcriptome", 
          "biological variability", 
          "R language", 
          "genes", 
          "DESeq", 
          "biology", 
          "quality control checks", 
          "sequencing", 
          "Bioconductor", 
          "profiling", 
          "regulation", 
          "control checks", 
          "particular interest", 
          "discovery", 
          "analysis", 
          "variation", 
          "different conditions", 
          "crucial aspect", 
          "variability", 
          "development", 
          "factors", 
          "disease", 
          "number", 
          "process", 
          "experiments", 
          "small experiment", 
          "tool", 
          "conditions", 
          "study", 
          "standard desktop PC", 
          "aspects", 
          "data", 
          "area", 
          "statistical modeling", 
          "time", 
          "treatment", 
          "interest", 
          "protocol", 
          "count", 
          "systematic factors", 
          "state", 
          "hand", 
          "appropriate setup", 
          "current best practice", 
          "modeling", 
          "need", 
          "check", 
          "PC", 
          "guidance", 
          "software", 
          "literature", 
          "practice", 
          "setup", 
          "best practices", 
          "art", 
          "computation time", 
          "desktop PC", 
          "appropriate treatment", 
          "data collection process", 
          "language"
        ], 
        "name": "Count-based differential expression analysis of RNA sequencing data using R and Bioconductor", 
        "pagination": "1765-1786", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1033430059"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1038/nprot.2013.099"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "23975260"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1038/nprot.2013.099", 
          "https://app.dimensions.ai/details/publication/pub.1033430059"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-10-01T06:39", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20221001/entities/gbq_results/article/article_610.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1038/nprot.2013.099"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/nprot.2013.099'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/nprot.2013.099'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/nprot.2013.099'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/nprot.2013.099'


     

    This table displays all metadata directly associated to this object as RDF triples.

    305 TRIPLES      21 PREDICATES      123 URIs      94 LITERALS      13 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1038/nprot.2013.099 schema:about N06a655f6a4984e5a950d1a3b70949bf0
    2 N34a2c041132743ae8dd5a8e7468a3c0d
    3 N351e3d335cdf4821be92a83a86e778f5
    4 Ncb6c8598746f4fe19665457818a9d3f2
    5 Ne21fc9412f3a4a59b2002676987413f4
    6 Neadfe32ceb2f423e889c9a26c44c6c17
    7 anzsrc-for:06
    8 anzsrc-for:0604
    9 schema:author N7d64f34e604d4bedaf41465001c7a04e
    10 schema:citation sg:pub.10.1007/0-387-29362-0_23
    11 sg:pub.10.1007/978-3-642-57489-4_89
    12 sg:pub.10.1038/nature05676
    13 sg:pub.10.1038/nature10730
    14 sg:pub.10.1038/nature12209
    15 sg:pub.10.1038/nbt.1621
    16 sg:pub.10.1038/nbt.1883
    17 sg:pub.10.1038/nbt.1910
    18 sg:pub.10.1038/nbt0509-455
    19 sg:pub.10.1038/nm.3029
    20 sg:pub.10.1038/nmeth.1226
    21 sg:pub.10.1038/nprot.2012.016
    22 sg:pub.10.1038/nrg2484
    23 sg:pub.10.1038/nrg2825
    24 sg:pub.10.1186/1471-2105-11-422
    25 sg:pub.10.1186/1471-2105-11-94
    26 sg:pub.10.1186/1471-2105-12-480
    27 sg:pub.10.1186/1471-2105-14-91
    28 sg:pub.10.1186/gb-2004-5-10-r80
    29 sg:pub.10.1186/gb-2010-11-10-r106
    30 sg:pub.10.1186/gb-2010-11-3-r25
    31 schema:datePublished 2013-08-22
    32 schema:datePublishedReg 2013-08-22
    33 schema:description RNA sequencing (RNA-seq) has been rapidly adopted for the profiling of transcriptomes in many areas of biology, including studies into gene regulation, development and disease. Of particular interest is the discovery of differentially expressed genes across different conditions (e.g., tissues, perturbations) while optionally adjusting for other systematic factors that affect the data-collection process. There are a number of subtle yet crucial aspects of these analyses, such as read counting, appropriate treatment of biological variability, quality control checks and appropriate setup of statistical modeling. Several variations have been presented in the literature, and there is a need for guidance on current best practices. This protocol presents a state-of-the-art computational and statistical RNA-seq differential expression analysis workflow largely based on the free open-source R language and Bioconductor software and, in particular, on two widely used tools, DESeq and edgeR. Hands-on time for typical small experiments (e.g., 4–10 samples) can be <1 h, with computation time <1 d using a standard desktop PC.
    34 schema:genre article
    35 schema:isAccessibleForFree true
    36 schema:isPartOf N424cc846a065491496a08b5b7dd51c66
    37 N820c64e1ba0e4e0e8cb6ea3ce4461664
    38 sg:journal.1037502
    39 schema:keywords Bioconductor
    40 Bioconductor software
    41 DESeq
    42 PC
    43 R language
    44 RNA sequencing
    45 RNA sequencing data
    46 RNA-seq differential expression analysis
    47 analysis
    48 appropriate setup
    49 appropriate treatment
    50 area
    51 areas of biology
    52 art
    53 aspects
    54 best practices
    55 biological variability
    56 biology
    57 check
    58 computation time
    59 conditions
    60 control checks
    61 count
    62 crucial aspect
    63 current best practice
    64 data
    65 data collection process
    66 desktop PC
    67 development
    68 different conditions
    69 differential expression analysis
    70 discovery
    71 disease
    72 experiments
    73 expression analysis
    74 factors
    75 gene regulation
    76 genes
    77 guidance
    78 hand
    79 interest
    80 language
    81 literature
    82 modeling
    83 need
    84 number
    85 open-source R language
    86 particular interest
    87 practice
    88 process
    89 profiling
    90 profiling of transcriptomes
    91 protocol
    92 quality control checks
    93 regulation
    94 sequencing
    95 sequencing data
    96 setup
    97 small experiment
    98 software
    99 standard desktop PC
    100 state
    101 statistical modeling
    102 study
    103 systematic factors
    104 time
    105 tool
    106 transcriptome
    107 treatment
    108 variability
    109 variation
    110 schema:name Count-based differential expression analysis of RNA sequencing data using R and Bioconductor
    111 schema:pagination 1765-1786
    112 schema:productId N96a29b6d73cd4d15b130ba480b3bcd06
    113 Naefa61be4f244f29b66fe99295e58068
    114 Ncf7103d40ad84881a56e474c6f0a9c61
    115 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033430059
    116 https://doi.org/10.1038/nprot.2013.099
    117 schema:sdDatePublished 2022-10-01T06:39
    118 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    119 schema:sdPublisher N584760be8c15497f98790047c64803b6
    120 schema:url https://doi.org/10.1038/nprot.2013.099
    121 sgo:license sg:explorer/license/
    122 sgo:sdDataset articles
    123 rdf:type schema:ScholarlyArticle
    124 N06a655f6a4984e5a950d1a3b70949bf0 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    125 schema:name Workflow
    126 rdf:type schema:DefinedTerm
    127 N34a2c041132743ae8dd5a8e7468a3c0d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    128 schema:name Gene Expression Profiling
    129 rdf:type schema:DefinedTerm
    130 N351e3d335cdf4821be92a83a86e778f5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    131 schema:name Base Sequence
    132 rdf:type schema:DefinedTerm
    133 N424cc846a065491496a08b5b7dd51c66 schema:issueNumber 9
    134 rdf:type schema:PublicationIssue
    135 N584760be8c15497f98790047c64803b6 schema:name Springer Nature - SN SciGraph project
    136 rdf:type schema:Organization
    137 N719fa2883ff44e8f8c334a5fd2d68d80 rdf:first sg:person.0750614167.42
    138 rdf:rest Ne138e18df8ef455fad1d5b6f1f4565bd
    139 N7d64f34e604d4bedaf41465001c7a04e rdf:first sg:person.0626036202.10
    140 rdf:rest Ndf6e8e9643604283bd97f2206b78d0bb
    141 N820c64e1ba0e4e0e8cb6ea3ce4461664 schema:volumeNumber 8
    142 rdf:type schema:PublicationVolume
    143 N8a1b6ca410484e65b52ada776443c22a rdf:first sg:person.0665226271.44
    144 rdf:rest N719fa2883ff44e8f8c334a5fd2d68d80
    145 N96a29b6d73cd4d15b130ba480b3bcd06 schema:name pubmed_id
    146 schema:value 23975260
    147 rdf:type schema:PropertyValue
    148 N97cecd85f8084e7b9531291ca5a072d1 rdf:first sg:person.01364364513.44
    149 rdf:rest N8a1b6ca410484e65b52ada776443c22a
    150 Naefa61be4f244f29b66fe99295e58068 schema:name dimensions_id
    151 schema:value pub.1033430059
    152 rdf:type schema:PropertyValue
    153 Ncb6c8598746f4fe19665457818a9d3f2 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    154 schema:name Computational Biology
    155 rdf:type schema:DefinedTerm
    156 Ncf7103d40ad84881a56e474c6f0a9c61 schema:name doi
    157 schema:value 10.1038/nprot.2013.099
    158 rdf:type schema:PropertyValue
    159 Nd493709e8b9f45ef99a3ff50569e9946 rdf:first sg:person.0641317761.98
    160 rdf:rest N97cecd85f8084e7b9531291ca5a072d1
    161 Ndf6e8e9643604283bd97f2206b78d0bb rdf:first sg:person.01345633021.62
    162 rdf:rest Nd493709e8b9f45ef99a3ff50569e9946
    163 Ne138e18df8ef455fad1d5b6f1f4565bd rdf:first sg:person.01161667677.19
    164 rdf:rest rdf:nil
    165 Ne21fc9412f3a4a59b2002676987413f4 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    166 schema:name Software
    167 rdf:type schema:DefinedTerm
    168 Neadfe32ceb2f423e889c9a26c44c6c17 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    169 schema:name Sequence Analysis, RNA
    170 rdf:type schema:DefinedTerm
    171 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
    172 schema:name Biological Sciences
    173 rdf:type schema:DefinedTerm
    174 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
    175 schema:name Genetics
    176 rdf:type schema:DefinedTerm
    177 sg:grant.3800145 http://pending.schema.org/fundedItem sg:pub.10.1038/nprot.2013.099
    178 rdf:type schema:MonetaryGrant
    179 sg:grant.5237961 http://pending.schema.org/fundedItem sg:pub.10.1038/nprot.2013.099
    180 rdf:type schema:MonetaryGrant
    181 sg:grant.6736593 http://pending.schema.org/fundedItem sg:pub.10.1038/nprot.2013.099
    182 rdf:type schema:MonetaryGrant
    183 sg:journal.1037502 schema:issn 1750-2799
    184 1754-2189
    185 schema:name Nature Protocols
    186 schema:publisher Springer Nature
    187 rdf:type schema:Periodical
    188 sg:person.01161667677.19 schema:affiliation grid-institutes:grid.7400.3
    189 schema:familyName Robinson
    190 schema:givenName Mark D
    191 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01161667677.19
    192 rdf:type schema:Person
    193 sg:person.01345633021.62 schema:affiliation grid-institutes:grid.270683.8
    194 schema:familyName McCarthy
    195 schema:givenName Davis J
    196 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01345633021.62
    197 rdf:type schema:Person
    198 sg:person.01364364513.44 schema:affiliation grid-institutes:grid.5801.c
    199 schema:familyName Okoniewski
    200 schema:givenName Michal
    201 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01364364513.44
    202 rdf:type schema:Person
    203 sg:person.0626036202.10 schema:affiliation grid-institutes:grid.4709.a
    204 schema:familyName Anders
    205 schema:givenName Simon
    206 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0626036202.10
    207 rdf:type schema:Person
    208 sg:person.0641317761.98 schema:affiliation grid-institutes:grid.1008.9
    209 schema:familyName Chen
    210 schema:givenName Yunshun
    211 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0641317761.98
    212 rdf:type schema:Person
    213 sg:person.0665226271.44 schema:affiliation grid-institutes:grid.1008.9
    214 schema:familyName Smyth
    215 schema:givenName Gordon K
    216 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0665226271.44
    217 rdf:type schema:Person
    218 sg:person.0750614167.42 schema:affiliation grid-institutes:grid.4709.a
    219 schema:familyName Huber
    220 schema:givenName Wolfgang
    221 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0750614167.42
    222 rdf:type schema:Person
    223 sg:pub.10.1007/0-387-29362-0_23 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025432622
    224 https://doi.org/10.1007/0-387-29362-0_23
    225 rdf:type schema:CreativeWork
    226 sg:pub.10.1007/978-3-642-57489-4_89 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042406449
    227 https://doi.org/10.1007/978-3-642-57489-4_89
    228 rdf:type schema:CreativeWork
    229 sg:pub.10.1038/nature05676 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030633547
    230 https://doi.org/10.1038/nature05676
    231 rdf:type schema:CreativeWork
    232 sg:pub.10.1038/nature10730 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005577478
    233 https://doi.org/10.1038/nature10730
    234 rdf:type schema:CreativeWork
    235 sg:pub.10.1038/nature12209 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023040655
    236 https://doi.org/10.1038/nature12209
    237 rdf:type schema:CreativeWork
    238 sg:pub.10.1038/nbt.1621 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031035095
    239 https://doi.org/10.1038/nbt.1621
    240 rdf:type schema:CreativeWork
    241 sg:pub.10.1038/nbt.1883 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015803168
    242 https://doi.org/10.1038/nbt.1883
    243 rdf:type schema:CreativeWork
    244 sg:pub.10.1038/nbt.1910 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030579983
    245 https://doi.org/10.1038/nbt.1910
    246 rdf:type schema:CreativeWork
    247 sg:pub.10.1038/nbt0509-455 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043194556
    248 https://doi.org/10.1038/nbt0509-455
    249 rdf:type schema:CreativeWork
    250 sg:pub.10.1038/nm.3029 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009981708
    251 https://doi.org/10.1038/nm.3029
    252 rdf:type schema:CreativeWork
    253 sg:pub.10.1038/nmeth.1226 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045381177
    254 https://doi.org/10.1038/nmeth.1226
    255 rdf:type schema:CreativeWork
    256 sg:pub.10.1038/nprot.2012.016 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030124536
    257 https://doi.org/10.1038/nprot.2012.016
    258 rdf:type schema:CreativeWork
    259 sg:pub.10.1038/nrg2484 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030687647
    260 https://doi.org/10.1038/nrg2484
    261 rdf:type schema:CreativeWork
    262 sg:pub.10.1038/nrg2825 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037809833
    263 https://doi.org/10.1038/nrg2825
    264 rdf:type schema:CreativeWork
    265 sg:pub.10.1186/1471-2105-11-422 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047456674
    266 https://doi.org/10.1186/1471-2105-11-422
    267 rdf:type schema:CreativeWork
    268 sg:pub.10.1186/1471-2105-11-94 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053091615
    269 https://doi.org/10.1186/1471-2105-11-94
    270 rdf:type schema:CreativeWork
    271 sg:pub.10.1186/1471-2105-12-480 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048338371
    272 https://doi.org/10.1186/1471-2105-12-480
    273 rdf:type schema:CreativeWork
    274 sg:pub.10.1186/1471-2105-14-91 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016675314
    275 https://doi.org/10.1186/1471-2105-14-91
    276 rdf:type schema:CreativeWork
    277 sg:pub.10.1186/gb-2004-5-10-r80 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018457673
    278 https://doi.org/10.1186/gb-2004-5-10-r80
    279 rdf:type schema:CreativeWork
    280 sg:pub.10.1186/gb-2010-11-10-r106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031289083
    281 https://doi.org/10.1186/gb-2010-11-10-r106
    282 rdf:type schema:CreativeWork
    283 sg:pub.10.1186/gb-2010-11-3-r25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050509557
    284 https://doi.org/10.1186/gb-2010-11-3-r25
    285 rdf:type schema:CreativeWork
    286 grid-institutes:grid.1008.9 schema:alternateName Department of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia
    287 Department of Medical Biology, University of Melbourne, Melbourne, Victoria, Australia
    288 schema:name Bioinformatics Division, Walter and Eliza Hall Institute, Parkville, Victoria, Australia
    289 Department of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia
    290 Department of Medical Biology, University of Melbourne, Melbourne, Victoria, Australia
    291 rdf:type schema:Organization
    292 grid-institutes:grid.270683.8 schema:alternateName Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
    293 schema:name Department of Statistics, University of Oxford, Oxford, UK
    294 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK
    295 rdf:type schema:Organization
    296 grid-institutes:grid.4709.a schema:alternateName Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
    297 schema:name Genome Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
    298 rdf:type schema:Organization
    299 grid-institutes:grid.5801.c schema:alternateName Functional Genomics Center UNI ETH, Zurich, Switzerland
    300 schema:name Functional Genomics Center UNI ETH, Zurich, Switzerland
    301 rdf:type schema:Organization
    302 grid-institutes:grid.7400.3 schema:alternateName SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
    303 schema:name Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
    304 SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
    305 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...