Combining 16S rRNA gene variable regions enables high-resolution microbial community profiling View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2018-01-26

AUTHORS

Garold Fuks, Michael Elgart, Amnon Amir, Amit Zeisel, Peter J. Turnbaugh, Yoav Soen, Noam Shental

ABSTRACT

BackgroundMost of our knowledge about the remarkable microbial diversity on Earth comes from sequencing the 16S rRNA gene. The use of next-generation sequencing methods has increased sample number and sequencing depth, but the read length of the most widely used sequencing platforms today is quite short, requiring the researcher to choose a subset of the gene to sequence (typically 16–33% of the total length). Thus, many bacteria may share the same amplified region, and the resolution of profiling is inherently limited. Platforms that offer ultra-long read lengths, whole genome shotgun sequencing approaches, and computational frameworks formerly suggested by us and by others all allow different ways to circumvent this problem yet suffer various shortcomings. There is a need for a simple and low-cost 16S rRNA gene-based profiling approach that harnesses the short read length to provide a much larger coverage of the gene to allow for high resolution, even in harsh conditions of low bacterial biomass and fragmented DNA.ResultsThis manuscript suggests Short MUltiple Regions Framework (SMURF), a method to combine sequencing results from different PCR-amplified regions to provide one coherent profiling. The de facto amplicon length is the total length of all amplified regions, thus providing much higher resolution compared to current techniques. Computationally, the method solves a convex optimization problem that allows extremely fast reconstruction and requires only moderate memory. We demonstrate the increase in resolution by in silico simulations and by profiling two mock mixtures and real-world biological samples. Reanalyzing a mock mixture from the Human Microbiome Project achieved about twofold improvement in resolution when combing two independent regions. Using a custom set of six primer pairs spanning about 1200 bp (80%) of the 16S rRNA gene, we were able to achieve ~ 100-fold improvement in resolution compared to a single region, over a mock mixture of common human gut bacterial isolates. Finally, the profiling of a Drosophila melanogaster microbiome using the set of six primer pairs provided a ~ 100-fold increase in resolution and thus enabling efficient downstream analysis.ConclusionsSMURF enables the identification of near full-length 16S rRNA gene sequences in microbial communities, having resolution superior compared to current techniques. It may be applied to standard sample preparation protocols with very little modifications. SMURF also paves the way to high-resolution profiling of low-biomass and fragmented DNA, e.g., in the case of formalin-fixed and paraffin-embedded samples, fossil-derived DNA, or DNA exposed to other degrading conditions. The approach is not restricted to combining amplicons of the 16S rRNA gene and may be applied to any set of amplicons, e.g., in multilocus sequence typing (MLST). More... »

PAGES

17

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/s40168-017-0396-x

DOI

http://dx.doi.org/10.1186/s40168-017-0396-x

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1100667179

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/29373999


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0605", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Microbiology", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Animals", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Bacteria", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computer Simulation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "DNA Probes", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "DNA, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Drosophila melanogaster", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Microbiota", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Phylogeny", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Polymerase Chain Reaction", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "RNA, Ribosomal, 16S", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Analysis, DNA", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Departments of Physics of Complex Systems, Weizmann Institute of Science, 7610001, Rehovot, Israel", 
          "id": "http://www.grid.ac/institutes/grid.13992.30", 
          "name": [
            "Departments of Physics of Complex Systems, Weizmann Institute of Science, 7610001, Rehovot, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Fuks", 
        "givenName": "Garold", 
        "id": "sg:person.01141151565.63", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01141151565.63"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel", 
          "id": "http://www.grid.ac/institutes/grid.13992.30", 
          "name": [
            "Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Elgart", 
        "givenName": "Michael", 
        "id": "sg:person.01353371540.37", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01353371540.37"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Pediatrics, University of California San Diego, La Jolla, 92093, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.266100.3", 
          "name": [
            "Department of Pediatrics, University of California San Diego, La Jolla, 92093, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Amir", 
        "givenName": "Amnon", 
        "id": "sg:person.01163102774.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01163102774.24"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Division of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, 10 Karolinska Institutet, S-171 77, Stockholm, Sweden", 
          "id": "http://www.grid.ac/institutes/grid.4714.6", 
          "name": [
            "Division of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, 10 Karolinska Institutet, S-171 77, Stockholm, Sweden"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zeisel", 
        "givenName": "Amit", 
        "id": "sg:person.01106137556.69", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01106137556.69"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Microbiology and Immunology, University of California San Francisco, 94143, San Francisco, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.266102.1", 
          "name": [
            "Department of Microbiology and Immunology, University of California San Francisco, 94143, San Francisco, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Turnbaugh", 
        "givenName": "Peter J.", 
        "id": "sg:person.01065252767.89", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065252767.89"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel", 
          "id": "http://www.grid.ac/institutes/grid.13992.30", 
          "name": [
            "Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Soen", 
        "givenName": "Yoav", 
        "id": "sg:person.01012327240.40", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01012327240.40"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, The Open University of Israel, 43107, Ra\u2019anana, Israel", 
          "id": "http://www.grid.ac/institutes/grid.412512.1", 
          "name": [
            "Department of Computer Science, The Open University of Israel, 43107, Ra\u2019anana, Israel"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shental", 
        "givenName": "Noam", 
        "id": "sg:person.01360371605.17", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01360371605.17"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1186/gb-2011-12-5-r44", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000326175", 
          "https://doi.org/10.1186/gb-2011-12-5-r44"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12866-016-0738-z", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021671098", 
          "https://doi.org/10.1186/s12866-016-0738-z"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature13793", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003490001", 
          "https://doi.org/10.1038/nature13793"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nbt.3601", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002490773", 
          "https://doi.org/10.1038/nbt.3601"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ismej.2011.188", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022891428", 
          "https://doi.org/10.1038/ismej.2011.188"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2013-14-5-r51", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034665395", 
          "https://doi.org/10.1186/gb-2013-14-5-r51"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/srep02620", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043515512", 
          "https://doi.org/10.1038/srep02620"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2180-12-66", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035882309", 
          "https://doi.org/10.1186/1471-2180-12-66"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmeth.3869", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016631324", 
          "https://doi.org/10.1038/nmeth.3869"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2009-10-3-r25", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049583368", 
          "https://doi.org/10.1186/gb-2009-10-3-r25"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s00403-011-1189-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035564942", 
          "https://doi.org/10.1007/s00403-011-1189-x"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2010-11-s1-p3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043272763", 
          "https://doi.org/10.1186/gb-2010-11-s1-p3"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2018-01-26", 
    "datePublishedReg": "2018-01-26", 
    "description": "BackgroundMost of our knowledge about the remarkable microbial diversity on Earth comes from sequencing the 16S rRNA gene. The use of next-generation sequencing methods has increased sample number and sequencing depth, but the read length of the most widely used sequencing platforms today is quite short, requiring the researcher to choose a subset of the gene to sequence (typically 16\u201333% of the total length). Thus, many bacteria may share the same amplified region, and the resolution of profiling is inherently limited. Platforms that offer ultra-long read lengths, whole genome shotgun sequencing approaches, and computational frameworks formerly suggested by us and by others all allow different ways to circumvent this problem yet suffer various shortcomings. There is a need for a simple and low-cost 16S rRNA gene-based profiling approach that harnesses the short read length to provide a much larger coverage of the gene to allow for high resolution, even in harsh conditions of low bacterial biomass and fragmented DNA.ResultsThis manuscript suggests Short MUltiple Regions Framework (SMURF), a method to combine sequencing results from different PCR-amplified regions to provide one coherent profiling. The de facto amplicon length is the total length of all amplified regions, thus providing much higher resolution compared to current techniques. Computationally, the method solves a convex optimization problem that allows extremely fast reconstruction and requires only moderate memory. We demonstrate the increase in resolution by in silico simulations and by profiling two mock mixtures and real-world biological samples. Reanalyzing a mock mixture from the Human Microbiome Project achieved about twofold improvement in resolution when combing two independent regions. Using a custom set of six primer pairs spanning about 1200\u00a0bp (80%) of the 16S rRNA gene, we were able to achieve ~\u2009100-fold improvement in resolution compared to a single region, over a mock mixture of common human gut bacterial isolates. Finally, the profiling of a Drosophila melanogaster microbiome using the set of six primer pairs provided a ~\u2009100-fold increase in resolution and thus enabling efficient downstream analysis.ConclusionsSMURF enables the identification of near full-length 16S rRNA gene sequences in microbial communities, having resolution superior compared to current techniques. It may be applied to standard sample preparation protocols with very little modifications. SMURF also paves the way to high-resolution profiling of low-biomass and fragmented DNA, e.g., in the case of formalin-fixed and paraffin-embedded samples, fossil-derived DNA, or DNA exposed to other degrading conditions. The approach is not restricted to combining amplicons of the 16S rRNA gene and may be applied to any set of amplicons, e.g., in multilocus sequence typing (MLST).", 
    "genre": "article", 
    "id": "sg:pub.10.1186/s40168-017-0396-x", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.4898633", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1048878", 
        "issn": [
          "2049-2618"
        ], 
        "name": "Microbiome", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "6"
      }
    ], 
    "keywords": [
      "standard sample preparation protocols", 
      "real-world biological samples", 
      "large coverage", 
      "biological samples", 
      "sample preparation protocol", 
      "high resolution", 
      "twofold improvement", 
      "primer pairs", 
      "harsh conditions", 
      "current techniques", 
      "platforms today", 
      "preparation protocol", 
      "convex optimization problem", 
      "amplicon length", 
      "read length", 
      "custom set", 
      "different PCR", 
      "rRNA gene", 
      "DNA", 
      "amplicons", 
      "platform", 
      "protocol", 
      "little modification", 
      "resolution", 
      "technique", 
      "gene sequences", 
      "optimization problem", 
      "downstream analysis", 
      "paraffin-embedded samples", 
      "method", 
      "bacterial biomass", 
      "microbial communities", 
      "coverage", 
      "improvement", 
      "modification", 
      "mixture", 
      "bacterial isolates", 
      "microbial community profiling", 
      "region framework", 
      "gut bacterial isolates", 
      "rRNA gene sequences", 
      "samples", 
      "silico simulations", 
      "length", 
      "fragmented DNA", 
      "profiling", 
      "approach", 
      "genes", 
      "sequencing methods", 
      "bacteria", 
      "PCR", 
      "community profiling", 
      "variable regions", 
      "biomass", 
      "microbial diversity", 
      "pairs", 
      "simulations", 
      "high-resolution profiling", 
      "way", 
      "sequencing results", 
      "conditions", 
      "memory", 
      "shortcomings", 
      "sample number", 
      "cases of formalin", 
      "next-generation sequencing methods", 
      "problem", 
      "formalin", 
      "use", 
      "increase", 
      "sequence", 
      "results", 
      "today", 
      "region", 
      "framework", 
      "researchers", 
      "BP", 
      "manuscript", 
      "profiling approach", 
      "fast reconstruction", 
      "independent regions", 
      "need", 
      "Computationally", 
      "diversity", 
      "number", 
      "analysis", 
      "identification", 
      "sequencing approach", 
      "set", 
      "different ways", 
      "depth", 
      "isolates", 
      "Earth", 
      "total length", 
      "typing", 
      "reconstruction", 
      "project", 
      "knowledge", 
      "cases", 
      "sequencing depth", 
      "community", 
      "BackgroundMost", 
      "microbiome", 
      "Human Microbiome Project", 
      "single region", 
      "subset", 
      "computational framework", 
      "Microbiome Project", 
      "sequence typing", 
      "multilocus sequence typing", 
      "whole-genome shotgun sequencing approach", 
      "remarkable microbial diversity", 
      "shotgun sequencing approach", 
      "mock mixtures", 
      "rRNA gene variable regions", 
      "low bacterial biomass", 
      "efficient downstream analysis", 
      "set of amplicons", 
      "moderate memory"
    ], 
    "name": "Combining 16S rRNA gene variable regions enables high-resolution microbial community profiling", 
    "pagination": "17", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1100667179"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/s40168-017-0396-x"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "29373999"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/s40168-017-0396-x", 
      "https://app.dimensions.ai/details/publication/pub.1100667179"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-05-10T10:22", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220509/entities/gbq_results/article/article_763.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/s40168-017-0396-x"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s40168-017-0396-x'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s40168-017-0396-x'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s40168-017-0396-x'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s40168-017-0396-x'


 

This table displays all metadata directly associated to this object as RDF triples.

338 TRIPLES      22 PREDICATES      170 URIs      149 LITERALS      19 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/s40168-017-0396-x schema:about N1e464ad0a3ff4e5bb3be44d6f489524b
2 N212d954e646a4bffac670605d5411af6
3 N2c929006907d4d758a0c6e723ce7b0ec
4 N3af57d9115bf49b79591d9ef0958d39b
5 N6986c7a641934b27a566282306ed4adf
6 N8a37c17d7a794efcb8e85ac51271b0c1
7 N9c2ee6245e5f4d8ba1eac1b68626923b
8 Nbf6e260732b94aa6bb1b4b8284e776d8
9 Nd5ea640bcf114d92a2434561ed2fccf1
10 Ne1db71a8e8ea400ab969c07231b2c5f8
11 Nea2a145a81ea4a6b9c6c13378a8352fe
12 Nf1d158ba4df94cda899677b0c3040eb4
13 anzsrc-for:06
14 anzsrc-for:0604
15 anzsrc-for:0605
16 schema:author Nc1d41e08f6c34a409aa63eb8a8afc96c
17 schema:citation sg:pub.10.1007/s00403-011-1189-x
18 sg:pub.10.1038/ismej.2011.188
19 sg:pub.10.1038/nature13793
20 sg:pub.10.1038/nbt.3601
21 sg:pub.10.1038/nmeth.3869
22 sg:pub.10.1038/srep02620
23 sg:pub.10.1186/1471-2180-12-66
24 sg:pub.10.1186/gb-2009-10-3-r25
25 sg:pub.10.1186/gb-2010-11-s1-p3
26 sg:pub.10.1186/gb-2011-12-5-r44
27 sg:pub.10.1186/gb-2013-14-5-r51
28 sg:pub.10.1186/s12866-016-0738-z
29 schema:datePublished 2018-01-26
30 schema:datePublishedReg 2018-01-26
31 schema:description BackgroundMost of our knowledge about the remarkable microbial diversity on Earth comes from sequencing the 16S rRNA gene. The use of next-generation sequencing methods has increased sample number and sequencing depth, but the read length of the most widely used sequencing platforms today is quite short, requiring the researcher to choose a subset of the gene to sequence (typically 16–33% of the total length). Thus, many bacteria may share the same amplified region, and the resolution of profiling is inherently limited. Platforms that offer ultra-long read lengths, whole genome shotgun sequencing approaches, and computational frameworks formerly suggested by us and by others all allow different ways to circumvent this problem yet suffer various shortcomings. There is a need for a simple and low-cost 16S rRNA gene-based profiling approach that harnesses the short read length to provide a much larger coverage of the gene to allow for high resolution, even in harsh conditions of low bacterial biomass and fragmented DNA.ResultsThis manuscript suggests Short MUltiple Regions Framework (SMURF), a method to combine sequencing results from different PCR-amplified regions to provide one coherent profiling. The de facto amplicon length is the total length of all amplified regions, thus providing much higher resolution compared to current techniques. Computationally, the method solves a convex optimization problem that allows extremely fast reconstruction and requires only moderate memory. We demonstrate the increase in resolution by in silico simulations and by profiling two mock mixtures and real-world biological samples. Reanalyzing a mock mixture from the Human Microbiome Project achieved about twofold improvement in resolution when combing two independent regions. Using a custom set of six primer pairs spanning about 1200 bp (80%) of the 16S rRNA gene, we were able to achieve ~ 100-fold improvement in resolution compared to a single region, over a mock mixture of common human gut bacterial isolates. Finally, the profiling of a Drosophila melanogaster microbiome using the set of six primer pairs provided a ~ 100-fold increase in resolution and thus enabling efficient downstream analysis.ConclusionsSMURF enables the identification of near full-length 16S rRNA gene sequences in microbial communities, having resolution superior compared to current techniques. It may be applied to standard sample preparation protocols with very little modifications. SMURF also paves the way to high-resolution profiling of low-biomass and fragmented DNA, e.g., in the case of formalin-fixed and paraffin-embedded samples, fossil-derived DNA, or DNA exposed to other degrading conditions. The approach is not restricted to combining amplicons of the 16S rRNA gene and may be applied to any set of amplicons, e.g., in multilocus sequence typing (MLST).
32 schema:genre article
33 schema:inLanguage en
34 schema:isAccessibleForFree true
35 schema:isPartOf N946fc253cfd04c56b4678f2ae1267d8e
36 Nc2e4e5808e644c8396c2d40f27bbfcd8
37 sg:journal.1048878
38 schema:keywords BP
39 BackgroundMost
40 Computationally
41 DNA
42 Earth
43 Human Microbiome Project
44 Microbiome Project
45 PCR
46 amplicon length
47 amplicons
48 analysis
49 approach
50 bacteria
51 bacterial biomass
52 bacterial isolates
53 biological samples
54 biomass
55 cases
56 cases of formalin
57 community
58 community profiling
59 computational framework
60 conditions
61 convex optimization problem
62 coverage
63 current techniques
64 custom set
65 depth
66 different PCR
67 different ways
68 diversity
69 downstream analysis
70 efficient downstream analysis
71 fast reconstruction
72 formalin
73 fragmented DNA
74 framework
75 gene sequences
76 genes
77 gut bacterial isolates
78 harsh conditions
79 high resolution
80 high-resolution profiling
81 identification
82 improvement
83 increase
84 independent regions
85 isolates
86 knowledge
87 large coverage
88 length
89 little modification
90 low bacterial biomass
91 manuscript
92 memory
93 method
94 microbial communities
95 microbial community profiling
96 microbial diversity
97 microbiome
98 mixture
99 mock mixtures
100 moderate memory
101 modification
102 multilocus sequence typing
103 need
104 next-generation sequencing methods
105 number
106 optimization problem
107 pairs
108 paraffin-embedded samples
109 platform
110 platforms today
111 preparation protocol
112 primer pairs
113 problem
114 profiling
115 profiling approach
116 project
117 protocol
118 rRNA gene
119 rRNA gene sequences
120 rRNA gene variable regions
121 read length
122 real-world biological samples
123 reconstruction
124 region
125 region framework
126 remarkable microbial diversity
127 researchers
128 resolution
129 results
130 sample number
131 sample preparation protocol
132 samples
133 sequence
134 sequence typing
135 sequencing approach
136 sequencing depth
137 sequencing methods
138 sequencing results
139 set
140 set of amplicons
141 shortcomings
142 shotgun sequencing approach
143 silico simulations
144 simulations
145 single region
146 standard sample preparation protocols
147 subset
148 technique
149 today
150 total length
151 twofold improvement
152 typing
153 use
154 variable regions
155 way
156 whole-genome shotgun sequencing approach
157 schema:name Combining 16S rRNA gene variable regions enables high-resolution microbial community profiling
158 schema:pagination 17
159 schema:productId N9dd857c6aa2749688fb0c66dbb5b47a9
160 Nb7028a156e4c444a87ac1951f6e55cd6
161 Nd2c16a32315e41b7931e33c293d9ecaa
162 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100667179
163 https://doi.org/10.1186/s40168-017-0396-x
164 schema:sdDatePublished 2022-05-10T10:22
165 schema:sdLicense https://scigraph.springernature.com/explorer/license/
166 schema:sdPublisher N9e99e9a84c514056902a58870082fb0a
167 schema:url https://doi.org/10.1186/s40168-017-0396-x
168 sgo:license sg:explorer/license/
169 sgo:sdDataset articles
170 rdf:type schema:ScholarlyArticle
171 N1e464ad0a3ff4e5bb3be44d6f489524b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
172 schema:name Drosophila melanogaster
173 rdf:type schema:DefinedTerm
174 N212d954e646a4bffac670605d5411af6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
175 schema:name DNA Probes
176 rdf:type schema:DefinedTerm
177 N2c929006907d4d758a0c6e723ce7b0ec schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
178 schema:name Algorithms
179 rdf:type schema:DefinedTerm
180 N3af57d9115bf49b79591d9ef0958d39b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
181 schema:name DNA, Bacterial
182 rdf:type schema:DefinedTerm
183 N4e5b915d45134f7097e58f2e7a4281c4 rdf:first sg:person.01065252767.89
184 rdf:rest N889854dd40234ccf9800000cc29e4b7f
185 N58bc86b9ca0041a09c1476f9c9cdc0e2 rdf:first sg:person.01360371605.17
186 rdf:rest rdf:nil
187 N5f8bef3e46b8475ca4523d535a93e457 rdf:first sg:person.01163102774.24
188 rdf:rest Ne5df1bd01f9949f090e151ccbcc9e4ba
189 N6986c7a641934b27a566282306ed4adf schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
190 schema:name Animals
191 rdf:type schema:DefinedTerm
192 N889854dd40234ccf9800000cc29e4b7f rdf:first sg:person.01012327240.40
193 rdf:rest N58bc86b9ca0041a09c1476f9c9cdc0e2
194 N8a37c17d7a794efcb8e85ac51271b0c1 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
195 schema:name Phylogeny
196 rdf:type schema:DefinedTerm
197 N946fc253cfd04c56b4678f2ae1267d8e schema:volumeNumber 6
198 rdf:type schema:PublicationVolume
199 N9c2ee6245e5f4d8ba1eac1b68626923b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
200 schema:name RNA, Ribosomal, 16S
201 rdf:type schema:DefinedTerm
202 N9dd857c6aa2749688fb0c66dbb5b47a9 schema:name doi
203 schema:value 10.1186/s40168-017-0396-x
204 rdf:type schema:PropertyValue
205 N9e99e9a84c514056902a58870082fb0a schema:name Springer Nature - SN SciGraph project
206 rdf:type schema:Organization
207 Nb7028a156e4c444a87ac1951f6e55cd6 schema:name pubmed_id
208 schema:value 29373999
209 rdf:type schema:PropertyValue
210 Nbf6e260732b94aa6bb1b4b8284e776d8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
211 schema:name Polymerase Chain Reaction
212 rdf:type schema:DefinedTerm
213 Nc1d41e08f6c34a409aa63eb8a8afc96c rdf:first sg:person.01141151565.63
214 rdf:rest Ncdecb66b88b2409c96a9d749cff3f2e7
215 Nc2e4e5808e644c8396c2d40f27bbfcd8 schema:issueNumber 1
216 rdf:type schema:PublicationIssue
217 Ncdecb66b88b2409c96a9d749cff3f2e7 rdf:first sg:person.01353371540.37
218 rdf:rest N5f8bef3e46b8475ca4523d535a93e457
219 Nd2c16a32315e41b7931e33c293d9ecaa schema:name dimensions_id
220 schema:value pub.1100667179
221 rdf:type schema:PropertyValue
222 Nd5ea640bcf114d92a2434561ed2fccf1 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
223 schema:name Bacteria
224 rdf:type schema:DefinedTerm
225 Ne1db71a8e8ea400ab969c07231b2c5f8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
226 schema:name Computer Simulation
227 rdf:type schema:DefinedTerm
228 Ne5df1bd01f9949f090e151ccbcc9e4ba rdf:first sg:person.01106137556.69
229 rdf:rest N4e5b915d45134f7097e58f2e7a4281c4
230 Nea2a145a81ea4a6b9c6c13378a8352fe schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
231 schema:name Microbiota
232 rdf:type schema:DefinedTerm
233 Nf1d158ba4df94cda899677b0c3040eb4 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
234 schema:name Sequence Analysis, DNA
235 rdf:type schema:DefinedTerm
236 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
237 schema:name Biological Sciences
238 rdf:type schema:DefinedTerm
239 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
240 schema:name Genetics
241 rdf:type schema:DefinedTerm
242 anzsrc-for:0605 schema:inDefinedTermSet anzsrc-for:
243 schema:name Microbiology
244 rdf:type schema:DefinedTerm
245 sg:grant.4898633 http://pending.schema.org/fundedItem sg:pub.10.1186/s40168-017-0396-x
246 rdf:type schema:MonetaryGrant
247 sg:journal.1048878 schema:issn 2049-2618
248 schema:name Microbiome
249 schema:publisher Springer Nature
250 rdf:type schema:Periodical
251 sg:person.01012327240.40 schema:affiliation grid-institutes:grid.13992.30
252 schema:familyName Soen
253 schema:givenName Yoav
254 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01012327240.40
255 rdf:type schema:Person
256 sg:person.01065252767.89 schema:affiliation grid-institutes:grid.266102.1
257 schema:familyName Turnbaugh
258 schema:givenName Peter J.
259 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065252767.89
260 rdf:type schema:Person
261 sg:person.01106137556.69 schema:affiliation grid-institutes:grid.4714.6
262 schema:familyName Zeisel
263 schema:givenName Amit
264 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01106137556.69
265 rdf:type schema:Person
266 sg:person.01141151565.63 schema:affiliation grid-institutes:grid.13992.30
267 schema:familyName Fuks
268 schema:givenName Garold
269 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01141151565.63
270 rdf:type schema:Person
271 sg:person.01163102774.24 schema:affiliation grid-institutes:grid.266100.3
272 schema:familyName Amir
273 schema:givenName Amnon
274 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01163102774.24
275 rdf:type schema:Person
276 sg:person.01353371540.37 schema:affiliation grid-institutes:grid.13992.30
277 schema:familyName Elgart
278 schema:givenName Michael
279 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01353371540.37
280 rdf:type schema:Person
281 sg:person.01360371605.17 schema:affiliation grid-institutes:grid.412512.1
282 schema:familyName Shental
283 schema:givenName Noam
284 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01360371605.17
285 rdf:type schema:Person
286 sg:pub.10.1007/s00403-011-1189-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1035564942
287 https://doi.org/10.1007/s00403-011-1189-x
288 rdf:type schema:CreativeWork
289 sg:pub.10.1038/ismej.2011.188 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022891428
290 https://doi.org/10.1038/ismej.2011.188
291 rdf:type schema:CreativeWork
292 sg:pub.10.1038/nature13793 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003490001
293 https://doi.org/10.1038/nature13793
294 rdf:type schema:CreativeWork
295 sg:pub.10.1038/nbt.3601 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002490773
296 https://doi.org/10.1038/nbt.3601
297 rdf:type schema:CreativeWork
298 sg:pub.10.1038/nmeth.3869 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016631324
299 https://doi.org/10.1038/nmeth.3869
300 rdf:type schema:CreativeWork
301 sg:pub.10.1038/srep02620 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043515512
302 https://doi.org/10.1038/srep02620
303 rdf:type schema:CreativeWork
304 sg:pub.10.1186/1471-2180-12-66 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035882309
305 https://doi.org/10.1186/1471-2180-12-66
306 rdf:type schema:CreativeWork
307 sg:pub.10.1186/gb-2009-10-3-r25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049583368
308 https://doi.org/10.1186/gb-2009-10-3-r25
309 rdf:type schema:CreativeWork
310 sg:pub.10.1186/gb-2010-11-s1-p3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043272763
311 https://doi.org/10.1186/gb-2010-11-s1-p3
312 rdf:type schema:CreativeWork
313 sg:pub.10.1186/gb-2011-12-5-r44 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000326175
314 https://doi.org/10.1186/gb-2011-12-5-r44
315 rdf:type schema:CreativeWork
316 sg:pub.10.1186/gb-2013-14-5-r51 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034665395
317 https://doi.org/10.1186/gb-2013-14-5-r51
318 rdf:type schema:CreativeWork
319 sg:pub.10.1186/s12866-016-0738-z schema:sameAs https://app.dimensions.ai/details/publication/pub.1021671098
320 https://doi.org/10.1186/s12866-016-0738-z
321 rdf:type schema:CreativeWork
322 grid-institutes:grid.13992.30 schema:alternateName Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel
323 Departments of Physics of Complex Systems, Weizmann Institute of Science, 7610001, Rehovot, Israel
324 schema:name Department of Biomolecular Sciences, Weizmann Institute of Science, 7610001, Rehovot, Israel
325 Departments of Physics of Complex Systems, Weizmann Institute of Science, 7610001, Rehovot, Israel
326 rdf:type schema:Organization
327 grid-institutes:grid.266100.3 schema:alternateName Department of Pediatrics, University of California San Diego, La Jolla, 92093, CA, USA
328 schema:name Department of Pediatrics, University of California San Diego, La Jolla, 92093, CA, USA
329 rdf:type schema:Organization
330 grid-institutes:grid.266102.1 schema:alternateName Department of Microbiology and Immunology, University of California San Francisco, 94143, San Francisco, CA, USA
331 schema:name Department of Microbiology and Immunology, University of California San Francisco, 94143, San Francisco, CA, USA
332 rdf:type schema:Organization
333 grid-institutes:grid.412512.1 schema:alternateName Department of Computer Science, The Open University of Israel, 43107, Ra’anana, Israel
334 schema:name Department of Computer Science, The Open University of Israel, 43107, Ra’anana, Israel
335 rdf:type schema:Organization
336 grid-institutes:grid.4714.6 schema:alternateName Division of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, 10 Karolinska Institutet, S-171 77, Stockholm, Sweden
337 schema:name Division of Molecular Neurobiology, Department of Medical Biochemistry and Biophysics, 10 Karolinska Institutet, S-171 77, Stockholm, Sweden
338 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...