454 antibody sequencing - error characterization and correction View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2011-12

AUTHORS

Ponraj Prabakaran, Emily Streaker, Weizao Chen, Dimiter S Dimitrov

ABSTRACT

BACKGROUND: 454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. Identification and correction of sequencing errors in such mixtures is especially important for the exploration of complex maturation pathways and identification of putative germline predecessors of highly somatically mutated antibodies. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing. RESULTS: We found that 454 antibody sequencing could lead to approximately 20% incorrect reads due to insertions that were mostly found at shorter homopolymer regions of 2-3 nucleotide length, and less so by insertions, deletions and other variants at random sites. Correction of errors might reduce this population of erroneous reads down to 5-10%. However, there are a certain number of errors accounting for 4-8% of the total reads that could not be corrected unless several repeated sequencing is performed, although this may not be possible for large diverse libraries and repertoires including complete sets of antibodies (antibodyomes). CONCLUSIONS: The experimental test procedure carried out for assessing 454 antibody sequencing errors reveals high (up to 20%) incorrect reads; the errors can be reduced down to 5-10% but not less which suggests the use of caution to avoid false discovery of antibody variants and diversity. More... »

PAGES

404

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1756-0500-4-404

DOI

http://dx.doi.org/10.1186/1756-0500-4-404

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1028760559

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/21992227


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1107", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Immunology", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/11", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Medical and Health Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "National Institutes of Health", 
          "id": "https://www.grid.ac/institutes/grid.94365.3d", 
          "name": [
            "Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), 21702-1201, Frederick, MD, USA", 
            "Basic Research Program, Science Applications International Corporation-Frederick, Inc, MD 21702, NCI-Frederick, Frederick, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Prabakaran", 
        "givenName": "Ponraj", 
        "id": "sg:person.01030646403.51", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01030646403.51"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "National Institutes of Health", 
          "id": "https://www.grid.ac/institutes/grid.94365.3d", 
          "name": [
            "Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), 21702-1201, Frederick, MD, USA", 
            "Basic Research Program, Science Applications International Corporation-Frederick, Inc, MD 21702, NCI-Frederick, Frederick, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Streaker", 
        "givenName": "Emily", 
        "id": "sg:person.01370437776.45", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01370437776.45"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "National Institutes of Health", 
          "id": "https://www.grid.ac/institutes/grid.94365.3d", 
          "name": [
            "Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), 21702-1201, Frederick, MD, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chen", 
        "givenName": "Weizao", 
        "id": "sg:person.0762165246.36", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0762165246.36"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "National Institutes of Health", 
          "id": "https://www.grid.ac/institutes/grid.94365.3d", 
          "name": [
            "Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), 21702-1201, Frederick, MD, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Dimitrov", 
        "givenName": "Dimiter S", 
        "id": "sg:person.012247636637.59", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012247636637.59"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1186/1471-2164-12-106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004430249", 
          "https://doi.org/10.1186/1471-2164-12-106"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.4161/mabs.2.3.11779", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010469742"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/bies.200900181", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010604039"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/bies.200900181", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010604039"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.4049/jimmunol.1000445", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017371685"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btq614", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022615178"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0022-2836(81)90087-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024589839"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btq653", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030919191"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0909775106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042221024"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/scitranslmed.3000540", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048321753"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkn316", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051706958"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2011-12", 
    "datePublishedReg": "2011-12-01", 
    "description": "BACKGROUND: 454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. Identification and correction of sequencing errors in such mixtures is especially important for the exploration of complex maturation pathways and identification of putative germline predecessors of highly somatically mutated antibodies. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing.\nRESULTS: We found that 454 antibody sequencing could lead to approximately 20% incorrect reads due to insertions that were mostly found at shorter homopolymer regions of 2-3 nucleotide length, and less so by insertions, deletions and other variants at random sites. Correction of errors might reduce this population of erroneous reads down to 5-10%. However, there are a certain number of errors accounting for 4-8% of the total reads that could not be corrected unless several repeated sequencing is performed, although this may not be possible for large diverse libraries and repertoires including complete sets of antibodies (antibodyomes).\nCONCLUSIONS: The experimental test procedure carried out for assessing 454 antibody sequencing errors reveals high (up to 20%) incorrect reads; the errors can be reduced down to 5-10% but not less which suggests the use of caution to avoid false discovery of antibody variants and diversity.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1756-0500-4-404", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2724082", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2723791", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2724081", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.5246872", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1039457", 
        "issn": [
          "1756-0500"
        ], 
        "name": "BMC Research Notes", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "4"
      }
    ], 
    "name": "454 antibody sequencing - error characterization and correction", 
    "pagination": "404", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "8901dc5b57815f35c8e1d66c5405e3db835a2fed96120d8fef73f20b6ea265e2"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "21992227"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101462768"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1756-0500-4-404"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1028760559"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1756-0500-4-404", 
      "https://app.dimensions.ai/details/publication/pub.1028760559"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T00:16", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8695_00000513.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2F1756-0500-4-404"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1756-0500-4-404'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1756-0500-4-404'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1756-0500-4-404'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1756-0500-4-404'


 

This table displays all metadata directly associated to this object as RDF triples.

129 TRIPLES      21 PREDICATES      39 URIs      21 LITERALS      9 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1756-0500-4-404 schema:about anzsrc-for:11
2 anzsrc-for:1107
3 schema:author N32c8ab3c49b74944878ed3043dd88dc4
4 schema:citation sg:pub.10.1186/1471-2164-12-106
5 https://doi.org/10.1002/bies.200900181
6 https://doi.org/10.1016/0022-2836(81)90087-5
7 https://doi.org/10.1073/pnas.0909775106
8 https://doi.org/10.1093/bioinformatics/btq614
9 https://doi.org/10.1093/bioinformatics/btq653
10 https://doi.org/10.1093/nar/gkn316
11 https://doi.org/10.1126/scitranslmed.3000540
12 https://doi.org/10.4049/jimmunol.1000445
13 https://doi.org/10.4161/mabs.2.3.11779
14 schema:datePublished 2011-12
15 schema:datePublishedReg 2011-12-01
16 schema:description BACKGROUND: 454 sequencing is currently the method of choice for sequencing of antibody repertoires and libraries containing large numbers (106 to 1012) of different molecules with similar frameworks and variable regions which poses significant challenges for identifying sequencing errors. Identification and correction of sequencing errors in such mixtures is especially important for the exploration of complex maturation pathways and identification of putative germline predecessors of highly somatically mutated antibodies. To quantify and correct errors incorporated in 454 antibody sequencing, we sequenced six antibodies at different known concentrations twice over and compared them with the corresponding known sequences as determined by standard Sanger sequencing. RESULTS: We found that 454 antibody sequencing could lead to approximately 20% incorrect reads due to insertions that were mostly found at shorter homopolymer regions of 2-3 nucleotide length, and less so by insertions, deletions and other variants at random sites. Correction of errors might reduce this population of erroneous reads down to 5-10%. However, there are a certain number of errors accounting for 4-8% of the total reads that could not be corrected unless several repeated sequencing is performed, although this may not be possible for large diverse libraries and repertoires including complete sets of antibodies (antibodyomes). CONCLUSIONS: The experimental test procedure carried out for assessing 454 antibody sequencing errors reveals high (up to 20%) incorrect reads; the errors can be reduced down to 5-10% but not less which suggests the use of caution to avoid false discovery of antibody variants and diversity.
17 schema:genre research_article
18 schema:inLanguage en
19 schema:isAccessibleForFree true
20 schema:isPartOf N2d9d8eae98de4607babed1262ce070ae
21 N670b49ae63a14da09307ced4761bc77a
22 sg:journal.1039457
23 schema:name 454 antibody sequencing - error characterization and correction
24 schema:pagination 404
25 schema:productId Na117d16c9fc54615af4ccb639863faf5
26 Naf8e3480a1304c7e904f144354175cd7
27 Nb9dd132d45204c9a815eed80b72406aa
28 Nc2493ad4231b4624875edce5556e215f
29 Ncea1c0dff4b44a05bf3dadd3e7516026
30 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028760559
31 https://doi.org/10.1186/1756-0500-4-404
32 schema:sdDatePublished 2019-04-11T00:16
33 schema:sdLicense https://scigraph.springernature.com/explorer/license/
34 schema:sdPublisher Nd1024491e38f43b6812d12bfd463e9c5
35 schema:url http://link.springer.com/10.1186%2F1756-0500-4-404
36 sgo:license sg:explorer/license/
37 sgo:sdDataset articles
38 rdf:type schema:ScholarlyArticle
39 N1a3e37b6ffc9482b8dc0ab0a3a38d32d rdf:first sg:person.012247636637.59
40 rdf:rest rdf:nil
41 N2d9d8eae98de4607babed1262ce070ae schema:issueNumber 1
42 rdf:type schema:PublicationIssue
43 N32c8ab3c49b74944878ed3043dd88dc4 rdf:first sg:person.01030646403.51
44 rdf:rest Nd2ad9cb88893400db3b44e465b85936a
45 N670b49ae63a14da09307ced4761bc77a schema:volumeNumber 4
46 rdf:type schema:PublicationVolume
47 Na117d16c9fc54615af4ccb639863faf5 schema:name readcube_id
48 schema:value 8901dc5b57815f35c8e1d66c5405e3db835a2fed96120d8fef73f20b6ea265e2
49 rdf:type schema:PropertyValue
50 Naf8e3480a1304c7e904f144354175cd7 schema:name nlm_unique_id
51 schema:value 101462768
52 rdf:type schema:PropertyValue
53 Nb9dd132d45204c9a815eed80b72406aa schema:name dimensions_id
54 schema:value pub.1028760559
55 rdf:type schema:PropertyValue
56 Nc2493ad4231b4624875edce5556e215f schema:name pubmed_id
57 schema:value 21992227
58 rdf:type schema:PropertyValue
59 Ncea1c0dff4b44a05bf3dadd3e7516026 schema:name doi
60 schema:value 10.1186/1756-0500-4-404
61 rdf:type schema:PropertyValue
62 Nd1024491e38f43b6812d12bfd463e9c5 schema:name Springer Nature - SN SciGraph project
63 rdf:type schema:Organization
64 Nd2ad9cb88893400db3b44e465b85936a rdf:first sg:person.01370437776.45
65 rdf:rest Ne1ec334291874b31b1dfafa77e3aa96d
66 Ne1ec334291874b31b1dfafa77e3aa96d rdf:first sg:person.0762165246.36
67 rdf:rest N1a3e37b6ffc9482b8dc0ab0a3a38d32d
68 anzsrc-for:11 schema:inDefinedTermSet anzsrc-for:
69 schema:name Medical and Health Sciences
70 rdf:type schema:DefinedTerm
71 anzsrc-for:1107 schema:inDefinedTermSet anzsrc-for:
72 schema:name Immunology
73 rdf:type schema:DefinedTerm
74 sg:grant.2723791 http://pending.schema.org/fundedItem sg:pub.10.1186/1756-0500-4-404
75 rdf:type schema:MonetaryGrant
76 sg:grant.2724081 http://pending.schema.org/fundedItem sg:pub.10.1186/1756-0500-4-404
77 rdf:type schema:MonetaryGrant
78 sg:grant.2724082 http://pending.schema.org/fundedItem sg:pub.10.1186/1756-0500-4-404
79 rdf:type schema:MonetaryGrant
80 sg:grant.5246872 http://pending.schema.org/fundedItem sg:pub.10.1186/1756-0500-4-404
81 rdf:type schema:MonetaryGrant
82 sg:journal.1039457 schema:issn 1756-0500
83 schema:name BMC Research Notes
84 rdf:type schema:Periodical
85 sg:person.01030646403.51 schema:affiliation https://www.grid.ac/institutes/grid.94365.3d
86 schema:familyName Prabakaran
87 schema:givenName Ponraj
88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01030646403.51
89 rdf:type schema:Person
90 sg:person.012247636637.59 schema:affiliation https://www.grid.ac/institutes/grid.94365.3d
91 schema:familyName Dimitrov
92 schema:givenName Dimiter S
93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012247636637.59
94 rdf:type schema:Person
95 sg:person.01370437776.45 schema:affiliation https://www.grid.ac/institutes/grid.94365.3d
96 schema:familyName Streaker
97 schema:givenName Emily
98 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01370437776.45
99 rdf:type schema:Person
100 sg:person.0762165246.36 schema:affiliation https://www.grid.ac/institutes/grid.94365.3d
101 schema:familyName Chen
102 schema:givenName Weizao
103 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0762165246.36
104 rdf:type schema:Person
105 sg:pub.10.1186/1471-2164-12-106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004430249
106 https://doi.org/10.1186/1471-2164-12-106
107 rdf:type schema:CreativeWork
108 https://doi.org/10.1002/bies.200900181 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010604039
109 rdf:type schema:CreativeWork
110 https://doi.org/10.1016/0022-2836(81)90087-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024589839
111 rdf:type schema:CreativeWork
112 https://doi.org/10.1073/pnas.0909775106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042221024
113 rdf:type schema:CreativeWork
114 https://doi.org/10.1093/bioinformatics/btq614 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022615178
115 rdf:type schema:CreativeWork
116 https://doi.org/10.1093/bioinformatics/btq653 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030919191
117 rdf:type schema:CreativeWork
118 https://doi.org/10.1093/nar/gkn316 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051706958
119 rdf:type schema:CreativeWork
120 https://doi.org/10.1126/scitranslmed.3000540 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048321753
121 rdf:type schema:CreativeWork
122 https://doi.org/10.4049/jimmunol.1000445 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017371685
123 rdf:type schema:CreativeWork
124 https://doi.org/10.4161/mabs.2.3.11779 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010469742
125 rdf:type schema:CreativeWork
126 https://www.grid.ac/institutes/grid.94365.3d schema:alternateName National Institutes of Health
127 schema:name Basic Research Program, Science Applications International Corporation-Frederick, Inc, MD 21702, NCI-Frederick, Frederick, USA
128 Protein Interactions Group, Center for Cancer Research Nanobiology Program, National Cancer Institute (NCI)-Frederick, National Institutes of Health (NIH), 21702-1201, Frederick, MD, USA
129 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...