Highly Scalable Algorithms for Robust String Barcoding View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2005

AUTHORS

B. DasGupta , K. M. Konwar , I. I. Măndoiu , A. A. Shvartsman

ABSTRACT

String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem. More... »

PAGES

1020-1028

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/11428848_129

DOI

http://dx.doi.org/10.1007/11428848_129

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1004539269


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of Illinois at Chicago, 60607-7053, Chicago, IL", 
          "id": "http://www.grid.ac/institutes/grid.185648.6", 
          "name": [
            "Department of Computer Science, University of Illinois at Chicago, 60607-7053, Chicago, IL"
          ], 
          "type": "Organization"
        }, 
        "familyName": "DasGupta", 
        "givenName": "B.", 
        "id": "sg:person.0763403270.10", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0763403270.10"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT", 
          "id": "http://www.grid.ac/institutes/grid.63054.34", 
          "name": [
            "Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Konwar", 
        "givenName": "K. M.", 
        "id": "sg:person.015370702377.45", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015370702377.45"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT", 
          "id": "http://www.grid.ac/institutes/grid.63054.34", 
          "name": [
            "Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT"
          ], 
          "type": "Organization"
        }, 
        "familyName": "M\u0103ndoiu", 
        "givenName": "I. I.", 
        "id": "sg:person.01017610620.16", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01017610620.16"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT", 
          "id": "http://www.grid.ac/institutes/grid.63054.34", 
          "name": [
            "Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shvartsman", 
        "givenName": "A. A.", 
        "id": "sg:person.012761271003.11", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012761271003.11"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2005", 
    "datePublishedReg": "2005-01-01", 
    "description": "String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem.", 
    "editor": [
      {
        "familyName": "Sunderam", 
        "givenName": "Vaidy S.", 
        "type": "Person"
      }, 
      {
        "familyName": "van Albada", 
        "givenName": "Geert Dick", 
        "type": "Person"
      }, 
      {
        "familyName": "Sloot", 
        "givenName": "Peter M. A.", 
        "type": "Person"
      }, 
      {
        "familyName": "Dongarra", 
        "givenName": "Jack J.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/11428848_129", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-26043-1", 
        "978-3-540-32114-9"
      ], 
      "name": "Computational Science \u2013 ICCS 2005", 
      "type": "Book"
    }, 
    "keywords": [
      "scalable algorithm", 
      "information-theoretic lower bounds", 
      "selection results", 
      "experimental results", 
      "algorithm", 
      "genomic data", 
      "lower bounds", 
      "workstations", 
      "distinguisher", 
      "thousands", 
      "engineering", 
      "bounds", 
      "whole genomic sequence", 
      "technique", 
      "hundreds", 
      "selection", 
      "data", 
      "method", 
      "results", 
      "genomic sequences", 
      "identification of microorganisms", 
      "number", 
      "identification", 
      "sequence", 
      "barcoding", 
      "bacterial size", 
      "applicability range", 
      "microorganisms", 
      "size", 
      "genome", 
      "range", 
      "problem", 
      "paper"
    ], 
    "name": "Highly Scalable Algorithms for Robust String Barcoding", 
    "pagination": "1020-1028", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1004539269"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/11428848_129"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/11428848_129", 
      "https://app.dimensions.ai/details/publication/pub.1004539269"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-11-24T21:15", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221124/entities/gbq_results/chapter/chapter_308.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/11428848_129"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11428848_129'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11428848_129'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11428848_129'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11428848_129'


 

This table displays all metadata directly associated to this object as RDF triples.

131 TRIPLES      22 PREDICATES      58 URIs      51 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/11428848_129 schema:about anzsrc-for:06
2 anzsrc-for:0604
3 schema:author N0d2eeac2bea241b4ba148b603bd8eea8
4 schema:datePublished 2005
5 schema:datePublishedReg 2005-01-01
6 schema:description String barcoding is a recently introduced technique for genomic-based identification of microorganisms. In this paper we describe the engineering of highly scalable algorithms for robust string barcoding. Our methods enable distinguisher selection based on whole genomic sequences of hundreds of microorganisms of up to bacterial size on a well-equipped workstation, and can be easily parallelized to further extend the applicability range to thousands of bacterial size genomes. Experimental results on both randomly generated and NCBI genomic data show that whole-genome based selection results in a number of distinguishers nearly matching the information theoretic lower bounds for the problem.
7 schema:editor N922c03e3cb92425bb63d55c2aa4c90d2
8 schema:genre chapter
9 schema:isAccessibleForFree true
10 schema:isPartOf N69ceecf3d570410ebd65d7d0dc2a919d
11 schema:keywords algorithm
12 applicability range
13 bacterial size
14 barcoding
15 bounds
16 data
17 distinguisher
18 engineering
19 experimental results
20 genome
21 genomic data
22 genomic sequences
23 hundreds
24 identification
25 identification of microorganisms
26 information-theoretic lower bounds
27 lower bounds
28 method
29 microorganisms
30 number
31 paper
32 problem
33 range
34 results
35 scalable algorithm
36 selection
37 selection results
38 sequence
39 size
40 technique
41 thousands
42 whole genomic sequence
43 workstations
44 schema:name Highly Scalable Algorithms for Robust String Barcoding
45 schema:pagination 1020-1028
46 schema:productId N06b9a78365a541a488491aa3e7d09a12
47 Naf7836a77c894a5fa88100c04ee9b059
48 schema:publisher N36439762b4e2435ca8f2622456470021
49 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004539269
50 https://doi.org/10.1007/11428848_129
51 schema:sdDatePublished 2022-11-24T21:15
52 schema:sdLicense https://scigraph.springernature.com/explorer/license/
53 schema:sdPublisher N91067f82814a4590a797f99739bf916a
54 schema:url https://doi.org/10.1007/11428848_129
55 sgo:license sg:explorer/license/
56 sgo:sdDataset chapters
57 rdf:type schema:Chapter
58 N06b9a78365a541a488491aa3e7d09a12 schema:name dimensions_id
59 schema:value pub.1004539269
60 rdf:type schema:PropertyValue
61 N0d2eeac2bea241b4ba148b603bd8eea8 rdf:first sg:person.0763403270.10
62 rdf:rest Ne4b7103f30c945489310a0d621ccd4c5
63 N1489792069414491b36ea93d1e28c958 schema:familyName Dongarra
64 schema:givenName Jack J.
65 rdf:type schema:Person
66 N29609fee131e4b6998781b66824917c7 schema:familyName Sloot
67 schema:givenName Peter M. A.
68 rdf:type schema:Person
69 N36439762b4e2435ca8f2622456470021 schema:name Springer Nature
70 rdf:type schema:Organisation
71 N4f594edad74e4da5ac60c3bf5fb9b898 schema:familyName van Albada
72 schema:givenName Geert Dick
73 rdf:type schema:Person
74 N68df7bfad9924a9180edd05f366521a7 rdf:first sg:person.012761271003.11
75 rdf:rest rdf:nil
76 N69ceecf3d570410ebd65d7d0dc2a919d schema:isbn 978-3-540-26043-1
77 978-3-540-32114-9
78 schema:name Computational Science – ICCS 2005
79 rdf:type schema:Book
80 N6f4b258daade49a7bc19a3de0dd5ad96 rdf:first sg:person.01017610620.16
81 rdf:rest N68df7bfad9924a9180edd05f366521a7
82 N8d7f7fe339e4420fa4096074ff05c647 schema:familyName Sunderam
83 schema:givenName Vaidy S.
84 rdf:type schema:Person
85 N91067f82814a4590a797f99739bf916a schema:name Springer Nature - SN SciGraph project
86 rdf:type schema:Organization
87 N922c03e3cb92425bb63d55c2aa4c90d2 rdf:first N8d7f7fe339e4420fa4096074ff05c647
88 rdf:rest N9b5e9fa037124be386aa355e1631e24c
89 N9b5e9fa037124be386aa355e1631e24c rdf:first N4f594edad74e4da5ac60c3bf5fb9b898
90 rdf:rest Naa909a376b0f4440b8ac8629e3dc4882
91 N9ce9628f408441ada689f526da89c064 rdf:first N1489792069414491b36ea93d1e28c958
92 rdf:rest rdf:nil
93 Naa909a376b0f4440b8ac8629e3dc4882 rdf:first N29609fee131e4b6998781b66824917c7
94 rdf:rest N9ce9628f408441ada689f526da89c064
95 Naf7836a77c894a5fa88100c04ee9b059 schema:name doi
96 schema:value 10.1007/11428848_129
97 rdf:type schema:PropertyValue
98 Ne4b7103f30c945489310a0d621ccd4c5 rdf:first sg:person.015370702377.45
99 rdf:rest N6f4b258daade49a7bc19a3de0dd5ad96
100 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
101 schema:name Biological Sciences
102 rdf:type schema:DefinedTerm
103 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
104 schema:name Genetics
105 rdf:type schema:DefinedTerm
106 sg:person.01017610620.16 schema:affiliation grid-institutes:grid.63054.34
107 schema:familyName Măndoiu
108 schema:givenName I. I.
109 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01017610620.16
110 rdf:type schema:Person
111 sg:person.012761271003.11 schema:affiliation grid-institutes:grid.63054.34
112 schema:familyName Shvartsman
113 schema:givenName A. A.
114 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012761271003.11
115 rdf:type schema:Person
116 sg:person.015370702377.45 schema:affiliation grid-institutes:grid.63054.34
117 schema:familyName Konwar
118 schema:givenName K. M.
119 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015370702377.45
120 rdf:type schema:Person
121 sg:person.0763403270.10 schema:affiliation grid-institutes:grid.185648.6
122 schema:familyName DasGupta
123 schema:givenName B.
124 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0763403270.10
125 rdf:type schema:Person
126 grid-institutes:grid.185648.6 schema:alternateName Department of Computer Science, University of Illinois at Chicago, 60607-7053, Chicago, IL
127 schema:name Department of Computer Science, University of Illinois at Chicago, 60607-7053, Chicago, IL
128 rdf:type schema:Organization
129 grid-institutes:grid.63054.34 schema:alternateName Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT
130 schema:name Computer Science and Engineering Department, University of Connecticut, 371 Fairfield Rd., 06269-2155, Storrs, CT
131 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...