A Bayesian Approach to High-Throughput Biological Model Generation View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2009

AUTHORS

Xinghua Shi , Rick Stevens

ABSTRACT

With the availability of hundreds and soon thousands of complete genomes, the construction of genome-scale metabolic models for these organisms has attracted much attention. Manual work still dominates the process of model generation, however, and leads to the huge gap between the number of complete genomes and genome-scale metabolic models. The challenge in constructing genome-scale models from existing databases is that usually such a directly extracted model is incomplete and contains network holes. Network holes occur when a network is disconnected and certain metabolites cannot be produced or consumed. In order to construct a valid metabolic model, network holes need to be filled by introducing candidate reactions into the network. As a step toward the high-throughput generation of biological models, we propose a Bayesian approach to improving draft genome-scale metabolic models. A collection of 23 types of biological and topological evidence is extracted from the SEED [1], KEGG [2], and BiGG [3] databases. Based on this evidence, we create 23 individual predictors using Bayesian approaches. To combine these individual predictors and unify their predictive results, we build an ensemble of individual predictors on majority vote and four classifiers: naive Bayes classifier, Bayesian network, multilayer perceptron network and AdaBoost. A set of experiments is performed to train and test individual predictors and integrative mechanisms of single predictors and to evaluate the performance of our approach. More... »

PAGES

376-387

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-00727-9_35

DOI

http://dx.doi.org/10.1007/978-3-642-00727-9_35

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1042680072


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Statistics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of Chicago, IL 60637, Chicago, USA", 
          "id": "http://www.grid.ac/institutes/grid.170205.1", 
          "name": [
            "Department of Computer Science, University of Chicago, IL 60637, Chicago, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shi", 
        "givenName": "Xinghua", 
        "id": "sg:person.01226144741.88", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01226144741.88"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "The Computing, Environment and Life Science, Argonne National Laboratory, IL 60439, Argonne, USA", 
          "id": "http://www.grid.ac/institutes/grid.187073.a", 
          "name": [
            "Department of Computer Science, University of Chicago, IL 60637, Chicago, USA", 
            "The Computing, Environment and Life Science, Argonne National Laboratory, IL 60439, Argonne, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Stevens", 
        "givenName": "Rick", 
        "id": "sg:person.0707416220.12", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0707416220.12"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2009", 
    "datePublishedReg": "2009-01-01", 
    "description": "With the availability of hundreds and soon thousands of complete genomes, the construction of genome-scale metabolic models for these organisms has attracted much attention. Manual work still dominates the process of model generation, however, and leads to the huge gap between the number of complete genomes and genome-scale metabolic models. The challenge in constructing genome-scale models from existing databases is that usually such a directly extracted model is incomplete and contains network holes. Network holes occur when a network is disconnected and certain metabolites cannot be produced or consumed. In order to construct a valid metabolic model, network holes need to be filled by introducing candidate reactions into the network. As a step toward the high-throughput generation of biological models, we propose a Bayesian approach to improving draft genome-scale metabolic models. A collection of 23 types of biological and topological evidence is extracted from the SEED [1], KEGG [2], and BiGG [3] databases. Based on this evidence, we create 23 individual predictors using Bayesian approaches. To combine these individual predictors and unify their predictive results, we build an ensemble of individual predictors on majority vote and four classifiers: naive Bayes classifier, Bayesian network, multilayer perceptron network and AdaBoost. A set of experiments is performed to train and test individual predictors and integrative mechanisms of single predictors and to evaluate the performance of our approach.", 
    "editor": [
      {
        "familyName": "Rajasekaran", 
        "givenName": "Sanguthevar", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-00727-9_35", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-642-00726-2", 
        "978-3-642-00727-9"
      ], 
      "name": "Bioinformatics and Computational Biology", 
      "type": "Book"
    }, 
    "keywords": [
      "Bayesian approach", 
      "genome-scale metabolic model", 
      "model generation", 
      "Bayesian networks", 
      "multilayer perceptron network", 
      "perceptron network", 
      "metabolic model", 
      "genome-scale models", 
      "biological models", 
      "network holes", 
      "network", 
      "topological evidence", 
      "predictive results", 
      "model", 
      "approach", 
      "Naive Bayes classifier", 
      "individual predictors", 
      "ensemble", 
      "Bayes classifier", 
      "set", 
      "high-throughput generation", 
      "majority vote", 
      "construction", 
      "set of experiments", 
      "classifier", 
      "performance", 
      "generation", 
      "number", 
      "order", 
      "step", 
      "availability of hundreds", 
      "thousands", 
      "work", 
      "huge gap", 
      "results", 
      "AdaBoost", 
      "process", 
      "holes", 
      "experiments", 
      "hundreds", 
      "collection", 
      "types", 
      "single predictor", 
      "attention", 
      "challenges", 
      "availability", 
      "complete genome", 
      "gap", 
      "predictors", 
      "database", 
      "manual work", 
      "KEGG", 
      "vote", 
      "genome", 
      "integrative mechanisms", 
      "mechanism", 
      "organisms", 
      "candidate reactions", 
      "evidence", 
      "valid metabolic model", 
      "reaction", 
      "BiGG database", 
      "certain metabolites", 
      "seeds", 
      "metabolites", 
      "draft genome-scale metabolic models", 
      "High-Throughput Biological Model Generation", 
      "Biological Model Generation"
    ], 
    "name": "A Bayesian Approach to High-Throughput Biological Model Generation", 
    "pagination": "376-387", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1042680072"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-00727-9_35"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-00727-9_35", 
      "https://app.dimensions.ai/details/publication/pub.1042680072"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2021-11-01T19:03", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211101/entities/gbq_results/chapter/chapter_7.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-642-00727-9_35"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-00727-9_35'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-00727-9_35'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-00727-9_35'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-00727-9_35'


 

This table displays all metadata directly associated to this object as RDF triples.

139 TRIPLES      23 PREDICATES      94 URIs      87 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-00727-9_35 schema:about anzsrc-for:01
2 anzsrc-for:0104
3 schema:author Nd30b34c54c6c42d9ae12ad99483e8c4f
4 schema:datePublished 2009
5 schema:datePublishedReg 2009-01-01
6 schema:description With the availability of hundreds and soon thousands of complete genomes, the construction of genome-scale metabolic models for these organisms has attracted much attention. Manual work still dominates the process of model generation, however, and leads to the huge gap between the number of complete genomes and genome-scale metabolic models. The challenge in constructing genome-scale models from existing databases is that usually such a directly extracted model is incomplete and contains network holes. Network holes occur when a network is disconnected and certain metabolites cannot be produced or consumed. In order to construct a valid metabolic model, network holes need to be filled by introducing candidate reactions into the network. As a step toward the high-throughput generation of biological models, we propose a Bayesian approach to improving draft genome-scale metabolic models. A collection of 23 types of biological and topological evidence is extracted from the SEED [1], KEGG [2], and BiGG [3] databases. Based on this evidence, we create 23 individual predictors using Bayesian approaches. To combine these individual predictors and unify their predictive results, we build an ensemble of individual predictors on majority vote and four classifiers: naive Bayes classifier, Bayesian network, multilayer perceptron network and AdaBoost. A set of experiments is performed to train and test individual predictors and integrative mechanisms of single predictors and to evaluate the performance of our approach.
7 schema:editor Nd9dd5533fec645eabe04805545d5abb5
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree true
11 schema:isPartOf Ne85de69dde4247cc87c134f7287fe1e4
12 schema:keywords AdaBoost
13 Bayes classifier
14 Bayesian approach
15 Bayesian networks
16 BiGG database
17 Biological Model Generation
18 High-Throughput Biological Model Generation
19 KEGG
20 Naive Bayes classifier
21 approach
22 attention
23 availability
24 availability of hundreds
25 biological models
26 candidate reactions
27 certain metabolites
28 challenges
29 classifier
30 collection
31 complete genome
32 construction
33 database
34 draft genome-scale metabolic models
35 ensemble
36 evidence
37 experiments
38 gap
39 generation
40 genome
41 genome-scale metabolic model
42 genome-scale models
43 high-throughput generation
44 holes
45 huge gap
46 hundreds
47 individual predictors
48 integrative mechanisms
49 majority vote
50 manual work
51 mechanism
52 metabolic model
53 metabolites
54 model
55 model generation
56 multilayer perceptron network
57 network
58 network holes
59 number
60 order
61 organisms
62 perceptron network
63 performance
64 predictive results
65 predictors
66 process
67 reaction
68 results
69 seeds
70 set
71 set of experiments
72 single predictor
73 step
74 thousands
75 topological evidence
76 types
77 valid metabolic model
78 vote
79 work
80 schema:name A Bayesian Approach to High-Throughput Biological Model Generation
81 schema:pagination 376-387
82 schema:productId Nb1e08ff7fd5742b3a14c803c4ada0bc7
83 Nca523b0bf56b4cf7acf38048fd0242d2
84 schema:publisher N914a66dc64b44a31b92033538dcd1b7b
85 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042680072
86 https://doi.org/10.1007/978-3-642-00727-9_35
87 schema:sdDatePublished 2021-11-01T19:03
88 schema:sdLicense https://scigraph.springernature.com/explorer/license/
89 schema:sdPublisher Nee71c49485f14654962e8c7d4245204c
90 schema:url https://doi.org/10.1007/978-3-642-00727-9_35
91 sgo:license sg:explorer/license/
92 sgo:sdDataset chapters
93 rdf:type schema:Chapter
94 N34ec8369b8fe40a8aaa529feb26a3b3a rdf:first sg:person.0707416220.12
95 rdf:rest rdf:nil
96 N41c067b5b406466599fd587060d787e1 schema:familyName Rajasekaran
97 schema:givenName Sanguthevar
98 rdf:type schema:Person
99 N914a66dc64b44a31b92033538dcd1b7b schema:name Springer Nature
100 rdf:type schema:Organisation
101 Nb1e08ff7fd5742b3a14c803c4ada0bc7 schema:name dimensions_id
102 schema:value pub.1042680072
103 rdf:type schema:PropertyValue
104 Nca523b0bf56b4cf7acf38048fd0242d2 schema:name doi
105 schema:value 10.1007/978-3-642-00727-9_35
106 rdf:type schema:PropertyValue
107 Nd30b34c54c6c42d9ae12ad99483e8c4f rdf:first sg:person.01226144741.88
108 rdf:rest N34ec8369b8fe40a8aaa529feb26a3b3a
109 Nd9dd5533fec645eabe04805545d5abb5 rdf:first N41c067b5b406466599fd587060d787e1
110 rdf:rest rdf:nil
111 Ne85de69dde4247cc87c134f7287fe1e4 schema:isbn 978-3-642-00726-2
112 978-3-642-00727-9
113 schema:name Bioinformatics and Computational Biology
114 rdf:type schema:Book
115 Nee71c49485f14654962e8c7d4245204c schema:name Springer Nature - SN SciGraph project
116 rdf:type schema:Organization
117 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
118 schema:name Mathematical Sciences
119 rdf:type schema:DefinedTerm
120 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
121 schema:name Statistics
122 rdf:type schema:DefinedTerm
123 sg:person.01226144741.88 schema:affiliation grid-institutes:grid.170205.1
124 schema:familyName Shi
125 schema:givenName Xinghua
126 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01226144741.88
127 rdf:type schema:Person
128 sg:person.0707416220.12 schema:affiliation grid-institutes:grid.187073.a
129 schema:familyName Stevens
130 schema:givenName Rick
131 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0707416220.12
132 rdf:type schema:Person
133 grid-institutes:grid.170205.1 schema:alternateName Department of Computer Science, University of Chicago, IL 60637, Chicago, USA
134 schema:name Department of Computer Science, University of Chicago, IL 60637, Chicago, USA
135 rdf:type schema:Organization
136 grid-institutes:grid.187073.a schema:alternateName The Computing, Environment and Life Science, Argonne National Laboratory, IL 60439, Argonne, USA
137 schema:name Department of Computer Science, University of Chicago, IL 60637, Chicago, USA
138 The Computing, Environment and Life Science, Argonne National Laboratory, IL 60439, Argonne, USA
139 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...