Gene finding in novel genomes View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2004-05-14

AUTHORS

Ian Korf

ABSTRACT

BackgroundComputational gene prediction continues to be an important problem, especially for genomes with little experimental data.ResultsI introduce the SNAP gene finder which has been designed to be easily adaptable to a variety of genomes. In novel genomes without an appropriate gene finder, I demonstrate that employing a foreign gene finder can produce highly inaccurate results, and that the most compatible parameters may not come from the nearest phylogenetic neighbor. I find that foreign gene finders are more usefully employed to bootstrap parameter estimation and that the resulting parameters can be highly accurate.ConclusionSince gene prediction is sensitive to species-specific parameters, every genome needs a dedicated gene finder. More... »

PAGES

59

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-5-59

DOI

http://dx.doi.org/10.1186/1471-2105-5-59

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1042217342

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/15144565


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Animals", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Arabidopsis", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Caenorhabditis elegans", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computational Biology", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Genetic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Drosophila melanogaster", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genes", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genes, Helminth", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genes, Insect", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genes, Plant", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genetic Variation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome, Plant", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Predictive Value of Tests", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, Cambridgeshire, UK", 
          "id": "http://www.grid.ac/institutes/grid.10306.34", 
          "name": [
            "Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, Cambridgeshire, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Korf", 
        "givenName": "Ian", 
        "id": "sg:person.01063041236.78", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01063041236.78"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/s003350010039", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008076660", 
          "https://doi.org/10.1007/s003350010039"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2004-05-14", 
    "datePublishedReg": "2004-05-14", 
    "description": "BackgroundComputational gene prediction continues to be an important problem, especially for genomes with little experimental data.ResultsI introduce the SNAP gene finder which has been designed to be easily adaptable to a variety of genomes. In novel genomes without an appropriate gene finder, I demonstrate that employing a foreign gene finder can produce highly inaccurate results, and that the most compatible parameters may not come from the nearest phylogenetic neighbor. I find that foreign gene finders are more usefully employed to bootstrap parameter estimation and that the resulting parameters can be highly accurate.ConclusionSince gene prediction is sensitive to species-specific parameters, every genome needs a dedicated gene finder.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/1471-2105-5-59", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "5"
      }
    ], 
    "keywords": [
      "gene finders", 
      "gene prediction", 
      "novel genomes", 
      "variety of genomes", 
      "nearest phylogenetic neighbors", 
      "phylogenetic neighbors", 
      "species-specific parameters", 
      "genome", 
      "genes", 
      "little experimental data", 
      "variety", 
      "prediction", 
      "important problem", 
      "data", 
      "finder", 
      "results", 
      "inaccurate results", 
      "parameters", 
      "neighbors", 
      "problem", 
      "experimental data", 
      "compatible parameters", 
      "parameter estimation", 
      "estimation"
    ], 
    "name": "Gene finding in novel genomes", 
    "pagination": "59", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1042217342"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-5-59"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "15144565"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-5-59", 
      "https://app.dimensions.ai/details/publication/pub.1042217342"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-08-04T16:55", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220804/entities/gbq_results/article/article_388.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/1471-2105-5-59"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-59'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-59'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-59'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-59'


 

This table displays all metadata directly associated to this object as RDF triples.

148 TRIPLES      21 PREDICATES      65 URIs      56 LITERALS      22 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-5-59 schema:about N0025f1c5ed9b4a0c83e3cd8cda0451b8
2 N03903cd18ffb48b983f154accd5de4b2
3 N0f367583fc584870991ba9080501ec53
4 N1d768b8c591746cf8a7a031e60bb2a3a
5 N2032e959284146038771f22a273025f8
6 N20aceae22913499ca1320397634f65a0
7 N2ce6ae13aa0d4702a8a272ccf71b1935
8 N3429724df6c34b60875071bd7a041f47
9 N57de2fbb666e4d3bbf1e1e54eccc3015
10 N6704af28fceb49cc98fb6901912c7c95
11 N8c78bea5fcdc4464a72755f4e0de8a2d
12 N97dd6353ce3549b6bcdd59122316b497
13 Nab8ee228b4504416a85374521ea6b921
14 Ncc6c1b24bf034ecd890dca44105a522e
15 Nf35fd06b9456404594af1ee1c6cd58a5
16 anzsrc-for:06
17 anzsrc-for:0604
18 schema:author Nd6ff08851f2c43d3b64a26b37efa7191
19 schema:citation sg:pub.10.1007/s003350010039
20 schema:datePublished 2004-05-14
21 schema:datePublishedReg 2004-05-14
22 schema:description BackgroundComputational gene prediction continues to be an important problem, especially for genomes with little experimental data.ResultsI introduce the SNAP gene finder which has been designed to be easily adaptable to a variety of genomes. In novel genomes without an appropriate gene finder, I demonstrate that employing a foreign gene finder can produce highly inaccurate results, and that the most compatible parameters may not come from the nearest phylogenetic neighbor. I find that foreign gene finders are more usefully employed to bootstrap parameter estimation and that the resulting parameters can be highly accurate.ConclusionSince gene prediction is sensitive to species-specific parameters, every genome needs a dedicated gene finder.
23 schema:genre article
24 schema:isAccessibleForFree true
25 schema:isPartOf N0ee46c4c0d7f446fb501960fecce0475
26 Nfaf0d305ec7f449e8c1eb6dd584317ff
27 sg:journal.1023786
28 schema:keywords compatible parameters
29 data
30 estimation
31 experimental data
32 finder
33 gene finders
34 gene prediction
35 genes
36 genome
37 important problem
38 inaccurate results
39 little experimental data
40 nearest phylogenetic neighbors
41 neighbors
42 novel genomes
43 parameter estimation
44 parameters
45 phylogenetic neighbors
46 prediction
47 problem
48 results
49 species-specific parameters
50 variety
51 variety of genomes
52 schema:name Gene finding in novel genomes
53 schema:pagination 59
54 schema:productId N41f197d068ef44d6b3571951bd8bcb63
55 N74d6cf4a39f848e681a66eb8196ec782
56 Ne8cf91a4d3e64644bab3476c846a1655
57 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042217342
58 https://doi.org/10.1186/1471-2105-5-59
59 schema:sdDatePublished 2022-08-04T16:55
60 schema:sdLicense https://scigraph.springernature.com/explorer/license/
61 schema:sdPublisher N3a4bc68112474b1fa649e8ce5cebdfd0
62 schema:url https://doi.org/10.1186/1471-2105-5-59
63 sgo:license sg:explorer/license/
64 sgo:sdDataset articles
65 rdf:type schema:ScholarlyArticle
66 N0025f1c5ed9b4a0c83e3cd8cda0451b8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
67 schema:name Genome, Plant
68 rdf:type schema:DefinedTerm
69 N03903cd18ffb48b983f154accd5de4b2 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
70 schema:name Caenorhabditis elegans
71 rdf:type schema:DefinedTerm
72 N0ee46c4c0d7f446fb501960fecce0475 schema:volumeNumber 5
73 rdf:type schema:PublicationVolume
74 N0f367583fc584870991ba9080501ec53 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
75 schema:name Genetic Variation
76 rdf:type schema:DefinedTerm
77 N1d768b8c591746cf8a7a031e60bb2a3a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
78 schema:name Arabidopsis
79 rdf:type schema:DefinedTerm
80 N2032e959284146038771f22a273025f8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
81 schema:name Drosophila melanogaster
82 rdf:type schema:DefinedTerm
83 N20aceae22913499ca1320397634f65a0 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
84 schema:name Genes, Helminth
85 rdf:type schema:DefinedTerm
86 N2ce6ae13aa0d4702a8a272ccf71b1935 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
87 schema:name Computational Biology
88 rdf:type schema:DefinedTerm
89 N3429724df6c34b60875071bd7a041f47 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
90 schema:name Software
91 rdf:type schema:DefinedTerm
92 N3a4bc68112474b1fa649e8ce5cebdfd0 schema:name Springer Nature - SN SciGraph project
93 rdf:type schema:Organization
94 N41f197d068ef44d6b3571951bd8bcb63 schema:name pubmed_id
95 schema:value 15144565
96 rdf:type schema:PropertyValue
97 N57de2fbb666e4d3bbf1e1e54eccc3015 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
98 schema:name Genes
99 rdf:type schema:DefinedTerm
100 N6704af28fceb49cc98fb6901912c7c95 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
101 schema:name Predictive Value of Tests
102 rdf:type schema:DefinedTerm
103 N74d6cf4a39f848e681a66eb8196ec782 schema:name dimensions_id
104 schema:value pub.1042217342
105 rdf:type schema:PropertyValue
106 N8c78bea5fcdc4464a72755f4e0de8a2d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
107 schema:name Databases, Genetic
108 rdf:type schema:DefinedTerm
109 N97dd6353ce3549b6bcdd59122316b497 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
110 schema:name Genes, Plant
111 rdf:type schema:DefinedTerm
112 Nab8ee228b4504416a85374521ea6b921 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
113 schema:name Genome
114 rdf:type schema:DefinedTerm
115 Ncc6c1b24bf034ecd890dca44105a522e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
116 schema:name Animals
117 rdf:type schema:DefinedTerm
118 Nd6ff08851f2c43d3b64a26b37efa7191 rdf:first sg:person.01063041236.78
119 rdf:rest rdf:nil
120 Ne8cf91a4d3e64644bab3476c846a1655 schema:name doi
121 schema:value 10.1186/1471-2105-5-59
122 rdf:type schema:PropertyValue
123 Nf35fd06b9456404594af1ee1c6cd58a5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
124 schema:name Genes, Insect
125 rdf:type schema:DefinedTerm
126 Nfaf0d305ec7f449e8c1eb6dd584317ff schema:issueNumber 1
127 rdf:type schema:PublicationIssue
128 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
129 schema:name Biological Sciences
130 rdf:type schema:DefinedTerm
131 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
132 schema:name Genetics
133 rdf:type schema:DefinedTerm
134 sg:journal.1023786 schema:issn 1471-2105
135 schema:name BMC Bioinformatics
136 schema:publisher Springer Nature
137 rdf:type schema:Periodical
138 sg:person.01063041236.78 schema:affiliation grid-institutes:grid.10306.34
139 schema:familyName Korf
140 schema:givenName Ian
141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01063041236.78
142 rdf:type schema:Person
143 sg:pub.10.1007/s003350010039 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008076660
144 https://doi.org/10.1007/s003350010039
145 rdf:type schema:CreativeWork
146 grid-institutes:grid.10306.34 schema:alternateName Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, Cambridgeshire, UK
147 schema:name Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, CB10 1SA, Hinxton, Cambridgeshire, UK
148 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...