Prodigal: prokaryotic gene recognition and translation initiation site identification View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2010-03-08

AUTHORS

Doug Hyatt, Gwo-Liang Chen, Philip F LoCascio, Miriam L Land, Frank W Larimer, Loren J Hauser

ABSTRACT

BackgroundThe quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals.ResultsWith our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives.ConclusionWe built a fast, lightweight, open source gene prediction program called Prodigal http://compbio.ornl.gov/prodigal/. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines. More... »

PAGES

119

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-11-119

DOI

http://dx.doi.org/10.1186/1471-2105-11-119

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1026423599

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/20211023


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Genetic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome, Bacterial", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Peptide Chain Initiation, Translational", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Prokaryotic Cells", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Genome Science and Technology Graduate School, The University of Tennessee, 37996, Knoxville, TN, USA", 
          "id": "http://www.grid.ac/institutes/grid.411461.7", 
          "name": [
            "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
            "Genome Science and Technology Graduate School, The University of Tennessee, 37996, Knoxville, TN, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hyatt", 
        "givenName": "Doug", 
        "id": "sg:person.0622320143.93", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0622320143.93"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
          "id": "http://www.grid.ac/institutes/grid.135519.a", 
          "name": [
            "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chen", 
        "givenName": "Gwo-Liang", 
        "id": "sg:person.01066216246.51", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01066216246.51"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
          "id": "http://www.grid.ac/institutes/grid.135519.a", 
          "name": [
            "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "LoCascio", 
        "givenName": "Philip F", 
        "id": "sg:person.01133554113.43", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01133554113.43"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "DOE Joint Genome Institute, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
          "id": "http://www.grid.ac/institutes/grid.135519.a", 
          "name": [
            "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
            "DOE Joint Genome Institute, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Land", 
        "givenName": "Miriam L", 
        "id": "sg:person.01115346474.60", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01115346474.60"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Genome Science and Technology Graduate School, The University of Tennessee, 37996, Knoxville, TN, USA", 
          "id": "http://www.grid.ac/institutes/grid.411461.7", 
          "name": [
            "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
            "Genome Science and Technology Graduate School, The University of Tennessee, 37996, Knoxville, TN, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Larimer", 
        "givenName": "Frank W", 
        "id": "sg:person.01203541750.08", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01203541750.08"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "DOE Joint Genome Institute, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
          "id": "http://www.grid.ac/institutes/grid.135519.a", 
          "name": [
            "Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA", 
            "DOE Joint Genome Institute, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hauser", 
        "givenName": "Loren J", 
        "id": "sg:person.01200425336.30", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01200425336.30"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1186/1471-2105-4-21", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013695495", 
          "https://doi.org/10.1186/1471-2105-4-21"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-8-97", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040156671", 
          "https://doi.org/10.1186/1471-2105-8-97"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2010-03-08", 
    "datePublishedReg": "2010-03-08", 
    "description": "BackgroundThe quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals.ResultsWith our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives.ConclusionWe built a fast, lightweight, open source gene prediction program called Prodigal http://compbio.ornl.gov/prodigal/. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/1471-2105-11-119", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "11"
      }
    ], 
    "keywords": [
      "Joint Genome Institute", 
      "gene prediction algorithms", 
      "gene prediction programs", 
      "gene-finding methods", 
      "gene structure prediction", 
      "translation initiation site", 
      "Genome Institute", 
      "gene prediction", 
      "annotation pipeline", 
      "microbial organisms", 
      "gene recognition", 
      "site recognition", 
      "initiation site", 
      "structure prediction", 
      "prediction programs", 
      "genes", 
      "site identification", 
      "genome", 
      "translation initiation site recognition", 
      "organisms", 
      "correct identification", 
      "identification", 
      "Prodigal", 
      "false positives", 
      "prediction algorithm", 
      "sites", 
      "past decade", 
      "overall number", 
      "number", 
      "recognition", 
      "pipeline", 
      "results", 
      "positives", 
      "desirable goal", 
      "prediction", 
      "decades", 
      "valuable asset", 
      "ConclusionWe", 
      "goal", 
      "program", 
      "years", 
      "method", 
      "quality", 
      "objective", 
      "ResultsWith", 
      "improvement", 
      "Institute", 
      "better results", 
      "algorithm", 
      "room", 
      "years of experience", 
      "assets", 
      "experience"
    ], 
    "name": "Prodigal: prokaryotic gene recognition and translation initiation site identification", 
    "pagination": "119", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1026423599"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-11-119"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "20211023"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-11-119", 
      "https://app.dimensions.ai/details/publication/pub.1026423599"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-12-01T06:28", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_499.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/1471-2105-11-119"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-119'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-119'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-119'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-119'


 

This table displays all metadata directly associated to this object as RDF triples.

186 TRIPLES      21 PREDICATES      86 URIs      76 LITERALS      13 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-11-119 schema:about N54bd8d6241344f329402bff0bac182a0
2 N79f67678d2bd4aa4895181140539c97d
3 N7e9bb690b044488c97d870a4654c92ea
4 N95f12b96854b42568bb0788ed6fbac49
5 N98254d568bd8451d95b2a57610202ea9
6 N9ad041748a96490b9e5f54752807f965
7 anzsrc-for:06
8 anzsrc-for:0604
9 schema:author N219afb083fd041d9b45cfae7f551d69d
10 schema:citation sg:pub.10.1186/1471-2105-4-21
11 sg:pub.10.1186/1471-2105-8-97
12 schema:datePublished 2010-03-08
13 schema:datePublishedReg 2010-03-08
14 schema:description BackgroundThe quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals.ResultsWith our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives.ConclusionWe built a fast, lightweight, open source gene prediction program called Prodigal http://compbio.ornl.gov/prodigal/. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines.
15 schema:genre article
16 schema:isAccessibleForFree true
17 schema:isPartOf N40a7387fe497499a8de50072a4d7013c
18 Nb4b9f268a1674ccb9a56eca06c64c4ef
19 sg:journal.1023786
20 schema:keywords ConclusionWe
21 Genome Institute
22 Institute
23 Joint Genome Institute
24 Prodigal
25 ResultsWith
26 algorithm
27 annotation pipeline
28 assets
29 better results
30 correct identification
31 decades
32 desirable goal
33 experience
34 false positives
35 gene prediction
36 gene prediction algorithms
37 gene prediction programs
38 gene recognition
39 gene structure prediction
40 gene-finding methods
41 genes
42 genome
43 goal
44 identification
45 improvement
46 initiation site
47 method
48 microbial organisms
49 number
50 objective
51 organisms
52 overall number
53 past decade
54 pipeline
55 positives
56 prediction
57 prediction algorithm
58 prediction programs
59 program
60 quality
61 recognition
62 results
63 room
64 site identification
65 site recognition
66 sites
67 structure prediction
68 translation initiation site
69 translation initiation site recognition
70 valuable asset
71 years
72 years of experience
73 schema:name Prodigal: prokaryotic gene recognition and translation initiation site identification
74 schema:pagination 119
75 schema:productId N0f34305990694167b8af01b3194ff823
76 N6f880fcd23c54323b1832f21b84f5bf3
77 Ndf9e0e7412774082b1e14c3bbbfacd78
78 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026423599
79 https://doi.org/10.1186/1471-2105-11-119
80 schema:sdDatePublished 2022-12-01T06:28
81 schema:sdLicense https://scigraph.springernature.com/explorer/license/
82 schema:sdPublisher N1057f060d9cb461bb8aad49e35aa1e4a
83 schema:url https://doi.org/10.1186/1471-2105-11-119
84 sgo:license sg:explorer/license/
85 sgo:sdDataset articles
86 rdf:type schema:ScholarlyArticle
87 N0f34305990694167b8af01b3194ff823 schema:name doi
88 schema:value 10.1186/1471-2105-11-119
89 rdf:type schema:PropertyValue
90 N1057f060d9cb461bb8aad49e35aa1e4a schema:name Springer Nature - SN SciGraph project
91 rdf:type schema:Organization
92 N1c11360fc9a944fa9e556423f4b8a96a rdf:first sg:person.01133554113.43
93 rdf:rest Nc7c3de49a9c748de8e5e7c0a6039789f
94 N219afb083fd041d9b45cfae7f551d69d rdf:first sg:person.0622320143.93
95 rdf:rest N36e6a8ff55d14a90a58535b6b5ae9147
96 N2e10170fa106474ea6ceb2d7f14757d1 rdf:first sg:person.01203541750.08
97 rdf:rest N32859869266742509ab09c5232267160
98 N32859869266742509ab09c5232267160 rdf:first sg:person.01200425336.30
99 rdf:rest rdf:nil
100 N36e6a8ff55d14a90a58535b6b5ae9147 rdf:first sg:person.01066216246.51
101 rdf:rest N1c11360fc9a944fa9e556423f4b8a96a
102 N40a7387fe497499a8de50072a4d7013c schema:volumeNumber 11
103 rdf:type schema:PublicationVolume
104 N54bd8d6241344f329402bff0bac182a0 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
105 schema:name Algorithms
106 rdf:type schema:DefinedTerm
107 N6f880fcd23c54323b1832f21b84f5bf3 schema:name dimensions_id
108 schema:value pub.1026423599
109 rdf:type schema:PropertyValue
110 N79f67678d2bd4aa4895181140539c97d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
111 schema:name Prokaryotic Cells
112 rdf:type schema:DefinedTerm
113 N7e9bb690b044488c97d870a4654c92ea schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
114 schema:name Genome, Bacterial
115 rdf:type schema:DefinedTerm
116 N95f12b96854b42568bb0788ed6fbac49 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
117 schema:name Databases, Genetic
118 rdf:type schema:DefinedTerm
119 N98254d568bd8451d95b2a57610202ea9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
120 schema:name Software
121 rdf:type schema:DefinedTerm
122 N9ad041748a96490b9e5f54752807f965 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
123 schema:name Peptide Chain Initiation, Translational
124 rdf:type schema:DefinedTerm
125 Nb4b9f268a1674ccb9a56eca06c64c4ef schema:issueNumber 1
126 rdf:type schema:PublicationIssue
127 Nc7c3de49a9c748de8e5e7c0a6039789f rdf:first sg:person.01115346474.60
128 rdf:rest N2e10170fa106474ea6ceb2d7f14757d1
129 Ndf9e0e7412774082b1e14c3bbbfacd78 schema:name pubmed_id
130 schema:value 20211023
131 rdf:type schema:PropertyValue
132 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
133 schema:name Biological Sciences
134 rdf:type schema:DefinedTerm
135 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
136 schema:name Genetics
137 rdf:type schema:DefinedTerm
138 sg:journal.1023786 schema:issn 1471-2105
139 schema:name BMC Bioinformatics
140 schema:publisher Springer Nature
141 rdf:type schema:Periodical
142 sg:person.01066216246.51 schema:affiliation grid-institutes:grid.135519.a
143 schema:familyName Chen
144 schema:givenName Gwo-Liang
145 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01066216246.51
146 rdf:type schema:Person
147 sg:person.01115346474.60 schema:affiliation grid-institutes:grid.135519.a
148 schema:familyName Land
149 schema:givenName Miriam L
150 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01115346474.60
151 rdf:type schema:Person
152 sg:person.01133554113.43 schema:affiliation grid-institutes:grid.135519.a
153 schema:familyName LoCascio
154 schema:givenName Philip F
155 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01133554113.43
156 rdf:type schema:Person
157 sg:person.01200425336.30 schema:affiliation grid-institutes:grid.135519.a
158 schema:familyName Hauser
159 schema:givenName Loren J
160 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01200425336.30
161 rdf:type schema:Person
162 sg:person.01203541750.08 schema:affiliation grid-institutes:grid.411461.7
163 schema:familyName Larimer
164 schema:givenName Frank W
165 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01203541750.08
166 rdf:type schema:Person
167 sg:person.0622320143.93 schema:affiliation grid-institutes:grid.411461.7
168 schema:familyName Hyatt
169 schema:givenName Doug
170 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0622320143.93
171 rdf:type schema:Person
172 sg:pub.10.1186/1471-2105-4-21 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013695495
173 https://doi.org/10.1186/1471-2105-4-21
174 rdf:type schema:CreativeWork
175 sg:pub.10.1186/1471-2105-8-97 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040156671
176 https://doi.org/10.1186/1471-2105-8-97
177 rdf:type schema:CreativeWork
178 grid-institutes:grid.135519.a schema:alternateName Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA
179 DOE Joint Genome Institute, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA
180 schema:name Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA
181 DOE Joint Genome Institute, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA
182 rdf:type schema:Organization
183 grid-institutes:grid.411461.7 schema:alternateName Genome Science and Technology Graduate School, The University of Tennessee, 37996, Knoxville, TN, USA
184 schema:name Computational Biology and Bioinformatics Group, Oak Ridge National Laboratory, 37831, Oak Ridge, TN, USA
185 Genome Science and Technology Graduate School, The University of Tennessee, 37996, Knoxville, TN, USA
186 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...