Bioinformatics Software for Analyzing Microbial Genomes View Homepage


Ontology type: schema:MonetaryGrant     


Grant Info

YEARS

2008-2019

FUNDING AMOUNT

2088253.0 USD

ABSTRACT

DESCRIPTION (provided by applicant): This project will support the continued development and maintenance of four bioinformatics software systems that are widely used in research on gene finding and genome annotation. The first of these, Glimmer, is used to find genes in bacteria, viruses, archaea, and simple eukaryotes. Glimmer is highly accurate, finding over 99% of the genes in most bacteria. It has been used by thousands of scientists around the world, including the majority of published bacterial and archival genome sequencing projects over the past decade. Collectively the three main publications describing Glimmer have been cited over 2,600 times, including 400 citations in 2012 alone. Usage of Glimmer has increased in recent years due to the explosion in next-generation sequencing projects, which are particularly cost-effective for bacterial genomes. Our very recent introduction of a new version of Glimmer customized for met genomics data is intended to make it available to microbiome researchers. Glimmer's algorithm is also the basis of PhymmBL, a new system for classifying sequences from metagenomics projects, which we will also support under this project. The second system, MUMmer, is a highly efficient system for whole-genome alignment that is widely used to compare bacterial genomes to one another and to compare genome assemblies to detect changes, both large and small. MUMmer and its components, especially Nucmer, have been widely used and have been incorporated in many other systems, including a recent multi-genome aligner, Mugsy, and several genome assembly packages. The three main publications describing MUMmer have been cited over 1,900 times including 200 citations in 2012. A major reason for the recent increase in usage of these systems, beyond the drop in sequencing costs, is the growth of metagenomics research, particularly the human microbiome project. This project will also support two other systems, TransTermHP and OperonDB, and the web databases that accompany them. TransTermHP finds transcription terminators in bacterial and archaeal genomes, and we have used it to build a website containing predictions for over 1500 genomes, all of which are freely downloadable. OperonDB includes a database and a software system that identifies operons in a collection of prokaryotic genomes using conserved synteny across species. Each of these systems have been widely used and cited, and this project requests funding to rebuild the databases on a larger collection of genomes and to continue to expand them as more genomes appear. All of the software and data generated by this project will continue to be freely available under an open source license, allowing unrestricted use by other researchers to use, modify, and redistribute them without restrictions of any kind. More... »

URL

http://projectreporter.nih.gov/project_info_description.cfm?aid=9206164

Related SciGraph Publications

  • 2018-11-28. CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise in GENOME BIOLOGY
  • 2018-11-16. KrakenUniq: confident and fast metagenomics classification using unique k-mer counts in GENOME BIOLOGY
  • 2017-05-08. Horizontal gene transfer is not a hallmark of the human genome in GENOME BIOLOGY
  • 2016-08-11. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown in NATURE PROTOCOLS
  • 2015-11-03. Use and mis-use of supplementary material in science publications in BMC BIOINFORMATICS
  • 2014-01-01. Kraken: ultrafast metagenomic sequence classification using exact alignments in GENOME BIOLOGY
  • 2012-10-30. Thousands of missed genes found in bacterial genomes and their analysis with COMBREX in BIOLOGY DIRECT
  • 2012-05-16. Butterfly genome reveals promiscuous exchange of mimicry adaptations among species in NATURE
  • 2012-03-04. Fast gapped-read alignment with Bowtie 2 in NATURE METHODS
  • 2012-03-01. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks in NATURE PROTOCOLS
  • 2011-11-29. Repetitive DNA and next-generation sequencing: computational challenges and solutions in NATURE REVIEWS GENETICS
  • 2011-08-11. TopHat-Fusion: an algorithm for discovery of novel fusion transcripts in GENOME BIOLOGY
  • 2011-07-04. Detection of lineage-specific evolutionary changes among primate species in BMC BIOINFORMATICS
  • 2011-06-30. Improving pan-genome annotation using whole genome multiple alignment in BMC BIOINFORMATICS
  • 2011-05-31. Complete Columbian mammoth mitogenome suggests interbreeding with woolly mammoths in GENOME BIOLOGY
  • 2011-04-28. PhymmBL expanded: confidence scores, custom databases, parallelization and more in NATURE METHODS
  • 2011-03-16. Improving RNA-Seq expression estimates by correcting for fragment bias in GENOME BIOLOGY
  • 2010-12-26. The genome of woodland strawberry (Fragaria vesca) in NATURE GENETICS
  • 2010-11-29. Quake: quality-aware detection and correction of sequencing errors in GENOME BIOLOGY
  • 2010-11-02. Clustering metagenomic sequences with interpolated Markov models in BMC BIOINFORMATICS
  • 2010-10-07. Do-it-yourself genetic testing in GENOME BIOLOGY
  • 2010-09-16. Probing the pan-genome of Listeria monocytogenes: new insights into intraspecific niche expansion and genomic diversification in BMC GENOMICS
  • 2010-07. Cloud computing and the DNA data race in NATURE BIOTECHNOLOGY
  • 2010-05-05. Between a chicken and a grape: estimating the number of human genes in GENOME BIOLOGY
  • 2010-05-02. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation in NATURE BIOTECHNOLOGY
  • 2010-03-10. Detection and correction of false segmental duplications caused by genome mis-assembly in GENOME BIOLOGY
  • 2009-11-20. Searching for SNPs with cloud computing in GENOME BIOLOGY
  • 2009-09-16. Efficient oligonucleotide probe selection for pan-genomic tiling arrays in BMC BIOINFORMATICS
  • 2009-08-02. Phymm and PhymmBL: metagenomic phylogenetic classification with interpolated Markov models in NATURE METHODS
  • 2009-07. The genome of the blood fluke Schistosoma mansoni in NATURE
  • 2009-05. How to map billions of short reads onto genomes in NATURE BIOTECHNOLOGY
  • 2009-04-24. A whole-genome assembly of the domestic cow, Bos taurus in GENOME BIOLOGY
  • 2009-03-04. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome in GENOME BIOLOGY
  • 2008-10-17. Analysis of Carica papaya Telomeres and Telomere-Associated Proteins: Insights into the Evolution of Telomere Maintenance in Brassicales in TROPICAL PLANT BIOLOGY
  • 2008-10. Comparative genomics of the neglected human malaria parasite Plasmodium vivax in NATURE
  • 2008-09-26. Genome-Wide Analysis of Repetitive Elements in Papaya in TROPICAL PLANT BIOLOGY
  • 2008-09. What are decision trees? in NATURE BIOTECHNOLOGY
  • 2008-07-09. The contents of the syringe in NATURE
  • 2008-05-01. Genome sequence and rapid evolution of the rice pathogen Xanthomonas oryzae pv. oryzae PXO99A in BMC GENOMICS
  • 2008-04. The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus) in NATURE
  • 2008-01-11. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments in GENOME BIOLOGY
  • 2007-12-10. High-throughput sequence alignment using Graphics Processing Units in BMC BIOINFORMATICS
  • 2007-02-21. Rapid, accurate, computational discovery of Rho-independent transcription terminators illuminates their relationship to DNA uptake in GENOME BIOLOGY
  • JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/31", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "type": "DefinedTerm"
          }
        ], 
        "amount": {
          "currency": "USD", 
          "type": "MonetaryAmount", 
          "value": 2088253.0
        }, 
        "description": "DESCRIPTION (provided by applicant): This project will support the continued development and maintenance of four bioinformatics software systems that are widely used in research on gene finding and genome annotation. The first of these, Glimmer, is used to find genes in bacteria, viruses, archaea, and simple eukaryotes. Glimmer is highly accurate, finding over 99% of the genes in most bacteria. It has been used by thousands of scientists around the world, including the majority of published bacterial and archival genome sequencing projects over the past decade. Collectively the three main publications describing Glimmer have been cited over 2,600 times, including 400 citations in 2012 alone. Usage of Glimmer has increased in recent years due to the explosion in next-generation sequencing projects, which are particularly cost-effective for bacterial genomes. Our very recent introduction of a new version of Glimmer customized for met genomics data is intended to make it available to microbiome researchers. Glimmer's algorithm is also the basis of PhymmBL, a new system for classifying sequences from metagenomics projects, which we will also support under this project. The second system, MUMmer, is a highly efficient system for whole-genome alignment that is widely used to compare bacterial genomes to one another and to compare genome assemblies to detect changes, both large and small. MUMmer and its components, especially Nucmer, have been widely used and have been incorporated in many other systems, including a recent multi-genome aligner, Mugsy, and several genome assembly packages. The three main publications describing MUMmer have been cited over 1,900 times including 200 citations in 2012. A major reason for the recent increase in usage of these systems, beyond the drop in sequencing costs, is the growth of metagenomics research, particularly the human microbiome project. This project will also support two other systems, TransTermHP and OperonDB, and the web databases that accompany them. TransTermHP finds transcription terminators in bacterial and archaeal genomes, and we have used it to build a website containing predictions for over 1500 genomes, all of which are freely downloadable. OperonDB includes a database and a software system that identifies operons in a collection of prokaryotic genomes using conserved synteny across species. Each of these systems have been widely used and cited, and this project requests funding to rebuild the databases on a larger collection of genomes and to continue to expand them as more genomes appear. All of the software and data generated by this project will continue to be freely available under an open source license, allowing unrestricted use by other researchers to use, modify, and redistribute them without restrictions of any kind.", 
        "endDate": "2019-07-31", 
        "funder": {
          "id": "http://www.grid.ac/institutes/grid.280785.0", 
          "type": "Organization"
        }, 
        "id": "sg:grant.2519905", 
        "identifier": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "grant.2519905"
            ]
          }, 
          {
            "name": "nih_id", 
            "type": "PropertyValue", 
            "value": [
              "R01GM083873"
            ]
          }
        ], 
        "keywords": [
          "bacterial genomes", 
          "sequencing projects", 
          "software systems", 
          "genome sequencing projects", 
          "whole-genome alignments", 
          "next-generation sequencing projects", 
          "Human Microbiome Project", 
          "open source license", 
          "simple eukaryotes", 
          "archaeal genomes", 
          "microbial genomes", 
          "genome annotation", 
          "genome assembly", 
          "prokaryotic genomes", 
          "transcription terminator", 
          "more genomes", 
          "metagenomic projects", 
          "most bacteria", 
          "genomic data", 
          "genome", 
          "metagenomic research", 
          "sequencing costs", 
          "Microbiome Project", 
          "gene finding", 
          "bioinformatics software", 
          "web databases", 
          "microbiome researchers", 
          "large collection", 
          "genes", 
          "MUMmer", 
          "assembly packages", 
          "thousands of scientists", 
          "bacteria", 
          "algorithm", 
          "software", 
          "efficient system", 
          "new version", 
          "synteny", 
          "eukaryotes", 
          "new system", 
          "archaea", 
          "operon", 
          "database", 
          "usage", 
          "species", 
          "recent introduction", 
          "terminator", 
          "project", 
          "recent years", 
          "PhymmBL", 
          "NUCmer", 
          "system", 
          "annotation", 
          "sequence", 
          "second system", 
          "assembly", 
          "collection", 
          "researchers", 
          "websites", 
          "growth", 
          "aligners", 
          "license", 
          "unrestricted use", 
          "maintenance", 
          "virus", 
          "package", 
          "thousands", 
          "recent increase", 
          "past decade", 
          "data", 
          "cost", 
          "main publications", 
          "research", 
          "version", 
          "alignment", 
          "explosion", 
          "kind", 
          "time", 
          "major reason", 
          "prediction", 
          "development", 
          "basis", 
          "components", 
          "world", 
          "scientists", 
          "description", 
          "changes", 
          "citations", 
          "publications", 
          "majority", 
          "glimmer", 
          "use", 
          "findings", 
          "increase", 
          "decades", 
          "restriction", 
          "introduction", 
          "reasons", 
          "years", 
          "drop"
        ], 
        "name": "Bioinformatics Software for Analyzing Microbial Genomes", 
        "recipient": [
          {
            "id": "http://www.grid.ac/institutes/grid.21107.35", 
            "type": "Organization"
          }, 
          {
            "affiliation": {
              "id": "http://www.grid.ac/institutes/None", 
              "name": "JOHNS HOPKINS UNIVERSITY", 
              "type": "Organization"
            }, 
            "familyName": "SALZBERG", 
            "givenName": "STEVEN L.", 
            "id": "sg:person.01223441713.02", 
            "type": "Person"
          }, 
          {
            "member": "sg:person.01223441713.02", 
            "roleName": "PI", 
            "type": "Role"
          }
        ], 
        "sameAs": [
          "https://app.dimensions.ai/details/grant/grant.2519905"
        ], 
        "sdDataset": "grants", 
        "sdDatePublished": "2022-12-01T06:57", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/grant/grant_33.jsonl", 
        "startDate": "2008-03-25", 
        "type": "MonetaryGrant", 
        "url": "http://projectreporter.nih.gov/project_info_description.cfm?aid=9206164"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/grant.2519905'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/grant.2519905'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/grant.2519905'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/grant.2519905'


     

    This table displays all metadata directly associated to this object as RDF triples.

    144 TRIPLES      18 PREDICATES      121 URIs      113 LITERALS      5 BLANK NODES

    Subject Predicate Object
    1 sg:grant.2519905 schema:about anzsrc-for:31
    2 schema:amount N69267e77597a4e4380801bfe3da47460
    3 schema:description DESCRIPTION (provided by applicant): This project will support the continued development and maintenance of four bioinformatics software systems that are widely used in research on gene finding and genome annotation. The first of these, Glimmer, is used to find genes in bacteria, viruses, archaea, and simple eukaryotes. Glimmer is highly accurate, finding over 99% of the genes in most bacteria. It has been used by thousands of scientists around the world, including the majority of published bacterial and archival genome sequencing projects over the past decade. Collectively the three main publications describing Glimmer have been cited over 2,600 times, including 400 citations in 2012 alone. Usage of Glimmer has increased in recent years due to the explosion in next-generation sequencing projects, which are particularly cost-effective for bacterial genomes. Our very recent introduction of a new version of Glimmer customized for met genomics data is intended to make it available to microbiome researchers. Glimmer's algorithm is also the basis of PhymmBL, a new system for classifying sequences from metagenomics projects, which we will also support under this project. The second system, MUMmer, is a highly efficient system for whole-genome alignment that is widely used to compare bacterial genomes to one another and to compare genome assemblies to detect changes, both large and small. MUMmer and its components, especially Nucmer, have been widely used and have been incorporated in many other systems, including a recent multi-genome aligner, Mugsy, and several genome assembly packages. The three main publications describing MUMmer have been cited over 1,900 times including 200 citations in 2012. A major reason for the recent increase in usage of these systems, beyond the drop in sequencing costs, is the growth of metagenomics research, particularly the human microbiome project. This project will also support two other systems, TransTermHP and OperonDB, and the web databases that accompany them. TransTermHP finds transcription terminators in bacterial and archaeal genomes, and we have used it to build a website containing predictions for over 1500 genomes, all of which are freely downloadable. OperonDB includes a database and a software system that identifies operons in a collection of prokaryotic genomes using conserved synteny across species. Each of these systems have been widely used and cited, and this project requests funding to rebuild the databases on a larger collection of genomes and to continue to expand them as more genomes appear. All of the software and data generated by this project will continue to be freely available under an open source license, allowing unrestricted use by other researchers to use, modify, and redistribute them without restrictions of any kind.
    4 schema:endDate 2019-07-31
    5 schema:funder grid-institutes:grid.280785.0
    6 schema:identifier N87777ff0455a4cdca25e3c30b0dca7e6
    7 Na60a083f6f6c4b30ab0936b54f361dd1
    8 schema:keywords Human Microbiome Project
    9 MUMmer
    10 Microbiome Project
    11 NUCmer
    12 PhymmBL
    13 algorithm
    14 aligners
    15 alignment
    16 annotation
    17 archaea
    18 archaeal genomes
    19 assembly
    20 assembly packages
    21 bacteria
    22 bacterial genomes
    23 basis
    24 bioinformatics software
    25 changes
    26 citations
    27 collection
    28 components
    29 cost
    30 data
    31 database
    32 decades
    33 description
    34 development
    35 drop
    36 efficient system
    37 eukaryotes
    38 explosion
    39 findings
    40 gene finding
    41 genes
    42 genome
    43 genome annotation
    44 genome assembly
    45 genome sequencing projects
    46 genomic data
    47 glimmer
    48 growth
    49 increase
    50 introduction
    51 kind
    52 large collection
    53 license
    54 main publications
    55 maintenance
    56 major reason
    57 majority
    58 metagenomic projects
    59 metagenomic research
    60 microbial genomes
    61 microbiome researchers
    62 more genomes
    63 most bacteria
    64 new system
    65 new version
    66 next-generation sequencing projects
    67 open source license
    68 operon
    69 package
    70 past decade
    71 prediction
    72 project
    73 prokaryotic genomes
    74 publications
    75 reasons
    76 recent increase
    77 recent introduction
    78 recent years
    79 research
    80 researchers
    81 restriction
    82 scientists
    83 second system
    84 sequence
    85 sequencing costs
    86 sequencing projects
    87 simple eukaryotes
    88 software
    89 software systems
    90 species
    91 synteny
    92 system
    93 terminator
    94 thousands
    95 thousands of scientists
    96 time
    97 transcription terminator
    98 unrestricted use
    99 usage
    100 use
    101 version
    102 virus
    103 web databases
    104 websites
    105 whole-genome alignments
    106 world
    107 years
    108 schema:name Bioinformatics Software for Analyzing Microbial Genomes
    109 schema:recipient N9ffcec8948d94080a6b8cecbf04f5f84
    110 sg:person.01223441713.02
    111 grid-institutes:grid.21107.35
    112 schema:sameAs https://app.dimensions.ai/details/grant/grant.2519905
    113 schema:sdDatePublished 2022-12-01T06:57
    114 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    115 schema:sdPublisher Nc240c453631e4a28af74fb6308689628
    116 schema:startDate 2008-03-25
    117 schema:url http://projectreporter.nih.gov/project_info_description.cfm?aid=9206164
    118 sgo:license sg:explorer/license/
    119 sgo:sdDataset grants
    120 rdf:type schema:MonetaryGrant
    121 N69267e77597a4e4380801bfe3da47460 schema:currency USD
    122 schema:value 2088253.0
    123 rdf:type schema:MonetaryAmount
    124 N87777ff0455a4cdca25e3c30b0dca7e6 schema:name nih_id
    125 schema:value R01GM083873
    126 rdf:type schema:PropertyValue
    127 N9ffcec8948d94080a6b8cecbf04f5f84 schema:member sg:person.01223441713.02
    128 schema:roleName PI
    129 rdf:type schema:Role
    130 Na60a083f6f6c4b30ab0936b54f361dd1 schema:name dimensions_id
    131 schema:value grant.2519905
    132 rdf:type schema:PropertyValue
    133 Nc240c453631e4a28af74fb6308689628 schema:name Springer Nature - SN SciGraph project
    134 rdf:type schema:Organization
    135 anzsrc-for:31 schema:inDefinedTermSet anzsrc-for:
    136 rdf:type schema:DefinedTerm
    137 sg:person.01223441713.02 schema:affiliation grid-institutes:None
    138 schema:familyName SALZBERG
    139 schema:givenName STEVEN L.
    140 rdf:type schema:Person
    141 grid-institutes:None schema:name JOHNS HOPKINS UNIVERSITY
    142 rdf:type schema:Organization
    143 grid-institutes:grid.21107.35 schema:Organization
    144 grid-institutes:grid.280785.0 schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...