Ontology type: schema:ScholarlyArticle Open Access: True
2009-12-15
AUTHORSChristiam Camacho, George Coulouris, Vahram Avagyan, Ning Ma, Jason Papadopoulos, Kevin Bealer, Thomas L Madden
ABSTRACTBackgroundSequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications.ResultsWe describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site.ConclusionThe new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications. More... »
PAGES421
http://scigraph.springernature.com/pub.10.1186/1471-2105-10-421
DOIhttp://dx.doi.org/10.1186/1471-2105-10-421
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1050579230
PUBMEDhttps://www.ncbi.nlm.nih.gov/pubmed/20003500
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information Systems",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Computational Biology",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Databases, Genetic",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Sequence Alignment",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Software",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA",
"id": "http://www.grid.ac/institutes/grid.419234.9",
"name": [
"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA"
],
"type": "Organization"
},
"familyName": "Camacho",
"givenName": "Christiam",
"id": "sg:person.0770117714.83",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0770117714.83"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA",
"id": "http://www.grid.ac/institutes/grid.419234.9",
"name": [
"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA"
],
"type": "Organization"
},
"familyName": "Coulouris",
"givenName": "George",
"id": "sg:person.0776642273.17",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0776642273.17"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA",
"id": "http://www.grid.ac/institutes/grid.419234.9",
"name": [
"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA"
],
"type": "Organization"
},
"familyName": "Avagyan",
"givenName": "Vahram",
"id": "sg:person.0722540102.57",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0722540102.57"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA",
"id": "http://www.grid.ac/institutes/grid.419234.9",
"name": [
"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA"
],
"type": "Organization"
},
"familyName": "Ma",
"givenName": "Ning",
"id": "sg:person.01220574714.68",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01220574714.68"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA",
"id": "http://www.grid.ac/institutes/grid.419234.9",
"name": [
"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA"
],
"type": "Organization"
},
"familyName": "Papadopoulos",
"givenName": "Jason",
"type": "Person"
},
{
"affiliation": {
"alternateName": "National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA",
"id": "http://www.grid.ac/institutes/grid.419234.9",
"name": [
"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA"
],
"type": "Organization"
},
"familyName": "Bealer",
"givenName": "Kevin",
"type": "Person"
},
{
"affiliation": {
"alternateName": "National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA",
"id": "http://www.grid.ac/institutes/grid.419234.9",
"name": [
"National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, 20894, Bethesda, MD, USA"
],
"type": "Organization"
},
"familyName": "Madden",
"givenName": "Thomas L",
"id": "sg:person.01155045411.88",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01155045411.88"
],
"type": "Person"
}
],
"citation": [
{
"id": "sg:pub.10.1038/nature01262",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1039854529",
"https://doi.org/10.1038/nature01262"
],
"type": "CreativeWork"
}
],
"datePublished": "2009-12-15",
"datePublishedReg": "2009-12-15",
"description": "BackgroundSequence similarity searching is a very important bioinformatics task. While Basic Local Alignment Search Tool (BLAST) outperforms exact methods through its use of heuristics, the speed of the current BLAST software is suboptimal for very long queries or database sequences. There are also some shortcomings in the user-interface of the current command-line applications.ResultsWe describe features and improvements of rewritten BLAST software and introduce new command-line applications. Long query sequences are broken into chunks for processing, in some cases leading to dramatically shorter run times. For long database sequences, it is possible to retrieve only the relevant parts of the sequence, reducing CPU time and memory usage for searches of short queries against databases of contigs or chromosomes. The program can now retrieve masking information for database sequences from the BLAST databases. A new modular software library can now access subject sequence data from arbitrary data sources. We introduce several new features, including strategy files that allow a user to save and reuse their favorite set of options. The strategy files can be uploaded to and downloaded from the NCBI BLAST web site.ConclusionThe new BLAST command-line applications, compared to the current BLAST tools, demonstrate substantial speed improvements for long queries as well as chromosome length database sequences. We have also improved the user interface of the command-line applications.",
"genre": "article",
"id": "sg:pub.10.1186/1471-2105-10-421",
"inLanguage": "en",
"isAccessibleForFree": true,
"isPartOf": [
{
"id": "sg:journal.1023786",
"issn": [
"1471-2105"
],
"name": "BMC Bioinformatics",
"publisher": "Springer Nature",
"type": "Periodical"
},
{
"issueNumber": "1",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "10"
}
],
"keywords": [
"command-line application",
"long queries",
"modular software library",
"Basic Local Alignment Search Tool",
"arbitrary data sources",
"important bioinformatics tasks",
"long query sequences",
"database sequences",
"substantial speed improvements",
"short queries",
"software library",
"user interface",
"memory usage",
"bioinformatics tasks",
"use of heuristics",
"query sequence",
"queries",
"Web sites",
"Local Alignment Search Tool",
"search tools",
"speed improvement",
"data sources",
"new features",
"BLAST software",
"CPU time",
"relevant parts",
"exact method",
"software",
"files",
"BLAST database",
"applications",
"database",
"users",
"BLAST tool",
"heuristics",
"architecture",
"chunks",
"tool",
"task",
"features",
"usage",
"processing",
"information",
"library",
"set",
"search",
"interface",
"sequence data",
"shortcomings",
"sequence",
"improvement",
"speed",
"time",
"similarity",
"data",
"method",
"program",
"use",
"part",
"source",
"cases",
"options",
"contigs",
"ResultsWe",
"sites",
"chromosomes"
],
"name": "BLAST+: architecture and applications",
"pagination": "421",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1050579230"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1186/1471-2105-10-421"
]
},
{
"name": "pubmed_id",
"type": "PropertyValue",
"value": [
"20003500"
]
}
],
"sameAs": [
"https://doi.org/10.1186/1471-2105-10-421",
"https://app.dimensions.ai/details/publication/pub.1050579230"
],
"sdDataset": "articles",
"sdDatePublished": "2022-05-20T07:25",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/article/article_494.jsonl",
"type": "ScholarlyArticle",
"url": "https://doi.org/10.1186/1471-2105-10-421"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-10-421'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-10-421'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-10-421'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-10-421'
This table displays all metadata directly associated to this object as RDF triples.
187 TRIPLES
22 PREDICATES
97 URIs
88 LITERALS
11 BLANK NODES