Ontology type: schema:ScholarlyArticle Open Access: True
2011-12
AUTHORS ABSTRACTBACKGROUND: The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation. RESULTS: A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License. CONCLUSIONS: Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance. More... »
PAGES221
http://scigraph.springernature.com/pub.10.1186/1471-2105-12-221
DOIhttp://dx.doi.org/10.1186/1471-2105-12-221
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1010513713
PUBMEDhttps://www.ncbi.nlm.nih.gov/pubmed/21631914
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information Systems",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Algorithms",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Amino Acid Sequence",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Computers",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Databases, Protein",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Molecular Sequence Data",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Programming Languages",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Proteins",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Sequence Alignment",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Software",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Sencel Bioinformatics (Norway)",
"id": "https://www.grid.ac/institutes/grid.458831.3",
"name": [
"Department of Informatics, University of Oslo, PO Box 1080, NO-0316, Blindern, Oslo, Norway",
"Centre for Molecular Biology and Neuroscience (CMBN), Department of Microbiology, Rikshospitalet, Oslo University Hospital, PO Box 4950, NO-0424, Nydalen, Oslo, Norway",
"Sencel Bioinformatics AS, PO Box 180, NO-0319, Vinderen, Oslo, Norway"
],
"type": "Organization"
},
"familyName": "Rognes",
"givenName": "Torbj\u00f8rn",
"id": "sg:person.01304226265.49",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01304226265.49"
],
"type": "Person"
}
],
"citation": [
{
"id": "https://doi.org/10.1093/nar/gkp846",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1006602999"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1093/bioinformatics/13.2.145",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1008319186"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1471-2105-9-377",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1009826728",
"https://doi.org/10.1186/1471-2105-9-377"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1471-2105-9-377",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1009826728",
"https://doi.org/10.1186/1471-2105-9-377"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1073/pnas.89.22.10915",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1010234644"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/s0022-2836(05)80360-2",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1013618994"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1093/bioinformatics/btl582",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1014155557"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1756-0500-2-73",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1016918082",
"https://doi.org/10.1186/1756-0500-2-73"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/b978-0-12-384988-5.00011-5",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1017852546"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1756-0500-1-107",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1023265691",
"https://doi.org/10.1186/1756-0500-1-107"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1471-2105-8-185",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1024279614",
"https://doi.org/10.1186/1471-2105-8-185"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1471-2105-8-185",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1024279614",
"https://doi.org/10.1186/1471-2105-8-185"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/0022-2836(81)90087-5",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1024589839"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/0022-2836(82)90398-9",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1025042064"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1093/bioinformatics/16.8.699",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1025315480"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1371/journal.pcbi.1000386",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1039097186"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1756-0500-3-93",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1041106303",
"https://doi.org/10.1186/1756-0500-3-93"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1093/nar/25.17.3389",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1047265454"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1016/j.ygeno.2010.03.001",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1052838811"
],
"type": "CreativeWork"
},
{
"id": "https://doi.org/10.1109/ipdps.2009.5160931",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1094780600"
],
"type": "CreativeWork"
}
],
"datePublished": "2011-12",
"datePublishedReg": "2011-12-01",
"description": "BACKGROUND: The Smith-Waterman algorithm for local sequence alignment is more sensitive than heuristic methods for database searching, but also more time-consuming. The fastest approach to parallelisation with SIMD technology has previously been described by Farrar in 2007. The aim of this study was to explore whether further speed could be gained by other approaches to parallelisation.\nRESULTS: A faster approach and implementation is described and benchmarked. In the new tool SWIPE, residues from sixteen different database sequences are compared in parallel to one query residue. Using a 375 residue query sequence a speed of 106 billion cell updates per second (GCUPS) was achieved on a dual Intel Xeon X5650 six-core processor system, which is over six times more rapid than software based on Farrar's 'striped' approach. SWIPE was about 2.5 times faster when the programs used only a single thread. For shorter queries, the increase in speed was larger. SWIPE was about twice as fast as BLAST when using the BLOSUM50 score matrix, while BLAST was about twice as fast as SWIPE for the BLOSUM62 matrix. The software is designed for 64 bit Linux on processors with SSSE3. Source code is available from http://dna.uio.no/swipe/ under the GNU Affero General Public License.\nCONCLUSIONS: Efficient parallelisation using SIMD on standard hardware makes it possible to run Smith-Waterman database searches more than six times faster than before. The approach described here could significantly widen the potential application of Smith-Waterman searches. Other applications that require optimal local alignment scores could also benefit from improved performance.",
"genre": "research_article",
"id": "sg:pub.10.1186/1471-2105-12-221",
"inLanguage": [
"en"
],
"isAccessibleForFree": true,
"isPartOf": [
{
"id": "sg:journal.1023786",
"issn": [
"1471-2105"
],
"name": "BMC Bioinformatics",
"type": "Periodical"
},
{
"issueNumber": "1",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "12"
}
],
"name": "Faster Smith-Waterman database searches with inter-sequence SIMD parallelisation",
"pagination": "221",
"productId": [
{
"name": "readcube_id",
"type": "PropertyValue",
"value": [
"3216829ee180732a981f613819f94e81f82537d78e65dd5dbbf0212845f3c032"
]
},
{
"name": "pubmed_id",
"type": "PropertyValue",
"value": [
"21631914"
]
},
{
"name": "nlm_unique_id",
"type": "PropertyValue",
"value": [
"100965194"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1186/1471-2105-12-221"
]
},
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1010513713"
]
}
],
"sameAs": [
"https://doi.org/10.1186/1471-2105-12-221",
"https://app.dimensions.ai/details/publication/pub.1010513713"
],
"sdDataset": "articles",
"sdDatePublished": "2019-04-10T14:16",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8660_00000549.jsonl",
"type": "ScholarlyArticle",
"url": "http://link.springer.com/10.1186/1471-2105-12-221"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-221'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-221'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-221'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-221'
This table displays all metadata directly associated to this object as RDF triples.
165 TRIPLES
21 PREDICATES
56 URIs
30 LITERALS
18 BLANK NODES