BioFlow: a web based workflow management software for design and execution of genomics pipelines View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2014-09-18

AUTHORS

Harold Garner, Ashwin Puthige

ABSTRACT

Background

Bioinformatics data analysis is usually done sequentially by chaining together multiple tools. These are created by writing scripts and tracking the inputs and outputs of all stages. Writing such scripts require programming skills. Executing multiple pipelines in parallel and keeping track of all the generated files is difficult and error prone. Checking results and task completion requires users to remotely login to their servers and run commands to identify process status. Users would benefit from a web-based tool that allows creation and execution of pipelines remotely. The tool should also keep track of all the files generated and maintain a history of user activities.

Results

A software tool for building and executing workflows is described here. The individual tools in the workflows can be any command line executable or script. The software has an intuitive mechanism for adding new tools to be used in workflows. It contains a workflow designer where workflows can be creating by visually connecting various components. Workflows are executed by job runners. The outputs and the job history are saved. The tool is web based software tool and all actions can be performed remotely.

Conclusions

Users without scripting knowledge can utilize the tool to build pipelines for executing tasks. Pipelines can be modeled as workflows that are reusable. BioFlow enables users to easily add new tools to the database. The workflows can be created and executed remotely. A number of parallel jobs can be easily controlled. Distributed execution is possible by running multiple instances of the application. Any number of tasks can be executed and the output will be stored making it is easy to correlate the outputs to the jobs executed.

More... »

PAGES

20-20

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1751-0473-9-20

DOI

http://dx.doi.org/10.1186/1751-0473-9-20

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1037235726


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0803", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computer Software", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute, Washington Street 0477, Blacksburg 24061, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute, Washington Street 0477, Blacksburg 24061, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Garner", 
        "givenName": "Harold", 
        "id": "sg:person.0613230631.83", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613230631.83"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute, Washington Street 0477, Blacksburg 24061, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute, Washington Street 0477, Blacksburg 24061, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Puthige", 
        "givenName": "Ashwin", 
        "id": "sg:person.01033744466.12", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01033744466.12"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1186/gb-2010-11-8-r86", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046347776", 
          "https://doi.org/10.1186/gb-2010-11-8-r86"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014-09-18", 
    "datePublishedReg": "2014-09-18", 
    "description": "Background

Bioinformatics data analysis is usually done sequentially by chaining together multiple tools. These are created by writing scripts and tracking the inputs and outputs of all stages. Writing such scripts require programming skills. Executing multiple pipelines in parallel and keeping track of all the generated files is difficult and error prone. Checking results and task completion requires users to remotely login to their servers and run commands to identify process status. Users would benefit from a web-based tool that allows creation and execution of pipelines remotely. The tool should also keep track of all the files generated and maintain a history of user activities.

Results

A software tool for building and executing workflows is described here. The individual tools in the workflows can be any command line executable or script. The software has an intuitive mechanism for adding new tools to be used in workflows. It contains a workflow designer where workflows can be creating by visually connecting various components. Workflows are executed by job runners. The outputs and the job history are saved. The tool is web based software tool and all actions can be performed remotely.

Conclusions

Users without scripting knowledge can utilize the tool to build pipelines for executing tasks. Pipelines can be modeled as workflows that are reusable. BioFlow enables users to easily add new tools to the database. The workflows can be created and executed remotely. A number of parallel jobs can be easily controlled. Distributed execution is possible by running multiple instances of the application. Any number of tasks can be executed and the output will be stored making it is easy to correlate the outputs to the jobs executed.

", "genre": "article", "id": "sg:pub.10.1186/1751-0473-9-20", "inLanguage": "en", "isAccessibleForFree": true, "isPartOf": [ { "id": "sg:journal.1037006", "issn": [ "1751-0473" ], "name": "Source Code for Biology and Medicine", "publisher": "Springer Nature", "type": "Periodical" }, { "issueNumber": "1", "type": "PublicationIssue" }, { "type": "PublicationVolume", "volumeNumber": "9" } ], "keywords": [ "software tools", "workflow management software", "web-based tool", "number of tasks", "workflow designer", "user activity", "management software", "programming skills", "parallel jobs", "error prone", "command line", "multiple instances", "intuitive mechanism", "bioinformatics data analysis", "multiple pipelines", "such scripts", "workflow", "users", "individual tools", "multiple tools", "execution", "task completion", "process status", "software", "new tool", "Web", "files", "genomics pipeline", "pipeline", "BioFlow", "task", "scripts", "tool", "data analysis", "server", "prone", "command", "designers", "output", "jobs", "track", "database", "instances", "applications", "creation", "input", "design", "number", "knowledge", "results", "parallel", "components", "skills", "background", "analysis", "job history", "completion", "stage", "mechanism", "action", "lines", "sec", "activity", "status", "history", "conclusion", "runners", "execution of pipelines", "job runners" ], "name": "BioFlow: a web based workflow management software for design and execution of genomics pipelines", "pagination": "20-20", "productId": [ { "name": "dimensions_id", "type": "PropertyValue", "value": [ "pub.1037235726" ] }, { "name": "doi", "type": "PropertyValue", "value": [ "10.1186/1751-0473-9-20" ] } ], "sameAs": [ "https://doi.org/10.1186/1751-0473-9-20", "https://app.dimensions.ai/details/publication/pub.1037235726" ], "sdDataset": "articles", "sdDatePublished": "2022-01-01T18:33", "sdLicense": "https://scigraph.springernature.com/explorer/license/", "sdPublisher": { "name": "Springer Nature - SN SciGraph project", "type": "Organization" }, "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_624.jsonl", "type": "ScholarlyArticle", "url": "https://doi.org/10.1186/1751-0473-9-20" } ]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'


 

This table displays all metadata directly associated to this object as RDF triples.

141 TRIPLES      22 PREDICATES      96 URIs      86 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1751-0473-9-20 schema:about anzsrc-for:08
2 anzsrc-for:0803
3 anzsrc-for:0806
4 schema:author N790d6c6967d8474b9de427420f2f8726
5 schema:citation sg:pub.10.1186/gb-2010-11-8-r86
6 schema:datePublished 2014-09-18
7 schema:datePublishedReg 2014-09-18
8 schema:description <sec>Background<p>Bioinformatics data analysis is usually done sequentially by chaining together multiple tools. These are created by writing scripts and tracking the inputs and outputs of all stages. Writing such scripts require programming skills. Executing multiple pipelines in parallel and keeping track of all the generated files is difficult and error prone. Checking results and task completion requires users to remotely login to their servers and run commands to identify process status. Users would benefit from a web-based tool that allows creation and execution of pipelines remotely. The tool should also keep track of all the files generated and maintain a history of user activities.</p></sec><sec>Results<p>A software tool for building and executing workflows is described here. The individual tools in the workflows can be any command line executable or script. The software has an intuitive mechanism for adding new tools to be used in workflows. It contains a workflow designer where workflows can be creating by visually connecting various components. Workflows are executed by job runners. The outputs and the job history are saved. The tool is web based software tool and all actions can be performed remotely.</p></sec><sec>Conclusions<p>Users without scripting knowledge can utilize the tool to build pipelines for executing tasks. Pipelines can be modeled as workflows that are reusable. BioFlow enables users to easily add new tools to the database. The workflows can be created and executed remotely. A number of parallel jobs can be easily controlled. Distributed execution is possible by running multiple instances of the application. Any number of tasks can be executed and the output will be stored making it is easy to correlate the outputs to the jobs executed.</p></sec>
9 schema:genre article
10 schema:inLanguage en
11 schema:isAccessibleForFree true
12 schema:isPartOf Na8b05a6c4a6c4b32a5a03495b57e196b
13 Nc889b73f91944643a223dcd0ef2394b0
14 sg:journal.1037006
15 schema:keywords BioFlow
16 Web
17 action
18 activity
19 analysis
20 applications
21 background
22 bioinformatics data analysis
23 command
24 command line
25 completion
26 components
27 conclusion
28 creation
29 data analysis
30 database
31 design
32 designers
33 error prone
34 execution
35 execution of pipelines
36 files
37 genomics pipeline
38 history
39 individual tools
40 input
41 instances
42 intuitive mechanism
43 job history
44 job runners
45 jobs
46 knowledge
47 lines
48 management software
49 mechanism
50 multiple instances
51 multiple pipelines
52 multiple tools
53 new tool
54 number
55 number of tasks
56 output
57 parallel
58 parallel jobs
59 pipeline
60 process status
61 programming skills
62 prone
63 results
64 runners
65 scripts
66 sec
67 server
68 skills
69 software
70 software tools
71 stage
72 status
73 such scripts
74 task
75 task completion
76 tool
77 track
78 user activity
79 users
80 web-based tool
81 workflow
82 workflow designer
83 workflow management software
84 schema:name BioFlow: a web based workflow management software for design and execution of genomics pipelines
85 schema:pagination 20-20
86 schema:productId Na862f35ae4374625b5851f92fda13b95
87 Nba614c662fea43d180702cc16cc6262f
88 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037235726
89 https://doi.org/10.1186/1751-0473-9-20
90 schema:sdDatePublished 2022-01-01T18:33
91 schema:sdLicense https://scigraph.springernature.com/explorer/license/
92 schema:sdPublisher Na8c56f1b3a8348de93db73545f076ce8
93 schema:url https://doi.org/10.1186/1751-0473-9-20
94 sgo:license sg:explorer/license/
95 sgo:sdDataset articles
96 rdf:type schema:ScholarlyArticle
97 N1ccdd711c76f487f8cc08fc95d72c2d7 rdf:first sg:person.01033744466.12
98 rdf:rest rdf:nil
99 N790d6c6967d8474b9de427420f2f8726 rdf:first sg:person.0613230631.83
100 rdf:rest N1ccdd711c76f487f8cc08fc95d72c2d7
101 Na862f35ae4374625b5851f92fda13b95 schema:name doi
102 schema:value 10.1186/1751-0473-9-20
103 rdf:type schema:PropertyValue
104 Na8b05a6c4a6c4b32a5a03495b57e196b schema:volumeNumber 9
105 rdf:type schema:PublicationVolume
106 Na8c56f1b3a8348de93db73545f076ce8 schema:name Springer Nature - SN SciGraph project
107 rdf:type schema:Organization
108 Nba614c662fea43d180702cc16cc6262f schema:name dimensions_id
109 schema:value pub.1037235726
110 rdf:type schema:PropertyValue
111 Nc889b73f91944643a223dcd0ef2394b0 schema:issueNumber 1
112 rdf:type schema:PublicationIssue
113 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
114 schema:name Information and Computing Sciences
115 rdf:type schema:DefinedTerm
116 anzsrc-for:0803 schema:inDefinedTermSet anzsrc-for:
117 schema:name Computer Software
118 rdf:type schema:DefinedTerm
119 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
120 schema:name Information Systems
121 rdf:type schema:DefinedTerm
122 sg:journal.1037006 schema:issn 1751-0473
123 schema:name Source Code for Biology and Medicine
124 schema:publisher Springer Nature
125 rdf:type schema:Periodical
126 sg:person.01033744466.12 schema:affiliation grid-institutes:grid.438526.e
127 schema:familyName Puthige
128 schema:givenName Ashwin
129 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01033744466.12
130 rdf:type schema:Person
131 sg:person.0613230631.83 schema:affiliation grid-institutes:grid.438526.e
132 schema:familyName Garner
133 schema:givenName Harold
134 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613230631.83
135 rdf:type schema:Person
136 sg:pub.10.1186/gb-2010-11-8-r86 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046347776
137 https://doi.org/10.1186/gb-2010-11-8-r86
138 rdf:type schema:CreativeWork
139 grid-institutes:grid.438526.e schema:alternateName Virginia Bioinformatics Institute, Washington Street 0477, Blacksburg 24061, VA, USA
140 schema:name Virginia Bioinformatics Institute, Washington Street 0477, Blacksburg 24061, VA, USA
141 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...