BioFlow: a web based workflow management software for design and execution of genomics pipelines View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2014-09-18

AUTHORS

Harold Garner, Ashwin Puthige

ABSTRACT

BackgroundBioinformatics data analysis is usually done sequentially by chaining together multiple tools. These are created by writing scripts and tracking the inputs and outputs of all stages. Writing such scripts require programming skills. Executing multiple pipelines in parallel and keeping track of all the generated files is difficult and error prone. Checking results and task completion requires users to remotely login to their servers and run commands to identify process status. Users would benefit from a web-based tool that allows creation and execution of pipelines remotely. The tool should also keep track of all the files generated and maintain a history of user activities.ResultsA software tool for building and executing workflows is described here. The individual tools in the workflows can be any command line executable or script. The software has an intuitive mechanism for adding new tools to be used in workflows. It contains a workflow designer where workflows can be creating by visually connecting various components. Workflows are executed by job runners. The outputs and the job history are saved. The tool is web based software tool and all actions can be performed remotely.ConclusionsUsers without scripting knowledge can utilize the tool to build pipelines for executing tasks. Pipelines can be modeled as workflows that are reusable. BioFlow enables users to easily add new tools to the database. The workflows can be created and executed remotely. A number of parallel jobs can be easily controlled. Distributed execution is possible by running multiple instances of the application. Any number of tasks can be executed and the output will be stored making it is easy to correlate the outputs to the jobs executed. More... »

PAGES

20

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1751-0473-9-20

DOI

http://dx.doi.org/10.1186/1751-0473-9-20

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1037235726


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0803", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Computer Software", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute, Washington Street 0477, 24061, Blacksburg, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute, Washington Street 0477, 24061, Blacksburg, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Garner", 
        "givenName": "Harold", 
        "id": "sg:person.0613230631.83", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613230631.83"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Virginia Bioinformatics Institute, Washington Street 0477, 24061, Blacksburg, VA, USA", 
          "id": "http://www.grid.ac/institutes/grid.438526.e", 
          "name": [
            "Virginia Bioinformatics Institute, Washington Street 0477, 24061, Blacksburg, VA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Puthige", 
        "givenName": "Ashwin", 
        "id": "sg:person.01033744466.12", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01033744466.12"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1186/gb-2010-11-8-r86", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046347776", 
          "https://doi.org/10.1186/gb-2010-11-8-r86"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014-09-18", 
    "datePublishedReg": "2014-09-18", 
    "description": "BackgroundBioinformatics data analysis is usually done sequentially by chaining together multiple tools. These are created by writing scripts and tracking the inputs and outputs of all stages. Writing such scripts require programming skills. Executing multiple pipelines in parallel and keeping track of all the generated files is difficult and error prone. Checking results and task completion requires users to remotely login to their servers and run commands to identify process status. Users would benefit from a web-based tool that allows creation and execution of pipelines remotely. The tool should also keep track of all the files generated and maintain a history of user activities.ResultsA software tool for building and executing workflows is described here. The individual tools in the workflows can be any command line executable or script. The software has an intuitive mechanism for adding new tools to be used in workflows. It contains a workflow designer where workflows can be creating by visually connecting various components. Workflows are executed by job runners. The outputs and the job history are saved. The tool is web based software tool and all actions can be performed remotely.ConclusionsUsers without scripting knowledge can utilize the tool to build pipelines for executing tasks. Pipelines can be modeled as workflows that are reusable. BioFlow enables users to easily add new tools to the database. The workflows can be created and executed remotely. A number of parallel jobs can be easily controlled. Distributed execution is possible by running multiple instances of the application. Any number of tasks can be executed and the output will be stored making it is easy to correlate the outputs to the jobs executed.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/1751-0473-9-20", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1037006", 
        "issn": [
          "1751-0473"
        ], 
        "name": "Source Code for Biology and Medicine", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "9"
      }
    ], 
    "keywords": [
      "software tools", 
      "workflow management software", 
      "number of tasks", 
      "web-based tool", 
      "workflow designer", 
      "user activity", 
      "parallel jobs", 
      "management software", 
      "error prone", 
      "programming skills", 
      "command line", 
      "multiple instances", 
      "multiple pipelines", 
      "such scripts", 
      "intuitive mechanism", 
      "individual tools", 
      "workflow", 
      "users", 
      "task completion", 
      "execution", 
      "multiple tools", 
      "process status", 
      "genomics pipeline", 
      "new tool", 
      "software", 
      "files", 
      "pipeline", 
      "Web", 
      "task", 
      "scripts", 
      "data analysis", 
      "tool", 
      "server", 
      "prone", 
      "command", 
      "designers", 
      "jobs", 
      "BioFlow", 
      "output", 
      "track", 
      "database", 
      "instances", 
      "input", 
      "applications", 
      "creation", 
      "design", 
      "number", 
      "knowledge", 
      "parallel", 
      "components", 
      "results", 
      "job history", 
      "skills", 
      "completion", 
      "analysis", 
      "stage", 
      "action", 
      "mechanism", 
      "lines", 
      "activity", 
      "status", 
      "history", 
      "runners"
    ], 
    "name": "BioFlow: a web based workflow management software for design and execution of genomics pipelines", 
    "pagination": "20", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1037235726"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1751-0473-9-20"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1751-0473-9-20", 
      "https://app.dimensions.ai/details/publication/pub.1037235726"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-05-20T07:29", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/article/article_620.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/1751-0473-9-20"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1751-0473-9-20'


 

This table displays all metadata directly associated to this object as RDF triples.

135 TRIPLES      22 PREDICATES      90 URIs      80 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1751-0473-9-20 schema:about anzsrc-for:08
2 anzsrc-for:0803
3 anzsrc-for:0806
4 schema:author N6f8172f8a9944bc2a97ddf326d07339a
5 schema:citation sg:pub.10.1186/gb-2010-11-8-r86
6 schema:datePublished 2014-09-18
7 schema:datePublishedReg 2014-09-18
8 schema:description BackgroundBioinformatics data analysis is usually done sequentially by chaining together multiple tools. These are created by writing scripts and tracking the inputs and outputs of all stages. Writing such scripts require programming skills. Executing multiple pipelines in parallel and keeping track of all the generated files is difficult and error prone. Checking results and task completion requires users to remotely login to their servers and run commands to identify process status. Users would benefit from a web-based tool that allows creation and execution of pipelines remotely. The tool should also keep track of all the files generated and maintain a history of user activities.ResultsA software tool for building and executing workflows is described here. The individual tools in the workflows can be any command line executable or script. The software has an intuitive mechanism for adding new tools to be used in workflows. It contains a workflow designer where workflows can be creating by visually connecting various components. Workflows are executed by job runners. The outputs and the job history are saved. The tool is web based software tool and all actions can be performed remotely.ConclusionsUsers without scripting knowledge can utilize the tool to build pipelines for executing tasks. Pipelines can be modeled as workflows that are reusable. BioFlow enables users to easily add new tools to the database. The workflows can be created and executed remotely. A number of parallel jobs can be easily controlled. Distributed execution is possible by running multiple instances of the application. Any number of tasks can be executed and the output will be stored making it is easy to correlate the outputs to the jobs executed.
9 schema:genre article
10 schema:inLanguage en
11 schema:isAccessibleForFree true
12 schema:isPartOf N298d53f8c6f7449eaf530f6d51858be5
13 N988130625f944475aa687b5faf650162
14 sg:journal.1037006
15 schema:keywords BioFlow
16 Web
17 action
18 activity
19 analysis
20 applications
21 command
22 command line
23 completion
24 components
25 creation
26 data analysis
27 database
28 design
29 designers
30 error prone
31 execution
32 files
33 genomics pipeline
34 history
35 individual tools
36 input
37 instances
38 intuitive mechanism
39 job history
40 jobs
41 knowledge
42 lines
43 management software
44 mechanism
45 multiple instances
46 multiple pipelines
47 multiple tools
48 new tool
49 number
50 number of tasks
51 output
52 parallel
53 parallel jobs
54 pipeline
55 process status
56 programming skills
57 prone
58 results
59 runners
60 scripts
61 server
62 skills
63 software
64 software tools
65 stage
66 status
67 such scripts
68 task
69 task completion
70 tool
71 track
72 user activity
73 users
74 web-based tool
75 workflow
76 workflow designer
77 workflow management software
78 schema:name BioFlow: a web based workflow management software for design and execution of genomics pipelines
79 schema:pagination 20
80 schema:productId N1ea317d647b84f8581b93c3f6343f194
81 Na1ed4f5c05814db7a00f732678a35410
82 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037235726
83 https://doi.org/10.1186/1751-0473-9-20
84 schema:sdDatePublished 2022-05-20T07:29
85 schema:sdLicense https://scigraph.springernature.com/explorer/license/
86 schema:sdPublisher N1aa06a8e7fde443c9e3ed0056779839d
87 schema:url https://doi.org/10.1186/1751-0473-9-20
88 sgo:license sg:explorer/license/
89 sgo:sdDataset articles
90 rdf:type schema:ScholarlyArticle
91 N1aa06a8e7fde443c9e3ed0056779839d schema:name Springer Nature - SN SciGraph project
92 rdf:type schema:Organization
93 N1ea317d647b84f8581b93c3f6343f194 schema:name doi
94 schema:value 10.1186/1751-0473-9-20
95 rdf:type schema:PropertyValue
96 N298d53f8c6f7449eaf530f6d51858be5 schema:issueNumber 1
97 rdf:type schema:PublicationIssue
98 N6f8172f8a9944bc2a97ddf326d07339a rdf:first sg:person.0613230631.83
99 rdf:rest N994f3870d4ea4b7fa00c9974ba023090
100 N988130625f944475aa687b5faf650162 schema:volumeNumber 9
101 rdf:type schema:PublicationVolume
102 N994f3870d4ea4b7fa00c9974ba023090 rdf:first sg:person.01033744466.12
103 rdf:rest rdf:nil
104 Na1ed4f5c05814db7a00f732678a35410 schema:name dimensions_id
105 schema:value pub.1037235726
106 rdf:type schema:PropertyValue
107 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
108 schema:name Information and Computing Sciences
109 rdf:type schema:DefinedTerm
110 anzsrc-for:0803 schema:inDefinedTermSet anzsrc-for:
111 schema:name Computer Software
112 rdf:type schema:DefinedTerm
113 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
114 schema:name Information Systems
115 rdf:type schema:DefinedTerm
116 sg:journal.1037006 schema:issn 1751-0473
117 schema:name Source Code for Biology and Medicine
118 schema:publisher Springer Nature
119 rdf:type schema:Periodical
120 sg:person.01033744466.12 schema:affiliation grid-institutes:grid.438526.e
121 schema:familyName Puthige
122 schema:givenName Ashwin
123 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01033744466.12
124 rdf:type schema:Person
125 sg:person.0613230631.83 schema:affiliation grid-institutes:grid.438526.e
126 schema:familyName Garner
127 schema:givenName Harold
128 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0613230631.83
129 rdf:type schema:Person
130 sg:pub.10.1186/gb-2010-11-8-r86 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046347776
131 https://doi.org/10.1186/gb-2010-11-8-r86
132 rdf:type schema:CreativeWork
133 grid-institutes:grid.438526.e schema:alternateName Virginia Bioinformatics Institute, Washington Street 0477, 24061, Blacksburg, VA, USA
134 schema:name Virginia Bioinformatics Institute, Washington Street 0477, 24061, Blacksburg, VA, USA
135 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...