StarFlow: A Script-Centric Data Analysis Environment View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2010-11-30

AUTHORS

Elaine Angelino , Daniel Yamins , Margo Seltzer

ABSTRACT

We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe real applications of StarFlow, including automatic parallelization of complex workflows in the cloud. More... »

PAGES

236-250

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27

DOI

http://dx.doi.org/10.1007/978-3-642-17819-1_27

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1021418233


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.38142.3c", 
          "name": [
            "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Angelino", 
        "givenName": "Elaine", 
        "id": "sg:person.013503020776.43", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013503020776.43"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.38142.3c", 
          "name": [
            "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yamins", 
        "givenName": "Daniel", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.38142.3c", 
          "name": [
            "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Seltzer", 
        "givenName": "Margo", 
        "id": "sg:person.01344037330.14", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344037330.14"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2010-11-30", 
    "datePublishedReg": "2010-11-30", 
    "description": "We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.", 
    "editor": [
      {
        "familyName": "McGuinness", 
        "givenName": "Deborah L.", 
        "type": "Person"
      }, 
      {
        "familyName": "Michaelis", 
        "givenName": "James R.", 
        "type": "Person"
      }, 
      {
        "familyName": "Moreau", 
        "givenName": "Luc", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-17819-1_27", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-642-17818-4", 
        "978-3-642-17819-1"
      ], 
      "name": "Provenance and Annotation of Data and Processes", 
      "type": "Book"
    }, 
    "keywords": [
      "data flow dependencies", 
      "data analysis environment", 
      "command-line tool", 
      "Python scripting language", 
      "complex analysis pipelines", 
      "workflow abstractions", 
      "user annotations", 
      "automatic parallelization", 
      "scripting language", 
      "complex workflows", 
      "parallel execution", 
      "analysis environment", 
      "dependency network", 
      "runtime analysis", 
      "real applications", 
      "seamless interface", 
      "analysis pipeline", 
      "static analysis", 
      "data analysis", 
      "StarFlow", 
      "parallelization", 
      "main features", 
      "novel combination", 
      "execution", 
      "workflow", 
      "environment", 
      "annotation", 
      "cloud", 
      "abstraction", 
      "network", 
      "language", 
      "pipeline", 
      "interface", 
      "tool", 
      "extraction", 
      "applications", 
      "dependency", 
      "features", 
      "support", 
      "analysis", 
      "control", 
      "combination", 
      "changes"
    ], 
    "name": "StarFlow: A Script-Centric Data Analysis Environment", 
    "pagination": "236-250", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1021418233"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-17819-1_27"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-17819-1_27", 
      "https://app.dimensions.ai/details/publication/pub.1021418233"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-08-04T17:15", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220804/entities/gbq_results/chapter/chapter_172.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-642-17819-1_27"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'


 

This table displays all metadata directly associated to this object as RDF triples.

129 TRIPLES      22 PREDICATES      68 URIs      60 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-17819-1_27 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 anzsrc-for:0806
4 schema:author Nf1b25bc3df2240c5ae55ce38b3b833fa
5 schema:datePublished 2010-11-30
6 schema:datePublishedReg 2010-11-30
7 schema:description We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.
8 schema:editor N7b5e19ceca6747f19b9cc5301f49b3fb
9 schema:genre chapter
10 schema:isAccessibleForFree true
11 schema:isPartOf N126796ff60e84871aaf6cc31c6aee19e
12 schema:keywords Python scripting language
13 StarFlow
14 abstraction
15 analysis
16 analysis environment
17 analysis pipeline
18 annotation
19 applications
20 automatic parallelization
21 changes
22 cloud
23 combination
24 command-line tool
25 complex analysis pipelines
26 complex workflows
27 control
28 data analysis
29 data analysis environment
30 data flow dependencies
31 dependency
32 dependency network
33 environment
34 execution
35 extraction
36 features
37 interface
38 language
39 main features
40 network
41 novel combination
42 parallel execution
43 parallelization
44 pipeline
45 real applications
46 runtime analysis
47 scripting language
48 seamless interface
49 static analysis
50 support
51 tool
52 user annotations
53 workflow
54 workflow abstractions
55 schema:name StarFlow: A Script-Centric Data Analysis Environment
56 schema:pagination 236-250
57 schema:productId N53c195513c5a4e1ea54299aa64c280ea
58 Nc6ab7aa0579840319f42e3334e29baca
59 schema:publisher N2718eca6d92946e789a2c9eb522776ba
60 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021418233
61 https://doi.org/10.1007/978-3-642-17819-1_27
62 schema:sdDatePublished 2022-08-04T17:15
63 schema:sdLicense https://scigraph.springernature.com/explorer/license/
64 schema:sdPublisher N97cf70f11e1e4a3ebb074ff008b82649
65 schema:url https://doi.org/10.1007/978-3-642-17819-1_27
66 sgo:license sg:explorer/license/
67 sgo:sdDataset chapters
68 rdf:type schema:Chapter
69 N126796ff60e84871aaf6cc31c6aee19e schema:isbn 978-3-642-17818-4
70 978-3-642-17819-1
71 schema:name Provenance and Annotation of Data and Processes
72 rdf:type schema:Book
73 N2718eca6d92946e789a2c9eb522776ba schema:name Springer Nature
74 rdf:type schema:Organisation
75 N2859ca728f9c4340b056cf0161afee82 rdf:first N5d81b6bce94145c9a0a76e75996dc579
76 rdf:rest Nb8c264fb646d485a9ddbee5f5ee4a313
77 N32dbbc8a373f4f8dac16c644c4955861 schema:affiliation grid-institutes:grid.38142.3c
78 schema:familyName Yamins
79 schema:givenName Daniel
80 rdf:type schema:Person
81 N42c3ffb83bc64a17abd8378023ec27ac rdf:first N32dbbc8a373f4f8dac16c644c4955861
82 rdf:rest Nf945fdd86f0f4105941d1f903035671b
83 N443c131dcc174ab6ba6f12d4a3ca6b08 schema:familyName Moreau
84 schema:givenName Luc
85 rdf:type schema:Person
86 N53c195513c5a4e1ea54299aa64c280ea schema:name doi
87 schema:value 10.1007/978-3-642-17819-1_27
88 rdf:type schema:PropertyValue
89 N5d81b6bce94145c9a0a76e75996dc579 schema:familyName Michaelis
90 schema:givenName James R.
91 rdf:type schema:Person
92 N670d4e987ca24ad2829dad4d0d41f700 schema:familyName McGuinness
93 schema:givenName Deborah L.
94 rdf:type schema:Person
95 N7b5e19ceca6747f19b9cc5301f49b3fb rdf:first N670d4e987ca24ad2829dad4d0d41f700
96 rdf:rest N2859ca728f9c4340b056cf0161afee82
97 N97cf70f11e1e4a3ebb074ff008b82649 schema:name Springer Nature - SN SciGraph project
98 rdf:type schema:Organization
99 Nb8c264fb646d485a9ddbee5f5ee4a313 rdf:first N443c131dcc174ab6ba6f12d4a3ca6b08
100 rdf:rest rdf:nil
101 Nc6ab7aa0579840319f42e3334e29baca schema:name dimensions_id
102 schema:value pub.1021418233
103 rdf:type schema:PropertyValue
104 Nf1b25bc3df2240c5ae55ce38b3b833fa rdf:first sg:person.013503020776.43
105 rdf:rest N42c3ffb83bc64a17abd8378023ec27ac
106 Nf945fdd86f0f4105941d1f903035671b rdf:first sg:person.01344037330.14
107 rdf:rest rdf:nil
108 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
109 schema:name Information and Computing Sciences
110 rdf:type schema:DefinedTerm
111 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
112 schema:name Artificial Intelligence and Image Processing
113 rdf:type schema:DefinedTerm
114 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
115 schema:name Information Systems
116 rdf:type schema:DefinedTerm
117 sg:person.01344037330.14 schema:affiliation grid-institutes:grid.38142.3c
118 schema:familyName Seltzer
119 schema:givenName Margo
120 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344037330.14
121 rdf:type schema:Person
122 sg:person.013503020776.43 schema:affiliation grid-institutes:grid.38142.3c
123 schema:familyName Angelino
124 schema:givenName Elaine
125 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013503020776.43
126 rdf:type schema:Person
127 grid-institutes:grid.38142.3c schema:alternateName School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA
128 schema:name School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA
129 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...