StarFlow: A Script-Centric Data Analysis Environment View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2010-11-30

AUTHORS

Elaine Angelino , Daniel Yamins , Margo Seltzer

ABSTRACT

We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe real applications of StarFlow, including automatic parallelization of complex workflows in the cloud. More... »

PAGES

236-250

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27

DOI

http://dx.doi.org/10.1007/978-3-642-17819-1_27

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1021418233


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information Systems", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.38142.3c", 
          "name": [
            "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Angelino", 
        "givenName": "Elaine", 
        "id": "sg:person.013503020776.43", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013503020776.43"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.38142.3c", 
          "name": [
            "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yamins", 
        "givenName": "Daniel", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.38142.3c", 
          "name": [
            "School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Seltzer", 
        "givenName": "Margo", 
        "id": "sg:person.01344037330.14", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344037330.14"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2010-11-30", 
    "datePublishedReg": "2010-11-30", 
    "description": "We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.", 
    "editor": [
      {
        "familyName": "McGuinness", 
        "givenName": "Deborah L.", 
        "type": "Person"
      }, 
      {
        "familyName": "Michaelis", 
        "givenName": "James R.", 
        "type": "Person"
      }, 
      {
        "familyName": "Moreau", 
        "givenName": "Luc", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-17819-1_27", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-642-17818-4", 
        "978-3-642-17819-1"
      ], 
      "name": "Provenance and Annotation of Data and Processes", 
      "type": "Book"
    }, 
    "keywords": [
      "data flow dependencies", 
      "data analysis environment", 
      "command-line tool", 
      "Python scripting language", 
      "complex analysis pipelines", 
      "workflow abstractions", 
      "user annotations", 
      "automatic parallelization", 
      "scripting language", 
      "complex workflows", 
      "parallel execution", 
      "analysis environment", 
      "dependency network", 
      "runtime analysis", 
      "real applications", 
      "seamless interface", 
      "analysis pipeline", 
      "static analysis", 
      "data analysis", 
      "StarFlow", 
      "parallelization", 
      "main features", 
      "novel combination", 
      "execution", 
      "workflow", 
      "environment", 
      "annotation", 
      "cloud", 
      "abstraction", 
      "network", 
      "language", 
      "pipeline", 
      "interface", 
      "tool", 
      "extraction", 
      "applications", 
      "dependency", 
      "features", 
      "support", 
      "analysis", 
      "control", 
      "combination", 
      "changes"
    ], 
    "name": "StarFlow: A Script-Centric Data Analysis Environment", 
    "pagination": "236-250", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1021418233"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-17819-1_27"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-17819-1_27", 
      "https://app.dimensions.ai/details/publication/pub.1021418233"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-08-04T17:15", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220804/entities/gbq_results/chapter/chapter_172.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-642-17819-1_27"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-17819-1_27'


 

This table displays all metadata directly associated to this object as RDF triples.

129 TRIPLES      22 PREDICATES      68 URIs      60 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-17819-1_27 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 anzsrc-for:0806
4 schema:author N05a81b995ed14ca1b8cc0848d0004cd0
5 schema:datePublished 2010-11-30
6 schema:datePublishedReg 2010-11-30
7 schema:description We introduce StarFlow, a script-centric environment for data analysis. StarFlow has four main features: (1) extraction of control and data-flow dependencies through a novel combination of static analysis, dynamic runtime analysis, and user annotations, (2) command-line tools for exploring and propagating changes through the resulting dependency network, (3) support for workflow abstractions enabling robust parallel executions of complex analysis pipelines, and (4) a seamless interface with the Python scripting language. We describe real applications of StarFlow, including automatic parallelization of complex workflows in the cloud.
8 schema:editor Nff9717c0379f4fcc994ba97b40c94540
9 schema:genre chapter
10 schema:isAccessibleForFree true
11 schema:isPartOf Nc8a2f6db077f4893977bf14cfac6671b
12 schema:keywords Python scripting language
13 StarFlow
14 abstraction
15 analysis
16 analysis environment
17 analysis pipeline
18 annotation
19 applications
20 automatic parallelization
21 changes
22 cloud
23 combination
24 command-line tool
25 complex analysis pipelines
26 complex workflows
27 control
28 data analysis
29 data analysis environment
30 data flow dependencies
31 dependency
32 dependency network
33 environment
34 execution
35 extraction
36 features
37 interface
38 language
39 main features
40 network
41 novel combination
42 parallel execution
43 parallelization
44 pipeline
45 real applications
46 runtime analysis
47 scripting language
48 seamless interface
49 static analysis
50 support
51 tool
52 user annotations
53 workflow
54 workflow abstractions
55 schema:name StarFlow: A Script-Centric Data Analysis Environment
56 schema:pagination 236-250
57 schema:productId N45fd7fd3f7f04bf0a615523fcf234245
58 Nda8d032777304f248bf7bb0e1865d746
59 schema:publisher Nc71bbac520bb4ace81224fdab64a6cda
60 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021418233
61 https://doi.org/10.1007/978-3-642-17819-1_27
62 schema:sdDatePublished 2022-08-04T17:15
63 schema:sdLicense https://scigraph.springernature.com/explorer/license/
64 schema:sdPublisher N3ba7164d6b88476e8ecca15b67144397
65 schema:url https://doi.org/10.1007/978-3-642-17819-1_27
66 sgo:license sg:explorer/license/
67 sgo:sdDataset chapters
68 rdf:type schema:Chapter
69 N05a81b995ed14ca1b8cc0848d0004cd0 rdf:first sg:person.013503020776.43
70 rdf:rest Ne0c4fc37821f42a8a5cdc325bc820d63
71 N0d21e5b11a664de3a31d04b7ed5c9299 schema:familyName Moreau
72 schema:givenName Luc
73 rdf:type schema:Person
74 N3ba7164d6b88476e8ecca15b67144397 schema:name Springer Nature - SN SciGraph project
75 rdf:type schema:Organization
76 N451b1601342344a984085a0161c4f137 schema:familyName McGuinness
77 schema:givenName Deborah L.
78 rdf:type schema:Person
79 N45fd7fd3f7f04bf0a615523fcf234245 schema:name dimensions_id
80 schema:value pub.1021418233
81 rdf:type schema:PropertyValue
82 N4842ab480e354142b0ea8a2a3eb63c7e schema:affiliation grid-institutes:grid.38142.3c
83 schema:familyName Yamins
84 schema:givenName Daniel
85 rdf:type schema:Person
86 N68385ea42ff24b7488ef521e9be575c9 schema:familyName Michaelis
87 schema:givenName James R.
88 rdf:type schema:Person
89 N824e9c845e894f459081e861c7b5888d rdf:first N0d21e5b11a664de3a31d04b7ed5c9299
90 rdf:rest rdf:nil
91 Nc71bbac520bb4ace81224fdab64a6cda schema:name Springer Nature
92 rdf:type schema:Organisation
93 Nc8a2f6db077f4893977bf14cfac6671b schema:isbn 978-3-642-17818-4
94 978-3-642-17819-1
95 schema:name Provenance and Annotation of Data and Processes
96 rdf:type schema:Book
97 Nd40152f05d454c4686bbd4b982804b40 rdf:first N68385ea42ff24b7488ef521e9be575c9
98 rdf:rest N824e9c845e894f459081e861c7b5888d
99 Nda8d032777304f248bf7bb0e1865d746 schema:name doi
100 schema:value 10.1007/978-3-642-17819-1_27
101 rdf:type schema:PropertyValue
102 Ne0c4fc37821f42a8a5cdc325bc820d63 rdf:first N4842ab480e354142b0ea8a2a3eb63c7e
103 rdf:rest Nea5b7644f68c42c4b17d3de5be116352
104 Nea5b7644f68c42c4b17d3de5be116352 rdf:first sg:person.01344037330.14
105 rdf:rest rdf:nil
106 Nff9717c0379f4fcc994ba97b40c94540 rdf:first N451b1601342344a984085a0161c4f137
107 rdf:rest Nd40152f05d454c4686bbd4b982804b40
108 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
109 schema:name Information and Computing Sciences
110 rdf:type schema:DefinedTerm
111 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
112 schema:name Artificial Intelligence and Image Processing
113 rdf:type schema:DefinedTerm
114 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
115 schema:name Information Systems
116 rdf:type schema:DefinedTerm
117 sg:person.01344037330.14 schema:affiliation grid-institutes:grid.38142.3c
118 schema:familyName Seltzer
119 schema:givenName Margo
120 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344037330.14
121 rdf:type schema:Person
122 sg:person.013503020776.43 schema:affiliation grid-institutes:grid.38142.3c
123 schema:familyName Angelino
124 schema:givenName Elaine
125 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013503020776.43
126 rdf:type schema:Person
127 grid-institutes:grid.38142.3c schema:alternateName School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA
128 schema:name School of Engineering and Applied Sciences, Harvard University, 33 Oxford St., 02138, Cambridge, MA, USA
129 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...