The what, why, and how of born-open data View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2015-10-01

AUTHORS

Jeffrey N. Rouder

ABSTRACT

Although many researchers agree that scientific data should be open to scrutiny to ferret out poor analyses and outright fraud, most raw data sets are not available on demand. There are many reasons researchers do not open their data, and one is technical. It is often time consuming to prepare and archive data. In response, my laboratory has automated the process such that our data are archived the night they are created without any human approval or action. All data are versioned, logged, time stamped, and uploaded including aborted runs and data from pilot subjects. The archive is GitHub, github.com, the world’s largest collection of open-source materials. Data archived in this manner are called born open. In this paper, I discuss the benefits of born-open data and provide a brief technical overview of the process. I also address some of the common concerns about opening data before publication. More... »

PAGES

1062-1069

References to SciGraph publications

  • 2012-05-16. Replication studies: Bad copy in NATURE
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z

    DOI

    http://dx.doi.org/10.3758/s13428-015-0630-z

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1001324777

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/26428912


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Data Interpretation, Statistical", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Databases, Factual", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Information Dissemination", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Internet", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Publishing", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Research Personnel", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Scientific Misconduct", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Software", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Missouri, 65211, Columbia, MO, USA", 
              "id": "http://www.grid.ac/institutes/grid.134936.a", 
              "name": [
                "University of Missouri, 65211, Columbia, MO, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Rouder", 
            "givenName": "Jeffrey N.", 
            "id": "sg:person.013247007107.17", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013247007107.17"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1038/485298a", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016495803", 
              "https://doi.org/10.1038/485298a"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2015-10-01", 
        "datePublishedReg": "2015-10-01", 
        "description": "Although many researchers agree that scientific data should be open to scrutiny to ferret out poor analyses and outright fraud, most raw data sets are not available on demand. There are many reasons researchers do not open their data, and one is technical. It is often time consuming to prepare and archive data. In response, my laboratory has automated the process such that our data are archived the night they are created without any human approval or action. All data are versioned, logged, time stamped, and uploaded including aborted runs and data from pilot subjects. The archive is GitHub, github.com, the world\u2019s largest collection of open-source materials. Data archived in this manner are called born open. In this paper, I discuss the benefits of born-open data and provide a brief technical overview of the process. I also address some of the common concerns about opening data before publication.", 
        "genre": "article", 
        "id": "sg:pub.10.3758/s13428-015-0630-z", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.3132642", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1319746", 
            "issn": [
              "1554-351X", 
              "1532-5970"
            ], 
            "name": "Behavior Research Methods", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "3", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "48"
          }
        ], 
        "keywords": [
          "large collection", 
          "human approval", 
          "raw data sets", 
          "brief technical overview", 
          "technical overview", 
          "data sets", 
          "archive data", 
          "world's largest collection", 
          "poor analysis", 
          "open source materials", 
          "scientific data", 
          "reason researchers", 
          "pilot subjects", 
          "GitHub", 
          "researchers", 
          "fraud", 
          "data", 
          "set", 
          "archives", 
          "collection", 
          "demand", 
          "process", 
          "time", 
          "common concern", 
          "overview", 
          "benefits", 
          "manner", 
          "run", 
          "concern", 
          "outright fraud", 
          "publications", 
          "analysis", 
          "action", 
          "laboratory", 
          "scrutiny", 
          "subjects", 
          "approval", 
          "night", 
          "response", 
          "materials", 
          "paper"
        ], 
        "name": "The what, why, and how of born-open data", 
        "pagination": "1062-1069", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1001324777"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.3758/s13428-015-0630-z"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "26428912"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.3758/s13428-015-0630-z", 
          "https://app.dimensions.ai/details/publication/pub.1001324777"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-08-04T17:02", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220804/entities/gbq_results/article/article_651.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.3758/s13428-015-0630-z"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'


     

    This table displays all metadata directly associated to this object as RDF triples.

    140 TRIPLES      21 PREDICATES      75 URIs      66 LITERALS      15 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.3758/s13428-015-0630-z schema:about N14f5d9f0f652416c9089b74410ab2d19
    2 N3c18a0b5a6344300a97d0f40510b6c47
    3 N45bfee3f68114a29b0f24f20227889ce
    4 N950af14a4f1b4a6a8732ec42318bd294
    5 N9b79c3ba693e48efaa3e9a52421235a9
    6 Na080da061324441b901a23b60b214df9
    7 Nab820370bdfd49d3bf75e17ae870b02c
    8 Nf8a0092fffef43dca1fd41e2ff9d5e9f
    9 anzsrc-for:08
    10 anzsrc-for:0801
    11 schema:author Nb698323339a641e49365dab379613e41
    12 schema:citation sg:pub.10.1038/485298a
    13 schema:datePublished 2015-10-01
    14 schema:datePublishedReg 2015-10-01
    15 schema:description Although many researchers agree that scientific data should be open to scrutiny to ferret out poor analyses and outright fraud, most raw data sets are not available on demand. There are many reasons researchers do not open their data, and one is technical. It is often time consuming to prepare and archive data. In response, my laboratory has automated the process such that our data are archived the night they are created without any human approval or action. All data are versioned, logged, time stamped, and uploaded including aborted runs and data from pilot subjects. The archive is GitHub, github.com, the world’s largest collection of open-source materials. Data archived in this manner are called born open. In this paper, I discuss the benefits of born-open data and provide a brief technical overview of the process. I also address some of the common concerns about opening data before publication.
    16 schema:genre article
    17 schema:isAccessibleForFree true
    18 schema:isPartOf N3b06be2aaff5436196e1e2de059b0a49
    19 N9694562077cf442baf4b0cf1b5e992f7
    20 sg:journal.1319746
    21 schema:keywords GitHub
    22 action
    23 analysis
    24 approval
    25 archive data
    26 archives
    27 benefits
    28 brief technical overview
    29 collection
    30 common concern
    31 concern
    32 data
    33 data sets
    34 demand
    35 fraud
    36 human approval
    37 laboratory
    38 large collection
    39 manner
    40 materials
    41 night
    42 open source materials
    43 outright fraud
    44 overview
    45 paper
    46 pilot subjects
    47 poor analysis
    48 process
    49 publications
    50 raw data sets
    51 reason researchers
    52 researchers
    53 response
    54 run
    55 scientific data
    56 scrutiny
    57 set
    58 subjects
    59 technical overview
    60 time
    61 world's largest collection
    62 schema:name The what, why, and how of born-open data
    63 schema:pagination 1062-1069
    64 schema:productId Na9a6324f32704dc69d2e7e71e70e30f0
    65 Naedf64a9fe1b4ffc8ce86f5f657f6197
    66 Nfe5ba63613fd42c7bff657d869425557
    67 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001324777
    68 https://doi.org/10.3758/s13428-015-0630-z
    69 schema:sdDatePublished 2022-08-04T17:02
    70 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    71 schema:sdPublisher Ne9136048400a434eb0018c20ba7d8cfb
    72 schema:url https://doi.org/10.3758/s13428-015-0630-z
    73 sgo:license sg:explorer/license/
    74 sgo:sdDataset articles
    75 rdf:type schema:ScholarlyArticle
    76 N14f5d9f0f652416c9089b74410ab2d19 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    77 schema:name Publishing
    78 rdf:type schema:DefinedTerm
    79 N3b06be2aaff5436196e1e2de059b0a49 schema:volumeNumber 48
    80 rdf:type schema:PublicationVolume
    81 N3c18a0b5a6344300a97d0f40510b6c47 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    82 schema:name Internet
    83 rdf:type schema:DefinedTerm
    84 N45bfee3f68114a29b0f24f20227889ce schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    85 schema:name Software
    86 rdf:type schema:DefinedTerm
    87 N950af14a4f1b4a6a8732ec42318bd294 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    88 schema:name Databases, Factual
    89 rdf:type schema:DefinedTerm
    90 N9694562077cf442baf4b0cf1b5e992f7 schema:issueNumber 3
    91 rdf:type schema:PublicationIssue
    92 N9b79c3ba693e48efaa3e9a52421235a9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    93 schema:name Information Dissemination
    94 rdf:type schema:DefinedTerm
    95 Na080da061324441b901a23b60b214df9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    96 schema:name Data Interpretation, Statistical
    97 rdf:type schema:DefinedTerm
    98 Na9a6324f32704dc69d2e7e71e70e30f0 schema:name pubmed_id
    99 schema:value 26428912
    100 rdf:type schema:PropertyValue
    101 Nab820370bdfd49d3bf75e17ae870b02c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    102 schema:name Scientific Misconduct
    103 rdf:type schema:DefinedTerm
    104 Naedf64a9fe1b4ffc8ce86f5f657f6197 schema:name dimensions_id
    105 schema:value pub.1001324777
    106 rdf:type schema:PropertyValue
    107 Nb698323339a641e49365dab379613e41 rdf:first sg:person.013247007107.17
    108 rdf:rest rdf:nil
    109 Ne9136048400a434eb0018c20ba7d8cfb schema:name Springer Nature - SN SciGraph project
    110 rdf:type schema:Organization
    111 Nf8a0092fffef43dca1fd41e2ff9d5e9f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    112 schema:name Research Personnel
    113 rdf:type schema:DefinedTerm
    114 Nfe5ba63613fd42c7bff657d869425557 schema:name doi
    115 schema:value 10.3758/s13428-015-0630-z
    116 rdf:type schema:PropertyValue
    117 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    118 schema:name Information and Computing Sciences
    119 rdf:type schema:DefinedTerm
    120 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    121 schema:name Artificial Intelligence and Image Processing
    122 rdf:type schema:DefinedTerm
    123 sg:grant.3132642 http://pending.schema.org/fundedItem sg:pub.10.3758/s13428-015-0630-z
    124 rdf:type schema:MonetaryGrant
    125 sg:journal.1319746 schema:issn 1532-5970
    126 1554-351X
    127 schema:name Behavior Research Methods
    128 schema:publisher Springer Nature
    129 rdf:type schema:Periodical
    130 sg:person.013247007107.17 schema:affiliation grid-institutes:grid.134936.a
    131 schema:familyName Rouder
    132 schema:givenName Jeffrey N.
    133 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013247007107.17
    134 rdf:type schema:Person
    135 sg:pub.10.1038/485298a schema:sameAs https://app.dimensions.ai/details/publication/pub.1016495803
    136 https://doi.org/10.1038/485298a
    137 rdf:type schema:CreativeWork
    138 grid-institutes:grid.134936.a schema:alternateName University of Missouri, 65211, Columbia, MO, USA
    139 schema:name University of Missouri, 65211, Columbia, MO, USA
    140 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...