The what, why, and how of born-open data View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2015-10-01

AUTHORS

Jeffrey N. Rouder

ABSTRACT

Although many researchers agree that scientific data should be open to scrutiny to ferret out poor analyses and outright fraud, most raw data sets are not available on demand. There are many reasons researchers do not open their data, and one is technical. It is often time consuming to prepare and archive data. In response, my laboratory has automated the process such that our data are archived the night they are created without any human approval or action. All data are versioned, logged, time stamped, and uploaded including aborted runs and data from pilot subjects. The archive is GitHub, github.com, the world’s largest collection of open-source materials. Data archived in this manner are called born open. In this paper, I discuss the benefits of born-open data and provide a brief technical overview of the process. I also address some of the common concerns about opening data before publication. More... »

PAGES

1062-1069

References to SciGraph publications

  • 2012-05-16. Replication studies: Bad copy in NATURE
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z

    DOI

    http://dx.doi.org/10.3758/s13428-015-0630-z

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1001324777

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/26428912


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Data Interpretation, Statistical", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Databases, Factual", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Information Dissemination", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Internet", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Publishing", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Research Personnel", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Scientific Misconduct", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Software", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of Missouri, 65211, Columbia, MO, USA", 
              "id": "http://www.grid.ac/institutes/grid.134936.a", 
              "name": [
                "University of Missouri, 65211, Columbia, MO, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Rouder", 
            "givenName": "Jeffrey N.", 
            "id": "sg:person.013247007107.17", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013247007107.17"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1038/485298a", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016495803", 
              "https://doi.org/10.1038/485298a"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2015-10-01", 
        "datePublishedReg": "2015-10-01", 
        "description": "Although many researchers agree that scientific data should be open to scrutiny to ferret out poor analyses and outright fraud, most raw data sets are not available on demand. There are many reasons researchers do not open their data, and one is technical. It is often time consuming to prepare and archive data. In response, my laboratory has automated the process such that our data are archived the night they are created without any human approval or action. All data are versioned, logged, time stamped, and uploaded including aborted runs and data from pilot subjects. The archive is GitHub, github.com, the world\u2019s largest collection of open-source materials. Data archived in this manner are called born open. In this paper, I discuss the benefits of born-open data and provide a brief technical overview of the process. I also address some of the common concerns about opening data before publication.", 
        "genre": "article", 
        "id": "sg:pub.10.3758/s13428-015-0630-z", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.3132642", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1319746", 
            "issn": [
              "1554-351X", 
              "1532-5970"
            ], 
            "name": "Behavior Research Methods", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "3", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "48"
          }
        ], 
        "keywords": [
          "large collection", 
          "human approval", 
          "raw data sets", 
          "brief technical overview", 
          "technical overview", 
          "data sets", 
          "archive data", 
          "world's largest collection", 
          "poor analysis", 
          "open source materials", 
          "scientific data", 
          "reason researchers", 
          "pilot subjects", 
          "GitHub", 
          "researchers", 
          "fraud", 
          "data", 
          "set", 
          "archives", 
          "collection", 
          "demand", 
          "process", 
          "time", 
          "common concern", 
          "overview", 
          "benefits", 
          "manner", 
          "run", 
          "concern", 
          "outright fraud", 
          "publications", 
          "analysis", 
          "action", 
          "laboratory", 
          "scrutiny", 
          "subjects", 
          "approval", 
          "night", 
          "response", 
          "materials", 
          "paper"
        ], 
        "name": "The what, why, and how of born-open data", 
        "pagination": "1062-1069", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1001324777"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.3758/s13428-015-0630-z"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "26428912"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.3758/s13428-015-0630-z", 
          "https://app.dimensions.ai/details/publication/pub.1001324777"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-08-04T17:02", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220804/entities/gbq_results/article/article_651.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.3758/s13428-015-0630-z"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.3758/s13428-015-0630-z'


     

    This table displays all metadata directly associated to this object as RDF triples.

    140 TRIPLES      21 PREDICATES      75 URIs      66 LITERALS      15 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.3758/s13428-015-0630-z schema:about N14fd2a0fabd34a37ad5074e949d589d7
    2 N3a4218ee45a1428285f114b6cac2fc09
    3 N71aec5d3828b4688b85b024e706cd67f
    4 N79fa06ed470f4f83950db49e05f2826b
    5 N98c0cedcc9124e4293e27e82d26ef6bc
    6 Nafccb64edf2344e9ad4b021b2b11ec07
    7 Ncb0dd3f7d12543e1a8a2b4bdbc9c7045
    8 Nd9662fe2af534eea910ace5f9b8eb7c2
    9 anzsrc-for:08
    10 anzsrc-for:0801
    11 schema:author N8d6f4394ff444e6ea8acba60abc2ebe4
    12 schema:citation sg:pub.10.1038/485298a
    13 schema:datePublished 2015-10-01
    14 schema:datePublishedReg 2015-10-01
    15 schema:description Although many researchers agree that scientific data should be open to scrutiny to ferret out poor analyses and outright fraud, most raw data sets are not available on demand. There are many reasons researchers do not open their data, and one is technical. It is often time consuming to prepare and archive data. In response, my laboratory has automated the process such that our data are archived the night they are created without any human approval or action. All data are versioned, logged, time stamped, and uploaded including aborted runs and data from pilot subjects. The archive is GitHub, github.com, the world’s largest collection of open-source materials. Data archived in this manner are called born open. In this paper, I discuss the benefits of born-open data and provide a brief technical overview of the process. I also address some of the common concerns about opening data before publication.
    16 schema:genre article
    17 schema:isAccessibleForFree true
    18 schema:isPartOf N99673c784e44411a905fb2279f94d691
    19 Nc719f4077d954ae18185b32e34460190
    20 sg:journal.1319746
    21 schema:keywords GitHub
    22 action
    23 analysis
    24 approval
    25 archive data
    26 archives
    27 benefits
    28 brief technical overview
    29 collection
    30 common concern
    31 concern
    32 data
    33 data sets
    34 demand
    35 fraud
    36 human approval
    37 laboratory
    38 large collection
    39 manner
    40 materials
    41 night
    42 open source materials
    43 outright fraud
    44 overview
    45 paper
    46 pilot subjects
    47 poor analysis
    48 process
    49 publications
    50 raw data sets
    51 reason researchers
    52 researchers
    53 response
    54 run
    55 scientific data
    56 scrutiny
    57 set
    58 subjects
    59 technical overview
    60 time
    61 world's largest collection
    62 schema:name The what, why, and how of born-open data
    63 schema:pagination 1062-1069
    64 schema:productId N59e01e8a49f2427882e5dd71e6598c21
    65 N6d116b28b7ad43da9fd8960980eff7b4
    66 Nfaf511031c324a0ab9105611fe5a645c
    67 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001324777
    68 https://doi.org/10.3758/s13428-015-0630-z
    69 schema:sdDatePublished 2022-08-04T17:02
    70 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    71 schema:sdPublisher Ncb21881cf05148c988de38b88192822f
    72 schema:url https://doi.org/10.3758/s13428-015-0630-z
    73 sgo:license sg:explorer/license/
    74 sgo:sdDataset articles
    75 rdf:type schema:ScholarlyArticle
    76 N14fd2a0fabd34a37ad5074e949d589d7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    77 schema:name Scientific Misconduct
    78 rdf:type schema:DefinedTerm
    79 N3a4218ee45a1428285f114b6cac2fc09 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    80 schema:name Publishing
    81 rdf:type schema:DefinedTerm
    82 N59e01e8a49f2427882e5dd71e6598c21 schema:name pubmed_id
    83 schema:value 26428912
    84 rdf:type schema:PropertyValue
    85 N6d116b28b7ad43da9fd8960980eff7b4 schema:name dimensions_id
    86 schema:value pub.1001324777
    87 rdf:type schema:PropertyValue
    88 N71aec5d3828b4688b85b024e706cd67f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    89 schema:name Information Dissemination
    90 rdf:type schema:DefinedTerm
    91 N79fa06ed470f4f83950db49e05f2826b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    92 schema:name Software
    93 rdf:type schema:DefinedTerm
    94 N8d6f4394ff444e6ea8acba60abc2ebe4 rdf:first sg:person.013247007107.17
    95 rdf:rest rdf:nil
    96 N98c0cedcc9124e4293e27e82d26ef6bc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    97 schema:name Data Interpretation, Statistical
    98 rdf:type schema:DefinedTerm
    99 N99673c784e44411a905fb2279f94d691 schema:volumeNumber 48
    100 rdf:type schema:PublicationVolume
    101 Nafccb64edf2344e9ad4b021b2b11ec07 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    102 schema:name Internet
    103 rdf:type schema:DefinedTerm
    104 Nc719f4077d954ae18185b32e34460190 schema:issueNumber 3
    105 rdf:type schema:PublicationIssue
    106 Ncb0dd3f7d12543e1a8a2b4bdbc9c7045 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    107 schema:name Databases, Factual
    108 rdf:type schema:DefinedTerm
    109 Ncb21881cf05148c988de38b88192822f schema:name Springer Nature - SN SciGraph project
    110 rdf:type schema:Organization
    111 Nd9662fe2af534eea910ace5f9b8eb7c2 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    112 schema:name Research Personnel
    113 rdf:type schema:DefinedTerm
    114 Nfaf511031c324a0ab9105611fe5a645c schema:name doi
    115 schema:value 10.3758/s13428-015-0630-z
    116 rdf:type schema:PropertyValue
    117 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    118 schema:name Information and Computing Sciences
    119 rdf:type schema:DefinedTerm
    120 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    121 schema:name Artificial Intelligence and Image Processing
    122 rdf:type schema:DefinedTerm
    123 sg:grant.3132642 http://pending.schema.org/fundedItem sg:pub.10.3758/s13428-015-0630-z
    124 rdf:type schema:MonetaryGrant
    125 sg:journal.1319746 schema:issn 1532-5970
    126 1554-351X
    127 schema:name Behavior Research Methods
    128 schema:publisher Springer Nature
    129 rdf:type schema:Periodical
    130 sg:person.013247007107.17 schema:affiliation grid-institutes:grid.134936.a
    131 schema:familyName Rouder
    132 schema:givenName Jeffrey N.
    133 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013247007107.17
    134 rdf:type schema:Person
    135 sg:pub.10.1038/485298a schema:sameAs https://app.dimensions.ai/details/publication/pub.1016495803
    136 https://doi.org/10.1038/485298a
    137 rdf:type schema:CreativeWork
    138 grid-institutes:grid.134936.a schema:alternateName University of Missouri, 65211, Columbia, MO, USA
    139 schema:name University of Missouri, 65211, Columbia, MO, USA
    140 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...