Protecting Census 2021 Origin-Destination Data Using a Combination of Cell-Key Perturbation and Suppression View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2018-08-25

AUTHORS

Iain Dove , Christos Ntoumos , Keith Spicer

ABSTRACT

The UK Office for National Statistics (ONS) is intending to produce outputs involving travel to and from different locations (origins and destinations) in 2021, as they have done for previous Censuses. This data poses a particular challenge for protecting against disclosure risk, as categorising respondents on multiple geographical variables yields very sparse tables. This paper explores the disclosure risk and data utility of one option for protecting this data: applying cell-key perturbation (noise), and suppressing the remaining disclosive values. It finds that these methods provide good protection for the data with considerable loss of utility for outputs at low geographies. Whether this is an acceptable approach will be determined by user feedback. More... »

PAGES

43-55

References to SciGraph publications

  • 2008. Invariant Post-tabular Protection of Census Frequency Counts in PRIVACY IN STATISTICAL DATABASES
  • 2010. Data Swapping for Protecting Census Tables in PRIVACY IN STATISTICAL DATABASES
  • Book

    TITLE

    Privacy in Statistical Databases

    ISBN

    978-3-319-99770-4
    978-3-319-99771-1

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-319-99771-1_4

    DOI

    http://dx.doi.org/10.1007/978-3-319-99771-1_4

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1106352235


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Office for National Statistics", 
              "id": "https://www.grid.ac/institutes/grid.426100.1", 
              "name": [
                "Office for National Statistics, PO15 5RR, Titchfield, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Dove", 
            "givenName": "Iain", 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Office for National Statistics", 
              "id": "https://www.grid.ac/institutes/grid.426100.1", 
              "name": [
                "Office for National Statistics, PO15 5RR, Titchfield, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Ntoumos", 
            "givenName": "Christos", 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Office for National Statistics", 
              "id": "https://www.grid.ac/institutes/grid.426100.1", 
              "name": [
                "Office for National Statistics, PO15 5RR, Titchfield, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Spicer", 
            "givenName": "Keith", 
            "id": "sg:person.012700142123.49", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012700142123.49"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/978-3-540-87471-3_7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031243917", 
              "https://doi.org/10.1007/978-3-540-87471-3_7"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-87471-3_7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031243917", 
              "https://doi.org/10.1007/978-3-540-87471-3_7"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-15838-4_4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034319090", 
              "https://doi.org/10.1007/978-3-642-15838-4_4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-15838-4_4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1034319090", 
              "https://doi.org/10.1007/978-3-642-15838-4_4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1106813468", 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2018-08-25", 
        "datePublishedReg": "2018-08-25", 
        "description": "The UK Office for National Statistics (ONS) is intending to produce outputs involving travel to and from different locations (origins and destinations) in 2021, as they have done for previous Censuses. This data poses a particular challenge for protecting against disclosure risk, as categorising respondents on multiple geographical variables yields very sparse tables. This paper explores the disclosure risk and data utility of one option for protecting this data: applying cell-key perturbation (noise), and suppressing the remaining disclosive values. It finds that these methods provide good protection for the data with considerable loss of utility for outputs at low geographies. Whether this is an acceptable approach will be determined by user feedback.", 
        "editor": [
          {
            "familyName": "Domingo-Ferrer", 
            "givenName": "Josep", 
            "type": "Person"
          }, 
          {
            "familyName": "Montes", 
            "givenName": "Francisco", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-319-99771-1_4", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": {
          "isbn": [
            "978-3-319-99770-4", 
            "978-3-319-99771-1"
          ], 
          "name": "Privacy in Statistical Databases", 
          "type": "Book"
        }, 
        "name": "Protecting Census 2021 Origin-Destination Data Using a Combination of Cell-Key Perturbation and Suppression", 
        "pagination": "43-55", 
        "productId": [
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-319-99771-1_4"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "274662a4d7e1f5c15322ff6feb789a2f1478b0f8ff0814c5a449eb500584a736"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1106352235"
            ]
          }
        ], 
        "publisher": {
          "location": "Cham", 
          "name": "Springer International Publishing", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-319-99771-1_4", 
          "https://app.dimensions.ai/details/publication/pub.1106352235"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T05:02", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000325_0000000325/records_100812_00000000.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-319-99771-1_4"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-99771-1_4'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-99771-1_4'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-99771-1_4'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-99771-1_4'


     

    This table displays all metadata directly associated to this object as RDF triples.

    92 TRIPLES      23 PREDICATES      29 URIs      19 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-319-99771-1_4 schema:about anzsrc-for:01
    2 anzsrc-for:0104
    3 schema:author Na2c7d2d6df254032934abea54f174aee
    4 schema:citation sg:pub.10.1007/978-3-540-87471-3_7
    5 sg:pub.10.1007/978-3-642-15838-4_4
    6 https://app.dimensions.ai/details/publication/pub.1106813468
    7 schema:datePublished 2018-08-25
    8 schema:datePublishedReg 2018-08-25
    9 schema:description The UK Office for National Statistics (ONS) is intending to produce outputs involving travel to and from different locations (origins and destinations) in 2021, as they have done for previous Censuses. This data poses a particular challenge for protecting against disclosure risk, as categorising respondents on multiple geographical variables yields very sparse tables. This paper explores the disclosure risk and data utility of one option for protecting this data: applying cell-key perturbation (noise), and suppressing the remaining disclosive values. It finds that these methods provide good protection for the data with considerable loss of utility for outputs at low geographies. Whether this is an acceptable approach will be determined by user feedback.
    10 schema:editor Na407cf3de1be4a219fb41ed0018d6b4a
    11 schema:genre chapter
    12 schema:inLanguage en
    13 schema:isAccessibleForFree false
    14 schema:isPartOf Nfe50b79e02534d5e8ff9ed745622ff40
    15 schema:name Protecting Census 2021 Origin-Destination Data Using a Combination of Cell-Key Perturbation and Suppression
    16 schema:pagination 43-55
    17 schema:productId Nc7c80b3c682040158a2f68ae676defc5
    18 Nd12a0bfc83a847c9975e573d58a5ff55
    19 Nd91ac5b805ef457a887920d7bcc6a2d7
    20 schema:publisher N7a11d90494a94b0a9074046af3a09ff7
    21 schema:sameAs https://app.dimensions.ai/details/publication/pub.1106352235
    22 https://doi.org/10.1007/978-3-319-99771-1_4
    23 schema:sdDatePublished 2019-04-16T05:02
    24 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    25 schema:sdPublisher Na011cf5251b24bf3a797756982fad9d7
    26 schema:url https://link.springer.com/10.1007%2F978-3-319-99771-1_4
    27 sgo:license sg:explorer/license/
    28 sgo:sdDataset chapters
    29 rdf:type schema:Chapter
    30 N08d777a17d264cde8155592319fddba6 rdf:first N5bbdd465c8584d45bf0f87b6f8b2ad19
    31 rdf:rest Nd549180cad3841eaac02e96dad4b96b4
    32 N15fd4515bae44fba950c47d394beb68f rdf:first N67fa761cc3234ca3ac99156278f2093b
    33 rdf:rest rdf:nil
    34 N1a41429fc813490d8f806b864977d602 schema:affiliation https://www.grid.ac/institutes/grid.426100.1
    35 schema:familyName Dove
    36 schema:givenName Iain
    37 rdf:type schema:Person
    38 N5bbdd465c8584d45bf0f87b6f8b2ad19 schema:affiliation https://www.grid.ac/institutes/grid.426100.1
    39 schema:familyName Ntoumos
    40 schema:givenName Christos
    41 rdf:type schema:Person
    42 N67fa761cc3234ca3ac99156278f2093b schema:familyName Montes
    43 schema:givenName Francisco
    44 rdf:type schema:Person
    45 N7a11d90494a94b0a9074046af3a09ff7 schema:location Cham
    46 schema:name Springer International Publishing
    47 rdf:type schema:Organisation
    48 N94521cfcc0bd405fb7392bc58d3cc456 schema:familyName Domingo-Ferrer
    49 schema:givenName Josep
    50 rdf:type schema:Person
    51 Na011cf5251b24bf3a797756982fad9d7 schema:name Springer Nature - SN SciGraph project
    52 rdf:type schema:Organization
    53 Na2c7d2d6df254032934abea54f174aee rdf:first N1a41429fc813490d8f806b864977d602
    54 rdf:rest N08d777a17d264cde8155592319fddba6
    55 Na407cf3de1be4a219fb41ed0018d6b4a rdf:first N94521cfcc0bd405fb7392bc58d3cc456
    56 rdf:rest N15fd4515bae44fba950c47d394beb68f
    57 Nc7c80b3c682040158a2f68ae676defc5 schema:name dimensions_id
    58 schema:value pub.1106352235
    59 rdf:type schema:PropertyValue
    60 Nd12a0bfc83a847c9975e573d58a5ff55 schema:name readcube_id
    61 schema:value 274662a4d7e1f5c15322ff6feb789a2f1478b0f8ff0814c5a449eb500584a736
    62 rdf:type schema:PropertyValue
    63 Nd549180cad3841eaac02e96dad4b96b4 rdf:first sg:person.012700142123.49
    64 rdf:rest rdf:nil
    65 Nd91ac5b805ef457a887920d7bcc6a2d7 schema:name doi
    66 schema:value 10.1007/978-3-319-99771-1_4
    67 rdf:type schema:PropertyValue
    68 Nfe50b79e02534d5e8ff9ed745622ff40 schema:isbn 978-3-319-99770-4
    69 978-3-319-99771-1
    70 schema:name Privacy in Statistical Databases
    71 rdf:type schema:Book
    72 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    73 schema:name Mathematical Sciences
    74 rdf:type schema:DefinedTerm
    75 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    76 schema:name Statistics
    77 rdf:type schema:DefinedTerm
    78 sg:person.012700142123.49 schema:affiliation https://www.grid.ac/institutes/grid.426100.1
    79 schema:familyName Spicer
    80 schema:givenName Keith
    81 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012700142123.49
    82 rdf:type schema:Person
    83 sg:pub.10.1007/978-3-540-87471-3_7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031243917
    84 https://doi.org/10.1007/978-3-540-87471-3_7
    85 rdf:type schema:CreativeWork
    86 sg:pub.10.1007/978-3-642-15838-4_4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034319090
    87 https://doi.org/10.1007/978-3-642-15838-4_4
    88 rdf:type schema:CreativeWork
    89 https://app.dimensions.ai/details/publication/pub.1106813468 schema:CreativeWork
    90 https://www.grid.ac/institutes/grid.426100.1 schema:alternateName Office for National Statistics
    91 schema:name Office for National Statistics, PO15 5RR, Titchfield, UK
    92 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...