Second-generation PLINK: rising to the challenge of larger and richer datasets View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2015-12

AUTHORS

Christopher C Chang, Carson C Chow, Laurent CAM Tellier, Shashaank Vattikuti, Shaun M Purcell, James J Lee

ABSTRACT

BACKGROUND: PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. FINDINGS: To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, [Formula: see text]-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0). CONCLUSIONS: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use. More... »

PAGES

7

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/s13742-015-0047-8

DOI

http://dx.doi.org/10.1186/s13742-015-0047-8

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1037894462

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/25722852


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computational Biology", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Datasets as Topic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genetics, Population", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome-Wide Association Study", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genotyping Techniques", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Likelihood Functions", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Linkage Disequilibrium", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Logistic Models", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Polymorphism, Single Nucleotide", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Beijing Genomics Institute", 
          "id": "https://www.grid.ac/institutes/grid.21155.32", 
          "name": [
            "Complete Genomics, 2071 Stierlin Court, 94043, Mountain View, CA, USA", 
            "BGI Cognitive Genomics Lab, Building No. 11, Bei Shan Industrial Zone, Yantian District, 518083, Shenzhen, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chang", 
        "givenName": "Christopher C", 
        "id": "sg:person.0603307336.99", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0603307336.99"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "National Institutes of Health", 
          "id": "https://www.grid.ac/institutes/grid.94365.3d", 
          "name": [
            "Mathematical Biology Section, NIDDK/LBM, National Institutes of Health, 20892, Bethesda, MD, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chow", 
        "givenName": "Carson C", 
        "id": "sg:person.01344501026.40", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344501026.40"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Copenhagen", 
          "id": "https://www.grid.ac/institutes/grid.5254.6", 
          "name": [
            "BGI Cognitive Genomics Lab, Building No. 11, Bei Shan Industrial Zone, Yantian District, 518083, Shenzhen, China", 
            "Bioinformatics Centre, University of Copenhagen, 2200, Copenhagen, Denmark"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Tellier", 
        "givenName": "Laurent CAM", 
        "id": "sg:person.0665035701.74", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0665035701.74"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "National Institutes of Health", 
          "id": "https://www.grid.ac/institutes/grid.94365.3d", 
          "name": [
            "Mathematical Biology Section, NIDDK/LBM, National Institutes of Health, 20892, Bethesda, MD, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vattikuti", 
        "givenName": "Shashaank", 
        "id": "sg:person.0655025005.16", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0655025005.16"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Massachusetts General Hospital", 
          "id": "https://www.grid.ac/institutes/grid.32224.35", 
          "name": [
            "Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, 02142, Cambridge, MA, USA", 
            "Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, 10029, New York, NY, USA", 
            "Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 10029, New York, NY, USA", 
            "Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, 02114, Boston, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Purcell", 
        "givenName": "Shaun M", 
        "id": "sg:person.01351436610.29", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01351436610.29"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Minnesota", 
          "id": "https://www.grid.ac/institutes/grid.17635.36", 
          "name": [
            "Mathematical Biology Section, NIDDK/LBM, National Institutes of Health, 20892, Bethesda, MD, USA", 
            "Department of Psychology, University of Minnesota Twin Cities, 55455, Minneapolis, MN, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lee", 
        "givenName": "James J", 
        "id": "sg:person.01326713436.22", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01326713436.22"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1038/nature11632", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000661742", 
          "https://doi.org/10.1038/nature11632"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ajhg.2011.01.010", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001408750"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/168173.168412", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002897629"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bth457", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008081196"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ajhg.2010.11.011", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009497006"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/gepi.21696", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009542095"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nbt.2241", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009931717", 
          "https://doi.org/10.1038/nbt.2241"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.1069424", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010139737"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1002625", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010725215"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/comjnl/20.4.364", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011411491"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1515/sagmb-2012-0039", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013334003"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr341", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013587508"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bts086", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014692388"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr330", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018404011"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/519795", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019061180"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-9-309", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019379161", 
          "https://doi.org/10.1186/1471-2105-9-309"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-9-309", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019379161", 
          "https://doi.org/10.1186/1471-2105-9-309"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/hdy.1974.89", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020115003", 
          "https://doi.org/10.1038/hdy.1974.89"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/hdy.1974.89", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020115003", 
          "https://doi.org/10.1038/hdy.1974.89"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btp352", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023014918"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1534/genetics.113.150029", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025094141"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1534/genetics.113.150029", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025094141"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btu495", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025878059"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1214/07-aoas131", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031228174"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.107524.110", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032096953"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-15-10", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033055767", 
          "https://doi.org/10.1186/1471-2105-15-10"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-15-10", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033055767", 
          "https://doi.org/10.1186/1471-2105-15-10"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-8-428", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033445757", 
          "https://doi.org/10.1186/1471-2105-8-428"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1000180", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041672048"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/sim.3531", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041835237"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1000529", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043446290"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/2047-217x-3-10", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047107593", 
          "https://doi.org/10.1186/2047-217x-3-10"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1816038.1816021", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048324140"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.csda.2005.09.004", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049185754"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/6497.214326", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051743138"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ajhg.2010.07.021", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052707952"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci200235e", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055402698"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ci200235e", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055402698"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/302698", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058610264"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/378099", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058671155"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/378099", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058671155"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/429864", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058716362"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.28.706.49", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062560131"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2307/2532296", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069977714"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2015-12", 
    "datePublishedReg": "2015-12-01", 
    "description": "BACKGROUND: PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format.\nFINDINGS: To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, [Formula: see text]-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0).\nCONCLUSIONS: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/s13742-015-0047-8", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.2725077", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1047731", 
        "issn": [
          "2047-217X"
        ], 
        "name": "GigaScience", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "4"
      }
    ], 
    "name": "Second-generation PLINK: rising to the challenge of larger and richer datasets", 
    "pagination": "7", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "b484627dc77a04193c297dd61bf0f336575f5cb9eff09d34a08e0f5b968ab53a"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "25722852"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101596872"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/s13742-015-0047-8"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1037894462"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/s13742-015-0047-8", 
      "https://app.dimensions.ai/details/publication/pub.1037894462"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T20:49", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8684_00000523.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2Fs13742-015-0047-8"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s13742-015-0047-8'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s13742-015-0047-8'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s13742-015-0047-8'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s13742-015-0047-8'


 

This table displays all metadata directly associated to this object as RDF triples.

288 TRIPLES      21 PREDICATES      78 URIs      32 LITERALS      20 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/s13742-015-0047-8 schema:about N2b8d7b26ae364a94a9480c636f25e8d1
2 N34d0d964334f4a65bccbee1aeae1c33f
3 N3bcd2e9697da42ad8b63a6d8c6b55c7f
4 N4f70ad7f88ee431097f9a5c7bdf80200
5 N604e9c4452a4471c8e6f33300bfe83ca
6 N686d93a5e8724f3d9d71ef5b7784f0ea
7 N7029eaabebe74c858a97815b1644c25e
8 N7723ab5d2e4d44eda25180f95a685530
9 N7e5e98c7e4a546a2ac39f33e2de2def3
10 Nda77df0af48e4581902a0c9636db75e9
11 Nf222c4f69ed84549b790eedb29ee575d
12 anzsrc-for:06
13 anzsrc-for:0604
14 schema:author N90c9433b93934e83b9208237a89ce8b5
15 schema:citation sg:pub.10.1038/hdy.1974.89
16 sg:pub.10.1038/nature11632
17 sg:pub.10.1038/nbt.2241
18 sg:pub.10.1186/1471-2105-15-10
19 sg:pub.10.1186/1471-2105-8-428
20 sg:pub.10.1186/1471-2105-9-309
21 sg:pub.10.1186/2047-217x-3-10
22 https://doi.org/10.1002/gepi.21696
23 https://doi.org/10.1002/sim.3531
24 https://doi.org/10.1016/j.ajhg.2010.07.021
25 https://doi.org/10.1016/j.ajhg.2010.11.011
26 https://doi.org/10.1016/j.ajhg.2011.01.010
27 https://doi.org/10.1016/j.csda.2005.09.004
28 https://doi.org/10.1021/ci200235e
29 https://doi.org/10.1086/302698
30 https://doi.org/10.1086/378099
31 https://doi.org/10.1086/429864
32 https://doi.org/10.1086/519795
33 https://doi.org/10.1093/bioinformatics/bth457
34 https://doi.org/10.1093/bioinformatics/btp352
35 https://doi.org/10.1093/bioinformatics/btr330
36 https://doi.org/10.1093/bioinformatics/btr341
37 https://doi.org/10.1093/bioinformatics/bts086
38 https://doi.org/10.1093/bioinformatics/btu495
39 https://doi.org/10.1093/comjnl/20.4.364
40 https://doi.org/10.1101/gr.107524.110
41 https://doi.org/10.1126/science.1069424
42 https://doi.org/10.1126/science.28.706.49
43 https://doi.org/10.1145/168173.168412
44 https://doi.org/10.1145/1816038.1816021
45 https://doi.org/10.1145/6497.214326
46 https://doi.org/10.1214/07-aoas131
47 https://doi.org/10.1371/journal.pgen.1000180
48 https://doi.org/10.1371/journal.pgen.1000529
49 https://doi.org/10.1371/journal.pgen.1002625
50 https://doi.org/10.1515/sagmb-2012-0039
51 https://doi.org/10.1534/genetics.113.150029
52 https://doi.org/10.2307/2532296
53 schema:datePublished 2015-12
54 schema:datePublishedReg 2015-12-01
55 schema:description BACKGROUND: PLINK 1 is a widely used open-source C/C++ toolset for genome-wide association studies (GWAS) and research in population genetics. However, the steady accumulation of data from imputation and whole-genome sequencing studies has exposed a strong need for faster and scalable implementations of key functions, such as logistic regression, linkage disequilibrium estimation, and genomic distance evaluation. In addition, GWAS and population-genetic data now frequently contain genotype likelihoods, phase information, and/or multiallelic variants, none of which can be represented by PLINK 1's primary data format. FINDINGS: To address these issues, we are developing a second-generation codebase for PLINK. The first major release from this codebase, PLINK 1.9, introduces extensive use of bit-level parallelism, [Formula: see text]-time/constant-space Hardy-Weinberg equilibrium and Fisher's exact tests, and many other algorithmic improvements. In combination, these changes accelerate most operations by 1-4 orders of magnitude, and allow the program to handle datasets too large to fit in RAM. We have also developed an extension to the data format which adds low-overhead support for genotype likelihoods, phase, multiallelic variants, and reference vs. alternate alleles, which is the basis of our planned second release (PLINK 2.0). CONCLUSIONS: The second-generation versions of PLINK will offer dramatic improvements in performance and compatibility. For the first time, users without access to high-end computing resources can perform several essential analyses of the feature-rich and very large genetic datasets coming into use.
56 schema:genre research_article
57 schema:inLanguage en
58 schema:isAccessibleForFree true
59 schema:isPartOf N3649ac2a52be4ee09f4412cb341f509f
60 N9c5870f5620b43c695722b804f281ebe
61 sg:journal.1047731
62 schema:name Second-generation PLINK: rising to the challenge of larger and richer datasets
63 schema:pagination 7
64 schema:productId N063db3b8bc5e46e28464bab260f74931
65 N0e10b2a43e274f9893c00123783ceebc
66 N11c3b191ce4c4022ab49d28dd7277014
67 N390b3db8d75341a3823186bb0d3ef114
68 N7ae45565f977466bbb28087ba1a5b1e2
69 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037894462
70 https://doi.org/10.1186/s13742-015-0047-8
71 schema:sdDatePublished 2019-04-10T20:49
72 schema:sdLicense https://scigraph.springernature.com/explorer/license/
73 schema:sdPublisher Nf3e79f1c3d0847cd862896cd44f6110c
74 schema:url http://link.springer.com/10.1186%2Fs13742-015-0047-8
75 sgo:license sg:explorer/license/
76 sgo:sdDataset articles
77 rdf:type schema:ScholarlyArticle
78 N063db3b8bc5e46e28464bab260f74931 schema:name pubmed_id
79 schema:value 25722852
80 rdf:type schema:PropertyValue
81 N0e10b2a43e274f9893c00123783ceebc schema:name doi
82 schema:value 10.1186/s13742-015-0047-8
83 rdf:type schema:PropertyValue
84 N11c3b191ce4c4022ab49d28dd7277014 schema:name nlm_unique_id
85 schema:value 101596872
86 rdf:type schema:PropertyValue
87 N2b8d7b26ae364a94a9480c636f25e8d1 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
88 schema:name Polymorphism, Single Nucleotide
89 rdf:type schema:DefinedTerm
90 N34d0d964334f4a65bccbee1aeae1c33f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
91 schema:name Software
92 rdf:type schema:DefinedTerm
93 N3649ac2a52be4ee09f4412cb341f509f schema:issueNumber 1
94 rdf:type schema:PublicationIssue
95 N390b3db8d75341a3823186bb0d3ef114 schema:name readcube_id
96 schema:value b484627dc77a04193c297dd61bf0f336575f5cb9eff09d34a08e0f5b968ab53a
97 rdf:type schema:PropertyValue
98 N3adc130e008d4d6eaf0d544f77da3384 rdf:first sg:person.01326713436.22
99 rdf:rest rdf:nil
100 N3bcd2e9697da42ad8b63a6d8c6b55c7f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
101 schema:name Genotyping Techniques
102 rdf:type schema:DefinedTerm
103 N4f70ad7f88ee431097f9a5c7bdf80200 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
104 schema:name Genetics, Population
105 rdf:type schema:DefinedTerm
106 N5a95ce6612c6414fa7db1c193b9f7fc8 rdf:first sg:person.0665035701.74
107 rdf:rest N64e7837551af4dbcad5b52b0e045cb4b
108 N604e9c4452a4471c8e6f33300bfe83ca schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
109 schema:name Logistic Models
110 rdf:type schema:DefinedTerm
111 N64e7837551af4dbcad5b52b0e045cb4b rdf:first sg:person.0655025005.16
112 rdf:rest Nfa6d3f28a1e14b4d8fd1822fd93a67e4
113 N686d93a5e8724f3d9d71ef5b7784f0ea schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
114 schema:name Likelihood Functions
115 rdf:type schema:DefinedTerm
116 N7029eaabebe74c858a97815b1644c25e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
117 schema:name Algorithms
118 rdf:type schema:DefinedTerm
119 N7723ab5d2e4d44eda25180f95a685530 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
120 schema:name Computational Biology
121 rdf:type schema:DefinedTerm
122 N7ae45565f977466bbb28087ba1a5b1e2 schema:name dimensions_id
123 schema:value pub.1037894462
124 rdf:type schema:PropertyValue
125 N7e5e98c7e4a546a2ac39f33e2de2def3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
126 schema:name Datasets as Topic
127 rdf:type schema:DefinedTerm
128 N804d37cfb426469f8c467d5612eb2fc9 rdf:first sg:person.01344501026.40
129 rdf:rest N5a95ce6612c6414fa7db1c193b9f7fc8
130 N90c9433b93934e83b9208237a89ce8b5 rdf:first sg:person.0603307336.99
131 rdf:rest N804d37cfb426469f8c467d5612eb2fc9
132 N9c5870f5620b43c695722b804f281ebe schema:volumeNumber 4
133 rdf:type schema:PublicationVolume
134 Nda77df0af48e4581902a0c9636db75e9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
135 schema:name Linkage Disequilibrium
136 rdf:type schema:DefinedTerm
137 Nf222c4f69ed84549b790eedb29ee575d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
138 schema:name Genome-Wide Association Study
139 rdf:type schema:DefinedTerm
140 Nf3e79f1c3d0847cd862896cd44f6110c schema:name Springer Nature - SN SciGraph project
141 rdf:type schema:Organization
142 Nfa6d3f28a1e14b4d8fd1822fd93a67e4 rdf:first sg:person.01351436610.29
143 rdf:rest N3adc130e008d4d6eaf0d544f77da3384
144 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
145 schema:name Biological Sciences
146 rdf:type schema:DefinedTerm
147 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
148 schema:name Genetics
149 rdf:type schema:DefinedTerm
150 sg:grant.2725077 http://pending.schema.org/fundedItem sg:pub.10.1186/s13742-015-0047-8
151 rdf:type schema:MonetaryGrant
152 sg:journal.1047731 schema:issn 2047-217X
153 schema:name GigaScience
154 rdf:type schema:Periodical
155 sg:person.01326713436.22 schema:affiliation https://www.grid.ac/institutes/grid.17635.36
156 schema:familyName Lee
157 schema:givenName James J
158 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01326713436.22
159 rdf:type schema:Person
160 sg:person.01344501026.40 schema:affiliation https://www.grid.ac/institutes/grid.94365.3d
161 schema:familyName Chow
162 schema:givenName Carson C
163 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01344501026.40
164 rdf:type schema:Person
165 sg:person.01351436610.29 schema:affiliation https://www.grid.ac/institutes/grid.32224.35
166 schema:familyName Purcell
167 schema:givenName Shaun M
168 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01351436610.29
169 rdf:type schema:Person
170 sg:person.0603307336.99 schema:affiliation https://www.grid.ac/institutes/grid.21155.32
171 schema:familyName Chang
172 schema:givenName Christopher C
173 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0603307336.99
174 rdf:type schema:Person
175 sg:person.0655025005.16 schema:affiliation https://www.grid.ac/institutes/grid.94365.3d
176 schema:familyName Vattikuti
177 schema:givenName Shashaank
178 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0655025005.16
179 rdf:type schema:Person
180 sg:person.0665035701.74 schema:affiliation https://www.grid.ac/institutes/grid.5254.6
181 schema:familyName Tellier
182 schema:givenName Laurent CAM
183 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0665035701.74
184 rdf:type schema:Person
185 sg:pub.10.1038/hdy.1974.89 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020115003
186 https://doi.org/10.1038/hdy.1974.89
187 rdf:type schema:CreativeWork
188 sg:pub.10.1038/nature11632 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000661742
189 https://doi.org/10.1038/nature11632
190 rdf:type schema:CreativeWork
191 sg:pub.10.1038/nbt.2241 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009931717
192 https://doi.org/10.1038/nbt.2241
193 rdf:type schema:CreativeWork
194 sg:pub.10.1186/1471-2105-15-10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033055767
195 https://doi.org/10.1186/1471-2105-15-10
196 rdf:type schema:CreativeWork
197 sg:pub.10.1186/1471-2105-8-428 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033445757
198 https://doi.org/10.1186/1471-2105-8-428
199 rdf:type schema:CreativeWork
200 sg:pub.10.1186/1471-2105-9-309 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019379161
201 https://doi.org/10.1186/1471-2105-9-309
202 rdf:type schema:CreativeWork
203 sg:pub.10.1186/2047-217x-3-10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047107593
204 https://doi.org/10.1186/2047-217x-3-10
205 rdf:type schema:CreativeWork
206 https://doi.org/10.1002/gepi.21696 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009542095
207 rdf:type schema:CreativeWork
208 https://doi.org/10.1002/sim.3531 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041835237
209 rdf:type schema:CreativeWork
210 https://doi.org/10.1016/j.ajhg.2010.07.021 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052707952
211 rdf:type schema:CreativeWork
212 https://doi.org/10.1016/j.ajhg.2010.11.011 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009497006
213 rdf:type schema:CreativeWork
214 https://doi.org/10.1016/j.ajhg.2011.01.010 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001408750
215 rdf:type schema:CreativeWork
216 https://doi.org/10.1016/j.csda.2005.09.004 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049185754
217 rdf:type schema:CreativeWork
218 https://doi.org/10.1021/ci200235e schema:sameAs https://app.dimensions.ai/details/publication/pub.1055402698
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1086/302698 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058610264
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1086/378099 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058671155
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1086/429864 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058716362
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1086/519795 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019061180
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1093/bioinformatics/bth457 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008081196
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1093/bioinformatics/btp352 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023014918
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1093/bioinformatics/btr330 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018404011
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1093/bioinformatics/btr341 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013587508
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1093/bioinformatics/bts086 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014692388
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1093/bioinformatics/btu495 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025878059
239 rdf:type schema:CreativeWork
240 https://doi.org/10.1093/comjnl/20.4.364 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011411491
241 rdf:type schema:CreativeWork
242 https://doi.org/10.1101/gr.107524.110 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032096953
243 rdf:type schema:CreativeWork
244 https://doi.org/10.1126/science.1069424 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010139737
245 rdf:type schema:CreativeWork
246 https://doi.org/10.1126/science.28.706.49 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062560131
247 rdf:type schema:CreativeWork
248 https://doi.org/10.1145/168173.168412 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002897629
249 rdf:type schema:CreativeWork
250 https://doi.org/10.1145/1816038.1816021 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048324140
251 rdf:type schema:CreativeWork
252 https://doi.org/10.1145/6497.214326 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051743138
253 rdf:type schema:CreativeWork
254 https://doi.org/10.1214/07-aoas131 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031228174
255 rdf:type schema:CreativeWork
256 https://doi.org/10.1371/journal.pgen.1000180 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041672048
257 rdf:type schema:CreativeWork
258 https://doi.org/10.1371/journal.pgen.1000529 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043446290
259 rdf:type schema:CreativeWork
260 https://doi.org/10.1371/journal.pgen.1002625 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010725215
261 rdf:type schema:CreativeWork
262 https://doi.org/10.1515/sagmb-2012-0039 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013334003
263 rdf:type schema:CreativeWork
264 https://doi.org/10.1534/genetics.113.150029 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025094141
265 rdf:type schema:CreativeWork
266 https://doi.org/10.2307/2532296 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069977714
267 rdf:type schema:CreativeWork
268 https://www.grid.ac/institutes/grid.17635.36 schema:alternateName University of Minnesota
269 schema:name Department of Psychology, University of Minnesota Twin Cities, 55455, Minneapolis, MN, USA
270 Mathematical Biology Section, NIDDK/LBM, National Institutes of Health, 20892, Bethesda, MD, USA
271 rdf:type schema:Organization
272 https://www.grid.ac/institutes/grid.21155.32 schema:alternateName Beijing Genomics Institute
273 schema:name BGI Cognitive Genomics Lab, Building No. 11, Bei Shan Industrial Zone, Yantian District, 518083, Shenzhen, China
274 Complete Genomics, 2071 Stierlin Court, 94043, Mountain View, CA, USA
275 rdf:type schema:Organization
276 https://www.grid.ac/institutes/grid.32224.35 schema:alternateName Massachusetts General Hospital
277 schema:name Analytic and Translational Genetics Unit, Psychiatric and Neurodevelopmental Genetics Unit, Massachusetts General Hospital, 02114, Boston, MA, USA
278 Division of Psychiatric Genomics, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, 10029, New York, NY, USA
279 Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 10029, New York, NY, USA
280 Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, 02142, Cambridge, MA, USA
281 rdf:type schema:Organization
282 https://www.grid.ac/institutes/grid.5254.6 schema:alternateName University of Copenhagen
283 schema:name BGI Cognitive Genomics Lab, Building No. 11, Bei Shan Industrial Zone, Yantian District, 518083, Shenzhen, China
284 Bioinformatics Centre, University of Copenhagen, 2200, Copenhagen, Denmark
285 rdf:type schema:Organization
286 https://www.grid.ac/institutes/grid.94365.3d schema:alternateName National Institutes of Health
287 schema:name Mathematical Biology Section, NIDDK/LBM, National Institutes of Health, 20892, Bethesda, MD, USA
288 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...