Assembly and annotation of an Ashkenazi human reference genome View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2020-06-02

AUTHORS

Alaina Shumate, Aleksey V. Zimin, Rachel M. Sherman, Daniela Puiu, Justin M. Wagner, Nathan D. Olson, Mihaela Pertea, Marc L. Salit, Justin M. Zook, Steven L. Salzberg

ABSTRACT

BackgroundThousands of experiments and studies use the human reference genome as a resource each year. This single reference genome, GRCh38, is a mosaic created from a small number of individuals, representing a very small sample of the human population. There is a need for reference genomes from multiple human populations to avoid potential biases.ResultsHere, we describe the assembly and annotation of the genome of an Ashkenazi individual and the creation of a new, population-specific human reference genome. This genome is more contiguous and more complete than GRCh38, the latest version of the human reference genome, and is annotated with highly similar gene content. The Ashkenazi reference genome, Ash1, contains 2,973,118,650 nucleotides as compared to 2,937,639,212 in GRCh38. Annotation identified 20,157 protein-coding genes, of which 19,563 are > 99% identical to their counterparts on GRCh38. Most of the remaining genes have small differences. Forty of the protein-coding genes in GRCh38 are missing from Ash1; however, all of these genes are members of multi-gene families for which Ash1 contains other copies. Eleven genes appear on different chromosomes from their homologs in GRCh38. Alignment of DNA sequences from an unrelated Ashkenazi individual to Ash1 identified ~ 1 million fewer homozygous SNPs than alignment of those same sequences to the more-distant GRCh38 genome, illustrating one of the benefits of population-specific reference genomes.ConclusionsThe Ash1 genome is presented as a reference for any genetic studies involving Ashkenazi Jewish individuals. More... »

PAGES

129

References to SciGraph publications

  • 2001-02-15. Initial sequencing and analysis of the human genome in NATURE
  • 2016-10-05. De novo assembly and phasing of a Korean human genome in NATURE
  • 2019-08-09. Is it time to change the reference genome? in GENOME BIOLOGY
  • 2015-04-24. Characterization and identification of hidden rare variants in the human genome in BMC GENOMICS
  • 2019-04-01. An open resource for accurately benchmarking small variant and reference calls in NATURE BIOTECHNOLOGY
  • 2019-03-11. Best practices for benchmarking germline small-variant calls in human genomes in NATURE BIOTECHNOLOGY
  • 2016-06-07. Extensive sequencing of seven human genomes to characterize benchmark reference materials in SCIENTIFIC DATA
  • 2017-10-26. Catching hidden variation: systematic correction of reference minor allele annotation in clinical variant calling in GENETICS IN MEDICINE
  • 2012-09-19. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory in BMC BIOINFORMATICS
  • 2018-07-16. A synthetic-diploid benchmark for accurate variant-calling evaluation in NATURE METHODS
  • 2016-10-12. Genomics is failing on diversity in NATURE
  • 2020-05-27. The mutational constraint spectrum quantified from variation in 141,456 humans in NATURE
  • 2014-02-28. Harvard Personal Genome Project: lessons from participatory public research in GENOME MEDICINE
  • 2018-08-02. De novo human genome assemblies reveal spectrum of alternative haplotypes in diverse populations in NATURE COMMUNICATIONS
  • 2018-11-28. CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise in GENOME BIOLOGY
  • 2014-02-16. Integrating human sequence data sets provides a resource of benchmark SNP and indel genotype calls in NATURE BIOTECHNOLOGY
  • 2012-03-04. Fast gapped-read alignment with Bowtie 2 in NATURE METHODS
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1186/s13059-020-02047-7

    DOI

    http://dx.doi.org/10.1186/s13059-020-02047-7

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1128162826

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/32487205


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Biological Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Genetics", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Genome, Human", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Humans", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Molecular Sequence Annotation", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Reference Values", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Translocation, Genetic", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Shumate", 
            "givenName": "Alaina", 
            "id": "sg:person.07665515552.55", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07665515552.55"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Zimin", 
            "givenName": "Aleksey V.", 
            "id": "sg:person.0642355665.18", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642355665.18"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Sherman", 
            "givenName": "Rachel M.", 
            "id": "sg:person.016043540221.18", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016043540221.18"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Puiu", 
            "givenName": "Daniela", 
            "id": "sg:person.0776626013.24", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0776626013.24"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "National Institute of Standards and Technology, Gaithersburg, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.94225.38", 
              "name": [
                "National Institute of Standards and Technology, Gaithersburg, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Wagner", 
            "givenName": "Justin M.", 
            "id": "sg:person.016166661351.64", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016166661351.64"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "National Institute of Standards and Technology, Gaithersburg, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.94225.38", 
              "name": [
                "National Institute of Standards and Technology, Gaithersburg, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Olson", 
            "givenName": "Nathan D.", 
            "id": "sg:person.01052711105.47", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01052711105.47"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Pertea", 
            "givenName": "Mihaela", 
            "id": "sg:person.0720234625.35", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0720234625.35"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Joint Initiative for Metrology in Biology, Stanford University, Stanford, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.168010.e", 
              "name": [
                "Joint Initiative for Metrology in Biology, Stanford University, Stanford, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Salit", 
            "givenName": "Marc L.", 
            "id": "sg:person.0635305134.21", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0635305134.21"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "National Institute of Standards and Technology, Gaithersburg, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.94225.38", 
              "name": [
                "National Institute of Standards and Technology, Gaithersburg, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Zook", 
            "givenName": "Justin M.", 
            "id": "sg:person.01040401176.42", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01040401176.42"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA", 
              "id": "http://www.grid.ac/institutes/grid.21107.35", 
              "name": [
                "Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA", 
                "Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Salzberg", 
            "givenName": "Steven L.", 
            "id": "sg:person.01223441713.02", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01223441713.02"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1186/1471-2105-13-238", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1028668057", 
              "https://doi.org/10.1186/1471-2105-13-238"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/s12864-015-1481-9", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1039450295", 
              "https://doi.org/10.1186/s12864-015-1481-9"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/s41586-020-2308-7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1127885244", 
              "https://doi.org/10.1038/s41586-020-2308-7"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/35057062", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042854081", 
              "https://doi.org/10.1038/35057062"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/s13059-018-1590-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1110263091", 
              "https://doi.org/10.1186/s13059-018-1590-2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/gim.2017.168", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1092347962", 
              "https://doi.org/10.1038/gim.2017.168"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/s41587-019-0074-6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1113157788", 
              "https://doi.org/10.1038/s41587-019-0074-6"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.2835", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1015051370", 
              "https://doi.org/10.1038/nbt.2835"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/538161a", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009029590", 
              "https://doi.org/10.1038/538161a"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gm527", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016277336", 
              "https://doi.org/10.1186/gm527"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/s13059-019-1774-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1120247943", 
              "https://doi.org/10.1186/s13059-019-1774-4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nmeth.1923", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006541515", 
              "https://doi.org/10.1038/nmeth.1923"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/s41467-018-05513-w", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1105886905", 
              "https://doi.org/10.1038/s41467-018-05513-w"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/sdata.2016.25", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041924932", 
              "https://doi.org/10.1038/sdata.2016.25"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nature20098", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1049863914", 
              "https://doi.org/10.1038/nature20098"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/s41592-018-0054-7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1105534003", 
              "https://doi.org/10.1038/s41592-018-0054-7"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/s41587-019-0054-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1112678001", 
              "https://doi.org/10.1038/s41587-019-0054-x"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2020-06-02", 
        "datePublishedReg": "2020-06-02", 
        "description": "BackgroundThousands of experiments and studies use the human reference genome as a resource each year. This single reference genome, GRCh38, is a mosaic created from a small number of individuals, representing a very small sample of the human population. There is a need for reference genomes from multiple human populations to avoid potential biases.ResultsHere, we describe the assembly and annotation of the genome of an Ashkenazi individual and the creation of a new, population-specific human reference genome. This genome is more contiguous and more complete than GRCh38, the latest version of the human reference genome, and is annotated with highly similar gene content. The Ashkenazi reference genome, Ash1, contains 2,973,118,650 nucleotides as compared to 2,937,639,212 in GRCh38. Annotation identified 20,157 protein-coding genes, of which 19,563 are >\u200999% identical to their counterparts on GRCh38. Most of the remaining genes have small differences. Forty of the protein-coding genes in GRCh38 are missing from Ash1; however, all of these genes are members of multi-gene families for which Ash1 contains other copies. Eleven genes appear on different chromosomes from their homologs in GRCh38. Alignment of DNA sequences from an unrelated Ashkenazi individual to Ash1 identified ~\u20091 million fewer homozygous SNPs than alignment of those same sequences to the more-distant GRCh38 genome, illustrating one of the benefits of population-specific reference genomes.ConclusionsThe Ash1 genome is presented as a reference for any genetic studies involving Ashkenazi Jewish individuals.", 
        "genre": "article", 
        "id": "sg:pub.10.1186/s13059-020-02047-7", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.2529453", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.8827846", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.8383234", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1023439", 
            "issn": [
              "1474-760X", 
              "1465-6906"
            ], 
            "name": "Genome Biology", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "21"
          }
        ], 
        "keywords": [
          "human reference genome", 
          "protein-coding genes", 
          "reference genome", 
          "similar gene content", 
          "multi-gene family", 
          "single reference genome", 
          "population-specific reference genomes", 
          "multiple human populations", 
          "human population", 
          "Ashkenazi individuals", 
          "gene content", 
          "homozygous SNPs", 
          "different chromosomes", 
          "Eleven genes", 
          "DNA sequences", 
          "genome", 
          "ASH1", 
          "GRCh38", 
          "genetic studies", 
          "genes", 
          "Ashkenazi Jewish individuals", 
          "annotation", 
          "sequence", 
          "assembly", 
          "same sequence", 
          "homolog", 
          "chromosomes", 
          "nucleotides", 
          "SNPs", 
          "copies", 
          "population", 
          "Jewish individuals", 
          "ResultsHere", 
          "family", 
          "members", 
          "small number", 
          "individuals", 
          "potential biases", 
          "alignment", 
          "small differences", 
          "study", 
          "counterparts", 
          "content", 
          "number", 
          "experiments", 
          "resources", 
          "differences", 
          "latest version", 
          "small samples", 
          "samples", 
          "biases", 
          "years", 
          "reference", 
          "benefits", 
          "need", 
          "creation", 
          "version"
        ], 
        "name": "Assembly and annotation of an Ashkenazi human reference genome", 
        "pagination": "129", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1128162826"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1186/s13059-020-02047-7"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "32487205"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1186/s13059-020-02047-7", 
          "https://app.dimensions.ai/details/publication/pub.1128162826"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-10-01T06:47", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20221001/entities/gbq_results/article/article_866.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1186/s13059-020-02047-7"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s13059-020-02047-7'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s13059-020-02047-7'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s13059-020-02047-7'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s13059-020-02047-7'


     

    This table displays all metadata directly associated to this object as RDF triples.

    286 TRIPLES      21 PREDICATES      104 URIs      79 LITERALS      12 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1186/s13059-020-02047-7 schema:about N2ec73abdf30c492ca63814ceec77c19a
    2 N83c9a150d0224c6383767642d3b2590d
    3 N85deb08133204eb3b4c91a69111d1e47
    4 Na42ed9421c724316b8f7caf4fba38e51
    5 Nd52edd9a6d1e4c23b3ac5e50a0366bb6
    6 anzsrc-for:06
    7 anzsrc-for:0604
    8 schema:author N9a4a39e0829343d59cf127567834b237
    9 schema:citation sg:pub.10.1038/35057062
    10 sg:pub.10.1038/538161a
    11 sg:pub.10.1038/gim.2017.168
    12 sg:pub.10.1038/nature20098
    13 sg:pub.10.1038/nbt.2835
    14 sg:pub.10.1038/nmeth.1923
    15 sg:pub.10.1038/s41467-018-05513-w
    16 sg:pub.10.1038/s41586-020-2308-7
    17 sg:pub.10.1038/s41587-019-0054-x
    18 sg:pub.10.1038/s41587-019-0074-6
    19 sg:pub.10.1038/s41592-018-0054-7
    20 sg:pub.10.1038/sdata.2016.25
    21 sg:pub.10.1186/1471-2105-13-238
    22 sg:pub.10.1186/gm527
    23 sg:pub.10.1186/s12864-015-1481-9
    24 sg:pub.10.1186/s13059-018-1590-2
    25 sg:pub.10.1186/s13059-019-1774-4
    26 schema:datePublished 2020-06-02
    27 schema:datePublishedReg 2020-06-02
    28 schema:description BackgroundThousands of experiments and studies use the human reference genome as a resource each year. This single reference genome, GRCh38, is a mosaic created from a small number of individuals, representing a very small sample of the human population. There is a need for reference genomes from multiple human populations to avoid potential biases.ResultsHere, we describe the assembly and annotation of the genome of an Ashkenazi individual and the creation of a new, population-specific human reference genome. This genome is more contiguous and more complete than GRCh38, the latest version of the human reference genome, and is annotated with highly similar gene content. The Ashkenazi reference genome, Ash1, contains 2,973,118,650 nucleotides as compared to 2,937,639,212 in GRCh38. Annotation identified 20,157 protein-coding genes, of which 19,563 are > 99% identical to their counterparts on GRCh38. Most of the remaining genes have small differences. Forty of the protein-coding genes in GRCh38 are missing from Ash1; however, all of these genes are members of multi-gene families for which Ash1 contains other copies. Eleven genes appear on different chromosomes from their homologs in GRCh38. Alignment of DNA sequences from an unrelated Ashkenazi individual to Ash1 identified ~ 1 million fewer homozygous SNPs than alignment of those same sequences to the more-distant GRCh38 genome, illustrating one of the benefits of population-specific reference genomes.ConclusionsThe Ash1 genome is presented as a reference for any genetic studies involving Ashkenazi Jewish individuals.
    29 schema:genre article
    30 schema:isAccessibleForFree true
    31 schema:isPartOf Na0185d5bd1c34122a4ffc81204865f96
    32 Nbf5d80a2a21a4e53a5c64d3c2f876a23
    33 sg:journal.1023439
    34 schema:keywords ASH1
    35 Ashkenazi Jewish individuals
    36 Ashkenazi individuals
    37 DNA sequences
    38 Eleven genes
    39 GRCh38
    40 Jewish individuals
    41 ResultsHere
    42 SNPs
    43 alignment
    44 annotation
    45 assembly
    46 benefits
    47 biases
    48 chromosomes
    49 content
    50 copies
    51 counterparts
    52 creation
    53 differences
    54 different chromosomes
    55 experiments
    56 family
    57 gene content
    58 genes
    59 genetic studies
    60 genome
    61 homolog
    62 homozygous SNPs
    63 human population
    64 human reference genome
    65 individuals
    66 latest version
    67 members
    68 multi-gene family
    69 multiple human populations
    70 need
    71 nucleotides
    72 number
    73 population
    74 population-specific reference genomes
    75 potential biases
    76 protein-coding genes
    77 reference
    78 reference genome
    79 resources
    80 same sequence
    81 samples
    82 sequence
    83 similar gene content
    84 single reference genome
    85 small differences
    86 small number
    87 small samples
    88 study
    89 version
    90 years
    91 schema:name Assembly and annotation of an Ashkenazi human reference genome
    92 schema:pagination 129
    93 schema:productId Naa51c7654eda4fe7ad16c893c044ec60
    94 Nc4c973dfa29844f395dcc6aec7b51895
    95 Nea3942e180e24840b5d8fac5915f2143
    96 schema:sameAs https://app.dimensions.ai/details/publication/pub.1128162826
    97 https://doi.org/10.1186/s13059-020-02047-7
    98 schema:sdDatePublished 2022-10-01T06:47
    99 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    100 schema:sdPublisher Nef1d5732c8cb45bc8dccaee03416a073
    101 schema:url https://doi.org/10.1186/s13059-020-02047-7
    102 sgo:license sg:explorer/license/
    103 sgo:sdDataset articles
    104 rdf:type schema:ScholarlyArticle
    105 N2c878c8f51bf47eb91c5ba9cdfdee23c rdf:first sg:person.0720234625.35
    106 rdf:rest Nb4971a59d31c4e31af63a1df81f95a2b
    107 N2ec73abdf30c492ca63814ceec77c19a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    108 schema:name Translocation, Genetic
    109 rdf:type schema:DefinedTerm
    110 N4a2d798e61be4c37986f267b0bb7769e rdf:first sg:person.01052711105.47
    111 rdf:rest N2c878c8f51bf47eb91c5ba9cdfdee23c
    112 N83c9a150d0224c6383767642d3b2590d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    113 schema:name Reference Values
    114 rdf:type schema:DefinedTerm
    115 N8467bd7ae36a48a8808eb0293ff30bf1 rdf:first sg:person.016043540221.18
    116 rdf:rest Nc448cab2d5e242a4809c65d3a4f8ee8b
    117 N85deb08133204eb3b4c91a69111d1e47 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    118 schema:name Molecular Sequence Annotation
    119 rdf:type schema:DefinedTerm
    120 N8641c92edcd94d6cac89ec5af208cc68 rdf:first sg:person.01040401176.42
    121 rdf:rest N98a4c53c608b4048b42746d823b1d7e4
    122 N98a4c53c608b4048b42746d823b1d7e4 rdf:first sg:person.01223441713.02
    123 rdf:rest rdf:nil
    124 N9a4a39e0829343d59cf127567834b237 rdf:first sg:person.07665515552.55
    125 rdf:rest Ndc15abbc15a94cc6b1e9655d15ae0176
    126 Na0185d5bd1c34122a4ffc81204865f96 schema:volumeNumber 21
    127 rdf:type schema:PublicationVolume
    128 Na42ed9421c724316b8f7caf4fba38e51 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    129 schema:name Genome, Human
    130 rdf:type schema:DefinedTerm
    131 Naa51c7654eda4fe7ad16c893c044ec60 schema:name dimensions_id
    132 schema:value pub.1128162826
    133 rdf:type schema:PropertyValue
    134 Nb4971a59d31c4e31af63a1df81f95a2b rdf:first sg:person.0635305134.21
    135 rdf:rest N8641c92edcd94d6cac89ec5af208cc68
    136 Nbf5d80a2a21a4e53a5c64d3c2f876a23 schema:issueNumber 1
    137 rdf:type schema:PublicationIssue
    138 Nc448cab2d5e242a4809c65d3a4f8ee8b rdf:first sg:person.0776626013.24
    139 rdf:rest Nd08233aacf164e148f0c116c2e678b4f
    140 Nc4c973dfa29844f395dcc6aec7b51895 schema:name pubmed_id
    141 schema:value 32487205
    142 rdf:type schema:PropertyValue
    143 Nd08233aacf164e148f0c116c2e678b4f rdf:first sg:person.016166661351.64
    144 rdf:rest N4a2d798e61be4c37986f267b0bb7769e
    145 Nd52edd9a6d1e4c23b3ac5e50a0366bb6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    146 schema:name Humans
    147 rdf:type schema:DefinedTerm
    148 Ndc15abbc15a94cc6b1e9655d15ae0176 rdf:first sg:person.0642355665.18
    149 rdf:rest N8467bd7ae36a48a8808eb0293ff30bf1
    150 Nea3942e180e24840b5d8fac5915f2143 schema:name doi
    151 schema:value 10.1186/s13059-020-02047-7
    152 rdf:type schema:PropertyValue
    153 Nef1d5732c8cb45bc8dccaee03416a073 schema:name Springer Nature - SN SciGraph project
    154 rdf:type schema:Organization
    155 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
    156 schema:name Biological Sciences
    157 rdf:type schema:DefinedTerm
    158 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
    159 schema:name Genetics
    160 rdf:type schema:DefinedTerm
    161 sg:grant.2529453 http://pending.schema.org/fundedItem sg:pub.10.1186/s13059-020-02047-7
    162 rdf:type schema:MonetaryGrant
    163 sg:grant.8383234 http://pending.schema.org/fundedItem sg:pub.10.1186/s13059-020-02047-7
    164 rdf:type schema:MonetaryGrant
    165 sg:grant.8827846 http://pending.schema.org/fundedItem sg:pub.10.1186/s13059-020-02047-7
    166 rdf:type schema:MonetaryGrant
    167 sg:journal.1023439 schema:issn 1465-6906
    168 1474-760X
    169 schema:name Genome Biology
    170 schema:publisher Springer Nature
    171 rdf:type schema:Periodical
    172 sg:person.01040401176.42 schema:affiliation grid-institutes:grid.94225.38
    173 schema:familyName Zook
    174 schema:givenName Justin M.
    175 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01040401176.42
    176 rdf:type schema:Person
    177 sg:person.01052711105.47 schema:affiliation grid-institutes:grid.94225.38
    178 schema:familyName Olson
    179 schema:givenName Nathan D.
    180 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01052711105.47
    181 rdf:type schema:Person
    182 sg:person.01223441713.02 schema:affiliation grid-institutes:grid.21107.35
    183 schema:familyName Salzberg
    184 schema:givenName Steven L.
    185 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01223441713.02
    186 rdf:type schema:Person
    187 sg:person.016043540221.18 schema:affiliation grid-institutes:grid.21107.35
    188 schema:familyName Sherman
    189 schema:givenName Rachel M.
    190 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016043540221.18
    191 rdf:type schema:Person
    192 sg:person.016166661351.64 schema:affiliation grid-institutes:grid.94225.38
    193 schema:familyName Wagner
    194 schema:givenName Justin M.
    195 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016166661351.64
    196 rdf:type schema:Person
    197 sg:person.0635305134.21 schema:affiliation grid-institutes:grid.168010.e
    198 schema:familyName Salit
    199 schema:givenName Marc L.
    200 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0635305134.21
    201 rdf:type schema:Person
    202 sg:person.0642355665.18 schema:affiliation grid-institutes:grid.21107.35
    203 schema:familyName Zimin
    204 schema:givenName Aleksey V.
    205 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0642355665.18
    206 rdf:type schema:Person
    207 sg:person.0720234625.35 schema:affiliation grid-institutes:grid.21107.35
    208 schema:familyName Pertea
    209 schema:givenName Mihaela
    210 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0720234625.35
    211 rdf:type schema:Person
    212 sg:person.07665515552.55 schema:affiliation grid-institutes:grid.21107.35
    213 schema:familyName Shumate
    214 schema:givenName Alaina
    215 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07665515552.55
    216 rdf:type schema:Person
    217 sg:person.0776626013.24 schema:affiliation grid-institutes:grid.21107.35
    218 schema:familyName Puiu
    219 schema:givenName Daniela
    220 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0776626013.24
    221 rdf:type schema:Person
    222 sg:pub.10.1038/35057062 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042854081
    223 https://doi.org/10.1038/35057062
    224 rdf:type schema:CreativeWork
    225 sg:pub.10.1038/538161a schema:sameAs https://app.dimensions.ai/details/publication/pub.1009029590
    226 https://doi.org/10.1038/538161a
    227 rdf:type schema:CreativeWork
    228 sg:pub.10.1038/gim.2017.168 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092347962
    229 https://doi.org/10.1038/gim.2017.168
    230 rdf:type schema:CreativeWork
    231 sg:pub.10.1038/nature20098 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049863914
    232 https://doi.org/10.1038/nature20098
    233 rdf:type schema:CreativeWork
    234 sg:pub.10.1038/nbt.2835 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015051370
    235 https://doi.org/10.1038/nbt.2835
    236 rdf:type schema:CreativeWork
    237 sg:pub.10.1038/nmeth.1923 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006541515
    238 https://doi.org/10.1038/nmeth.1923
    239 rdf:type schema:CreativeWork
    240 sg:pub.10.1038/s41467-018-05513-w schema:sameAs https://app.dimensions.ai/details/publication/pub.1105886905
    241 https://doi.org/10.1038/s41467-018-05513-w
    242 rdf:type schema:CreativeWork
    243 sg:pub.10.1038/s41586-020-2308-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1127885244
    244 https://doi.org/10.1038/s41586-020-2308-7
    245 rdf:type schema:CreativeWork
    246 sg:pub.10.1038/s41587-019-0054-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1112678001
    247 https://doi.org/10.1038/s41587-019-0054-x
    248 rdf:type schema:CreativeWork
    249 sg:pub.10.1038/s41587-019-0074-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1113157788
    250 https://doi.org/10.1038/s41587-019-0074-6
    251 rdf:type schema:CreativeWork
    252 sg:pub.10.1038/s41592-018-0054-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105534003
    253 https://doi.org/10.1038/s41592-018-0054-7
    254 rdf:type schema:CreativeWork
    255 sg:pub.10.1038/sdata.2016.25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041924932
    256 https://doi.org/10.1038/sdata.2016.25
    257 rdf:type schema:CreativeWork
    258 sg:pub.10.1186/1471-2105-13-238 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028668057
    259 https://doi.org/10.1186/1471-2105-13-238
    260 rdf:type schema:CreativeWork
    261 sg:pub.10.1186/gm527 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016277336
    262 https://doi.org/10.1186/gm527
    263 rdf:type schema:CreativeWork
    264 sg:pub.10.1186/s12864-015-1481-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039450295
    265 https://doi.org/10.1186/s12864-015-1481-9
    266 rdf:type schema:CreativeWork
    267 sg:pub.10.1186/s13059-018-1590-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1110263091
    268 https://doi.org/10.1186/s13059-018-1590-2
    269 rdf:type schema:CreativeWork
    270 sg:pub.10.1186/s13059-019-1774-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1120247943
    271 https://doi.org/10.1186/s13059-019-1774-4
    272 rdf:type schema:CreativeWork
    273 grid-institutes:grid.168010.e schema:alternateName Joint Initiative for Metrology in Biology, Stanford University, Stanford, CA, USA
    274 schema:name Joint Initiative for Metrology in Biology, Stanford University, Stanford, CA, USA
    275 rdf:type schema:Organization
    276 grid-institutes:grid.21107.35 schema:alternateName Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
    277 Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
    278 Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
    279 schema:name Center for Computational Biology, Johns Hopkins University, Baltimore, MD, USA
    280 Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
    281 Department of Biostatistics, Johns Hopkins University, Baltimore, MD, USA
    282 Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
    283 rdf:type schema:Organization
    284 grid-institutes:grid.94225.38 schema:alternateName National Institute of Standards and Technology, Gaithersburg, MD, USA
    285 schema:name National Institute of Standards and Technology, Gaithersburg, MD, USA
    286 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...