Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2003-04

AUTHORS

Joseph Cheung, Xavier Estivill, Razi Khaja, Jeffrey R MacDonald, Ken Lau, Lap-Chee Tsui, Stephen W Scherer

ABSTRACT

BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve. More... »

PAGES

r25

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/gb-2003-4-4-r25

DOI

http://dx.doi.org/10.1186/gb-2003-4-4-r25

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1025892956

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/12702206


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Artifacts", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Base Sequence", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Chromosomes, Human", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computational Biology", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Gene Duplication", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genetic Diseases, Inborn", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genetic Variation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genome, Human", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Humans", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Polymorphism, Single Nucleotide", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Analysis, DNA", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Hospital for Sick Children", 
          "id": "https://www.grid.ac/institutes/grid.42327.30", 
          "name": [
            "Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Cheung", 
        "givenName": "Joseph", 
        "id": "sg:person.01346013070.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01346013070.79"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Pompeu Fabra University", 
          "id": "https://www.grid.ac/institutes/grid.5612.0", 
          "name": [
            "Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada", 
            "Genes and Disease Program, Genomic Regulation Center, and Facultat Ciencies de la Salut i de la Vida, Universitat Pompeu Fabra, E-08003, Barcelona, Catalonia, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Estivill", 
        "givenName": "Xavier", 
        "id": "sg:person.0675214555.67", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0675214555.67"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Hospital for Sick Children", 
          "id": "https://www.grid.ac/institutes/grid.42327.30", 
          "name": [
            "Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Khaja", 
        "givenName": "Razi", 
        "id": "sg:person.0702001650.57", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702001650.57"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Hospital for Sick Children", 
          "id": "https://www.grid.ac/institutes/grid.42327.30", 
          "name": [
            "Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "MacDonald", 
        "givenName": "Jeffrey R", 
        "id": "sg:person.01005010277.13", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01005010277.13"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Hospital for Sick Children", 
          "id": "https://www.grid.ac/institutes/grid.42327.30", 
          "name": [
            "Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lau", 
        "givenName": "Ken", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Hong Kong", 
          "id": "https://www.grid.ac/institutes/grid.194645.b", 
          "name": [
            "Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada", 
            "Department of Molecular and Medical Genetics, University of Toronto, 555 University Avenue, M5G 1X8, Toronto, ON, Canada", 
            "The University of Hong Kong, Pokfulam Road, Hong Kong"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Tsui", 
        "givenName": "Lap-Chee", 
        "id": "sg:person.01243057047.05", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01243057047.05"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Toronto", 
          "id": "https://www.grid.ac/institutes/grid.17063.33", 
          "name": [
            "Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada", 
            "Department of Molecular and Medical Genetics, University of Toronto, 555 University Avenue, M5G 1X8, Toronto, ON, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Scherer", 
        "givenName": "Stephen W", 
        "id": "sg:person.011066475117.55", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011066475117.55"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1016/s0168-9525(02)02592-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000179704"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/18.2.335", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001279450"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/gcc.10111", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001434096"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.1058040", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001517867"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0092-8674(01)00447-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003113849"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/hmg/11.17.1987", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003966518"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.1072047", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1006710139"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg705", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008302356", 
          "https://doi.org/10.1038/nrg705"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg705", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008302356", 
          "https://doi.org/10.1038/nrg705"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.403602", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1012195556"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0022-2836(05)80360-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013618994"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0960-9822(01)00490-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015447031"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ng0396-288", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017805829", 
          "https://doi.org/10.1038/ng0396-288"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/35097067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018632260", 
          "https://doi.org/10.1038/35097067"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/35097067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018632260", 
          "https://doi.org/10.1038/35097067"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/35093500", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029953644", 
          "https://doi.org/10.1038/35093500"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/35093500", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029953644", 
          "https://doi.org/10.1038/35093500"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ng753", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031273704", 
          "https://doi.org/10.1038/ng753"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ng753", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031273704", 
          "https://doi.org/10.1038/ng753"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1006/geno.2000.6312", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036764135"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ng757", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040986528", 
          "https://doi.org/10.1038/ng757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ng757", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040986528", 
          "https://doi.org/10.1038/ng757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.187101", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042408677"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/35057062", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042854081", 
          "https://doi.org/10.1038/35057062"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/35057062", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042854081", 
          "https://doi.org/10.1038/35057062"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0168-9525(98)01555-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044342085"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.152171299", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046195397"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/hmg/9.4.489", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049132213"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.290.5494.1151", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050543187"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/319506", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058622732"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/341610", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058639071"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/9780471650126.dob0273.pub2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1089871629"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/0471684228.egp03898", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1089901767"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2003-04", 
    "datePublishedReg": "2003-04-01", 
    "description": "BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies.\nRESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants.\nCONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/gb-2003-4-4-r25", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023439", 
        "issn": [
          "1474-760X", 
          "1465-6906"
        ], 
        "name": "Genome Biology", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "4", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "4"
      }
    ], 
    "name": "Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence", 
    "pagination": "r25", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "7b5b10a15174f02750b22560d30b4eb07ae075baf38e7f6b661a34f5e5467970"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "12702206"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100960660"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/gb-2003-4-4-r25"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1025892956"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/gb-2003-4-4-r25", 
      "https://app.dimensions.ai/details/publication/pub.1025892956"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T00:16", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8695_00000513.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186%2Fgb-2003-4-4-r25"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/gb-2003-4-4-r25'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/gb-2003-4-4-r25'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/gb-2003-4-4-r25'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/gb-2003-4-4-r25'


 

This table displays all metadata directly associated to this object as RDF triples.

255 TRIPLES      21 PREDICATES      67 URIs      32 LITERALS      20 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/gb-2003-4-4-r25 schema:about N2c7c098622744c0888c1fe8878db7928
2 N4e9b57bfb9f14b12b3296c13b5febc68
3 N8c5331c624dd44848d377d6e33c1982d
4 Nad70f4da60ad499286443c82b743a771
5 Nb5d048afc1b44246bee806ca74b914bb
6 Nc1cc59af7197445fb538e84a9f9b260f
7 Nc8f343bbd8234f6ca88a5c90e18d23a9
8 Nc93003aa50574af6932f6c4a85910122
9 Ne27ef2b6a4274d7ba017d02aa5675a1a
10 Ne7729f859e3444dfa9c1bb7fa6330cdc
11 Nfb98bf7988e340679a6f480b40ed96b9
12 anzsrc-for:06
13 anzsrc-for:0604
14 schema:author N9e8d9caa0917465da5e3936e89c6e8f4
15 schema:citation sg:pub.10.1038/35057062
16 sg:pub.10.1038/35093500
17 sg:pub.10.1038/35097067
18 sg:pub.10.1038/ng0396-288
19 sg:pub.10.1038/ng753
20 sg:pub.10.1038/ng757
21 sg:pub.10.1038/nrg705
22 https://doi.org/10.1002/0471684228.egp03898
23 https://doi.org/10.1002/9780471650126.dob0273.pub2
24 https://doi.org/10.1002/gcc.10111
25 https://doi.org/10.1006/geno.2000.6312
26 https://doi.org/10.1016/s0022-2836(05)80360-2
27 https://doi.org/10.1016/s0092-8674(01)00447-0
28 https://doi.org/10.1016/s0168-9525(02)02592-1
29 https://doi.org/10.1016/s0168-9525(98)01555-8
30 https://doi.org/10.1016/s0960-9822(01)00490-0
31 https://doi.org/10.1073/pnas.152171299
32 https://doi.org/10.1086/319506
33 https://doi.org/10.1086/341610
34 https://doi.org/10.1093/bioinformatics/18.2.335
35 https://doi.org/10.1093/hmg/11.17.1987
36 https://doi.org/10.1093/hmg/9.4.489
37 https://doi.org/10.1101/gr.187101
38 https://doi.org/10.1101/gr.403602
39 https://doi.org/10.1126/science.1058040
40 https://doi.org/10.1126/science.1072047
41 https://doi.org/10.1126/science.290.5494.1151
42 schema:datePublished 2003-04
43 schema:datePublishedReg 2003-04-01
44 schema:description BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve.
45 schema:genre research_article
46 schema:inLanguage en
47 schema:isAccessibleForFree true
48 schema:isPartOf Na4ad01557fd8496a9038f94116d14c08
49 Nbc407c50c28c45d8989c67e591d7578f
50 sg:journal.1023439
51 schema:name Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence
52 schema:pagination r25
53 schema:productId N01f7888fe764424c820e9e05345c5d2c
54 N228642c996044c6aa2e88fba22471cd8
55 N4ed642509061450abe6b5168f94367ac
56 Ncae4c9022a9e4964b4e2a0a4410270b9
57 Nfda4e7ce30b1445fb861a4e78feeb055
58 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025892956
59 https://doi.org/10.1186/gb-2003-4-4-r25
60 schema:sdDatePublished 2019-04-11T00:16
61 schema:sdLicense https://scigraph.springernature.com/explorer/license/
62 schema:sdPublisher N458b123690744053b17086b26c95aa3e
63 schema:url http://link.springer.com/10.1186%2Fgb-2003-4-4-r25
64 sgo:license sg:explorer/license/
65 sgo:sdDataset articles
66 rdf:type schema:ScholarlyArticle
67 N01f7888fe764424c820e9e05345c5d2c schema:name doi
68 schema:value 10.1186/gb-2003-4-4-r25
69 rdf:type schema:PropertyValue
70 N05ae2509230d4c3787d89c581a6ae8ff rdf:first sg:person.0702001650.57
71 rdf:rest N89f844372c0446e8a133d47658f2f541
72 N169d4c7b7dcb4245910181af20fe12c3 rdf:first sg:person.01243057047.05
73 rdf:rest Ncfacb9cf5ab4476f9a421eadeff02966
74 N228642c996044c6aa2e88fba22471cd8 schema:name readcube_id
75 schema:value 7b5b10a15174f02750b22560d30b4eb07ae075baf38e7f6b661a34f5e5467970
76 rdf:type schema:PropertyValue
77 N2c7c098622744c0888c1fe8878db7928 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
78 schema:name Base Sequence
79 rdf:type schema:DefinedTerm
80 N458b123690744053b17086b26c95aa3e schema:name Springer Nature - SN SciGraph project
81 rdf:type schema:Organization
82 N4e9b57bfb9f14b12b3296c13b5febc68 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
83 schema:name Artifacts
84 rdf:type schema:DefinedTerm
85 N4ed642509061450abe6b5168f94367ac schema:name nlm_unique_id
86 schema:value 100960660
87 rdf:type schema:PropertyValue
88 N89f844372c0446e8a133d47658f2f541 rdf:first sg:person.01005010277.13
89 rdf:rest Nf7d05c61d16c486aa947f83642f4559a
90 N8b66801edc45420f8d86c25eaed28dbe schema:affiliation https://www.grid.ac/institutes/grid.42327.30
91 schema:familyName Lau
92 schema:givenName Ken
93 rdf:type schema:Person
94 N8c5331c624dd44848d377d6e33c1982d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
95 schema:name Gene Duplication
96 rdf:type schema:DefinedTerm
97 N9e8d9caa0917465da5e3936e89c6e8f4 rdf:first sg:person.01346013070.79
98 rdf:rest Nc4c570b6a53842de8d1ba75d583f3a8d
99 Na4ad01557fd8496a9038f94116d14c08 schema:issueNumber 4
100 rdf:type schema:PublicationIssue
101 Nad70f4da60ad499286443c82b743a771 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
102 schema:name Genome, Human
103 rdf:type schema:DefinedTerm
104 Nb5d048afc1b44246bee806ca74b914bb schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
105 schema:name Humans
106 rdf:type schema:DefinedTerm
107 Nbc407c50c28c45d8989c67e591d7578f schema:volumeNumber 4
108 rdf:type schema:PublicationVolume
109 Nc1cc59af7197445fb538e84a9f9b260f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
110 schema:name Polymorphism, Single Nucleotide
111 rdf:type schema:DefinedTerm
112 Nc4c570b6a53842de8d1ba75d583f3a8d rdf:first sg:person.0675214555.67
113 rdf:rest N05ae2509230d4c3787d89c581a6ae8ff
114 Nc8f343bbd8234f6ca88a5c90e18d23a9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
115 schema:name Genetic Variation
116 rdf:type schema:DefinedTerm
117 Nc93003aa50574af6932f6c4a85910122 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
118 schema:name Computational Biology
119 rdf:type schema:DefinedTerm
120 Ncae4c9022a9e4964b4e2a0a4410270b9 schema:name dimensions_id
121 schema:value pub.1025892956
122 rdf:type schema:PropertyValue
123 Ncfacb9cf5ab4476f9a421eadeff02966 rdf:first sg:person.011066475117.55
124 rdf:rest rdf:nil
125 Ne27ef2b6a4274d7ba017d02aa5675a1a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
126 schema:name Sequence Analysis, DNA
127 rdf:type schema:DefinedTerm
128 Ne7729f859e3444dfa9c1bb7fa6330cdc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
129 schema:name Genetic Diseases, Inborn
130 rdf:type schema:DefinedTerm
131 Nf7d05c61d16c486aa947f83642f4559a rdf:first N8b66801edc45420f8d86c25eaed28dbe
132 rdf:rest N169d4c7b7dcb4245910181af20fe12c3
133 Nfb98bf7988e340679a6f480b40ed96b9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
134 schema:name Chromosomes, Human
135 rdf:type schema:DefinedTerm
136 Nfda4e7ce30b1445fb861a4e78feeb055 schema:name pubmed_id
137 schema:value 12702206
138 rdf:type schema:PropertyValue
139 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
140 schema:name Biological Sciences
141 rdf:type schema:DefinedTerm
142 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
143 schema:name Genetics
144 rdf:type schema:DefinedTerm
145 sg:journal.1023439 schema:issn 1465-6906
146 1474-760X
147 schema:name Genome Biology
148 rdf:type schema:Periodical
149 sg:person.01005010277.13 schema:affiliation https://www.grid.ac/institutes/grid.42327.30
150 schema:familyName MacDonald
151 schema:givenName Jeffrey R
152 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01005010277.13
153 rdf:type schema:Person
154 sg:person.011066475117.55 schema:affiliation https://www.grid.ac/institutes/grid.17063.33
155 schema:familyName Scherer
156 schema:givenName Stephen W
157 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011066475117.55
158 rdf:type schema:Person
159 sg:person.01243057047.05 schema:affiliation https://www.grid.ac/institutes/grid.194645.b
160 schema:familyName Tsui
161 schema:givenName Lap-Chee
162 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01243057047.05
163 rdf:type schema:Person
164 sg:person.01346013070.79 schema:affiliation https://www.grid.ac/institutes/grid.42327.30
165 schema:familyName Cheung
166 schema:givenName Joseph
167 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01346013070.79
168 rdf:type schema:Person
169 sg:person.0675214555.67 schema:affiliation https://www.grid.ac/institutes/grid.5612.0
170 schema:familyName Estivill
171 schema:givenName Xavier
172 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0675214555.67
173 rdf:type schema:Person
174 sg:person.0702001650.57 schema:affiliation https://www.grid.ac/institutes/grid.42327.30
175 schema:familyName Khaja
176 schema:givenName Razi
177 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702001650.57
178 rdf:type schema:Person
179 sg:pub.10.1038/35057062 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042854081
180 https://doi.org/10.1038/35057062
181 rdf:type schema:CreativeWork
182 sg:pub.10.1038/35093500 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029953644
183 https://doi.org/10.1038/35093500
184 rdf:type schema:CreativeWork
185 sg:pub.10.1038/35097067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018632260
186 https://doi.org/10.1038/35097067
187 rdf:type schema:CreativeWork
188 sg:pub.10.1038/ng0396-288 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017805829
189 https://doi.org/10.1038/ng0396-288
190 rdf:type schema:CreativeWork
191 sg:pub.10.1038/ng753 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031273704
192 https://doi.org/10.1038/ng753
193 rdf:type schema:CreativeWork
194 sg:pub.10.1038/ng757 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040986528
195 https://doi.org/10.1038/ng757
196 rdf:type schema:CreativeWork
197 sg:pub.10.1038/nrg705 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008302356
198 https://doi.org/10.1038/nrg705
199 rdf:type schema:CreativeWork
200 https://doi.org/10.1002/0471684228.egp03898 schema:sameAs https://app.dimensions.ai/details/publication/pub.1089901767
201 rdf:type schema:CreativeWork
202 https://doi.org/10.1002/9780471650126.dob0273.pub2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1089871629
203 rdf:type schema:CreativeWork
204 https://doi.org/10.1002/gcc.10111 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001434096
205 rdf:type schema:CreativeWork
206 https://doi.org/10.1006/geno.2000.6312 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036764135
207 rdf:type schema:CreativeWork
208 https://doi.org/10.1016/s0022-2836(05)80360-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013618994
209 rdf:type schema:CreativeWork
210 https://doi.org/10.1016/s0092-8674(01)00447-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003113849
211 rdf:type schema:CreativeWork
212 https://doi.org/10.1016/s0168-9525(02)02592-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000179704
213 rdf:type schema:CreativeWork
214 https://doi.org/10.1016/s0168-9525(98)01555-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044342085
215 rdf:type schema:CreativeWork
216 https://doi.org/10.1016/s0960-9822(01)00490-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015447031
217 rdf:type schema:CreativeWork
218 https://doi.org/10.1073/pnas.152171299 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046195397
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1086/319506 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058622732
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1086/341610 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058639071
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1093/bioinformatics/18.2.335 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001279450
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1093/hmg/11.17.1987 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003966518
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1093/hmg/9.4.489 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049132213
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1101/gr.187101 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042408677
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1101/gr.403602 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012195556
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1126/science.1058040 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001517867
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1126/science.1072047 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006710139
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1126/science.290.5494.1151 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050543187
239 rdf:type schema:CreativeWork
240 https://www.grid.ac/institutes/grid.17063.33 schema:alternateName University of Toronto
241 schema:name Department of Molecular and Medical Genetics, University of Toronto, 555 University Avenue, M5G 1X8, Toronto, ON, Canada
242 Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada
243 rdf:type schema:Organization
244 https://www.grid.ac/institutes/grid.194645.b schema:alternateName University of Hong Kong
245 schema:name Department of Molecular and Medical Genetics, University of Toronto, 555 University Avenue, M5G 1X8, Toronto, ON, Canada
246 Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada
247 The University of Hong Kong, Pokfulam Road, Hong Kong
248 rdf:type schema:Organization
249 https://www.grid.ac/institutes/grid.42327.30 schema:alternateName Hospital for Sick Children
250 schema:name Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada
251 rdf:type schema:Organization
252 https://www.grid.ac/institutes/grid.5612.0 schema:alternateName Pompeu Fabra University
253 schema:name Genes and Disease Program, Genomic Regulation Center, and Facultat Ciencies de la Salut i de la Vida, Universitat Pompeu Fabra, E-08003, Barcelona, Catalonia, Spain
254 Program in Genetics and Genomic Biology, Research Institute, The Hospital for Sick Children, Toronto, Canada
255 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...