Estimating evolutionary distances between genomic sequences from spaced-word matches View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2015-12

AUTHORS

Burkhard Morgenstern, Bingyao Zhu, Sebastian Horwege, Chris André Leimeister

ABSTRACT

Alignment-free methods are increasingly used to calculate evolutionary distances between DNA and protein sequences as a basis of phylogeny reconstruction. Most of these methods, however, use heuristic distance functions that are not based on any explicit model of molecular evolution. Herein, we propose a simple estimator d N of the evolutionary distance between two DNA sequences that is calculated from the number N of (spaced) word matches between them. We show that this distance function is more accurate than other distance measures that are used by alignment-free methods. In addition, we calculate the variance of the normalized number N of (spaced) word matches. We show that the variance of N is smaller for spaced words than for contiguous words, and that the variance is further reduced if our spaced-words approach is used with multiple patterns of 'match positions' and 'don't care positions'. Our software is available online and as downloadable source code at: http://spaced.gobics.de/. More... »

PAGES

5

References to SciGraph publications

  • 2004-12. Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites in BMC BIOINFORMATICS
  • 2014. Estimating Evolutionary Distances from Spaced-Word Matches in ALGORITHMS IN BIOINFORMATICS
  • 2012-12. Separating metagenomic short reads into genomes via clustering in ALGORITHMS FOR MOLECULAR BIOLOGY
  • 2007-12. Comparing sequences without using alignments: application to HIV/SIV subtyping in BMC BIOINFORMATICS
  • 2012-12. Alignment-free phylogeny of whole genomes using underlying subwords in ALGORITHMS FOR MOLECULAR BIOLOGY
  • 2014-05. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms in NATURE BIOTECHNOLOGY
  • 2004-12. TETRA: a web-service and a stand-alone program for the analysis and comparison of tetranucleotide usage patterns in DNA sequences in BMC BIOINFORMATICS
  • 2008. CompostBin: A DNA Composition-Based Algorithm for Binning Environmental Shotgun Reads in RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY
  • 2005-12. Genome comparison without alignment using shortest unique substrings in BMC BIOINFORMATICS
  • 2008-12. Word correlation matrices for protein sequence analysis and remote homology detection in BMC BIOINFORMATICS
  • 2010-06. Genome characteristics of a generalist marine bacterial lineage in THE ISME JOURNAL
  • 2014-08. Robust k-mer frequency estimation using gapped k-mers in JOURNAL OF MATHEMATICAL BIOLOGY
  • 2009-03. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome in GENOME BIOLOGY
  • 2012-12. Direct vs 2-stage approaches to structured motif finding in ALGORITHMS FOR MOLECULAR BIOLOGY
  • 2013. The Gapped Spectrum Kernel for Support Vector Machines in MACHINE LEARNING AND DATA MINING IN PATTERN RECOGNITION
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1186/s13015-015-0032-x

    DOI

    http://dx.doi.org/10.1186/s13015-015-0032-x

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1051413139

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/25685176


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Genetics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Biological Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of \u00c9vry Val d'Essonne", 
              "id": "https://www.grid.ac/institutes/grid.8390.2", 
              "name": [
                "University of G\u00f6ttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37073, G\u00f6ttingen, Germany", 
                "Universit\u00e9 d\u2019Evry Val d\u2019Essonne, Laboratoire Statistique et G\u00e9nome, UMR CNRS 8071, USC INRA 23 Boulevard de France, 91037, Evry, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Morgenstern", 
            "givenName": "Burkhard", 
            "id": "sg:person.0645534251.08", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0645534251.08"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of G\u00f6ttingen", 
              "id": "https://www.grid.ac/institutes/grid.7450.6", 
              "name": [
                "University of G\u00f6ttingen, Department of General Microbiology, Grisebachstr. 8, 37073, G\u00f6ttingen, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Zhu", 
            "givenName": "Bingyao", 
            "id": "sg:person.01323711134.72", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01323711134.72"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of G\u00f6ttingen", 
              "id": "https://www.grid.ac/institutes/grid.7450.6", 
              "name": [
                "University of G\u00f6ttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37073, G\u00f6ttingen, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Horwege", 
            "givenName": "Sebastian", 
            "id": "sg:person.01230147602.11", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01230147602.11"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of G\u00f6ttingen", 
              "id": "https://www.grid.ac/institutes/grid.7450.6", 
              "name": [
                "University of G\u00f6ttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37073, G\u00f6ttingen, Germany"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Leimeister", 
            "givenName": "Chris Andr\u00e9", 
            "id": "sg:person.0735220502.03", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0735220502.03"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1186/1471-2105-9-259", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002376230", 
              "https://doi.org/10.1186/1471-2105-9-259"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-9-259", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002376230", 
              "https://doi.org/10.1186/1471-2105-9-259"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-78839-3_3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004083256", 
              "https://doi.org/10.1007/978-3-540-78839-3_3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-78839-3_3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1004083256", 
              "https://doi.org/10.1007/978-3-540-78839-3_3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/18.3.440", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006017712"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1073/pnas.202468099", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008602816"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/nbt.2862", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011219673", 
              "https://doi.org/10.1038/nbt.2862"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-662-44753-6_13", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011739535", 
              "https://doi.org/10.1007/978-3-662-44753-6_13"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkr1246", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011839928"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btn025", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1012266713"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btm211", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1013418609"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btu331", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1013421163"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btr186", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1015876268"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/b978-1-4832-3211-9.50009-7", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016180325"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/ismej.2009.150", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016723931", 
              "https://doi.org/10.1038/ismej.2009.150"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-5-163", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017298406", 
              "https://doi.org/10.1186/1471-2105-5-163"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btq689", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017638252"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1073/pnas.0813249106", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1017699788"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btl510", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019520182"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkh362", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020789913"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btl376", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1021993261"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1371/journal.pcbi.1003711", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025723993"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-5-169", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027569341", 
              "https://doi.org/10.1186/1471-2105-5-169"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-5-169", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027569341", 
              "https://doi.org/10.1186/1471-2105-5-169"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-8-1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027920111", 
              "https://doi.org/10.1186/1471-2105-8-1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-8-1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027920111", 
              "https://doi.org/10.1186/1471-2105-8-1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btu177", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1029212748"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkt003", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1030907104"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btu815", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031438618"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-6-123", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032819871", 
              "https://doi.org/10.1186/1471-2105-6-123"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s00285-013-0705-3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1033021300", 
              "https://doi.org/10.1007/s00285-013-0705-3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/bts397", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035175462"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkr648", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1035552552"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkh155", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040169796"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1748-7188-7-27", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040184107", 
              "https://doi.org/10.1186/1748-7188-7-27"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btr176", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040969457"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1073/pnas.83.14.5155", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041403481"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0025-5564(81)90043-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1041640515"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bib/bbu005", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042019053"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1748-7188-7-34", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043383249", 
              "https://doi.org/10.1186/1748-7188-7-34"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2010.0245", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043925121"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gku398", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044368991"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-39712-7_1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045307141", 
              "https://doi.org/10.1007/978-3-642-39712-7_1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1748-7188-7-20", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045972818", 
              "https://doi.org/10.1186/1748-7188-7-20"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1371/journal.pone.0006901", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047316233"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2009.0198", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047941896"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2009-10-3-r25", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1049583368", 
              "https://doi.org/10.1186/gb-2009-10-3-r25"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1101/gr.074492.107", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1051720574"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1371/journal.pone.0008700", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053138932"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2006.13.1465", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059245443"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2006.13.336", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059245478"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2009.0106", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059245836"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2010.0171", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059245958"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1089/cmb.2014.0173", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1059246256"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/18.61115", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061100441"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.18637/jss.v007.i10", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1068672116"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1075024926", 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/oxfordjournals.molbev.a040454", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1079752303"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/9789812799623_0053", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1096080294"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2015-12", 
        "datePublishedReg": "2015-12-01", 
        "description": "Alignment-free methods are increasingly used to calculate evolutionary distances between DNA and protein sequences as a basis of phylogeny reconstruction. Most of these methods, however, use heuristic distance functions that are not based on any explicit model of molecular evolution. Herein, we propose a simple estimator d N of the evolutionary distance between two DNA sequences that is calculated from the number N of (spaced) word matches between them. We show that this distance function is more accurate than other distance measures that are used by alignment-free methods. In addition, we calculate the variance of the normalized number N of (spaced) word matches. We show that the variance of N is smaller for spaced words than for contiguous words, and that the variance is further reduced if our spaced-words approach is used with multiple patterns of 'match positions' and 'don't care positions'. Our software is available online and as downloadable source code at: http://spaced.gobics.de/. ", 
        "genre": "research_article", 
        "id": "sg:pub.10.1186/s13015-015-0032-x", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1036449", 
            "issn": [
              "1748-7188"
            ], 
            "name": "Algorithms for Molecular Biology", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "10"
          }
        ], 
        "name": "Estimating evolutionary distances between genomic sequences from spaced-word matches", 
        "pagination": "5", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "f4e93f72d662ff19095e2912f1b16640f5edf80bd1d85ffbc22410182a40c9f7"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "25685176"
            ]
          }, 
          {
            "name": "nlm_unique_id", 
            "type": "PropertyValue", 
            "value": [
              "101265088"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1186/s13015-015-0032-x"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1051413139"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1186/s13015-015-0032-x", 
          "https://app.dimensions.ai/details/publication/pub.1051413139"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-11T13:12", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000367_0000000367/records_88257_00000001.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "http://link.springer.com/10.1186%2Fs13015-015-0032-x"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s13015-015-0032-x'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s13015-015-0032-x'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s13015-015-0032-x'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s13015-015-0032-x'


     

    This table displays all metadata directly associated to this object as RDF triples.

    273 TRIPLES      21 PREDICATES      84 URIs      21 LITERALS      9 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1186/s13015-015-0032-x schema:about anzsrc-for:06
    2 anzsrc-for:0604
    3 schema:author Na72fa26712ae4396a4c86f3ad2df4e1d
    4 schema:citation sg:pub.10.1007/978-3-540-78839-3_3
    5 sg:pub.10.1007/978-3-642-39712-7_1
    6 sg:pub.10.1007/978-3-662-44753-6_13
    7 sg:pub.10.1007/s00285-013-0705-3
    8 sg:pub.10.1038/ismej.2009.150
    9 sg:pub.10.1038/nbt.2862
    10 sg:pub.10.1186/1471-2105-5-163
    11 sg:pub.10.1186/1471-2105-5-169
    12 sg:pub.10.1186/1471-2105-6-123
    13 sg:pub.10.1186/1471-2105-8-1
    14 sg:pub.10.1186/1471-2105-9-259
    15 sg:pub.10.1186/1748-7188-7-20
    16 sg:pub.10.1186/1748-7188-7-27
    17 sg:pub.10.1186/1748-7188-7-34
    18 sg:pub.10.1186/gb-2009-10-3-r25
    19 https://app.dimensions.ai/details/publication/pub.1075024926
    20 https://doi.org/10.1016/0025-5564(81)90043-2
    21 https://doi.org/10.1016/b978-1-4832-3211-9.50009-7
    22 https://doi.org/10.1073/pnas.0813249106
    23 https://doi.org/10.1073/pnas.202468099
    24 https://doi.org/10.1073/pnas.83.14.5155
    25 https://doi.org/10.1089/cmb.2006.13.1465
    26 https://doi.org/10.1089/cmb.2006.13.336
    27 https://doi.org/10.1089/cmb.2009.0106
    28 https://doi.org/10.1089/cmb.2009.0198
    29 https://doi.org/10.1089/cmb.2010.0171
    30 https://doi.org/10.1089/cmb.2010.0245
    31 https://doi.org/10.1089/cmb.2014.0173
    32 https://doi.org/10.1093/bib/bbu005
    33 https://doi.org/10.1093/bioinformatics/18.3.440
    34 https://doi.org/10.1093/bioinformatics/btl376
    35 https://doi.org/10.1093/bioinformatics/btl510
    36 https://doi.org/10.1093/bioinformatics/btm211
    37 https://doi.org/10.1093/bioinformatics/btn025
    38 https://doi.org/10.1093/bioinformatics/btq689
    39 https://doi.org/10.1093/bioinformatics/btr176
    40 https://doi.org/10.1093/bioinformatics/btr186
    41 https://doi.org/10.1093/bioinformatics/bts397
    42 https://doi.org/10.1093/bioinformatics/btu177
    43 https://doi.org/10.1093/bioinformatics/btu331
    44 https://doi.org/10.1093/bioinformatics/btu815
    45 https://doi.org/10.1093/nar/gkh155
    46 https://doi.org/10.1093/nar/gkh362
    47 https://doi.org/10.1093/nar/gkr1246
    48 https://doi.org/10.1093/nar/gkr648
    49 https://doi.org/10.1093/nar/gkt003
    50 https://doi.org/10.1093/nar/gku398
    51 https://doi.org/10.1093/oxfordjournals.molbev.a040454
    52 https://doi.org/10.1101/gr.074492.107
    53 https://doi.org/10.1109/18.61115
    54 https://doi.org/10.1142/9789812799623_0053
    55 https://doi.org/10.1371/journal.pcbi.1003711
    56 https://doi.org/10.1371/journal.pone.0006901
    57 https://doi.org/10.1371/journal.pone.0008700
    58 https://doi.org/10.18637/jss.v007.i10
    59 schema:datePublished 2015-12
    60 schema:datePublishedReg 2015-12-01
    61 schema:description Alignment-free methods are increasingly used to calculate evolutionary distances between DNA and protein sequences as a basis of phylogeny reconstruction. Most of these methods, however, use heuristic distance functions that are not based on any explicit model of molecular evolution. Herein, we propose a simple estimator d N of the evolutionary distance between two DNA sequences that is calculated from the number N of (spaced) word matches between them. We show that this distance function is more accurate than other distance measures that are used by alignment-free methods. In addition, we calculate the variance of the normalized number N of (spaced) word matches. We show that the variance of N is smaller for spaced words than for contiguous words, and that the variance is further reduced if our spaced-words approach is used with multiple patterns of 'match positions' and 'don't care positions'. Our software is available online and as downloadable source code at: http://spaced.gobics.de/.
    62 schema:genre research_article
    63 schema:inLanguage en
    64 schema:isAccessibleForFree true
    65 schema:isPartOf N01f8af5af62349688a1cc379d069bc2c
    66 Nc7d955bde5304fb2a679ce1e4b7f8ad2
    67 sg:journal.1036449
    68 schema:name Estimating evolutionary distances between genomic sequences from spaced-word matches
    69 schema:pagination 5
    70 schema:productId N19073f17e937488ebe1eba42c8ad61ff
    71 N2c54b0c7b06e4790849afa4dd4f5064f
    72 Nc9dd3b9625f44c62abf4f77d2e0ce3d2
    73 Ncc2a5add8047434b98cea10ce01a4f4a
    74 Nccc97a91c5cb4b88adadd26f5ab8ee70
    75 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051413139
    76 https://doi.org/10.1186/s13015-015-0032-x
    77 schema:sdDatePublished 2019-04-11T13:12
    78 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    79 schema:sdPublisher N9151156f2fc048e7a9e9f2c0169b6fa6
    80 schema:url http://link.springer.com/10.1186%2Fs13015-015-0032-x
    81 sgo:license sg:explorer/license/
    82 sgo:sdDataset articles
    83 rdf:type schema:ScholarlyArticle
    84 N01f8af5af62349688a1cc379d069bc2c schema:issueNumber 1
    85 rdf:type schema:PublicationIssue
    86 N19073f17e937488ebe1eba42c8ad61ff schema:name pubmed_id
    87 schema:value 25685176
    88 rdf:type schema:PropertyValue
    89 N2be4d2f8a9a44b82b0349d9343d7f7a5 rdf:first sg:person.01230147602.11
    90 rdf:rest N700772020e4f433bb06285951e284cf4
    91 N2c54b0c7b06e4790849afa4dd4f5064f schema:name doi
    92 schema:value 10.1186/s13015-015-0032-x
    93 rdf:type schema:PropertyValue
    94 N69019b5b85d74c5f91fa709374a31ce8 rdf:first sg:person.01323711134.72
    95 rdf:rest N2be4d2f8a9a44b82b0349d9343d7f7a5
    96 N700772020e4f433bb06285951e284cf4 rdf:first sg:person.0735220502.03
    97 rdf:rest rdf:nil
    98 N9151156f2fc048e7a9e9f2c0169b6fa6 schema:name Springer Nature - SN SciGraph project
    99 rdf:type schema:Organization
    100 Na72fa26712ae4396a4c86f3ad2df4e1d rdf:first sg:person.0645534251.08
    101 rdf:rest N69019b5b85d74c5f91fa709374a31ce8
    102 Nc7d955bde5304fb2a679ce1e4b7f8ad2 schema:volumeNumber 10
    103 rdf:type schema:PublicationVolume
    104 Nc9dd3b9625f44c62abf4f77d2e0ce3d2 schema:name readcube_id
    105 schema:value f4e93f72d662ff19095e2912f1b16640f5edf80bd1d85ffbc22410182a40c9f7
    106 rdf:type schema:PropertyValue
    107 Ncc2a5add8047434b98cea10ce01a4f4a schema:name dimensions_id
    108 schema:value pub.1051413139
    109 rdf:type schema:PropertyValue
    110 Nccc97a91c5cb4b88adadd26f5ab8ee70 schema:name nlm_unique_id
    111 schema:value 101265088
    112 rdf:type schema:PropertyValue
    113 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
    114 schema:name Biological Sciences
    115 rdf:type schema:DefinedTerm
    116 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
    117 schema:name Genetics
    118 rdf:type schema:DefinedTerm
    119 sg:journal.1036449 schema:issn 1748-7188
    120 schema:name Algorithms for Molecular Biology
    121 rdf:type schema:Periodical
    122 sg:person.01230147602.11 schema:affiliation https://www.grid.ac/institutes/grid.7450.6
    123 schema:familyName Horwege
    124 schema:givenName Sebastian
    125 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01230147602.11
    126 rdf:type schema:Person
    127 sg:person.01323711134.72 schema:affiliation https://www.grid.ac/institutes/grid.7450.6
    128 schema:familyName Zhu
    129 schema:givenName Bingyao
    130 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01323711134.72
    131 rdf:type schema:Person
    132 sg:person.0645534251.08 schema:affiliation https://www.grid.ac/institutes/grid.8390.2
    133 schema:familyName Morgenstern
    134 schema:givenName Burkhard
    135 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0645534251.08
    136 rdf:type schema:Person
    137 sg:person.0735220502.03 schema:affiliation https://www.grid.ac/institutes/grid.7450.6
    138 schema:familyName Leimeister
    139 schema:givenName Chris André
    140 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0735220502.03
    141 rdf:type schema:Person
    142 sg:pub.10.1007/978-3-540-78839-3_3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004083256
    143 https://doi.org/10.1007/978-3-540-78839-3_3
    144 rdf:type schema:CreativeWork
    145 sg:pub.10.1007/978-3-642-39712-7_1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045307141
    146 https://doi.org/10.1007/978-3-642-39712-7_1
    147 rdf:type schema:CreativeWork
    148 sg:pub.10.1007/978-3-662-44753-6_13 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011739535
    149 https://doi.org/10.1007/978-3-662-44753-6_13
    150 rdf:type schema:CreativeWork
    151 sg:pub.10.1007/s00285-013-0705-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033021300
    152 https://doi.org/10.1007/s00285-013-0705-3
    153 rdf:type schema:CreativeWork
    154 sg:pub.10.1038/ismej.2009.150 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016723931
    155 https://doi.org/10.1038/ismej.2009.150
    156 rdf:type schema:CreativeWork
    157 sg:pub.10.1038/nbt.2862 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011219673
    158 https://doi.org/10.1038/nbt.2862
    159 rdf:type schema:CreativeWork
    160 sg:pub.10.1186/1471-2105-5-163 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017298406
    161 https://doi.org/10.1186/1471-2105-5-163
    162 rdf:type schema:CreativeWork
    163 sg:pub.10.1186/1471-2105-5-169 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027569341
    164 https://doi.org/10.1186/1471-2105-5-169
    165 rdf:type schema:CreativeWork
    166 sg:pub.10.1186/1471-2105-6-123 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032819871
    167 https://doi.org/10.1186/1471-2105-6-123
    168 rdf:type schema:CreativeWork
    169 sg:pub.10.1186/1471-2105-8-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027920111
    170 https://doi.org/10.1186/1471-2105-8-1
    171 rdf:type schema:CreativeWork
    172 sg:pub.10.1186/1471-2105-9-259 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002376230
    173 https://doi.org/10.1186/1471-2105-9-259
    174 rdf:type schema:CreativeWork
    175 sg:pub.10.1186/1748-7188-7-20 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045972818
    176 https://doi.org/10.1186/1748-7188-7-20
    177 rdf:type schema:CreativeWork
    178 sg:pub.10.1186/1748-7188-7-27 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040184107
    179 https://doi.org/10.1186/1748-7188-7-27
    180 rdf:type schema:CreativeWork
    181 sg:pub.10.1186/1748-7188-7-34 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043383249
    182 https://doi.org/10.1186/1748-7188-7-34
    183 rdf:type schema:CreativeWork
    184 sg:pub.10.1186/gb-2009-10-3-r25 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049583368
    185 https://doi.org/10.1186/gb-2009-10-3-r25
    186 rdf:type schema:CreativeWork
    187 https://app.dimensions.ai/details/publication/pub.1075024926 schema:CreativeWork
    188 https://doi.org/10.1016/0025-5564(81)90043-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041640515
    189 rdf:type schema:CreativeWork
    190 https://doi.org/10.1016/b978-1-4832-3211-9.50009-7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016180325
    191 rdf:type schema:CreativeWork
    192 https://doi.org/10.1073/pnas.0813249106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017699788
    193 rdf:type schema:CreativeWork
    194 https://doi.org/10.1073/pnas.202468099 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008602816
    195 rdf:type schema:CreativeWork
    196 https://doi.org/10.1073/pnas.83.14.5155 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041403481
    197 rdf:type schema:CreativeWork
    198 https://doi.org/10.1089/cmb.2006.13.1465 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245443
    199 rdf:type schema:CreativeWork
    200 https://doi.org/10.1089/cmb.2006.13.336 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245478
    201 rdf:type schema:CreativeWork
    202 https://doi.org/10.1089/cmb.2009.0106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245836
    203 rdf:type schema:CreativeWork
    204 https://doi.org/10.1089/cmb.2009.0198 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047941896
    205 rdf:type schema:CreativeWork
    206 https://doi.org/10.1089/cmb.2010.0171 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245958
    207 rdf:type schema:CreativeWork
    208 https://doi.org/10.1089/cmb.2010.0245 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043925121
    209 rdf:type schema:CreativeWork
    210 https://doi.org/10.1089/cmb.2014.0173 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059246256
    211 rdf:type schema:CreativeWork
    212 https://doi.org/10.1093/bib/bbu005 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042019053
    213 rdf:type schema:CreativeWork
    214 https://doi.org/10.1093/bioinformatics/18.3.440 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006017712
    215 rdf:type schema:CreativeWork
    216 https://doi.org/10.1093/bioinformatics/btl376 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021993261
    217 rdf:type schema:CreativeWork
    218 https://doi.org/10.1093/bioinformatics/btl510 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019520182
    219 rdf:type schema:CreativeWork
    220 https://doi.org/10.1093/bioinformatics/btm211 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013418609
    221 rdf:type schema:CreativeWork
    222 https://doi.org/10.1093/bioinformatics/btn025 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012266713
    223 rdf:type schema:CreativeWork
    224 https://doi.org/10.1093/bioinformatics/btq689 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017638252
    225 rdf:type schema:CreativeWork
    226 https://doi.org/10.1093/bioinformatics/btr176 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040969457
    227 rdf:type schema:CreativeWork
    228 https://doi.org/10.1093/bioinformatics/btr186 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015876268
    229 rdf:type schema:CreativeWork
    230 https://doi.org/10.1093/bioinformatics/bts397 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035175462
    231 rdf:type schema:CreativeWork
    232 https://doi.org/10.1093/bioinformatics/btu177 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029212748
    233 rdf:type schema:CreativeWork
    234 https://doi.org/10.1093/bioinformatics/btu331 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013421163
    235 rdf:type schema:CreativeWork
    236 https://doi.org/10.1093/bioinformatics/btu815 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031438618
    237 rdf:type schema:CreativeWork
    238 https://doi.org/10.1093/nar/gkh155 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040169796
    239 rdf:type schema:CreativeWork
    240 https://doi.org/10.1093/nar/gkh362 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020789913
    241 rdf:type schema:CreativeWork
    242 https://doi.org/10.1093/nar/gkr1246 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011839928
    243 rdf:type schema:CreativeWork
    244 https://doi.org/10.1093/nar/gkr648 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035552552
    245 rdf:type schema:CreativeWork
    246 https://doi.org/10.1093/nar/gkt003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030907104
    247 rdf:type schema:CreativeWork
    248 https://doi.org/10.1093/nar/gku398 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044368991
    249 rdf:type schema:CreativeWork
    250 https://doi.org/10.1093/oxfordjournals.molbev.a040454 schema:sameAs https://app.dimensions.ai/details/publication/pub.1079752303
    251 rdf:type schema:CreativeWork
    252 https://doi.org/10.1101/gr.074492.107 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051720574
    253 rdf:type schema:CreativeWork
    254 https://doi.org/10.1109/18.61115 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061100441
    255 rdf:type schema:CreativeWork
    256 https://doi.org/10.1142/9789812799623_0053 schema:sameAs https://app.dimensions.ai/details/publication/pub.1096080294
    257 rdf:type schema:CreativeWork
    258 https://doi.org/10.1371/journal.pcbi.1003711 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025723993
    259 rdf:type schema:CreativeWork
    260 https://doi.org/10.1371/journal.pone.0006901 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047316233
    261 rdf:type schema:CreativeWork
    262 https://doi.org/10.1371/journal.pone.0008700 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053138932
    263 rdf:type schema:CreativeWork
    264 https://doi.org/10.18637/jss.v007.i10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1068672116
    265 rdf:type schema:CreativeWork
    266 https://www.grid.ac/institutes/grid.7450.6 schema:alternateName University of Göttingen
    267 schema:name University of Göttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37073, Göttingen, Germany
    268 University of Göttingen, Department of General Microbiology, Grisebachstr. 8, 37073, Göttingen, Germany
    269 rdf:type schema:Organization
    270 https://www.grid.ac/institutes/grid.8390.2 schema:alternateName University of Évry Val d'Essonne
    271 schema:name University of Göttingen, Department of Bioinformatics, Goldschmidtstr. 1, 37073, Göttingen, Germany
    272 Université d’Evry Val d’Essonne, Laboratoire Statistique et Génome, UMR CNRS 8071, USC INRA 23 Boulevard de France, 91037, Evry, France
    273 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...