A minimal descriptor of an ancestral recombinations graph View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2011-02-15

AUTHORS

Laxmi Parida, Pier Francesco Palamara, Asif Javed

ABSTRACT

BackgroundAncestral Recombinations Graph (ARG) is a phylogenetic structure that encodes both duplication events, such as mutations, as well as genetic exchange events, such as recombinations: this captures the (genetic) dynamics of a population evolving over generations.ResultsIn this paper, we identify structure-preserving and samples-preserving core of an ARG G and call it the minimal descriptor ARG of G. Its structure-preserving characteristic ensures that all the branch lengths of the marginal trees of the minimal descriptor ARG are identical to that of G and the samples-preserving property asserts that the patterns of genetic variation in the samples of the minimal descriptor ARG are exactly the same as that of G. We also prove that even an unbounded G has a finite minimal descriptor, that continues to preserve certain (graph-theoretic) properties of G and for an appropriate class of ARGs, our estimate (Eqn 8) as well as empirical observation is that the expected reduction in the number of vertices is exponential.ConclusionsBased on the definition of this lossless and bounded structure, we derive local properties of the vertices of a minimal descriptor ARG, which lend itself very naturally to the design of efficient sampling algorithms. We further show that a class of minimal descriptors, that of binary ARGs, models the standard coalescent exactly (Thm 6). More... »

PAGES

s6

References to SciGraph publications

  • 2006-03-15. Fast "coalescent" simulation in BMC GENOMIC DATA
  • 2009-01-30. Minimizing recombinations in consensus networks for phylogeographic studies in BMC BIOINFORMATICS
  • 1997. An Ancestral Recombination Graph in PROGRESS IN POPULATION GENETICS AND HUMAN EVOLUTION
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1186/1471-2105-12-s1-s6

    DOI

    http://dx.doi.org/10.1186/1471-2105-12-s1-s6

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1038455285

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/21342589


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Mathematical Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Statistics", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Algorithms", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computational Biology", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Models, Genetic", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Phylogeny", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Recombination, Genetic", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Computational Genomics, IBM T J Watson Research, Yorktown, New York, USA", 
              "id": "http://www.grid.ac/institutes/None", 
              "name": [
                "Computational Genomics, IBM T J Watson Research, Yorktown, New York, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Parida", 
            "givenName": "Laxmi", 
            "id": "sg:person.01336557015.68", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336557015.68"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "The work done during an internship at, IBM T J Watson Research Center, USA", 
              "id": "http://www.grid.ac/institutes/grid.481554.9", 
              "name": [
                "Columbia University, New York, USA", 
                "The work done during an internship at, IBM T J Watson Research Center, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Palamara", 
            "givenName": "Pier Francesco", 
            "id": "sg:person.0577752122.01", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0577752122.01"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Computational Genomics, IBM T J Watson Research, Yorktown, New York, USA", 
              "id": "http://www.grid.ac/institutes/None", 
              "name": [
                "Computational Genomics, IBM T J Watson Research, Yorktown, New York, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Javed", 
            "givenName": "Asif", 
            "id": "sg:person.0634253750.76", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0634253750.76"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/978-1-4757-2609-1_16", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1089756973", 
              "https://doi.org/10.1007/978-1-4757-2609-1_16"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-10-s1-s72", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1021721164", 
              "https://doi.org/10.1186/1471-2105-10-s1-s72"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2156-7-16", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1049183548", 
              "https://doi.org/10.1186/1471-2156-7-16"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2011-02-15", 
        "datePublishedReg": "2011-02-15", 
        "description": "BackgroundAncestral Recombinations Graph (ARG) is a phylogenetic structure that encodes both duplication events, such as mutations, as well as genetic exchange events, such as recombinations: this captures the (genetic) dynamics of a population evolving over generations.ResultsIn this paper, we identify structure-preserving and samples-preserving core of an ARG G and call it the minimal descriptor ARG of G. Its structure-preserving characteristic ensures that all the branch lengths of the marginal trees of the minimal descriptor ARG are identical to that of G and the samples-preserving property asserts that the patterns of genetic variation in the samples of the minimal descriptor ARG are exactly the same as that of G. We also prove that even an unbounded G has a finite minimal descriptor, that continues to preserve certain (graph-theoretic) properties of G and for an appropriate class of ARGs, our estimate (Eqn 8) as well as empirical observation is that the expected reduction in the number of vertices is exponential.ConclusionsBased on the definition of this lossless and bounded structure, we derive local properties of the vertices of a minimal descriptor ARG, which lend itself very naturally to the design of efficient sampling algorithms. We further show that a class of minimal descriptors, that of binary ARGs, models the standard coalescent exactly (Thm 6).", 
        "genre": "article", 
        "id": "sg:pub.10.1186/1471-2105-12-s1-s6", 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1023786", 
            "issn": [
              "1471-2105"
            ], 
            "name": "BMC Bioinformatics", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "Suppl 1", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "12"
          }
        ], 
        "keywords": [
          "minimal descriptors", 
          "number of vertices", 
          "ancestral recombination graph", 
          "certain properties", 
          "appropriate class", 
          "local properties", 
          "standard coalescent", 
          "marginal trees", 
          "vertices", 
          "graph", 
          "branch lengths", 
          "class", 
          "empirical observations", 
          "properties", 
          "coalescent", 
          "algorithm", 
          "dynamics", 
          "lossless", 
          "estimates", 
          "structure", 
          "descriptors", 
          "number", 
          "ensures", 
          "definition", 
          "design", 
          "observations", 
          "core", 
          "length", 
          "genetic exchange events", 
          "variation", 
          "generation", 
          "trees", 
          "recombination", 
          "reduction", 
          "exchange events", 
          "samples", 
          "events", 
          "patterns", 
          "phylogenetic structure", 
          "population", 
          "genetic variation", 
          "mutations", 
          "Arg", 
          "ResultsIn", 
          "paper", 
          "duplication events"
        ], 
        "name": "A minimal descriptor of an ancestral recombinations graph", 
        "pagination": "s6", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1038455285"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1186/1471-2105-12-s1-s6"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "21342589"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1186/1471-2105-12-s1-s6", 
          "https://app.dimensions.ai/details/publication/pub.1038455285"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-09-02T15:55", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220902/entities/gbq_results/article/article_546.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1186/1471-2105-12-s1-s6"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-s1-s6'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-s1-s6'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-s1-s6'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-12-s1-s6'


     

    This table displays all metadata directly associated to this object as RDF triples.

    156 TRIPLES      21 PREDICATES      79 URIs      68 LITERALS      12 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1186/1471-2105-12-s1-s6 schema:about N0d84cfdbd3634460bfbbbeb89a928792
    2 Nb9a2c58b313549059086e3329bba38cc
    3 Nc36a1349bab1497ab8016cea72d91f08
    4 Ncb43a69b1f154b23992fa60be00d75b3
    5 Ndeacce44d6a54f0ca134a5d566108b1b
    6 anzsrc-for:01
    7 anzsrc-for:0104
    8 schema:author Nbb95487a9bb74977951215295978fece
    9 schema:citation sg:pub.10.1007/978-1-4757-2609-1_16
    10 sg:pub.10.1186/1471-2105-10-s1-s72
    11 sg:pub.10.1186/1471-2156-7-16
    12 schema:datePublished 2011-02-15
    13 schema:datePublishedReg 2011-02-15
    14 schema:description BackgroundAncestral Recombinations Graph (ARG) is a phylogenetic structure that encodes both duplication events, such as mutations, as well as genetic exchange events, such as recombinations: this captures the (genetic) dynamics of a population evolving over generations.ResultsIn this paper, we identify structure-preserving and samples-preserving core of an ARG G and call it the minimal descriptor ARG of G. Its structure-preserving characteristic ensures that all the branch lengths of the marginal trees of the minimal descriptor ARG are identical to that of G and the samples-preserving property asserts that the patterns of genetic variation in the samples of the minimal descriptor ARG are exactly the same as that of G. We also prove that even an unbounded G has a finite minimal descriptor, that continues to preserve certain (graph-theoretic) properties of G and for an appropriate class of ARGs, our estimate (Eqn 8) as well as empirical observation is that the expected reduction in the number of vertices is exponential.ConclusionsBased on the definition of this lossless and bounded structure, we derive local properties of the vertices of a minimal descriptor ARG, which lend itself very naturally to the design of efficient sampling algorithms. We further show that a class of minimal descriptors, that of binary ARGs, models the standard coalescent exactly (Thm 6).
    15 schema:genre article
    16 schema:isAccessibleForFree true
    17 schema:isPartOf N27da77b3e5a648a7898c4327f5448023
    18 N37cdbc663a3c42b182efa4cc0edaecf1
    19 sg:journal.1023786
    20 schema:keywords Arg
    21 ResultsIn
    22 algorithm
    23 ancestral recombination graph
    24 appropriate class
    25 branch lengths
    26 certain properties
    27 class
    28 coalescent
    29 core
    30 definition
    31 descriptors
    32 design
    33 duplication events
    34 dynamics
    35 empirical observations
    36 ensures
    37 estimates
    38 events
    39 exchange events
    40 generation
    41 genetic exchange events
    42 genetic variation
    43 graph
    44 length
    45 local properties
    46 lossless
    47 marginal trees
    48 minimal descriptors
    49 mutations
    50 number
    51 number of vertices
    52 observations
    53 paper
    54 patterns
    55 phylogenetic structure
    56 population
    57 properties
    58 recombination
    59 reduction
    60 samples
    61 standard coalescent
    62 structure
    63 trees
    64 variation
    65 vertices
    66 schema:name A minimal descriptor of an ancestral recombinations graph
    67 schema:pagination s6
    68 schema:productId N64bca3b9fa534bddbea7f6c6f1aa03b4
    69 N7eaaf6d0cb03417ebe3e1cba826c337a
    70 Ned9965a91e3a43559c779838f64db221
    71 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038455285
    72 https://doi.org/10.1186/1471-2105-12-s1-s6
    73 schema:sdDatePublished 2022-09-02T15:55
    74 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    75 schema:sdPublisher Na09d2b34cc374c399d0b820880b99385
    76 schema:url https://doi.org/10.1186/1471-2105-12-s1-s6
    77 sgo:license sg:explorer/license/
    78 sgo:sdDataset articles
    79 rdf:type schema:ScholarlyArticle
    80 N0d84cfdbd3634460bfbbbeb89a928792 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    81 schema:name Models, Genetic
    82 rdf:type schema:DefinedTerm
    83 N27da77b3e5a648a7898c4327f5448023 schema:volumeNumber 12
    84 rdf:type schema:PublicationVolume
    85 N2b351f5faa994bed8911fa3448023490 rdf:first sg:person.0634253750.76
    86 rdf:rest rdf:nil
    87 N37cdbc663a3c42b182efa4cc0edaecf1 schema:issueNumber Suppl 1
    88 rdf:type schema:PublicationIssue
    89 N550f22c909f543b3a528ce2c5d75214a rdf:first sg:person.0577752122.01
    90 rdf:rest N2b351f5faa994bed8911fa3448023490
    91 N64bca3b9fa534bddbea7f6c6f1aa03b4 schema:name dimensions_id
    92 schema:value pub.1038455285
    93 rdf:type schema:PropertyValue
    94 N7eaaf6d0cb03417ebe3e1cba826c337a schema:name doi
    95 schema:value 10.1186/1471-2105-12-s1-s6
    96 rdf:type schema:PropertyValue
    97 Na09d2b34cc374c399d0b820880b99385 schema:name Springer Nature - SN SciGraph project
    98 rdf:type schema:Organization
    99 Nb9a2c58b313549059086e3329bba38cc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    100 schema:name Phylogeny
    101 rdf:type schema:DefinedTerm
    102 Nbb95487a9bb74977951215295978fece rdf:first sg:person.01336557015.68
    103 rdf:rest N550f22c909f543b3a528ce2c5d75214a
    104 Nc36a1349bab1497ab8016cea72d91f08 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    105 schema:name Algorithms
    106 rdf:type schema:DefinedTerm
    107 Ncb43a69b1f154b23992fa60be00d75b3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    108 schema:name Computational Biology
    109 rdf:type schema:DefinedTerm
    110 Ndeacce44d6a54f0ca134a5d566108b1b schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    111 schema:name Recombination, Genetic
    112 rdf:type schema:DefinedTerm
    113 Ned9965a91e3a43559c779838f64db221 schema:name pubmed_id
    114 schema:value 21342589
    115 rdf:type schema:PropertyValue
    116 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
    117 schema:name Mathematical Sciences
    118 rdf:type schema:DefinedTerm
    119 anzsrc-for:0104 schema:inDefinedTermSet anzsrc-for:
    120 schema:name Statistics
    121 rdf:type schema:DefinedTerm
    122 sg:journal.1023786 schema:issn 1471-2105
    123 schema:name BMC Bioinformatics
    124 schema:publisher Springer Nature
    125 rdf:type schema:Periodical
    126 sg:person.01336557015.68 schema:affiliation grid-institutes:None
    127 schema:familyName Parida
    128 schema:givenName Laxmi
    129 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336557015.68
    130 rdf:type schema:Person
    131 sg:person.0577752122.01 schema:affiliation grid-institutes:grid.481554.9
    132 schema:familyName Palamara
    133 schema:givenName Pier Francesco
    134 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0577752122.01
    135 rdf:type schema:Person
    136 sg:person.0634253750.76 schema:affiliation grid-institutes:None
    137 schema:familyName Javed
    138 schema:givenName Asif
    139 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0634253750.76
    140 rdf:type schema:Person
    141 sg:pub.10.1007/978-1-4757-2609-1_16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1089756973
    142 https://doi.org/10.1007/978-1-4757-2609-1_16
    143 rdf:type schema:CreativeWork
    144 sg:pub.10.1186/1471-2105-10-s1-s72 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021721164
    145 https://doi.org/10.1186/1471-2105-10-s1-s72
    146 rdf:type schema:CreativeWork
    147 sg:pub.10.1186/1471-2156-7-16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049183548
    148 https://doi.org/10.1186/1471-2156-7-16
    149 rdf:type schema:CreativeWork
    150 grid-institutes:None schema:alternateName Computational Genomics, IBM T J Watson Research, Yorktown, New York, USA
    151 schema:name Computational Genomics, IBM T J Watson Research, Yorktown, New York, USA
    152 rdf:type schema:Organization
    153 grid-institutes:grid.481554.9 schema:alternateName The work done during an internship at, IBM T J Watson Research Center, USA
    154 schema:name Columbia University, New York, USA
    155 The work done during an internship at, IBM T J Watson Research Center, USA
    156 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...