Investigating how non-homologous recombination structures genes, proteins, operons, clusters, genomes and ecosystems View Homepage


Ontology type: schema:MonetaryGrant     


Grant Info

YEARS

2016-2019

FUNDING AMOUNT

325894 GBP

ABSTRACT

Since Darwin's time we have thought of the evolution of life on the planet in terms of a great unifying tree of life. Darwin, writing to Thomas Henry Huxley said "The time will come (though I will not live to see it), when we shall have fairly true genealogies of each great kingdom of life". For much of the intervening years, the focus has been on trying to construct this great tree of life. However, while much of life is tree-like and diversifying, much of life is also involved in the process of merging. From simple symbioses, which might or might not become permanent (e.g. the chloroplast that we see powering plant life on the planet is the descendent of a once free-living bacterium), to the hybridization of plants or animals, to the fusion of genes, we see many, many instances of mergings. Unfortunately, our knowledge of these mergings lags well behind our knowledge of diversifying evolution. In this proposal, we will broaden our understanding of mergings and make such analyses easier and more comprehensive. Our first objective is to develop software to help analyse genetic data. In this effort we have been helped enormously - and perhaps quite surprisingly - by entities such as Google, Facebook and Twitter. These companies have based their technology around the kinds of graphs we are using for molecular sequences. When a person joins Facebook, they are represented by the software as a "node" on a graph. When they "friend" somebody, then an "edge" is drawn between these two nodes. When they "like" a post or a page, a different kind of edge is drawn between the person node and the page node. People and pages form a bipartite graph. Pages are characterised, say as being political pages, or pages with an interest in sport or furniture, etc. Therefore, there is another level for pages. Overall, between people, pages, groups, interests, etc. Facebook represents their entire business as a multi-level graph. We are now doing the same kind of thing for evolving entities. In the case of multilevel analysis of evolving objects, we can represent the smallest of evolving objects (say, a protein domain) as a node. If two domains are homologous (they share a common ancestor and are related), then we can draw an edge between them. If the appear on the same protein/gene, then we can draw edges between the domains and that gene (like as if two people have the same interest in fishing, on facebook). We can then characterise the gene as being of a particular "kind", say metabolic, or membrane-embedded. We can also indicate genes on our network, that are sitting on the same chromosome (analogous to saying they have the same "interest"). We can also have a network level where we indicate whether the organism is free-living, pathogenic, anaerobic, involved in a metabolic consortium, etc. In the same way that we see on social networks that communities form, we see on sequence networks that communities form. There are many parallels and we can gain significant insights into how evolution is really structuring life on the planet. For instance, preliminary studies have shown that some sequences are promiscuous and some are not. Certain domains are widespread in genes, while some are only found in one kind of sequence and no other. We see plasmids, such as those found in the Lyme-disease-causing bacterium Borrelia that have unique kinds of genes, but these genes are found across the diversity of Borrelia plasmids. In other words, the genes are species-restricted, but not plasmid restricted. The outcome of this programme will be to have flexible software and several new insights into how evolution has structures genes and genomes. Technical Summary The merging of evolving entities, known as introgression, is the focus of this proposal. The challenge is to move from the current situation where analyses of genetic mergers are relatively ad hoc, to a situation where introgression is as widely understood as phylogenetics, where analysis tools are as widely available, user-friendly and flexible and where evolutionary biologists investigate their data as easily for introgressive processes as they currently investigate the data for treelike processes. This transformation requires careful analysis of concepts, the development of exceptional software and the analysis of data from the diversity of evolving entities. We will develop a flexible, robust network analysis program that will be a major new addition to the toolset for evolutionary biology. We will develop N-Rooted Fusion Graphs that will facilitate entirely new insights into what happens to sequences post-fusion. We will develop approaches based on graph theory in order to explain how nature structures evolving objects (domains, genes, operons, clusters, genomes, consortia and ecosystems). We will identify communities in bipartite/k-partite graphs with a view to understanding sequence promiscuity, co-occurrence of sequences, major gene flows from one lineage to another, the level that is most important for understanding an ecosystem (the gene level, or the species level, or the protein domain level, etc). More... »

URL

http://gtr.rcuk.ac.uk/project/6FFFDB43-F2C5-4E06-8705-F61A48C0A1F3

Related SciGraph Publications

  • 2017-04. Why prokaryotes have pangenomes in NATURE MICROBIOLOGY
  • JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2208", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2206", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "type": "DefinedTerm"
          }
        ], 
        "amount": {
          "currency": "GBP", 
          "type": "MonetaryAmount", 
          "value": "325894"
        }, 
        "description": "Since Darwin's time we have thought of the evolution of life on the planet in terms of a great unifying tree of life. Darwin, writing to Thomas Henry Huxley said \"The time will come (though I will not live to see it), when we shall have fairly true genealogies of each great kingdom of life\". For much of the intervening years, the focus has been on trying to construct this great tree of life. However, while much of life is tree-like and diversifying, much of life is also involved in the process of merging. From simple symbioses, which might or might not become permanent (e.g. the chloroplast that we see powering plant life on the planet is the descendent of a once free-living bacterium), to the hybridization of plants or animals, to the fusion of genes, we see many, many instances of mergings. Unfortunately, our knowledge of these mergings lags well behind our knowledge of diversifying evolution. In this proposal, we will broaden our understanding of mergings and make such analyses easier and more comprehensive.\n\nOur first objective is to develop software to help analyse genetic data. In this effort we have been helped enormously - and perhaps quite surprisingly - by entities such as Google, Facebook and Twitter. These companies have based their technology around the kinds of graphs we are using for molecular sequences. When a person joins Facebook, they are represented by the software as a \"node\" on a graph. When they \"friend\" somebody, then an \"edge\" is drawn between these two nodes. When they \"like\" a post or a page, a different kind of edge is drawn between the person node and the page node. People and pages form a bipartite graph. Pages are characterised, say as being political pages, or pages with an interest in sport or furniture, etc. Therefore, there is another level for pages. Overall, between people, pages, groups, interests, etc. Facebook represents their entire business as a multi-level graph. We are now doing the same kind of thing for evolving entities.\n\nIn the case of multilevel analysis of evolving objects, we can represent the smallest of evolving objects (say, a protein domain) as a node. If two domains are homologous (they share a common ancestor and are related), then we can draw an edge between them. If the appear on the same protein/gene, then we can draw edges between the domains and that gene (like as if two people have the same interest in fishing, on facebook). We can then characterise the gene as being of a particular \"kind\", say metabolic, or membrane-embedded. We can also indicate genes on our network, that are sitting on the same chromosome (analogous to saying they have the same \"interest\"). We can also have a network level where we indicate whether the organism is free-living, pathogenic, anaerobic, involved in a metabolic consortium, etc.\n\nIn the same way that we see on social networks that communities form, we see on sequence networks that communities form. There are many parallels and we can gain significant insights into how evolution is really structuring life on the planet. For instance, preliminary studies have shown that some sequences are promiscuous and some are not. Certain domains are widespread in genes, while some are only found in one kind of sequence and no other. We see plasmids, such as those found in the Lyme-disease-causing bacterium Borrelia that have unique kinds of genes, but these genes are found across the diversity of Borrelia plasmids. In other words, the genes are species-restricted, but not plasmid restricted.\n\nThe outcome of this programme will be to have flexible software and several new insights into how evolution has structures genes and genomes.\n\nTechnical Summary\nThe merging of evolving entities, known as introgression, is the focus of this proposal. The challenge is to move from the current situation where analyses of genetic mergers are relatively ad hoc, to a situation where introgression is as widely understood as phylogenetics, where analysis tools are as widely available, user-friendly and flexible and where evolutionary biologists investigate their data as easily for introgressive processes as they currently investigate the data for treelike processes. This transformation requires careful analysis of concepts, the development of exceptional software and the analysis of data from the diversity of evolving entities.\n\nWe will develop a flexible, robust network analysis program that will be a major new addition to the toolset for evolutionary biology. We will develop N-Rooted Fusion Graphs that will facilitate entirely new insights into what happens to sequences post-fusion. We will develop approaches based on graph theory in order to explain how nature structures evolving objects (domains, genes, operons, clusters, genomes, consortia and ecosystems). We will identify communities in bipartite/k-partite graphs with a view to understanding sequence promiscuity, co-occurrence of sequences, major gene flows from one lineage to another, the level that is most important for understanding an ecosystem (the gene level, or the species level, or the protein domain level, etc).", 
        "endDate": "2019-11-06T00:00:00Z", 
        "funder": {
          "id": "https://www.grid.ac/institutes/grid.418100.c", 
          "type": "Organization"
        }, 
        "id": "sg:grant.5125593", 
        "identifier": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "5125593"
            ]
          }, 
          {
            "name": "gtr_id", 
            "type": "PropertyValue", 
            "value": [
              "6FFFDB43-F2C5-4E06-8705-F61A48C0A1F3"
            ]
          }
        ], 
        "inLanguage": [
          "en"
        ], 
        "keywords": [
          "evolutionary biology", 
          "technology", 
          "bacterium Borrelia", 
          "understanding sequence promiscuity", 
          "plasmid", 
          "same protein/gene", 
          "Facebook", 
          "non-homologous recombination structures genes", 
          "planet", 
          "people", 
          "gene level", 
          "protein domain level", 
          "descendents", 
          "knowledge", 
          "pages", 
          "introgressive processes", 
          "multilevel analysis", 
          "technical summary", 
          "genome", 
          "preliminary study", 
          "lineage", 
          "diversity", 
          "object", 
          "free-living bacterium", 
          "political pages", 
          "species level", 
          "plant life", 
          "time", 
          "careful analysis", 
          "sequence", 
          "occurrence", 
          "view", 
          "concept", 
          "evolution", 
          "flexible software", 
          "page node", 
          "community", 
          "same interests", 
          "appear", 
          "kind", 
          "person node", 
          "years", 
          "exceptional software", 
          "first objective", 
          "software", 
          "efforts", 
          "furniture", 
          "analysis tools", 
          "entity", 
          "great tree", 
          "species", 
          "analysis", 
          "great kingdoms", 
          "same kind", 
          "metabolic", 
          "membrane", 
          "organisms", 
          "same chromosome", 
          "current situation", 
          "common ancestor", 
          "bipartite graphs", 
          "mergings lags", 
          "ecosystems", 
          "order", 
          "toolset", 
          "edge", 
          "many instances", 
          "many parallels", 
          "Darwin's time", 
          "partite graph", 
          "friends", 
          "levels", 
          "nature structure", 
          "entire business", 
          "Lyme-disease", 
          "graph", 
          "true genealogy", 
          "evolutionary biologists", 
          "sequence networks", 
          "such analyses", 
          "situation", 
          "merging", 
          "process", 
          "understanding", 
          "terms", 
          "fishing", 
          "simple symbioses", 
          "new insights", 
          "significant insights", 
          "several new insights", 
          "introgression", 
          "network", 
          "multi-level graph", 
          "chloroplasts", 
          "unique kind", 
          "data", 
          "genetic merger", 
          "plants", 
          "life", 
          "major gene", 
          "domain", 
          "things", 
          "robust network analysis program", 
          "fusion", 
          "network level", 
          "companies", 
          "Thomas Henry Huxley", 
          "clusters", 
          "hybridization", 
          "operon", 
          "structure gene", 
          "N-Rooted Fusion Graphs", 
          "social networks", 
          "certain domains", 
          "sport", 
          "same way", 
          "great unifying tree", 
          "persons", 
          "cases", 
          "groups", 
          "Borrelia plasmids", 
          "somebody", 
          "instance", 
          "consortium", 
          "users", 
          "different kinds", 
          "treelike processes", 
          "protein", 
          "development", 
          "program", 
          "genes", 
          "proposal", 
          "Google", 
          "bipartite/k", 
          "genetic data", 
          "Darwin", 
          "Twitter", 
          "challenges", 
          "outcome", 
          "nodes", 
          "post", 
          "graph theory", 
          "phylogenetic", 
          "focus", 
          "interest", 
          "major new additions", 
          "metabolic consortium", 
          "trees", 
          "protein domains", 
          "animals", 
          "molecular sequences", 
          "transformation", 
          "other words", 
          "approach"
        ], 
        "name": "Investigating how non-homologous recombination structures genes, proteins, operons, clusters, genomes and ecosystems", 
        "recipient": [
          {
            "id": "https://www.grid.ac/institutes/grid.5379.8", 
            "type": "Organization"
          }
        ], 
        "sameAs": [
          "https://app.dimensions.ai/details/grant/grant.5125593"
        ], 
        "sdDataset": "grants", 
        "sdDatePublished": "2019-03-07T11:33", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com.uberresearch.data.processor/core_data/20181219_192338/projects/base/gtr_projects_0.xml.gz", 
        "startDate": "2016-11-07T00:00:00Z", 
        "type": "MonetaryGrant", 
        "url": "http://gtr.rcuk.ac.uk/project/6FFFDB43-F2C5-4E06-8705-F61A48C0A1F3"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/grant.5125593'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/grant.5125593'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/grant.5125593'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/grant.5125593'


     

    This table displays all metadata directly associated to this object as RDF triples.

    191 TRIPLES      19 PREDICATES      175 URIs      167 LITERALS      4 BLANK NODES

    Subject Predicate Object
    1 sg:grant.5125593 schema:about anzsrc-for:2206
    2 anzsrc-for:2208
    3 schema:amount Nb17d411137ba42c8bd5016b261b5b4dc
    4 schema:description Since Darwin's time we have thought of the evolution of life on the planet in terms of a great unifying tree of life. Darwin, writing to Thomas Henry Huxley said "The time will come (though I will not live to see it), when we shall have fairly true genealogies of each great kingdom of life". For much of the intervening years, the focus has been on trying to construct this great tree of life. However, while much of life is tree-like and diversifying, much of life is also involved in the process of merging. From simple symbioses, which might or might not become permanent (e.g. the chloroplast that we see powering plant life on the planet is the descendent of a once free-living bacterium), to the hybridization of plants or animals, to the fusion of genes, we see many, many instances of mergings. Unfortunately, our knowledge of these mergings lags well behind our knowledge of diversifying evolution. In this proposal, we will broaden our understanding of mergings and make such analyses easier and more comprehensive. Our first objective is to develop software to help analyse genetic data. In this effort we have been helped enormously - and perhaps quite surprisingly - by entities such as Google, Facebook and Twitter. These companies have based their technology around the kinds of graphs we are using for molecular sequences. When a person joins Facebook, they are represented by the software as a "node" on a graph. When they "friend" somebody, then an "edge" is drawn between these two nodes. When they "like" a post or a page, a different kind of edge is drawn between the person node and the page node. People and pages form a bipartite graph. Pages are characterised, say as being political pages, or pages with an interest in sport or furniture, etc. Therefore, there is another level for pages. Overall, between people, pages, groups, interests, etc. Facebook represents their entire business as a multi-level graph. We are now doing the same kind of thing for evolving entities. In the case of multilevel analysis of evolving objects, we can represent the smallest of evolving objects (say, a protein domain) as a node. If two domains are homologous (they share a common ancestor and are related), then we can draw an edge between them. If the appear on the same protein/gene, then we can draw edges between the domains and that gene (like as if two people have the same interest in fishing, on facebook). We can then characterise the gene as being of a particular "kind", say metabolic, or membrane-embedded. We can also indicate genes on our network, that are sitting on the same chromosome (analogous to saying they have the same "interest"). We can also have a network level where we indicate whether the organism is free-living, pathogenic, anaerobic, involved in a metabolic consortium, etc. In the same way that we see on social networks that communities form, we see on sequence networks that communities form. There are many parallels and we can gain significant insights into how evolution is really structuring life on the planet. For instance, preliminary studies have shown that some sequences are promiscuous and some are not. Certain domains are widespread in genes, while some are only found in one kind of sequence and no other. We see plasmids, such as those found in the Lyme-disease-causing bacterium Borrelia that have unique kinds of genes, but these genes are found across the diversity of Borrelia plasmids. In other words, the genes are species-restricted, but not plasmid restricted. The outcome of this programme will be to have flexible software and several new insights into how evolution has structures genes and genomes. Technical Summary The merging of evolving entities, known as introgression, is the focus of this proposal. The challenge is to move from the current situation where analyses of genetic mergers are relatively ad hoc, to a situation where introgression is as widely understood as phylogenetics, where analysis tools are as widely available, user-friendly and flexible and where evolutionary biologists investigate their data as easily for introgressive processes as they currently investigate the data for treelike processes. This transformation requires careful analysis of concepts, the development of exceptional software and the analysis of data from the diversity of evolving entities. We will develop a flexible, robust network analysis program that will be a major new addition to the toolset for evolutionary biology. We will develop N-Rooted Fusion Graphs that will facilitate entirely new insights into what happens to sequences post-fusion. We will develop approaches based on graph theory in order to explain how nature structures evolving objects (domains, genes, operons, clusters, genomes, consortia and ecosystems). We will identify communities in bipartite/k-partite graphs with a view to understanding sequence promiscuity, co-occurrence of sequences, major gene flows from one lineage to another, the level that is most important for understanding an ecosystem (the gene level, or the species level, or the protein domain level, etc).
    5 schema:endDate 2019-11-06T00:00:00Z
    6 schema:funder https://www.grid.ac/institutes/grid.418100.c
    7 schema:identifier N3e72e0925ac74df496a28ce49fdf53c3
    8 Nb166041e5402478ba41a51331f4478a6
    9 schema:inLanguage en
    10 schema:keywords Borrelia plasmids
    11 Darwin
    12 Darwin's time
    13 Facebook
    14 Google
    15 Lyme-disease
    16 N-Rooted Fusion Graphs
    17 Thomas Henry Huxley
    18 Twitter
    19 analysis
    20 analysis tools
    21 animals
    22 appear
    23 approach
    24 bacterium Borrelia
    25 bipartite graphs
    26 bipartite/k
    27 careful analysis
    28 cases
    29 certain domains
    30 challenges
    31 chloroplasts
    32 clusters
    33 common ancestor
    34 community
    35 companies
    36 concept
    37 consortium
    38 current situation
    39 data
    40 descendents
    41 development
    42 different kinds
    43 diversity
    44 domain
    45 ecosystems
    46 edge
    47 efforts
    48 entire business
    49 entity
    50 evolution
    51 evolutionary biologists
    52 evolutionary biology
    53 exceptional software
    54 first objective
    55 fishing
    56 flexible software
    57 focus
    58 free-living bacterium
    59 friends
    60 furniture
    61 fusion
    62 gene level
    63 genes
    64 genetic data
    65 genetic merger
    66 genome
    67 graph
    68 graph theory
    69 great kingdoms
    70 great tree
    71 great unifying tree
    72 groups
    73 hybridization
    74 instance
    75 interest
    76 introgression
    77 introgressive processes
    78 kind
    79 knowledge
    80 levels
    81 life
    82 lineage
    83 major gene
    84 major new additions
    85 many instances
    86 many parallels
    87 membrane
    88 merging
    89 mergings lags
    90 metabolic
    91 metabolic consortium
    92 molecular sequences
    93 multi-level graph
    94 multilevel analysis
    95 nature structure
    96 network
    97 network level
    98 new insights
    99 nodes
    100 non-homologous recombination structures genes
    101 object
    102 occurrence
    103 operon
    104 order
    105 organisms
    106 other words
    107 outcome
    108 page node
    109 pages
    110 partite graph
    111 people
    112 person node
    113 persons
    114 phylogenetic
    115 planet
    116 plant life
    117 plants
    118 plasmid
    119 political pages
    120 post
    121 preliminary study
    122 process
    123 program
    124 proposal
    125 protein
    126 protein domain level
    127 protein domains
    128 robust network analysis program
    129 same chromosome
    130 same interests
    131 same kind
    132 same protein/gene
    133 same way
    134 sequence
    135 sequence networks
    136 several new insights
    137 significant insights
    138 simple symbioses
    139 situation
    140 social networks
    141 software
    142 somebody
    143 species
    144 species level
    145 sport
    146 structure gene
    147 such analyses
    148 technical summary
    149 technology
    150 terms
    151 things
    152 time
    153 toolset
    154 transformation
    155 treelike processes
    156 trees
    157 true genealogy
    158 understanding
    159 understanding sequence promiscuity
    160 unique kind
    161 users
    162 view
    163 years
    164 schema:name Investigating how non-homologous recombination structures genes, proteins, operons, clusters, genomes and ecosystems
    165 schema:recipient https://www.grid.ac/institutes/grid.5379.8
    166 schema:sameAs https://app.dimensions.ai/details/grant/grant.5125593
    167 schema:sdDatePublished 2019-03-07T11:33
    168 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    169 schema:sdPublisher N21181c4ab8164678bd793042f4992ab8
    170 schema:startDate 2016-11-07T00:00:00Z
    171 schema:url http://gtr.rcuk.ac.uk/project/6FFFDB43-F2C5-4E06-8705-F61A48C0A1F3
    172 sgo:license sg:explorer/license/
    173 sgo:sdDataset grants
    174 rdf:type schema:MonetaryGrant
    175 N21181c4ab8164678bd793042f4992ab8 schema:name Springer Nature - SN SciGraph project
    176 rdf:type schema:Organization
    177 N3e72e0925ac74df496a28ce49fdf53c3 schema:name dimensions_id
    178 schema:value 5125593
    179 rdf:type schema:PropertyValue
    180 Nb166041e5402478ba41a51331f4478a6 schema:name gtr_id
    181 schema:value 6FFFDB43-F2C5-4E06-8705-F61A48C0A1F3
    182 rdf:type schema:PropertyValue
    183 Nb17d411137ba42c8bd5016b261b5b4dc schema:currency GBP
    184 schema:value 325894
    185 rdf:type schema:MonetaryAmount
    186 anzsrc-for:2206 schema:inDefinedTermSet anzsrc-for:
    187 rdf:type schema:DefinedTerm
    188 anzsrc-for:2208 schema:inDefinedTermSet anzsrc-for:
    189 rdf:type schema:DefinedTerm
    190 https://www.grid.ac/institutes/grid.418100.c schema:Organization
    191 https://www.grid.ac/institutes/grid.5379.8 schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...