Evolutionary Analysis and Comparative Genomics of Protein Superfamilies View Homepage


Ontology type: schema:MonetaryGrant     


Grant Info

YEARS

2009-2019

FUNDING AMOUNT

11780231.0 USD

ABSTRACT

The provenance of several components of major uniquely eukaryotic molecular machines are increasingly being traced back to prokaryotic biological conflict systems. L Aravind and his team demonstrated that the N-terminal single-stranded DNA-binding domain from the anti-restriction protein ArdC, deployed by bacterial mobile elements against their host, was independently acquired twice by eukaryotes, giving rise to the DNA-binding domains of XPC/Rad4 and the Tc-38-like proteins in the stem kinetoplastid. In both instances, the ArdC-N domain tandemly duplicated forming an extensive DNA-binding interface. In XPC/Rad4, the ArdC-N domains (BHDs) also fused to the inactive transglutaminase domain of a peptide-N-glycanase ultimately derived from an archaeal conflict system. Alongside, they delineated several parallel acquisitions from conjugative elements/bacteriophages that gave rise to key components of the kinetoplast DNA (kDNA) replication apparatus. These findings resolve two outstanding questions in eukaryote biology: (1) the origin of the unique DNA lesion-recognition component of NER and (2) origin of the unusual, plasmid-like features of kDNA. The evolution of release factors catalyzing the hydrolysis of the final peptidyl-tRNA bond and the release of the polypeptide from the ribosome has been a longstanding paradox. While the components of the translation apparatus are generally well-conserved across extant life, structurally unrelated release factor peptidyl hydrolases (RF-PHs) emerged in the stems of the bacterial and archaeo-eukaryotic lineages. L. Aravind and his team analyzed the diversification of RF-PH domains within the broader evolutionary framework of the translation apparatus. Thus, they reconstructed the possible state of translation termination in the Last Universal Common Ancestor with possible tRNA-like terminators. Further, evolutionary trajectories of the several auxiliary release factors in ribosome quality control (RQC) and rescue pathways point to multiple independent solutions to this problem and frequent transfers between superkingdoms including the recently characterized ArfT, which is more widely distributed across life than previously appreciated. The eukaryotic RQC system was pieced together from components with disparate provenance, which include the long-sought-after Vms1/ANKZF1 RF-PH of bacterial origin. They also uncover an under-appreciated evolutionary driver of innovation in rescue pathways: effectors deployed in biological conflicts that target the ribosome. At least three rescue pathways (centered on the prfH/RFH, baeRF-1, and C12orf65 RF-PH domains), were likely innovated in response to such conflicts. Numerous, diverse, highly variable defense and offense genetic systems are encoded in most bacterial genomes and are involved in various forms of conflict among competing microbes or their eukaryotic hosts. In collaboration with Dr. Eugene Koonin's group Dr. L. Aravind focused on the offense and self-versus-nonself discrimination systems encoded by archaeal genomes that so far have remained largely uncharacterized and unannotated. Specifically, they analyzed archaeal genomic loci encoding polymorphic and related toxin systems and ribosomally synthesized antimicrobial peptides. Using sensitive methods for sequence comparison and the guilt by association approach, they identified such systems in 141 archaeal genomes. These toxins can be classified into four major groups based on the structure of the components involved in the toxin delivery. The toxin domains are often shared between and within each system. They revisited the halocin families and substantially expand the halocin C8 family, which was identified in diverse archaeal genomes and also certain bacteria. Finally, they employ features of protein sequences and genomic locus organization characteristic of archaeocins and polymorphic toxins to identify candidates for analogous but not necessarily homologous systems among uncharacterized protein families. This work confidently predicts that more than 1,600 archaeal proteins, currently annotated as hypothetical in public databases, are components of conflict and self-versus-nonself discrimination systems. Diverse and highly variable systems involved in biological conflicts and self-versus-nonself discrimination are ubiquitous in bacteria but much less studied in archaea. They performed comprehensive comparative genomic analyses of the archaeal systems that share components with analogous bacterial systems and propose an approach to identify new systems that could be involved in these functions. They predicted polymorphic toxin systems in 141 archaeal genomes and identify new, archaea-specific toxin and immunity protein families. These systems are widely represented in archaea and are predicted to play major roles in interactions between species and in intermicrobial conflicts. This work is expected to stimulate experimental research to advance the understanding of poorly characterized major aspects of archaeal biology. A diverse collection of enzymes comprising the protocatechuate dioxygenases (PCADs) has been characterized in several extradiol aromatic compound degradation pathways. Structural studies have shown a relationship between PCADs and the more broadly-distributed, functionally enigmatic Memo domain linked to several human diseases. To better understand the evolution of this PCAD-Memo protein superfamily, L Aravind and his team explored their structural and functional determinants to establish a unified evolutionary framework, identifying 15 clearly-delineable families, including a previously-underappreciated diversity in five Memo clade families. They placed the superfamily's origin within the greater radiation of the nucleoside phosphorylase/hydrolase-peptide/amidohydrolase fold prior to the last universal common ancestor of all extant organisms. In addition to identifying active-site residues across the superfamily, they described three distinct, structurally-variable regions emanating from the core scaffold often housing conserved residues specific to individual families. These were predicted to contribute to the active-site pocket, potentially in substrate specificity and allosteric regulation. They also identified several previously-undescribed conserved genome contexts, providing insight into potentially novel substrates in PCAD clade families. They extended known conserved contextual associations for the Memo clade beyond previously-described associations with the AMMECR1 domain and a radical S-adenosylmethionine family domain. These observations point to two distinct yet potentially overlapping contexts wherein the elusive molecular function of the Memo domain could be finally resolved, thereby linking it to nucleotide base and aliphatic isoprenoid modification. In total, this report throws light on the functions of large swaths of the experimentally-uncharacterized PCAD-Memo families. mRNAs are regulated by nucleotide modifications that influence their cellular fate. Two of the most abundant modified nucleotides are N6-methyladenosine (m6A), found within mRNAs, and N6,2'-O-dimethyladenosine (m6Am), which is found at the first transcribed nucleotide. Distinguishing these modifications in mapping studies has been difficult. L. Aravind identified PCIF1 as the methyltransferase that catalyzes that modification. With Dr. Eric Greer's group at Harvard University he worked to biochemically characterize PCIF1 that generates m6Am. They founnd that PCIF1 binds and is dependent on the m7G cap. By depleting PCIF1, we generated transcriptome-wide maps that distinguish m6Am and m6A. They showed that m6A and m6Am misannotations arise from mRNA isoforms with alternative transcription start sites (TSSs). This explains the biological significance of this RNA modification More... »

URL

http://projectreporter.nih.gov/project_info_description.cfm?aid=10018682

Related SciGraph Publications

  • 2018-04-18. Evolutionary convergence and divergence in archaeal chromosomal proteins and Chromo-like domains from bacteria and eukaryotes in SCIENTIFIC REPORTS
  • 2018-04-09. Vms1/Ankzf1 peptidyl-tRNA hydrolase releases nascent chains from stalled ribosomes in NATURE
  • 2015-05-15. The eukaryotic translation initiation regulator CDC123 defines a divergent clade of ATP-grasp enzymes with a predicted role in novel protein modifications in BIOLOGY DIRECT
  • 2015-01-16. Structure and sequence analyses of Bacteroides proteins BVU_4064 and BF1687 reveal presence of two novel predominantly-beta domains, predicted to be involved in lipid and cell surface interactions in BMC BIOINFORMATICS
  • 2014-10-31. Multiple enzymatic activities of ParB/Srx superfamily mediate sexual conflict among conjugative plasmids in NATURE COMMUNICATIONS
  • 2013-08-15. Novel autoproteolytic and DNA-damage sensing components in the bacterial SOS response and oxidized methylcytosine-induced eukaryotic DNA demethylation systems in BIOLOGY DIRECT
  • 2013-06-15. Comprehensive analysis of the HEPN superfamily: identification of novel roles in intra-genomic conflicts, defense, pathogenesis and RNA processing in BIOLOGY DIRECT
  • 2013-06-08. Two novel PIWI families: roles in inter-genomic conflicts in bacteria and Mediator-dependent modulation of transcription in eukaryotes in BIOLOGY DIRECT
  • 2012-11-14. Live virus-free or die: coupling of antivirus immunity and programmed suicide or dormancy in prokaryotes in BIOLOGY DIRECT
  • 2012-11-12. ALOG domains: provenance of plant homeotic and developmental regulators from the DNA-binding domain of a novel class of DIRS1-type retroposons in BIOLOGY DIRECT
  • 2012-06-25. Polymorphic toxin systems: Comprehensive characterization of trafficking modes, processing, mechanisms of action, immunity and ecology using comparative genomics in BIOLOGY DIRECT
  • 2010-08-02. Predicted class-I aminoacyl tRNA synthetase-like proteins in non-ribosomal peptide synthesis in BIOLOGY DIRECT
  • 2010-06-30. Presence of a classical RRM-fold palm domain in Thg1-type 3'- 5'nucleic acid polymerases and the origin of the GGDEF and CRISPR polymerase domains in BIOLOGY DIRECT
  • 2010-03-19. OST-HTH: a novel predicted RNA-binding domain in BIOLOGY DIRECT
  • 2010-01-07. Novel eukaryotic enzymes modifying cell-surface biopolymers in BIOLOGY DIRECT
  • 2009-08-14. The Anabaena sensory rhodopsin transducer defines a novel superfamily of prokaryotic small-molecule binding domains in BIOLOGY DIRECT
  • 2009-06-21. Structure of a lamprey variable lymphocyte receptor in complex with a protein antigen in NATURE STRUCTURAL & MOLECULAR BIOLOGY
  • 2009-04-15. Reconstructing prokaryotic transcriptional regulatory networks: lessons from actinobacteria in BMC BIOLOGY
  • 2009-03-30. Reconstructing the ubiquitin network - cross-talk with other systems and identification of novel functions in GENOME BIOLOGY
  • 2008-11-03. Unraveling the biochemistry and provenance of pupylation: a prokaryotic analog of ubiquitination in BIOLOGY DIRECT
  • JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "type": "DefinedTerm"
          }
        ], 
        "amount": {
          "currency": "USD", 
          "type": "MonetaryAmount", 
          "value": 11780231.0
        }, 
        "description": "The provenance of several components of major uniquely eukaryotic molecular machines are increasingly being traced back to prokaryotic biological conflict systems. L Aravind and his team demonstrated that the N-terminal single-stranded DNA-binding domain from the anti-restriction protein ArdC, deployed by bacterial mobile elements against their host, was independently acquired twice by eukaryotes, giving rise to the DNA-binding domains of XPC/Rad4 and the Tc-38-like proteins in the stem kinetoplastid. In both instances, the ArdC-N domain tandemly duplicated forming an extensive DNA-binding interface. In XPC/Rad4, the ArdC-N domains (BHDs) also fused to the inactive transglutaminase domain of a peptide-N-glycanase ultimately derived from an archaeal conflict system. Alongside, they delineated several parallel acquisitions from conjugative elements/bacteriophages that gave rise to key components of the kinetoplast DNA (kDNA) replication apparatus. These findings resolve two outstanding questions in eukaryote biology: (1) the origin of the unique DNA lesion-recognition component of NER and (2) origin of the unusual, plasmid-like features of kDNA. The evolution of release factors catalyzing the hydrolysis of the final peptidyl-tRNA bond and the release of the polypeptide from the ribosome has been a longstanding paradox. While the components of the translation apparatus are generally well-conserved across extant life, structurally unrelated release factor peptidyl hydrolases (RF-PHs) emerged in the stems of the bacterial and archaeo-eukaryotic lineages. L. Aravind and his team analyzed the diversification of RF-PH domains within the broader evolutionary framework of the translation apparatus. Thus, they reconstructed the possible state of translation termination in the Last Universal Common Ancestor with possible tRNA-like terminators. Further, evolutionary trajectories of the several auxiliary release factors in ribosome quality control (RQC) and rescue pathways point to multiple independent solutions to this problem and frequent transfers between superkingdoms including the recently characterized ArfT, which is more widely distributed across life than previously appreciated. The eukaryotic RQC system was pieced together from components with disparate provenance, which include the long-sought-after Vms1/ANKZF1 RF-PH of bacterial origin. They also uncover an under-appreciated evolutionary driver of innovation in rescue pathways: effectors deployed in biological conflicts that target the ribosome. At least three rescue pathways (centered on the prfH/RFH, baeRF-1, and C12orf65 RF-PH domains), were likely innovated in response to such conflicts. Numerous, diverse, highly variable defense and offense genetic systems are encoded in most bacterial genomes and are involved in various forms of conflict among competing microbes or their eukaryotic hosts. In collaboration with Dr. Eugene Koonin's group Dr. L. Aravind focused on the offense and self-versus-nonself discrimination systems encoded by archaeal genomes that so far have remained largely uncharacterized and unannotated. Specifically, they analyzed archaeal genomic loci encoding polymorphic and related toxin systems and ribosomally synthesized antimicrobial peptides. Using sensitive methods for sequence comparison and the guilt by association approach, they identified such systems in 141 archaeal genomes. These toxins can be classified into four major groups based on the structure of the components involved in the toxin delivery. The toxin domains are often shared between and within each system. They revisited the halocin families and substantially expand the halocin C8 family, which was identified in diverse archaeal genomes and also certain bacteria. Finally, they employ features of protein sequences and genomic locus organization characteristic of archaeocins and polymorphic toxins to identify candidates for analogous but not necessarily homologous systems among uncharacterized protein families. This work confidently predicts that more than 1,600 archaeal proteins, currently annotated as hypothetical in public databases, are components of conflict and self-versus-nonself discrimination systems. Diverse and highly variable systems involved in biological conflicts and self-versus-nonself discrimination are ubiquitous in bacteria but much less studied in archaea. They performed comprehensive comparative genomic analyses of the archaeal systems that share components with analogous bacterial systems and propose an approach to identify new systems that could be involved in these functions. They predicted polymorphic toxin systems in 141 archaeal genomes and identify new, archaea-specific toxin and immunity protein families. These systems are widely represented in archaea and are predicted to play major roles in interactions between species and in intermicrobial conflicts. This work is expected to stimulate experimental research to advance the understanding of poorly characterized major aspects of archaeal biology. A diverse collection of enzymes comprising the protocatechuate dioxygenases (PCADs) has been characterized in several extradiol aromatic compound degradation pathways. Structural studies have shown a relationship between PCADs and the more broadly-distributed, functionally enigmatic Memo domain linked to several human diseases. To better understand the evolution of this PCAD-Memo protein superfamily, L Aravind and his team explored their structural and functional determinants to establish a unified evolutionary framework, identifying 15 clearly-delineable families, including a previously-underappreciated diversity in five Memo clade families. They placed the superfamily's origin within the greater radiation of the nucleoside phosphorylase/hydrolase-peptide/amidohydrolase fold prior to the last universal common ancestor of all extant organisms. In addition to identifying active-site residues across the superfamily, they described three distinct, structurally-variable regions emanating from the core scaffold often housing conserved residues specific to individual families. These were predicted to contribute to the active-site pocket, potentially in substrate specificity and allosteric regulation. They also identified several previously-undescribed conserved genome contexts, providing insight into potentially novel substrates in PCAD clade families. They extended known conserved contextual associations for the Memo clade beyond previously-described associations with the AMMECR1 domain and a radical S-adenosylmethionine family domain. These observations point to two distinct yet potentially overlapping contexts wherein the elusive molecular function of the Memo domain could be finally resolved, thereby linking it to nucleotide base and aliphatic isoprenoid modification. In total, this report throws light on the functions of large swaths of the experimentally-uncharacterized PCAD-Memo families. mRNAs are regulated by nucleotide modifications that influence their cellular fate. Two of the most abundant modified nucleotides are N6-methyladenosine (m6A), found within mRNAs, and N6,2'-O-dimethyladenosine (m6Am), which is found at the first transcribed nucleotide. Distinguishing these modifications in mapping studies has been difficult. L. Aravind identified PCIF1 as the methyltransferase that catalyzes that modification. With Dr. Eric Greer's group at Harvard University he worked to biochemically characterize PCIF1 that generates m6Am. They founnd that PCIF1 binds and is dependent on the m7G cap. By depleting PCIF1, we generated transcriptome-wide maps that distinguish m6Am and m6A. They showed that m6A and m6Am misannotations arise from mRNA isoforms with alternative transcription start sites (TSSs). This explains the biological significance of this RNA modification", 
        "endDate": "2019-01-01", 
        "funder": {
          "id": "http://www.grid.ac/institutes/grid.280285.5", 
          "type": "Organization"
        }, 
        "id": "sg:grant.2726065", 
        "identifier": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "grant.2726065"
            ]
          }, 
          {
            "name": "nih_id", 
            "type": "PropertyValue", 
            "value": [
              "ZIALM594244"
            ]
          }
        ], 
        "inLanguage": [
          "en"
        ], 
        "keywords": [
          "last universal common ancestor", 
          "universal common ancestor", 
          "ribosome quality control", 
          "XPC/Rad4", 
          "DNA-binding domain", 
          "archaeal genomes", 
          "transcription start site", 
          "translation apparatus", 
          "biological conflicts", 
          "common ancestor", 
          "protein family", 
          "aromatic compound degradation pathways", 
          "toxin systems", 
          "comprehensive comparative genomic analysis", 
          "rescue pathway", 
          "release factors", 
          "alternative transcription start sites", 
          "biological conflict systems", 
          "polymorphic toxin systems", 
          "structurally variable regions", 
          "elusive molecular functions", 
          "most bacterial genomes", 
          "uncharacterized protein families", 
          "DNA replication apparatus", 
          "comparative genomic analysis", 
          "m7G cap", 
          "DNA-binding interface", 
          "evolutionary framework", 
          "transcriptome-wide map", 
          "broader evolutionary framework", 
          "active site residues", 
          "active site pocket", 
          "bacterial mobile elements", 
          "eukaryote biology", 
          "archaeal biology", 
          "archaeal proteins", 
          "polymorphic toxins", 
          "eukaryotic hosts", 
          "locus organization", 
          "evolutionary drivers", 
          "transglutaminase domain", 
          "archaeal systems", 
          "comparative genomics", 
          "underappreciated diversity", 
          "extant organisms", 
          "genomic loci", 
          "molecular functions", 
          "bacterial genomes", 
          "genome context", 
          "evolutionary analysis", 
          "evolutionary trajectories", 
          "allosteric regulation", 
          "cellular fate", 
          "protein superfamilies", 
          "RNA modifications", 
          "isoprenoid modification", 
          "replication apparatus", 
          "genetic system", 
          "translation termination", 
          "genomic analysis", 
          "nucleotide modifications", 
          "sequence comparison", 
          "PCIF1", 
          "start site", 
          "substrate specificity", 
          "variable defenses", 
          "protein sequences", 
          "bacterial systems", 
          "genome", 
          "first transcribed", 
          "pathways points", 
          "molecular machines", 
          "toxin domain", 
          "mRNA isoforms", 
          "novel substrate", 
          "human diseases", 
          "disparate provenance", 
          "biological significance", 
          "greater radiation", 
          "toxin delivery", 
          "mobile elements", 
          "certain bacteria", 
          "association approach", 
          "public databases", 
          "bacterial origin", 
          "Rad4", 
          "major groups", 
          "archaea", 
          "degradation pathway", 
          "longstanding paradox", 
          "homologous system", 
          "diverse collection", 
          "antimicrobial peptides", 
          "superfamily", 
          "protein", 
          "m6Am", 
          "ribosomes", 
          "ancestor", 
          "functional determinants", 
          "mapping studies", 
          "structural studies", 
          "extant life", 
          "outstanding questions", 
          "pathway", 
          "biology", 
          "nonself discrimination", 
          "Aravind", 
          "bacteria", 
          "mRNA", 
          "family", 
          "multiple independent solutions", 
          "host", 
          "toxin", 
          "domain", 
          "ARDC", 
          "superkingdoms", 
          "eukaryotes", 
          "archaeocins", 
          "kinetoplastids", 
          "frequent transfer", 
          "clade", 
          "individual families", 
          "genomics", 
          "m6A.", 
          "misannotations", 
          "major role", 
          "lineages", 
          "dioxygenases", 
          "core scaffold", 
          "m6A", 
          "amidohydrolase", 
          "hydrolases", 
          "microbes", 
          "dimethyladenosine", 
          "loci", 
          "peptides", 
          "key component", 
          "conflict systems", 
          "organisms", 
          "methyltransferase", 
          "polypeptide", 
          "nucleotides", 
          "species", 
          "effectors", 
          "isoforms", 
          "terminator", 
          "diversity", 
          "bacteriophages", 
          "modification", 
          "diversification", 
          "origin", 
          "glycanase", 
          "evolution", 
          "residues", 
          "enzyme", 
          "regulation", 
          "kDNA", 
          "family domain", 
          "components of conflict", 
          "sequence", 
          "fate", 
          "function", 
          "stem", 
          "defense", 
          "NER", 
          "pocket", 
          "quality control", 
          "apparatus", 
          "components", 
          "hydrolysis", 
          "sites", 
          "scaffolds", 
          "insights", 
          "major aspects", 
          "specificity", 
          "role", 
          "discrimination system", 
          "substrate", 
          "interaction", 
          "provenance", 
          "factors", 
          "region", 
          "analysis", 
          "association", 
          "understanding", 
          "response", 
          "determinants", 
          "release", 
          "sensitive method", 
          "drivers", 
          "large swaths", 
          "cap", 
          "study", 
          "such conflicts", 
          "structure", 
          "rise", 
          "addition", 
          "transcribed", 
          "candidates", 
          "system", 
          "significance", 
          "variable systems", 
          "features", 
          "maps", 
          "elements", 
          "disease", 
          "form", 
          "collection", 
          "acquisition", 
          "termination", 
          "organization", 
          "group", 
          "transfer", 
          "control", 
          "findings", 
          "relationship", 
          "observations", 
          "forms of conflict", 
          "approach", 
          "radiation", 
          "bonds", 
          "database", 
          "aspects", 
          "discrimination", 
          "total", 
          "comparison", 
          "base", 
          "paradox", 
          "report", 
          "work", 
          "parallel acquisition", 
          "questions", 
          "context", 
          "life", 
          "state", 
          "instances", 
          "swath", 
          "conflict", 
          "research", 
          "trajectories", 
          "interface", 
          "delivery", 
          "possible states", 
          "point", 
          "method", 
          "Harvard University", 
          "new system", 
          "framework", 
          "experimental research", 
          "contextual associations", 
          "such systems", 
          "innovation", 
          "self", 
          "solution", 
          "machine", 
          "collaboration", 
          "problem", 
          "independent solutions", 
          "University", 
          "guilt", 
          "housing", 
          "team", 
          "offenses"
        ], 
        "name": "Evolutionary Analysis and Comparative Genomics of Protein Superfamilies", 
        "recipient": [
          {
            "id": "http://www.grid.ac/institutes/grid.280285.5", 
            "type": "Organization"
          }, 
          {
            "affiliation": {
              "id": "http://www.grid.ac/institutes/None", 
              "name": "NATIONAL LIBRARY OF MEDICINE", 
              "type": "Organization"
            }, 
            "familyName": "IYER", 
            "givenName": "ARAVIND", 
            "id": "sg:person.0627270374.18", 
            "type": "Person"
          }, 
          {
            "member": "sg:person.0627270374.18", 
            "roleName": "PI", 
            "type": "Role"
          }
        ], 
        "sameAs": [
          "https://app.dimensions.ai/details/grant/grant.2726065"
        ], 
        "sdDataset": "grants", 
        "sdDatePublished": "2022-01-01T19:29", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/grant/grant_26.jsonl", 
        "startDate": "2009-01-01", 
        "type": "MonetaryGrant", 
        "url": "http://projectreporter.nih.gov/project_info_description.cfm?aid=10018682"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/grant.2726065'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/grant.2726065'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/grant.2726065'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/grant.2726065'


     

    This table displays all metadata directly associated to this object as RDF triples.

    307 TRIPLES      19 PREDICATES      284 URIs      277 LITERALS      5 BLANK NODES

    Subject Predicate Object
    1 sg:grant.2726065 schema:about anzsrc-for:06
    2 schema:amount Nd86fe4d2f98e4b4084646ce5765a4b0f
    3 schema:description The provenance of several components of major uniquely eukaryotic molecular machines are increasingly being traced back to prokaryotic biological conflict systems. L Aravind and his team demonstrated that the N-terminal single-stranded DNA-binding domain from the anti-restriction protein ArdC, deployed by bacterial mobile elements against their host, was independently acquired twice by eukaryotes, giving rise to the DNA-binding domains of XPC/Rad4 and the Tc-38-like proteins in the stem kinetoplastid. In both instances, the ArdC-N domain tandemly duplicated forming an extensive DNA-binding interface. In XPC/Rad4, the ArdC-N domains (BHDs) also fused to the inactive transglutaminase domain of a peptide-N-glycanase ultimately derived from an archaeal conflict system. Alongside, they delineated several parallel acquisitions from conjugative elements/bacteriophages that gave rise to key components of the kinetoplast DNA (kDNA) replication apparatus. These findings resolve two outstanding questions in eukaryote biology: (1) the origin of the unique DNA lesion-recognition component of NER and (2) origin of the unusual, plasmid-like features of kDNA. The evolution of release factors catalyzing the hydrolysis of the final peptidyl-tRNA bond and the release of the polypeptide from the ribosome has been a longstanding paradox. While the components of the translation apparatus are generally well-conserved across extant life, structurally unrelated release factor peptidyl hydrolases (RF-PHs) emerged in the stems of the bacterial and archaeo-eukaryotic lineages. L. Aravind and his team analyzed the diversification of RF-PH domains within the broader evolutionary framework of the translation apparatus. Thus, they reconstructed the possible state of translation termination in the Last Universal Common Ancestor with possible tRNA-like terminators. Further, evolutionary trajectories of the several auxiliary release factors in ribosome quality control (RQC) and rescue pathways point to multiple independent solutions to this problem and frequent transfers between superkingdoms including the recently characterized ArfT, which is more widely distributed across life than previously appreciated. The eukaryotic RQC system was pieced together from components with disparate provenance, which include the long-sought-after Vms1/ANKZF1 RF-PH of bacterial origin. They also uncover an under-appreciated evolutionary driver of innovation in rescue pathways: effectors deployed in biological conflicts that target the ribosome. At least three rescue pathways (centered on the prfH/RFH, baeRF-1, and C12orf65 RF-PH domains), were likely innovated in response to such conflicts. Numerous, diverse, highly variable defense and offense genetic systems are encoded in most bacterial genomes and are involved in various forms of conflict among competing microbes or their eukaryotic hosts. In collaboration with Dr. Eugene Koonin's group Dr. L. Aravind focused on the offense and self-versus-nonself discrimination systems encoded by archaeal genomes that so far have remained largely uncharacterized and unannotated. Specifically, they analyzed archaeal genomic loci encoding polymorphic and related toxin systems and ribosomally synthesized antimicrobial peptides. Using sensitive methods for sequence comparison and the guilt by association approach, they identified such systems in 141 archaeal genomes. These toxins can be classified into four major groups based on the structure of the components involved in the toxin delivery. The toxin domains are often shared between and within each system. They revisited the halocin families and substantially expand the halocin C8 family, which was identified in diverse archaeal genomes and also certain bacteria. Finally, they employ features of protein sequences and genomic locus organization characteristic of archaeocins and polymorphic toxins to identify candidates for analogous but not necessarily homologous systems among uncharacterized protein families. This work confidently predicts that more than 1,600 archaeal proteins, currently annotated as hypothetical in public databases, are components of conflict and self-versus-nonself discrimination systems. Diverse and highly variable systems involved in biological conflicts and self-versus-nonself discrimination are ubiquitous in bacteria but much less studied in archaea. They performed comprehensive comparative genomic analyses of the archaeal systems that share components with analogous bacterial systems and propose an approach to identify new systems that could be involved in these functions. They predicted polymorphic toxin systems in 141 archaeal genomes and identify new, archaea-specific toxin and immunity protein families. These systems are widely represented in archaea and are predicted to play major roles in interactions between species and in intermicrobial conflicts. This work is expected to stimulate experimental research to advance the understanding of poorly characterized major aspects of archaeal biology. A diverse collection of enzymes comprising the protocatechuate dioxygenases (PCADs) has been characterized in several extradiol aromatic compound degradation pathways. Structural studies have shown a relationship between PCADs and the more broadly-distributed, functionally enigmatic Memo domain linked to several human diseases. To better understand the evolution of this PCAD-Memo protein superfamily, L Aravind and his team explored their structural and functional determinants to establish a unified evolutionary framework, identifying 15 clearly-delineable families, including a previously-underappreciated diversity in five Memo clade families. They placed the superfamily's origin within the greater radiation of the nucleoside phosphorylase/hydrolase-peptide/amidohydrolase fold prior to the last universal common ancestor of all extant organisms. In addition to identifying active-site residues across the superfamily, they described three distinct, structurally-variable regions emanating from the core scaffold often housing conserved residues specific to individual families. These were predicted to contribute to the active-site pocket, potentially in substrate specificity and allosteric regulation. They also identified several previously-undescribed conserved genome contexts, providing insight into potentially novel substrates in PCAD clade families. They extended known conserved contextual associations for the Memo clade beyond previously-described associations with the AMMECR1 domain and a radical S-adenosylmethionine family domain. These observations point to two distinct yet potentially overlapping contexts wherein the elusive molecular function of the Memo domain could be finally resolved, thereby linking it to nucleotide base and aliphatic isoprenoid modification. In total, this report throws light on the functions of large swaths of the experimentally-uncharacterized PCAD-Memo families. mRNAs are regulated by nucleotide modifications that influence their cellular fate. Two of the most abundant modified nucleotides are N6-methyladenosine (m6A), found within mRNAs, and N6,2'-O-dimethyladenosine (m6Am), which is found at the first transcribed nucleotide. Distinguishing these modifications in mapping studies has been difficult. L. Aravind identified PCIF1 as the methyltransferase that catalyzes that modification. With Dr. Eric Greer's group at Harvard University he worked to biochemically characterize PCIF1 that generates m6Am. They founnd that PCIF1 binds and is dependent on the m7G cap. By depleting PCIF1, we generated transcriptome-wide maps that distinguish m6Am and m6A. They showed that m6A and m6Am misannotations arise from mRNA isoforms with alternative transcription start sites (TSSs). This explains the biological significance of this RNA modification
    4 schema:endDate 2019-01-01
    5 schema:funder grid-institutes:grid.280285.5
    6 schema:identifier N341eb94db7ea4b519174ffc01252021f
    7 N4fc14f79ab2e486f89ec572a09a314fa
    8 schema:inLanguage en
    9 schema:keywords ARDC
    10 Aravind
    11 DNA replication apparatus
    12 DNA-binding domain
    13 DNA-binding interface
    14 Harvard University
    15 NER
    16 PCIF1
    17 RNA modifications
    18 Rad4
    19 University
    20 XPC/Rad4
    21 acquisition
    22 active site pocket
    23 active site residues
    24 addition
    25 allosteric regulation
    26 alternative transcription start sites
    27 amidohydrolase
    28 analysis
    29 ancestor
    30 antimicrobial peptides
    31 apparatus
    32 approach
    33 archaea
    34 archaeal biology
    35 archaeal genomes
    36 archaeal proteins
    37 archaeal systems
    38 archaeocins
    39 aromatic compound degradation pathways
    40 aspects
    41 association
    42 association approach
    43 bacteria
    44 bacterial genomes
    45 bacterial mobile elements
    46 bacterial origin
    47 bacterial systems
    48 bacteriophages
    49 base
    50 biological conflict systems
    51 biological conflicts
    52 biological significance
    53 biology
    54 bonds
    55 broader evolutionary framework
    56 candidates
    57 cap
    58 cellular fate
    59 certain bacteria
    60 clade
    61 collaboration
    62 collection
    63 common ancestor
    64 comparative genomic analysis
    65 comparative genomics
    66 comparison
    67 components
    68 components of conflict
    69 comprehensive comparative genomic analysis
    70 conflict
    71 conflict systems
    72 context
    73 contextual associations
    74 control
    75 core scaffold
    76 database
    77 defense
    78 degradation pathway
    79 delivery
    80 determinants
    81 dimethyladenosine
    82 dioxygenases
    83 discrimination
    84 discrimination system
    85 disease
    86 disparate provenance
    87 diverse collection
    88 diversification
    89 diversity
    90 domain
    91 drivers
    92 effectors
    93 elements
    94 elusive molecular functions
    95 enzyme
    96 eukaryote biology
    97 eukaryotes
    98 eukaryotic hosts
    99 evolution
    100 evolutionary analysis
    101 evolutionary drivers
    102 evolutionary framework
    103 evolutionary trajectories
    104 experimental research
    105 extant life
    106 extant organisms
    107 factors
    108 family
    109 family domain
    110 fate
    111 features
    112 findings
    113 first transcribed
    114 form
    115 forms of conflict
    116 framework
    117 frequent transfer
    118 function
    119 functional determinants
    120 genetic system
    121 genome
    122 genome context
    123 genomic analysis
    124 genomic loci
    125 genomics
    126 glycanase
    127 greater radiation
    128 group
    129 guilt
    130 homologous system
    131 host
    132 housing
    133 human diseases
    134 hydrolases
    135 hydrolysis
    136 independent solutions
    137 individual families
    138 innovation
    139 insights
    140 instances
    141 interaction
    142 interface
    143 isoforms
    144 isoprenoid modification
    145 kDNA
    146 key component
    147 kinetoplastids
    148 large swaths
    149 last universal common ancestor
    150 life
    151 lineages
    152 loci
    153 locus organization
    154 longstanding paradox
    155 m6A
    156 m6A.
    157 m6Am
    158 m7G cap
    159 mRNA
    160 mRNA isoforms
    161 machine
    162 major aspects
    163 major groups
    164 major role
    165 mapping studies
    166 maps
    167 method
    168 methyltransferase
    169 microbes
    170 misannotations
    171 mobile elements
    172 modification
    173 molecular functions
    174 molecular machines
    175 most bacterial genomes
    176 multiple independent solutions
    177 new system
    178 nonself discrimination
    179 novel substrate
    180 nucleotide modifications
    181 nucleotides
    182 observations
    183 offenses
    184 organisms
    185 organization
    186 origin
    187 outstanding questions
    188 paradox
    189 parallel acquisition
    190 pathway
    191 pathways points
    192 peptides
    193 pocket
    194 point
    195 polymorphic toxin systems
    196 polymorphic toxins
    197 polypeptide
    198 possible states
    199 problem
    200 protein
    201 protein family
    202 protein sequences
    203 protein superfamilies
    204 provenance
    205 public databases
    206 quality control
    207 questions
    208 radiation
    209 region
    210 regulation
    211 relationship
    212 release
    213 release factors
    214 replication apparatus
    215 report
    216 rescue pathway
    217 research
    218 residues
    219 response
    220 ribosome quality control
    221 ribosomes
    222 rise
    223 role
    224 scaffolds
    225 self
    226 sensitive method
    227 sequence
    228 sequence comparison
    229 significance
    230 sites
    231 solution
    232 species
    233 specificity
    234 start site
    235 state
    236 stem
    237 structural studies
    238 structurally variable regions
    239 structure
    240 study
    241 substrate
    242 substrate specificity
    243 such conflicts
    244 such systems
    245 superfamily
    246 superkingdoms
    247 swath
    248 system
    249 team
    250 termination
    251 terminator
    252 total
    253 toxin
    254 toxin delivery
    255 toxin domain
    256 toxin systems
    257 trajectories
    258 transcribed
    259 transcription start site
    260 transcriptome-wide map
    261 transfer
    262 transglutaminase domain
    263 translation apparatus
    264 translation termination
    265 uncharacterized protein families
    266 underappreciated diversity
    267 understanding
    268 universal common ancestor
    269 variable defenses
    270 variable systems
    271 work
    272 schema:name Evolutionary Analysis and Comparative Genomics of Protein Superfamilies
    273 schema:recipient N7bfa10b49544410494b1f0a1003e8624
    274 sg:person.0627270374.18
    275 grid-institutes:grid.280285.5
    276 schema:sameAs https://app.dimensions.ai/details/grant/grant.2726065
    277 schema:sdDatePublished 2022-01-01T19:29
    278 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    279 schema:sdPublisher Na7579e5be8c542b784e9180873019ebe
    280 schema:startDate 2009-01-01
    281 schema:url http://projectreporter.nih.gov/project_info_description.cfm?aid=10018682
    282 sgo:license sg:explorer/license/
    283 sgo:sdDataset grants
    284 rdf:type schema:MonetaryGrant
    285 N341eb94db7ea4b519174ffc01252021f schema:name dimensions_id
    286 schema:value grant.2726065
    287 rdf:type schema:PropertyValue
    288 N4fc14f79ab2e486f89ec572a09a314fa schema:name nih_id
    289 schema:value ZIALM594244
    290 rdf:type schema:PropertyValue
    291 N7bfa10b49544410494b1f0a1003e8624 schema:member sg:person.0627270374.18
    292 schema:roleName PI
    293 rdf:type schema:Role
    294 Na7579e5be8c542b784e9180873019ebe schema:name Springer Nature - SN SciGraph project
    295 rdf:type schema:Organization
    296 Nd86fe4d2f98e4b4084646ce5765a4b0f schema:currency USD
    297 schema:value 11780231.0
    298 rdf:type schema:MonetaryAmount
    299 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
    300 rdf:type schema:DefinedTerm
    301 sg:person.0627270374.18 schema:affiliation grid-institutes:None
    302 schema:familyName IYER
    303 schema:givenName ARAVIND
    304 rdf:type schema:Person
    305 grid-institutes:None schema:name NATIONAL LIBRARY OF MEDICINE
    306 rdf:type schema:Organization
    307 grid-institutes:grid.280285.5 schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...