Evolutionary Analysis and Comparative Genomics of Protei View Homepage


Ontology type: schema:MonetaryGrant     


Grant Info

YEARS

2004-2005

FUNDING AMOUNT

N/A

ABSTRACT

1) Natural history and classification of the Helix-turn-helix domains. The helix-turn-helix (HTH) domain is a common denominator in basal and specific transcription factors from the three super-kingdoms of life. At its core, the domain comprises of an open tri-helical bundle, which typically binds DNA with the 3rd helix. Drawing on the wealth of data that has accumulated over two decades since the discovery of the domain, we present an overview of the natural history of the HTH domain from the viewpoint of structural analysis and comparative genomics. In structural terms, the HTH domains have developed several elaborations on the basic 3-helical core, such as the tetra-helical bundle, the winged-helix and the ribbon-helix-helix type configurations. In functional terms, the HTH domains are present in the most prevalent transcription factors of all prokaryotic genomes and some eukaryotic genomes. They have been recruited to a wide range of functions beyond transcription regulation, which include DNA repair and replication, RNA metabolism and protein-protein interactions in diverse signaling contexts. Beyond their basic role in mediating macromolecular interactions, the HTH domains have also been incorporated into the catalytic domains of diverse enzymes. We discuss the general domain architectural themes that have arisen amongst the HTH domains as a result of their recruitment to these diverse functions. We present a natural classification, higher-order relationships and phyletic pattern analysis of all the major families of HTH domains. This reconstruction suggests that there were at least 6-11 different HTH domains in the last universal common ancestor of all life forms, which covered much of the structural diversity and part of the functional versatility of the extant representatives of this domain. In prokaryotes the total number of HTH domains per genome shows a strong power-equation type scaling with the gene number per genome. However, the HTH domains in two-component signaling pathways show a linear scaling with gene number, in contrast to the non-linear scaling of HTH domains in single-component systems and sigma factors. These observations point to distinct evolutionary forces in the emergence of different signaling systems with HTH transcription factors. The archaea and bacteria share a number of ancient families of specific HTH transcription factors. However, they do not share any orthologous HTH proteins in the basal transcription apparatus. This differential relationship of their basal and specific transcriptional machinery poses an apparent conundrum regarding the origins of their transcription apparatus. 2) Origin of acetylcholine-receptor type ligand gated channels of animals. Acetylcholine receptor type ligand-gated ion channels (ART-LGIC; also known as Cys-loop receptors) are a superfamily of proteins that include the receptors for major neurotransmitters such as acetylcholine, serotonin, glycine, GABA, glutamate and histamine, and for Zn2+ ions. They play a central role in fast synaptic signaling in animal nervous systems and so far have not been found outside of the Metazoa. We have identified homologs of ART-LGICs in several bacteria and a single archaeal genus, Methanosarcina. The homology between the animal receptors and the prokaryotic homologs spans the entire length of the former, including both the ligand-binding and channel-forming transmembrane domains. A sequence-structure analysis using the structure of Lymnaea stagnalis acetylcholine-binding protein and the newly detected prokaryotic versions indicates the presence of at least one aromatic residue in the ligand-binding boxes of almost all representatives of the superfamily. Investigation of the domain architectures of the bacterial forms shows that they may often show fusions with other small-molecule-binding domains, such as the periplasmic binding protein superfamily I (PBP-I), Cache and MCP-N domains. Some of the bacterial forms also occur in predicted operons with the genes of the PBP-II superfamily and the Cache domains. Analysis of phyletic patterns suggests that the ART-LGICs are currently absent in all other eukaryotic lineages except animals. Moreover, phylogenetic analysis and conserved sequence motifs also suggest that a subset of the bacterial forms is closer to the metazoan forms. From the information from the bacterial forms we infer that cation-pi or hydrophobic interactions with the ligand are likely to be a pervasive feature of the entire superfamily, even though the individual residues involved in the process may vary. The conservation pattern in the channel-forming transmembrane domains also suggests similar channel-gating mechanisms in the prokaryotic versions. From the distribution of charged residues in the prokaryotic M2 transmembrane segments, we expect that there will be examples of both cation and anion selectivity within the prokaryotic members. Contextual connections suggest that the prokaryotic forms may function as chemotactic receptors for low molecular weight solutes. The phyletic patterns and phylogenetic relationships suggest the possibility that the metazoan receptors emerged through an early lateral transfer from a prokaryotic source, before the divergence of extant metazoan lineages. 3) Apicomplexan transcription factors. The comparative genomics of apicomplexans, such as the malarial parasite Plasmodium, the cattle parasite Theileria and the emerging human parasite Cryptosporidium, have suggested an unexpected paucity of specific transcription factors (TFs) with DNA binding domains that are closely related to those found in the major families of TFs from other eukaryotes. This apparent lack of specific TFs is paradoxical, given that the apicomplexans show a complex developmental cycle in one or more hosts and a reproducible pattern of differential gene expression in course of this cycle. Using sensitive sequence profile searches, we show that the apicomplexans possess a lineage-specific expansion of a novel family of proteins with a version of the AP2 (Apetala2)-integrase DNA binding domain, which is present in numerous plant TFs. About 20-27 members of this apicomplexan AP2 (ApiAP2) family are encoded in different apicomplexan genomes, wth each protein containing one to four copies of the AP2 DNA binding domain. Using gene expression data from Plasmodium falciparum, we show that guilds of ApiAP2 genes are expressed in different stages of intraerythrocytic development. By analogy to the plant AP2 proteins and based on the expression patterns, we predict that the ApiAP2 proteins are likely to function as previously unknown specific TFs in the apicomplexans and regulate the progression of their developmental cycle. In addition to the ApiAP2 family, we also identified two other novel families of AP2 DNA binding domains in bacteria and transposons. Using structure similarity searches, we also identified divergent versions of the AP2-integrase DNA binding domain fold in the DNA binding region of the PI-SceI homing endonuclease and the C-terminal domain of the pleckstrin homology (PH) domain-like modules of eukaryotes. Integrating these findings, we present a reconstruction of the evolutionary scenario of the AP2-integrase DNA binding domain fold, which suggests that it underwent multiple independent combinations with different types of mobile endonucleases or recombinases. It appears that the eukaryotic versions have emerged from versions of the domain associated with mobile elements, followed by independent lineage-specific expansions, which accompanied their recruitment to transcription regulation functions. More... »

URL

http://projectreporter.nih.gov/project_info_description.cfm?aid=7148173

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "type": "DefinedTerm"
      }
    ], 
    "description": "1) Natural history and classification of the Helix-turn-helix domains. The helix-turn-helix (HTH) domain is a common denominator in basal and specific transcription factors from the three super-kingdoms of life. At its core, the domain comprises of an open tri-helical bundle, which typically binds DNA with the 3rd helix. Drawing on the wealth of data that has accumulated over two decades since the discovery of the domain, we present an overview of the natural history of the HTH domain from the viewpoint of structural analysis and comparative genomics. In structural terms, the HTH domains have developed several elaborations on the basic 3-helical core, such as the tetra-helical bundle, the winged-helix and the ribbon-helix-helix type configurations. In functional terms, the HTH domains are present in the most prevalent transcription factors of all prokaryotic genomes and some eukaryotic genomes. They have been recruited to a wide range of functions beyond transcription regulation, which include DNA repair and replication, RNA metabolism and protein-protein interactions in diverse signaling contexts. Beyond their basic role in mediating macromolecular interactions, the HTH domains have also been incorporated into the catalytic domains of diverse enzymes. We discuss the general domain architectural themes that have arisen amongst the HTH domains as a result of their recruitment to these diverse functions. We present a natural classification, higher-order relationships and phyletic pattern analysis of all the major families of HTH domains. This reconstruction suggests that there were at least 6-11 different HTH domains in the last universal common ancestor of all life forms, which covered much of the structural diversity and part of the functional versatility of the extant representatives of this domain. In prokaryotes the total number of HTH domains per genome shows a strong power-equation type scaling with the gene number per genome. However, the HTH domains in two-component signaling pathways show a linear scaling with gene number, in contrast to the non-linear scaling of HTH domains in single-component systems and sigma factors. These observations point to distinct evolutionary forces in the emergence of different signaling systems with HTH transcription factors. The archaea and bacteria share a number of ancient families of specific HTH transcription factors. However, they do not share any orthologous HTH proteins in the basal transcription apparatus. This differential relationship of their basal and specific transcriptional machinery poses an apparent conundrum regarding the origins of their transcription apparatus. 2) Origin of acetylcholine-receptor type ligand gated channels of animals. Acetylcholine receptor type ligand-gated ion channels (ART-LGIC; also known as Cys-loop receptors) are a superfamily of proteins that include the receptors for major neurotransmitters such as acetylcholine, serotonin, glycine, GABA, glutamate and histamine, and for Zn2+ ions. They play a central role in fast synaptic signaling in animal nervous systems and so far have not been found outside of the Metazoa. We have identified homologs of ART-LGICs in several bacteria and a single archaeal genus, Methanosarcina. The homology between the animal receptors and the prokaryotic homologs spans the entire length of the former, including both the ligand-binding and channel-forming transmembrane domains. A sequence-structure analysis using the structure of Lymnaea stagnalis acetylcholine-binding protein and the newly detected prokaryotic versions indicates the presence of at least one aromatic residue in the ligand-binding boxes of almost all representatives of the superfamily. Investigation of the domain architectures of the bacterial forms shows that they may often show fusions with other small-molecule-binding domains, such as the periplasmic binding protein superfamily I (PBP-I), Cache and MCP-N domains. Some of the bacterial forms also occur in predicted operons with the genes of the PBP-II superfamily and the Cache domains. Analysis of phyletic patterns suggests that the ART-LGICs are currently absent in all other eukaryotic lineages except animals. Moreover, phylogenetic analysis and conserved sequence motifs also suggest that a subset of the bacterial forms is closer to the metazoan forms. From the information from the bacterial forms we infer that cation-pi or hydrophobic interactions with the ligand are likely to be a pervasive feature of the entire superfamily, even though the individual residues involved in the process may vary. The conservation pattern in the channel-forming transmembrane domains also suggests similar channel-gating mechanisms in the prokaryotic versions. From the distribution of charged residues in the prokaryotic M2 transmembrane segments, we expect that there will be examples of both cation and anion selectivity within the prokaryotic members. Contextual connections suggest that the prokaryotic forms may function as chemotactic receptors for low molecular weight solutes. The phyletic patterns and phylogenetic relationships suggest the possibility that the metazoan receptors emerged through an early lateral transfer from a prokaryotic source, before the divergence of extant metazoan lineages. 3) Apicomplexan transcription factors. The comparative genomics of apicomplexans, such as the malarial parasite Plasmodium, the cattle parasite Theileria and the emerging human parasite Cryptosporidium, have suggested an unexpected paucity of specific transcription factors (TFs) with DNA binding domains that are closely related to those found in the major families of TFs from other eukaryotes. This apparent lack of specific TFs is paradoxical, given that the apicomplexans show a complex developmental cycle in one or more hosts and a reproducible pattern of differential gene expression in course of this cycle. Using sensitive sequence profile searches, we show that the apicomplexans possess a lineage-specific expansion of a novel family of proteins with a version of the AP2 (Apetala2)-integrase DNA binding domain, which is present in numerous plant TFs. About 20-27 members of this apicomplexan AP2 (ApiAP2) family are encoded in different apicomplexan genomes, wth each protein containing one to four copies of the AP2 DNA binding domain. Using gene expression data from Plasmodium falciparum, we show that guilds of ApiAP2 genes are expressed in different stages of intraerythrocytic development. By analogy to the plant AP2 proteins and based on the expression patterns, we predict that the ApiAP2 proteins are likely to function as previously unknown specific TFs in the apicomplexans and regulate the progression of their developmental cycle. In addition to the ApiAP2 family, we also identified two other novel families of AP2 DNA binding domains in bacteria and transposons. Using structure similarity searches, we also identified divergent versions of the AP2-integrase DNA binding domain fold in the DNA binding region of the PI-SceI homing endonuclease and the C-terminal domain of the pleckstrin homology (PH) domain-like modules of eukaryotes. Integrating these findings, we present a reconstruction of the evolutionary scenario of the AP2-integrase DNA binding domain fold, which suggests that it underwent multiple independent combinations with different types of mobile endonucleases or recombinases. It appears that the eukaryotic versions have emerged from versions of the domain associated with mobile elements, followed by independent lineage-specific expansions, which accompanied their recruitment to transcription regulation functions.", 
    "endDate": "2005-01-01", 
    "funder": {
      "id": "http://www.grid.ac/institutes/grid.280285.5", 
      "type": "Organization"
    }, 
    "id": "sg:grant.2720300", 
    "identifier": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "grant.2720300"
        ]
      }, 
      {
        "name": "nih_id", 
        "type": "PropertyValue", 
        "value": [
          "Z01LM092504"
        ]
      }
    ], 
    "inLanguage": [
      "en"
    ], 
    "keywords": [
      "specific transcription factors", 
      "channel-forming transmembrane domains", 
      "lineage-specific expansions", 
      "HTH domain", 
      "transcription factors", 
      "comparative genomics", 
      "AP2 DNA", 
      "phyletic patterns", 
      "transcription apparatus", 
      "gene number", 
      "prokaryotic versions", 
      "transmembrane domain", 
      "bacterial forms", 
      "two-component signaling pathways", 
      "developmental cycle", 
      "last universal common ancestor", 
      "sensitive sequence profile searches", 
      "major families", 
      "phyletic pattern analysis", 
      "distinct evolutionary forces", 
      "Apicomplexan AP2 (ApiAP2) family", 
      "plant transcription factors", 
      "sequence profile searches", 
      "universal common ancestor", 
      "basal transcription apparatus", 
      "periplasmic binding protein", 
      "protein-protein interactions", 
      "complex developmental cycle", 
      "superfamily of proteins", 
      "transcription regulation functions", 
      "ligand-gated ion channels", 
      "novel family", 
      "malarial parasite Plasmodium", 
      "sequence-structure analysis", 
      "differential gene expression", 
      "acetylcholine-binding protein", 
      "different signaling systems", 
      "animal nervous systems", 
      "gene expression data", 
      "M2 transmembrane segment", 
      "metazoan forms", 
      "AP2 family", 
      "ApiAP2 genes", 
      "ApiAP2 proteins", 
      "ApiAP2 family", 
      "eukaryotic lineages", 
      "fast synaptic signaling", 
      "apicomplexan genomes", 
      "metazoan lineages", 
      "prokaryotic forms", 
      "eukaryotic genomes", 
      "evolutionary forces", 
      "eukaryotic versions", 
      "transcriptional machinery", 
      "transcription regulation", 
      "prokaryotic members", 
      "prokaryotic homolog", 
      "HTH protein", 
      "phylogenetic relationships", 
      "sigma factor", 
      "domain architecture", 
      "prokaryotic genomes", 
      "conservation patterns", 
      "helix domain", 
      "RNA metabolism", 
      "transmembrane segments", 
      "AP2 proteins", 
      "evolutionary analysis", 
      "homing endonuclease", 
      "common ancestor", 
      "structure similarity search", 
      "sequence motifs", 
      "phylogenetic analysis", 
      "ancient family", 
      "prokaryotic sources", 
      "extant representatives", 
      "archaeal genera", 
      "diverse enzymes", 
      "catalytic domain", 
      "DNA repair", 
      "terminal domain", 
      "intraerythrocytic development", 
      "apicomplexans", 
      "evolutionary scenario", 
      "life forms", 
      "profile searches", 
      "diverse functions", 
      "animal receptors", 
      "genome", 
      "Cache domains", 
      "lateral transfer", 
      "functional versatility", 
      "expression patterns", 
      "gene expression", 
      "parasite Plasmodium", 
      "signaling pathways", 
      "domain comprises", 
      "expression data", 
      "individual residues", 
      "signaling system", 
      "binding protein", 
      "genomics", 
      "macromolecular interactions", 
      "aromatic residues", 
      "more hosts", 
      "mobile elements", 
      "synaptic signaling", 
      "ion channels", 
      "protein", 
      "eukaryotes", 
      "DNA", 
      "structural diversity", 
      "homolog", 
      "divergent versions", 
      "superfamily", 
      "functional terms", 
      "lineages", 
      "natural classification", 
      "regulation function", 
      "unexpected paucity", 
      "genes", 
      "chemotactic receptors", 
      "residues", 
      "bacteria", 
      "endonuclease", 
      "Plasmodium falciparum", 
      "central role", 
      "wealth of data", 
      "apparent conundrum", 
      "similarity search", 
      "family", 
      "reproducible pattern", 
      "structural terms", 
      "anion selectivity", 
      "low molecular weight solutes", 
      "receptors", 
      "parasite Cryptosporidium", 
      "domain", 
      "recruitment", 
      "metazoans", 
      "hydrophobic interactions", 
      "archaea", 
      "operon", 
      "apparent lack", 
      "guilds", 
      "pervasive feature", 
      "recombinases", 
      "structural analysis", 
      "transposon", 
      "protei", 
      "ancestor", 
      "architectural themes", 
      "higher-order relationships", 
      "AP2", 
      "homology", 
      "genus", 
      "entire length", 
      "nervous system", 
      "HTH", 
      "machinery", 
      "signaling", 
      "Methanosarcina", 
      "major neurotransmitter", 
      "members", 
      "diversity", 
      "motif", 
      "helix", 
      "different stages", 
      "divergence", 
      "enzyme", 
      "animals", 
      "regulation", 
      "interaction", 
      "pathway", 
      "Theileria", 
      "patterns", 
      "host", 
      "Plasmodium", 
      "basic role", 
      "role", 
      "copies", 
      "replication", 
      "pattern analysis", 
      "molecular weight solutes", 
      "expression", 
      "function", 
      "ligands", 
      "metabolism", 
      "cycle", 
      "origin", 
      "falciparum", 
      "weight solutes", 
      "glycine", 
      "representatives", 
      "discovery", 
      "non-linear scaling", 
      "common denominator", 
      "wide range", 
      "Cryptosporidium", 
      "cattle", 
      "glutamate", 
      "fusion", 
      "factors", 
      "apparatus", 
      "natural history", 
      "analysis", 
      "mechanism", 
      "repair", 
      "bundles", 
      "number", 
      "total number", 
      "form", 
      "neurotransmitters", 
      "progression", 
      "expansion", 
      "region", 
      "types", 
      "box", 
      "contrast", 
      "stage", 
      "relationship", 
      "independent combinations", 
      "emergence", 
      "subset", 
      "Zn2", 
      "channels", 
      "segments", 
      "development", 
      "presence", 
      "conundrum", 
      "versatility", 
      "comprises", 
      "GABA", 
      "structure", 
      "addition", 
      "data", 
      "length", 
      "distribution", 
      "different types", 
      "elements", 
      "system", 
      "solutes", 
      "combination", 
      "process", 
      "transfer", 
      "core", 
      "findings", 
      "lack", 
      "wealth", 
      "history", 
      "decades", 
      "overview", 
      "part", 
      "paucity", 
      "source", 
      "acetylcholine", 
      "observations", 
      "search", 
      "contextual connections", 
      "selectivity", 
      "architecture", 
      "reconstruction", 
      "module", 
      "possibility", 
      "range", 
      "elaboration", 
      "features", 
      "information", 
      "type ligands", 
      "results", 
      "classification", 
      "investigation", 
      "histamine", 
      "example", 
      "context", 
      "analogy", 
      "scaling", 
      "cations", 
      "denominator", 
      "wth", 
      "ions", 
      "connection", 
      "force", 
      "scenarios", 
      "terms", 
      "life", 
      "type configuration", 
      "version", 
      "course", 
      "differential relationships", 
      "viewpoint", 
      "single-component systems", 
      "configuration", 
      "themes", 
      "cache", 
      "linear scaling"
    ], 
    "name": "Evolutionary Analysis and Comparative Genomics of Protei", 
    "recipient": [
      {
        "id": "http://www.grid.ac/institutes/grid.280285.5", 
        "type": "Organization"
      }, 
      {
        "affiliation": {
          "id": "http://www.grid.ac/institutes/None", 
          "name": "NATIONAL LIBRARY OF MEDICINE", 
          "type": "Organization"
        }, 
        "familyName": "ARAVIND", 
        "givenName": "L", 
        "id": "sg:person.01106662166.38", 
        "type": "Person"
      }, 
      {
        "member": "sg:person.01106662166.38", 
        "roleName": "PI", 
        "type": "Role"
      }
    ], 
    "sameAs": [
      "https://app.dimensions.ai/details/grant/grant.2720300"
    ], 
    "sdDataset": "grants", 
    "sdDatePublished": "2022-05-10T10:57", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220509/entities/gbq_results/grant/grant_14.jsonl", 
    "startDate": "2004-01-01", 
    "type": "MonetaryGrant", 
    "url": "http://projectreporter.nih.gov/project_info_description.cfm?aid=7148173"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/grant.2720300'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/grant.2720300'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/grant.2720300'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/grant.2720300'


 

This table displays all metadata directly associated to this object as RDF triples.

336 TRIPLES      18 PREDICATES      316 URIs      309 LITERALS      4 BLANK NODES

Subject Predicate Object
1 sg:grant.2720300 schema:about anzsrc-for:06
2 schema:description 1) Natural history and classification of the Helix-turn-helix domains. The helix-turn-helix (HTH) domain is a common denominator in basal and specific transcription factors from the three super-kingdoms of life. At its core, the domain comprises of an open tri-helical bundle, which typically binds DNA with the 3rd helix. Drawing on the wealth of data that has accumulated over two decades since the discovery of the domain, we present an overview of the natural history of the HTH domain from the viewpoint of structural analysis and comparative genomics. In structural terms, the HTH domains have developed several elaborations on the basic 3-helical core, such as the tetra-helical bundle, the winged-helix and the ribbon-helix-helix type configurations. In functional terms, the HTH domains are present in the most prevalent transcription factors of all prokaryotic genomes and some eukaryotic genomes. They have been recruited to a wide range of functions beyond transcription regulation, which include DNA repair and replication, RNA metabolism and protein-protein interactions in diverse signaling contexts. Beyond their basic role in mediating macromolecular interactions, the HTH domains have also been incorporated into the catalytic domains of diverse enzymes. We discuss the general domain architectural themes that have arisen amongst the HTH domains as a result of their recruitment to these diverse functions. We present a natural classification, higher-order relationships and phyletic pattern analysis of all the major families of HTH domains. This reconstruction suggests that there were at least 6-11 different HTH domains in the last universal common ancestor of all life forms, which covered much of the structural diversity and part of the functional versatility of the extant representatives of this domain. In prokaryotes the total number of HTH domains per genome shows a strong power-equation type scaling with the gene number per genome. However, the HTH domains in two-component signaling pathways show a linear scaling with gene number, in contrast to the non-linear scaling of HTH domains in single-component systems and sigma factors. These observations point to distinct evolutionary forces in the emergence of different signaling systems with HTH transcription factors. The archaea and bacteria share a number of ancient families of specific HTH transcription factors. However, they do not share any orthologous HTH proteins in the basal transcription apparatus. This differential relationship of their basal and specific transcriptional machinery poses an apparent conundrum regarding the origins of their transcription apparatus. 2) Origin of acetylcholine-receptor type ligand gated channels of animals. Acetylcholine receptor type ligand-gated ion channels (ART-LGIC; also known as Cys-loop receptors) are a superfamily of proteins that include the receptors for major neurotransmitters such as acetylcholine, serotonin, glycine, GABA, glutamate and histamine, and for Zn2+ ions. They play a central role in fast synaptic signaling in animal nervous systems and so far have not been found outside of the Metazoa. We have identified homologs of ART-LGICs in several bacteria and a single archaeal genus, Methanosarcina. The homology between the animal receptors and the prokaryotic homologs spans the entire length of the former, including both the ligand-binding and channel-forming transmembrane domains. A sequence-structure analysis using the structure of Lymnaea stagnalis acetylcholine-binding protein and the newly detected prokaryotic versions indicates the presence of at least one aromatic residue in the ligand-binding boxes of almost all representatives of the superfamily. Investigation of the domain architectures of the bacterial forms shows that they may often show fusions with other small-molecule-binding domains, such as the periplasmic binding protein superfamily I (PBP-I), Cache and MCP-N domains. Some of the bacterial forms also occur in predicted operons with the genes of the PBP-II superfamily and the Cache domains. Analysis of phyletic patterns suggests that the ART-LGICs are currently absent in all other eukaryotic lineages except animals. Moreover, phylogenetic analysis and conserved sequence motifs also suggest that a subset of the bacterial forms is closer to the metazoan forms. From the information from the bacterial forms we infer that cation-pi or hydrophobic interactions with the ligand are likely to be a pervasive feature of the entire superfamily, even though the individual residues involved in the process may vary. The conservation pattern in the channel-forming transmembrane domains also suggests similar channel-gating mechanisms in the prokaryotic versions. From the distribution of charged residues in the prokaryotic M2 transmembrane segments, we expect that there will be examples of both cation and anion selectivity within the prokaryotic members. Contextual connections suggest that the prokaryotic forms may function as chemotactic receptors for low molecular weight solutes. The phyletic patterns and phylogenetic relationships suggest the possibility that the metazoan receptors emerged through an early lateral transfer from a prokaryotic source, before the divergence of extant metazoan lineages. 3) Apicomplexan transcription factors. The comparative genomics of apicomplexans, such as the malarial parasite Plasmodium, the cattle parasite Theileria and the emerging human parasite Cryptosporidium, have suggested an unexpected paucity of specific transcription factors (TFs) with DNA binding domains that are closely related to those found in the major families of TFs from other eukaryotes. This apparent lack of specific TFs is paradoxical, given that the apicomplexans show a complex developmental cycle in one or more hosts and a reproducible pattern of differential gene expression in course of this cycle. Using sensitive sequence profile searches, we show that the apicomplexans possess a lineage-specific expansion of a novel family of proteins with a version of the AP2 (Apetala2)-integrase DNA binding domain, which is present in numerous plant TFs. About 20-27 members of this apicomplexan AP2 (ApiAP2) family are encoded in different apicomplexan genomes, wth each protein containing one to four copies of the AP2 DNA binding domain. Using gene expression data from Plasmodium falciparum, we show that guilds of ApiAP2 genes are expressed in different stages of intraerythrocytic development. By analogy to the plant AP2 proteins and based on the expression patterns, we predict that the ApiAP2 proteins are likely to function as previously unknown specific TFs in the apicomplexans and regulate the progression of their developmental cycle. In addition to the ApiAP2 family, we also identified two other novel families of AP2 DNA binding domains in bacteria and transposons. Using structure similarity searches, we also identified divergent versions of the AP2-integrase DNA binding domain fold in the DNA binding region of the PI-SceI homing endonuclease and the C-terminal domain of the pleckstrin homology (PH) domain-like modules of eukaryotes. Integrating these findings, we present a reconstruction of the evolutionary scenario of the AP2-integrase DNA binding domain fold, which suggests that it underwent multiple independent combinations with different types of mobile endonucleases or recombinases. It appears that the eukaryotic versions have emerged from versions of the domain associated with mobile elements, followed by independent lineage-specific expansions, which accompanied their recruitment to transcription regulation functions.
3 schema:endDate 2005-01-01
4 schema:funder grid-institutes:grid.280285.5
5 schema:identifier N7a19e039448e4139a0329399bc92c07a
6 Nc24b562254174a309754c74dcf6da0b6
7 schema:inLanguage en
8 schema:keywords AP2
9 AP2 DNA
10 AP2 family
11 AP2 proteins
12 ApiAP2 family
13 ApiAP2 genes
14 ApiAP2 proteins
15 Apicomplexan AP2 (ApiAP2) family
16 Cache domains
17 Cryptosporidium
18 DNA
19 DNA repair
20 GABA
21 HTH
22 HTH domain
23 HTH protein
24 M2 transmembrane segment
25 Methanosarcina
26 Plasmodium
27 Plasmodium falciparum
28 RNA metabolism
29 Theileria
30 Zn2
31 acetylcholine
32 acetylcholine-binding protein
33 addition
34 analogy
35 analysis
36 ancestor
37 ancient family
38 animal nervous systems
39 animal receptors
40 animals
41 anion selectivity
42 apicomplexan genomes
43 apicomplexans
44 apparatus
45 apparent conundrum
46 apparent lack
47 archaea
48 archaeal genera
49 architectural themes
50 architecture
51 aromatic residues
52 bacteria
53 bacterial forms
54 basal transcription apparatus
55 basic role
56 binding protein
57 box
58 bundles
59 cache
60 catalytic domain
61 cations
62 cattle
63 central role
64 channel-forming transmembrane domains
65 channels
66 chemotactic receptors
67 classification
68 combination
69 common ancestor
70 common denominator
71 comparative genomics
72 complex developmental cycle
73 comprises
74 configuration
75 connection
76 conservation patterns
77 context
78 contextual connections
79 contrast
80 conundrum
81 copies
82 core
83 course
84 cycle
85 data
86 decades
87 denominator
88 development
89 developmental cycle
90 different signaling systems
91 different stages
92 different types
93 differential gene expression
94 differential relationships
95 discovery
96 distinct evolutionary forces
97 distribution
98 divergence
99 divergent versions
100 diverse enzymes
101 diverse functions
102 diversity
103 domain
104 domain architecture
105 domain comprises
106 elaboration
107 elements
108 emergence
109 endonuclease
110 entire length
111 enzyme
112 eukaryotes
113 eukaryotic genomes
114 eukaryotic lineages
115 eukaryotic versions
116 evolutionary analysis
117 evolutionary forces
118 evolutionary scenario
119 example
120 expansion
121 expression
122 expression data
123 expression patterns
124 extant representatives
125 factors
126 falciparum
127 family
128 fast synaptic signaling
129 features
130 findings
131 force
132 form
133 function
134 functional terms
135 functional versatility
136 fusion
137 gene expression
138 gene expression data
139 gene number
140 genes
141 genome
142 genomics
143 genus
144 glutamate
145 glycine
146 guilds
147 helix
148 helix domain
149 higher-order relationships
150 histamine
151 history
152 homing endonuclease
153 homolog
154 homology
155 host
156 hydrophobic interactions
157 independent combinations
158 individual residues
159 information
160 interaction
161 intraerythrocytic development
162 investigation
163 ion channels
164 ions
165 lack
166 last universal common ancestor
167 lateral transfer
168 length
169 life
170 life forms
171 ligand-gated ion channels
172 ligands
173 lineage-specific expansions
174 lineages
175 linear scaling
176 low molecular weight solutes
177 machinery
178 macromolecular interactions
179 major families
180 major neurotransmitter
181 malarial parasite Plasmodium
182 mechanism
183 members
184 metabolism
185 metazoan forms
186 metazoan lineages
187 metazoans
188 mobile elements
189 module
190 molecular weight solutes
191 more hosts
192 motif
193 natural classification
194 natural history
195 nervous system
196 neurotransmitters
197 non-linear scaling
198 novel family
199 number
200 observations
201 operon
202 origin
203 overview
204 parasite Cryptosporidium
205 parasite Plasmodium
206 part
207 pathway
208 pattern analysis
209 patterns
210 paucity
211 periplasmic binding protein
212 pervasive feature
213 phyletic pattern analysis
214 phyletic patterns
215 phylogenetic analysis
216 phylogenetic relationships
217 plant transcription factors
218 possibility
219 presence
220 process
221 profile searches
222 progression
223 prokaryotic forms
224 prokaryotic genomes
225 prokaryotic homolog
226 prokaryotic members
227 prokaryotic sources
228 prokaryotic versions
229 protei
230 protein
231 protein-protein interactions
232 range
233 receptors
234 recombinases
235 reconstruction
236 recruitment
237 region
238 regulation
239 regulation function
240 relationship
241 repair
242 replication
243 representatives
244 reproducible pattern
245 residues
246 results
247 role
248 scaling
249 scenarios
250 search
251 segments
252 selectivity
253 sensitive sequence profile searches
254 sequence motifs
255 sequence profile searches
256 sequence-structure analysis
257 sigma factor
258 signaling
259 signaling pathways
260 signaling system
261 similarity search
262 single-component systems
263 solutes
264 source
265 specific transcription factors
266 stage
267 structural analysis
268 structural diversity
269 structural terms
270 structure
271 structure similarity search
272 subset
273 superfamily
274 superfamily of proteins
275 synaptic signaling
276 system
277 terminal domain
278 terms
279 themes
280 total number
281 transcription apparatus
282 transcription factors
283 transcription regulation
284 transcription regulation functions
285 transcriptional machinery
286 transfer
287 transmembrane domain
288 transmembrane segments
289 transposon
290 two-component signaling pathways
291 type configuration
292 type ligands
293 types
294 unexpected paucity
295 universal common ancestor
296 versatility
297 version
298 viewpoint
299 wealth
300 wealth of data
301 weight solutes
302 wide range
303 wth
304 schema:name Evolutionary Analysis and Comparative Genomics of Protei
305 schema:recipient N97d83ee92e2d46b5aa1f0137f0cedce6
306 sg:person.01106662166.38
307 grid-institutes:grid.280285.5
308 schema:sameAs https://app.dimensions.ai/details/grant/grant.2720300
309 schema:sdDatePublished 2022-05-10T10:57
310 schema:sdLicense https://scigraph.springernature.com/explorer/license/
311 schema:sdPublisher N2d49bf3c7de240328f369f446b8ad16b
312 schema:startDate 2004-01-01
313 schema:url http://projectreporter.nih.gov/project_info_description.cfm?aid=7148173
314 sgo:license sg:explorer/license/
315 sgo:sdDataset grants
316 rdf:type schema:MonetaryGrant
317 N2d49bf3c7de240328f369f446b8ad16b schema:name Springer Nature - SN SciGraph project
318 rdf:type schema:Organization
319 N7a19e039448e4139a0329399bc92c07a schema:name nih_id
320 schema:value Z01LM092504
321 rdf:type schema:PropertyValue
322 N97d83ee92e2d46b5aa1f0137f0cedce6 schema:member sg:person.01106662166.38
323 schema:roleName PI
324 rdf:type schema:Role
325 Nc24b562254174a309754c74dcf6da0b6 schema:name dimensions_id
326 schema:value grant.2720300
327 rdf:type schema:PropertyValue
328 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
329 rdf:type schema:DefinedTerm
330 sg:person.01106662166.38 schema:affiliation grid-institutes:None
331 schema:familyName ARAVIND
332 schema:givenName L
333 rdf:type schema:Person
334 grid-institutes:None schema:name NATIONAL LIBRARY OF MEDICINE
335 rdf:type schema:Organization
336 grid-institutes:grid.280285.5 schema:Organization
 




Preview window. Press ESC to close (or click here)


...