Ontology type: schema:ScholarlyArticle Open Access: True
2010-01-04
AUTHORSMarta Gîrdea, Laurent Noé, Gregory Kucherov
ABSTRACTBackgroundFrameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level.ResultsWe developed a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. Our implementation is freely available at http://bioinfo.lifl.fr/path/.ConclusionsOur approach allows to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples. More... »
PAGES6
http://scigraph.springernature.com/pub.10.1186/1748-7188-5-6
DOIhttp://dx.doi.org/10.1186/1748-7188-5-6
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1013342982
PUBMEDhttps://www.ncbi.nlm.nih.gov/pubmed/20047662
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Biological Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Genetics",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France",
"id": "http://www.grid.ac/institutes/None",
"name": [
"Laboratoire d'Informatique Fondamentale de Lille, (Centre National de la Recherche Scientifique, Universit\u00e9 Lille 1), Lille, France",
"Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France"
],
"type": "Organization"
},
"familyName": "G\u00eerdea",
"givenName": "Marta",
"id": "sg:person.01203454254.70",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01203454254.70"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France",
"id": "http://www.grid.ac/institutes/None",
"name": [
"Laboratoire d'Informatique Fondamentale de Lille, (Centre National de la Recherche Scientifique, Universit\u00e9 Lille 1), Lille, France",
"Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France"
],
"type": "Organization"
},
"familyName": "No\u00e9",
"givenName": "Laurent",
"id": "sg:person.01233236632.20",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01233236632.20"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "French-Russian J-V Poncelet Laboratory, Moscow, Russia",
"id": "http://www.grid.ac/institutes/None",
"name": [
"Laboratoire d'Informatique Fondamentale de Lille, (Centre National de la Recherche Scientifique, Universit\u00e9 Lille 1), Lille, France",
"Institut National de Recherche en Informatique et en Automatique, Centre de Recherche Lille - Nord Europe, France",
"French-Russian J-V Poncelet Laboratory, Moscow, Russia"
],
"type": "Organization"
},
"familyName": "Kucherov",
"givenName": "Gregory",
"id": "sg:person.013163701366.40",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013163701366.40"
],
"type": "Person"
}
],
"citation": [
{
"id": "sg:pub.10.1186/1471-2105-6-156",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1023271915",
"https://doi.org/10.1186/1471-2105-6-156"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-642-04241-6_10",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1030953214",
"https://doi.org/10.1007/978-3-642-04241-6_10"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1471-2105-6-134",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1033106693",
"https://doi.org/10.1186/1471-2105-6-134"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/s00239-004-0138-0",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1022873889",
"https://doi.org/10.1007/s00239-004-0138-0"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1471-2164-8-371",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1050226563",
"https://doi.org/10.1186/1471-2164-8-371"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-642-04241-6_20",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1019087962",
"https://doi.org/10.1007/978-3-642-04241-6_20"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/3-540-63220-4_59",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1053403180",
"https://doi.org/10.1007/3-540-63220-4_59"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/bf00162968",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1013701222",
"https://doi.org/10.1007/bf00162968"
],
"type": "CreativeWork"
}
],
"datePublished": "2010-01-04",
"datePublishedReg": "2010-01-04",
"description": "BackgroundFrameshift mutations in protein-coding DNA sequences produce a drastic change in the resulting protein sequence, which prevents classic protein alignment methods from revealing the proteins' common origin. Moreover, when a large number of substitutions are additionally involved in the divergence, the homology detection becomes difficult even at the DNA level.ResultsWe developed a novel method to infer distant homology relations of two proteins, that accounts for frameshift and point mutations that may have affected the coding sequences. We design a dynamic programming alignment algorithm over memory-efficient graph representations of the complete set of putative DNA sequences of each protein, with the goal of determining the two putative DNA sequences which have the best scoring alignment under a powerful scoring system designed to reflect the most probable evolutionary process. Our implementation is freely available at http://bioinfo.lifl.fr/path/.ConclusionsOur approach allows to uncover evolutionary information that is not captured by traditional alignment methods, which is confirmed by biologically significant examples.",
"genre": "article",
"id": "sg:pub.10.1186/1748-7188-5-6",
"inLanguage": "en",
"isAccessibleForFree": true,
"isPartOf": [
{
"id": "sg:journal.1036449",
"issn": [
"1748-7188"
],
"name": "Algorithms for Molecular Biology",
"publisher": "Springer Nature",
"type": "Periodical"
},
{
"issueNumber": "1",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "5"
}
],
"keywords": [
"DNA sequences",
"protein-coding DNA sequences",
"dynamic programming alignment algorithms",
"alignment method",
"traditional alignment methods",
"common origin",
"protein homology",
"evolutionary information",
"protein sequences",
"alignment algorithm",
"evolutionary processes",
"graph representation",
"powerful scoring system",
"homology relations",
"homology detection",
"point mutations",
"frameshift mutation",
"mutations",
"sequence",
"ConclusionsOur approach",
"protein",
"DNA levels",
"best scoring",
"novel method",
"large number",
"homology",
"frameshift",
"algorithm",
"divergence",
"significant examples",
"implementation",
"drastic changes",
"complete set",
"representation",
"information",
"method",
"set",
"detection",
"substitution",
"system",
"goal",
"origin",
"example",
"scoring",
"presence",
"levels",
"changes",
"number",
"process",
"ResultsWe",
"approach",
"scoring system",
"relation"
],
"name": "Back-translation for discovering distant protein homologies in the presence of frameshift mutations",
"pagination": "6",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1013342982"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1186/1748-7188-5-6"
]
},
{
"name": "pubmed_id",
"type": "PropertyValue",
"value": [
"20047662"
]
}
],
"sameAs": [
"https://doi.org/10.1186/1748-7188-5-6",
"https://app.dimensions.ai/details/publication/pub.1013342982"
],
"sdDataset": "articles",
"sdDatePublished": "2022-05-20T07:26",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/article/article_510.jsonl",
"type": "ScholarlyArticle",
"url": "https://doi.org/10.1186/1748-7188-5-6"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1748-7188-5-6'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1748-7188-5-6'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1748-7188-5-6'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1748-7188-5-6'
This table displays all metadata directly associated to this object as RDF triples.
163 TRIPLES
22 PREDICATES
87 URIs
71 LITERALS
7 BLANK NODES