galaxieEST: addressing EST identity through automated phylogenetic analysis View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2004-12

AUTHORS

R Henrik Nilsson, Balaji Rajashekar, Karl-Henrik Larsson, Björn M Ursing

ABSTRACT

BACKGROUND: Research involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories. Comparatively complete and satisfactory annotated public sequence libraries are, however, available only for a limited range of organisms, rendering the absence of sequences and gene structure information a tangible problem for those working with taxa lacking an EST or genome sequencing project. Paralogous genes belonging to the same gene family but distinguished by derived characteristics are particularly prone to misidentification and erroneous annotation; high but incomplete levels of sequence similarity are typically difficult to interpret and have formed the basis of many unsubstantiated assumptions of orthology. In these cases, a phylogenetic study of the query sequence together with the most similar sequences in the database may be of great value to the identification process. In order to facilitate this laborious procedure, a project to employ automated phylogenetic analysis in the identification of ESTs was initiated. RESULTS: galaxieEST is an open source Perl-CGI script package designed to complement traditional similarity-based identification of EST sequences through employment of automated phylogenetic analysis. It uses a series of BLAST runs as a sieve to retrieve nucleotide and protein sequences for inclusion in neighbour joining and parsimony analyses; the output includes the BLAST output, the results of the phylogenetic analyses, and the corresponding multiple alignments. galaxieEST is available as an on-line web service for identification of fungal ESTs and for download / local installation for use with any organism group at http://galaxie.cgb.ki.se/galaxieEST.html. CONCLUSIONS: By addressing sequence relatedness in addition to similarity, galaxieEST provides an integrative view on EST origin and identity, which may prove particularly useful in cases where similarity searches return one or more pertinent, but not full, matches and additional information on the query EST is needed. More... »

PAGES

87

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-5-87

DOI

http://dx.doi.org/10.1186/1471-2105-5-87

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1024481166

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/15236648


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computational Biology", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Expressed Sequence Tags", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Phylogeny", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Alignment", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Gothenburg", 
          "id": "https://www.grid.ac/institutes/grid.8761.8", 
          "name": [
            "Botanical Institute, G\u00f6teborg University, Box 461, SE-405 30, G\u00f6teborg, Sweden"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Nilsson", 
        "givenName": "R Henrik", 
        "id": "sg:person.01145211747.88", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01145211747.88"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Lund University", 
          "id": "https://www.grid.ac/institutes/grid.4514.4", 
          "name": [
            "Department of Microbial Ecology, Lund University, SE-223 62, Lund, Sweden"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Rajashekar", 
        "givenName": "Balaji", 
        "id": "sg:person.0666361066.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0666361066.33"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Gothenburg", 
          "id": "https://www.grid.ac/institutes/grid.8761.8", 
          "name": [
            "Botanical Institute, G\u00f6teborg University, Box 461, SE-405 30, G\u00f6teborg, Sweden"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Larsson", 
        "givenName": "Karl-Henrik", 
        "id": "sg:person.01361572401.02", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01361572401.02"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Karolinska Institute", 
          "id": "https://www.grid.ac/institutes/grid.4714.6", 
          "name": [
            "Karolinska Institutet, Center for Genomics and Bioinformatics, Berzelius v\u00e4g 35, SE-171 77, Stockholm, Sweden"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ursing", 
        "givenName": "Bj\u00f6rn M", 
        "id": "sg:person.0652741073.68", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0652741073.68"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1093/bioinformatics/bth119", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003235153"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-4-39", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018633949", 
          "https://doi.org/10.1186/1471-2105-4-39"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/13.4.477", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019841193"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/29.6.1272", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040102707"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/22.22.4673", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042438223"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/25.17.3389", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047265454"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/4236.612229", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061172160"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2004-12", 
    "datePublishedReg": "2004-12-01", 
    "description": "BACKGROUND: Research involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories. Comparatively complete and satisfactory annotated public sequence libraries are, however, available only for a limited range of organisms, rendering the absence of sequences and gene structure information a tangible problem for those working with taxa lacking an EST or genome sequencing project. Paralogous genes belonging to the same gene family but distinguished by derived characteristics are particularly prone to misidentification and erroneous annotation; high but incomplete levels of sequence similarity are typically difficult to interpret and have formed the basis of many unsubstantiated assumptions of orthology. In these cases, a phylogenetic study of the query sequence together with the most similar sequences in the database may be of great value to the identification process. In order to facilitate this laborious procedure, a project to employ automated phylogenetic analysis in the identification of ESTs was initiated.\nRESULTS: galaxieEST is an open source Perl-CGI script package designed to complement traditional similarity-based identification of EST sequences through employment of automated phylogenetic analysis. It uses a series of BLAST runs as a sieve to retrieve nucleotide and protein sequences for inclusion in neighbour joining and parsimony analyses; the output includes the BLAST output, the results of the phylogenetic analyses, and the corresponding multiple alignments. galaxieEST is available as an on-line web service for identification of fungal ESTs and for download / local installation for use with any organism group at http://galaxie.cgb.ki.se/galaxieEST.html.\nCONCLUSIONS: By addressing sequence relatedness in addition to similarity, galaxieEST provides an integrative view on EST origin and identity, which may prove particularly useful in cases where similarity searches return one or more pertinent, but not full, matches and additional information on the query EST is needed.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1471-2105-5-87", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "5"
      }
    ], 
    "name": "galaxieEST: addressing EST identity through automated phylogenetic analysis", 
    "pagination": "87", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "74a13ac75971a462282cf0c49eaeb04738dde2fbda94b5cd2ed5cffb3f70aee7"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "15236648"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100965194"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-5-87"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1024481166"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-5-87", 
      "https://app.dimensions.ai/details/publication/pub.1024481166"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T09:53", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000347_0000000347/records_89793_00000001.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1186%2F1471-2105-5-87"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-87'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-87'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-87'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-5-87'


 

This table displays all metadata directly associated to this object as RDF triples.

137 TRIPLES      21 PREDICATES      41 URIs      26 LITERALS      14 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-5-87 schema:about N240f29a5486b4215967fdd49df664cff
2 N87e7aa68d2a14a9895aba1f0c22ba875
3 Na4a72b32aed74e038e83beb14e1e6a9a
4 Nb448d752799b481990f312211ec0a3d4
5 Nd688ecd23e5a424989050a10fda50dde
6 anzsrc-for:06
7 anzsrc-for:0604
8 schema:author Nb5b376626e9b4a498eebee08213d3572
9 schema:citation sg:pub.10.1186/1471-2105-4-39
10 https://doi.org/10.1093/bioinformatics/13.4.477
11 https://doi.org/10.1093/bioinformatics/bth119
12 https://doi.org/10.1093/nar/22.22.4673
13 https://doi.org/10.1093/nar/25.17.3389
14 https://doi.org/10.1093/nar/29.6.1272
15 https://doi.org/10.1109/4236.612229
16 schema:datePublished 2004-12
17 schema:datePublishedReg 2004-12-01
18 schema:description BACKGROUND: Research involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories. Comparatively complete and satisfactory annotated public sequence libraries are, however, available only for a limited range of organisms, rendering the absence of sequences and gene structure information a tangible problem for those working with taxa lacking an EST or genome sequencing project. Paralogous genes belonging to the same gene family but distinguished by derived characteristics are particularly prone to misidentification and erroneous annotation; high but incomplete levels of sequence similarity are typically difficult to interpret and have formed the basis of many unsubstantiated assumptions of orthology. In these cases, a phylogenetic study of the query sequence together with the most similar sequences in the database may be of great value to the identification process. In order to facilitate this laborious procedure, a project to employ automated phylogenetic analysis in the identification of ESTs was initiated. RESULTS: galaxieEST is an open source Perl-CGI script package designed to complement traditional similarity-based identification of EST sequences through employment of automated phylogenetic analysis. It uses a series of BLAST runs as a sieve to retrieve nucleotide and protein sequences for inclusion in neighbour joining and parsimony analyses; the output includes the BLAST output, the results of the phylogenetic analyses, and the corresponding multiple alignments. galaxieEST is available as an on-line web service for identification of fungal ESTs and for download / local installation for use with any organism group at http://galaxie.cgb.ki.se/galaxieEST.html. CONCLUSIONS: By addressing sequence relatedness in addition to similarity, galaxieEST provides an integrative view on EST origin and identity, which may prove particularly useful in cases where similarity searches return one or more pertinent, but not full, matches and additional information on the query EST is needed.
19 schema:genre research_article
20 schema:inLanguage en
21 schema:isAccessibleForFree true
22 schema:isPartOf N2f939f3e599a4874af78b13029da090a
23 N7cd0f19d136f4a86ade09f4db9c5950a
24 sg:journal.1023786
25 schema:name galaxieEST: addressing EST identity through automated phylogenetic analysis
26 schema:pagination 87
27 schema:productId N5201db462d2540f195e857491c436e87
28 Na25202778fbc497ebe3106b673bc0c8e
29 Nddd749a3858b43cca195c1e34581c4a9
30 Nfa725c8968914846a608fd46af6efb42
31 Nfd654fb260bd41e8bf10709262bf2782
32 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024481166
33 https://doi.org/10.1186/1471-2105-5-87
34 schema:sdDatePublished 2019-04-11T09:53
35 schema:sdLicense https://scigraph.springernature.com/explorer/license/
36 schema:sdPublisher N27173d86816e4cb0a8f8d7ba46c5bc33
37 schema:url https://link.springer.com/10.1186%2F1471-2105-5-87
38 sgo:license sg:explorer/license/
39 sgo:sdDataset articles
40 rdf:type schema:ScholarlyArticle
41 N240f29a5486b4215967fdd49df664cff schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
42 schema:name Computational Biology
43 rdf:type schema:DefinedTerm
44 N27173d86816e4cb0a8f8d7ba46c5bc33 schema:name Springer Nature - SN SciGraph project
45 rdf:type schema:Organization
46 N2f939f3e599a4874af78b13029da090a schema:volumeNumber 5
47 rdf:type schema:PublicationVolume
48 N5201db462d2540f195e857491c436e87 schema:name dimensions_id
49 schema:value pub.1024481166
50 rdf:type schema:PropertyValue
51 N600f891a561746d29bcbe8a3e2d0edc8 rdf:first sg:person.01361572401.02
52 rdf:rest N91eef9022ee04ba5bf4e5b6edaf5b2d6
53 N7cd0f19d136f4a86ade09f4db9c5950a schema:issueNumber 1
54 rdf:type schema:PublicationIssue
55 N87e7aa68d2a14a9895aba1f0c22ba875 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
56 schema:name Sequence Alignment
57 rdf:type schema:DefinedTerm
58 N91eef9022ee04ba5bf4e5b6edaf5b2d6 rdf:first sg:person.0652741073.68
59 rdf:rest rdf:nil
60 Na25202778fbc497ebe3106b673bc0c8e schema:name nlm_unique_id
61 schema:value 100965194
62 rdf:type schema:PropertyValue
63 Na4a72b32aed74e038e83beb14e1e6a9a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
64 schema:name Expressed Sequence Tags
65 rdf:type schema:DefinedTerm
66 Nb3051f61eacc4d55b0db1524de1ce8d1 rdf:first sg:person.0666361066.33
67 rdf:rest N600f891a561746d29bcbe8a3e2d0edc8
68 Nb448d752799b481990f312211ec0a3d4 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
69 schema:name Software
70 rdf:type schema:DefinedTerm
71 Nb5b376626e9b4a498eebee08213d3572 rdf:first sg:person.01145211747.88
72 rdf:rest Nb3051f61eacc4d55b0db1524de1ce8d1
73 Nd688ecd23e5a424989050a10fda50dde schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
74 schema:name Phylogeny
75 rdf:type schema:DefinedTerm
76 Nddd749a3858b43cca195c1e34581c4a9 schema:name doi
77 schema:value 10.1186/1471-2105-5-87
78 rdf:type schema:PropertyValue
79 Nfa725c8968914846a608fd46af6efb42 schema:name readcube_id
80 schema:value 74a13ac75971a462282cf0c49eaeb04738dde2fbda94b5cd2ed5cffb3f70aee7
81 rdf:type schema:PropertyValue
82 Nfd654fb260bd41e8bf10709262bf2782 schema:name pubmed_id
83 schema:value 15236648
84 rdf:type schema:PropertyValue
85 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
86 schema:name Biological Sciences
87 rdf:type schema:DefinedTerm
88 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
89 schema:name Genetics
90 rdf:type schema:DefinedTerm
91 sg:journal.1023786 schema:issn 1471-2105
92 schema:name BMC Bioinformatics
93 rdf:type schema:Periodical
94 sg:person.01145211747.88 schema:affiliation https://www.grid.ac/institutes/grid.8761.8
95 schema:familyName Nilsson
96 schema:givenName R Henrik
97 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01145211747.88
98 rdf:type schema:Person
99 sg:person.01361572401.02 schema:affiliation https://www.grid.ac/institutes/grid.8761.8
100 schema:familyName Larsson
101 schema:givenName Karl-Henrik
102 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01361572401.02
103 rdf:type schema:Person
104 sg:person.0652741073.68 schema:affiliation https://www.grid.ac/institutes/grid.4714.6
105 schema:familyName Ursing
106 schema:givenName Björn M
107 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0652741073.68
108 rdf:type schema:Person
109 sg:person.0666361066.33 schema:affiliation https://www.grid.ac/institutes/grid.4514.4
110 schema:familyName Rajashekar
111 schema:givenName Balaji
112 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0666361066.33
113 rdf:type schema:Person
114 sg:pub.10.1186/1471-2105-4-39 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018633949
115 https://doi.org/10.1186/1471-2105-4-39
116 rdf:type schema:CreativeWork
117 https://doi.org/10.1093/bioinformatics/13.4.477 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019841193
118 rdf:type schema:CreativeWork
119 https://doi.org/10.1093/bioinformatics/bth119 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003235153
120 rdf:type schema:CreativeWork
121 https://doi.org/10.1093/nar/22.22.4673 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042438223
122 rdf:type schema:CreativeWork
123 https://doi.org/10.1093/nar/25.17.3389 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047265454
124 rdf:type schema:CreativeWork
125 https://doi.org/10.1093/nar/29.6.1272 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040102707
126 rdf:type schema:CreativeWork
127 https://doi.org/10.1109/4236.612229 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061172160
128 rdf:type schema:CreativeWork
129 https://www.grid.ac/institutes/grid.4514.4 schema:alternateName Lund University
130 schema:name Department of Microbial Ecology, Lund University, SE-223 62, Lund, Sweden
131 rdf:type schema:Organization
132 https://www.grid.ac/institutes/grid.4714.6 schema:alternateName Karolinska Institute
133 schema:name Karolinska Institutet, Center for Genomics and Bioinformatics, Berzelius väg 35, SE-171 77, Stockholm, Sweden
134 rdf:type schema:Organization
135 https://www.grid.ac/institutes/grid.8761.8 schema:alternateName University of Gothenburg
136 schema:name Botanical Institute, Göteborg University, Box 461, SE-405 30, Göteborg, Sweden
137 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...