Topology Independent Protein Structural Alignment View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2007-01-01

AUTHORS

Joe Dundas , T. A. Binkowski , Bhaskar DasGupta , Jie Liang

ABSTRACT

Protein structural alignment is an indispensable tool used for many different studies in bioinformatics. Most structural alignment algorithms assume that the structural units of two similar proteins will align sequentially. This assumption may not be true for all similar proteins and as a result, proteins with similar structure but with permuted sequence arrangement are often missed. We present a solution to the problem based on an approximation algorithm that finds a sequence-order independent structural alignment that is close to optimal. We first exhaustively fragment two proteins and calculate a novel similarity score between all possible aligned fragment pairs. We treat each aligned fragment pair as a vertex on a graph. Vertices are connected by an edge if there are intra residue sequence conflicts. We regard the realignment of the fragment pairs as a special case of the maximum-weight independent set problem and solve this computationally intensive problem approximately by iteratively solving relaxations of an appropriate integer programming formulation. The resulting structural alignment is sequence order independent. Our method is insensitive to gaps, insertions/deletions, and circular permutations. More... »

PAGES

171-182

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-540-74126-8_16

DOI

http://dx.doi.org/10.1007/978-3-540-74126-8_16

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1037587669


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0601", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biochemistry and Cell Biology", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607\u20137052", 
          "id": "http://www.grid.ac/institutes/grid.185648.6", 
          "name": [
            "Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607\u20137052"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Dundas", 
        "givenName": "Joe", 
        "id": "sg:person.015617564507.74", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015617564507.74"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607\u20137052", 
          "id": "http://www.grid.ac/institutes/grid.185648.6", 
          "name": [
            "Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607\u20137052"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Binkowski", 
        "givenName": "T. A.", 
        "id": "sg:person.014205644054.00", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014205644054.00"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607-7053", 
          "id": "http://www.grid.ac/institutes/grid.185648.6", 
          "name": [
            "Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607-7053"
          ], 
          "type": "Organization"
        }, 
        "familyName": "DasGupta", 
        "givenName": "Bhaskar", 
        "id": "sg:person.0763403270.10", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0763403270.10"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607\u20137052", 
          "id": "http://www.grid.ac/institutes/grid.185648.6", 
          "name": [
            "Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607\u20137052"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Liang", 
        "givenName": "Jie", 
        "id": "sg:person.01277443336.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01277443336.48"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2007-01-01", 
    "datePublishedReg": "2007-01-01", 
    "description": "Protein structural alignment is an indispensable tool used for many different studies in bioinformatics. Most structural alignment algorithms assume that the structural units of two similar proteins will align sequentially. This assumption may not be true for all similar proteins and as a result, proteins with similar structure but with permuted sequence arrangement are often missed. We present a solution to the problem based on an approximation algorithm that finds a sequence-order independent structural alignment that is close to optimal. We first exhaustively fragment two proteins and calculate a novel similarity score between all possible aligned fragment pairs. We treat each aligned fragment pair as a vertex on a graph. Vertices are connected by an edge if there are intra residue sequence conflicts. We regard the realignment of the fragment pairs as a special case of the maximum-weight independent set problem and solve this computationally intensive problem approximately by iteratively solving relaxations of an appropriate integer programming formulation. The resulting structural alignment is sequence order independent. Our method is insensitive to gaps, insertions/deletions, and circular permutations.", 
    "editor": [
      {
        "familyName": "Giancarlo", 
        "givenName": "Raffaele", 
        "type": "Person"
      }, 
      {
        "familyName": "Hannenhalli", 
        "givenName": "Sridhar", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-540-74126-8_16", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-540-74125-1", 
        "978-3-540-74126-8"
      ], 
      "name": "Algorithms in Bioinformatics", 
      "type": "Book"
    }, 
    "keywords": [
      "protein structural alignments", 
      "maximum weight independent set problem", 
      "independent set problem", 
      "structural alignment algorithms", 
      "structural alignment", 
      "similar proteins", 
      "integer programming formulation", 
      "intensive problems", 
      "alignment algorithm", 
      "similarity scores", 
      "set problem", 
      "approximation algorithm", 
      "insertions/deletions", 
      "sequence conflicts", 
      "programming formulation", 
      "circular permutation", 
      "algorithm", 
      "sequence arrangement", 
      "protein", 
      "sequence order", 
      "fragment pairs", 
      "indispensable tool", 
      "alignment", 
      "bioinformatics", 
      "graph", 
      "novel similarity score", 
      "vertices", 
      "special case", 
      "deletion", 
      "similar structure", 
      "tool", 
      "permutations", 
      "pairs", 
      "structural units", 
      "solution", 
      "edge", 
      "order", 
      "method", 
      "different studies", 
      "assumption", 
      "results", 
      "units", 
      "structure", 
      "gap", 
      "arrangement", 
      "formulation", 
      "study", 
      "conflict", 
      "cases", 
      "scores", 
      "relaxation", 
      "realignment", 
      "problem"
    ], 
    "name": "Topology Independent Protein Structural Alignment", 
    "pagination": "171-182", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1037587669"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-540-74126-8_16"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-540-74126-8_16", 
      "https://app.dimensions.ai/details/publication/pub.1037587669"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-12-01T06:46", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/chapter/chapter_130.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-540-74126-8_16"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74126-8_16'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74126-8_16'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74126-8_16'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-540-74126-8_16'


 

This table displays all metadata directly associated to this object as RDF triples.

140 TRIPLES      22 PREDICATES      77 URIs      70 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-540-74126-8_16 schema:about anzsrc-for:06
2 anzsrc-for:0601
3 schema:author Nada06a5be925497583286eaa7be94b5e
4 schema:datePublished 2007-01-01
5 schema:datePublishedReg 2007-01-01
6 schema:description Protein structural alignment is an indispensable tool used for many different studies in bioinformatics. Most structural alignment algorithms assume that the structural units of two similar proteins will align sequentially. This assumption may not be true for all similar proteins and as a result, proteins with similar structure but with permuted sequence arrangement are often missed. We present a solution to the problem based on an approximation algorithm that finds a sequence-order independent structural alignment that is close to optimal. We first exhaustively fragment two proteins and calculate a novel similarity score between all possible aligned fragment pairs. We treat each aligned fragment pair as a vertex on a graph. Vertices are connected by an edge if there are intra residue sequence conflicts. We regard the realignment of the fragment pairs as a special case of the maximum-weight independent set problem and solve this computationally intensive problem approximately by iteratively solving relaxations of an appropriate integer programming formulation. The resulting structural alignment is sequence order independent. Our method is insensitive to gaps, insertions/deletions, and circular permutations.
7 schema:editor Ned6db7e8c31c4a8f86310f3c492a3ffd
8 schema:genre chapter
9 schema:isAccessibleForFree false
10 schema:isPartOf N9c0eac5ce83642aeb555ba74cfcb5d58
11 schema:keywords algorithm
12 alignment
13 alignment algorithm
14 approximation algorithm
15 arrangement
16 assumption
17 bioinformatics
18 cases
19 circular permutation
20 conflict
21 deletion
22 different studies
23 edge
24 formulation
25 fragment pairs
26 gap
27 graph
28 independent set problem
29 indispensable tool
30 insertions/deletions
31 integer programming formulation
32 intensive problems
33 maximum weight independent set problem
34 method
35 novel similarity score
36 order
37 pairs
38 permutations
39 problem
40 programming formulation
41 protein
42 protein structural alignments
43 realignment
44 relaxation
45 results
46 scores
47 sequence arrangement
48 sequence conflicts
49 sequence order
50 set problem
51 similar proteins
52 similar structure
53 similarity scores
54 solution
55 special case
56 structural alignment
57 structural alignment algorithms
58 structural units
59 structure
60 study
61 tool
62 units
63 vertices
64 schema:name Topology Independent Protein Structural Alignment
65 schema:pagination 171-182
66 schema:productId N4668f976a80b4e489e5e9e9826be1be9
67 Nabbdf92af197473ba87395d607e5e095
68 schema:publisher N00a394f1f756490b8ef4688d336f106b
69 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037587669
70 https://doi.org/10.1007/978-3-540-74126-8_16
71 schema:sdDatePublished 2022-12-01T06:46
72 schema:sdLicense https://scigraph.springernature.com/explorer/license/
73 schema:sdPublisher N7506aa4e738549398e85de49c923e0a8
74 schema:url https://doi.org/10.1007/978-3-540-74126-8_16
75 sgo:license sg:explorer/license/
76 sgo:sdDataset chapters
77 rdf:type schema:Chapter
78 N00a394f1f756490b8ef4688d336f106b schema:name Springer Nature
79 rdf:type schema:Organisation
80 N0f0fea861e3f40dc9f4924e62e2b3c57 rdf:first sg:person.01277443336.48
81 rdf:rest rdf:nil
82 N1e3fab9fb134459496e494d049a39a5b rdf:first N7967d382220e401795c64cf3ea3dfd94
83 rdf:rest rdf:nil
84 N4668f976a80b4e489e5e9e9826be1be9 schema:name doi
85 schema:value 10.1007/978-3-540-74126-8_16
86 rdf:type schema:PropertyValue
87 N7506aa4e738549398e85de49c923e0a8 schema:name Springer Nature - SN SciGraph project
88 rdf:type schema:Organization
89 N7967d382220e401795c64cf3ea3dfd94 schema:familyName Hannenhalli
90 schema:givenName Sridhar
91 rdf:type schema:Person
92 N7b28ee9b39864b1f879faacdaddde5b0 schema:familyName Giancarlo
93 schema:givenName Raffaele
94 rdf:type schema:Person
95 N9c0eac5ce83642aeb555ba74cfcb5d58 schema:isbn 978-3-540-74125-1
96 978-3-540-74126-8
97 schema:name Algorithms in Bioinformatics
98 rdf:type schema:Book
99 Nabbdf92af197473ba87395d607e5e095 schema:name dimensions_id
100 schema:value pub.1037587669
101 rdf:type schema:PropertyValue
102 Nada06a5be925497583286eaa7be94b5e rdf:first sg:person.015617564507.74
103 rdf:rest Nfe15b512af7e4a20817517d8d91cb4a6
104 Ncb755bff3cac4f09aef8eea387a00d6a rdf:first sg:person.0763403270.10
105 rdf:rest N0f0fea861e3f40dc9f4924e62e2b3c57
106 Ned6db7e8c31c4a8f86310f3c492a3ffd rdf:first N7b28ee9b39864b1f879faacdaddde5b0
107 rdf:rest N1e3fab9fb134459496e494d049a39a5b
108 Nfe15b512af7e4a20817517d8d91cb4a6 rdf:first sg:person.014205644054.00
109 rdf:rest Ncb755bff3cac4f09aef8eea387a00d6a
110 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
111 schema:name Biological Sciences
112 rdf:type schema:DefinedTerm
113 anzsrc-for:0601 schema:inDefinedTermSet anzsrc-for:
114 schema:name Biochemistry and Cell Biology
115 rdf:type schema:DefinedTerm
116 sg:person.01277443336.48 schema:affiliation grid-institutes:grid.185648.6
117 schema:familyName Liang
118 schema:givenName Jie
119 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01277443336.48
120 rdf:type schema:Person
121 sg:person.014205644054.00 schema:affiliation grid-institutes:grid.185648.6
122 schema:familyName Binkowski
123 schema:givenName T. A.
124 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014205644054.00
125 rdf:type schema:Person
126 sg:person.015617564507.74 schema:affiliation grid-institutes:grid.185648.6
127 schema:familyName Dundas
128 schema:givenName Joe
129 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015617564507.74
130 rdf:type schema:Person
131 sg:person.0763403270.10 schema:affiliation grid-institutes:grid.185648.6
132 schema:familyName DasGupta
133 schema:givenName Bhaskar
134 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0763403270.10
135 rdf:type schema:Person
136 grid-institutes:grid.185648.6 schema:alternateName Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607–7052
137 Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607-7053
138 schema:name Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607–7052
139 Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois 60607-7053
140 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...