Protein Structure Prediction Based on Sequence Similarity View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2009-06-29

AUTHORS

Lukasz Jaroszewski

ABSTRACT

The observation that similar protein sequences fold into similar three-dimensional structures provides a basis for the methods which predict structural features of a novel protein based on the similarity between its sequence and sequences of known protein structures. Similarity over entire sequence or large sequence fragment(s) enables prediction and modeling of entire structural domains while statistics derived from distributions of local features of known protein structures make it possible to predict such features in proteins with unknown structures. The accuracy of models of protein structures is sufficient for many practical purposes such as analysis of point mutation effects, enzymatic reactions, interaction interfaces of protein complexes, and active sites. Protein models are also used for phasing of crystallographic data and, in some cases, for drug design. By using models one can avoid the costly and time-consuming process of experimental structure determination. The purpose of this chapter is to give a practical review of the most popular protein structure prediction methods based on sequence similarity and to outline a practical approach to protein structure prediction. While the main focus of this chapter is on template-based protein structure prediction, it also provides references to other methods and programs which play an important role in protein structure prediction. More... »

PAGES

129-156

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-1-59745-524-4_7

DOI

http://dx.doi.org/10.1007/978-1-59745-524-4_7

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1000887598

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/19623489


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/03", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Chemical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0399", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Other Chemical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0601", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biochemistry and Cell Biology", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computational Biology", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computer Simulation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Protein", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Models, Molecular", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Protein Folding", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Protein Structure, Tertiary", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Proteins", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Alignment", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Homology, Amino Acid", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software Design", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Structural Homology, Protein", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "The Burnham Institute, La Jolla, CA, USA", 
          "id": "http://www.grid.ac/institutes/grid.479509.6", 
          "name": [
            "The Burnham Institute, La Jolla, CA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Jaroszewski", 
        "givenName": "Lukasz", 
        "id": "sg:person.01071146351.70", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01071146351.70"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2009-06-29", 
    "datePublishedReg": "2009-06-29", 
    "description": "The observation that similar protein sequences fold into similar three-dimensional structures provides a basis for the methods which predict structural features of a novel protein based on the similarity between its sequence and sequences of known protein structures. Similarity over entire sequence or large sequence fragment(s) enables prediction and modeling of entire structural domains while statistics derived from distributions of local features of known protein structures make it possible to predict such features in proteins with unknown structures. The accuracy of models of protein structures is sufficient for many practical purposes such as analysis of point mutation effects, enzymatic reactions, interaction interfaces of protein complexes, and active sites. Protein models are also used for phasing of crystallographic data and, in some cases, for drug design. By using models one can avoid the costly and time-consuming process of experimental structure determination. The purpose of this chapter is to give a practical review of the most popular protein structure prediction methods based on sequence similarity and to outline a practical approach to protein structure prediction. While the main focus of this chapter is on template-based protein structure prediction, it also provides references to other methods and programs which play an important role in protein structure prediction.", 
    "editor": [
      {
        "familyName": "Astakhov", 
        "givenName": "Vadim", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-1-59745-524-4_7", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-1-934115-63-3", 
        "978-1-59745-524-4"
      ], 
      "name": "Biomedical Informatics", 
      "type": "Book"
    }, 
    "keywords": [
      "protein structure prediction", 
      "local features", 
      "time-consuming process", 
      "template-based protein structure prediction", 
      "large sequences", 
      "interaction interface", 
      "structure prediction", 
      "such features", 
      "accuracy of models", 
      "protein structure prediction methods", 
      "structure prediction methods", 
      "prediction method", 
      "similar protein sequences", 
      "practical approach", 
      "main focus", 
      "features", 
      "experimental structure determination", 
      "protein sequences", 
      "method", 
      "similarity", 
      "entire sequence", 
      "prediction", 
      "modeling", 
      "accuracy", 
      "model", 
      "practical purposes", 
      "interface", 
      "sequence", 
      "protein structure", 
      "domain", 
      "unknown structure", 
      "design", 
      "structure", 
      "statistics", 
      "purpose", 
      "protein models", 
      "data", 
      "drug design", 
      "process", 
      "chapter", 
      "focus", 
      "program", 
      "three-dimensional structure", 
      "basis", 
      "distribution", 
      "analysis", 
      "phasing", 
      "reference", 
      "important role", 
      "observations", 
      "structural features", 
      "effect", 
      "enzymatic reaction", 
      "active site", 
      "cases", 
      "determination", 
      "practical review", 
      "approach", 
      "structural domains", 
      "reaction", 
      "structure determination", 
      "review", 
      "sequence similarity", 
      "role", 
      "mutation effects", 
      "protein complexes", 
      "sites", 
      "point mutation effect", 
      "complexes", 
      "crystallographic data", 
      "novel protein", 
      "protein", 
      "entire structural domains", 
      "popular protein structure prediction methods"
    ], 
    "name": "Protein Structure Prediction Based on Sequence Similarity", 
    "pagination": "129-156", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1000887598"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-1-59745-524-4_7"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "19623489"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-1-59745-524-4_7", 
      "https://app.dimensions.ai/details/publication/pub.1000887598"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-01-01T19:09", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/chapter/chapter_164.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-1-59745-524-4_7"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-1-59745-524-4_7'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-1-59745-524-4_7'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-1-59745-524-4_7'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-1-59745-524-4_7'


 

This table displays all metadata directly associated to this object as RDF triples.

198 TRIPLES      23 PREDICATES      114 URIs      105 LITERALS      21 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-1-59745-524-4_7 schema:about N04ff633f13d64bf396ccad7ff14c862a
2 N2877b7ff145145d1ae4fadd82b4244d6
3 N84022b1ea7274e3faf7301084e241019
4 N868d675ff1ca46aeb1c1a8a409fdd2ff
5 N94b75e7db9c94cb1b88f10a4ac3657eb
6 N979686cd0c804ad584269f207ecdb883
7 N98ed38f87fa54584a3bceb218f2234ef
8 N9ccbd10f0ba44e738fc4a42d4dfd62a7
9 Naa523b95f98541e78dc548d31ea9a54e
10 Nc2d2917c6a0b480d8d4b42dfc08bc9b9
11 Ncd355b3a7daa48c6821dfe7e9fcf6861
12 Ne22d3e07d5734081bc0fc2d9e602db30
13 Ne6677d4deb7040d283720d9a6f41b329
14 anzsrc-for:03
15 anzsrc-for:0399
16 anzsrc-for:06
17 anzsrc-for:0601
18 schema:author N39a37799f29943918da805597df54a19
19 schema:datePublished 2009-06-29
20 schema:datePublishedReg 2009-06-29
21 schema:description The observation that similar protein sequences fold into similar three-dimensional structures provides a basis for the methods which predict structural features of a novel protein based on the similarity between its sequence and sequences of known protein structures. Similarity over entire sequence or large sequence fragment(s) enables prediction and modeling of entire structural domains while statistics derived from distributions of local features of known protein structures make it possible to predict such features in proteins with unknown structures. The accuracy of models of protein structures is sufficient for many practical purposes such as analysis of point mutation effects, enzymatic reactions, interaction interfaces of protein complexes, and active sites. Protein models are also used for phasing of crystallographic data and, in some cases, for drug design. By using models one can avoid the costly and time-consuming process of experimental structure determination. The purpose of this chapter is to give a practical review of the most popular protein structure prediction methods based on sequence similarity and to outline a practical approach to protein structure prediction. While the main focus of this chapter is on template-based protein structure prediction, it also provides references to other methods and programs which play an important role in protein structure prediction.
22 schema:editor N6138555baee54b13ae0401313bffa623
23 schema:genre chapter
24 schema:inLanguage en
25 schema:isAccessibleForFree false
26 schema:isPartOf N501b5bc5e0584c4ca0717ff67ba06e27
27 schema:keywords accuracy
28 accuracy of models
29 active site
30 analysis
31 approach
32 basis
33 cases
34 chapter
35 complexes
36 crystallographic data
37 data
38 design
39 determination
40 distribution
41 domain
42 drug design
43 effect
44 entire sequence
45 entire structural domains
46 enzymatic reaction
47 experimental structure determination
48 features
49 focus
50 important role
51 interaction interface
52 interface
53 large sequences
54 local features
55 main focus
56 method
57 model
58 modeling
59 mutation effects
60 novel protein
61 observations
62 phasing
63 point mutation effect
64 popular protein structure prediction methods
65 practical approach
66 practical purposes
67 practical review
68 prediction
69 prediction method
70 process
71 program
72 protein
73 protein complexes
74 protein models
75 protein sequences
76 protein structure
77 protein structure prediction
78 protein structure prediction methods
79 purpose
80 reaction
81 reference
82 review
83 role
84 sequence
85 sequence similarity
86 similar protein sequences
87 similarity
88 sites
89 statistics
90 structural domains
91 structural features
92 structure
93 structure determination
94 structure prediction
95 structure prediction methods
96 such features
97 template-based protein structure prediction
98 three-dimensional structure
99 time-consuming process
100 unknown structure
101 schema:name Protein Structure Prediction Based on Sequence Similarity
102 schema:pagination 129-156
103 schema:productId N179e98b4051c4f87956ad0af3bb8911e
104 N962b14c2afe844c2b36a87446ceded12
105 Ne1a7d298560a4ff68aa84c370eda95bf
106 schema:publisher Nf50c9a98ea7141f9b81c965c85a4dd98
107 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000887598
108 https://doi.org/10.1007/978-1-59745-524-4_7
109 schema:sdDatePublished 2022-01-01T19:09
110 schema:sdLicense https://scigraph.springernature.com/explorer/license/
111 schema:sdPublisher Nbf14c7126f1b48a6a33abf2874a33849
112 schema:url https://doi.org/10.1007/978-1-59745-524-4_7
113 sgo:license sg:explorer/license/
114 sgo:sdDataset chapters
115 rdf:type schema:Chapter
116 N04ff633f13d64bf396ccad7ff14c862a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
117 schema:name Structural Homology, Protein
118 rdf:type schema:DefinedTerm
119 N179e98b4051c4f87956ad0af3bb8911e schema:name dimensions_id
120 schema:value pub.1000887598
121 rdf:type schema:PropertyValue
122 N1acf7f9a353842878d19a8f99853cab3 schema:familyName Astakhov
123 schema:givenName Vadim
124 rdf:type schema:Person
125 N2877b7ff145145d1ae4fadd82b4244d6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
126 schema:name Sequence Homology, Amino Acid
127 rdf:type schema:DefinedTerm
128 N39a37799f29943918da805597df54a19 rdf:first sg:person.01071146351.70
129 rdf:rest rdf:nil
130 N501b5bc5e0584c4ca0717ff67ba06e27 schema:isbn 978-1-59745-524-4
131 978-1-934115-63-3
132 schema:name Biomedical Informatics
133 rdf:type schema:Book
134 N6138555baee54b13ae0401313bffa623 rdf:first N1acf7f9a353842878d19a8f99853cab3
135 rdf:rest rdf:nil
136 N84022b1ea7274e3faf7301084e241019 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
137 schema:name Databases, Protein
138 rdf:type schema:DefinedTerm
139 N868d675ff1ca46aeb1c1a8a409fdd2ff schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
140 schema:name Models, Molecular
141 rdf:type schema:DefinedTerm
142 N94b75e7db9c94cb1b88f10a4ac3657eb schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
143 schema:name Protein Folding
144 rdf:type schema:DefinedTerm
145 N962b14c2afe844c2b36a87446ceded12 schema:name pubmed_id
146 schema:value 19623489
147 rdf:type schema:PropertyValue
148 N979686cd0c804ad584269f207ecdb883 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
149 schema:name Computer Simulation
150 rdf:type schema:DefinedTerm
151 N98ed38f87fa54584a3bceb218f2234ef schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
152 schema:name Protein Structure, Tertiary
153 rdf:type schema:DefinedTerm
154 N9ccbd10f0ba44e738fc4a42d4dfd62a7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
155 schema:name Proteins
156 rdf:type schema:DefinedTerm
157 Naa523b95f98541e78dc548d31ea9a54e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
158 schema:name Computational Biology
159 rdf:type schema:DefinedTerm
160 Nbf14c7126f1b48a6a33abf2874a33849 schema:name Springer Nature - SN SciGraph project
161 rdf:type schema:Organization
162 Nc2d2917c6a0b480d8d4b42dfc08bc9b9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
163 schema:name Sequence Alignment
164 rdf:type schema:DefinedTerm
165 Ncd355b3a7daa48c6821dfe7e9fcf6861 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
166 schema:name Software Design
167 rdf:type schema:DefinedTerm
168 Ne1a7d298560a4ff68aa84c370eda95bf schema:name doi
169 schema:value 10.1007/978-1-59745-524-4_7
170 rdf:type schema:PropertyValue
171 Ne22d3e07d5734081bc0fc2d9e602db30 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
172 schema:name Software
173 rdf:type schema:DefinedTerm
174 Ne6677d4deb7040d283720d9a6f41b329 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
175 schema:name Algorithms
176 rdf:type schema:DefinedTerm
177 Nf50c9a98ea7141f9b81c965c85a4dd98 schema:name Springer Nature
178 rdf:type schema:Organisation
179 anzsrc-for:03 schema:inDefinedTermSet anzsrc-for:
180 schema:name Chemical Sciences
181 rdf:type schema:DefinedTerm
182 anzsrc-for:0399 schema:inDefinedTermSet anzsrc-for:
183 schema:name Other Chemical Sciences
184 rdf:type schema:DefinedTerm
185 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
186 schema:name Biological Sciences
187 rdf:type schema:DefinedTerm
188 anzsrc-for:0601 schema:inDefinedTermSet anzsrc-for:
189 schema:name Biochemistry and Cell Biology
190 rdf:type schema:DefinedTerm
191 sg:person.01071146351.70 schema:affiliation grid-institutes:grid.479509.6
192 schema:familyName Jaroszewski
193 schema:givenName Lukasz
194 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01071146351.70
195 rdf:type schema:Person
196 grid-institutes:grid.479509.6 schema:alternateName The Burnham Institute, La Jolla, CA, USA
197 schema:name The Burnham Institute, La Jolla, CA, USA
198 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...