New Computational Methods for Data-driven Protein Structure Prediction View Homepage


Ontology type: schema:MonetaryGrant     


Grant Info

YEARS

2010-2019

FUNDING AMOUNT

2539300 USD

ABSTRACT

Proteins and their interactions play fundamental roles in all biological processes. Accurate description of protein structure and interactions is a fundamental step towards understanding biological life and highly relevant in the development of therapeutics and drugs. However, there is a large gap between the number of available protein sequences and the number of proteins (complexes) with solved structures and accurate interaction description, which has to be filled by computational prediction. The long-term goal of this project is to apply statistical machine learning and optimization algorithms to understand protein sequence-structure-function relationship by analyzing low- and high-throughput sequence, structure and functional data and to develop algorithms for structure and functional prediction. Our hypothesis is that by developing sophisticated algorithms to take advantage of the growing sequence and structure data, we can model sequence-structure relationship much more accurately and significantly improve structure and functional prediction, in particular for this proposal, residue (atomic) interaction strength prediction and remote homology detection. This project has produced a few CASP-winning, widely-used data-driven algorithms and web server (http://raptorx.uchicago.edu) for monomer protein modeling. This renewal will not only further develop machine learning algorithms (especially Deep Learning and probabilistic graphical models) for monomer proteins, but also branch out to protein interactions (complexes). The specific aims are: (1) develop novel structure learning algorithms to predict inter-reside contacts and coevolved residues; (2) develop context-specific, coevolution-based, and distance-dependent statistical potentials using a new machine learning model called Deep Conditional (Markov) Neural Fields (DeepCNF); (3) develop Markov Random Fields (MRF) and DeepCNF methods for remote protein (interface/complex) homology detection and fold recognition to make use of long-range residue interaction predicted by the first two aims. This renewal will lead to further understanding and new models of protein sequence-structure-function relationship and yield publicly available software and servers for automated, accurate, quantitative analysis for a wide range of proteins and their interactions. The impact will be multiplied by tens of thousands of worldwide users employing the resulting software/servers to study a wide variety of proteins and interactions relevant to basic biological research and human diseases, in both low- and high-throughput experiments. More... »

URL

http://projectreporter.nih.gov/project_info_description.cfm?aid=9545787

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2202", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2201", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2208", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "type": "DefinedTerm"
      }
    ], 
    "amount": {
      "currency": "USD", 
      "type": "MonetaryAmount", 
      "value": "2539300"
    }, 
    "description": "Proteins and their interactions play fundamental roles in all biological processes. Accurate description of protein structure and interactions is a fundamental step towards understanding biological life and highly relevant in the development of therapeutics and drugs. However, there is a large gap between the number of available protein sequences and the number of proteins (complexes) with solved structures and accurate interaction description, which has to be filled by computational prediction. The long-term goal of this project is to apply statistical machine learning and optimization algorithms to understand protein sequence-structure-function relationship by analyzing low- and high-throughput sequence, structure and functional data and to develop algorithms for structure and functional prediction. Our hypothesis is that by developing sophisticated algorithms to take advantage of the growing sequence and structure data, we can model sequence-structure relationship much more accurately and significantly improve structure and functional prediction, in particular for this proposal, residue (atomic) interaction strength prediction and remote homology detection. This project has produced a few CASP-winning, widely-used data-driven algorithms and web server (http://raptorx.uchicago.edu) for monomer protein modeling. This renewal will not only further develop machine learning algorithms (especially Deep Learning and probabilistic graphical models) for monomer proteins, but also branch out to protein interactions (complexes). The specific aims are: (1) develop novel structure learning algorithms to predict inter-reside contacts and coevolved residues; (2) develop context-specific, coevolution-based, and distance-dependent statistical potentials using a new machine learning model called Deep Conditional (Markov) Neural Fields (DeepCNF); (3) develop Markov Random Fields (MRF) and DeepCNF methods for remote protein (interface/complex) homology detection and fold recognition to make use of long-range residue interaction predicted by the first two aims. This renewal will lead to further understanding and new models of protein sequence-structure-function relationship and yield publicly available software and servers for automated, accurate, quantitative analysis for a wide range of proteins and their interactions. The impact will be multiplied by tens of thousands of worldwide users employing the resulting software/servers to study a wide variety of proteins and interactions relevant to basic biological research and human diseases, in both low- and high-throughput experiments.", 
    "endDate": "2019-08-31T00:00:00Z", 
    "funder": {
      "id": "https://www.grid.ac/institutes/grid.280785.0", 
      "type": "Organization"
    }, 
    "id": "sg:grant.2520623", 
    "identifier": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "2520623"
        ]
      }, 
      {
        "name": "nih_id", 
        "type": "PropertyValue", 
        "value": [
          "R01GM089753"
        ]
      }
    ], 
    "inLanguage": [
      "en"
    ], 
    "keywords": [
      "software/servers", 
      "DeepCNF", 
      "functional data", 
      "Deep Conditional", 
      "sequence-structure relationship", 
      "sophisticated algorithms", 
      "advantages", 
      "specific aim", 
      "project", 
      "development", 
      "aim", 
      "monomer protein modeling", 
      "novel structure", 
      "worldwide users", 
      "monomer protein", 
      "web server", 
      "renewal", 
      "data", 
      "interaction strength prediction", 
      "protein structure prediction", 
      "long-term goal", 
      "use", 
      "high-throughput sequence", 
      "interaction", 
      "wide range", 
      "quantitative analysis", 
      "contact", 
      "low-", 
      "optimization algorithms", 
      "structure", 
      "model", 
      "new computational method", 
      "distance-dependent statistical potentials", 
      "protein", 
      "Markov Random Fields", 
      "probabilistic graphical models", 
      "human disease", 
      "new model", 
      "Markov", 
      "remote proteins", 
      "server", 
      "fundamental role", 
      "Neural Fields", 
      "biological processes", 
      "residues", 
      "functional prediction", 
      "interface/complex", 
      "therapeutics", 
      "hypothesis", 
      "homology detection", 
      "wide variety", 
      "complexes", 
      "impact", 
      "remote homology detection", 
      "drugs", 
      "algorithms", 
      "protein sequence-structure", 
      "few CASP", 
      "deep learning", 
      "new machine", 
      "long-range residue interaction", 
      "statistical machine learning", 
      "function relationships", 
      "number", 
      "available software", 
      "available protein sequences", 
      "context", 
      "biological life", 
      "protein structure", 
      "proposal", 
      "fundamental step", 
      "further understanding", 
      "computational prediction", 
      "protein interactions", 
      "basic biological research", 
      "high-throughput experiments", 
      "inter", 
      "recognition", 
      "coevolution", 
      "large gap", 
      "machine", 
      "thousands", 
      "structure data", 
      "accurate description", 
      "accurate interaction description", 
      "DeepCNF methods", 
      "sequence", 
      "tens"
    ], 
    "name": "New Computational Methods for Data-driven Protein Structure Prediction", 
    "recipient": [
      {
        "id": "https://www.grid.ac/institutes/grid.287491.1", 
        "type": "Organization"
      }, 
      {
        "affiliation": {
          "id": "https://www.grid.ac/institutes/grid.287491.1", 
          "name": "TOYOTA TECHNOLOGICAL INSTITUTE / CHICAGO", 
          "type": "Organization"
        }, 
        "familyName": "XU", 
        "givenName": "JINBO", 
        "id": "sg:person.0603660076.01", 
        "type": "Person"
      }, 
      {
        "member": "sg:person.0603660076.01", 
        "roleName": "PI", 
        "type": "Role"
      }
    ], 
    "sameAs": [
      "https://app.dimensions.ai/details/grant/grant.2520623"
    ], 
    "sdDataset": "grants", 
    "sdDatePublished": "2019-03-07T12:17", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com.uberresearch.data.processor/core_data/20181219_192338/projects/base/nih_projects_9.xml.gz", 
    "startDate": "2010-05-14T00:00:00Z", 
    "type": "MonetaryGrant", 
    "url": "http://projectreporter.nih.gov/project_info_description.cfm?aid=9545787"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/grant.2520623'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/grant.2520623'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/grant.2520623'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/grant.2520623'


 

This table displays all metadata directly associated to this object as RDF triples.

138 TRIPLES      19 PREDICATES      112 URIs      102 LITERALS      5 BLANK NODES

Subject Predicate Object
1 sg:grant.2520623 schema:about anzsrc-for:2201
2 anzsrc-for:2202
3 anzsrc-for:2208
4 schema:amount N62038ffa7bbe4214bd237ba583675568
5 schema:description Proteins and their interactions play fundamental roles in all biological processes. Accurate description of protein structure and interactions is a fundamental step towards understanding biological life and highly relevant in the development of therapeutics and drugs. However, there is a large gap between the number of available protein sequences and the number of proteins (complexes) with solved structures and accurate interaction description, which has to be filled by computational prediction. The long-term goal of this project is to apply statistical machine learning and optimization algorithms to understand protein sequence-structure-function relationship by analyzing low- and high-throughput sequence, structure and functional data and to develop algorithms for structure and functional prediction. Our hypothesis is that by developing sophisticated algorithms to take advantage of the growing sequence and structure data, we can model sequence-structure relationship much more accurately and significantly improve structure and functional prediction, in particular for this proposal, residue (atomic) interaction strength prediction and remote homology detection. This project has produced a few CASP-winning, widely-used data-driven algorithms and web server (http://raptorx.uchicago.edu) for monomer protein modeling. This renewal will not only further develop machine learning algorithms (especially Deep Learning and probabilistic graphical models) for monomer proteins, but also branch out to protein interactions (complexes). The specific aims are: (1) develop novel structure learning algorithms to predict inter-reside contacts and coevolved residues; (2) develop context-specific, coevolution-based, and distance-dependent statistical potentials using a new machine learning model called Deep Conditional (Markov) Neural Fields (DeepCNF); (3) develop Markov Random Fields (MRF) and DeepCNF methods for remote protein (interface/complex) homology detection and fold recognition to make use of long-range residue interaction predicted by the first two aims. This renewal will lead to further understanding and new models of protein sequence-structure-function relationship and yield publicly available software and servers for automated, accurate, quantitative analysis for a wide range of proteins and their interactions. The impact will be multiplied by tens of thousands of worldwide users employing the resulting software/servers to study a wide variety of proteins and interactions relevant to basic biological research and human diseases, in both low- and high-throughput experiments.
6 schema:endDate 2019-08-31T00:00:00Z
7 schema:funder https://www.grid.ac/institutes/grid.280785.0
8 schema:identifier N048485e959bf4d52b50f184acf126e3c
9 Nd1bc0cb8136d4791a1b387a9d5098aba
10 schema:inLanguage en
11 schema:keywords Deep Conditional
12 DeepCNF
13 DeepCNF methods
14 Markov
15 Markov Random Fields
16 Neural Fields
17 accurate description
18 accurate interaction description
19 advantages
20 aim
21 algorithms
22 available protein sequences
23 available software
24 basic biological research
25 biological life
26 biological processes
27 coevolution
28 complexes
29 computational prediction
30 contact
31 context
32 data
33 deep learning
34 development
35 distance-dependent statistical potentials
36 drugs
37 few CASP
38 function relationships
39 functional data
40 functional prediction
41 fundamental role
42 fundamental step
43 further understanding
44 high-throughput experiments
45 high-throughput sequence
46 homology detection
47 human disease
48 hypothesis
49 impact
50 inter
51 interaction
52 interaction strength prediction
53 interface/complex
54 large gap
55 long-range residue interaction
56 long-term goal
57 low-
58 machine
59 model
60 monomer protein
61 monomer protein modeling
62 new computational method
63 new machine
64 new model
65 novel structure
66 number
67 optimization algorithms
68 probabilistic graphical models
69 project
70 proposal
71 protein
72 protein interactions
73 protein sequence-structure
74 protein structure
75 protein structure prediction
76 quantitative analysis
77 recognition
78 remote homology detection
79 remote proteins
80 renewal
81 residues
82 sequence
83 sequence-structure relationship
84 server
85 software/servers
86 sophisticated algorithms
87 specific aim
88 statistical machine learning
89 structure
90 structure data
91 tens
92 therapeutics
93 thousands
94 use
95 web server
96 wide range
97 wide variety
98 worldwide users
99 schema:name New Computational Methods for Data-driven Protein Structure Prediction
100 schema:recipient Nf914ccb57b8e4d018be1c32991ecb80a
101 sg:person.0603660076.01
102 https://www.grid.ac/institutes/grid.287491.1
103 schema:sameAs https://app.dimensions.ai/details/grant/grant.2520623
104 schema:sdDatePublished 2019-03-07T12:17
105 schema:sdLicense https://scigraph.springernature.com/explorer/license/
106 schema:sdPublisher N089b314f3e324e19b9c9b5e180ee201f
107 schema:startDate 2010-05-14T00:00:00Z
108 schema:url http://projectreporter.nih.gov/project_info_description.cfm?aid=9545787
109 sgo:license sg:explorer/license/
110 sgo:sdDataset grants
111 rdf:type schema:MonetaryGrant
112 N048485e959bf4d52b50f184acf126e3c schema:name nih_id
113 schema:value R01GM089753
114 rdf:type schema:PropertyValue
115 N089b314f3e324e19b9c9b5e180ee201f schema:name Springer Nature - SN SciGraph project
116 rdf:type schema:Organization
117 N62038ffa7bbe4214bd237ba583675568 schema:currency USD
118 schema:value 2539300
119 rdf:type schema:MonetaryAmount
120 Nd1bc0cb8136d4791a1b387a9d5098aba schema:name dimensions_id
121 schema:value 2520623
122 rdf:type schema:PropertyValue
123 Nf914ccb57b8e4d018be1c32991ecb80a schema:member sg:person.0603660076.01
124 schema:roleName PI
125 rdf:type schema:Role
126 anzsrc-for:2201 schema:inDefinedTermSet anzsrc-for:
127 rdf:type schema:DefinedTerm
128 anzsrc-for:2202 schema:inDefinedTermSet anzsrc-for:
129 rdf:type schema:DefinedTerm
130 anzsrc-for:2208 schema:inDefinedTermSet anzsrc-for:
131 rdf:type schema:DefinedTerm
132 sg:person.0603660076.01 schema:affiliation https://www.grid.ac/institutes/grid.287491.1
133 schema:familyName XU
134 schema:givenName JINBO
135 rdf:type schema:Person
136 https://www.grid.ac/institutes/grid.280785.0 schema:Organization
137 https://www.grid.ac/institutes/grid.287491.1 schema:name TOYOTA TECHNOLOGICAL INSTITUTE / CHICAGO
138 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...