YEARS

2006-2007

AUTHORS

Andrew S. Peek

TITLE

Support Vector Machine modeling software for improving RNAi efficacy prediction

ABSTRACT

DESCRIPTION (provided by applicant): Title Support vector machines predict sequence ~ activity relationships in RNA interference: Project Summary/Abstract: Support Vector Machines (SVMs) are a group of algorithms in supervised machine learning that are able to build classification or regression models on training data such that these models can be used to predict information not seen during model construction. RNA interference (RNAi) is the property of small (20 to 23 bases) RNA sequences that with the help of the RNA Induced Silencing Complex (RISC) enable the catalytic cleavage of target RNA sequences and the knockdown of the expression level of the target gene. The steps involved in loading and associating an RNAi sequences into an active RISC are several in addition to the multi-factorial variation in biochemical activities of RNAi sequences once in an active RISC. Finding the relevant biochemical features that associate with these quantifiable measures of RNAi can allow i) better predictive models of RNAi and RNAi-like (e.g. microRNAs) activities and ii) a better understanding of the relevant biochemical properties since presumably less relevant properties should not increase the predictive abilities of models containing those properties. We have developed a novel feature mapping method, referred to as Binary Base mapping, that improves the ability of a SVM to predict RNAi activities when compared to 2 previous methods, refered to as Unit Vector and N-gram mapping. Alone, the Binary Base SVM method has greater predictive accuracy than a recently published neural network machine learning method, on the same training and testing data. Several additional mapping methods can be envisioned, including methods that incorporate RNAi thermodynamics, secondary structure or measures of entropy, and whether alone or in combination these mappings of sequence to vector space for SVM model construction lead to better predictive models or understanding of RNAi biochemistry is unknown. We are requesting funding for the specific aims of: i) testing whether the Binary Base method can be used to further dissect and identify relevant biochemical feature associated with RNAi activity, ii) analyzing what additional vector mapping methods lead to predictive models with increased accuracy or greater understanding of relevant biochemical properties, and iii) investigating the distribution of sites within and among target mRNA genes where predictive SVM models identify high versus low activity. Title Support vector machines predict sequence ~ activity relationships in RNA interference: Project Narrative: Small non-coding RNAs (sncRNAs) have regulatory influence in human development and disease and better understanding how these molecules function involves the development of predictive models. Machine learning methods such as Support Vector Machines (SVMs) are 1 way to develop predictive models for these small RNA sequences and the incorporation of novel mapping methods in SVMs leads to model improvement. Finding and combining additional sequence mapping methods can lead to better predictive models for RNA interference activity as well as related processes such as microRNA activity, chemical modification of RNAi and RNAi stability or RNAi toxicity; further improving the understanding of how scnRNAs function and how they might be regulated.

FUNDED PUBLICATIONS

  • Improving model predictions for RNA interference activities that use support vector machine regression by combining and filtering features.
  • Improving model predictions for RNA interference activities that use support vector machine regression by combining and filtering features
  • Comparing artificial neural networks, general linear models and support vector machines in building predictive models for small interfering RNAs.
  • How to use: Click on a object to move its position. Double click to open its homepage. Right click to preview its contents.

    Download the RDF metadata as:   json-ld nt turtle xml License info


    20 TRIPLES      17 PREDICATES      21 URIs      9 LITERALS

    Subject Predicate Object
    1 grants:cd9f5e5860d475957213c2c69966bd58 sg:abstract DESCRIPTION (provided by applicant): Title Support vector machines predict sequence ~ activity relationships in RNA interference: Project Summary/Abstract: Support Vector Machines (SVMs) are a group of algorithms in supervised machine learning that are able to build classification or regression models on training data such that these models can be used to predict information not seen during model construction. RNA interference (RNAi) is the property of small (20 to 23 bases) RNA sequences that with the help of the RNA Induced Silencing Complex (RISC) enable the catalytic cleavage of target RNA sequences and the knockdown of the expression level of the target gene. The steps involved in loading and associating an RNAi sequences into an active RISC are several in addition to the multi-factorial variation in biochemical activities of RNAi sequences once in an active RISC. Finding the relevant biochemical features that associate with these quantifiable measures of RNAi can allow i) better predictive models of RNAi and RNAi-like (e.g. microRNAs) activities and ii) a better understanding of the relevant biochemical properties since presumably less relevant properties should not increase the predictive abilities of models containing those properties. We have developed a novel feature mapping method, referred to as Binary Base mapping, that improves the ability of a SVM to predict RNAi activities when compared to 2 previous methods, refered to as Unit Vector and N-gram mapping. Alone, the Binary Base SVM method has greater predictive accuracy than a recently published neural network machine learning method, on the same training and testing data. Several additional mapping methods can be envisioned, including methods that incorporate RNAi thermodynamics, secondary structure or measures of entropy, and whether alone or in combination these mappings of sequence to vector space for SVM model construction lead to better predictive models or understanding of RNAi biochemistry is unknown. We are requesting funding for the specific aims of: i) testing whether the Binary Base method can be used to further dissect and identify relevant biochemical feature associated with RNAi activity, ii) analyzing what additional vector mapping methods lead to predictive models with increased accuracy or greater understanding of relevant biochemical properties, and iii) investigating the distribution of sites within and among target mRNA genes where predictive SVM models identify high versus low activity. Title Support vector machines predict sequence ~ activity relationships in RNA interference: Project Narrative: Small non-coding RNAs (sncRNAs) have regulatory influence in human development and disease and better understanding how these molecules function involves the development of predictive models. Machine learning methods such as Support Vector Machines (SVMs) are 1 way to develop predictive models for these small RNA sequences and the incorporation of novel mapping methods in SVMs leads to model improvement. Finding and combining additional sequence mapping methods can lead to better predictive models for RNA interference activity as well as related processes such as microRNA activity, chemical modification of RNAi and RNAi stability or RNAi toxicity; further improving the understanding of how scnRNAs function and how they might be regulated.
    2 sg:endYear 2007
    3 sg:fundingAmount 97773.0
    4 sg:fundingCurrency USD
    5 sg:hasContribution contributions:d2a48679007ef8b5561fd495e2dcfdcb
    6 sg:hasFieldOfResearchCode anzsrc-for:08
    7 anzsrc-for:0801
    8 sg:hasFundedPublication articles:3a6bccff290c37630b507cb2b6002b6f
    9 articles:6bac7f0daa98c3f58517336609ef64aa
    10 articles:e4e0bfa42502d786abeb141b325e91bc
    11 sg:hasFundingOrganization grid-institutes:grid.280785.0
    12 sg:hasRecipientOrganization grid-institutes:grid.420360.3
    13 sg:language English
    14 sg:license http://scigraph.springernature.com/explorer/license/
    15 sg:scigraphId cd9f5e5860d475957213c2c69966bd58
    16 sg:startYear 2006
    17 sg:title Support Vector Machine modeling software for improving RNAi efficacy prediction
    18 sg:webpage http://projectreporter.nih.gov/project_info_description.cfm?aid=7157547
    19 rdf:type sg:Grant
    20 rdfs:label Grant: Support Vector Machine modeling software for improving RNAi efficacy prediction
    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular JSON format for linked data.

    curl -H 'Accept: application/ld+json' 'http://scigraph.springernature.com/things/grants/cd9f5e5860d475957213c2c69966bd58'

    N-Triples is a line-based linked data format ideal for batch operations .

    curl -H 'Accept: application/n-triples' 'http://scigraph.springernature.com/things/grants/cd9f5e5860d475957213c2c69966bd58'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'http://scigraph.springernature.com/things/grants/cd9f5e5860d475957213c2c69966bd58'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'http://scigraph.springernature.com/things/grants/cd9f5e5860d475957213c2c69966bd58'






    Preview window. Press ESC to close (or click here)


    ...