YEARS

1988-1991

AUTHORS

Alan S Lapedes

TITLE

GENETIC DATABASES--APPLICATIONS FOR MACHINE LEARNING

ABSTRACT

Recent advances in neural network theory, in which the present authors have played a significant role, have resulted in machine learning algorithms of great power. In an initial investigation, the authors have applied these algorithms to detecting and exploiting pattern regularities in DNA and also in amino acid sequences. In the two situations considered thus far (determination of whether or not a fragment of DNA codes for a protein, and predicting protein secondary structure given amino acid sequence) the results of the neural net analysis technique equals or exceeds results of conventional methods. We propose to intensively investigate these two problems with the goal of verifying and expanding our initial results, particularly to DNA sequences of other species, especially humans. We plan to expand our investigations to include pattern recognition searches for promoter/terminator sequences, intron/exon splice junctions, and other regulatory signals. Methods for the sequence to structure problem will be extended to include new results in energy minimization techniques for analogue models that contain numerous local minima. Different network architectures and different representations for the data will be investigated. When a neural net method exceeds a conventional method in accuracy we plan to analyze the network connections with the goal of understanding what rules the network developed (by virtue of the learning algorithm) that yielded the increased accuracy. Other machine learning algorithms, such as "classifier systems," will also be applied, as well as new approaches to information theoretic constructions of default hierarchies.

FUNDED PUBLICATIONS

  • Predicting protein secondary structure using neural net and statistical methods.
  • Determination of eukaryotic protein coding regions using neural networks and information theory.
  • How to use: Click on a object to move its position. Double click to open its homepage. Right click to preview its contents.

    Download the RDF metadata as:   json-ld nt turtle xml License info


    17 TRIPLES      15 PREDICATES      18 URIs      7 LITERALS

    Subject Predicate Object
    1 grants:638192350036a6f4415c637985088211 sg:abstract Recent advances in neural network theory, in which the present authors have played a significant role, have resulted in machine learning algorithms of great power. In an initial investigation, the authors have applied these algorithms to detecting and exploiting pattern regularities in DNA and also in amino acid sequences. In the two situations considered thus far (determination of whether or not a fragment of DNA codes for a protein, and predicting protein secondary structure given amino acid sequence) the results of the neural net analysis technique equals or exceeds results of conventional methods. We propose to intensively investigate these two problems with the goal of verifying and expanding our initial results, particularly to DNA sequences of other species, especially humans. We plan to expand our investigations to include pattern recognition searches for promoter/terminator sequences, intron/exon splice junctions, and other regulatory signals. Methods for the sequence to structure problem will be extended to include new results in energy minimization techniques for analogue models that contain numerous local minima. Different network architectures and different representations for the data will be investigated. When a neural net method exceeds a conventional method in accuracy we plan to analyze the network connections with the goal of understanding what rules the network developed (by virtue of the learning algorithm) that yielded the increased accuracy. Other machine learning algorithms, such as "classifier systems," will also be applied, as well as new approaches to information theoretic constructions of default hierarchies.
    2 sg:endYear 1991
    3 sg:hasContribution contributions:4ab574d28734390491ae039fba99f894
    4 sg:hasFieldOfResearchCode anzsrc-for:08
    5 anzsrc-for:0801
    6 sg:hasFundedPublication articles:20aa066948f0c1724e4d2ac9633e762c
    7 articles:c9a396c5a451f8103892ca2139beeac6
    8 sg:hasFundingOrganization grid-institutes:grid.280785.0
    9 sg:hasRecipientOrganization grid-institutes:grid.148313.c
    10 sg:language English
    11 sg:license http://scigraph.springernature.com/explorer/license/
    12 sg:scigraphId 638192350036a6f4415c637985088211
    13 sg:startYear 1988
    14 sg:title GENETIC DATABASES--APPLICATIONS FOR MACHINE LEARNING
    15 sg:webpage http://projectreporter.nih.gov/project_info_description.cfm?aid=3298717
    16 rdf:type sg:Grant
    17 rdfs:label Grant: GENETIC DATABASES--APPLICATIONS FOR MACHINE LEARNING
    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular JSON format for linked data.

    curl -H 'Accept: application/ld+json' 'http://scigraph.springernature.com/things/grants/638192350036a6f4415c637985088211'

    N-Triples is a line-based linked data format ideal for batch operations .

    curl -H 'Accept: application/n-triples' 'http://scigraph.springernature.com/things/grants/638192350036a6f4415c637985088211'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'http://scigraph.springernature.com/things/grants/638192350036a6f4415c637985088211'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'http://scigraph.springernature.com/things/grants/638192350036a6f4415c637985088211'






    Preview window. Press ESC to close (or click here)


    ...