YEARS

2011-2017

AUTHORS

Hong Yu

TITLE

Exploring Natural Language Processing, Image Processing, Machine Learning, and Us

ABSTRACT

DESCRIPTION (provided by applicant): Most biomedical text mining systems target only text information and do not provide intelligent access to other important data such as Figures. More than any other documentation, figures usually represent the evidence of discovery in the biomedical literature. Full-text biomedical articles nearly always incorporate images that are the crucial content of biomedical knowledge discovery. Biomedical scientists need to access images to validate research facts and to formulate or to test novel research hypotheses. Evaluation has shown that textual statements reported in the literature are frequently noisy (i.e., contain false facts). Capturing images that are essentially experimental evidence to support the textual fact will benefit biomedical information systems, databases, and biomedical scientists. We are developing a biomedical literature figure search engine BioFigureSearch. We develop innovative algorithms and models in natural language processing, image processing, machine learning and user interfacing. The deliverables will be novel biomedical natural language figure processing (bNLfP) algorithms and iBioFigureSearch allowing biomedical scientists to access figure data effectively, and open-source tools that will enhance biomedical information retrieval, summarization, and question answering. The bNLfP algorithms we will be developing can be applied or integrated into other biomedical text-mining systems.

FUNDED PUBLICATIONS

  • A robust data-driven approach for gene ontology annotation.
  • Figure-associated text summarization and evaluation.
  • Improving patients' electronic health record comprehension with NoteAid.
  • Automatically identifying health- and clinical-related content in wikipedia.
  • Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.
  • Beyond captions: linking figures with abstract sentences in biomedical articles.
  • DeTEXT: A Database for Evaluating Text Extraction from Biomedical Literature Figures.
  • Learning to rank figures within a biomedical article.
  • Detecting hedge cues and their scope in biomedical text with conditional random fields.
  • Automatic discourse connective detection in biomedical text.
  • Automatically extracting information needs from complex clinical questions.
  • Toward automated consumer question answering: automatically separating consumer questions from professional questions in the healthcare domain.
  • Computational approaches for predicting biomedical research collaborations.
  • Automatic figure classification in bioscience literature.
  • CiteGraph: a citation network system for MEDLINE articles and analysis.
  • Parsing citations in biomedical articles using conditional random fields.
  • How to use: Click on a object to move its position. Double click to open its homepage. Right click to preview its contents.

    Download the RDF metadata as:   json-ld nt turtle xml License info


    34 TRIPLES      17 PREDICATES      35 URIs      9 LITERALS

    Subject Predicate Object
    1 grants:919b0c05151d1cacf65f86a7c10305c1 sg:abstract DESCRIPTION (provided by applicant): Most biomedical text mining systems target only text information and do not provide intelligent access to other important data such as Figures. More than any other documentation, figures usually represent the evidence of discovery in the biomedical literature. Full-text biomedical articles nearly always incorporate images that are the crucial content of biomedical knowledge discovery. Biomedical scientists need to access images to validate research facts and to formulate or to test novel research hypotheses. Evaluation has shown that textual statements reported in the literature are frequently noisy (i.e., contain false facts). Capturing images that are essentially experimental evidence to support the textual fact will benefit biomedical information systems, databases, and biomedical scientists. We are developing a biomedical literature figure search engine BioFigureSearch. We develop innovative algorithms and models in natural language processing, image processing, machine learning and user interfacing. The deliverables will be novel biomedical natural language figure processing (bNLfP) algorithms and iBioFigureSearch allowing biomedical scientists to access figure data effectively, and open-source tools that will enhance biomedical information retrieval, summarization, and question answering. The bNLfP algorithms we will be developing can be applied or integrated into other biomedical text-mining systems.
    2 sg:endYear 2017
    3 sg:fundingAmount 2477978.0
    4 sg:fundingCurrency USD
    5 sg:hasContribution contributions:491e76abbf72897842ca87b3ec53ad8a
    6 sg:hasFieldOfResearchCode anzsrc-for:08
    7 anzsrc-for:0801
    8 anzsrc-for:0806
    9 sg:hasFundedPublication articles:0d7df6ae128ebcd698a410cea399b454
    10 articles:0e9af004727305a8b0d7224b0f775589
    11 articles:1f287fecf60956cd6001b8bced3ecc75
    12 articles:447470357de1437d448fd5fa2baf280c
    13 articles:4ef9539444437dbba53496dc405533ea
    14 articles:5c019b61455239deed5dc58aedd3c466
    15 articles:82fe33b4de463f8894ca0b5be195d3e0
    16 articles:8791d4d9f11a74a0e000c02d78866072
    17 articles:978bb1fa8192bd6480231b673733985b
    18 articles:a2e282c9d8b9215e54d9c5a6c27e02a5
    19 articles:a669c6ee2eef5f22ef127dac1cee6073
    20 articles:b21d3f1aa650c5ed9d0b5825d7fe66ff
    21 articles:c01904dcbbe97cd33cda8976933fd2ed
    22 articles:d8fbc630e138f4a4c8796b149833a1a1
    23 articles:e245145f553d0228b4df40208790fad1
    24 articles:e5dcb3c032d80088e40b0c73394185f2
    25 sg:hasFundingOrganization grid-institutes:grid.280785.0
    26 sg:hasRecipientOrganization grid-institutes:grid.168645.8
    27 sg:language English
    28 sg:license http://scigraph.springernature.com/explorer/license/
    29 sg:scigraphId 919b0c05151d1cacf65f86a7c10305c1
    30 sg:startYear 2011
    31 sg:title Exploring Natural Language Processing, Image Processing, Machine Learning, and Us
    32 sg:webpage http://projectreporter.nih.gov/project_info_description.cfm?aid=8840267
    33 rdf:type sg:Grant
    34 rdfs:label Grant: Exploring Natural Language Processing, Image Processing, Machine Learning, and Us
    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular JSON format for linked data.

    curl -H 'Accept: application/ld+json' 'http://scigraph.springernature.com/things/grants/919b0c05151d1cacf65f86a7c10305c1'

    N-Triples is a line-based linked data format ideal for batch operations .

    curl -H 'Accept: application/n-triples' 'http://scigraph.springernature.com/things/grants/919b0c05151d1cacf65f86a7c10305c1'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'http://scigraph.springernature.com/things/grants/919b0c05151d1cacf65f86a7c10305c1'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'http://scigraph.springernature.com/things/grants/919b0c05151d1cacf65f86a7c10305c1'






    Preview window. Press ESC to close (or click here)


    ...