Combining heterogeneous data sources for accurate functional annotation of proteins View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2013-02

AUTHORS

Artem Sokolov, Christopher Funk, Kiley Graim, Karin Verspoor, Asa Ben-Hur

ABSTRACT

Combining heterogeneous sources of data is essential for accurate prediction of protein function. The task is complicated by the fact that while sequence-based features can be readily compared across species, most other data are species-specific. In this paper, we present a multi-view extension to GOstruct, a structured-output framework for function annotation of proteins. The extended framework can learn from disparate data sources, with each data source provided to the framework in the form of a kernel. Our empirical results demonstrate that the multi-view framework is able to utilize all available information, yielding better performance than sequence-based models trained across species and models trained from collections of data within a given species. This version of GOstruct participated in the recent Critical Assessment of Functional Annotations (CAFA) challenge; since then we have significantly improved the natural language processing component of the method, which now provides performance that is on par with that provided by sequence information. The GOstruct framework is available for download at http://strut.sourceforge.net. More... »

PAGES

s10

References to SciGraph publications

  • 2000-05. Gene Ontology: tool for the unification of biology in NATURE GENETICS
  • 2004-12. GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes in BMC BIOINFORMATICS
  • 2011-12. The gene normalization task in BioCreative III in BMC BIOINFORMATICS
  • 2009-02. Protein function annotation by homology-based inference in GENOME BIOLOGY
  • 2005-05. Learning Statistical Models for Annotating Proteins with Function Information using Biomedical Text in BMC BIOINFORMATICS
  • 2005-05. Mining protein function from text using term-based support vector machines in BMC BIOINFORMATICS
  • 2005-05. Evaluation of BioCreAtIvE assessment of task 2 in BMC BIOINFORMATICS
  • 2005-05. Finding genomic ontology terms in text using evidence content in BMC BIOINFORMATICS
  • 2003-12. Automatic prediction of protein function in CELLULAR AND MOLECULAR LIFE SCIENCES
  • 2008-06. Consistent probabilistic outputs for protein function prediction in GENOME BIOLOGY
  • 2010-12. Low-complexity regions within protein sequences have position-dependent roles in BMC SYSTEMS BIOLOGY
  • 2008-06. GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function in GENOME BIOLOGY
  • 2008-06. A critical assessment of Mus musculusgene function prediction using integrated genomic evidence in GENOME BIOLOGY
  • 1998-04. Predicting functions from protein sequences—where are the bottlenecks? in NATURE GENETICS
  • 2008-12. Towards structured output prediction of enzyme function in BMC PROCEEDINGS
  • 2012-12. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools in BMC BIOINFORMATICS
  • 2008-06. Predicting gene function in a hierarchical context with an ensemble of classifiers in GENOME BIOLOGY
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1186/1471-2105-14-s3-s10

    DOI

    http://dx.doi.org/10.1186/1471-2105-14-s3-s10

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1020835861

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/23514123


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information Systems", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Algorithms", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Animals", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computational Biology", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Gene Expression", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Mice", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Molecular Sequence Annotation", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Protein Interaction Mapping", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Proteins", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Software", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Vocabulary, Controlled", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of California, Santa Cruz", 
              "id": "https://www.grid.ac/institutes/grid.205975.c", 
              "name": [
                "Department of Biomolecular Engineering, University of California Santa Cruz, 95064, Santa Cruz, California, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Sokolov", 
            "givenName": "Artem", 
            "id": "sg:person.01101321731.95", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01101321731.95"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of Colorado Anschutz Medical Campus", 
              "id": "https://www.grid.ac/institutes/grid.430503.1", 
              "name": [
                "Computational Bioscience Program, University of Colorado School of Medicine, 80045, Aurora, Colorado, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Funk", 
            "givenName": "Christopher", 
            "id": "sg:person.01142564764.38", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01142564764.38"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of California, Santa Cruz", 
              "id": "https://www.grid.ac/institutes/grid.205975.c", 
              "name": [
                "Department of Biomolecular Engineering, University of California Santa Cruz, 95064, Santa Cruz, California, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Graim", 
            "givenName": "Kiley", 
            "id": "sg:person.01243051406.16", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01243051406.16"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Data61", 
              "id": "https://www.grid.ac/institutes/grid.425461.0", 
              "name": [
                "Computational Bioscience Program, University of Colorado School of Medicine, 80045, Aurora, Colorado, USA", 
                "National ICT Australia, Victoria Research Lab, 3010, Melbourne, Australia"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Verspoor", 
            "givenName": "Karin", 
            "id": "sg:person.01372713104.04", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01372713104.04"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Colorado State University", 
              "id": "https://www.grid.ac/institutes/grid.47894.36", 
              "name": [
                "Department of Computer Science, Colorado State University, 80523, Fort Collins, Colorado, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Ben-Hur", 
            "givenName": "Asa", 
            "id": "sg:person.01242755504.30", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01242755504.30"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1186/1471-2105-6-s1-s21", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1000074431", 
              "https://doi.org/10.1186/1471-2105-6-s1-s21"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1752-0509-4-43", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1005678588", 
              "https://doi.org/10.1186/1752-0509-4-43"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btp122", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1005864951"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/prot.23029", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1005949120"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-6-s1-s18", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006919385", 
              "https://doi.org/10.1186/1471-2105-6-s1-s18"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2008-9-s1-s3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1006959314", 
              "https://doi.org/10.1186/gb-2008-9-s1-s3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0022-2836(05)80360-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1013618994"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1006/jmbi.2000.4315", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018016237"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-5-178", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018640803", 
              "https://doi.org/10.1186/1471-2105-5-178"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/1097-0134(20001001)41:1<98::aid-prot120>3.0.co;2-s", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1019688976"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/ng0498-313", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1021131638", 
              "https://doi.org/10.1038/ng0498-313"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-12-s8-s2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1022575918", 
              "https://doi.org/10.1186/1471-2105-12-s8-s2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1102351.1102464", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1022920538"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2008-9-s1-s4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1024435781", 
              "https://doi.org/10.1186/gb-2008-9-s1-s4"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/btk048", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1025615135"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkg095", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027045924"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/bioinformatics/bth921", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031648236"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1753-6561-2-s4-s2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032045755", 
              "https://doi.org/10.1186/1753-6561-2-s4-s2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkn760", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032318578"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/2147805.2147820", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1032813119"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-6-s1-s16", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1038229739", 
              "https://doi.org/10.1186/1471-2105-6-s1-s16"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2008-9-s1-s2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1039025062", 
              "https://doi.org/10.1186/gb-2008-9-s1-s2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2009-10-2-207", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1040399255", 
              "https://doi.org/10.1186/gb-2009-10-2-207"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-6-s1-s22", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1042479871", 
              "https://doi.org/10.1186/1471-2105-6-s1-s22"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/75556", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044135237", 
              "https://doi.org/10.1038/75556"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/75556", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044135237", 
              "https://doi.org/10.1038/75556"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/gb-2008-9-s1-s6", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1044176432", 
              "https://doi.org/10.1186/gb-2008-9-s1-s6"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s00018-003-3114-8", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045087383", 
              "https://doi.org/10.1007/s00018-003-3114-8"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/279943.279962", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045398430"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkg582", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047342895"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkg555", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047366540"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/prot.20903", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052562570"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/prot.20903", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052562570"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-13-207", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052987047", 
              "https://doi.org/10.1186/1471-2105-13-207"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1093/nar/gkr440", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053252157"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/s0219720010004744", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1063004958"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1074854609", 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1076803194", 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/9789812704856_0029", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1096052467"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1142/9781860947292_0007", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1096052853"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.7551/mitpress/7443.001.0001", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1111386243"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2013-02", 
        "datePublishedReg": "2013-02-01", 
        "description": "Combining heterogeneous sources of data is essential for accurate prediction of protein function. The task is complicated by the fact that while sequence-based features can be readily compared across species, most other data are species-specific. In this paper, we present a multi-view extension to GOstruct, a structured-output framework for function annotation of proteins. The extended framework can learn from disparate data sources, with each data source provided to the framework in the form of a kernel. Our empirical results demonstrate that the multi-view framework is able to utilize all available information, yielding better performance than sequence-based models trained across species and models trained from collections of data within a given species. This version of GOstruct participated in the recent Critical Assessment of Functional Annotations (CAFA) challenge; since then we have significantly improved the natural language processing component of the method, which now provides performance that is on par with that provided by sequence information. The GOstruct framework is available for download at http://strut.sourceforge.net.", 
        "genre": "research_article", 
        "id": "sg:pub.10.1186/1471-2105-14-s3-s10", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.3111397", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.2681199", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.3111370", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1023786", 
            "issn": [
              "1471-2105"
            ], 
            "name": "BMC Bioinformatics", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "Suppl 3", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "14"
          }
        ], 
        "name": "Combining heterogeneous data sources for accurate functional annotation of proteins", 
        "pagination": "s10", 
        "productId": [
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "634668d20faa1fac2f28657eaf9133af28ea3f242d72b6d0e4bba1601172a681"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "23514123"
            ]
          }, 
          {
            "name": "nlm_unique_id", 
            "type": "PropertyValue", 
            "value": [
              "100965194"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1186/1471-2105-14-s3-s10"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1020835861"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1186/1471-2105-14-s3-s10", 
          "https://app.dimensions.ai/details/publication/pub.1020835861"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2019-04-11T08:57", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000325_0000000325/records_100809_00000000.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "http://link.springer.com/10.1186/1471-2105-14-S3-S10"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-14-s3-s10'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-14-s3-s10'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-14-s3-s10'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-14-s3-s10'


     

    This table displays all metadata directly associated to this object as RDF triples.

    284 TRIPLES      21 PREDICATES      78 URIs      31 LITERALS      19 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1186/1471-2105-14-s3-s10 schema:about N36c246d4c42a4c148c537b1d9d11a047
    2 N3f3004a548a24297b199b7de8a7c671f
    3 N569fbc2b545141fb90b8c2594b224c78
    4 N88a265e21eda4064a85cb9343feaecf8
    5 N929f61bc3e6144dcaa2c7afb4b6ff8cb
    6 Na51b28f5bc6b42219ee9b00ff2398f5a
    7 Nbeb0ee025dec46f795326a42fc31f723
    8 Nbfc3cca7f7c74e54a53bb617472bb35c
    9 Ndf7f9bf51e7e4fdcaf6c980eaf5e1d26
    10 Necec8b5b2bed494d9ce8599b0b4764cd
    11 anzsrc-for:08
    12 anzsrc-for:0806
    13 schema:author N93c96f70f99d4c3aae51941c0400ffcd
    14 schema:citation sg:pub.10.1007/s00018-003-3114-8
    15 sg:pub.10.1038/75556
    16 sg:pub.10.1038/ng0498-313
    17 sg:pub.10.1186/1471-2105-12-s8-s2
    18 sg:pub.10.1186/1471-2105-13-207
    19 sg:pub.10.1186/1471-2105-5-178
    20 sg:pub.10.1186/1471-2105-6-s1-s16
    21 sg:pub.10.1186/1471-2105-6-s1-s18
    22 sg:pub.10.1186/1471-2105-6-s1-s21
    23 sg:pub.10.1186/1471-2105-6-s1-s22
    24 sg:pub.10.1186/1752-0509-4-43
    25 sg:pub.10.1186/1753-6561-2-s4-s2
    26 sg:pub.10.1186/gb-2008-9-s1-s2
    27 sg:pub.10.1186/gb-2008-9-s1-s3
    28 sg:pub.10.1186/gb-2008-9-s1-s4
    29 sg:pub.10.1186/gb-2008-9-s1-s6
    30 sg:pub.10.1186/gb-2009-10-2-207
    31 https://app.dimensions.ai/details/publication/pub.1074854609
    32 https://app.dimensions.ai/details/publication/pub.1076803194
    33 https://doi.org/10.1002/1097-0134(20001001)41:1<98::aid-prot120>3.0.co;2-s
    34 https://doi.org/10.1002/prot.20903
    35 https://doi.org/10.1002/prot.23029
    36 https://doi.org/10.1006/jmbi.2000.4315
    37 https://doi.org/10.1016/s0022-2836(05)80360-2
    38 https://doi.org/10.1093/bioinformatics/bth921
    39 https://doi.org/10.1093/bioinformatics/btk048
    40 https://doi.org/10.1093/bioinformatics/btp122
    41 https://doi.org/10.1093/nar/gkg095
    42 https://doi.org/10.1093/nar/gkg555
    43 https://doi.org/10.1093/nar/gkg582
    44 https://doi.org/10.1093/nar/gkn760
    45 https://doi.org/10.1093/nar/gkr440
    46 https://doi.org/10.1142/9781860947292_0007
    47 https://doi.org/10.1142/9789812704856_0029
    48 https://doi.org/10.1142/s0219720010004744
    49 https://doi.org/10.1145/1102351.1102464
    50 https://doi.org/10.1145/2147805.2147820
    51 https://doi.org/10.1145/279943.279962
    52 https://doi.org/10.7551/mitpress/7443.001.0001
    53 schema:datePublished 2013-02
    54 schema:datePublishedReg 2013-02-01
    55 schema:description Combining heterogeneous sources of data is essential for accurate prediction of protein function. The task is complicated by the fact that while sequence-based features can be readily compared across species, most other data are species-specific. In this paper, we present a multi-view extension to GOstruct, a structured-output framework for function annotation of proteins. The extended framework can learn from disparate data sources, with each data source provided to the framework in the form of a kernel. Our empirical results demonstrate that the multi-view framework is able to utilize all available information, yielding better performance than sequence-based models trained across species and models trained from collections of data within a given species. This version of GOstruct participated in the recent Critical Assessment of Functional Annotations (CAFA) challenge; since then we have significantly improved the natural language processing component of the method, which now provides performance that is on par with that provided by sequence information. The GOstruct framework is available for download at http://strut.sourceforge.net.
    56 schema:genre research_article
    57 schema:inLanguage en
    58 schema:isAccessibleForFree true
    59 schema:isPartOf N667951d01ce240f3a5469ba35ca04a3a
    60 N9fa7406e8a7b446da69274a1685cf46f
    61 sg:journal.1023786
    62 schema:name Combining heterogeneous data sources for accurate functional annotation of proteins
    63 schema:pagination s10
    64 schema:productId N1d5c75bef67746d0ae897cddc9812ecb
    65 N70ef8e63aab64ba09b218b11190ea3cb
    66 N8d0487543efe4354aabf28d47d2668eb
    67 Ne817950e357c4ec8b3213174b6ea6042
    68 Nf1c684aa8ecf4c66b5705f2f0d8b213c
    69 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020835861
    70 https://doi.org/10.1186/1471-2105-14-s3-s10
    71 schema:sdDatePublished 2019-04-11T08:57
    72 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    73 schema:sdPublisher Na0a0b0dfe92846d586b772bf43fa7d97
    74 schema:url http://link.springer.com/10.1186/1471-2105-14-S3-S10
    75 sgo:license sg:explorer/license/
    76 sgo:sdDataset articles
    77 rdf:type schema:ScholarlyArticle
    78 N1d5c75bef67746d0ae897cddc9812ecb schema:name dimensions_id
    79 schema:value pub.1020835861
    80 rdf:type schema:PropertyValue
    81 N36c246d4c42a4c148c537b1d9d11a047 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    82 schema:name Protein Interaction Mapping
    83 rdf:type schema:DefinedTerm
    84 N3f3004a548a24297b199b7de8a7c671f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    85 schema:name Computational Biology
    86 rdf:type schema:DefinedTerm
    87 N4ec774306aa64c83a144d4588f5b8a2d rdf:first sg:person.01142564764.38
    88 rdf:rest Ne994aaf7708e49cea78d015044cc4270
    89 N569fbc2b545141fb90b8c2594b224c78 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    90 schema:name Proteins
    91 rdf:type schema:DefinedTerm
    92 N667951d01ce240f3a5469ba35ca04a3a schema:volumeNumber 14
    93 rdf:type schema:PublicationVolume
    94 N70ef8e63aab64ba09b218b11190ea3cb schema:name nlm_unique_id
    95 schema:value 100965194
    96 rdf:type schema:PropertyValue
    97 N78a7c61f8de14a6595791afa82da6031 rdf:first sg:person.01372713104.04
    98 rdf:rest Nba46d755661140d4b34d8802ee48ef83
    99 N88a265e21eda4064a85cb9343feaecf8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    100 schema:name Gene Expression
    101 rdf:type schema:DefinedTerm
    102 N8d0487543efe4354aabf28d47d2668eb schema:name pubmed_id
    103 schema:value 23514123
    104 rdf:type schema:PropertyValue
    105 N929f61bc3e6144dcaa2c7afb4b6ff8cb schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    106 schema:name Mice
    107 rdf:type schema:DefinedTerm
    108 N93c96f70f99d4c3aae51941c0400ffcd rdf:first sg:person.01101321731.95
    109 rdf:rest N4ec774306aa64c83a144d4588f5b8a2d
    110 N9fa7406e8a7b446da69274a1685cf46f schema:issueNumber Suppl 3
    111 rdf:type schema:PublicationIssue
    112 Na0a0b0dfe92846d586b772bf43fa7d97 schema:name Springer Nature - SN SciGraph project
    113 rdf:type schema:Organization
    114 Na51b28f5bc6b42219ee9b00ff2398f5a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    115 schema:name Animals
    116 rdf:type schema:DefinedTerm
    117 Nba46d755661140d4b34d8802ee48ef83 rdf:first sg:person.01242755504.30
    118 rdf:rest rdf:nil
    119 Nbeb0ee025dec46f795326a42fc31f723 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    120 schema:name Molecular Sequence Annotation
    121 rdf:type schema:DefinedTerm
    122 Nbfc3cca7f7c74e54a53bb617472bb35c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    123 schema:name Software
    124 rdf:type schema:DefinedTerm
    125 Ndf7f9bf51e7e4fdcaf6c980eaf5e1d26 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    126 schema:name Vocabulary, Controlled
    127 rdf:type schema:DefinedTerm
    128 Ne817950e357c4ec8b3213174b6ea6042 schema:name readcube_id
    129 schema:value 634668d20faa1fac2f28657eaf9133af28ea3f242d72b6d0e4bba1601172a681
    130 rdf:type schema:PropertyValue
    131 Ne994aaf7708e49cea78d015044cc4270 rdf:first sg:person.01243051406.16
    132 rdf:rest N78a7c61f8de14a6595791afa82da6031
    133 Necec8b5b2bed494d9ce8599b0b4764cd schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    134 schema:name Algorithms
    135 rdf:type schema:DefinedTerm
    136 Nf1c684aa8ecf4c66b5705f2f0d8b213c schema:name doi
    137 schema:value 10.1186/1471-2105-14-s3-s10
    138 rdf:type schema:PropertyValue
    139 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    140 schema:name Information and Computing Sciences
    141 rdf:type schema:DefinedTerm
    142 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
    143 schema:name Information Systems
    144 rdf:type schema:DefinedTerm
    145 sg:grant.2681199 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2105-14-s3-s10
    146 rdf:type schema:MonetaryGrant
    147 sg:grant.3111370 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2105-14-s3-s10
    148 rdf:type schema:MonetaryGrant
    149 sg:grant.3111397 http://pending.schema.org/fundedItem sg:pub.10.1186/1471-2105-14-s3-s10
    150 rdf:type schema:MonetaryGrant
    151 sg:journal.1023786 schema:issn 1471-2105
    152 schema:name BMC Bioinformatics
    153 rdf:type schema:Periodical
    154 sg:person.01101321731.95 schema:affiliation https://www.grid.ac/institutes/grid.205975.c
    155 schema:familyName Sokolov
    156 schema:givenName Artem
    157 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01101321731.95
    158 rdf:type schema:Person
    159 sg:person.01142564764.38 schema:affiliation https://www.grid.ac/institutes/grid.430503.1
    160 schema:familyName Funk
    161 schema:givenName Christopher
    162 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01142564764.38
    163 rdf:type schema:Person
    164 sg:person.01242755504.30 schema:affiliation https://www.grid.ac/institutes/grid.47894.36
    165 schema:familyName Ben-Hur
    166 schema:givenName Asa
    167 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01242755504.30
    168 rdf:type schema:Person
    169 sg:person.01243051406.16 schema:affiliation https://www.grid.ac/institutes/grid.205975.c
    170 schema:familyName Graim
    171 schema:givenName Kiley
    172 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01243051406.16
    173 rdf:type schema:Person
    174 sg:person.01372713104.04 schema:affiliation https://www.grid.ac/institutes/grid.425461.0
    175 schema:familyName Verspoor
    176 schema:givenName Karin
    177 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01372713104.04
    178 rdf:type schema:Person
    179 sg:pub.10.1007/s00018-003-3114-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045087383
    180 https://doi.org/10.1007/s00018-003-3114-8
    181 rdf:type schema:CreativeWork
    182 sg:pub.10.1038/75556 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044135237
    183 https://doi.org/10.1038/75556
    184 rdf:type schema:CreativeWork
    185 sg:pub.10.1038/ng0498-313 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021131638
    186 https://doi.org/10.1038/ng0498-313
    187 rdf:type schema:CreativeWork
    188 sg:pub.10.1186/1471-2105-12-s8-s2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022575918
    189 https://doi.org/10.1186/1471-2105-12-s8-s2
    190 rdf:type schema:CreativeWork
    191 sg:pub.10.1186/1471-2105-13-207 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052987047
    192 https://doi.org/10.1186/1471-2105-13-207
    193 rdf:type schema:CreativeWork
    194 sg:pub.10.1186/1471-2105-5-178 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018640803
    195 https://doi.org/10.1186/1471-2105-5-178
    196 rdf:type schema:CreativeWork
    197 sg:pub.10.1186/1471-2105-6-s1-s16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038229739
    198 https://doi.org/10.1186/1471-2105-6-s1-s16
    199 rdf:type schema:CreativeWork
    200 sg:pub.10.1186/1471-2105-6-s1-s18 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006919385
    201 https://doi.org/10.1186/1471-2105-6-s1-s18
    202 rdf:type schema:CreativeWork
    203 sg:pub.10.1186/1471-2105-6-s1-s21 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000074431
    204 https://doi.org/10.1186/1471-2105-6-s1-s21
    205 rdf:type schema:CreativeWork
    206 sg:pub.10.1186/1471-2105-6-s1-s22 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042479871
    207 https://doi.org/10.1186/1471-2105-6-s1-s22
    208 rdf:type schema:CreativeWork
    209 sg:pub.10.1186/1752-0509-4-43 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005678588
    210 https://doi.org/10.1186/1752-0509-4-43
    211 rdf:type schema:CreativeWork
    212 sg:pub.10.1186/1753-6561-2-s4-s2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032045755
    213 https://doi.org/10.1186/1753-6561-2-s4-s2
    214 rdf:type schema:CreativeWork
    215 sg:pub.10.1186/gb-2008-9-s1-s2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039025062
    216 https://doi.org/10.1186/gb-2008-9-s1-s2
    217 rdf:type schema:CreativeWork
    218 sg:pub.10.1186/gb-2008-9-s1-s3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006959314
    219 https://doi.org/10.1186/gb-2008-9-s1-s3
    220 rdf:type schema:CreativeWork
    221 sg:pub.10.1186/gb-2008-9-s1-s4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024435781
    222 https://doi.org/10.1186/gb-2008-9-s1-s4
    223 rdf:type schema:CreativeWork
    224 sg:pub.10.1186/gb-2008-9-s1-s6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044176432
    225 https://doi.org/10.1186/gb-2008-9-s1-s6
    226 rdf:type schema:CreativeWork
    227 sg:pub.10.1186/gb-2009-10-2-207 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040399255
    228 https://doi.org/10.1186/gb-2009-10-2-207
    229 rdf:type schema:CreativeWork
    230 https://app.dimensions.ai/details/publication/pub.1074854609 schema:CreativeWork
    231 https://app.dimensions.ai/details/publication/pub.1076803194 schema:CreativeWork
    232 https://doi.org/10.1002/1097-0134(20001001)41:1<98::aid-prot120>3.0.co;2-s schema:sameAs https://app.dimensions.ai/details/publication/pub.1019688976
    233 rdf:type schema:CreativeWork
    234 https://doi.org/10.1002/prot.20903 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052562570
    235 rdf:type schema:CreativeWork
    236 https://doi.org/10.1002/prot.23029 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005949120
    237 rdf:type schema:CreativeWork
    238 https://doi.org/10.1006/jmbi.2000.4315 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018016237
    239 rdf:type schema:CreativeWork
    240 https://doi.org/10.1016/s0022-2836(05)80360-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013618994
    241 rdf:type schema:CreativeWork
    242 https://doi.org/10.1093/bioinformatics/bth921 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031648236
    243 rdf:type schema:CreativeWork
    244 https://doi.org/10.1093/bioinformatics/btk048 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025615135
    245 rdf:type schema:CreativeWork
    246 https://doi.org/10.1093/bioinformatics/btp122 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005864951
    247 rdf:type schema:CreativeWork
    248 https://doi.org/10.1093/nar/gkg095 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027045924
    249 rdf:type schema:CreativeWork
    250 https://doi.org/10.1093/nar/gkg555 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047366540
    251 rdf:type schema:CreativeWork
    252 https://doi.org/10.1093/nar/gkg582 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047342895
    253 rdf:type schema:CreativeWork
    254 https://doi.org/10.1093/nar/gkn760 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032318578
    255 rdf:type schema:CreativeWork
    256 https://doi.org/10.1093/nar/gkr440 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053252157
    257 rdf:type schema:CreativeWork
    258 https://doi.org/10.1142/9781860947292_0007 schema:sameAs https://app.dimensions.ai/details/publication/pub.1096052853
    259 rdf:type schema:CreativeWork
    260 https://doi.org/10.1142/9789812704856_0029 schema:sameAs https://app.dimensions.ai/details/publication/pub.1096052467
    261 rdf:type schema:CreativeWork
    262 https://doi.org/10.1142/s0219720010004744 schema:sameAs https://app.dimensions.ai/details/publication/pub.1063004958
    263 rdf:type schema:CreativeWork
    264 https://doi.org/10.1145/1102351.1102464 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022920538
    265 rdf:type schema:CreativeWork
    266 https://doi.org/10.1145/2147805.2147820 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032813119
    267 rdf:type schema:CreativeWork
    268 https://doi.org/10.1145/279943.279962 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045398430
    269 rdf:type schema:CreativeWork
    270 https://doi.org/10.7551/mitpress/7443.001.0001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1111386243
    271 rdf:type schema:CreativeWork
    272 https://www.grid.ac/institutes/grid.205975.c schema:alternateName University of California, Santa Cruz
    273 schema:name Department of Biomolecular Engineering, University of California Santa Cruz, 95064, Santa Cruz, California, USA
    274 rdf:type schema:Organization
    275 https://www.grid.ac/institutes/grid.425461.0 schema:alternateName Data61
    276 schema:name Computational Bioscience Program, University of Colorado School of Medicine, 80045, Aurora, Colorado, USA
    277 National ICT Australia, Victoria Research Lab, 3010, Melbourne, Australia
    278 rdf:type schema:Organization
    279 https://www.grid.ac/institutes/grid.430503.1 schema:alternateName University of Colorado Anschutz Medical Campus
    280 schema:name Computational Bioscience Program, University of Colorado School of Medicine, 80045, Aurora, Colorado, USA
    281 rdf:type schema:Organization
    282 https://www.grid.ac/institutes/grid.47894.36 schema:alternateName Colorado State University
    283 schema:name Department of Computer Science, Colorado State University, 80523, Fort Collins, Colorado, USA
    284 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...