Using machine learning tools for protein database biocuration assistance View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2018-12

AUTHORS

Caroline König, Ilmira Shaim, Alfredo Vellido, Enrique Romero, René Alquézar, Jesús Giraldo

ABSTRACT

Biocuration in the omics sciences has become paramount, as research in these fields rapidly evolves towards increasingly data-dependent models. As a result, the management of web-accessible publicly-available databases becomes a central task in biological knowledge dissemination. One relevant challenge for biocurators is the unambiguous identification of biological entities. In this study, we illustrate the adequacy of machine learning methods as biocuration assistance tools using a publicly available protein database as an example. This database contains information on G Protein-Coupled Receptors (GPCRs), which are part of eukaryotic cell membranes and relevant in cell communication as well as major drug targets in pharmacology. These receptors are characterized according to subtype labels. Previous analysis of this database provided evidence that some of the receptor sequences could be affected by a case of label noise, as they appeared to be too consistently misclassified by machine learning methods. Here, we extend our analysis to recent and quite substantially modified new versions of the database and reveal their now extremely accurate labeling using several machine learning models and different transformations of the unaligned sequences. These findings support the adequacy of our proposed method to identify problematic labeling cases as a tool for database biocuration. More... »

PAGES

10148

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1038/s41598-018-28330-z

DOI

http://dx.doi.org/10.1038/s41598-018-28330-z

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1105236536

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/29977071


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Universitat Polit\u00e8cnica de Catalunya", 
          "id": "https://www.grid.ac/institutes/grid.6835.8", 
          "name": [
            "IDEAI Research Center, Universitat Polit\u00e8cnica de Catalunya, UPC BarcelonaTech, 08034, Barcelona, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "K\u00f6nig", 
        "givenName": "Caroline", 
        "id": "sg:person.0631670146.50", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0631670146.50"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universitat Polit\u00e8cnica de Catalunya", 
          "id": "https://www.grid.ac/institutes/grid.6835.8", 
          "name": [
            "IDEAI Research Center, Universitat Polit\u00e8cnica de Catalunya, UPC BarcelonaTech, 08034, Barcelona, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Shaim", 
        "givenName": "Ilmira", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universitat Polit\u00e8cnica de Catalunya", 
          "id": "https://www.grid.ac/institutes/grid.6835.8", 
          "name": [
            "IDEAI Research Center, Universitat Polit\u00e8cnica de Catalunya, UPC BarcelonaTech, 08034, Barcelona, Spain", 
            "Centro de Investigaci\u00f3n Biom\u00e9dica en Red en Bioingenier\u00eda, Biomateriales y Nanomedicina (CIBER-BBN), 08193, Cerdanyola del Vall\u00e8s, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vellido", 
        "givenName": "Alfredo", 
        "id": "sg:person.01270336770.60", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01270336770.60"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universitat Polit\u00e8cnica de Catalunya", 
          "id": "https://www.grid.ac/institutes/grid.6835.8", 
          "name": [
            "IDEAI Research Center, Universitat Polit\u00e8cnica de Catalunya, UPC BarcelonaTech, 08034, Barcelona, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Romero", 
        "givenName": "Enrique", 
        "id": "sg:person.01164502317.77", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01164502317.77"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universitat Polit\u00e8cnica de Catalunya", 
          "id": "https://www.grid.ac/institutes/grid.6835.8", 
          "name": [
            "IDEAI Research Center, Universitat Polit\u00e8cnica de Catalunya, UPC BarcelonaTech, 08034, Barcelona, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Alqu\u00e9zar", 
        "givenName": "Ren\u00e9", 
        "id": "sg:person.0700003346.30", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0700003346.30"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Centro de Investigaci\u00f3n Biom\u00e9dica en Red de Salud Mental", 
          "id": "https://www.grid.ac/institutes/grid.469673.9", 
          "name": [
            "Institut de Neuroci\u00e8ncies - Unitat de Bioestad\u00ecstica, Universitat Aut\u00f2noma de Barcelona, 08193, Cerdanyola del Vall\u00e8s, Spain", 
            "Network Biomedical Research Center on Mental Health (CIBERSAM), 28029, Madrid, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Giraldo", 
        "givenName": "Jes\u00fas", 
        "id": "sg:person.01210663303.80", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01210663303.80"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/3-540-45014-9_1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000204280", 
          "https://doi.org/10.1007/3-540-45014-9_1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-45014-9_1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000204280", 
          "https://doi.org/10.1007/3-540-45014-9_1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btl665", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000236398"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/bph.13509", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002460091"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1124/mol.63.6.1256", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003054041"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0163-7258(03)00038-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004232436"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0163-7258(03)00038-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004232436"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-642-41190-8_36", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008012862", 
          "https://doi.org/10.1007/978-3-642-41190-8_36"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btn028", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008261674"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.biosystems.2006.03.006", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008791531"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ipm.2009.03.002", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1012321154"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.bbrc.2013.08.023", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014287757"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0141287", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014761536"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s1359-6446(01)02131-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015484954"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.compbiolchem.2005.09.006", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016226726"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/18.1.147", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016517627"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/455047a", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018445264", 
          "https://doi.org/10.1038/455047a"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/ijc.29047", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022453222"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1010933404324", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024739340", 
          "https://doi.org/10.1023/a:1010933404324"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btg317", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024930760"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrd2760", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025311434", 
          "https://doi.org/10.1038/nrd2760"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/0471250953.bi0101s50", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026436840"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/database/baw161", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026572344"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1110/ps.2500102", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026733522"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12859-015-0731-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028981300", 
          "https://doi.org/10.1186/s12859-015-0731-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12859-015-0731-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028981300", 
          "https://doi.org/10.1186/s12859-015-0731-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1146/annurev-pharmtox-032112-135923", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030698697"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrd.2016.230", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030835959", 
          "https://doi.org/10.1038/nrd.2016.230"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrd.2016.230", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030835959", 
          "https://doi.org/10.1038/nrd.2016.230"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature20566", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031000217", 
          "https://doi.org/10.1038/nature20566"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0041882", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032363911"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s1054-3589(10)58010-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034153661"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.gene.2005.07.029", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035607530"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.gene.2005.07.029", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035607530"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11517-014-1218-y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035837530", 
          "https://doi.org/10.1007/s11517-014-1218-y"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0003-2670(93)80437-p", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038362770"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/498255a", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042946700", 
          "https://doi.org/10.1038/498255a"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.coph.2014.12.002", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044505059"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.compbiomed.2011.05.015", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045373136"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0046633", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046449008"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrd2518", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047432273", 
          "https://doi.org/10.1038/nrd2518"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/prot.20373", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051371941"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/cn200025w", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1054062739"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/jm9700575", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055958127"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/jm9700575", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055958127"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/pr060534k", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1056291290"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tnnls.2013.2292894", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061718465"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1198/tas.2011.11052", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1064201549"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.18547/gcb.2015.vol1.iss1.e19", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1068659075"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkw1218", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1079364417"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1082642903", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.neuron.2017.03.016", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085116577"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.neuron.2017.03.016", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085116577"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3390/molecules22071119", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090350697"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3390/molecules22071119", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090350697"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12938-017-0357-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091274879", 
          "https://doi.org/10.1186/s12938-017-0357-4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12938-017-0357-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091274879", 
          "https://doi.org/10.1186/s12938-017-0357-4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/ijcnn.2015.7280613", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095788447"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.7208/chicago/9780226416502.001.0001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1100805936"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12859-018-2121-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1101898328", 
          "https://doi.org/10.1186/s12859-018-2121-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12859-018-2121-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1101898328", 
          "https://doi.org/10.1186/s12859-018-2121-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12859-018-2121-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1101898328", 
          "https://doi.org/10.1186/s12859-018-2121-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1613/jair.1199", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1105579281"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2018-12", 
    "datePublishedReg": "2018-12-01", 
    "description": "Biocuration in the omics sciences has become paramount, as research in these fields rapidly evolves towards increasingly data-dependent models. As a result, the management of web-accessible publicly-available databases becomes a central task in biological knowledge dissemination. One relevant challenge for biocurators is the unambiguous identification of biological entities. In this study, we illustrate the adequacy of machine learning methods as biocuration assistance tools using a publicly available protein database as an example. This database contains information on G Protein-Coupled Receptors (GPCRs), which are part of eukaryotic cell membranes and relevant in cell communication as well as major drug targets in pharmacology. These receptors are characterized according to subtype labels. Previous analysis of this database provided evidence that some of the receptor sequences could be affected by a case of label noise, as they appeared to be too consistently misclassified by machine learning methods. Here, we extend our analysis to recent and quite substantially modified new versions of the database and reveal their now extremely accurate labeling using several machine learning models and different transformations of the unaligned sequences. These findings support the adequacy of our proposed method to identify problematic labeling cases as a tool for database biocuration.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1038/s41598-018-28330-z", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1045337", 
        "issn": [
          "2045-2322"
        ], 
        "name": "Scientific Reports", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "8"
      }
    ], 
    "name": "Using machine learning tools for protein database biocuration assistance", 
    "pagination": "10148", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "84d0c21d7d9aba71706d10d156385b8572eb6d4c2688691bf79e69fdd2ad6213"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "29977071"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101563288"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1038/s41598-018-28330-z"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1105236536"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1038/s41598-018-28330-z", 
      "https://app.dimensions.ai/details/publication/pub.1105236536"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T02:32", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8700_00000604.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://www.nature.com/articles/s41598-018-28330-z"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/s41598-018-28330-z'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/s41598-018-28330-z'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/s41598-018-28330-z'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/s41598-018-28330-z'


 

This table displays all metadata directly associated to this object as RDF triples.

275 TRIPLES      21 PREDICATES      81 URIs      21 LITERALS      9 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1038/s41598-018-28330-z schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nff95df2029e0480abc07e8b1ad3a1368
4 schema:citation sg:pub.10.1007/3-540-45014-9_1
5 sg:pub.10.1007/978-3-642-41190-8_36
6 sg:pub.10.1007/s11517-014-1218-y
7 sg:pub.10.1023/a:1010933404324
8 sg:pub.10.1038/455047a
9 sg:pub.10.1038/498255a
10 sg:pub.10.1038/nature20566
11 sg:pub.10.1038/nrd.2016.230
12 sg:pub.10.1038/nrd2518
13 sg:pub.10.1038/nrd2760
14 sg:pub.10.1186/s12859-015-0731-9
15 sg:pub.10.1186/s12859-018-2121-6
16 sg:pub.10.1186/s12938-017-0357-4
17 https://app.dimensions.ai/details/publication/pub.1082642903
18 https://doi.org/10.1002/0471250953.bi0101s50
19 https://doi.org/10.1002/ijc.29047
20 https://doi.org/10.1002/prot.20373
21 https://doi.org/10.1016/0003-2670(93)80437-p
22 https://doi.org/10.1016/j.bbrc.2013.08.023
23 https://doi.org/10.1016/j.biosystems.2006.03.006
24 https://doi.org/10.1016/j.compbiolchem.2005.09.006
25 https://doi.org/10.1016/j.compbiomed.2011.05.015
26 https://doi.org/10.1016/j.coph.2014.12.002
27 https://doi.org/10.1016/j.gene.2005.07.029
28 https://doi.org/10.1016/j.ipm.2009.03.002
29 https://doi.org/10.1016/j.neuron.2017.03.016
30 https://doi.org/10.1016/s0163-7258(03)00038-x
31 https://doi.org/10.1016/s1054-3589(10)58010-4
32 https://doi.org/10.1016/s1359-6446(01)02131-6
33 https://doi.org/10.1021/cn200025w
34 https://doi.org/10.1021/jm9700575
35 https://doi.org/10.1021/pr060534k
36 https://doi.org/10.1093/bioinformatics/18.1.147
37 https://doi.org/10.1093/bioinformatics/btg317
38 https://doi.org/10.1093/bioinformatics/btl665
39 https://doi.org/10.1093/bioinformatics/btn028
40 https://doi.org/10.1093/database/baw161
41 https://doi.org/10.1093/nar/gkw1218
42 https://doi.org/10.1109/ijcnn.2015.7280613
43 https://doi.org/10.1109/tnnls.2013.2292894
44 https://doi.org/10.1110/ps.2500102
45 https://doi.org/10.1111/bph.13509
46 https://doi.org/10.1124/mol.63.6.1256
47 https://doi.org/10.1146/annurev-pharmtox-032112-135923
48 https://doi.org/10.1198/tas.2011.11052
49 https://doi.org/10.1371/journal.pone.0041882
50 https://doi.org/10.1371/journal.pone.0046633
51 https://doi.org/10.1371/journal.pone.0141287
52 https://doi.org/10.1613/jair.1199
53 https://doi.org/10.18547/gcb.2015.vol1.iss1.e19
54 https://doi.org/10.3390/molecules22071119
55 https://doi.org/10.7208/chicago/9780226416502.001.0001
56 schema:datePublished 2018-12
57 schema:datePublishedReg 2018-12-01
58 schema:description Biocuration in the omics sciences has become paramount, as research in these fields rapidly evolves towards increasingly data-dependent models. As a result, the management of web-accessible publicly-available databases becomes a central task in biological knowledge dissemination. One relevant challenge for biocurators is the unambiguous identification of biological entities. In this study, we illustrate the adequacy of machine learning methods as biocuration assistance tools using a publicly available protein database as an example. This database contains information on G Protein-Coupled Receptors (GPCRs), which are part of eukaryotic cell membranes and relevant in cell communication as well as major drug targets in pharmacology. These receptors are characterized according to subtype labels. Previous analysis of this database provided evidence that some of the receptor sequences could be affected by a case of label noise, as they appeared to be too consistently misclassified by machine learning methods. Here, we extend our analysis to recent and quite substantially modified new versions of the database and reveal their now extremely accurate labeling using several machine learning models and different transformations of the unaligned sequences. These findings support the adequacy of our proposed method to identify problematic labeling cases as a tool for database biocuration.
59 schema:genre research_article
60 schema:inLanguage en
61 schema:isAccessibleForFree true
62 schema:isPartOf N14d3aae1617744d1b50450381c427822
63 Nbd953e640c714b0fb11b975dbf596d81
64 sg:journal.1045337
65 schema:name Using machine learning tools for protein database biocuration assistance
66 schema:pagination 10148
67 schema:productId N1d24103b25394bf9ad03e11c43b9157f
68 N6c12cc6bd73b492280d1c2ed9e2799f6
69 N8e5e2246699f4153b2239d9d75c721b0
70 Nb912d0c1cf294ecfbc6d81176241d2f7
71 Ndc0cda9b67da46079a4c837e3d1a58c8
72 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105236536
73 https://doi.org/10.1038/s41598-018-28330-z
74 schema:sdDatePublished 2019-04-11T02:32
75 schema:sdLicense https://scigraph.springernature.com/explorer/license/
76 schema:sdPublisher Nd50146d0a0d54b72a968e9a4c5218bc2
77 schema:url https://www.nature.com/articles/s41598-018-28330-z
78 sgo:license sg:explorer/license/
79 sgo:sdDataset articles
80 rdf:type schema:ScholarlyArticle
81 N02a618964511420a9056ae2937a2259c rdf:first Ne62307e0e9b54802af0a59a460e2fcba
82 rdf:rest Nc96101df80b440658f3695ed36b40092
83 N14d3aae1617744d1b50450381c427822 schema:issueNumber 1
84 rdf:type schema:PublicationIssue
85 N1d24103b25394bf9ad03e11c43b9157f schema:name doi
86 schema:value 10.1038/s41598-018-28330-z
87 rdf:type schema:PropertyValue
88 N4d96e1ace6b64e9b94a93d4af6906805 rdf:first sg:person.0700003346.30
89 rdf:rest N62abb746b8904ee2bc7b5371a1a70641
90 N62abb746b8904ee2bc7b5371a1a70641 rdf:first sg:person.01210663303.80
91 rdf:rest rdf:nil
92 N6c12cc6bd73b492280d1c2ed9e2799f6 schema:name pubmed_id
93 schema:value 29977071
94 rdf:type schema:PropertyValue
95 N8e5e2246699f4153b2239d9d75c721b0 schema:name nlm_unique_id
96 schema:value 101563288
97 rdf:type schema:PropertyValue
98 Nb912d0c1cf294ecfbc6d81176241d2f7 schema:name readcube_id
99 schema:value 84d0c21d7d9aba71706d10d156385b8572eb6d4c2688691bf79e69fdd2ad6213
100 rdf:type schema:PropertyValue
101 Nbd953e640c714b0fb11b975dbf596d81 schema:volumeNumber 8
102 rdf:type schema:PublicationVolume
103 Nc96101df80b440658f3695ed36b40092 rdf:first sg:person.01270336770.60
104 rdf:rest Nf92bb166eebb4b7682671b8039d01b7d
105 Nd50146d0a0d54b72a968e9a4c5218bc2 schema:name Springer Nature - SN SciGraph project
106 rdf:type schema:Organization
107 Ndc0cda9b67da46079a4c837e3d1a58c8 schema:name dimensions_id
108 schema:value pub.1105236536
109 rdf:type schema:PropertyValue
110 Ne62307e0e9b54802af0a59a460e2fcba schema:affiliation https://www.grid.ac/institutes/grid.6835.8
111 schema:familyName Shaim
112 schema:givenName Ilmira
113 rdf:type schema:Person
114 Nf92bb166eebb4b7682671b8039d01b7d rdf:first sg:person.01164502317.77
115 rdf:rest N4d96e1ace6b64e9b94a93d4af6906805
116 Nff95df2029e0480abc07e8b1ad3a1368 rdf:first sg:person.0631670146.50
117 rdf:rest N02a618964511420a9056ae2937a2259c
118 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
119 schema:name Information and Computing Sciences
120 rdf:type schema:DefinedTerm
121 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
122 schema:name Artificial Intelligence and Image Processing
123 rdf:type schema:DefinedTerm
124 sg:journal.1045337 schema:issn 2045-2322
125 schema:name Scientific Reports
126 rdf:type schema:Periodical
127 sg:person.01164502317.77 schema:affiliation https://www.grid.ac/institutes/grid.6835.8
128 schema:familyName Romero
129 schema:givenName Enrique
130 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01164502317.77
131 rdf:type schema:Person
132 sg:person.01210663303.80 schema:affiliation https://www.grid.ac/institutes/grid.469673.9
133 schema:familyName Giraldo
134 schema:givenName Jesús
135 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01210663303.80
136 rdf:type schema:Person
137 sg:person.01270336770.60 schema:affiliation https://www.grid.ac/institutes/grid.6835.8
138 schema:familyName Vellido
139 schema:givenName Alfredo
140 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01270336770.60
141 rdf:type schema:Person
142 sg:person.0631670146.50 schema:affiliation https://www.grid.ac/institutes/grid.6835.8
143 schema:familyName König
144 schema:givenName Caroline
145 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0631670146.50
146 rdf:type schema:Person
147 sg:person.0700003346.30 schema:affiliation https://www.grid.ac/institutes/grid.6835.8
148 schema:familyName Alquézar
149 schema:givenName René
150 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0700003346.30
151 rdf:type schema:Person
152 sg:pub.10.1007/3-540-45014-9_1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000204280
153 https://doi.org/10.1007/3-540-45014-9_1
154 rdf:type schema:CreativeWork
155 sg:pub.10.1007/978-3-642-41190-8_36 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008012862
156 https://doi.org/10.1007/978-3-642-41190-8_36
157 rdf:type schema:CreativeWork
158 sg:pub.10.1007/s11517-014-1218-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1035837530
159 https://doi.org/10.1007/s11517-014-1218-y
160 rdf:type schema:CreativeWork
161 sg:pub.10.1023/a:1010933404324 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024739340
162 https://doi.org/10.1023/a:1010933404324
163 rdf:type schema:CreativeWork
164 sg:pub.10.1038/455047a schema:sameAs https://app.dimensions.ai/details/publication/pub.1018445264
165 https://doi.org/10.1038/455047a
166 rdf:type schema:CreativeWork
167 sg:pub.10.1038/498255a schema:sameAs https://app.dimensions.ai/details/publication/pub.1042946700
168 https://doi.org/10.1038/498255a
169 rdf:type schema:CreativeWork
170 sg:pub.10.1038/nature20566 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031000217
171 https://doi.org/10.1038/nature20566
172 rdf:type schema:CreativeWork
173 sg:pub.10.1038/nrd.2016.230 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030835959
174 https://doi.org/10.1038/nrd.2016.230
175 rdf:type schema:CreativeWork
176 sg:pub.10.1038/nrd2518 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047432273
177 https://doi.org/10.1038/nrd2518
178 rdf:type schema:CreativeWork
179 sg:pub.10.1038/nrd2760 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025311434
180 https://doi.org/10.1038/nrd2760
181 rdf:type schema:CreativeWork
182 sg:pub.10.1186/s12859-015-0731-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028981300
183 https://doi.org/10.1186/s12859-015-0731-9
184 rdf:type schema:CreativeWork
185 sg:pub.10.1186/s12859-018-2121-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1101898328
186 https://doi.org/10.1186/s12859-018-2121-6
187 rdf:type schema:CreativeWork
188 sg:pub.10.1186/s12938-017-0357-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091274879
189 https://doi.org/10.1186/s12938-017-0357-4
190 rdf:type schema:CreativeWork
191 https://app.dimensions.ai/details/publication/pub.1082642903 schema:CreativeWork
192 https://doi.org/10.1002/0471250953.bi0101s50 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026436840
193 rdf:type schema:CreativeWork
194 https://doi.org/10.1002/ijc.29047 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022453222
195 rdf:type schema:CreativeWork
196 https://doi.org/10.1002/prot.20373 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051371941
197 rdf:type schema:CreativeWork
198 https://doi.org/10.1016/0003-2670(93)80437-p schema:sameAs https://app.dimensions.ai/details/publication/pub.1038362770
199 rdf:type schema:CreativeWork
200 https://doi.org/10.1016/j.bbrc.2013.08.023 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014287757
201 rdf:type schema:CreativeWork
202 https://doi.org/10.1016/j.biosystems.2006.03.006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008791531
203 rdf:type schema:CreativeWork
204 https://doi.org/10.1016/j.compbiolchem.2005.09.006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016226726
205 rdf:type schema:CreativeWork
206 https://doi.org/10.1016/j.compbiomed.2011.05.015 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045373136
207 rdf:type schema:CreativeWork
208 https://doi.org/10.1016/j.coph.2014.12.002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044505059
209 rdf:type schema:CreativeWork
210 https://doi.org/10.1016/j.gene.2005.07.029 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035607530
211 rdf:type schema:CreativeWork
212 https://doi.org/10.1016/j.ipm.2009.03.002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012321154
213 rdf:type schema:CreativeWork
214 https://doi.org/10.1016/j.neuron.2017.03.016 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085116577
215 rdf:type schema:CreativeWork
216 https://doi.org/10.1016/s0163-7258(03)00038-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1004232436
217 rdf:type schema:CreativeWork
218 https://doi.org/10.1016/s1054-3589(10)58010-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034153661
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1016/s1359-6446(01)02131-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015484954
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1021/cn200025w schema:sameAs https://app.dimensions.ai/details/publication/pub.1054062739
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1021/jm9700575 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055958127
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1021/pr060534k schema:sameAs https://app.dimensions.ai/details/publication/pub.1056291290
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1093/bioinformatics/18.1.147 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016517627
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1093/bioinformatics/btg317 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024930760
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1093/bioinformatics/btl665 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000236398
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1093/bioinformatics/btn028 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008261674
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1093/database/baw161 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026572344
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1093/nar/gkw1218 schema:sameAs https://app.dimensions.ai/details/publication/pub.1079364417
239 rdf:type schema:CreativeWork
240 https://doi.org/10.1109/ijcnn.2015.7280613 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095788447
241 rdf:type schema:CreativeWork
242 https://doi.org/10.1109/tnnls.2013.2292894 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061718465
243 rdf:type schema:CreativeWork
244 https://doi.org/10.1110/ps.2500102 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026733522
245 rdf:type schema:CreativeWork
246 https://doi.org/10.1111/bph.13509 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002460091
247 rdf:type schema:CreativeWork
248 https://doi.org/10.1124/mol.63.6.1256 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003054041
249 rdf:type schema:CreativeWork
250 https://doi.org/10.1146/annurev-pharmtox-032112-135923 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030698697
251 rdf:type schema:CreativeWork
252 https://doi.org/10.1198/tas.2011.11052 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064201549
253 rdf:type schema:CreativeWork
254 https://doi.org/10.1371/journal.pone.0041882 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032363911
255 rdf:type schema:CreativeWork
256 https://doi.org/10.1371/journal.pone.0046633 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046449008
257 rdf:type schema:CreativeWork
258 https://doi.org/10.1371/journal.pone.0141287 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014761536
259 rdf:type schema:CreativeWork
260 https://doi.org/10.1613/jair.1199 schema:sameAs https://app.dimensions.ai/details/publication/pub.1105579281
261 rdf:type schema:CreativeWork
262 https://doi.org/10.18547/gcb.2015.vol1.iss1.e19 schema:sameAs https://app.dimensions.ai/details/publication/pub.1068659075
263 rdf:type schema:CreativeWork
264 https://doi.org/10.3390/molecules22071119 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090350697
265 rdf:type schema:CreativeWork
266 https://doi.org/10.7208/chicago/9780226416502.001.0001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100805936
267 rdf:type schema:CreativeWork
268 https://www.grid.ac/institutes/grid.469673.9 schema:alternateName Centro de Investigación Biomédica en Red de Salud Mental
269 schema:name Institut de Neurociències - Unitat de Bioestadìstica, Universitat Autònoma de Barcelona, 08193, Cerdanyola del Vallès, Spain
270 Network Biomedical Research Center on Mental Health (CIBERSAM), 28029, Madrid, Spain
271 rdf:type schema:Organization
272 https://www.grid.ac/institutes/grid.6835.8 schema:alternateName Universitat Politècnica de Catalunya
273 schema:name Centro de Investigación Biomédica en Red en Bioingeniería, Biomateriales y Nanomedicina (CIBER-BBN), 08193, Cerdanyola del Vallès, Spain
274 IDEAI Research Center, Universitat Politècnica de Catalunya, UPC BarcelonaTech, 08034, Barcelona, Spain
275 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...