Interpretable genotype-to-phenotype classifiers with performance guarantees View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2019-12

AUTHORS

Alexandre Drouin, Gaël Letarte, Frédéric Raymond, Mario Marchand, Jacques Corbeil, François Laviolette

ABSTRACT

Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials. More... »

PAGES

4071

Identifiers

URI

http://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2

DOI

http://dx.doi.org/10.1038/s41598-019-40561-2

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1112675121

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/30858411


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Drouin", 
        "givenName": "Alexandre", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Letarte", 
        "givenName": "Ga\u00ebl", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "School of Nutrition, Universit\u00e9 Laval, Quebec, Canada", 
            "Institute of Nutrition and Functional Foods, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Raymond", 
        "givenName": "Fr\u00e9d\u00e9ric", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Marchand", 
        "givenName": "Mario", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada", 
            "Infectious Disease Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Corbeil", 
        "givenName": "Jacques", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Laviolette", 
        "givenName": "Fran\u00e7ois", 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1128/aac.05583-11", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000434250"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s1473-3099(10)70139-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003782186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbt067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007818694"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkw1017", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009329911"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btm344", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009424564"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00993593", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010564739", 
          "https://doi.org/10.1007/bf00993593"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3389/fmicb.2016.01887", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011970086"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/0471667196.ess0866.pub2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011977749"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0022-2836(05)80360-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013618994"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.1002822", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016451804"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg2986", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016983398", 
          "https://doi.org/10.1038/nrg2986"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg3868", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017872565", 
          "https://doi.org/10.1038/nrg3868"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1010933404324", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024739340", 
          "https://doi.org/10.1023/a:1010933404324"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0140-6736(00)03167-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027743530"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/jproc.2015.2494198", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028018203"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btu177", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029212748"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aac.00774-09", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029908739"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1273496.1273597", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030151299"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0378-1119(99)00219-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031039232"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrc2294", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033431840", 
          "https://doi.org/10.1038/nrc2294"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmicrobiol.2016.41", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033465527", 
          "https://doi.org/10.1038/nmicrobiol.2016.41"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1751-5823.2001.tb00465.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034920734"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/jb.01410-13", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036484036"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btt020", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038110157"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2147/idr.s26613", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038424283"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/srep27930", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039514060", 
          "https://doi.org/10.1038/srep27930"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.cell.2013.09.006", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041185259"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btg005", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043080454"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0907925106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043355738"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/gepi.20473", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047078076"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/gepi.20473", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047078076"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrmicro3380", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047623025", 
          "https://doi.org/10.1038/nrmicro3380"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg.2016.132", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047890909", 
          "https://doi.org/10.1038/nrg.2016.132"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0033275", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048954099"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12864-016-2889-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052400167", 
          "https://doi.org/10.1186/s12864-016-2889-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12864-016-2889-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052400167", 
          "https://doi.org/10.1186/s12864-016-2889-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/cr0301088", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053880075"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/cr0301088", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053880075"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbt052", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059413022"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aac.02413-16", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062709581"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.18637/jss.v033.i01", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1068672496"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.18637/jss.v033.i01", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1068672496"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/j.1460-2075.1989.tb03494.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1078895400"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083256144", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/jac/dkx067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1083932525"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/113563", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085102893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/113563", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085102893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/113563", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085102893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbx083", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090537010"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/molbev/msx200", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090638745"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/msphere.00341-17", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091333202"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1097/qco.0000000000000406", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091690880"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1097/qco.0000000000000406", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091690880"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1017/cbo9780511809682", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1098667572"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/s41598-017-18972-w", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1100212536", 
          "https://doi.org/10.1038/s41598-017-18972-w"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.1005958", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1100829676"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/297754", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1103192186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/297754", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1103192186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/297754", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1103192186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1007758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109823786"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1007758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109823786"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1007758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109823786"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2019-12", 
    "datePublishedReg": "2019-12-01", 
    "description": "Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1038/s41598-019-40561-2", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.4168696", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1045337", 
        "issn": [
          "2045-2322"
        ], 
        "name": "Scientific Reports", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "9"
      }
    ], 
    "name": "Interpretable genotype-to-phenotype classifiers with performance guarantees", 
    "pagination": "4071", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "317ec35e17e793eb038ba90061bcd1a4027065fd95dc24b91a20888fde295bb7"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "30858411"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101563288"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1038/s41598-019-40561-2"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1112675121"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1038/s41598-019-40561-2", 
      "https://app.dimensions.ai/details/publication/pub.1112675121"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T13:19", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000368_0000000368/records_78956_00000001.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://www.nature.com/articles/s41598-019-40561-2"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'


 

This table displays all metadata directly associated to this object as RDF triples.

266 TRIPLES      21 PREDICATES      80 URIs      21 LITERALS      9 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1038/s41598-019-40561-2 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N37341d45470546ba93193d9c5237d9c4
4 schema:citation sg:pub.10.1007/bf00993593
5 sg:pub.10.1023/a:1010933404324
6 sg:pub.10.1038/nmicrobiol.2016.41
7 sg:pub.10.1038/nrc2294
8 sg:pub.10.1038/nrg.2016.132
9 sg:pub.10.1038/nrg2986
10 sg:pub.10.1038/nrg3868
11 sg:pub.10.1038/nrmicro3380
12 sg:pub.10.1038/s41598-017-18972-w
13 sg:pub.10.1038/srep27930
14 sg:pub.10.1186/s12864-016-2889-6
15 https://app.dimensions.ai/details/publication/pub.1083256144
16 https://doi.org/10.1002/0471667196.ess0866.pub2
17 https://doi.org/10.1002/gepi.20473
18 https://doi.org/10.1002/j.1460-2075.1989.tb03494.x
19 https://doi.org/10.1016/j.cell.2013.09.006
20 https://doi.org/10.1016/s0022-2836(05)80360-2
21 https://doi.org/10.1016/s0140-6736(00)03167-6
22 https://doi.org/10.1016/s0378-1119(99)00219-x
23 https://doi.org/10.1016/s1473-3099(10)70139-0
24 https://doi.org/10.1017/cbo9780511809682
25 https://doi.org/10.1021/cr0301088
26 https://doi.org/10.1073/pnas.0907925106
27 https://doi.org/10.1093/bib/bbt052
28 https://doi.org/10.1093/bib/bbt067
29 https://doi.org/10.1093/bib/bbx083
30 https://doi.org/10.1093/bioinformatics/btg005
31 https://doi.org/10.1093/bioinformatics/btm344
32 https://doi.org/10.1093/bioinformatics/btt020
33 https://doi.org/10.1093/bioinformatics/btu177
34 https://doi.org/10.1093/jac/dkx067
35 https://doi.org/10.1093/molbev/msx200
36 https://doi.org/10.1093/nar/gkw1017
37 https://doi.org/10.1097/qco.0000000000000406
38 https://doi.org/10.1101/113563
39 https://doi.org/10.1101/297754
40 https://doi.org/10.1109/jproc.2015.2494198
41 https://doi.org/10.1111/j.1751-5823.2001.tb00465.x
42 https://doi.org/10.1128/aac.00774-09
43 https://doi.org/10.1128/aac.02413-16
44 https://doi.org/10.1128/aac.05583-11
45 https://doi.org/10.1128/jb.01410-13
46 https://doi.org/10.1128/msphere.00341-17
47 https://doi.org/10.1145/1273496.1273597
48 https://doi.org/10.1371/journal.pcbi.1002822
49 https://doi.org/10.1371/journal.pcbi.1005958
50 https://doi.org/10.1371/journal.pgen.1007758
51 https://doi.org/10.1371/journal.pone.0033275
52 https://doi.org/10.18637/jss.v033.i01
53 https://doi.org/10.2147/idr.s26613
54 https://doi.org/10.3389/fmicb.2016.01887
55 schema:datePublished 2019-12
56 schema:datePublishedReg 2019-12-01
57 schema:description Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials.
58 schema:genre research_article
59 schema:inLanguage en
60 schema:isAccessibleForFree true
61 schema:isPartOf N13e7b8efa9e844a0ba96a02ca6c5bd44
62 Ndf74cc2c9489402792ed9704bdc2969a
63 sg:journal.1045337
64 schema:name Interpretable genotype-to-phenotype classifiers with performance guarantees
65 schema:pagination 4071
66 schema:productId N56568677f9c14da89b962dc082e9bee6
67 N8e367792242e4e998d6ac15fbd22e0ff
68 Nbf5e95ff01e0483d94c48bb3dc68f989
69 Nec62da7f3e4d4715bc2d0fa1943357d4
70 Ned61c5ca487845f8bd40dbc737e5e14b
71 schema:sameAs https://app.dimensions.ai/details/publication/pub.1112675121
72 https://doi.org/10.1038/s41598-019-40561-2
73 schema:sdDatePublished 2019-04-11T13:19
74 schema:sdLicense https://scigraph.springernature.com/explorer/license/
75 schema:sdPublisher N863dd37695054104ab881d8cd9398634
76 schema:url https://www.nature.com/articles/s41598-019-40561-2
77 sgo:license sg:explorer/license/
78 sgo:sdDataset articles
79 rdf:type schema:ScholarlyArticle
80 N13e7b8efa9e844a0ba96a02ca6c5bd44 schema:volumeNumber 9
81 rdf:type schema:PublicationVolume
82 N15ee51f7b6cd4760a2ff7f3d7b920f38 rdf:first Nb585a1861d174739b032f9aa8fce2cc4
83 rdf:rest N87014d0ecbb242b8a004afc2f4576778
84 N37341d45470546ba93193d9c5237d9c4 rdf:first Nf5cb78f9d7e24bb0863d48f14411972e
85 rdf:rest N15ee51f7b6cd4760a2ff7f3d7b920f38
86 N56568677f9c14da89b962dc082e9bee6 schema:name doi
87 schema:value 10.1038/s41598-019-40561-2
88 rdf:type schema:PropertyValue
89 N60eea53f08df4d138398bc8b12b41e51 schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
90 schema:familyName Laviolette
91 schema:givenName François
92 rdf:type schema:Person
93 N8075e8b5348244259f95a62c73ffaed7 rdf:first Nde07e8e915ea4ff8a75cb2ec0ed71f73
94 rdf:rest Nb1876aa39e254ac9a7b08c9fe3b3e636
95 N863dd37695054104ab881d8cd9398634 schema:name Springer Nature - SN SciGraph project
96 rdf:type schema:Organization
97 N87014d0ecbb242b8a004afc2f4576778 rdf:first Nc531d47d74644a99b8022ea499144f8d
98 rdf:rest Ne501e271adcf400f977444ddb45e9b83
99 N8e367792242e4e998d6ac15fbd22e0ff schema:name readcube_id
100 schema:value 317ec35e17e793eb038ba90061bcd1a4027065fd95dc24b91a20888fde295bb7
101 rdf:type schema:PropertyValue
102 Nb1876aa39e254ac9a7b08c9fe3b3e636 rdf:first N60eea53f08df4d138398bc8b12b41e51
103 rdf:rest rdf:nil
104 Nb585a1861d174739b032f9aa8fce2cc4 schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
105 schema:familyName Letarte
106 schema:givenName Gaël
107 rdf:type schema:Person
108 Nbf5e95ff01e0483d94c48bb3dc68f989 schema:name dimensions_id
109 schema:value pub.1112675121
110 rdf:type schema:PropertyValue
111 Nc011d11ab5054b2fad309ce7149f89e6 schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
112 schema:familyName Marchand
113 schema:givenName Mario
114 rdf:type schema:Person
115 Nc531d47d74644a99b8022ea499144f8d schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
116 schema:familyName Raymond
117 schema:givenName Frédéric
118 rdf:type schema:Person
119 Nde07e8e915ea4ff8a75cb2ec0ed71f73 schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
120 schema:familyName Corbeil
121 schema:givenName Jacques
122 rdf:type schema:Person
123 Ndf74cc2c9489402792ed9704bdc2969a schema:issueNumber 1
124 rdf:type schema:PublicationIssue
125 Ne501e271adcf400f977444ddb45e9b83 rdf:first Nc011d11ab5054b2fad309ce7149f89e6
126 rdf:rest N8075e8b5348244259f95a62c73ffaed7
127 Nec62da7f3e4d4715bc2d0fa1943357d4 schema:name nlm_unique_id
128 schema:value 101563288
129 rdf:type schema:PropertyValue
130 Ned61c5ca487845f8bd40dbc737e5e14b schema:name pubmed_id
131 schema:value 30858411
132 rdf:type schema:PropertyValue
133 Nf5cb78f9d7e24bb0863d48f14411972e schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
134 schema:familyName Drouin
135 schema:givenName Alexandre
136 rdf:type schema:Person
137 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
138 schema:name Information and Computing Sciences
139 rdf:type schema:DefinedTerm
140 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
141 schema:name Artificial Intelligence and Image Processing
142 rdf:type schema:DefinedTerm
143 sg:grant.4168696 http://pending.schema.org/fundedItem sg:pub.10.1038/s41598-019-40561-2
144 rdf:type schema:MonetaryGrant
145 sg:journal.1045337 schema:issn 2045-2322
146 schema:name Scientific Reports
147 rdf:type schema:Periodical
148 sg:pub.10.1007/bf00993593 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010564739
149 https://doi.org/10.1007/bf00993593
150 rdf:type schema:CreativeWork
151 sg:pub.10.1023/a:1010933404324 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024739340
152 https://doi.org/10.1023/a:1010933404324
153 rdf:type schema:CreativeWork
154 sg:pub.10.1038/nmicrobiol.2016.41 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033465527
155 https://doi.org/10.1038/nmicrobiol.2016.41
156 rdf:type schema:CreativeWork
157 sg:pub.10.1038/nrc2294 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033431840
158 https://doi.org/10.1038/nrc2294
159 rdf:type schema:CreativeWork
160 sg:pub.10.1038/nrg.2016.132 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047890909
161 https://doi.org/10.1038/nrg.2016.132
162 rdf:type schema:CreativeWork
163 sg:pub.10.1038/nrg2986 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016983398
164 https://doi.org/10.1038/nrg2986
165 rdf:type schema:CreativeWork
166 sg:pub.10.1038/nrg3868 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017872565
167 https://doi.org/10.1038/nrg3868
168 rdf:type schema:CreativeWork
169 sg:pub.10.1038/nrmicro3380 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047623025
170 https://doi.org/10.1038/nrmicro3380
171 rdf:type schema:CreativeWork
172 sg:pub.10.1038/s41598-017-18972-w schema:sameAs https://app.dimensions.ai/details/publication/pub.1100212536
173 https://doi.org/10.1038/s41598-017-18972-w
174 rdf:type schema:CreativeWork
175 sg:pub.10.1038/srep27930 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039514060
176 https://doi.org/10.1038/srep27930
177 rdf:type schema:CreativeWork
178 sg:pub.10.1186/s12864-016-2889-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052400167
179 https://doi.org/10.1186/s12864-016-2889-6
180 rdf:type schema:CreativeWork
181 https://app.dimensions.ai/details/publication/pub.1083256144 schema:CreativeWork
182 https://doi.org/10.1002/0471667196.ess0866.pub2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011977749
183 rdf:type schema:CreativeWork
184 https://doi.org/10.1002/gepi.20473 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047078076
185 rdf:type schema:CreativeWork
186 https://doi.org/10.1002/j.1460-2075.1989.tb03494.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1078895400
187 rdf:type schema:CreativeWork
188 https://doi.org/10.1016/j.cell.2013.09.006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041185259
189 rdf:type schema:CreativeWork
190 https://doi.org/10.1016/s0022-2836(05)80360-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013618994
191 rdf:type schema:CreativeWork
192 https://doi.org/10.1016/s0140-6736(00)03167-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027743530
193 rdf:type schema:CreativeWork
194 https://doi.org/10.1016/s0378-1119(99)00219-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1031039232
195 rdf:type schema:CreativeWork
196 https://doi.org/10.1016/s1473-3099(10)70139-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003782186
197 rdf:type schema:CreativeWork
198 https://doi.org/10.1017/cbo9780511809682 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098667572
199 rdf:type schema:CreativeWork
200 https://doi.org/10.1021/cr0301088 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053880075
201 rdf:type schema:CreativeWork
202 https://doi.org/10.1073/pnas.0907925106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043355738
203 rdf:type schema:CreativeWork
204 https://doi.org/10.1093/bib/bbt052 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059413022
205 rdf:type schema:CreativeWork
206 https://doi.org/10.1093/bib/bbt067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007818694
207 rdf:type schema:CreativeWork
208 https://doi.org/10.1093/bib/bbx083 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090537010
209 rdf:type schema:CreativeWork
210 https://doi.org/10.1093/bioinformatics/btg005 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043080454
211 rdf:type schema:CreativeWork
212 https://doi.org/10.1093/bioinformatics/btm344 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009424564
213 rdf:type schema:CreativeWork
214 https://doi.org/10.1093/bioinformatics/btt020 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038110157
215 rdf:type schema:CreativeWork
216 https://doi.org/10.1093/bioinformatics/btu177 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029212748
217 rdf:type schema:CreativeWork
218 https://doi.org/10.1093/jac/dkx067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1083932525
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1093/molbev/msx200 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090638745
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1093/nar/gkw1017 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009329911
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1097/qco.0000000000000406 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091690880
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1101/113563 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085102893
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1101/297754 schema:sameAs https://app.dimensions.ai/details/publication/pub.1103192186
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1109/jproc.2015.2494198 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028018203
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1111/j.1751-5823.2001.tb00465.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1034920734
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1128/aac.00774-09 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029908739
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1128/aac.02413-16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062709581
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1128/aac.05583-11 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000434250
239 rdf:type schema:CreativeWork
240 https://doi.org/10.1128/jb.01410-13 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036484036
241 rdf:type schema:CreativeWork
242 https://doi.org/10.1128/msphere.00341-17 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091333202
243 rdf:type schema:CreativeWork
244 https://doi.org/10.1145/1273496.1273597 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030151299
245 rdf:type schema:CreativeWork
246 https://doi.org/10.1371/journal.pcbi.1002822 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016451804
247 rdf:type schema:CreativeWork
248 https://doi.org/10.1371/journal.pcbi.1005958 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100829676
249 rdf:type schema:CreativeWork
250 https://doi.org/10.1371/journal.pgen.1007758 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109823786
251 rdf:type schema:CreativeWork
252 https://doi.org/10.1371/journal.pone.0033275 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048954099
253 rdf:type schema:CreativeWork
254 https://doi.org/10.18637/jss.v033.i01 schema:sameAs https://app.dimensions.ai/details/publication/pub.1068672496
255 rdf:type schema:CreativeWork
256 https://doi.org/10.2147/idr.s26613 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038424283
257 rdf:type schema:CreativeWork
258 https://doi.org/10.3389/fmicb.2016.01887 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011970086
259 rdf:type schema:CreativeWork
260 https://www.grid.ac/institutes/grid.23856.3a schema:alternateName Université Laval
261 schema:name Big Data Research Centre, Université Laval, Quebec, Canada
262 Department of Computer Science and Software Engineering, Université Laval, Quebec, Canada
263 Infectious Disease Research Centre, Université Laval, Quebec, Canada
264 Institute of Nutrition and Functional Foods, Université Laval, Quebec, Canada
265 School of Nutrition, Université Laval, Quebec, Canada
266 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...