Interpretable genotype-to-phenotype classifiers with performance guarantees View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2019-12

AUTHORS

Alexandre Drouin, Gaël Letarte, Frédéric Raymond, Mario Marchand, Jacques Corbeil, François Laviolette

ABSTRACT

Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials. More... »

PAGES

4071

Identifiers

URI

http://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2

DOI

http://dx.doi.org/10.1038/s41598-019-40561-2

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1112675121

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/30858411


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Drouin", 
        "givenName": "Alexandre", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Letarte", 
        "givenName": "Ga\u00ebl", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "School of Nutrition, Universit\u00e9 Laval, Quebec, Canada", 
            "Institute of Nutrition and Functional Foods, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Raymond", 
        "givenName": "Fr\u00e9d\u00e9ric", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Marchand", 
        "givenName": "Mario", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada", 
            "Infectious Disease Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Corbeil", 
        "givenName": "Jacques", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Universit\u00e9 Laval", 
          "id": "https://www.grid.ac/institutes/grid.23856.3a", 
          "name": [
            "Department of Computer Science and Software Engineering, Universit\u00e9 Laval, Quebec, Canada", 
            "Big Data Research Centre, Universit\u00e9 Laval, Quebec, Canada"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Laviolette", 
        "givenName": "Fran\u00e7ois", 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1128/aac.05583-11", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000434250"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s1473-3099(10)70139-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003782186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbt067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007818694"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkw1017", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009329911"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btm344", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009424564"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00993593", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010564739", 
          "https://doi.org/10.1007/bf00993593"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3389/fmicb.2016.01887", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011970086"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/0471667196.ess0866.pub2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011977749"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0022-2836(05)80360-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013618994"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.1002822", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016451804"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg2986", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016983398", 
          "https://doi.org/10.1038/nrg2986"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg3868", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017872565", 
          "https://doi.org/10.1038/nrg3868"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1010933404324", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024739340", 
          "https://doi.org/10.1023/a:1010933404324"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0140-6736(00)03167-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027743530"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/jproc.2015.2494198", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1028018203"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btu177", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029212748"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aac.00774-09", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029908739"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1273496.1273597", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030151299"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0378-1119(99)00219-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031039232"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrc2294", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033431840", 
          "https://doi.org/10.1038/nrc2294"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nmicrobiol.2016.41", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033465527", 
          "https://doi.org/10.1038/nmicrobiol.2016.41"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1751-5823.2001.tb00465.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034920734"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/jb.01410-13", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036484036"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btt020", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038110157"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2147/idr.s26613", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038424283"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/srep27930", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039514060", 
          "https://doi.org/10.1038/srep27930"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.cell.2013.09.006", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041185259"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btg005", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043080454"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0907925106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043355738"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/gepi.20473", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047078076"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/gepi.20473", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047078076"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrmicro3380", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047623025", 
          "https://doi.org/10.1038/nrmicro3380"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg.2016.132", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047890909", 
          "https://doi.org/10.1038/nrg.2016.132"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0033275", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048954099"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12864-016-2889-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052400167", 
          "https://doi.org/10.1186/s12864-016-2889-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s12864-016-2889-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052400167", 
          "https://doi.org/10.1186/s12864-016-2889-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/cr0301088", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053880075"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/cr0301088", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053880075"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbt052", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059413022"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aac.02413-16", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062709581"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.18637/jss.v033.i01", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1068672496"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.18637/jss.v033.i01", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1068672496"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/j.1460-2075.1989.tb03494.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1078895400"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083256144", 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/jac/dkx067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1083932525"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/113563", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085102893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/113563", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085102893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/113563", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085102893"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbx083", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090537010"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/molbev/msx200", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090638745"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/msphere.00341-17", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091333202"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1097/qco.0000000000000406", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091690880"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1097/qco.0000000000000406", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1091690880"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1017/cbo9780511809682", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1098667572"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/s41598-017-18972-w", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1100212536", 
          "https://doi.org/10.1038/s41598-017-18972-w"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.1005958", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1100829676"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/297754", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1103192186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/297754", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1103192186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/297754", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1103192186"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1007758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109823786"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1007758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109823786"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pgen.1007758", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1109823786"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2019-12", 
    "datePublishedReg": "2019-12-01", 
    "description": "Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1038/s41598-019-40561-2", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.4168696", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1045337", 
        "issn": [
          "2045-2322"
        ], 
        "name": "Scientific Reports", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "9"
      }
    ], 
    "name": "Interpretable genotype-to-phenotype classifiers with performance guarantees", 
    "pagination": "4071", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "317ec35e17e793eb038ba90061bcd1a4027065fd95dc24b91a20888fde295bb7"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "30858411"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "101563288"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1038/s41598-019-40561-2"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1112675121"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1038/s41598-019-40561-2", 
      "https://app.dimensions.ai/details/publication/pub.1112675121"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T13:19", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000368_0000000368/records_78956_00000001.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://www.nature.com/articles/s41598-019-40561-2"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/s41598-019-40561-2'


 

This table displays all metadata directly associated to this object as RDF triples.

266 TRIPLES      21 PREDICATES      80 URIs      21 LITERALS      9 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1038/s41598-019-40561-2 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N3d454012d55347cfa48886637ece41c7
4 schema:citation sg:pub.10.1007/bf00993593
5 sg:pub.10.1023/a:1010933404324
6 sg:pub.10.1038/nmicrobiol.2016.41
7 sg:pub.10.1038/nrc2294
8 sg:pub.10.1038/nrg.2016.132
9 sg:pub.10.1038/nrg2986
10 sg:pub.10.1038/nrg3868
11 sg:pub.10.1038/nrmicro3380
12 sg:pub.10.1038/s41598-017-18972-w
13 sg:pub.10.1038/srep27930
14 sg:pub.10.1186/s12864-016-2889-6
15 https://app.dimensions.ai/details/publication/pub.1083256144
16 https://doi.org/10.1002/0471667196.ess0866.pub2
17 https://doi.org/10.1002/gepi.20473
18 https://doi.org/10.1002/j.1460-2075.1989.tb03494.x
19 https://doi.org/10.1016/j.cell.2013.09.006
20 https://doi.org/10.1016/s0022-2836(05)80360-2
21 https://doi.org/10.1016/s0140-6736(00)03167-6
22 https://doi.org/10.1016/s0378-1119(99)00219-x
23 https://doi.org/10.1016/s1473-3099(10)70139-0
24 https://doi.org/10.1017/cbo9780511809682
25 https://doi.org/10.1021/cr0301088
26 https://doi.org/10.1073/pnas.0907925106
27 https://doi.org/10.1093/bib/bbt052
28 https://doi.org/10.1093/bib/bbt067
29 https://doi.org/10.1093/bib/bbx083
30 https://doi.org/10.1093/bioinformatics/btg005
31 https://doi.org/10.1093/bioinformatics/btm344
32 https://doi.org/10.1093/bioinformatics/btt020
33 https://doi.org/10.1093/bioinformatics/btu177
34 https://doi.org/10.1093/jac/dkx067
35 https://doi.org/10.1093/molbev/msx200
36 https://doi.org/10.1093/nar/gkw1017
37 https://doi.org/10.1097/qco.0000000000000406
38 https://doi.org/10.1101/113563
39 https://doi.org/10.1101/297754
40 https://doi.org/10.1109/jproc.2015.2494198
41 https://doi.org/10.1111/j.1751-5823.2001.tb00465.x
42 https://doi.org/10.1128/aac.00774-09
43 https://doi.org/10.1128/aac.02413-16
44 https://doi.org/10.1128/aac.05583-11
45 https://doi.org/10.1128/jb.01410-13
46 https://doi.org/10.1128/msphere.00341-17
47 https://doi.org/10.1145/1273496.1273597
48 https://doi.org/10.1371/journal.pcbi.1002822
49 https://doi.org/10.1371/journal.pcbi.1005958
50 https://doi.org/10.1371/journal.pgen.1007758
51 https://doi.org/10.1371/journal.pone.0033275
52 https://doi.org/10.18637/jss.v033.i01
53 https://doi.org/10.2147/idr.s26613
54 https://doi.org/10.3389/fmicb.2016.01887
55 schema:datePublished 2019-12
56 schema:datePublishedReg 2019-12-01
57 schema:description Understanding the relationship between the genome of a cell and its phenotype is a central problem in precision medicine. Nonetheless, genotype-to-phenotype prediction comes with great challenges for machine learning algorithms that limit their use in this setting. The high dimensionality of the data tends to hinder generalization and challenges the scalability of most learning algorithms. Additionally, most algorithms produce models that are complex and difficult to interpret. We alleviate these limitations by proposing strong performance guarantees, based on sample compression theory, for rule-based learning algorithms that produce highly interpretable models. We show that these guarantees can be leveraged to accelerate learning and improve model interpretability. Our approach is validated through an application to the genomic prediction of antimicrobial resistance, an important public health concern. Highly accurate models were obtained for 12 species and 56 antibiotics, and their interpretation revealed known resistance mechanisms, as well as some potentially new ones. An open-source disk-based implementation that is both memory and computationally efficient is provided with this work. The implementation is turnkey, requires no prior knowledge of machine learning, and is complemented by comprehensive tutorials.
58 schema:genre research_article
59 schema:inLanguage en
60 schema:isAccessibleForFree true
61 schema:isPartOf N5bfbd52f61654aedb37bcd686a3ac5b5
62 N87cbc427616c4b618c1cec122d60de81
63 sg:journal.1045337
64 schema:name Interpretable genotype-to-phenotype classifiers with performance guarantees
65 schema:pagination 4071
66 schema:productId N52c49398c5bb4737a6de79c9f08d06c9
67 N5d12aa44276946cba72e75442dd94e4f
68 N6201354195d34647b7ec2aff5f550f76
69 Nc304dfe1e68b478d9d4854367cf2fe01
70 Ned8f2313fd4b42c3bd1b308b2ea6cb4d
71 schema:sameAs https://app.dimensions.ai/details/publication/pub.1112675121
72 https://doi.org/10.1038/s41598-019-40561-2
73 schema:sdDatePublished 2019-04-11T13:19
74 schema:sdLicense https://scigraph.springernature.com/explorer/license/
75 schema:sdPublisher Nd6c5d38cfb8d43cfa191c30b03f0ace2
76 schema:url https://www.nature.com/articles/s41598-019-40561-2
77 sgo:license sg:explorer/license/
78 sgo:sdDataset articles
79 rdf:type schema:ScholarlyArticle
80 N2aaab198a33e4c5989bb74db63ebea2c schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
81 schema:familyName Raymond
82 schema:givenName Frédéric
83 rdf:type schema:Person
84 N3d454012d55347cfa48886637ece41c7 rdf:first Naa44d0e262a0411e935310c2c7ef824a
85 rdf:rest N95c95604d6ea4b13b3f800747010c44a
86 N52c49398c5bb4737a6de79c9f08d06c9 schema:name doi
87 schema:value 10.1038/s41598-019-40561-2
88 rdf:type schema:PropertyValue
89 N5bfbd52f61654aedb37bcd686a3ac5b5 schema:volumeNumber 9
90 rdf:type schema:PublicationVolume
91 N5d12aa44276946cba72e75442dd94e4f schema:name nlm_unique_id
92 schema:value 101563288
93 rdf:type schema:PropertyValue
94 N60dd422682294fd9884fb32d06f0627c rdf:first N92539480a6b64beebd59b5f652636cd6
95 rdf:rest Nb7aed88a273a42e8a59e151523e58f6e
96 N6201354195d34647b7ec2aff5f550f76 schema:name dimensions_id
97 schema:value pub.1112675121
98 rdf:type schema:PropertyValue
99 N71b5e7e4cdd04bed8444afc46f8e6ac1 rdf:first Nf182cd8cba4f4bd5bdcb507521a24641
100 rdf:rest rdf:nil
101 N87cbc427616c4b618c1cec122d60de81 schema:issueNumber 1
102 rdf:type schema:PublicationIssue
103 N92539480a6b64beebd59b5f652636cd6 schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
104 schema:familyName Marchand
105 schema:givenName Mario
106 rdf:type schema:Person
107 N95c95604d6ea4b13b3f800747010c44a rdf:first Nf2b4d3975abd45b18b1c296df8a6dacf
108 rdf:rest Nfd2001063595477faf5b0c40afe489b3
109 Naa44d0e262a0411e935310c2c7ef824a schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
110 schema:familyName Drouin
111 schema:givenName Alexandre
112 rdf:type schema:Person
113 Nb7aed88a273a42e8a59e151523e58f6e rdf:first Nfa8f8282712340cc98b59890640219f3
114 rdf:rest N71b5e7e4cdd04bed8444afc46f8e6ac1
115 Nc304dfe1e68b478d9d4854367cf2fe01 schema:name readcube_id
116 schema:value 317ec35e17e793eb038ba90061bcd1a4027065fd95dc24b91a20888fde295bb7
117 rdf:type schema:PropertyValue
118 Nd6c5d38cfb8d43cfa191c30b03f0ace2 schema:name Springer Nature - SN SciGraph project
119 rdf:type schema:Organization
120 Ned8f2313fd4b42c3bd1b308b2ea6cb4d schema:name pubmed_id
121 schema:value 30858411
122 rdf:type schema:PropertyValue
123 Nf182cd8cba4f4bd5bdcb507521a24641 schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
124 schema:familyName Laviolette
125 schema:givenName François
126 rdf:type schema:Person
127 Nf2b4d3975abd45b18b1c296df8a6dacf schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
128 schema:familyName Letarte
129 schema:givenName Gaël
130 rdf:type schema:Person
131 Nfa8f8282712340cc98b59890640219f3 schema:affiliation https://www.grid.ac/institutes/grid.23856.3a
132 schema:familyName Corbeil
133 schema:givenName Jacques
134 rdf:type schema:Person
135 Nfd2001063595477faf5b0c40afe489b3 rdf:first N2aaab198a33e4c5989bb74db63ebea2c
136 rdf:rest N60dd422682294fd9884fb32d06f0627c
137 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
138 schema:name Information and Computing Sciences
139 rdf:type schema:DefinedTerm
140 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
141 schema:name Artificial Intelligence and Image Processing
142 rdf:type schema:DefinedTerm
143 sg:grant.4168696 http://pending.schema.org/fundedItem sg:pub.10.1038/s41598-019-40561-2
144 rdf:type schema:MonetaryGrant
145 sg:journal.1045337 schema:issn 2045-2322
146 schema:name Scientific Reports
147 rdf:type schema:Periodical
148 sg:pub.10.1007/bf00993593 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010564739
149 https://doi.org/10.1007/bf00993593
150 rdf:type schema:CreativeWork
151 sg:pub.10.1023/a:1010933404324 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024739340
152 https://doi.org/10.1023/a:1010933404324
153 rdf:type schema:CreativeWork
154 sg:pub.10.1038/nmicrobiol.2016.41 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033465527
155 https://doi.org/10.1038/nmicrobiol.2016.41
156 rdf:type schema:CreativeWork
157 sg:pub.10.1038/nrc2294 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033431840
158 https://doi.org/10.1038/nrc2294
159 rdf:type schema:CreativeWork
160 sg:pub.10.1038/nrg.2016.132 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047890909
161 https://doi.org/10.1038/nrg.2016.132
162 rdf:type schema:CreativeWork
163 sg:pub.10.1038/nrg2986 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016983398
164 https://doi.org/10.1038/nrg2986
165 rdf:type schema:CreativeWork
166 sg:pub.10.1038/nrg3868 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017872565
167 https://doi.org/10.1038/nrg3868
168 rdf:type schema:CreativeWork
169 sg:pub.10.1038/nrmicro3380 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047623025
170 https://doi.org/10.1038/nrmicro3380
171 rdf:type schema:CreativeWork
172 sg:pub.10.1038/s41598-017-18972-w schema:sameAs https://app.dimensions.ai/details/publication/pub.1100212536
173 https://doi.org/10.1038/s41598-017-18972-w
174 rdf:type schema:CreativeWork
175 sg:pub.10.1038/srep27930 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039514060
176 https://doi.org/10.1038/srep27930
177 rdf:type schema:CreativeWork
178 sg:pub.10.1186/s12864-016-2889-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052400167
179 https://doi.org/10.1186/s12864-016-2889-6
180 rdf:type schema:CreativeWork
181 https://app.dimensions.ai/details/publication/pub.1083256144 schema:CreativeWork
182 https://doi.org/10.1002/0471667196.ess0866.pub2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011977749
183 rdf:type schema:CreativeWork
184 https://doi.org/10.1002/gepi.20473 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047078076
185 rdf:type schema:CreativeWork
186 https://doi.org/10.1002/j.1460-2075.1989.tb03494.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1078895400
187 rdf:type schema:CreativeWork
188 https://doi.org/10.1016/j.cell.2013.09.006 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041185259
189 rdf:type schema:CreativeWork
190 https://doi.org/10.1016/s0022-2836(05)80360-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013618994
191 rdf:type schema:CreativeWork
192 https://doi.org/10.1016/s0140-6736(00)03167-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027743530
193 rdf:type schema:CreativeWork
194 https://doi.org/10.1016/s0378-1119(99)00219-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1031039232
195 rdf:type schema:CreativeWork
196 https://doi.org/10.1016/s1473-3099(10)70139-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003782186
197 rdf:type schema:CreativeWork
198 https://doi.org/10.1017/cbo9780511809682 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098667572
199 rdf:type schema:CreativeWork
200 https://doi.org/10.1021/cr0301088 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053880075
201 rdf:type schema:CreativeWork
202 https://doi.org/10.1073/pnas.0907925106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043355738
203 rdf:type schema:CreativeWork
204 https://doi.org/10.1093/bib/bbt052 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059413022
205 rdf:type schema:CreativeWork
206 https://doi.org/10.1093/bib/bbt067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007818694
207 rdf:type schema:CreativeWork
208 https://doi.org/10.1093/bib/bbx083 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090537010
209 rdf:type schema:CreativeWork
210 https://doi.org/10.1093/bioinformatics/btg005 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043080454
211 rdf:type schema:CreativeWork
212 https://doi.org/10.1093/bioinformatics/btm344 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009424564
213 rdf:type schema:CreativeWork
214 https://doi.org/10.1093/bioinformatics/btt020 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038110157
215 rdf:type schema:CreativeWork
216 https://doi.org/10.1093/bioinformatics/btu177 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029212748
217 rdf:type schema:CreativeWork
218 https://doi.org/10.1093/jac/dkx067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1083932525
219 rdf:type schema:CreativeWork
220 https://doi.org/10.1093/molbev/msx200 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090638745
221 rdf:type schema:CreativeWork
222 https://doi.org/10.1093/nar/gkw1017 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009329911
223 rdf:type schema:CreativeWork
224 https://doi.org/10.1097/qco.0000000000000406 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091690880
225 rdf:type schema:CreativeWork
226 https://doi.org/10.1101/113563 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085102893
227 rdf:type schema:CreativeWork
228 https://doi.org/10.1101/297754 schema:sameAs https://app.dimensions.ai/details/publication/pub.1103192186
229 rdf:type schema:CreativeWork
230 https://doi.org/10.1109/jproc.2015.2494198 schema:sameAs https://app.dimensions.ai/details/publication/pub.1028018203
231 rdf:type schema:CreativeWork
232 https://doi.org/10.1111/j.1751-5823.2001.tb00465.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1034920734
233 rdf:type schema:CreativeWork
234 https://doi.org/10.1128/aac.00774-09 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029908739
235 rdf:type schema:CreativeWork
236 https://doi.org/10.1128/aac.02413-16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062709581
237 rdf:type schema:CreativeWork
238 https://doi.org/10.1128/aac.05583-11 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000434250
239 rdf:type schema:CreativeWork
240 https://doi.org/10.1128/jb.01410-13 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036484036
241 rdf:type schema:CreativeWork
242 https://doi.org/10.1128/msphere.00341-17 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091333202
243 rdf:type schema:CreativeWork
244 https://doi.org/10.1145/1273496.1273597 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030151299
245 rdf:type schema:CreativeWork
246 https://doi.org/10.1371/journal.pcbi.1002822 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016451804
247 rdf:type schema:CreativeWork
248 https://doi.org/10.1371/journal.pcbi.1005958 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100829676
249 rdf:type schema:CreativeWork
250 https://doi.org/10.1371/journal.pgen.1007758 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109823786
251 rdf:type schema:CreativeWork
252 https://doi.org/10.1371/journal.pone.0033275 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048954099
253 rdf:type schema:CreativeWork
254 https://doi.org/10.18637/jss.v033.i01 schema:sameAs https://app.dimensions.ai/details/publication/pub.1068672496
255 rdf:type schema:CreativeWork
256 https://doi.org/10.2147/idr.s26613 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038424283
257 rdf:type schema:CreativeWork
258 https://doi.org/10.3389/fmicb.2016.01887 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011970086
259 rdf:type schema:CreativeWork
260 https://www.grid.ac/institutes/grid.23856.3a schema:alternateName Université Laval
261 schema:name Big Data Research Centre, Université Laval, Quebec, Canada
262 Department of Computer Science and Software Engineering, Université Laval, Quebec, Canada
263 Infectious Disease Research Centre, Université Laval, Quebec, Canada
264 Institute of Nutrition and Functional Foods, Université Laval, Quebec, Canada
265 School of Nutrition, Université Laval, Quebec, Canada
266 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...