Screen efficiency comparisons of decision tree and neural network algorithms in machine learning assisted drug design View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2019-04

AUTHORS

Qiumei Pu, Yinghao Li, Hong Zhang, Haodong Yao, Bo Zhang, Bingji Hou, Lin Li, Yuliang Zhao, Lina Zhao

ABSTRACT

In view of huge search space in drug design, machine learning has become a powerful method to predict the affinity between small molecular drug and targeting protein with the development of artificial intelligence technology. However, various machine learning algorithms including massive different parameters make the prediction framework choice to be quite difficult. In this work, we took a recent drug design competition (from XtalPi company on the DataCastle platform) as the typical case to find the optimized parameters for different machines learning algorithms and the most effective algorithm. After the parameter optimizations, we compared the typical machine learning methods as decision tree (XGBoost, LightGBM) and artificial neural network (MLP, CNN) with root-mean-square error (RMSE) and coefficient of determination (R2) evaluation. As a result, decision tree is more effective than the neural network as LightGBM>XGBoost>CNN>MLP in the affinity prediction of the specific drug design problem with ~160000 samples. For a much larger screening task in a more complicated drug design study, the sophisticated neural network model may go beyond the decision tree algorithm after generalization enhancing and overfitting reducing. The advanced machine learning methods could extract more information of protein-ligand bindings than traditional ones and improve the screen efficiency of drug design up to 200–1000 times. More... »

PAGES

1-9

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s11426-018-9412-6

DOI

http://dx.doi.org/10.1007/s11426-018-9412-6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1112395553


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Minzu University of China", 
          "id": "https://www.grid.ac/institutes/grid.411077.4", 
          "name": [
            "CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China", 
            "School of Information Engineering, Minzu University of China, 100081, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Pu", 
        "givenName": "Qiumei", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Minzu University of China", 
          "id": "https://www.grid.ac/institutes/grid.411077.4", 
          "name": [
            "CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China", 
            "School of Information Engineering, Minzu University of China, 100081, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Yinghao", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Minzu University of China", 
          "id": "https://www.grid.ac/institutes/grid.411077.4", 
          "name": [
            "School of Information Engineering, Minzu University of China, 100081, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhang", 
        "givenName": "Hong", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Chinese Academy of Sciences", 
          "id": "https://www.grid.ac/institutes/grid.410726.6", 
          "name": [
            "CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China", 
            "University of Chinese Academy of Sciences, 101408, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yao", 
        "givenName": "Haodong", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Beijing Institute of Technology", 
          "id": "https://www.grid.ac/institutes/grid.43555.32", 
          "name": [
            "School of Computer, Beijing Institute of Technology, 100081, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhang", 
        "givenName": "Bo", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Chinese Academy of Sciences", 
          "id": "https://www.grid.ac/institutes/grid.410726.6", 
          "name": [
            "University of Chinese Academy of Sciences, 101408, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hou", 
        "givenName": "Bingji", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Beijing Institute of Technology", 
          "id": "https://www.grid.ac/institutes/grid.43555.32", 
          "name": [
            "School of Computer, Beijing Institute of Technology, 100081, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Li", 
        "givenName": "Lin", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Chinese Academy of Sciences", 
          "id": "https://www.grid.ac/institutes/grid.410726.6", 
          "name": [
            "CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China", 
            "University of Chinese Academy of Sciences, 101408, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhao", 
        "givenName": "Yuliang", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Chinese Academy of Sciences", 
          "id": "https://www.grid.ac/institutes/grid.410726.6", 
          "name": [
            "CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China", 
            "University of Chinese Academy of Sciences, 101408, Beijing, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhao", 
        "givenName": "Lina", 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1016/j.eswa.2013.08.015", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003312444"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/jcc.21372", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004469317"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/jcc.21372", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004469317"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.drudis.2014.10.012", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008937145"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btr513", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013554534"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1124/pr.112.007336", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015137292"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.asoc.2009.12.023", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016287243"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11426-016-0417-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017005429", 
          "https://doi.org/10.1007/s11426-016-0417-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11426-016-0417-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017005429", 
          "https://doi.org/10.1007/s11426-016-0417-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s12517-012-0610-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019624469", 
          "https://doi.org/10.1007/s12517-012-0610-x"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.eswa.2011.04.149", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021057648"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/2939672.2939785", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021899069"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ymeth.2016.06.024", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024904990"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.ymeth.2016.06.024", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024904990"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nchembio.576", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025077038", 
          "https://doi.org/10.1038/nchembio.576"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/2347736.2347755", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027581364"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.enzmictec.2012.10.009", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029621414"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/483531a", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032733014", 
          "https://doi.org/10.1038/483531a"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1088/1367-2630/15/9/095003", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033584668"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0167-6296(02)00126-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034082462"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0167-6296(02)00126-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1034082462"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.pharmthera.2013.01.016", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036857532"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.neucom.2014.08.021", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038390172"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbp023", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038784425"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bib/bbp023", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038784425"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.aaa8415", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039153602"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/2733373.2807412", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039662878"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg3920", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040939097", 
          "https://doi.org/10.1038/nrg3920"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/gepi.21614", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041852572"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11426-016-0007-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044760367", 
          "https://doi.org/10.1007/s11426-016-0007-2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11426-016-0007-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044760367", 
          "https://doi.org/10.1007/s11426-016-0007-2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/wcms.1225", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046690188"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkv1072", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047581205"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1758-2946-6-32", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050522199", 
          "https://doi.org/10.1186/1758-2946-6-32"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1758-2946-6-32", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050522199", 
          "https://doi.org/10.1186/1758-2946-6-32"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1177/1087057108319644", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050550722"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1177/1087057108319644", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050550722"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/acs.jcim.6b00355", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055098402"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/ct400783h", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055424961"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/jm0491804", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055948927"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/jm0491804", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055948927"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/jm1010572", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055951899"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1021/jm1010572", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1055951899"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tnnls.2015.2424995", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061718846"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tsmcc.2011.2157494", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061798353"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.2174/156802610790232251", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1069193894"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.5194/gmdd-7-1525-2014", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1072670849"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11426-017-9057-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090352764", 
          "https://doi.org/10.1007/s11426-017-9057-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11426-017-9057-1", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1090352764", 
          "https://doi.org/10.1007/s11426-017-9057-1"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/v1/d14-1181", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099110546"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/v1/d14-1181", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099110546"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s11426-017-9169-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1100154650", 
          "https://doi.org/10.1007/s11426-017-9169-4"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2019-04", 
    "datePublishedReg": "2019-04-01", 
    "description": "In view of huge search space in drug design, machine learning has become a powerful method to predict the affinity between small molecular drug and targeting protein with the development of artificial intelligence technology. However, various machine learning algorithms including massive different parameters make the prediction framework choice to be quite difficult. In this work, we took a recent drug design competition (from XtalPi company on the DataCastle platform) as the typical case to find the optimized parameters for different machines learning algorithms and the most effective algorithm. After the parameter optimizations, we compared the typical machine learning methods as decision tree (XGBoost, LightGBM) and artificial neural network (MLP, CNN) with root-mean-square error (RMSE) and coefficient of determination (R2) evaluation. As a result, decision tree is more effective than the neural network as LightGBM>XGBoost>CNN>MLP in the affinity prediction of the specific drug design problem with ~160000 samples. For a much larger screening task in a more complicated drug design study, the sophisticated neural network model may go beyond the decision tree algorithm after generalization enhancing and overfitting reducing. The advanced machine learning methods could extract more information of protein-ligand bindings than traditional ones and improve the screen efficiency of drug design up to 200\u20131000 times.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1007/s11426-018-9412-6", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": [
      {
        "id": "sg:journal.1297513", 
        "issn": [
          "1674-7291", 
          "1862-2771"
        ], 
        "name": "Science China Chemistry", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "4", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "62"
      }
    ], 
    "name": "Screen efficiency comparisons of decision tree and neural network algorithms in machine learning assisted drug design", 
    "pagination": "1-9", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "b893db51b2c21a7adfecf37b377dcd24668d12b36915fe7343411dc9a99a8d7f"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s11426-018-9412-6"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1112395553"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s11426-018-9412-6", 
      "https://app.dimensions.ai/details/publication/pub.1112395553"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T14:20", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000372_0000000372/records_117121_00000003.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1007%2Fs11426-018-9412-6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s11426-018-9412-6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s11426-018-9412-6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s11426-018-9412-6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s11426-018-9412-6'


 

This table displays all metadata directly associated to this object as RDF triples.

245 TRIPLES      21 PREDICATES      67 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s11426-018-9412-6 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Na376f217f4184a2ba67066417d944ab4
4 schema:citation sg:pub.10.1007/s11426-016-0007-2
5 sg:pub.10.1007/s11426-016-0417-9
6 sg:pub.10.1007/s11426-017-9057-1
7 sg:pub.10.1007/s11426-017-9169-4
8 sg:pub.10.1007/s12517-012-0610-x
9 sg:pub.10.1038/483531a
10 sg:pub.10.1038/nchembio.576
11 sg:pub.10.1038/nrg3920
12 sg:pub.10.1186/1758-2946-6-32
13 https://doi.org/10.1002/gepi.21614
14 https://doi.org/10.1002/jcc.21372
15 https://doi.org/10.1002/wcms.1225
16 https://doi.org/10.1016/j.asoc.2009.12.023
17 https://doi.org/10.1016/j.drudis.2014.10.012
18 https://doi.org/10.1016/j.enzmictec.2012.10.009
19 https://doi.org/10.1016/j.eswa.2011.04.149
20 https://doi.org/10.1016/j.eswa.2013.08.015
21 https://doi.org/10.1016/j.neucom.2014.08.021
22 https://doi.org/10.1016/j.pharmthera.2013.01.016
23 https://doi.org/10.1016/j.ymeth.2016.06.024
24 https://doi.org/10.1016/s0167-6296(02)00126-1
25 https://doi.org/10.1021/acs.jcim.6b00355
26 https://doi.org/10.1021/ct400783h
27 https://doi.org/10.1021/jm0491804
28 https://doi.org/10.1021/jm1010572
29 https://doi.org/10.1088/1367-2630/15/9/095003
30 https://doi.org/10.1093/bib/bbp023
31 https://doi.org/10.1093/bioinformatics/btr513
32 https://doi.org/10.1093/nar/gkv1072
33 https://doi.org/10.1109/tnnls.2015.2424995
34 https://doi.org/10.1109/tsmcc.2011.2157494
35 https://doi.org/10.1124/pr.112.007336
36 https://doi.org/10.1126/science.aaa8415
37 https://doi.org/10.1145/2347736.2347755
38 https://doi.org/10.1145/2733373.2807412
39 https://doi.org/10.1145/2939672.2939785
40 https://doi.org/10.1177/1087057108319644
41 https://doi.org/10.2174/156802610790232251
42 https://doi.org/10.3115/v1/d14-1181
43 https://doi.org/10.5194/gmdd-7-1525-2014
44 schema:datePublished 2019-04
45 schema:datePublishedReg 2019-04-01
46 schema:description In view of huge search space in drug design, machine learning has become a powerful method to predict the affinity between small molecular drug and targeting protein with the development of artificial intelligence technology. However, various machine learning algorithms including massive different parameters make the prediction framework choice to be quite difficult. In this work, we took a recent drug design competition (from XtalPi company on the DataCastle platform) as the typical case to find the optimized parameters for different machines learning algorithms and the most effective algorithm. After the parameter optimizations, we compared the typical machine learning methods as decision tree (XGBoost, LightGBM) and artificial neural network (MLP, CNN) with root-mean-square error (RMSE) and coefficient of determination (R2) evaluation. As a result, decision tree is more effective than the neural network as LightGBM>XGBoost>CNN>MLP in the affinity prediction of the specific drug design problem with ~160000 samples. For a much larger screening task in a more complicated drug design study, the sophisticated neural network model may go beyond the decision tree algorithm after generalization enhancing and overfitting reducing. The advanced machine learning methods could extract more information of protein-ligand bindings than traditional ones and improve the screen efficiency of drug design up to 200–1000 times.
47 schema:genre research_article
48 schema:inLanguage en
49 schema:isAccessibleForFree false
50 schema:isPartOf Na2e3aa1547c04477beaa1e666aa310d1
51 Nbde73306c4f146b9a4f772327fac723c
52 sg:journal.1297513
53 schema:name Screen efficiency comparisons of decision tree and neural network algorithms in machine learning assisted drug design
54 schema:pagination 1-9
55 schema:productId N77257375c772416da71faf16c753e4cb
56 Nb151621272b741dda8af0588bc81dd71
57 Nda8bcc5064f44b3794ab4897d1657aa7
58 schema:sameAs https://app.dimensions.ai/details/publication/pub.1112395553
59 https://doi.org/10.1007/s11426-018-9412-6
60 schema:sdDatePublished 2019-04-11T14:20
61 schema:sdLicense https://scigraph.springernature.com/explorer/license/
62 schema:sdPublisher Ndcee70c614884706827682885a318dc8
63 schema:url https://link.springer.com/10.1007%2Fs11426-018-9412-6
64 sgo:license sg:explorer/license/
65 sgo:sdDataset articles
66 rdf:type schema:ScholarlyArticle
67 N1df271b53624455695ede55fb9738918 schema:affiliation https://www.grid.ac/institutes/grid.43555.32
68 schema:familyName Zhang
69 schema:givenName Bo
70 rdf:type schema:Person
71 N20f005497991460fae763b8d6d5376aa rdf:first N1df271b53624455695ede55fb9738918
72 rdf:rest Na36dd62213434517b98c84f0ff681233
73 N3d8b8832fcae40d2a45e43050f442145 schema:affiliation https://www.grid.ac/institutes/grid.411077.4
74 schema:familyName Zhang
75 schema:givenName Hong
76 rdf:type schema:Person
77 N508e420c3d1440e1a7b8016752909461 rdf:first Ndd57304ecc7f4acebdaf2d6feaa3ea65
78 rdf:rest Nfecdac98eb3942b0834840fd7c00e43e
79 N565a191ccf8d46c5ab09bfd86d12f82e rdf:first Nb10b11e0e4464e8ea426310e74ddc706
80 rdf:rest Nd70ee7e4808748c9bb89d3dc60f0bcb4
81 N5ca69b58aef64e3d9ff28b262a6f58d9 schema:affiliation https://www.grid.ac/institutes/grid.410726.6
82 schema:familyName Hou
83 schema:givenName Bingji
84 rdf:type schema:Person
85 N77257375c772416da71faf16c753e4cb schema:name dimensions_id
86 schema:value pub.1112395553
87 rdf:type schema:PropertyValue
88 N9085065639194088b7040ff9eaf13076 rdf:first Nc9fa567628fd4e79a3c87bc256aa806d
89 rdf:rest N20f005497991460fae763b8d6d5376aa
90 N98c818a653554065baab26c96ca5b227 schema:affiliation https://www.grid.ac/institutes/grid.43555.32
91 schema:familyName Li
92 schema:givenName Lin
93 rdf:type schema:Person
94 Na2e3aa1547c04477beaa1e666aa310d1 schema:issueNumber 4
95 rdf:type schema:PublicationIssue
96 Na36dd62213434517b98c84f0ff681233 rdf:first N5ca69b58aef64e3d9ff28b262a6f58d9
97 rdf:rest Na63945cf06f3430bac33ecb8abd6d8ef
98 Na376f217f4184a2ba67066417d944ab4 rdf:first Ndfbbb496e7084a6fb278d228cd64e32a
99 rdf:rest N565a191ccf8d46c5ab09bfd86d12f82e
100 Na63945cf06f3430bac33ecb8abd6d8ef rdf:first N98c818a653554065baab26c96ca5b227
101 rdf:rest N508e420c3d1440e1a7b8016752909461
102 Nb10b11e0e4464e8ea426310e74ddc706 schema:affiliation https://www.grid.ac/institutes/grid.411077.4
103 schema:familyName Li
104 schema:givenName Yinghao
105 rdf:type schema:Person
106 Nb151621272b741dda8af0588bc81dd71 schema:name readcube_id
107 schema:value b893db51b2c21a7adfecf37b377dcd24668d12b36915fe7343411dc9a99a8d7f
108 rdf:type schema:PropertyValue
109 Nbde73306c4f146b9a4f772327fac723c schema:volumeNumber 62
110 rdf:type schema:PublicationVolume
111 Nc9fa567628fd4e79a3c87bc256aa806d schema:affiliation https://www.grid.ac/institutes/grid.410726.6
112 schema:familyName Yao
113 schema:givenName Haodong
114 rdf:type schema:Person
115 Nd70ee7e4808748c9bb89d3dc60f0bcb4 rdf:first N3d8b8832fcae40d2a45e43050f442145
116 rdf:rest N9085065639194088b7040ff9eaf13076
117 Nda8bcc5064f44b3794ab4897d1657aa7 schema:name doi
118 schema:value 10.1007/s11426-018-9412-6
119 rdf:type schema:PropertyValue
120 Ndb38f35bbc224a49bebc4e53f0705025 schema:affiliation https://www.grid.ac/institutes/grid.410726.6
121 schema:familyName Zhao
122 schema:givenName Lina
123 rdf:type schema:Person
124 Ndcee70c614884706827682885a318dc8 schema:name Springer Nature - SN SciGraph project
125 rdf:type schema:Organization
126 Ndd57304ecc7f4acebdaf2d6feaa3ea65 schema:affiliation https://www.grid.ac/institutes/grid.410726.6
127 schema:familyName Zhao
128 schema:givenName Yuliang
129 rdf:type schema:Person
130 Ndfbbb496e7084a6fb278d228cd64e32a schema:affiliation https://www.grid.ac/institutes/grid.411077.4
131 schema:familyName Pu
132 schema:givenName Qiumei
133 rdf:type schema:Person
134 Nfecdac98eb3942b0834840fd7c00e43e rdf:first Ndb38f35bbc224a49bebc4e53f0705025
135 rdf:rest rdf:nil
136 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
137 schema:name Information and Computing Sciences
138 rdf:type schema:DefinedTerm
139 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
140 schema:name Artificial Intelligence and Image Processing
141 rdf:type schema:DefinedTerm
142 sg:journal.1297513 schema:issn 1674-7291
143 1862-2771
144 schema:name Science China Chemistry
145 rdf:type schema:Periodical
146 sg:pub.10.1007/s11426-016-0007-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044760367
147 https://doi.org/10.1007/s11426-016-0007-2
148 rdf:type schema:CreativeWork
149 sg:pub.10.1007/s11426-016-0417-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017005429
150 https://doi.org/10.1007/s11426-016-0417-9
151 rdf:type schema:CreativeWork
152 sg:pub.10.1007/s11426-017-9057-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1090352764
153 https://doi.org/10.1007/s11426-017-9057-1
154 rdf:type schema:CreativeWork
155 sg:pub.10.1007/s11426-017-9169-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100154650
156 https://doi.org/10.1007/s11426-017-9169-4
157 rdf:type schema:CreativeWork
158 sg:pub.10.1007/s12517-012-0610-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1019624469
159 https://doi.org/10.1007/s12517-012-0610-x
160 rdf:type schema:CreativeWork
161 sg:pub.10.1038/483531a schema:sameAs https://app.dimensions.ai/details/publication/pub.1032733014
162 https://doi.org/10.1038/483531a
163 rdf:type schema:CreativeWork
164 sg:pub.10.1038/nchembio.576 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025077038
165 https://doi.org/10.1038/nchembio.576
166 rdf:type schema:CreativeWork
167 sg:pub.10.1038/nrg3920 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040939097
168 https://doi.org/10.1038/nrg3920
169 rdf:type schema:CreativeWork
170 sg:pub.10.1186/1758-2946-6-32 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050522199
171 https://doi.org/10.1186/1758-2946-6-32
172 rdf:type schema:CreativeWork
173 https://doi.org/10.1002/gepi.21614 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041852572
174 rdf:type schema:CreativeWork
175 https://doi.org/10.1002/jcc.21372 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004469317
176 rdf:type schema:CreativeWork
177 https://doi.org/10.1002/wcms.1225 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046690188
178 rdf:type schema:CreativeWork
179 https://doi.org/10.1016/j.asoc.2009.12.023 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016287243
180 rdf:type schema:CreativeWork
181 https://doi.org/10.1016/j.drudis.2014.10.012 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008937145
182 rdf:type schema:CreativeWork
183 https://doi.org/10.1016/j.enzmictec.2012.10.009 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029621414
184 rdf:type schema:CreativeWork
185 https://doi.org/10.1016/j.eswa.2011.04.149 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021057648
186 rdf:type schema:CreativeWork
187 https://doi.org/10.1016/j.eswa.2013.08.015 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003312444
188 rdf:type schema:CreativeWork
189 https://doi.org/10.1016/j.neucom.2014.08.021 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038390172
190 rdf:type schema:CreativeWork
191 https://doi.org/10.1016/j.pharmthera.2013.01.016 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036857532
192 rdf:type schema:CreativeWork
193 https://doi.org/10.1016/j.ymeth.2016.06.024 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024904990
194 rdf:type schema:CreativeWork
195 https://doi.org/10.1016/s0167-6296(02)00126-1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1034082462
196 rdf:type schema:CreativeWork
197 https://doi.org/10.1021/acs.jcim.6b00355 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055098402
198 rdf:type schema:CreativeWork
199 https://doi.org/10.1021/ct400783h schema:sameAs https://app.dimensions.ai/details/publication/pub.1055424961
200 rdf:type schema:CreativeWork
201 https://doi.org/10.1021/jm0491804 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055948927
202 rdf:type schema:CreativeWork
203 https://doi.org/10.1021/jm1010572 schema:sameAs https://app.dimensions.ai/details/publication/pub.1055951899
204 rdf:type schema:CreativeWork
205 https://doi.org/10.1088/1367-2630/15/9/095003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033584668
206 rdf:type schema:CreativeWork
207 https://doi.org/10.1093/bib/bbp023 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038784425
208 rdf:type schema:CreativeWork
209 https://doi.org/10.1093/bioinformatics/btr513 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013554534
210 rdf:type schema:CreativeWork
211 https://doi.org/10.1093/nar/gkv1072 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047581205
212 rdf:type schema:CreativeWork
213 https://doi.org/10.1109/tnnls.2015.2424995 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061718846
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1109/tsmcc.2011.2157494 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061798353
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1124/pr.112.007336 schema:sameAs https://app.dimensions.ai/details/publication/pub.1015137292
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1126/science.aaa8415 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039153602
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1145/2347736.2347755 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027581364
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1145/2733373.2807412 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039662878
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1145/2939672.2939785 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021899069
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1177/1087057108319644 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050550722
228 rdf:type schema:CreativeWork
229 https://doi.org/10.2174/156802610790232251 schema:sameAs https://app.dimensions.ai/details/publication/pub.1069193894
230 rdf:type schema:CreativeWork
231 https://doi.org/10.3115/v1/d14-1181 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099110546
232 rdf:type schema:CreativeWork
233 https://doi.org/10.5194/gmdd-7-1525-2014 schema:sameAs https://app.dimensions.ai/details/publication/pub.1072670849
234 rdf:type schema:CreativeWork
235 https://www.grid.ac/institutes/grid.410726.6 schema:alternateName University of Chinese Academy of Sciences
236 schema:name CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China
237 University of Chinese Academy of Sciences, 101408, Beijing, China
238 rdf:type schema:Organization
239 https://www.grid.ac/institutes/grid.411077.4 schema:alternateName Minzu University of China
240 schema:name CAS Key Laboratory for Biomedical Effects of Nanomaterials and Nanosafety, Institute of High Energy Physics, Chinese Academy of Sciences, 100049, Beijing, China
241 School of Information Engineering, Minzu University of China, 100081, Beijing, China
242 rdf:type schema:Organization
243 https://www.grid.ac/institutes/grid.43555.32 schema:alternateName Beijing Institute of Technology
244 schema:name School of Computer, Beijing Institute of Technology, 100081, Beijing, China
245 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...