Molecular graph convolutions: moving beyond fingerprints View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2016-08-24

AUTHORS

Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley

ABSTRACT

Molecular “fingerprints” encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement. More... »

PAGES

595-608

References to SciGraph publications

  • 2015-05-27. Deep learning in NATURE
  • 2008-03-13. Recommendations for evaluation of computational methods in JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
  • 1986-10. Learning representations by back-propagating errors in NATURE
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8

    DOI

    http://dx.doi.org/10.1007/s10822-016-9938-8

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1053264924

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/27558503


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/03", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Chemical Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0304", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Medicinal and Biomolecular Chemistry", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0307", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Theoretical and Computational Chemistry", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computer Graphics", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computer-Aided Design", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Drug Design", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Ligands", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Machine Learning", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Molecular Structure", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Neural Networks, Computer", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Pharmaceutical Preparations", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.168010.e", 
              "name": [
                "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Kearnes", 
            "givenName": "Steven", 
            "id": "sg:person.0757445204.86", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757445204.86"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.420451.6", 
              "name": [
                "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "McCloskey", 
            "givenName": "Kevin", 
            "id": "sg:person.013333034335.94", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013333034335.94"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.420451.6", 
              "name": [
                "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Berndl", 
            "givenName": "Marc", 
            "id": "sg:person.014130414735.74", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014130414735.74"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.168010.e", 
              "name": [
                "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Pande", 
            "givenName": "Vijay", 
            "id": "sg:person.01336150035.35", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336150035.35"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.420451.6", 
              "name": [
                "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Riley", 
            "givenName": "Patrick", 
            "id": "sg:person.015523355735.57", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015523355735.57"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1038/nature14539", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1010020120", 
              "https://doi.org/10.1038/nature14539"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s10822-008-9196-5", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045738948", 
              "https://doi.org/10.1007/s10822-008-9196-5"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/323533a0", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018367015", 
              "https://doi.org/10.1038/323533a0"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2016-08-24", 
        "datePublishedReg": "2016-08-24", 
        "description": "Molecular \u201cfingerprints\u201d encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph\u2014atoms, bonds, distances, etc.\u2014which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.", 
        "genre": "article", 
        "id": "sg:pub.10.1007/s10822-016-9938-8", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.2675009", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.3536592", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1105375", 
            "issn": [
              "0928-2866", 
              "1573-9023"
            ], 
            "name": "Journal of Computer-Aided Molecular Design", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "8", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "30"
          }
        ], 
        "keywords": [
          "graph convolution", 
          "data-driven decision", 
          "fingerprint-based methods", 
          "graph structure", 
          "fingerprint representation", 
          "simple encoding", 
          "discovery applications", 
          "undirected graph", 
          "new paradigm", 
          "convolution", 
          "machine", 
          "drug discovery applications", 
          "information", 
          "great advantage", 
          "architecture", 
          "fingerprints", 
          "structural information", 
          "future improvements", 
          "cheminformatics", 
          "graph", 
          "encoding", 
          "paradigm", 
          "representation", 
          "exciting opportunities", 
          "particular aspects", 
          "model", 
          "applications", 
          "decisions", 
          "advantages", 
          "method", 
          "aspects", 
          "opportunities", 
          "improvement", 
          "workhorse", 
          "distance", 
          "structure", 
          "molecular structure", 
          "small molecules", 
          "bonds", 
          "molecules"
        ], 
        "name": "Molecular graph convolutions: moving beyond fingerprints", 
        "pagination": "595-608", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1053264924"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10822-016-9938-8"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "27558503"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10822-016-9938-8", 
          "https://app.dimensions.ai/details/publication/pub.1053264924"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-12-01T06:34", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_687.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/s10822-016-9938-8"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'


     

    This table displays all metadata directly associated to this object as RDF triples.

    184 TRIPLES      21 PREDICATES      77 URIs      65 LITERALS      15 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10822-016-9938-8 schema:about N2422b99e034a43629b974d84b6f1055c
    2 N2f869edbb5d3461b8e46e15e67afebb3
    3 N3bce84526d1f4cc4bfe1f45ac6a4320d
    4 N86155f0edeaa4bea9b3ed64c83025439
    5 N90ffa762f36d40c7b8755c43da6188a3
    6 N91cd44ccca4448a7bf86a61515db371f
    7 Nc0fffee18b6944e28b6ee000ba0a1a1a
    8 Nc5a0504e878d402c8bee641e23ed17f2
    9 anzsrc-for:03
    10 anzsrc-for:0304
    11 anzsrc-for:0307
    12 schema:author N4084f8e3f45f416386ccbbf7bd434a36
    13 schema:citation sg:pub.10.1007/s10822-008-9196-5
    14 sg:pub.10.1038/323533a0
    15 sg:pub.10.1038/nature14539
    16 schema:datePublished 2016-08-24
    17 schema:datePublishedReg 2016-08-24
    18 schema:description Molecular “fingerprints” encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.
    19 schema:genre article
    20 schema:isAccessibleForFree true
    21 schema:isPartOf N2ebc1a8a30804a1ba3043871f3132e0e
    22 N90cfc32246b64a879f26d4f665ff3a11
    23 sg:journal.1105375
    24 schema:keywords advantages
    25 applications
    26 architecture
    27 aspects
    28 bonds
    29 cheminformatics
    30 convolution
    31 data-driven decision
    32 decisions
    33 discovery applications
    34 distance
    35 drug discovery applications
    36 encoding
    37 exciting opportunities
    38 fingerprint representation
    39 fingerprint-based methods
    40 fingerprints
    41 future improvements
    42 graph
    43 graph convolution
    44 graph structure
    45 great advantage
    46 improvement
    47 information
    48 machine
    49 method
    50 model
    51 molecular structure
    52 molecules
    53 new paradigm
    54 opportunities
    55 paradigm
    56 particular aspects
    57 representation
    58 simple encoding
    59 small molecules
    60 structural information
    61 structure
    62 undirected graph
    63 workhorse
    64 schema:name Molecular graph convolutions: moving beyond fingerprints
    65 schema:pagination 595-608
    66 schema:productId N03ea33a46b1145baa8d4fb41a64a3032
    67 Nfbceea29905b4613a45896d7da203644
    68 Nfe64570ce02948d2acc3ad059b8339f1
    69 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053264924
    70 https://doi.org/10.1007/s10822-016-9938-8
    71 schema:sdDatePublished 2022-12-01T06:34
    72 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    73 schema:sdPublisher N58f36064bca548098ccdd2e4586f53e9
    74 schema:url https://doi.org/10.1007/s10822-016-9938-8
    75 sgo:license sg:explorer/license/
    76 sgo:sdDataset articles
    77 rdf:type schema:ScholarlyArticle
    78 N03ea33a46b1145baa8d4fb41a64a3032 schema:name doi
    79 schema:value 10.1007/s10822-016-9938-8
    80 rdf:type schema:PropertyValue
    81 N2422b99e034a43629b974d84b6f1055c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    82 schema:name Computer Graphics
    83 rdf:type schema:DefinedTerm
    84 N2ebc1a8a30804a1ba3043871f3132e0e schema:volumeNumber 30
    85 rdf:type schema:PublicationVolume
    86 N2f869edbb5d3461b8e46e15e67afebb3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    87 schema:name Ligands
    88 rdf:type schema:DefinedTerm
    89 N3bce84526d1f4cc4bfe1f45ac6a4320d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    90 schema:name Computer-Aided Design
    91 rdf:type schema:DefinedTerm
    92 N4084f8e3f45f416386ccbbf7bd434a36 rdf:first sg:person.0757445204.86
    93 rdf:rest N9080461e1848445c91469f9eeb6283c1
    94 N58f36064bca548098ccdd2e4586f53e9 schema:name Springer Nature - SN SciGraph project
    95 rdf:type schema:Organization
    96 N86155f0edeaa4bea9b3ed64c83025439 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    97 schema:name Drug Design
    98 rdf:type schema:DefinedTerm
    99 N9080461e1848445c91469f9eeb6283c1 rdf:first sg:person.013333034335.94
    100 rdf:rest Nc19a8064dae74c79814380a03cf0a110
    101 N90cfc32246b64a879f26d4f665ff3a11 schema:issueNumber 8
    102 rdf:type schema:PublicationIssue
    103 N90ffa762f36d40c7b8755c43da6188a3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    104 schema:name Neural Networks, Computer
    105 rdf:type schema:DefinedTerm
    106 N91cd44ccca4448a7bf86a61515db371f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    107 schema:name Pharmaceutical Preparations
    108 rdf:type schema:DefinedTerm
    109 Nc0fffee18b6944e28b6ee000ba0a1a1a schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    110 schema:name Machine Learning
    111 rdf:type schema:DefinedTerm
    112 Nc19a8064dae74c79814380a03cf0a110 rdf:first sg:person.014130414735.74
    113 rdf:rest Nea4c6a8e1c8a4754ba9e90676df3b489
    114 Nc5a0504e878d402c8bee641e23ed17f2 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    115 schema:name Molecular Structure
    116 rdf:type schema:DefinedTerm
    117 Ndbbeb90660c7441da66ea4dd33f68b23 rdf:first sg:person.015523355735.57
    118 rdf:rest rdf:nil
    119 Nea4c6a8e1c8a4754ba9e90676df3b489 rdf:first sg:person.01336150035.35
    120 rdf:rest Ndbbeb90660c7441da66ea4dd33f68b23
    121 Nfbceea29905b4613a45896d7da203644 schema:name dimensions_id
    122 schema:value pub.1053264924
    123 rdf:type schema:PropertyValue
    124 Nfe64570ce02948d2acc3ad059b8339f1 schema:name pubmed_id
    125 schema:value 27558503
    126 rdf:type schema:PropertyValue
    127 anzsrc-for:03 schema:inDefinedTermSet anzsrc-for:
    128 schema:name Chemical Sciences
    129 rdf:type schema:DefinedTerm
    130 anzsrc-for:0304 schema:inDefinedTermSet anzsrc-for:
    131 schema:name Medicinal and Biomolecular Chemistry
    132 rdf:type schema:DefinedTerm
    133 anzsrc-for:0307 schema:inDefinedTermSet anzsrc-for:
    134 schema:name Theoretical and Computational Chemistry
    135 rdf:type schema:DefinedTerm
    136 sg:grant.2675009 http://pending.schema.org/fundedItem sg:pub.10.1007/s10822-016-9938-8
    137 rdf:type schema:MonetaryGrant
    138 sg:grant.3536592 http://pending.schema.org/fundedItem sg:pub.10.1007/s10822-016-9938-8
    139 rdf:type schema:MonetaryGrant
    140 sg:journal.1105375 schema:issn 0928-2866
    141 1573-9023
    142 schema:name Journal of Computer-Aided Molecular Design
    143 schema:publisher Springer Nature
    144 rdf:type schema:Periodical
    145 sg:person.013333034335.94 schema:affiliation grid-institutes:grid.420451.6
    146 schema:familyName McCloskey
    147 schema:givenName Kevin
    148 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013333034335.94
    149 rdf:type schema:Person
    150 sg:person.01336150035.35 schema:affiliation grid-institutes:grid.168010.e
    151 schema:familyName Pande
    152 schema:givenName Vijay
    153 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336150035.35
    154 rdf:type schema:Person
    155 sg:person.014130414735.74 schema:affiliation grid-institutes:grid.420451.6
    156 schema:familyName Berndl
    157 schema:givenName Marc
    158 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014130414735.74
    159 rdf:type schema:Person
    160 sg:person.015523355735.57 schema:affiliation grid-institutes:grid.420451.6
    161 schema:familyName Riley
    162 schema:givenName Patrick
    163 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015523355735.57
    164 rdf:type schema:Person
    165 sg:person.0757445204.86 schema:affiliation grid-institutes:grid.168010.e
    166 schema:familyName Kearnes
    167 schema:givenName Steven
    168 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757445204.86
    169 rdf:type schema:Person
    170 sg:pub.10.1007/s10822-008-9196-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045738948
    171 https://doi.org/10.1007/s10822-008-9196-5
    172 rdf:type schema:CreativeWork
    173 sg:pub.10.1038/323533a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018367015
    174 https://doi.org/10.1038/323533a0
    175 rdf:type schema:CreativeWork
    176 sg:pub.10.1038/nature14539 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010020120
    177 https://doi.org/10.1038/nature14539
    178 rdf:type schema:CreativeWork
    179 grid-institutes:grid.168010.e schema:alternateName Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA
    180 schema:name Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA
    181 rdf:type schema:Organization
    182 grid-institutes:grid.420451.6 schema:alternateName Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA
    183 schema:name Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA
    184 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...