Molecular graph convolutions: moving beyond fingerprints View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2016-08-24

AUTHORS

Steven Kearnes, Kevin McCloskey, Marc Berndl, Vijay Pande, Patrick Riley

ABSTRACT

Molecular “fingerprints” encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement. More... »

PAGES

595-608

References to SciGraph publications

  • 2015-05-27. Deep learning in NATURE
  • 2008-03-13. Recommendations for evaluation of computational methods in JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
  • 1986-10. Learning representations by back-propagating errors in NATURE
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8

    DOI

    http://dx.doi.org/10.1007/s10822-016-9938-8

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1053264924

    PUBMED

    https://www.ncbi.nlm.nih.gov/pubmed/27558503


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/03", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Chemical Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0304", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Medicinal and Biomolecular Chemistry", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0307", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Theoretical and Computational Chemistry", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computer Graphics", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Computer-Aided Design", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Drug Design", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Ligands", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Machine Learning", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Molecular Structure", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Neural Networks, Computer", 
            "type": "DefinedTerm"
          }, 
          {
            "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
            "name": "Pharmaceutical Preparations", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.168010.e", 
              "name": [
                "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Kearnes", 
            "givenName": "Steven", 
            "id": "sg:person.0757445204.86", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757445204.86"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.420451.6", 
              "name": [
                "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "McCloskey", 
            "givenName": "Kevin", 
            "id": "sg:person.013333034335.94", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013333034335.94"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.420451.6", 
              "name": [
                "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Berndl", 
            "givenName": "Marc", 
            "id": "sg:person.014130414735.74", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014130414735.74"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.168010.e", 
              "name": [
                "Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Pande", 
            "givenName": "Vijay", 
            "id": "sg:person.01336150035.35", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336150035.35"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA", 
              "id": "http://www.grid.ac/institutes/grid.420451.6", 
              "name": [
                "Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Riley", 
            "givenName": "Patrick", 
            "id": "sg:person.015523355735.57", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015523355735.57"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1038/nature14539", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1010020120", 
              "https://doi.org/10.1038/nature14539"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s10822-008-9196-5", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1045738948", 
              "https://doi.org/10.1007/s10822-008-9196-5"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/323533a0", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018367015", 
              "https://doi.org/10.1038/323533a0"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2016-08-24", 
        "datePublishedReg": "2016-08-24", 
        "description": "Molecular \u201cfingerprints\u201d encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph\u2014atoms, bonds, distances, etc.\u2014which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.", 
        "genre": "article", 
        "id": "sg:pub.10.1007/s10822-016-9938-8", 
        "isAccessibleForFree": true, 
        "isFundedItemOf": [
          {
            "id": "sg:grant.2675009", 
            "type": "MonetaryGrant"
          }, 
          {
            "id": "sg:grant.3536592", 
            "type": "MonetaryGrant"
          }
        ], 
        "isPartOf": [
          {
            "id": "sg:journal.1105375", 
            "issn": [
              "0928-2866", 
              "1573-9023"
            ], 
            "name": "Journal of Computer-Aided Molecular Design", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "8", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "30"
          }
        ], 
        "keywords": [
          "graph convolution", 
          "data-driven decision", 
          "fingerprint-based methods", 
          "graph structure", 
          "fingerprint representation", 
          "simple encoding", 
          "discovery applications", 
          "undirected graph", 
          "new paradigm", 
          "convolution", 
          "machine", 
          "drug discovery applications", 
          "information", 
          "great advantage", 
          "architecture", 
          "fingerprints", 
          "structural information", 
          "future improvements", 
          "cheminformatics", 
          "graph", 
          "encoding", 
          "paradigm", 
          "representation", 
          "exciting opportunities", 
          "particular aspects", 
          "model", 
          "applications", 
          "decisions", 
          "advantages", 
          "method", 
          "aspects", 
          "opportunities", 
          "improvement", 
          "workhorse", 
          "distance", 
          "structure", 
          "molecular structure", 
          "small molecules", 
          "bonds", 
          "molecules"
        ], 
        "name": "Molecular graph convolutions: moving beyond fingerprints", 
        "pagination": "595-608", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1053264924"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10822-016-9938-8"
            ]
          }, 
          {
            "name": "pubmed_id", 
            "type": "PropertyValue", 
            "value": [
              "27558503"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10822-016-9938-8", 
          "https://app.dimensions.ai/details/publication/pub.1053264924"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-12-01T06:34", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20221201/entities/gbq_results/article/article_687.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/s10822-016-9938-8"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10822-016-9938-8'


     

    This table displays all metadata directly associated to this object as RDF triples.

    184 TRIPLES      21 PREDICATES      77 URIs      65 LITERALS      15 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10822-016-9938-8 schema:about N088b525ba63c4ae5a9b61dbcadf301d8
    2 N310102c0d08a4918badf3c5a38d6707e
    3 N3fb036cd5e2c404f89db9d808d0f21b3
    4 N61544828a6324884aac10048acdb11ab
    5 N7eedaa0bee1a42b48a406aedb0b9e8c5
    6 N8504cf9abbb9403eb951c4021c38fefc
    7 Nc2b34615284e45cba9379d8d9153d2c8
    8 Nc662960f98cd415f898b62f0df60d643
    9 anzsrc-for:03
    10 anzsrc-for:0304
    11 anzsrc-for:0307
    12 schema:author N23a33e4be5984dd0ab2588ee97689b67
    13 schema:citation sg:pub.10.1007/s10822-008-9196-5
    14 sg:pub.10.1038/323533a0
    15 sg:pub.10.1038/nature14539
    16 schema:datePublished 2016-08-24
    17 schema:datePublishedReg 2016-08-24
    18 schema:description Molecular “fingerprints” encoding structural information are the workhorse of cheminformatics and machine learning in drug discovery applications. However, fingerprint representations necessarily emphasize particular aspects of the molecular structure while ignoring others, rather than allowing the model to make data-driven decisions. We describe molecular graph convolutions, a machine learning architecture for learning from undirected graphs, specifically small molecules. Graph convolutions use a simple encoding of the molecular graph—atoms, bonds, distances, etc.—which allows the model to take greater advantage of information in the graph structure. Although graph convolutions do not outperform all fingerprint-based methods, they (along with other graph-based methods) represent a new paradigm in ligand-based virtual screening with exciting opportunities for future improvement.
    19 schema:genre article
    20 schema:isAccessibleForFree true
    21 schema:isPartOf N6521e8801ef243af847e5a381ce0a0e3
    22 Nb21c1ef83ca040c08336d33acb85c7e5
    23 sg:journal.1105375
    24 schema:keywords advantages
    25 applications
    26 architecture
    27 aspects
    28 bonds
    29 cheminformatics
    30 convolution
    31 data-driven decision
    32 decisions
    33 discovery applications
    34 distance
    35 drug discovery applications
    36 encoding
    37 exciting opportunities
    38 fingerprint representation
    39 fingerprint-based methods
    40 fingerprints
    41 future improvements
    42 graph
    43 graph convolution
    44 graph structure
    45 great advantage
    46 improvement
    47 information
    48 machine
    49 method
    50 model
    51 molecular structure
    52 molecules
    53 new paradigm
    54 opportunities
    55 paradigm
    56 particular aspects
    57 representation
    58 simple encoding
    59 small molecules
    60 structural information
    61 structure
    62 undirected graph
    63 workhorse
    64 schema:name Molecular graph convolutions: moving beyond fingerprints
    65 schema:pagination 595-608
    66 schema:productId N7eac044d78e34d3b8f3af6cbd6cf6e0e
    67 N90ea3f1b858544da9ae41cdb00e7d40e
    68 Nb32f6c352c854859b82ef8c6fff375d3
    69 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053264924
    70 https://doi.org/10.1007/s10822-016-9938-8
    71 schema:sdDatePublished 2022-12-01T06:34
    72 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    73 schema:sdPublisher N12b06d8b9ab5471ea6a40009c02c55a8
    74 schema:url https://doi.org/10.1007/s10822-016-9938-8
    75 sgo:license sg:explorer/license/
    76 sgo:sdDataset articles
    77 rdf:type schema:ScholarlyArticle
    78 N088b525ba63c4ae5a9b61dbcadf301d8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    79 schema:name Drug Design
    80 rdf:type schema:DefinedTerm
    81 N0fc565dd9f62421fa24e8db989df70ca rdf:first sg:person.015523355735.57
    82 rdf:rest rdf:nil
    83 N12b06d8b9ab5471ea6a40009c02c55a8 schema:name Springer Nature - SN SciGraph project
    84 rdf:type schema:Organization
    85 N23a33e4be5984dd0ab2588ee97689b67 rdf:first sg:person.0757445204.86
    86 rdf:rest Nf4aaea37fc9d4ffdafcb6e27ec71e9f9
    87 N310102c0d08a4918badf3c5a38d6707e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    88 schema:name Molecular Structure
    89 rdf:type schema:DefinedTerm
    90 N3fb036cd5e2c404f89db9d808d0f21b3 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    91 schema:name Pharmaceutical Preparations
    92 rdf:type schema:DefinedTerm
    93 N61544828a6324884aac10048acdb11ab schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    94 schema:name Computer-Aided Design
    95 rdf:type schema:DefinedTerm
    96 N6521e8801ef243af847e5a381ce0a0e3 schema:issueNumber 8
    97 rdf:type schema:PublicationIssue
    98 N7eac044d78e34d3b8f3af6cbd6cf6e0e schema:name dimensions_id
    99 schema:value pub.1053264924
    100 rdf:type schema:PropertyValue
    101 N7eedaa0bee1a42b48a406aedb0b9e8c5 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    102 schema:name Machine Learning
    103 rdf:type schema:DefinedTerm
    104 N8504cf9abbb9403eb951c4021c38fefc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    105 schema:name Neural Networks, Computer
    106 rdf:type schema:DefinedTerm
    107 N90ea3f1b858544da9ae41cdb00e7d40e schema:name doi
    108 schema:value 10.1007/s10822-016-9938-8
    109 rdf:type schema:PropertyValue
    110 N9d7259c848b9458eaadd37500af5126e rdf:first sg:person.01336150035.35
    111 rdf:rest N0fc565dd9f62421fa24e8db989df70ca
    112 Nb21c1ef83ca040c08336d33acb85c7e5 schema:volumeNumber 30
    113 rdf:type schema:PublicationVolume
    114 Nb32f6c352c854859b82ef8c6fff375d3 schema:name pubmed_id
    115 schema:value 27558503
    116 rdf:type schema:PropertyValue
    117 Nc2b34615284e45cba9379d8d9153d2c8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    118 schema:name Computer Graphics
    119 rdf:type schema:DefinedTerm
    120 Nc662960f98cd415f898b62f0df60d643 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
    121 schema:name Ligands
    122 rdf:type schema:DefinedTerm
    123 Nc8cce30a134d4ddebe8e58bb40baa909 rdf:first sg:person.014130414735.74
    124 rdf:rest N9d7259c848b9458eaadd37500af5126e
    125 Nf4aaea37fc9d4ffdafcb6e27ec71e9f9 rdf:first sg:person.013333034335.94
    126 rdf:rest Nc8cce30a134d4ddebe8e58bb40baa909
    127 anzsrc-for:03 schema:inDefinedTermSet anzsrc-for:
    128 schema:name Chemical Sciences
    129 rdf:type schema:DefinedTerm
    130 anzsrc-for:0304 schema:inDefinedTermSet anzsrc-for:
    131 schema:name Medicinal and Biomolecular Chemistry
    132 rdf:type schema:DefinedTerm
    133 anzsrc-for:0307 schema:inDefinedTermSet anzsrc-for:
    134 schema:name Theoretical and Computational Chemistry
    135 rdf:type schema:DefinedTerm
    136 sg:grant.2675009 http://pending.schema.org/fundedItem sg:pub.10.1007/s10822-016-9938-8
    137 rdf:type schema:MonetaryGrant
    138 sg:grant.3536592 http://pending.schema.org/fundedItem sg:pub.10.1007/s10822-016-9938-8
    139 rdf:type schema:MonetaryGrant
    140 sg:journal.1105375 schema:issn 0928-2866
    141 1573-9023
    142 schema:name Journal of Computer-Aided Molecular Design
    143 schema:publisher Springer Nature
    144 rdf:type schema:Periodical
    145 sg:person.013333034335.94 schema:affiliation grid-institutes:grid.420451.6
    146 schema:familyName McCloskey
    147 schema:givenName Kevin
    148 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013333034335.94
    149 rdf:type schema:Person
    150 sg:person.01336150035.35 schema:affiliation grid-institutes:grid.168010.e
    151 schema:familyName Pande
    152 schema:givenName Vijay
    153 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336150035.35
    154 rdf:type schema:Person
    155 sg:person.014130414735.74 schema:affiliation grid-institutes:grid.420451.6
    156 schema:familyName Berndl
    157 schema:givenName Marc
    158 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014130414735.74
    159 rdf:type schema:Person
    160 sg:person.015523355735.57 schema:affiliation grid-institutes:grid.420451.6
    161 schema:familyName Riley
    162 schema:givenName Patrick
    163 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015523355735.57
    164 rdf:type schema:Person
    165 sg:person.0757445204.86 schema:affiliation grid-institutes:grid.168010.e
    166 schema:familyName Kearnes
    167 schema:givenName Steven
    168 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0757445204.86
    169 rdf:type schema:Person
    170 sg:pub.10.1007/s10822-008-9196-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045738948
    171 https://doi.org/10.1007/s10822-008-9196-5
    172 rdf:type schema:CreativeWork
    173 sg:pub.10.1038/323533a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018367015
    174 https://doi.org/10.1038/323533a0
    175 rdf:type schema:CreativeWork
    176 sg:pub.10.1038/nature14539 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010020120
    177 https://doi.org/10.1038/nature14539
    178 rdf:type schema:CreativeWork
    179 grid-institutes:grid.168010.e schema:alternateName Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA
    180 schema:name Stanford University, 318 Campus Dr. S296, 94305, Stanford, CA, USA
    181 rdf:type schema:Organization
    182 grid-institutes:grid.420451.6 schema:alternateName Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA
    183 schema:name Google Inc., 1600 Amphitheatre Pkwy, 94043, Mountain View, CA, USA
    184 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...