Group Contribution-Based Graph Convolution Network: Pure Property Estimation Model View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2022-07-14

AUTHORS

Sun Yoo Hwang, Jeong Won Kang

ABSTRACT

Properties data for chemical compounds are essential information for the design and operation of chemical processes. Experimental values are reported in the literature, but that are too scarce compared with exploding demand for data. When the data are not available, various estimation methods are employed. The group contribution method is one of the standards and simple techniques used today. However, these methods have inherent inaccuracy due to the simplified representation of the molecular structure. More advanced methods are emerging, including improved molecular representations and handling experimental data. However, such processes also suffer from a lack of valid data for adjusting many parameters. We suggest a compromise between a complex machine learning algorithm and a linear group contribution method in this contribution. Instead of representing a molecule using a graph of atoms, we employed bulkier blocks—a graph of functional groups. The new approach dramatically reduced the number of adjustable parameters for machine learning. The result shows higher accuracy than the conventional methods. The whole process was also examined in various aspects—incorporating uncertainties in the data, the robustness of the fitting process, and detecting outlier data.Graphical Abstract More... »

PAGES

136

References to SciGraph publications

  • 2016-08-24. Molecular graph convolutions: moving beyond fingerprints in JOURNAL OF COMPUTER-AIDED MOLECULAR DESIGN
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10765-022-03060-7

    DOI

    http://dx.doi.org/10.1007/s10765-022-03060-7

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1149472009


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/02", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Physical Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/03", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Chemical Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/09", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Engineering", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0203", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Classical Physics", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0306", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Physical Chemistry (incl. Structural)", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0904", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Chemical Engineering", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Department of Chemical Biological Engineering, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea", 
              "id": "http://www.grid.ac/institutes/grid.222754.4", 
              "name": [
                "Department of Chemical Biological Engineering, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Hwang", 
            "givenName": "Sun Yoo", 
            "id": "sg:person.013261515052.49", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013261515052.49"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Graduate School of Energy and Environment, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea", 
              "id": "http://www.grid.ac/institutes/grid.222754.4", 
              "name": [
                "Department of Chemical Biological Engineering, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea", 
                "Graduate School of Energy and Environment, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Kang", 
            "givenName": "Jeong Won", 
            "id": "sg:person.014533173241.15", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014533173241.15"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/s10822-016-9938-8", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1053264924", 
              "https://doi.org/10.1007/s10822-016-9938-8"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2022-07-14", 
        "datePublishedReg": "2022-07-14", 
        "description": "Properties data for chemical compounds are essential information for the design and operation of chemical processes. Experimental values are reported in the literature, but that are too scarce compared with exploding demand for data. When the data are not available, various estimation methods are employed. The group contribution method is one of the standards and simple techniques used today. However, these methods have inherent inaccuracy due to the simplified representation of the molecular structure. More advanced methods are emerging, including improved molecular representations and handling experimental data. However, such processes also suffer from a lack of valid data for adjusting many parameters. We suggest a compromise between a complex machine learning algorithm and a linear group contribution method in this contribution. Instead of representing a molecule using a graph of atoms, we employed bulkier blocks\u2014a graph of functional groups. The new approach dramatically reduced the number of adjustable parameters for machine learning. The result shows higher accuracy than the conventional methods. The whole process was also examined in various aspects\u2014incorporating uncertainties in the data, the robustness of the fitting process, and detecting outlier data.Graphical Abstract", 
        "genre": "article", 
        "id": "sg:pub.10.1007/s10765-022-03060-7", 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1043587", 
            "issn": [
              "0195-928X", 
              "1572-9567"
            ], 
            "name": "International Journal of Thermophysics", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "9", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "43"
          }
        ], 
        "keywords": [
          "property estimation models", 
          "estimation method", 
          "fitting process", 
          "adjustable parameters", 
          "group contribution method", 
          "outlier data", 
          "estimation model", 
          "graph", 
          "experimental data", 
          "simplified representation", 
          "contribution method", 
          "high accuracy", 
          "complex machine", 
          "advanced methods", 
          "molecular representations", 
          "machine learning", 
          "new approach", 
          "conventional methods", 
          "such processes", 
          "representation", 
          "parameters", 
          "experimental values", 
          "chemical processes", 
          "property data", 
          "algorithm", 
          "robustness", 
          "uncertainty", 
          "inherent inaccuracies", 
          "simple technique", 
          "atoms", 
          "accuracy", 
          "inaccuracy", 
          "model", 
          "approach", 
          "data", 
          "whole process", 
          "process", 
          "technique", 
          "essential information", 
          "number", 
          "structure", 
          "machine", 
          "operation", 
          "design", 
          "compromise", 
          "results", 
          "values", 
          "contribution", 
          "information", 
          "literature", 
          "molecular structure", 
          "demand", 
          "learning", 
          "block", 
          "valid data", 
          "chemical compounds", 
          "functional groups", 
          "today", 
          "molecules", 
          "method", 
          "compounds", 
          "group", 
          "standards", 
          "lack"
        ], 
        "name": "Group Contribution-Based Graph Convolution Network: Pure Property Estimation Model", 
        "pagination": "136", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1149472009"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10765-022-03060-7"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10765-022-03060-7", 
          "https://app.dimensions.ai/details/publication/pub.1149472009"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-09-02T16:08", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220902/entities/gbq_results/article/article_937.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/s10765-022-03060-7"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10765-022-03060-7'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10765-022-03060-7'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10765-022-03060-7'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10765-022-03060-7'


     

    This table displays all metadata directly associated to this object as RDF triples.

    150 TRIPLES      21 PREDICATES      93 URIs      80 LITERALS      6 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10765-022-03060-7 schema:about anzsrc-for:02
    2 anzsrc-for:0203
    3 anzsrc-for:03
    4 anzsrc-for:0306
    5 anzsrc-for:09
    6 anzsrc-for:0904
    7 schema:author N6f1573e458cf40f6937a145841bd89c8
    8 schema:citation sg:pub.10.1007/s10822-016-9938-8
    9 schema:datePublished 2022-07-14
    10 schema:datePublishedReg 2022-07-14
    11 schema:description Properties data for chemical compounds are essential information for the design and operation of chemical processes. Experimental values are reported in the literature, but that are too scarce compared with exploding demand for data. When the data are not available, various estimation methods are employed. The group contribution method is one of the standards and simple techniques used today. However, these methods have inherent inaccuracy due to the simplified representation of the molecular structure. More advanced methods are emerging, including improved molecular representations and handling experimental data. However, such processes also suffer from a lack of valid data for adjusting many parameters. We suggest a compromise between a complex machine learning algorithm and a linear group contribution method in this contribution. Instead of representing a molecule using a graph of atoms, we employed bulkier blocks—a graph of functional groups. The new approach dramatically reduced the number of adjustable parameters for machine learning. The result shows higher accuracy than the conventional methods. The whole process was also examined in various aspects—incorporating uncertainties in the data, the robustness of the fitting process, and detecting outlier data.Graphical Abstract
    12 schema:genre article
    13 schema:isAccessibleForFree true
    14 schema:isPartOf N5fc46551e0024bc98743959e3b1f2262
    15 Ne0f5043e88514a16b70331b497689bd9
    16 sg:journal.1043587
    17 schema:keywords accuracy
    18 adjustable parameters
    19 advanced methods
    20 algorithm
    21 approach
    22 atoms
    23 block
    24 chemical compounds
    25 chemical processes
    26 complex machine
    27 compounds
    28 compromise
    29 contribution
    30 contribution method
    31 conventional methods
    32 data
    33 demand
    34 design
    35 essential information
    36 estimation method
    37 estimation model
    38 experimental data
    39 experimental values
    40 fitting process
    41 functional groups
    42 graph
    43 group
    44 group contribution method
    45 high accuracy
    46 inaccuracy
    47 information
    48 inherent inaccuracies
    49 lack
    50 learning
    51 literature
    52 machine
    53 machine learning
    54 method
    55 model
    56 molecular representations
    57 molecular structure
    58 molecules
    59 new approach
    60 number
    61 operation
    62 outlier data
    63 parameters
    64 process
    65 property data
    66 property estimation models
    67 representation
    68 results
    69 robustness
    70 simple technique
    71 simplified representation
    72 standards
    73 structure
    74 such processes
    75 technique
    76 today
    77 uncertainty
    78 valid data
    79 values
    80 whole process
    81 schema:name Group Contribution-Based Graph Convolution Network: Pure Property Estimation Model
    82 schema:pagination 136
    83 schema:productId Ncb77cbb4be244c4989cc9e1ef5a0d45e
    84 Nf984fa23edd2479cadc85d067ee9bfbc
    85 schema:sameAs https://app.dimensions.ai/details/publication/pub.1149472009
    86 https://doi.org/10.1007/s10765-022-03060-7
    87 schema:sdDatePublished 2022-09-02T16:08
    88 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    89 schema:sdPublisher N2b8b7184de2643a5a34c2ea17106c37f
    90 schema:url https://doi.org/10.1007/s10765-022-03060-7
    91 sgo:license sg:explorer/license/
    92 sgo:sdDataset articles
    93 rdf:type schema:ScholarlyArticle
    94 N2b8b7184de2643a5a34c2ea17106c37f schema:name Springer Nature - SN SciGraph project
    95 rdf:type schema:Organization
    96 N5fc46551e0024bc98743959e3b1f2262 schema:volumeNumber 43
    97 rdf:type schema:PublicationVolume
    98 N6f1573e458cf40f6937a145841bd89c8 rdf:first sg:person.013261515052.49
    99 rdf:rest Nad7ef148944247c68b2f2191c041d5b2
    100 Nad7ef148944247c68b2f2191c041d5b2 rdf:first sg:person.014533173241.15
    101 rdf:rest rdf:nil
    102 Ncb77cbb4be244c4989cc9e1ef5a0d45e schema:name doi
    103 schema:value 10.1007/s10765-022-03060-7
    104 rdf:type schema:PropertyValue
    105 Ne0f5043e88514a16b70331b497689bd9 schema:issueNumber 9
    106 rdf:type schema:PublicationIssue
    107 Nf984fa23edd2479cadc85d067ee9bfbc schema:name dimensions_id
    108 schema:value pub.1149472009
    109 rdf:type schema:PropertyValue
    110 anzsrc-for:02 schema:inDefinedTermSet anzsrc-for:
    111 schema:name Physical Sciences
    112 rdf:type schema:DefinedTerm
    113 anzsrc-for:0203 schema:inDefinedTermSet anzsrc-for:
    114 schema:name Classical Physics
    115 rdf:type schema:DefinedTerm
    116 anzsrc-for:03 schema:inDefinedTermSet anzsrc-for:
    117 schema:name Chemical Sciences
    118 rdf:type schema:DefinedTerm
    119 anzsrc-for:0306 schema:inDefinedTermSet anzsrc-for:
    120 schema:name Physical Chemistry (incl. Structural)
    121 rdf:type schema:DefinedTerm
    122 anzsrc-for:09 schema:inDefinedTermSet anzsrc-for:
    123 schema:name Engineering
    124 rdf:type schema:DefinedTerm
    125 anzsrc-for:0904 schema:inDefinedTermSet anzsrc-for:
    126 schema:name Chemical Engineering
    127 rdf:type schema:DefinedTerm
    128 sg:journal.1043587 schema:issn 0195-928X
    129 1572-9567
    130 schema:name International Journal of Thermophysics
    131 schema:publisher Springer Nature
    132 rdf:type schema:Periodical
    133 sg:person.013261515052.49 schema:affiliation grid-institutes:grid.222754.4
    134 schema:familyName Hwang
    135 schema:givenName Sun Yoo
    136 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013261515052.49
    137 rdf:type schema:Person
    138 sg:person.014533173241.15 schema:affiliation grid-institutes:grid.222754.4
    139 schema:familyName Kang
    140 schema:givenName Jeong Won
    141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014533173241.15
    142 rdf:type schema:Person
    143 sg:pub.10.1007/s10822-016-9938-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053264924
    144 https://doi.org/10.1007/s10822-016-9938-8
    145 rdf:type schema:CreativeWork
    146 grid-institutes:grid.222754.4 schema:alternateName Department of Chemical Biological Engineering, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea
    147 Graduate School of Energy and Environment, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea
    148 schema:name Department of Chemical Biological Engineering, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea
    149 Graduate School of Energy and Environment, Korea University, 145 Anam-ro, Sungbuk-gu, 02841, Seoul, Republic of Korea
    150 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...