Alignment-Based Trace Clustering View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2017-10-21

AUTHORS

Thomas Chatain , Josep Carmona , Boudewijn van Dongen

ABSTRACT

A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver. More... »

PAGES

295-308

References to SciGraph publications

  • 2016. A Recursive Paradigm for Aligning Observed Behavior of Large Structured Process Models in BUSINESS PROCESS MANAGEMENT
  • 2007. Approaching Process Mining with Sequence Clustering: Experiments and Findings in BUSINESS PROCESS MANAGEMENT
  • 2009. Trace Clustering in Process Mining in BUSINESS PROCESS MANAGEMENT WORKSHOPS
  • 2016. Anti-alignments in Conformance Checking – The Dark Side of Process Models in APPLICATION AND THEORY OF PETRI NETS AND CONCURRENCY
  • 2018. Model and Event Log Reductions to Boost the Computation of Alignments in DATA-DRIVEN PROCESS DISCOVERY AND ANALYSIS
  • 1993. Complexity results for 1-safe nets in FOUNDATIONS OF SOFTWARE TECHNOLOGY AND THEORETICAL COMPUTER SCIENCE
  • 2010. Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models in BUSINESS PROCESS MANAGEMENT WORKSHOPS
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-3-319-69904-2_24

    DOI

    http://dx.doi.org/10.1007/978-3-319-69904-2_24

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1092370091


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information Systems", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Laboratoire Sp\u00e9cification et V\u00e9rification", 
              "id": "https://www.grid.ac/institutes/grid.464035.0", 
              "name": [
                "LSV, ENS Paris-Saclay, CNRS, Inria, Cachan, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Chatain", 
            "givenName": "Thomas", 
            "id": "sg:person.013473531515.83", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013473531515.83"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Universitat Polit\u00e8cnica de Catalunya", 
              "id": "https://www.grid.ac/institutes/grid.6835.8", 
              "name": [
                "Universitat Polit\u00e8cnica de Catalunya, Barcelona, Spain"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Carmona", 
            "givenName": "Josep", 
            "id": "sg:person.012252436511.20", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012252436511.20"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Eindhoven University of Technology", 
              "id": "https://www.grid.ac/institutes/grid.6852.9", 
              "name": [
                "Eindhoven University of Technology, Eindhoven, The Netherlands"
              ], 
              "type": "Organization"
            }, 
            "familyName": "van Dongen", 
            "givenName": "Boudewijn", 
            "id": "sg:person.015723401221.54", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015723401221.54"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/3-540-57529-4_66", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1002723777", 
              "https://doi.org/10.1007/3-540-57529-4_66"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-75183-0_26", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1046507206", 
              "https://doi.org/10.1007/978-3-540-75183-0_26"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-540-75183-0_26", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1046507206", 
              "https://doi.org/10.1007/978-3-540-75183-0_26"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-319-39086-4_15", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1050241067", 
              "https://doi.org/10.1007/978-3-319-39086-4_15"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-12186-9_16", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1051249812", 
              "https://doi.org/10.1007/978-3-642-12186-9_16"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-642-00328-8_11", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1051634441", 
              "https://doi.org/10.1007/978-3-642-00328-8_11"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/5.24143", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061179070"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tkde.2006.123", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061661511"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tkde.2006.123", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061661511"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tkde.2006.123", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061661511"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tkde.2013.64", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061662817"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-319-45348-4_12", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1084916164", 
              "https://doi.org/10.1007/978-3-319-45348-4_12"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1137/1.9781611972795.35", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1088800353"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-319-74161-1_1", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1100653782", 
              "https://doi.org/10.1007/978-3-319-74161-1_1"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1002/0471741442", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1109491873"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://app.dimensions.ai/details/publication/pub.1109491873", 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2017-10-21", 
        "datePublishedReg": "2017-10-21", 
        "description": "A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.", 
        "editor": [
          {
            "familyName": "Mayr", 
            "givenName": "Heinrich C.", 
            "type": "Person"
          }, 
          {
            "familyName": "Guizzardi", 
            "givenName": "Giancarlo", 
            "type": "Person"
          }, 
          {
            "familyName": "Ma", 
            "givenName": "Hui", 
            "type": "Person"
          }, 
          {
            "familyName": "Pastor", 
            "givenName": "Oscar", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-3-319-69904-2_24", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": true, 
        "isPartOf": {
          "isbn": [
            "978-3-319-69903-5", 
            "978-3-319-69904-2"
          ], 
          "name": "Conceptual Modeling", 
          "type": "Book"
        }, 
        "name": "Alignment-Based Trace Clustering", 
        "pagination": "295-308", 
        "productId": [
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-3-319-69904-2_24"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "74fb04095b1a0c589cd58fb884038b35142f898df8b59bf61820fb10a44bcb4e"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1092370091"
            ]
          }
        ], 
        "publisher": {
          "location": "Cham", 
          "name": "Springer International Publishing", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-3-319-69904-2_24", 
          "https://app.dimensions.ai/details/publication/pub.1092370091"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T04:59", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000325_0000000325/records_100782_00000000.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-3-319-69904-2_24"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-69904-2_24'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-69904-2_24'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-69904-2_24'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-69904-2_24'


     

    This table displays all metadata directly associated to this object as RDF triples.

    145 TRIPLES      23 PREDICATES      39 URIs      19 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-3-319-69904-2_24 schema:about anzsrc-for:08
    2 anzsrc-for:0806
    3 schema:author N8f541831c1cd4fbbbb174b3a39293232
    4 schema:citation sg:pub.10.1007/3-540-57529-4_66
    5 sg:pub.10.1007/978-3-319-39086-4_15
    6 sg:pub.10.1007/978-3-319-45348-4_12
    7 sg:pub.10.1007/978-3-319-74161-1_1
    8 sg:pub.10.1007/978-3-540-75183-0_26
    9 sg:pub.10.1007/978-3-642-00328-8_11
    10 sg:pub.10.1007/978-3-642-12186-9_16
    11 https://app.dimensions.ai/details/publication/pub.1109491873
    12 https://doi.org/10.1002/0471741442
    13 https://doi.org/10.1109/5.24143
    14 https://doi.org/10.1109/tkde.2006.123
    15 https://doi.org/10.1109/tkde.2013.64
    16 https://doi.org/10.1137/1.9781611972795.35
    17 schema:datePublished 2017-10-21
    18 schema:datePublishedReg 2017-10-21
    19 schema:description A novel method to cluster event log traces is presented in this paper. In contrast to the approaches in the literature, the clustering approach of this paper assumes an additional input: a process model that describes the current process. The core idea of the algorithm is to use model traces as centroids of the clusters detected, computed from a generalization of the notion of alignment. This way, model explanations of observed behavior are the driving force to compute the clusters, instead of current model agnostic approaches, e.g., which group log traces merely on their vector-space similarity. We believe alignment-based trace clustering provides results more useful for stakeholders. Moreover, in case of log incompleteness, noisy logs or concept drift, they can be more robust for dealing with highly deviating traces. The technique of this paper can be combined with any clustering technique to provide model explanations to the clusters computed. The proposed technique relies on encoding the individual alignment problems into the (pseudo-)Boolean domain, and has been implemented in our tool DarkSider that uses an open-source solver.
    20 schema:editor N673acad28a864bbdb6c15a7701fe47f9
    21 schema:genre chapter
    22 schema:inLanguage en
    23 schema:isAccessibleForFree true
    24 schema:isPartOf N546961f11e734b8b97685b04222bba04
    25 schema:name Alignment-Based Trace Clustering
    26 schema:pagination 295-308
    27 schema:productId N31b2845213884e0893be862036c27882
    28 N3df0aa8b83b5423ba91675fb40ab7eba
    29 Nea361c2811ab4fd69bf5c8fefb938daf
    30 schema:publisher N781a3b7a1e044b809237516b3ba4bfc0
    31 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092370091
    32 https://doi.org/10.1007/978-3-319-69904-2_24
    33 schema:sdDatePublished 2019-04-16T04:59
    34 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    35 schema:sdPublisher N7d2a111938574ce3809dc38e3bef27a7
    36 schema:url https://link.springer.com/10.1007%2F978-3-319-69904-2_24
    37 sgo:license sg:explorer/license/
    38 sgo:sdDataset chapters
    39 rdf:type schema:Chapter
    40 N020ecc3f9d1445748fd000fd49a132e6 rdf:first N4e224052f1884e36acc7c3fd79dffc4f
    41 rdf:rest N4a5be24412444eee85e6a9001fd23323
    42 N31b2845213884e0893be862036c27882 schema:name doi
    43 schema:value 10.1007/978-3-319-69904-2_24
    44 rdf:type schema:PropertyValue
    45 N3df0aa8b83b5423ba91675fb40ab7eba schema:name readcube_id
    46 schema:value 74fb04095b1a0c589cd58fb884038b35142f898df8b59bf61820fb10a44bcb4e
    47 rdf:type schema:PropertyValue
    48 N3ebfaa93a7964d2594dc08a5decdf3ea schema:familyName Pastor
    49 schema:givenName Oscar
    50 rdf:type schema:Person
    51 N4a5be24412444eee85e6a9001fd23323 rdf:first N3ebfaa93a7964d2594dc08a5decdf3ea
    52 rdf:rest rdf:nil
    53 N4e224052f1884e36acc7c3fd79dffc4f schema:familyName Ma
    54 schema:givenName Hui
    55 rdf:type schema:Person
    56 N546961f11e734b8b97685b04222bba04 schema:isbn 978-3-319-69903-5
    57 978-3-319-69904-2
    58 schema:name Conceptual Modeling
    59 rdf:type schema:Book
    60 N673acad28a864bbdb6c15a7701fe47f9 rdf:first N93cf6f68d41c401b976d9be4521b112f
    61 rdf:rest N942595d9811f48e89794e371671afbbc
    62 N781a3b7a1e044b809237516b3ba4bfc0 schema:location Cham
    63 schema:name Springer International Publishing
    64 rdf:type schema:Organisation
    65 N7d2a111938574ce3809dc38e3bef27a7 schema:name Springer Nature - SN SciGraph project
    66 rdf:type schema:Organization
    67 N8f541831c1cd4fbbbb174b3a39293232 rdf:first sg:person.013473531515.83
    68 rdf:rest Nec56409efe424211ad399cb6df2491d9
    69 N93cf6f68d41c401b976d9be4521b112f schema:familyName Mayr
    70 schema:givenName Heinrich C.
    71 rdf:type schema:Person
    72 N942595d9811f48e89794e371671afbbc rdf:first Nfcb295b35d044e198e66c85d6e58981a
    73 rdf:rest N020ecc3f9d1445748fd000fd49a132e6
    74 Nc603f67f504f4519a68d584a3e24a2ee rdf:first sg:person.015723401221.54
    75 rdf:rest rdf:nil
    76 Nea361c2811ab4fd69bf5c8fefb938daf schema:name dimensions_id
    77 schema:value pub.1092370091
    78 rdf:type schema:PropertyValue
    79 Nec56409efe424211ad399cb6df2491d9 rdf:first sg:person.012252436511.20
    80 rdf:rest Nc603f67f504f4519a68d584a3e24a2ee
    81 Nfcb295b35d044e198e66c85d6e58981a schema:familyName Guizzardi
    82 schema:givenName Giancarlo
    83 rdf:type schema:Person
    84 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    85 schema:name Information and Computing Sciences
    86 rdf:type schema:DefinedTerm
    87 anzsrc-for:0806 schema:inDefinedTermSet anzsrc-for:
    88 schema:name Information Systems
    89 rdf:type schema:DefinedTerm
    90 sg:person.012252436511.20 schema:affiliation https://www.grid.ac/institutes/grid.6835.8
    91 schema:familyName Carmona
    92 schema:givenName Josep
    93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012252436511.20
    94 rdf:type schema:Person
    95 sg:person.013473531515.83 schema:affiliation https://www.grid.ac/institutes/grid.464035.0
    96 schema:familyName Chatain
    97 schema:givenName Thomas
    98 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013473531515.83
    99 rdf:type schema:Person
    100 sg:person.015723401221.54 schema:affiliation https://www.grid.ac/institutes/grid.6852.9
    101 schema:familyName van Dongen
    102 schema:givenName Boudewijn
    103 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015723401221.54
    104 rdf:type schema:Person
    105 sg:pub.10.1007/3-540-57529-4_66 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002723777
    106 https://doi.org/10.1007/3-540-57529-4_66
    107 rdf:type schema:CreativeWork
    108 sg:pub.10.1007/978-3-319-39086-4_15 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050241067
    109 https://doi.org/10.1007/978-3-319-39086-4_15
    110 rdf:type schema:CreativeWork
    111 sg:pub.10.1007/978-3-319-45348-4_12 schema:sameAs https://app.dimensions.ai/details/publication/pub.1084916164
    112 https://doi.org/10.1007/978-3-319-45348-4_12
    113 rdf:type schema:CreativeWork
    114 sg:pub.10.1007/978-3-319-74161-1_1 schema:sameAs https://app.dimensions.ai/details/publication/pub.1100653782
    115 https://doi.org/10.1007/978-3-319-74161-1_1
    116 rdf:type schema:CreativeWork
    117 sg:pub.10.1007/978-3-540-75183-0_26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046507206
    118 https://doi.org/10.1007/978-3-540-75183-0_26
    119 rdf:type schema:CreativeWork
    120 sg:pub.10.1007/978-3-642-00328-8_11 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051634441
    121 https://doi.org/10.1007/978-3-642-00328-8_11
    122 rdf:type schema:CreativeWork
    123 sg:pub.10.1007/978-3-642-12186-9_16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051249812
    124 https://doi.org/10.1007/978-3-642-12186-9_16
    125 rdf:type schema:CreativeWork
    126 https://app.dimensions.ai/details/publication/pub.1109491873 schema:CreativeWork
    127 https://doi.org/10.1002/0471741442 schema:sameAs https://app.dimensions.ai/details/publication/pub.1109491873
    128 rdf:type schema:CreativeWork
    129 https://doi.org/10.1109/5.24143 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061179070
    130 rdf:type schema:CreativeWork
    131 https://doi.org/10.1109/tkde.2006.123 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061661511
    132 rdf:type schema:CreativeWork
    133 https://doi.org/10.1109/tkde.2013.64 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061662817
    134 rdf:type schema:CreativeWork
    135 https://doi.org/10.1137/1.9781611972795.35 schema:sameAs https://app.dimensions.ai/details/publication/pub.1088800353
    136 rdf:type schema:CreativeWork
    137 https://www.grid.ac/institutes/grid.464035.0 schema:alternateName Laboratoire Spécification et Vérification
    138 schema:name LSV, ENS Paris-Saclay, CNRS, Inria, Cachan, France
    139 rdf:type schema:Organization
    140 https://www.grid.ac/institutes/grid.6835.8 schema:alternateName Universitat Politècnica de Catalunya
    141 schema:name Universitat Politècnica de Catalunya, Barcelona, Spain
    142 rdf:type schema:Organization
    143 https://www.grid.ac/institutes/grid.6852.9 schema:alternateName Eindhoven University of Technology
    144 schema:name Eindhoven University of Technology, Eindhoven, The Netherlands
    145 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...