A comparative evaluation of novelty detection algorithms for discrete sequences View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2019-11-08

AUTHORS

Rémi Domingues, Pietro Michiardi, Jérémie Barlet, Maurizio Filippone

ABSTRACT

The identification of anomalies in temporal data is a core component of numerous research areas such as intrusion detection, fault prevention, genomics and fraud detection. This article provides an experimental comparison of candidate methods for the novelty detection problem applied to discrete sequences. The objective of this study is to identify which state-of-the-art methods are efficient and appropriate candidates for a given use case. These recommendations rely on extensive novelty detection experiments based on a variety of public datasets in addition to novel industrial datasets. We also perform thorough scalability and memory usage tests resulting in new supplementary insights of the methods’ performance, key selection criteria to solve problems relying on large volumes of data and to meet the expectations of applications subject to strict response time constraints. More... »

PAGES

3787-3812

References to SciGraph publications

  • 2004-10. A Survey of Outlier Detection Methodologies in ARTIFICIAL INTELLIGENCE REVIEW
  • 2010-04-08. Protein sequences classification by means of feature extraction with substitution matrices in BMC BIOINFORMATICS
  • 2009-02-19. RankAggreg, an R package for weighted rank aggregation in BMC BIOINFORMATICS
  • <error retrieving object. in <ERROR RETRIEVING OBJECT
  • 2018-01-28. Visual interpretability for deep learning: a survey in FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING
  • 2019-09-23. Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms in SIMILARITY SEARCH AND APPLICATIONS
  • 2018-06-14. Deep Gaussian Process autoencoders for novelty detection in MACHINE LEARNING
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4

    DOI

    http://dx.doi.org/10.1007/s10462-019-09779-4

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1122411601


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/grid.28848.3e", 
              "name": [
                "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Domingues", 
            "givenName": "R\u00e9mi", 
            "id": "sg:person.013615603455.39", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013615603455.39"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/grid.28848.3e", 
              "name": [
                "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Michiardi", 
            "givenName": "Pietro", 
            "id": "sg:person.016057235273.37", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016057235273.37"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/None", 
              "name": [
                "Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Barlet", 
            "givenName": "J\u00e9r\u00e9mie", 
            "id": "sg:person.015304074031.53", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015304074031.53"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/grid.28848.3e", 
              "name": [
                "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Filippone", 
            "givenName": "Maurizio", 
            "id": "sg:person.07706215665.03", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07706215665.03"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1007/11823728_26", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020787219", 
              "https://doi.org/10.1007/11823728_26"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1631/fitee.1700808", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1101554124", 
              "https://doi.org/10.1631/fitee.1700808"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-030-32047-8_16", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1121221916", 
              "https://doi.org/10.1007/978-3-030-32047-8_16"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-11-175", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027592485", 
              "https://doi.org/10.1186/1471-2105-11-175"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s10994-018-5723-3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1104606391", 
              "https://doi.org/10.1007/s10994-018-5723-3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1023/b:aire.0000045502.10941.a9", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1014095928", 
              "https://doi.org/10.1023/b:aire.0000045502.10941.a9"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-10-62", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052717070", 
              "https://doi.org/10.1186/1471-2105-10-62"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2019-11-08", 
        "datePublishedReg": "2019-11-08", 
        "description": "The identification of anomalies in temporal data is a core component of numerous research areas such as intrusion detection, fault prevention, genomics and fraud detection. This article provides an experimental comparison of candidate methods for the novelty detection problem applied to discrete sequences. The objective of this study is to identify which state-of-the-art methods are efficient and appropriate candidates for a given use case. These recommendations rely on extensive novelty detection experiments based on a variety of public datasets in addition to novel industrial datasets. We also perform thorough scalability and memory usage tests resulting in new supplementary insights of the methods\u2019 performance, key selection criteria to solve problems relying on large volumes of data and to meet the expectations of applications subject to strict response time constraints.", 
        "genre": "article", 
        "id": "sg:pub.10.1007/s10462-019-09779-4", 
        "inLanguage": "en", 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1126843", 
            "issn": [
              "0269-2821", 
              "1573-7462"
            ], 
            "name": "Artificial Intelligence Review", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "5", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "53"
          }
        ], 
        "keywords": [
          "response time constraints", 
          "novelty detection problem", 
          "novelty detection algorithm", 
          "identification of anomalies", 
          "intrusion detection", 
          "fraud detection", 
          "use cases", 
          "industrial datasets", 
          "public datasets", 
          "fault prevention", 
          "art methods", 
          "temporal data", 
          "detection algorithm", 
          "detection problem", 
          "discrete sequence", 
          "numerous research areas", 
          "research area", 
          "time constraints", 
          "experimental comparison", 
          "large volumes", 
          "dataset", 
          "expectations of applications", 
          "candidate method", 
          "core component", 
          "scalability", 
          "algorithm", 
          "usage test", 
          "detection", 
          "detection experiments", 
          "supplementary insights", 
          "comparative evaluation", 
          "constraints", 
          "method", 
          "key selection criteria", 
          "applications", 
          "data", 
          "performance", 
          "selection criteria", 
          "sequence", 
          "experiments", 
          "evaluation", 
          "recommendations", 
          "variety", 
          "objective", 
          "identification", 
          "components", 
          "state", 
          "article", 
          "area", 
          "expectations", 
          "criteria", 
          "genomics", 
          "insights", 
          "comparison", 
          "appropriate candidates", 
          "candidates", 
          "anomalies", 
          "cases", 
          "volume", 
          "addition", 
          "test", 
          "study", 
          "prevention", 
          "problem", 
          "extensive novelty detection experiments", 
          "novelty detection experiments", 
          "novel industrial datasets", 
          "thorough scalability", 
          "memory usage tests", 
          "new supplementary insights", 
          "strict response time constraints"
        ], 
        "name": "A comparative evaluation of novelty detection algorithms for discrete sequences", 
        "pagination": "3787-3812", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1122411601"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10462-019-09779-4"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10462-019-09779-4", 
          "https://app.dimensions.ai/details/publication/pub.1122411601"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-01-01T18:55", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_826.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/s10462-019-09779-4"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'


     

    This table displays all metadata directly associated to this object as RDF triples.

    181 TRIPLES      22 PREDICATES      102 URIs      87 LITERALS      6 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10462-019-09779-4 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N018259ffd846400ba855868aa7e3425b
    4 schema:citation sg:pub.10.1007/11823728_26
    5 sg:pub.10.1007/978-3-030-32047-8_16
    6 sg:pub.10.1007/s10994-018-5723-3
    7 sg:pub.10.1023/b:aire.0000045502.10941.a9
    8 sg:pub.10.1186/1471-2105-10-62
    9 sg:pub.10.1186/1471-2105-11-175
    10 sg:pub.10.1631/fitee.1700808
    11 schema:datePublished 2019-11-08
    12 schema:datePublishedReg 2019-11-08
    13 schema:description The identification of anomalies in temporal data is a core component of numerous research areas such as intrusion detection, fault prevention, genomics and fraud detection. This article provides an experimental comparison of candidate methods for the novelty detection problem applied to discrete sequences. The objective of this study is to identify which state-of-the-art methods are efficient and appropriate candidates for a given use case. These recommendations rely on extensive novelty detection experiments based on a variety of public datasets in addition to novel industrial datasets. We also perform thorough scalability and memory usage tests resulting in new supplementary insights of the methods’ performance, key selection criteria to solve problems relying on large volumes of data and to meet the expectations of applications subject to strict response time constraints.
    14 schema:genre article
    15 schema:inLanguage en
    16 schema:isAccessibleForFree true
    17 schema:isPartOf N77231e6fd786470ea043b4b67f4f5921
    18 Nf2322eaa236a4137a464e6958ec21417
    19 sg:journal.1126843
    20 schema:keywords addition
    21 algorithm
    22 anomalies
    23 applications
    24 appropriate candidates
    25 area
    26 art methods
    27 article
    28 candidate method
    29 candidates
    30 cases
    31 comparative evaluation
    32 comparison
    33 components
    34 constraints
    35 core component
    36 criteria
    37 data
    38 dataset
    39 detection
    40 detection algorithm
    41 detection experiments
    42 detection problem
    43 discrete sequence
    44 evaluation
    45 expectations
    46 expectations of applications
    47 experimental comparison
    48 experiments
    49 extensive novelty detection experiments
    50 fault prevention
    51 fraud detection
    52 genomics
    53 identification
    54 identification of anomalies
    55 industrial datasets
    56 insights
    57 intrusion detection
    58 key selection criteria
    59 large volumes
    60 memory usage tests
    61 method
    62 new supplementary insights
    63 novel industrial datasets
    64 novelty detection algorithm
    65 novelty detection experiments
    66 novelty detection problem
    67 numerous research areas
    68 objective
    69 performance
    70 prevention
    71 problem
    72 public datasets
    73 recommendations
    74 research area
    75 response time constraints
    76 scalability
    77 selection criteria
    78 sequence
    79 state
    80 strict response time constraints
    81 study
    82 supplementary insights
    83 temporal data
    84 test
    85 thorough scalability
    86 time constraints
    87 usage test
    88 use cases
    89 variety
    90 volume
    91 schema:name A comparative evaluation of novelty detection algorithms for discrete sequences
    92 schema:pagination 3787-3812
    93 schema:productId N846c7964324f45a189e897801baf7834
    94 Na6ea1f6917d54644b3992d8149d4bad3
    95 schema:sameAs https://app.dimensions.ai/details/publication/pub.1122411601
    96 https://doi.org/10.1007/s10462-019-09779-4
    97 schema:sdDatePublished 2022-01-01T18:55
    98 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    99 schema:sdPublisher N62dbbdfdaa0e41dabb05ccf0e10f97cf
    100 schema:url https://doi.org/10.1007/s10462-019-09779-4
    101 sgo:license sg:explorer/license/
    102 sgo:sdDataset articles
    103 rdf:type schema:ScholarlyArticle
    104 N018259ffd846400ba855868aa7e3425b rdf:first sg:person.013615603455.39
    105 rdf:rest N862c5a3aaa004c0db88ee0149df1d673
    106 N62dbbdfdaa0e41dabb05ccf0e10f97cf schema:name Springer Nature - SN SciGraph project
    107 rdf:type schema:Organization
    108 N77231e6fd786470ea043b4b67f4f5921 schema:volumeNumber 53
    109 rdf:type schema:PublicationVolume
    110 N846c7964324f45a189e897801baf7834 schema:name dimensions_id
    111 schema:value pub.1122411601
    112 rdf:type schema:PropertyValue
    113 N862c5a3aaa004c0db88ee0149df1d673 rdf:first sg:person.016057235273.37
    114 rdf:rest Ne695d93122a34565b722c152d6047d0a
    115 N9878613ddb554331bc2f624eb67fa7f7 rdf:first sg:person.07706215665.03
    116 rdf:rest rdf:nil
    117 Na6ea1f6917d54644b3992d8149d4bad3 schema:name doi
    118 schema:value 10.1007/s10462-019-09779-4
    119 rdf:type schema:PropertyValue
    120 Ne695d93122a34565b722c152d6047d0a rdf:first sg:person.015304074031.53
    121 rdf:rest N9878613ddb554331bc2f624eb67fa7f7
    122 Nf2322eaa236a4137a464e6958ec21417 schema:issueNumber 5
    123 rdf:type schema:PublicationIssue
    124 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    125 schema:name Information and Computing Sciences
    126 rdf:type schema:DefinedTerm
    127 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    128 schema:name Artificial Intelligence and Image Processing
    129 rdf:type schema:DefinedTerm
    130 sg:journal.1126843 schema:issn 0269-2821
    131 1573-7462
    132 schema:name Artificial Intelligence Review
    133 schema:publisher Springer Nature
    134 rdf:type schema:Periodical
    135 sg:person.013615603455.39 schema:affiliation grid-institutes:grid.28848.3e
    136 schema:familyName Domingues
    137 schema:givenName Rémi
    138 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013615603455.39
    139 rdf:type schema:Person
    140 sg:person.015304074031.53 schema:affiliation grid-institutes:None
    141 schema:familyName Barlet
    142 schema:givenName Jérémie
    143 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015304074031.53
    144 rdf:type schema:Person
    145 sg:person.016057235273.37 schema:affiliation grid-institutes:grid.28848.3e
    146 schema:familyName Michiardi
    147 schema:givenName Pietro
    148 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016057235273.37
    149 rdf:type schema:Person
    150 sg:person.07706215665.03 schema:affiliation grid-institutes:grid.28848.3e
    151 schema:familyName Filippone
    152 schema:givenName Maurizio
    153 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07706215665.03
    154 rdf:type schema:Person
    155 sg:pub.10.1007/11823728_26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020787219
    156 https://doi.org/10.1007/11823728_26
    157 rdf:type schema:CreativeWork
    158 sg:pub.10.1007/978-3-030-32047-8_16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1121221916
    159 https://doi.org/10.1007/978-3-030-32047-8_16
    160 rdf:type schema:CreativeWork
    161 sg:pub.10.1007/s10994-018-5723-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1104606391
    162 https://doi.org/10.1007/s10994-018-5723-3
    163 rdf:type schema:CreativeWork
    164 sg:pub.10.1023/b:aire.0000045502.10941.a9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014095928
    165 https://doi.org/10.1023/b:aire.0000045502.10941.a9
    166 rdf:type schema:CreativeWork
    167 sg:pub.10.1186/1471-2105-10-62 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052717070
    168 https://doi.org/10.1186/1471-2105-10-62
    169 rdf:type schema:CreativeWork
    170 sg:pub.10.1186/1471-2105-11-175 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027592485
    171 https://doi.org/10.1186/1471-2105-11-175
    172 rdf:type schema:CreativeWork
    173 sg:pub.10.1631/fitee.1700808 schema:sameAs https://app.dimensions.ai/details/publication/pub.1101554124
    174 https://doi.org/10.1631/fitee.1700808
    175 rdf:type schema:CreativeWork
    176 grid-institutes:None schema:alternateName Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France
    177 schema:name Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France
    178 rdf:type schema:Organization
    179 grid-institutes:grid.28848.3e schema:alternateName Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France
    180 schema:name Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France
    181 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...