A comparative evaluation of novelty detection algorithms for discrete sequences View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2019-11-08

AUTHORS

Rémi Domingues, Pietro Michiardi, Jérémie Barlet, Maurizio Filippone

ABSTRACT

The identification of anomalies in temporal data is a core component of numerous research areas such as intrusion detection, fault prevention, genomics and fraud detection. This article provides an experimental comparison of candidate methods for the novelty detection problem applied to discrete sequences. The objective of this study is to identify which state-of-the-art methods are efficient and appropriate candidates for a given use case. These recommendations rely on extensive novelty detection experiments based on a variety of public datasets in addition to novel industrial datasets. We also perform thorough scalability and memory usage tests resulting in new supplementary insights of the methods’ performance, key selection criteria to solve problems relying on large volumes of data and to meet the expectations of applications subject to strict response time constraints. More... »

PAGES

3787-3812

References to SciGraph publications

  • 2004-10. A Survey of Outlier Detection Methodologies in ARTIFICIAL INTELLIGENCE REVIEW
  • 2010-04-08. Protein sequences classification by means of feature extraction with substitution matrices in BMC BIOINFORMATICS
  • 2009-02-19. RankAggreg, an R package for weighted rank aggregation in BMC BIOINFORMATICS
  • 2006. ITER: An Algorithm for Predictive Regression Rule Extraction in DATA WAREHOUSING AND KNOWLEDGE DISCOVERY
  • 2018-01-28. Visual interpretability for deep learning: a survey in FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING
  • 2019-09-23. Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms in SIMILARITY SEARCH AND APPLICATIONS
  • 2018-06-14. Deep Gaussian Process autoencoders for novelty detection in MACHINE LEARNING
  • Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4

    DOI

    http://dx.doi.org/10.1007/s10462-019-09779-4

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1122411601


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/grid.28848.3e", 
              "name": [
                "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Domingues", 
            "givenName": "R\u00e9mi", 
            "id": "sg:person.013615603455.39", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013615603455.39"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/grid.28848.3e", 
              "name": [
                "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Michiardi", 
            "givenName": "Pietro", 
            "id": "sg:person.016057235273.37", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016057235273.37"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/None", 
              "name": [
                "Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Barlet", 
            "givenName": "J\u00e9r\u00e9mie", 
            "id": "sg:person.015304074031.53", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015304074031.53"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France", 
              "id": "http://www.grid.ac/institutes/grid.28848.3e", 
              "name": [
                "Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Filippone", 
            "givenName": "Maurizio", 
            "id": "sg:person.07706215665.03", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07706215665.03"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1023/b:aire.0000045502.10941.a9", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1014095928", 
              "https://doi.org/10.1023/b:aire.0000045502.10941.a9"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1631/fitee.1700808", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1101554124", 
              "https://doi.org/10.1631/fitee.1700808"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/11823728_26", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020787219", 
              "https://doi.org/10.1007/11823728_26"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/s10994-018-5723-3", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1104606391", 
              "https://doi.org/10.1007/s10994-018-5723-3"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-11-175", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1027592485", 
              "https://doi.org/10.1186/1471-2105-11-175"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/978-3-030-32047-8_16", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1121221916", 
              "https://doi.org/10.1007/978-3-030-32047-8_16"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1186/1471-2105-10-62", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1052717070", 
              "https://doi.org/10.1186/1471-2105-10-62"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2019-11-08", 
        "datePublishedReg": "2019-11-08", 
        "description": "The identification of anomalies in temporal data is a core component of numerous research areas such as intrusion detection, fault prevention, genomics and fraud detection. This article provides an experimental comparison of candidate methods for the novelty detection problem applied to discrete sequences. The objective of this study is to identify which state-of-the-art methods are efficient and appropriate candidates for a given use case. These recommendations rely on extensive novelty detection experiments based on a variety of public datasets in addition to novel industrial datasets. We also perform thorough scalability and memory usage tests resulting in new supplementary insights of the methods\u2019 performance, key selection criteria to solve problems relying on large volumes of data and to meet the expectations of applications subject to strict response time constraints.", 
        "genre": "article", 
        "id": "sg:pub.10.1007/s10462-019-09779-4", 
        "inLanguage": "en", 
        "isAccessibleForFree": true, 
        "isPartOf": [
          {
            "id": "sg:journal.1126843", 
            "issn": [
              "0269-2821", 
              "1573-7462"
            ], 
            "name": "Artificial Intelligence Review", 
            "publisher": "Springer Nature", 
            "type": "Periodical"
          }, 
          {
            "issueNumber": "5", 
            "type": "PublicationIssue"
          }, 
          {
            "type": "PublicationVolume", 
            "volumeNumber": "53"
          }
        ], 
        "keywords": [
          "response time constraints", 
          "novelty detection problem", 
          "novelty detection algorithm", 
          "identification of anomalies", 
          "intrusion detection", 
          "fraud detection", 
          "use cases", 
          "industrial datasets", 
          "public datasets", 
          "fault prevention", 
          "art methods", 
          "temporal data", 
          "detection algorithm", 
          "detection problem", 
          "discrete sequence", 
          "numerous research areas", 
          "research area", 
          "time constraints", 
          "experimental comparison", 
          "large volumes", 
          "dataset", 
          "expectations of applications", 
          "candidate method", 
          "core component", 
          "scalability", 
          "algorithm", 
          "usage test", 
          "detection", 
          "detection experiments", 
          "supplementary insights", 
          "comparative evaluation", 
          "constraints", 
          "method", 
          "key selection criteria", 
          "applications", 
          "data", 
          "performance", 
          "selection criteria", 
          "sequence", 
          "experiments", 
          "evaluation", 
          "recommendations", 
          "variety", 
          "objective", 
          "identification", 
          "components", 
          "state", 
          "article", 
          "area", 
          "expectations", 
          "criteria", 
          "genomics", 
          "insights", 
          "comparison", 
          "appropriate candidates", 
          "candidates", 
          "anomalies", 
          "cases", 
          "volume", 
          "addition", 
          "test", 
          "study", 
          "prevention", 
          "problem"
        ], 
        "name": "A comparative evaluation of novelty detection algorithms for discrete sequences", 
        "pagination": "3787-3812", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1122411601"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/s10462-019-09779-4"
            ]
          }
        ], 
        "sameAs": [
          "https://doi.org/10.1007/s10462-019-09779-4", 
          "https://app.dimensions.ai/details/publication/pub.1122411601"
        ], 
        "sdDataset": "articles", 
        "sdDatePublished": "2022-05-20T07:35", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-springernature-scigraph/baseset/20220519/entities/gbq_results/article/article_806.jsonl", 
        "type": "ScholarlyArticle", 
        "url": "https://doi.org/10.1007/s10462-019-09779-4"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s10462-019-09779-4'


     

    This table displays all metadata directly associated to this object as RDF triples.

    174 TRIPLES      22 PREDICATES      95 URIs      80 LITERALS      6 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/s10462-019-09779-4 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N2da951d7366b4f848ceaacfc2f81fa2f
    4 schema:citation sg:pub.10.1007/11823728_26
    5 sg:pub.10.1007/978-3-030-32047-8_16
    6 sg:pub.10.1007/s10994-018-5723-3
    7 sg:pub.10.1023/b:aire.0000045502.10941.a9
    8 sg:pub.10.1186/1471-2105-10-62
    9 sg:pub.10.1186/1471-2105-11-175
    10 sg:pub.10.1631/fitee.1700808
    11 schema:datePublished 2019-11-08
    12 schema:datePublishedReg 2019-11-08
    13 schema:description The identification of anomalies in temporal data is a core component of numerous research areas such as intrusion detection, fault prevention, genomics and fraud detection. This article provides an experimental comparison of candidate methods for the novelty detection problem applied to discrete sequences. The objective of this study is to identify which state-of-the-art methods are efficient and appropriate candidates for a given use case. These recommendations rely on extensive novelty detection experiments based on a variety of public datasets in addition to novel industrial datasets. We also perform thorough scalability and memory usage tests resulting in new supplementary insights of the methods’ performance, key selection criteria to solve problems relying on large volumes of data and to meet the expectations of applications subject to strict response time constraints.
    14 schema:genre article
    15 schema:inLanguage en
    16 schema:isAccessibleForFree true
    17 schema:isPartOf N7b0fa464f8d44ae6a24061bd873c1997
    18 N8f2667a559ba4c1084f115b3ad507ed2
    19 sg:journal.1126843
    20 schema:keywords addition
    21 algorithm
    22 anomalies
    23 applications
    24 appropriate candidates
    25 area
    26 art methods
    27 article
    28 candidate method
    29 candidates
    30 cases
    31 comparative evaluation
    32 comparison
    33 components
    34 constraints
    35 core component
    36 criteria
    37 data
    38 dataset
    39 detection
    40 detection algorithm
    41 detection experiments
    42 detection problem
    43 discrete sequence
    44 evaluation
    45 expectations
    46 expectations of applications
    47 experimental comparison
    48 experiments
    49 fault prevention
    50 fraud detection
    51 genomics
    52 identification
    53 identification of anomalies
    54 industrial datasets
    55 insights
    56 intrusion detection
    57 key selection criteria
    58 large volumes
    59 method
    60 novelty detection algorithm
    61 novelty detection problem
    62 numerous research areas
    63 objective
    64 performance
    65 prevention
    66 problem
    67 public datasets
    68 recommendations
    69 research area
    70 response time constraints
    71 scalability
    72 selection criteria
    73 sequence
    74 state
    75 study
    76 supplementary insights
    77 temporal data
    78 test
    79 time constraints
    80 usage test
    81 use cases
    82 variety
    83 volume
    84 schema:name A comparative evaluation of novelty detection algorithms for discrete sequences
    85 schema:pagination 3787-3812
    86 schema:productId N69ce57e94ee2411d80a80a3620b4db82
    87 Nfdc59e99312a480aad30a26c87b14a7b
    88 schema:sameAs https://app.dimensions.ai/details/publication/pub.1122411601
    89 https://doi.org/10.1007/s10462-019-09779-4
    90 schema:sdDatePublished 2022-05-20T07:35
    91 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    92 schema:sdPublisher N5a78b5123e1b4c588a13595ebf5798e3
    93 schema:url https://doi.org/10.1007/s10462-019-09779-4
    94 sgo:license sg:explorer/license/
    95 sgo:sdDataset articles
    96 rdf:type schema:ScholarlyArticle
    97 N0db85e9aa0844a5d9933449754867294 rdf:first sg:person.07706215665.03
    98 rdf:rest rdf:nil
    99 N2da951d7366b4f848ceaacfc2f81fa2f rdf:first sg:person.013615603455.39
    100 rdf:rest N739b538dabbe41ecb774a3786d2b8507
    101 N5a78b5123e1b4c588a13595ebf5798e3 schema:name Springer Nature - SN SciGraph project
    102 rdf:type schema:Organization
    103 N5dc49584a7624388a7fa0f881ff2b794 rdf:first sg:person.015304074031.53
    104 rdf:rest N0db85e9aa0844a5d9933449754867294
    105 N69ce57e94ee2411d80a80a3620b4db82 schema:name dimensions_id
    106 schema:value pub.1122411601
    107 rdf:type schema:PropertyValue
    108 N739b538dabbe41ecb774a3786d2b8507 rdf:first sg:person.016057235273.37
    109 rdf:rest N5dc49584a7624388a7fa0f881ff2b794
    110 N7b0fa464f8d44ae6a24061bd873c1997 schema:volumeNumber 53
    111 rdf:type schema:PublicationVolume
    112 N8f2667a559ba4c1084f115b3ad507ed2 schema:issueNumber 5
    113 rdf:type schema:PublicationIssue
    114 Nfdc59e99312a480aad30a26c87b14a7b schema:name doi
    115 schema:value 10.1007/s10462-019-09779-4
    116 rdf:type schema:PropertyValue
    117 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    118 schema:name Information and Computing Sciences
    119 rdf:type schema:DefinedTerm
    120 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    121 schema:name Artificial Intelligence and Image Processing
    122 rdf:type schema:DefinedTerm
    123 sg:journal.1126843 schema:issn 0269-2821
    124 1573-7462
    125 schema:name Artificial Intelligence Review
    126 schema:publisher Springer Nature
    127 rdf:type schema:Periodical
    128 sg:person.013615603455.39 schema:affiliation grid-institutes:grid.28848.3e
    129 schema:familyName Domingues
    130 schema:givenName Rémi
    131 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013615603455.39
    132 rdf:type schema:Person
    133 sg:person.015304074031.53 schema:affiliation grid-institutes:None
    134 schema:familyName Barlet
    135 schema:givenName Jérémie
    136 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015304074031.53
    137 rdf:type schema:Person
    138 sg:person.016057235273.37 schema:affiliation grid-institutes:grid.28848.3e
    139 schema:familyName Michiardi
    140 schema:givenName Pietro
    141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016057235273.37
    142 rdf:type schema:Person
    143 sg:person.07706215665.03 schema:affiliation grid-institutes:grid.28848.3e
    144 schema:familyName Filippone
    145 schema:givenName Maurizio
    146 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07706215665.03
    147 rdf:type schema:Person
    148 sg:pub.10.1007/11823728_26 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020787219
    149 https://doi.org/10.1007/11823728_26
    150 rdf:type schema:CreativeWork
    151 sg:pub.10.1007/978-3-030-32047-8_16 schema:sameAs https://app.dimensions.ai/details/publication/pub.1121221916
    152 https://doi.org/10.1007/978-3-030-32047-8_16
    153 rdf:type schema:CreativeWork
    154 sg:pub.10.1007/s10994-018-5723-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1104606391
    155 https://doi.org/10.1007/s10994-018-5723-3
    156 rdf:type schema:CreativeWork
    157 sg:pub.10.1023/b:aire.0000045502.10941.a9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014095928
    158 https://doi.org/10.1023/b:aire.0000045502.10941.a9
    159 rdf:type schema:CreativeWork
    160 sg:pub.10.1186/1471-2105-10-62 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052717070
    161 https://doi.org/10.1186/1471-2105-10-62
    162 rdf:type schema:CreativeWork
    163 sg:pub.10.1186/1471-2105-11-175 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027592485
    164 https://doi.org/10.1186/1471-2105-11-175
    165 rdf:type schema:CreativeWork
    166 sg:pub.10.1631/fitee.1700808 schema:sameAs https://app.dimensions.ai/details/publication/pub.1101554124
    167 https://doi.org/10.1631/fitee.1700808
    168 rdf:type schema:CreativeWork
    169 grid-institutes:None schema:alternateName Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France
    170 schema:name Amadeus, 485 Route du Pin Montard, Sophia Antipolis, France
    171 rdf:type schema:Organization
    172 grid-institutes:grid.28848.3e schema:alternateName Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France
    173 schema:name Department of Data Science, EURECOM, 450 Route des Chappes, Sophia Antipolis, France
    174 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...