Two-stage pruning method for gram-based categorical sequence clustering View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2019-04

AUTHORS

Liang Yuan, Wenjian Wang, Lifei Chen

ABSTRACT

Gram-based vector space model has been extensively applied to categorical sequence clustering. However, there is a general lack of an efficient method to determine the length of grams and to identify redundant and non-significant grams involved in the model. In this paper, a variable-length gram model is proposed, different from previous studies mainly focused on the fixed-length grams of sequences. The variable-length grams are obtained using a two-stage pruning method aimed at selecting the irredundant and significant subsequences from the prefix trees, created from the fixed-length grams with an initially large length. A robust partitioning algorithm is then defined for categorical sequence clustering on the normalized representation model using variable-length grams collected from the pruned trees. Experimental results on real-world sequence sets from various domains are given to demonstrate the performance of the proposed methods. More... »

PAGES

631-640

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s13042-017-0744-y

DOI

http://dx.doi.org/10.1007/s13042-017-0744-y

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1092693192


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Electronic Science and Technology of China", 
          "id": "https://www.grid.ac/institutes/grid.54549.39", 
          "name": [
            "Network Operation Maintenance Center, University of Electronic Science and Technology of China, Chengdu, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yuan", 
        "givenName": "Liang", 
        "id": "sg:person.015110201075.53", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015110201075.53"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Shanxi University", 
          "id": "https://www.grid.ac/institutes/grid.163032.5", 
          "name": [
            "Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Wenjian", 
        "id": "sg:person.013123716475.20", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013123716475.20"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Fujian Normal University", 
          "id": "https://www.grid.ac/institutes/grid.411503.2", 
          "name": [
            "Digit Fujian Internet-of-Thing Laboratory of Environment Monitoring, Fujian Normal University, Fuzhou, China"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chen", 
        "givenName": "Lifei", 
        "id": "sg:person.01310645333.95", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01310645333.95"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1093/bib/bbt067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007818694"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-319-47650-6_2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1008684113", 
          "https://doi.org/10.1007/978-3-319-47650-6_2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-13-174", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010064525", 
          "https://doi.org/10.1186/1471-2105-13-174"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-13-174", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010064525", 
          "https://doi.org/10.1186/1471-2105-13-174"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s13042-015-0421-y", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021359505", 
          "https://doi.org/10.1007/s13042-015-0421-y"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.dsp.2014.02.014", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022006176"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10618-006-0060-8", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024715987", 
          "https://doi.org/10.1007/s10618-006-0060-8"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-14-s8-s7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030334663", 
          "https://doi.org/10.1186/1471-2105-14-s8-s7"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-14-s8-s7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030334663", 
          "https://doi.org/10.1186/1471-2105-14-s8-s7"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s13042-015-0381-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035090012", 
          "https://doi.org/10.1007/s13042-015-0381-2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1145/1882471.1882478", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035938734"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s13042-013-0210-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036547359", 
          "https://doi.org/10.1007/s13042-013-0210-4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tkde.2010.190", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061662182"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tkde.2013.104", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061662686"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tnnls.2016.2608354", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061719275"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1142/s021800141755014x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1084825485"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.asoc.2017.04.019", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085213924"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icdm.2008.43", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094406112"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icde.2003.1260785", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095149849"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/ijcnn.2005.1556220", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095278133"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2019-04", 
    "datePublishedReg": "2019-04-01", 
    "description": "Gram-based vector space model has been extensively applied to categorical sequence clustering. However, there is a general lack of an efficient method to determine the length of grams and to identify redundant and non-significant grams involved in the model. In this paper, a variable-length gram model is proposed, different from previous studies mainly focused on the fixed-length grams of sequences. The variable-length grams are obtained using a two-stage pruning method aimed at selecting the irredundant and significant subsequences from the prefix trees, created from the fixed-length grams with an initially large length. A robust partitioning algorithm is then defined for categorical sequence clustering on the normalized representation model using variable-length grams collected from the pruned trees. Experimental results on real-world sequence sets from various domains are given to demonstrate the performance of the proposed methods.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1007/s13042-017-0744-y", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": [
      {
        "id": "sg:journal.1136696", 
        "issn": [
          "1868-8071", 
          "1868-808X"
        ], 
        "name": "International Journal of Machine Learning and Cybernetics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "4", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "10"
      }
    ], 
    "name": "Two-stage pruning method for gram-based categorical sequence clustering", 
    "pagination": "631-640", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "d69e984023401cf381eae23b3a1c82460d4e9db74bcb1f4e1f2bf269b79c8738"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s13042-017-0744-y"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1092693192"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s13042-017-0744-y", 
      "https://app.dimensions.ai/details/publication/pub.1092693192"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T12:53", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000364_0000000364/records_72856_00000000.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1007%2Fs13042-017-0744-y"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s13042-017-0744-y'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s13042-017-0744-y'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s13042-017-0744-y'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s13042-017-0744-y'


 

This table displays all metadata directly associated to this object as RDF triples.

142 TRIPLES      21 PREDICATES      45 URIs      19 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s13042-017-0744-y schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nb04ba035e3364956adc533e05ea03a94
4 schema:citation sg:pub.10.1007/978-3-319-47650-6_2
5 sg:pub.10.1007/s10618-006-0060-8
6 sg:pub.10.1007/s13042-013-0210-4
7 sg:pub.10.1007/s13042-015-0381-2
8 sg:pub.10.1007/s13042-015-0421-y
9 sg:pub.10.1186/1471-2105-13-174
10 sg:pub.10.1186/1471-2105-14-s8-s7
11 https://doi.org/10.1016/j.asoc.2017.04.019
12 https://doi.org/10.1016/j.dsp.2014.02.014
13 https://doi.org/10.1093/bib/bbt067
14 https://doi.org/10.1109/icde.2003.1260785
15 https://doi.org/10.1109/icdm.2008.43
16 https://doi.org/10.1109/ijcnn.2005.1556220
17 https://doi.org/10.1109/tkde.2010.190
18 https://doi.org/10.1109/tkde.2013.104
19 https://doi.org/10.1109/tnnls.2016.2608354
20 https://doi.org/10.1142/s021800141755014x
21 https://doi.org/10.1145/1882471.1882478
22 schema:datePublished 2019-04
23 schema:datePublishedReg 2019-04-01
24 schema:description Gram-based vector space model has been extensively applied to categorical sequence clustering. However, there is a general lack of an efficient method to determine the length of grams and to identify redundant and non-significant grams involved in the model. In this paper, a variable-length gram model is proposed, different from previous studies mainly focused on the fixed-length grams of sequences. The variable-length grams are obtained using a two-stage pruning method aimed at selecting the irredundant and significant subsequences from the prefix trees, created from the fixed-length grams with an initially large length. A robust partitioning algorithm is then defined for categorical sequence clustering on the normalized representation model using variable-length grams collected from the pruned trees. Experimental results on real-world sequence sets from various domains are given to demonstrate the performance of the proposed methods.
25 schema:genre research_article
26 schema:inLanguage en
27 schema:isAccessibleForFree false
28 schema:isPartOf N78ffd9d42beb4c8191accc1f9214e5e5
29 Nb08d30d1e0c24e6cade32d7922d6ac57
30 sg:journal.1136696
31 schema:name Two-stage pruning method for gram-based categorical sequence clustering
32 schema:pagination 631-640
33 schema:productId N4329522f55f745bc809aed7cc0992848
34 Nb468df40bd724ee28203a1c00b351083
35 Nf8361cf0ef69465890d3a9b152b09db0
36 schema:sameAs https://app.dimensions.ai/details/publication/pub.1092693192
37 https://doi.org/10.1007/s13042-017-0744-y
38 schema:sdDatePublished 2019-04-11T12:53
39 schema:sdLicense https://scigraph.springernature.com/explorer/license/
40 schema:sdPublisher Na0977723d9f54078b6f8dc29b6b1bf6f
41 schema:url https://link.springer.com/10.1007%2Fs13042-017-0744-y
42 sgo:license sg:explorer/license/
43 sgo:sdDataset articles
44 rdf:type schema:ScholarlyArticle
45 N4329522f55f745bc809aed7cc0992848 schema:name dimensions_id
46 schema:value pub.1092693192
47 rdf:type schema:PropertyValue
48 N483956895dd84f499fe05a45a264a848 rdf:first sg:person.013123716475.20
49 rdf:rest Nebaa29323a6040a5acc3e727ba351112
50 N78ffd9d42beb4c8191accc1f9214e5e5 schema:volumeNumber 10
51 rdf:type schema:PublicationVolume
52 Na0977723d9f54078b6f8dc29b6b1bf6f schema:name Springer Nature - SN SciGraph project
53 rdf:type schema:Organization
54 Nb04ba035e3364956adc533e05ea03a94 rdf:first sg:person.015110201075.53
55 rdf:rest N483956895dd84f499fe05a45a264a848
56 Nb08d30d1e0c24e6cade32d7922d6ac57 schema:issueNumber 4
57 rdf:type schema:PublicationIssue
58 Nb468df40bd724ee28203a1c00b351083 schema:name doi
59 schema:value 10.1007/s13042-017-0744-y
60 rdf:type schema:PropertyValue
61 Nebaa29323a6040a5acc3e727ba351112 rdf:first sg:person.01310645333.95
62 rdf:rest rdf:nil
63 Nf8361cf0ef69465890d3a9b152b09db0 schema:name readcube_id
64 schema:value d69e984023401cf381eae23b3a1c82460d4e9db74bcb1f4e1f2bf269b79c8738
65 rdf:type schema:PropertyValue
66 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
67 schema:name Information and Computing Sciences
68 rdf:type schema:DefinedTerm
69 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
70 schema:name Artificial Intelligence and Image Processing
71 rdf:type schema:DefinedTerm
72 sg:journal.1136696 schema:issn 1868-8071
73 1868-808X
74 schema:name International Journal of Machine Learning and Cybernetics
75 rdf:type schema:Periodical
76 sg:person.01310645333.95 schema:affiliation https://www.grid.ac/institutes/grid.411503.2
77 schema:familyName Chen
78 schema:givenName Lifei
79 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01310645333.95
80 rdf:type schema:Person
81 sg:person.013123716475.20 schema:affiliation https://www.grid.ac/institutes/grid.163032.5
82 schema:familyName Wang
83 schema:givenName Wenjian
84 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013123716475.20
85 rdf:type schema:Person
86 sg:person.015110201075.53 schema:affiliation https://www.grid.ac/institutes/grid.54549.39
87 schema:familyName Yuan
88 schema:givenName Liang
89 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015110201075.53
90 rdf:type schema:Person
91 sg:pub.10.1007/978-3-319-47650-6_2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008684113
92 https://doi.org/10.1007/978-3-319-47650-6_2
93 rdf:type schema:CreativeWork
94 sg:pub.10.1007/s10618-006-0060-8 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024715987
95 https://doi.org/10.1007/s10618-006-0060-8
96 rdf:type schema:CreativeWork
97 sg:pub.10.1007/s13042-013-0210-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036547359
98 https://doi.org/10.1007/s13042-013-0210-4
99 rdf:type schema:CreativeWork
100 sg:pub.10.1007/s13042-015-0381-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035090012
101 https://doi.org/10.1007/s13042-015-0381-2
102 rdf:type schema:CreativeWork
103 sg:pub.10.1007/s13042-015-0421-y schema:sameAs https://app.dimensions.ai/details/publication/pub.1021359505
104 https://doi.org/10.1007/s13042-015-0421-y
105 rdf:type schema:CreativeWork
106 sg:pub.10.1186/1471-2105-13-174 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010064525
107 https://doi.org/10.1186/1471-2105-13-174
108 rdf:type schema:CreativeWork
109 sg:pub.10.1186/1471-2105-14-s8-s7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030334663
110 https://doi.org/10.1186/1471-2105-14-s8-s7
111 rdf:type schema:CreativeWork
112 https://doi.org/10.1016/j.asoc.2017.04.019 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085213924
113 rdf:type schema:CreativeWork
114 https://doi.org/10.1016/j.dsp.2014.02.014 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022006176
115 rdf:type schema:CreativeWork
116 https://doi.org/10.1093/bib/bbt067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007818694
117 rdf:type schema:CreativeWork
118 https://doi.org/10.1109/icde.2003.1260785 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095149849
119 rdf:type schema:CreativeWork
120 https://doi.org/10.1109/icdm.2008.43 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094406112
121 rdf:type schema:CreativeWork
122 https://doi.org/10.1109/ijcnn.2005.1556220 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095278133
123 rdf:type schema:CreativeWork
124 https://doi.org/10.1109/tkde.2010.190 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061662182
125 rdf:type schema:CreativeWork
126 https://doi.org/10.1109/tkde.2013.104 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061662686
127 rdf:type schema:CreativeWork
128 https://doi.org/10.1109/tnnls.2016.2608354 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061719275
129 rdf:type schema:CreativeWork
130 https://doi.org/10.1142/s021800141755014x schema:sameAs https://app.dimensions.ai/details/publication/pub.1084825485
131 rdf:type schema:CreativeWork
132 https://doi.org/10.1145/1882471.1882478 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035938734
133 rdf:type schema:CreativeWork
134 https://www.grid.ac/institutes/grid.163032.5 schema:alternateName Shanxi University
135 schema:name Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan, China
136 rdf:type schema:Organization
137 https://www.grid.ac/institutes/grid.411503.2 schema:alternateName Fujian Normal University
138 schema:name Digit Fujian Internet-of-Thing Laboratory of Environment Monitoring, Fujian Normal University, Fuzhou, China
139 rdf:type schema:Organization
140 https://www.grid.ac/institutes/grid.54549.39 schema:alternateName University of Electronic Science and Technology of China
141 schema:name Network Operation Maintenance Center, University of Electronic Science and Technology of China, Chengdu, China
142 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...