An Analysis of Constructed Categories for Textual Classification Using Fuzzy Similarity and Agglomerative Hierarchical Methods View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2010

AUTHORS

Marcus V. C. Guelpeli , Ana Cristina Bicharra Garcia , Flavia Cristina Bernardini

ABSTRACT

Ambiguity is a challenge faced by systems that handle natural language. To assuage the issue of linguistic ambiguities found in text classification, this work proposes a text categorizer using the methodology of Fuzzy Similarity. The clustering algorithms Stars and Cliques are adopted in the Agglomerative Hierarchical method and they identify the groups of texts by specifying some type of relationship rule to create categories based on the similarity analysis of the textual terms. The proposal is based on the methodology suggested, categories can be created from the analysis of the degree of similarity of the texts to be classified, without needing to determine the number of initial categories. The combination of techniques proposed in the categorizer’s steps brought satisfactory results, proving to be efficient in textual classification. More... »

PAGES

277-306

References to SciGraph publications

  • 2006. A Survey of Clustering Data Mining Techniques in GROUPING MULTIDIMENSIONAL DATA
  • 1994-02. Fuzzy information retrieval in JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
  • 1997-07. Exploiting Background Information in Knowledge Discovery from Text in JOURNAL OF INTELLIGENT INFORMATION SYSTEMS
  • 1998. From the data mine to the knowledge mill: Applying the principles of lexical analysis to the data mining and knowledge discovery process in PRINCIPLES OF DATA MINING AND KNOWLEDGE DISCOVERY
  • Book

    TITLE

    Emergent Web Intelligence: Advanced Semantic Technologies

    ISBN

    978-1-84996-076-2
    978-1-84996-077-9

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/978-1-84996-077-9_11

    DOI

    http://dx.doi.org/10.1007/978-1-84996-077-9_11

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1017970100


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "name": [
                "Departamento de Ci\u00eancia da Computa\u00e7\u00e3o, Instituto de Computa\u00e7\u00e3o\u2014IC, Universidade Federal Fluminense\u2014UFF, Rua Passo da P\u00e1tria 156, Bloco E, 3\u00ba andar, RJ CEP 24210-240, S\u00e3o Domingos, Niter\u00f3i, Brazil"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Guelpeli", 
            "givenName": "Marcus V. C.", 
            "id": "sg:person.012350245407.38", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012350245407.38"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "name": [
                "Departamento de Ci\u00eancia da Computa\u00e7\u00e3o, Instituto de Computa\u00e7\u00e3o\u2014IC, Universidade Federal Fluminense\u2014UFF, Rua Passo da P\u00e1tria 156, Bloco E, 3\u00ba andar, RJ CEP 24210-240, S\u00e3o Domingos, Niter\u00f3i, Brazil"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Garcia", 
            "givenName": "Ana Cristina Bicharra", 
            "id": "sg:person.07430767131.99", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07430767131.99"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "name": [
                "Departamento de Ci\u00eancia e Tecnologia\u2014RCT, P\u00f3lo Universit\u00e1rio de Rio das Ostras\u2014PURO, Universidade Federal Fluminense\u2014UFF, Rua Recife, s/n, Jardim Bela Vista, RJ CEP 28890-000, Rio das Ostras, Brazil"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Bernardini", 
            "givenName": "Flavia Cristina", 
            "id": "sg:person.013776555445.44", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013776555445.44"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "sg:pub.10.1023/a:1008693204338", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1008597299", 
              "https://doi.org/10.1023/a:1008693204338"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0019-9958(65)90241-x", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1009640697"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bfb0094844", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1010267166", 
              "https://doi.org/10.1007/bfb0094844"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/3-540-28349-8_2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1011218014", 
              "https://doi.org/10.1007/3-540-28349-8_2"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/240455.240463", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1018673843"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/775047.775062", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1031094505"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf01014019", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1033037339", 
              "https://doi.org/10.1007/bf01014019"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1007/bf01014019", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1033037339", 
              "https://doi.org/10.1007/bf01014019"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1145/1167350.1167397", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1043823733"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/s0261-5177(01)00050-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1046548125"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/2.781637", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061106156"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tsmc.1973.5408575", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061792730"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/tsmcc.2002.1009142", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061797640"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "2010", 
        "datePublishedReg": "2010-01-01", 
        "description": "Ambiguity is a challenge faced by systems that handle natural language. To assuage the issue of linguistic ambiguities found in text classification, this work proposes a text categorizer using the methodology of Fuzzy Similarity. The clustering algorithms Stars and Cliques are adopted in the Agglomerative Hierarchical method and they identify the groups of texts by specifying some type of relationship rule to create categories based on the similarity analysis of the textual terms. The proposal is based on the methodology suggested, categories can be created from the analysis of the degree of similarity of the texts to be classified, without needing to determine the number of initial categories. The combination of techniques proposed in the categorizer\u2019s steps brought satisfactory results, proving to be efficient in textual classification.", 
        "editor": [
          {
            "familyName": "Badr", 
            "givenName": "Youakim", 
            "type": "Person"
          }, 
          {
            "familyName": "Chbeir", 
            "givenName": "Richard", 
            "type": "Person"
          }, 
          {
            "familyName": "Abraham", 
            "givenName": "Ajith", 
            "type": "Person"
          }, 
          {
            "familyName": "Hassanien", 
            "givenName": "Aboul-Ella", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/978-1-84996-077-9_11", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": {
          "isbn": [
            "978-1-84996-076-2", 
            "978-1-84996-077-9"
          ], 
          "name": "Emergent Web Intelligence: Advanced Semantic Technologies", 
          "type": "Book"
        }, 
        "name": "An Analysis of Constructed Categories for Textual Classification Using Fuzzy Similarity and Agglomerative Hierarchical Methods", 
        "pagination": "277-306", 
        "productId": [
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1017970100"
            ]
          }, 
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/978-1-84996-077-9_11"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "559efbd17e574ae2d72103b06402626ecab2cfd9c5ae924a2689340f4cb7198c"
            ]
          }
        ], 
        "publisher": {
          "location": "London", 
          "name": "Springer London", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/978-1-84996-077-9_11", 
          "https://app.dimensions.ai/details/publication/pub.1017970100"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-16T08:00", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000359_0000000359/records_29186_00000000.jsonl", 
        "type": "Chapter", 
        "url": "https://link.springer.com/10.1007%2F978-1-84996-077-9_11"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-1-84996-077-9_11'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-1-84996-077-9_11'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-1-84996-077-9_11'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-1-84996-077-9_11'


     

    This table displays all metadata directly associated to this object as RDF triples.

    137 TRIPLES      23 PREDICATES      39 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/978-1-84996-077-9_11 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N1818ca538a704b8e953a18a5f413caf9
    4 schema:citation sg:pub.10.1007/3-540-28349-8_2
    5 sg:pub.10.1007/bf01014019
    6 sg:pub.10.1007/bfb0094844
    7 sg:pub.10.1023/a:1008693204338
    8 https://doi.org/10.1016/s0019-9958(65)90241-x
    9 https://doi.org/10.1016/s0261-5177(01)00050-4
    10 https://doi.org/10.1109/2.781637
    11 https://doi.org/10.1109/tsmc.1973.5408575
    12 https://doi.org/10.1109/tsmcc.2002.1009142
    13 https://doi.org/10.1145/1167350.1167397
    14 https://doi.org/10.1145/240455.240463
    15 https://doi.org/10.1145/775047.775062
    16 schema:datePublished 2010
    17 schema:datePublishedReg 2010-01-01
    18 schema:description Ambiguity is a challenge faced by systems that handle natural language. To assuage the issue of linguistic ambiguities found in text classification, this work proposes a text categorizer using the methodology of Fuzzy Similarity. The clustering algorithms Stars and Cliques are adopted in the Agglomerative Hierarchical method and they identify the groups of texts by specifying some type of relationship rule to create categories based on the similarity analysis of the textual terms. The proposal is based on the methodology suggested, categories can be created from the analysis of the degree of similarity of the texts to be classified, without needing to determine the number of initial categories. The combination of techniques proposed in the categorizer’s steps brought satisfactory results, proving to be efficient in textual classification.
    19 schema:editor N7c4d0b7492514e6f8d1be5892234e857
    20 schema:genre chapter
    21 schema:inLanguage en
    22 schema:isAccessibleForFree false
    23 schema:isPartOf Nd170c88237754041861e023e6af5fd74
    24 schema:name An Analysis of Constructed Categories for Textual Classification Using Fuzzy Similarity and Agglomerative Hierarchical Methods
    25 schema:pagination 277-306
    26 schema:productId N075be2d7aa78437ebc7cfd4a501b1f05
    27 Nd89d6654d24741f590ab2d228956c6ff
    28 Ne5286504be9d48b7a1dd6ce8ed6dfb5c
    29 schema:publisher N06ee75454173491a9c64c8025302a11a
    30 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017970100
    31 https://doi.org/10.1007/978-1-84996-077-9_11
    32 schema:sdDatePublished 2019-04-16T08:00
    33 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    34 schema:sdPublisher N9f9754b4845343849d015241277b656b
    35 schema:url https://link.springer.com/10.1007%2F978-1-84996-077-9_11
    36 sgo:license sg:explorer/license/
    37 sgo:sdDataset chapters
    38 rdf:type schema:Chapter
    39 N06ee75454173491a9c64c8025302a11a schema:location London
    40 schema:name Springer London
    41 rdf:type schema:Organisation
    42 N075be2d7aa78437ebc7cfd4a501b1f05 schema:name dimensions_id
    43 schema:value pub.1017970100
    44 rdf:type schema:PropertyValue
    45 N1818ca538a704b8e953a18a5f413caf9 rdf:first sg:person.012350245407.38
    46 rdf:rest N41de5f7fb54f4fd59ebae1ec3835b959
    47 N41de5f7fb54f4fd59ebae1ec3835b959 rdf:first sg:person.07430767131.99
    48 rdf:rest N978c7b9bf4604ae8a8cd28ab932c9b39
    49 N455c4cf868d64d6a8ea3a410f58f638c schema:name Departamento de Ciência da Computação, Instituto de Computação—IC, Universidade Federal Fluminense—UFF, Rua Passo da Pátria 156, Bloco E, 3º andar, RJ CEP 24210-240, São Domingos, Niterói, Brazil
    50 rdf:type schema:Organization
    51 N576a21aa86844da490b5efa12b31ed73 schema:familyName Badr
    52 schema:givenName Youakim
    53 rdf:type schema:Person
    54 N64e3745d61f04350aefc2ddc73df7a25 schema:name Departamento de Ciência e Tecnologia—RCT, Pólo Universitário de Rio das Ostras—PURO, Universidade Federal Fluminense—UFF, Rua Recife, s/n, Jardim Bela Vista, RJ CEP 28890-000, Rio das Ostras, Brazil
    55 rdf:type schema:Organization
    56 N70f1d520bfdf438385da37b8ea2ea498 rdf:first Nddee1271278a498bb8b6648ac36b8f9c
    57 rdf:rest rdf:nil
    58 N7c4d0b7492514e6f8d1be5892234e857 rdf:first N576a21aa86844da490b5efa12b31ed73
    59 rdf:rest N9c3de7f36259432080e1b2122e64986a
    60 N87d5b4470fff4236a38d32a91dfd7c76 schema:name Departamento de Ciência da Computação, Instituto de Computação—IC, Universidade Federal Fluminense—UFF, Rua Passo da Pátria 156, Bloco E, 3º andar, RJ CEP 24210-240, São Domingos, Niterói, Brazil
    61 rdf:type schema:Organization
    62 N881fd5981b5b41339fde68e536d8af8d rdf:first Nd63ca02e630c498ea079607b70a90a5f
    63 rdf:rest N70f1d520bfdf438385da37b8ea2ea498
    64 N978c7b9bf4604ae8a8cd28ab932c9b39 rdf:first sg:person.013776555445.44
    65 rdf:rest rdf:nil
    66 N9c3de7f36259432080e1b2122e64986a rdf:first Nbbe7cd44e9fc4abeb3fb3961c2d1a582
    67 rdf:rest N881fd5981b5b41339fde68e536d8af8d
    68 N9f9754b4845343849d015241277b656b schema:name Springer Nature - SN SciGraph project
    69 rdf:type schema:Organization
    70 Nbbe7cd44e9fc4abeb3fb3961c2d1a582 schema:familyName Chbeir
    71 schema:givenName Richard
    72 rdf:type schema:Person
    73 Nd170c88237754041861e023e6af5fd74 schema:isbn 978-1-84996-076-2
    74 978-1-84996-077-9
    75 schema:name Emergent Web Intelligence: Advanced Semantic Technologies
    76 rdf:type schema:Book
    77 Nd63ca02e630c498ea079607b70a90a5f schema:familyName Abraham
    78 schema:givenName Ajith
    79 rdf:type schema:Person
    80 Nd89d6654d24741f590ab2d228956c6ff schema:name doi
    81 schema:value 10.1007/978-1-84996-077-9_11
    82 rdf:type schema:PropertyValue
    83 Nddee1271278a498bb8b6648ac36b8f9c schema:familyName Hassanien
    84 schema:givenName Aboul-Ella
    85 rdf:type schema:Person
    86 Ne5286504be9d48b7a1dd6ce8ed6dfb5c schema:name readcube_id
    87 schema:value 559efbd17e574ae2d72103b06402626ecab2cfd9c5ae924a2689340f4cb7198c
    88 rdf:type schema:PropertyValue
    89 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    90 schema:name Information and Computing Sciences
    91 rdf:type schema:DefinedTerm
    92 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    93 schema:name Artificial Intelligence and Image Processing
    94 rdf:type schema:DefinedTerm
    95 sg:person.012350245407.38 schema:affiliation N455c4cf868d64d6a8ea3a410f58f638c
    96 schema:familyName Guelpeli
    97 schema:givenName Marcus V. C.
    98 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012350245407.38
    99 rdf:type schema:Person
    100 sg:person.013776555445.44 schema:affiliation N64e3745d61f04350aefc2ddc73df7a25
    101 schema:familyName Bernardini
    102 schema:givenName Flavia Cristina
    103 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013776555445.44
    104 rdf:type schema:Person
    105 sg:person.07430767131.99 schema:affiliation N87d5b4470fff4236a38d32a91dfd7c76
    106 schema:familyName Garcia
    107 schema:givenName Ana Cristina Bicharra
    108 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07430767131.99
    109 rdf:type schema:Person
    110 sg:pub.10.1007/3-540-28349-8_2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011218014
    111 https://doi.org/10.1007/3-540-28349-8_2
    112 rdf:type schema:CreativeWork
    113 sg:pub.10.1007/bf01014019 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033037339
    114 https://doi.org/10.1007/bf01014019
    115 rdf:type schema:CreativeWork
    116 sg:pub.10.1007/bfb0094844 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010267166
    117 https://doi.org/10.1007/bfb0094844
    118 rdf:type schema:CreativeWork
    119 sg:pub.10.1023/a:1008693204338 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008597299
    120 https://doi.org/10.1023/a:1008693204338
    121 rdf:type schema:CreativeWork
    122 https://doi.org/10.1016/s0019-9958(65)90241-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1009640697
    123 rdf:type schema:CreativeWork
    124 https://doi.org/10.1016/s0261-5177(01)00050-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046548125
    125 rdf:type schema:CreativeWork
    126 https://doi.org/10.1109/2.781637 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061106156
    127 rdf:type schema:CreativeWork
    128 https://doi.org/10.1109/tsmc.1973.5408575 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061792730
    129 rdf:type schema:CreativeWork
    130 https://doi.org/10.1109/tsmcc.2002.1009142 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061797640
    131 rdf:type schema:CreativeWork
    132 https://doi.org/10.1145/1167350.1167397 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043823733
    133 rdf:type schema:CreativeWork
    134 https://doi.org/10.1145/240455.240463 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018673843
    135 rdf:type schema:CreativeWork
    136 https://doi.org/10.1145/775047.775062 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031094505
    137 rdf:type schema:CreativeWork
     




    Preview window. Press ESC to close (or click here)


    ...