Ontology type: schema:ScholarlyArticle
2000-02
AUTHORSDavid Gibson, Jon Kleinberg, Prabhakar Raghavan
ABSTRACT. We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By “categorical data,” we mean tables with fields that cannot be naturally ordered by a metric – e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the co-occurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems. More... »
PAGES222-236
http://scigraph.springernature.com/pub.10.1007/s007780050005
DOIhttp://dx.doi.org/10.1007/s007780050005
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1031702635
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0804",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Data Format",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0805",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Distributed Computing",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0806",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information Systems",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Department of Computer Science UC Berkeley, Berkeley, CA 94720 USA; e-mail: dag@cs.berkeley.edu, US",
"id": "http://www.grid.ac/institutes/None",
"name": [
"Department of Computer Science UC Berkeley, Berkeley, CA 94720 USA; e-mail: dag@cs.berkeley.edu, US"
],
"type": "Organization"
},
"familyName": "Gibson",
"givenName": "David",
"id": "sg:person.011534145461.80",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011534145461.80"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Department of Computer Science, Cornell University, Ithaca, NY 14853; e-mail: kleinber@cs.cornell.edu, US",
"id": "http://www.grid.ac/institutes/grid.5386.8",
"name": [
"Department of Computer Science, Cornell University, Ithaca, NY 14853; e-mail: kleinber@cs.cornell.edu, US"
],
"type": "Organization"
},
"familyName": "Kleinberg",
"givenName": "Jon",
"id": "sg:person.011522233557.04",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011522233557.04"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Almaden Research Center IBM, San Jose, CA 95120 USA; e-mail: pragh@almaden.ibm.com, US",
"id": "http://www.grid.ac/institutes/None",
"name": [
"Almaden Research Center IBM, San Jose, CA 95120 USA; e-mail: pragh@almaden.ibm.com, US"
],
"type": "Organization"
},
"familyName": "Raghavan",
"givenName": "Prabhakar",
"id": "sg:person.012437241622.81",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012437241622.81"
],
"type": "Person"
}
],
"datePublished": "2000-02",
"datePublishedReg": "2000-02-01",
"description": "Abstract. We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By \u201ccategorical data,\u201d we mean tables with fields that cannot be naturally ordered by a metric \u2013 e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the co-occurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems.",
"genre": "article",
"id": "sg:pub.10.1007/s007780050005",
"inLanguage": "en",
"isAccessibleForFree": false,
"isPartOf": [
{
"id": "sg:journal.1044889",
"issn": [
"1066-8888",
"0949-877X"
],
"name": "The VLDB Journal",
"publisher": "Springer Nature",
"type": "Periodical"
},
{
"issueNumber": "3",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "8"
}
],
"keywords": [
"dynamical systems",
"non-linear dynamical systems",
"categorical data",
"iterative method",
"collection of sets",
"certain types",
"approach",
"similarity measure",
"novel approach",
"system",
"categorical values",
"field",
"set",
"table",
"terms",
"applications",
"values",
"technique",
"data",
"names of products",
"dataset",
"types",
"analysis",
"measures",
"automobiles",
"collection",
"manufacturers",
"mining",
"weight",
"products",
"name",
"producers",
"method"
],
"name": "Clustering categorical data: an approach based on dynamical systems",
"pagination": "222-236",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1031702635"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1007/s007780050005"
]
}
],
"sameAs": [
"https://doi.org/10.1007/s007780050005",
"https://app.dimensions.ai/details/publication/pub.1031702635"
],
"sdDataset": "articles",
"sdDatePublished": "2022-05-10T09:47",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220509/entities/gbq_results/article/article_312.jsonl",
"type": "ScholarlyArticle",
"url": "https://doi.org/10.1007/s007780050005"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s007780050005'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s007780050005'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s007780050005'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s007780050005'
This table displays all metadata directly associated to this object as RDF triples.
118 TRIPLES
21 PREDICATES
61 URIs
51 LITERALS
6 BLANK NODES
Subject | Predicate | Object | |
---|---|---|---|
1 | sg:pub.10.1007/s007780050005 | schema:about | anzsrc-for:08 |
2 | ″ | ″ | anzsrc-for:0804 |
3 | ″ | ″ | anzsrc-for:0805 |
4 | ″ | ″ | anzsrc-for:0806 |
5 | ″ | schema:author | Nb53e388802f34a3da822db44eebe67e0 |
6 | ″ | schema:datePublished | 2000-02 |
7 | ″ | schema:datePublishedReg | 2000-02-01 |
8 | ″ | schema:description | Abstract. We describe a novel approach for clustering collections of sets, and its application to the analysis and mining of categorical data. By “categorical data,” we mean tables with fields that cannot be naturally ordered by a metric – e.g., the names of producers of automobiles, or the names of products offered by a manufacturer. Our approach is based on an iterative method for assigning and propagating weights on the categorical values in a table; this facilitates a type of similarity measure arising from the co-occurrence of values in the dataset. Our techniques can be studied analytically in terms of certain types of non-linear dynamical systems. |
9 | ″ | schema:genre | article |
10 | ″ | schema:inLanguage | en |
11 | ″ | schema:isAccessibleForFree | false |
12 | ″ | schema:isPartOf | N606f616ed91c437198c10aae066f36f7 |
13 | ″ | ″ | Na5f65208507d4879929002c3f019146e |
14 | ″ | ″ | sg:journal.1044889 |
15 | ″ | schema:keywords | analysis |
16 | ″ | ″ | applications |
17 | ″ | ″ | approach |
18 | ″ | ″ | automobiles |
19 | ″ | ″ | categorical data |
20 | ″ | ″ | categorical values |
21 | ″ | ″ | certain types |
22 | ″ | ″ | collection |
23 | ″ | ″ | collection of sets |
24 | ″ | ″ | data |
25 | ″ | ″ | dataset |
26 | ″ | ″ | dynamical systems |
27 | ″ | ″ | field |
28 | ″ | ″ | iterative method |
29 | ″ | ″ | manufacturers |
30 | ″ | ″ | measures |
31 | ″ | ″ | method |
32 | ″ | ″ | mining |
33 | ″ | ″ | name |
34 | ″ | ″ | names of products |
35 | ″ | ″ | non-linear dynamical systems |
36 | ″ | ″ | novel approach |
37 | ″ | ″ | producers |
38 | ″ | ″ | products |
39 | ″ | ″ | set |
40 | ″ | ″ | similarity measure |
41 | ″ | ″ | system |
42 | ″ | ″ | table |
43 | ″ | ″ | technique |
44 | ″ | ″ | terms |
45 | ″ | ″ | types |
46 | ″ | ″ | values |
47 | ″ | ″ | weight |
48 | ″ | schema:name | Clustering categorical data: an approach based on dynamical systems |
49 | ″ | schema:pagination | 222-236 |
50 | ″ | schema:productId | N1470b1c99f034bd88dddd9074f8b901e |
51 | ″ | ″ | Nd4a52c5ffd1445f9891c1919ac4c7920 |
52 | ″ | schema:sameAs | https://app.dimensions.ai/details/publication/pub.1031702635 |
53 | ″ | ″ | https://doi.org/10.1007/s007780050005 |
54 | ″ | schema:sdDatePublished | 2022-05-10T09:47 |
55 | ″ | schema:sdLicense | https://scigraph.springernature.com/explorer/license/ |
56 | ″ | schema:sdPublisher | N6ea55a2a922d40689f4b40e6cea20c99 |
57 | ″ | schema:url | https://doi.org/10.1007/s007780050005 |
58 | ″ | sgo:license | sg:explorer/license/ |
59 | ″ | sgo:sdDataset | articles |
60 | ″ | rdf:type | schema:ScholarlyArticle |
61 | N1470b1c99f034bd88dddd9074f8b901e | schema:name | dimensions_id |
62 | ″ | schema:value | pub.1031702635 |
63 | ″ | rdf:type | schema:PropertyValue |
64 | N606f616ed91c437198c10aae066f36f7 | schema:issueNumber | 3 |
65 | ″ | rdf:type | schema:PublicationIssue |
66 | N6ea55a2a922d40689f4b40e6cea20c99 | schema:name | Springer Nature - SN SciGraph project |
67 | ″ | rdf:type | schema:Organization |
68 | N785ff1d7acfb49a98896b2b5defbf76a | rdf:first | sg:person.011522233557.04 |
69 | ″ | rdf:rest | Nb6bf070a54a6459ebc1d2dbf04e11cd9 |
70 | Na5f65208507d4879929002c3f019146e | schema:volumeNumber | 8 |
71 | ″ | rdf:type | schema:PublicationVolume |
72 | Nb53e388802f34a3da822db44eebe67e0 | rdf:first | sg:person.011534145461.80 |
73 | ″ | rdf:rest | N785ff1d7acfb49a98896b2b5defbf76a |
74 | Nb6bf070a54a6459ebc1d2dbf04e11cd9 | rdf:first | sg:person.012437241622.81 |
75 | ″ | rdf:rest | rdf:nil |
76 | Nd4a52c5ffd1445f9891c1919ac4c7920 | schema:name | doi |
77 | ″ | schema:value | 10.1007/s007780050005 |
78 | ″ | rdf:type | schema:PropertyValue |
79 | anzsrc-for:08 | schema:inDefinedTermSet | anzsrc-for: |
80 | ″ | schema:name | Information and Computing Sciences |
81 | ″ | rdf:type | schema:DefinedTerm |
82 | anzsrc-for:0804 | schema:inDefinedTermSet | anzsrc-for: |
83 | ″ | schema:name | Data Format |
84 | ″ | rdf:type | schema:DefinedTerm |
85 | anzsrc-for:0805 | schema:inDefinedTermSet | anzsrc-for: |
86 | ″ | schema:name | Distributed Computing |
87 | ″ | rdf:type | schema:DefinedTerm |
88 | anzsrc-for:0806 | schema:inDefinedTermSet | anzsrc-for: |
89 | ″ | schema:name | Information Systems |
90 | ″ | rdf:type | schema:DefinedTerm |
91 | sg:journal.1044889 | schema:issn | 0949-877X |
92 | ″ | ″ | 1066-8888 |
93 | ″ | schema:name | The VLDB Journal |
94 | ″ | schema:publisher | Springer Nature |
95 | ″ | rdf:type | schema:Periodical |
96 | sg:person.011522233557.04 | schema:affiliation | grid-institutes:grid.5386.8 |
97 | ″ | schema:familyName | Kleinberg |
98 | ″ | schema:givenName | Jon |
99 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011522233557.04 |
100 | ″ | rdf:type | schema:Person |
101 | sg:person.011534145461.80 | schema:affiliation | grid-institutes:None |
102 | ″ | schema:familyName | Gibson |
103 | ″ | schema:givenName | David |
104 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011534145461.80 |
105 | ″ | rdf:type | schema:Person |
106 | sg:person.012437241622.81 | schema:affiliation | grid-institutes:None |
107 | ″ | schema:familyName | Raghavan |
108 | ″ | schema:givenName | Prabhakar |
109 | ″ | schema:sameAs | https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012437241622.81 |
110 | ″ | rdf:type | schema:Person |
111 | grid-institutes:None | schema:alternateName | Almaden Research Center IBM, San Jose, CA 95120 USA; e-mail: pragh@almaden.ibm.com, US |
112 | ″ | ″ | Department of Computer Science UC Berkeley, Berkeley, CA 94720 USA; e-mail: dag@cs.berkeley.edu, US |
113 | ″ | schema:name | Almaden Research Center IBM, San Jose, CA 95120 USA; e-mail: pragh@almaden.ibm.com, US |
114 | ″ | ″ | Department of Computer Science UC Berkeley, Berkeley, CA 94720 USA; e-mail: dag@cs.berkeley.edu, US |
115 | ″ | rdf:type | schema:Organization |
116 | grid-institutes:grid.5386.8 | schema:alternateName | Department of Computer Science, Cornell University, Ithaca, NY 14853; e-mail: kleinber@cs.cornell.edu, US |
117 | ″ | schema:name | Department of Computer Science, Cornell University, Ithaca, NY 14853; e-mail: kleinber@cs.cornell.edu, US |
118 | ″ | rdf:type | schema:Organization |