Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2003-07

AUTHORS

Stefano Monti, Pablo Tamayo, Jill Mesirov, Todd Golub

ABSTRACT

In this paper we present a new methodology of class discovery and clustering validation tailored to the task of analyzing gene expression data. The method can best be thought of as an analysis approach, to guide and assist in the use of any of a wide range of available clustering algorithms. We call the new methodology consensus clustering, and in conjunction with resampling techniques, it provides for a method to represent the consensus across multiple runs of a clustering algorithm and to assess the stability of the discovered clusters. The method can also be used to represent the consensus over multiple runs of a clustering algorithm with random restart (such as K-means, model-based Bayesian clustering, SOM, etc.), so as to account for its sensitivity to the initial conditions. Finally, it provides for a visualization tool to inspect cluster number, membership, and boundaries. We present the results of our experiments on both simulated data and real gene expression data aimed at evaluating the effectiveness of the methodology in discovering biologically meaningful clusters. More... »

PAGES

91-118

Identifiers

URI

http://scigraph.springernature.com/pub.10.1023/a:1023949509487

DOI

http://dx.doi.org/10.1023/a:1023949509487

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1036378730


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Monti", 
        "givenName": "Stefano", 
        "id": "sg:person.07563510377.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07563510377.41"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Tamayo", 
        "givenName": "Pablo", 
        "id": "sg:person.0756245403.69", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0756245403.69"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Mesirov", 
        "givenName": "Jill", 
        "id": "sg:person.0766253043.80", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0766253043.80"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Golub", 
        "givenName": "Todd", 
        "id": "sg:person.01152644406.14", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01152644406.14"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/bf01908075", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022323983", 
          "https://doi.org/10.1007/bf01908075"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-45784-4_39", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051759094", 
          "https://doi.org/10.1007/3-540-45784-4_39"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1007469629108", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044802083", 
          "https://doi.org/10.1023/a:1007469629108"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02294245", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1012945757", 
          "https://doi.org/10.1007/bf02294245"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/415436a", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019207421", 
          "https://doi.org/10.1038/415436a"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01908065", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043555994", 
          "https://doi.org/10.1007/bf01908065"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2002-3-7-research0036", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041996792", 
          "https://doi.org/10.1186/gb-2002-3-7-research0036"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2003-07", 
    "datePublishedReg": "2003-07-01", 
    "description": "In this paper we present a new methodology of class discovery and clustering validation tailored to the task of analyzing gene expression data. The method can best be thought of as an analysis approach, to guide and assist in the use of any of a wide range of available clustering algorithms. We call the new methodology consensus clustering, and in conjunction with resampling techniques, it provides for a method to represent the consensus across multiple runs of a clustering algorithm and to assess the stability of the discovered clusters. The method can also be used to represent the consensus over multiple runs of a clustering algorithm with random restart (such as K-means, model-based Bayesian clustering, SOM, etc.), so as to account for its sensitivity to the initial conditions. Finally, it provides for a visualization tool to inspect cluster number, membership, and boundaries. We present the results of our experiments on both simulated data and real gene expression data aimed at evaluating the effectiveness of the methodology in discovering biologically meaningful clusters.", 
    "genre": "article", 
    "id": "sg:pub.10.1023/a:1023949509487", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1125588", 
        "issn": [
          "0885-6125", 
          "1573-0565"
        ], 
        "name": "Machine Learning", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1-2", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "52"
      }
    ], 
    "keywords": [
      "gene expression data", 
      "real gene expression data", 
      "class discovery", 
      "multiple runs", 
      "gene expression microarray data", 
      "random restarts", 
      "expression data", 
      "clustering algorithm", 
      "visualization tool", 
      "expression microarray data", 
      "consensus clustering", 
      "initial conditions", 
      "meaningful clusters", 
      "cluster number", 
      "algorithm", 
      "new methodology", 
      "microarray data", 
      "analysis approach", 
      "methodology", 
      "clustering", 
      "resampling", 
      "task", 
      "visualization", 
      "restart", 
      "method", 
      "clusters", 
      "data", 
      "discovery", 
      "tool", 
      "effectiveness", 
      "wide range", 
      "approach", 
      "technique", 
      "stability", 
      "run", 
      "validation", 
      "boundaries", 
      "experiments", 
      "number", 
      "conditions", 
      "consensus", 
      "membership", 
      "results", 
      "use", 
      "conjunction", 
      "range", 
      "sensitivity", 
      "paper", 
      "new methodology consensus clustering", 
      "methodology consensus clustering"
    ], 
    "name": "Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data", 
    "pagination": "91-118", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1036378730"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1023/a:1023949509487"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1023/a:1023949509487", 
      "https://app.dimensions.ai/details/publication/pub.1036378730"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-01-01T18:13", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_374.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1023/a:1023949509487"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'


 

This table displays all metadata directly associated to this object as RDF triples.

157 TRIPLES      22 PREDICATES      83 URIs      68 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1023/a:1023949509487 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nd593d82bdc7c45fe9c6912f5aad1bb86
4 schema:citation sg:pub.10.1007/3-540-45784-4_39
5 sg:pub.10.1007/bf01908065
6 sg:pub.10.1007/bf01908075
7 sg:pub.10.1007/bf02294245
8 sg:pub.10.1023/a:1007469629108
9 sg:pub.10.1038/415436a
10 sg:pub.10.1186/gb-2002-3-7-research0036
11 schema:datePublished 2003-07
12 schema:datePublishedReg 2003-07-01
13 schema:description In this paper we present a new methodology of class discovery and clustering validation tailored to the task of analyzing gene expression data. The method can best be thought of as an analysis approach, to guide and assist in the use of any of a wide range of available clustering algorithms. We call the new methodology consensus clustering, and in conjunction with resampling techniques, it provides for a method to represent the consensus across multiple runs of a clustering algorithm and to assess the stability of the discovered clusters. The method can also be used to represent the consensus over multiple runs of a clustering algorithm with random restart (such as K-means, model-based Bayesian clustering, SOM, etc.), so as to account for its sensitivity to the initial conditions. Finally, it provides for a visualization tool to inspect cluster number, membership, and boundaries. We present the results of our experiments on both simulated data and real gene expression data aimed at evaluating the effectiveness of the methodology in discovering biologically meaningful clusters.
14 schema:genre article
15 schema:inLanguage en
16 schema:isAccessibleForFree true
17 schema:isPartOf Nedbe4e123fbf4b5b8850997ba7dc2e0c
18 Nf963abe1f15a4a1a874970b19cfc6f1d
19 sg:journal.1125588
20 schema:keywords algorithm
21 analysis approach
22 approach
23 boundaries
24 class discovery
25 cluster number
26 clustering
27 clustering algorithm
28 clusters
29 conditions
30 conjunction
31 consensus
32 consensus clustering
33 data
34 discovery
35 effectiveness
36 experiments
37 expression data
38 expression microarray data
39 gene expression data
40 gene expression microarray data
41 initial conditions
42 meaningful clusters
43 membership
44 method
45 methodology
46 methodology consensus clustering
47 microarray data
48 multiple runs
49 new methodology
50 new methodology consensus clustering
51 number
52 paper
53 random restarts
54 range
55 real gene expression data
56 resampling
57 restart
58 results
59 run
60 sensitivity
61 stability
62 task
63 technique
64 tool
65 use
66 validation
67 visualization
68 visualization tool
69 wide range
70 schema:name Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data
71 schema:pagination 91-118
72 schema:productId N3004088b118a43178f1c2273ace97d38
73 Naf4e095130064036b4f7de8589ae0462
74 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036378730
75 https://doi.org/10.1023/a:1023949509487
76 schema:sdDatePublished 2022-01-01T18:13
77 schema:sdLicense https://scigraph.springernature.com/explorer/license/
78 schema:sdPublisher Nbcce49899a69446498fc995b314a9c62
79 schema:url https://doi.org/10.1023/a:1023949509487
80 sgo:license sg:explorer/license/
81 sgo:sdDataset articles
82 rdf:type schema:ScholarlyArticle
83 N3004088b118a43178f1c2273ace97d38 schema:name dimensions_id
84 schema:value pub.1036378730
85 rdf:type schema:PropertyValue
86 N86afbbb8d3704a2783a691c0ebfa683b rdf:first sg:person.01152644406.14
87 rdf:rest rdf:nil
88 Na6ae6fa4163540cc8f8de4bad8de2c72 rdf:first sg:person.0756245403.69
89 rdf:rest Ncb81d52381144974bec7fc8a331cde89
90 Naf4e095130064036b4f7de8589ae0462 schema:name doi
91 schema:value 10.1023/a:1023949509487
92 rdf:type schema:PropertyValue
93 Nbcce49899a69446498fc995b314a9c62 schema:name Springer Nature - SN SciGraph project
94 rdf:type schema:Organization
95 Ncb81d52381144974bec7fc8a331cde89 rdf:first sg:person.0766253043.80
96 rdf:rest N86afbbb8d3704a2783a691c0ebfa683b
97 Nd593d82bdc7c45fe9c6912f5aad1bb86 rdf:first sg:person.07563510377.41
98 rdf:rest Na6ae6fa4163540cc8f8de4bad8de2c72
99 Nedbe4e123fbf4b5b8850997ba7dc2e0c schema:volumeNumber 52
100 rdf:type schema:PublicationVolume
101 Nf963abe1f15a4a1a874970b19cfc6f1d schema:issueNumber 1-2
102 rdf:type schema:PublicationIssue
103 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
104 schema:name Information and Computing Sciences
105 rdf:type schema:DefinedTerm
106 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
107 schema:name Artificial Intelligence and Image Processing
108 rdf:type schema:DefinedTerm
109 sg:journal.1125588 schema:issn 0885-6125
110 1573-0565
111 schema:name Machine Learning
112 schema:publisher Springer Nature
113 rdf:type schema:Periodical
114 sg:person.01152644406.14 schema:affiliation grid-institutes:grid.270301.7
115 schema:familyName Golub
116 schema:givenName Todd
117 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01152644406.14
118 rdf:type schema:Person
119 sg:person.0756245403.69 schema:affiliation grid-institutes:grid.270301.7
120 schema:familyName Tamayo
121 schema:givenName Pablo
122 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0756245403.69
123 rdf:type schema:Person
124 sg:person.07563510377.41 schema:affiliation grid-institutes:grid.270301.7
125 schema:familyName Monti
126 schema:givenName Stefano
127 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07563510377.41
128 rdf:type schema:Person
129 sg:person.0766253043.80 schema:affiliation grid-institutes:grid.270301.7
130 schema:familyName Mesirov
131 schema:givenName Jill
132 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0766253043.80
133 rdf:type schema:Person
134 sg:pub.10.1007/3-540-45784-4_39 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051759094
135 https://doi.org/10.1007/3-540-45784-4_39
136 rdf:type schema:CreativeWork
137 sg:pub.10.1007/bf01908065 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043555994
138 https://doi.org/10.1007/bf01908065
139 rdf:type schema:CreativeWork
140 sg:pub.10.1007/bf01908075 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022323983
141 https://doi.org/10.1007/bf01908075
142 rdf:type schema:CreativeWork
143 sg:pub.10.1007/bf02294245 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012945757
144 https://doi.org/10.1007/bf02294245
145 rdf:type schema:CreativeWork
146 sg:pub.10.1023/a:1007469629108 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044802083
147 https://doi.org/10.1023/a:1007469629108
148 rdf:type schema:CreativeWork
149 sg:pub.10.1038/415436a schema:sameAs https://app.dimensions.ai/details/publication/pub.1019207421
150 https://doi.org/10.1038/415436a
151 rdf:type schema:CreativeWork
152 sg:pub.10.1186/gb-2002-3-7-research0036 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041996792
153 https://doi.org/10.1186/gb-2002-3-7-research0036
154 rdf:type schema:CreativeWork
155 grid-institutes:grid.270301.7 schema:alternateName Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA
156 schema:name Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA
157 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...