Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2003-07

AUTHORS

Stefano Monti, Pablo Tamayo, Jill Mesirov, Todd Golub

ABSTRACT

In this paper we present a new methodology of class discovery and clustering validation tailored to the task of analyzing gene expression data. The method can best be thought of as an analysis approach, to guide and assist in the use of any of a wide range of available clustering algorithms. We call the new methodology consensus clustering, and in conjunction with resampling techniques, it provides for a method to represent the consensus across multiple runs of a clustering algorithm and to assess the stability of the discovered clusters. The method can also be used to represent the consensus over multiple runs of a clustering algorithm with random restart (such as K-means, model-based Bayesian clustering, SOM, etc.), so as to account for its sensitivity to the initial conditions. Finally, it provides for a visualization tool to inspect cluster number, membership, and boundaries. We present the results of our experiments on both simulated data and real gene expression data aimed at evaluating the effectiveness of the methodology in discovering biologically meaningful clusters. More... »

PAGES

91-118

Identifiers

URI

http://scigraph.springernature.com/pub.10.1023/a:1023949509487

DOI

http://dx.doi.org/10.1023/a:1023949509487

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1036378730


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Monti", 
        "givenName": "Stefano", 
        "id": "sg:person.07563510377.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07563510377.41"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Tamayo", 
        "givenName": "Pablo", 
        "id": "sg:person.0756245403.69", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0756245403.69"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Mesirov", 
        "givenName": "Jill", 
        "id": "sg:person.0766253043.80", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0766253043.80"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA", 
          "id": "http://www.grid.ac/institutes/grid.270301.7", 
          "name": [
            "Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Golub", 
        "givenName": "Todd", 
        "id": "sg:person.01152644406.14", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01152644406.14"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/bf01908075", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022323983", 
          "https://doi.org/10.1007/bf01908075"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-45784-4_39", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051759094", 
          "https://doi.org/10.1007/3-540-45784-4_39"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1007469629108", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044802083", 
          "https://doi.org/10.1023/a:1007469629108"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02294245", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1012945757", 
          "https://doi.org/10.1007/bf02294245"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/415436a", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019207421", 
          "https://doi.org/10.1038/415436a"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01908065", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043555994", 
          "https://doi.org/10.1007/bf01908065"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/gb-2002-3-7-research0036", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041996792", 
          "https://doi.org/10.1186/gb-2002-3-7-research0036"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2003-07", 
    "datePublishedReg": "2003-07-01", 
    "description": "In this paper we present a new methodology of class discovery and clustering validation tailored to the task of analyzing gene expression data. The method can best be thought of as an analysis approach, to guide and assist in the use of any of a wide range of available clustering algorithms. We call the new methodology consensus clustering, and in conjunction with resampling techniques, it provides for a method to represent the consensus across multiple runs of a clustering algorithm and to assess the stability of the discovered clusters. The method can also be used to represent the consensus over multiple runs of a clustering algorithm with random restart (such as K-means, model-based Bayesian clustering, SOM, etc.), so as to account for its sensitivity to the initial conditions. Finally, it provides for a visualization tool to inspect cluster number, membership, and boundaries. We present the results of our experiments on both simulated data and real gene expression data aimed at evaluating the effectiveness of the methodology in discovering biologically meaningful clusters.", 
    "genre": "article", 
    "id": "sg:pub.10.1023/a:1023949509487", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1125588", 
        "issn": [
          "0885-6125", 
          "1573-0565"
        ], 
        "name": "Machine Learning", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1-2", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "52"
      }
    ], 
    "keywords": [
      "gene expression data", 
      "real gene expression data", 
      "class discovery", 
      "multiple runs", 
      "gene expression microarray data", 
      "random restarts", 
      "expression data", 
      "clustering algorithm", 
      "visualization tool", 
      "expression microarray data", 
      "consensus clustering", 
      "initial conditions", 
      "meaningful clusters", 
      "cluster number", 
      "algorithm", 
      "new methodology", 
      "microarray data", 
      "analysis approach", 
      "methodology", 
      "clustering", 
      "resampling", 
      "task", 
      "visualization", 
      "restart", 
      "method", 
      "clusters", 
      "data", 
      "discovery", 
      "tool", 
      "effectiveness", 
      "wide range", 
      "approach", 
      "technique", 
      "stability", 
      "run", 
      "validation", 
      "boundaries", 
      "experiments", 
      "number", 
      "conditions", 
      "consensus", 
      "membership", 
      "results", 
      "use", 
      "conjunction", 
      "range", 
      "sensitivity", 
      "paper", 
      "new methodology consensus clustering", 
      "methodology consensus clustering"
    ], 
    "name": "Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data", 
    "pagination": "91-118", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1036378730"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1023/a:1023949509487"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1023/a:1023949509487", 
      "https://app.dimensions.ai/details/publication/pub.1036378730"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-01-01T18:13", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_374.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1023/a:1023949509487"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1023/a:1023949509487'


 

This table displays all metadata directly associated to this object as RDF triples.

157 TRIPLES      22 PREDICATES      83 URIs      68 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1023/a:1023949509487 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N1dd0bc1adea94bd6931afa563fded5ee
4 schema:citation sg:pub.10.1007/3-540-45784-4_39
5 sg:pub.10.1007/bf01908065
6 sg:pub.10.1007/bf01908075
7 sg:pub.10.1007/bf02294245
8 sg:pub.10.1023/a:1007469629108
9 sg:pub.10.1038/415436a
10 sg:pub.10.1186/gb-2002-3-7-research0036
11 schema:datePublished 2003-07
12 schema:datePublishedReg 2003-07-01
13 schema:description In this paper we present a new methodology of class discovery and clustering validation tailored to the task of analyzing gene expression data. The method can best be thought of as an analysis approach, to guide and assist in the use of any of a wide range of available clustering algorithms. We call the new methodology consensus clustering, and in conjunction with resampling techniques, it provides for a method to represent the consensus across multiple runs of a clustering algorithm and to assess the stability of the discovered clusters. The method can also be used to represent the consensus over multiple runs of a clustering algorithm with random restart (such as K-means, model-based Bayesian clustering, SOM, etc.), so as to account for its sensitivity to the initial conditions. Finally, it provides for a visualization tool to inspect cluster number, membership, and boundaries. We present the results of our experiments on both simulated data and real gene expression data aimed at evaluating the effectiveness of the methodology in discovering biologically meaningful clusters.
14 schema:genre article
15 schema:inLanguage en
16 schema:isAccessibleForFree true
17 schema:isPartOf Na4d8afc97eb8468fbb45d75f4d5f7de3
18 Nd5c2fe0877fe462099e07e918fb29901
19 sg:journal.1125588
20 schema:keywords algorithm
21 analysis approach
22 approach
23 boundaries
24 class discovery
25 cluster number
26 clustering
27 clustering algorithm
28 clusters
29 conditions
30 conjunction
31 consensus
32 consensus clustering
33 data
34 discovery
35 effectiveness
36 experiments
37 expression data
38 expression microarray data
39 gene expression data
40 gene expression microarray data
41 initial conditions
42 meaningful clusters
43 membership
44 method
45 methodology
46 methodology consensus clustering
47 microarray data
48 multiple runs
49 new methodology
50 new methodology consensus clustering
51 number
52 paper
53 random restarts
54 range
55 real gene expression data
56 resampling
57 restart
58 results
59 run
60 sensitivity
61 stability
62 task
63 technique
64 tool
65 use
66 validation
67 visualization
68 visualization tool
69 wide range
70 schema:name Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data
71 schema:pagination 91-118
72 schema:productId N8d6ef72c35fa4f84a82a590a6dd7ad9a
73 Nc7048765b9d44418b8ec0be2401467a2
74 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036378730
75 https://doi.org/10.1023/a:1023949509487
76 schema:sdDatePublished 2022-01-01T18:13
77 schema:sdLicense https://scigraph.springernature.com/explorer/license/
78 schema:sdPublisher N3f448bfc8e2048fab10a1e5b36012895
79 schema:url https://doi.org/10.1023/a:1023949509487
80 sgo:license sg:explorer/license/
81 sgo:sdDataset articles
82 rdf:type schema:ScholarlyArticle
83 N1dd0bc1adea94bd6931afa563fded5ee rdf:first sg:person.07563510377.41
84 rdf:rest N42dbcefd2ff347408f87882cb8c9c512
85 N3601e3dd37e64d9185e8cf64409c07eb rdf:first sg:person.01152644406.14
86 rdf:rest rdf:nil
87 N3f448bfc8e2048fab10a1e5b36012895 schema:name Springer Nature - SN SciGraph project
88 rdf:type schema:Organization
89 N42dbcefd2ff347408f87882cb8c9c512 rdf:first sg:person.0756245403.69
90 rdf:rest Nef18b01b0df84228ac284ce8d9942044
91 N8d6ef72c35fa4f84a82a590a6dd7ad9a schema:name dimensions_id
92 schema:value pub.1036378730
93 rdf:type schema:PropertyValue
94 Na4d8afc97eb8468fbb45d75f4d5f7de3 schema:volumeNumber 52
95 rdf:type schema:PublicationVolume
96 Nc7048765b9d44418b8ec0be2401467a2 schema:name doi
97 schema:value 10.1023/a:1023949509487
98 rdf:type schema:PropertyValue
99 Nd5c2fe0877fe462099e07e918fb29901 schema:issueNumber 1-2
100 rdf:type schema:PublicationIssue
101 Nef18b01b0df84228ac284ce8d9942044 rdf:first sg:person.0766253043.80
102 rdf:rest N3601e3dd37e64d9185e8cf64409c07eb
103 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
104 schema:name Information and Computing Sciences
105 rdf:type schema:DefinedTerm
106 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
107 schema:name Artificial Intelligence and Image Processing
108 rdf:type schema:DefinedTerm
109 sg:journal.1125588 schema:issn 0885-6125
110 1573-0565
111 schema:name Machine Learning
112 schema:publisher Springer Nature
113 rdf:type schema:Periodical
114 sg:person.01152644406.14 schema:affiliation grid-institutes:grid.270301.7
115 schema:familyName Golub
116 schema:givenName Todd
117 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01152644406.14
118 rdf:type schema:Person
119 sg:person.0756245403.69 schema:affiliation grid-institutes:grid.270301.7
120 schema:familyName Tamayo
121 schema:givenName Pablo
122 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0756245403.69
123 rdf:type schema:Person
124 sg:person.07563510377.41 schema:affiliation grid-institutes:grid.270301.7
125 schema:familyName Monti
126 schema:givenName Stefano
127 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07563510377.41
128 rdf:type schema:Person
129 sg:person.0766253043.80 schema:affiliation grid-institutes:grid.270301.7
130 schema:familyName Mesirov
131 schema:givenName Jill
132 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0766253043.80
133 rdf:type schema:Person
134 sg:pub.10.1007/3-540-45784-4_39 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051759094
135 https://doi.org/10.1007/3-540-45784-4_39
136 rdf:type schema:CreativeWork
137 sg:pub.10.1007/bf01908065 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043555994
138 https://doi.org/10.1007/bf01908065
139 rdf:type schema:CreativeWork
140 sg:pub.10.1007/bf01908075 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022323983
141 https://doi.org/10.1007/bf01908075
142 rdf:type schema:CreativeWork
143 sg:pub.10.1007/bf02294245 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012945757
144 https://doi.org/10.1007/bf02294245
145 rdf:type schema:CreativeWork
146 sg:pub.10.1023/a:1007469629108 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044802083
147 https://doi.org/10.1023/a:1007469629108
148 rdf:type schema:CreativeWork
149 sg:pub.10.1038/415436a schema:sameAs https://app.dimensions.ai/details/publication/pub.1019207421
150 https://doi.org/10.1038/415436a
151 rdf:type schema:CreativeWork
152 sg:pub.10.1186/gb-2002-3-7-research0036 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041996792
153 https://doi.org/10.1186/gb-2002-3-7-research0036
154 rdf:type schema:CreativeWork
155 grid-institutes:grid.270301.7 schema:alternateName Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA
156 schema:name Whitehead Institute/MIT Center for Genome Research, One Kendall Square, 02139, Cambridge, MA, USA
157 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...