Simulated annealing for supervised gene selection View Full Text


Ontology type: schema:ScholarlyArticle     


Article Info

DATE

2010-03-31

AUTHORS

Maurizio Filippone, Francesco Masulli, Stefano Rovetta

ABSTRACT

Genomic data, and more generally biomedical data, are often characterized by high dimensionality. An input selection procedure can attain the two objectives of highlighting the relevant variables (genes) and possibly improving classification results. In this paper, we propose a wrapper approach to gene selection in classification of gene expression data using simulated annealing along with supervised classification. The proposed approach can perform global combinatorial searches through the space of all possible input subsets, can handle cases with numerical, categorical or mixed inputs, and is able to find (sub-)optimal subsets of inputs giving low classification errors. The method has been tested on publicly available bioinformatics data sets using support vector machines and on a mixed type data set using classification trees. We also propose some heuristics able to speed up the convergence. The experimental results highlight the ability of the method to select minimal sets of relevant features. More... »

PAGES

1471-1482

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/s00500-010-0597-8

DOI

http://dx.doi.org/10.1007/s00500-010-0597-8

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1021001908


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/17", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Psychology and Cognitive Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0102", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Applied Mathematics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1702", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Cognitive Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Computing Science, University of Glasgow, Sir Alwyn Williams Building, G12 8QQ, Glasgow, UK", 
          "id": "http://www.grid.ac/institutes/grid.8756.c", 
          "name": [
            "Department of Computing Science, University of Glasgow, Sir Alwyn Williams Building, G12 8QQ, Glasgow, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Filippone", 
        "givenName": "Maurizio", 
        "id": "sg:person.07706215665.03", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07706215665.03"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, Temple University, Philadelphia, PA, USA", 
          "id": "http://www.grid.ac/institutes/grid.264727.2", 
          "name": [
            "Department of Computer and Information Sciences, University of Genova, Genoa, Italy", 
            "CNISM Genova Research Unit, Genoa, Italy", 
            "Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, Temple University, Philadelphia, PA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Masulli", 
        "givenName": "Francesco", 
        "id": "sg:person.013052261502.67", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013052261502.67"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "CNISM Genova Research Unit, Genoa, Italy", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Department of Computer and Information Sciences, University of Genova, Genoa, Italy", 
            "CNISM Genova Research Unit, Genoa, Italy"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Rovetta", 
        "givenName": "Stefano", 
        "id": "sg:person.015767137221.48", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015767137221.48"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/11676935_28", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036239508", 
          "https://doi.org/10.1007/11676935_28"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1012487302797", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048573168", 
          "https://doi.org/10.1023/a:1012487302797"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00994018", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025150743", 
          "https://doi.org/10.1007/bf00994018"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1008641220268", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033779583", 
          "https://doi.org/10.1023/a:1008641220268"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4757-2440-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027312764", 
          "https://doi.org/10.1007/978-1-4757-2440-0"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/3-540-57868-4_57", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029649421", 
          "https://doi.org/10.1007/3-540-57868-4_57"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2010-03-31", 
    "datePublishedReg": "2010-03-31", 
    "description": "Genomic data, and more generally biomedical data, are often characterized by high dimensionality. An input selection procedure can attain the two objectives of highlighting the relevant variables (genes) and possibly improving classification results. In this paper, we propose a wrapper approach to gene selection in classification of gene expression data using simulated annealing along with supervised classification. The proposed approach can perform global combinatorial searches through the space of all possible input subsets, can handle cases with numerical, categorical or mixed inputs, and is able to find (sub-)optimal subsets of inputs giving low classification errors. The method has been tested on publicly available bioinformatics data sets using support vector machines and on a mixed type data set using classification trees. We also propose some heuristics able to speed up the convergence. The experimental results highlight the ability of the method to select minimal sets of relevant features.", 
    "genre": "article", 
    "id": "sg:pub.10.1007/s00500-010-0597-8", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.6852919", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1050238", 
        "issn": [
          "1432-7643", 
          "1433-7479"
        ], 
        "name": "Soft Computing", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "8", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "15"
      }
    ], 
    "keywords": [
      "supervised gene selection", 
      "gene expression data", 
      "genomic data", 
      "expression data", 
      "possible input subsets", 
      "gene selection", 
      "global combinatorial search", 
      "bioinformatics data sets", 
      "selection", 
      "trees", 
      "minimal set", 
      "selection procedure", 
      "subset", 
      "type data", 
      "ability", 
      "data sets", 
      "classification trees", 
      "data", 
      "input selection procedure", 
      "mixed inputs", 
      "results", 
      "mixed type data", 
      "set", 
      "approach", 
      "features", 
      "classification", 
      "input", 
      "search", 
      "biomedical data", 
      "method", 
      "lower classification error", 
      "objective", 
      "supervised classification", 
      "relevant features", 
      "space", 
      "high dimensionality", 
      "combinatorial search", 
      "variables", 
      "support vector machine", 
      "convergence", 
      "cases", 
      "procedure", 
      "vector machine", 
      "input subset", 
      "relevant variables", 
      "subset of inputs", 
      "wrapper approach", 
      "machine", 
      "dimensionality", 
      "classification error", 
      "error", 
      "classification results", 
      "annealing", 
      "paper", 
      "experimental results", 
      "heuristics", 
      "available bioinformatics data sets"
    ], 
    "name": "Simulated annealing for supervised gene selection", 
    "pagination": "1471-1482", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1021001908"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/s00500-010-0597-8"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1007/s00500-010-0597-8", 
      "https://app.dimensions.ai/details/publication/pub.1021001908"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-01-01T18:24", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_523.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1007/s00500-010-0597-8"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/s00500-010-0597-8'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/s00500-010-0597-8'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/s00500-010-0597-8'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/s00500-010-0597-8'


 

This table displays all metadata directly associated to this object as RDF triples.

180 TRIPLES      22 PREDICATES      92 URIs      74 LITERALS      6 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/s00500-010-0597-8 schema:about anzsrc-for:01
2 anzsrc-for:0102
3 anzsrc-for:08
4 anzsrc-for:0801
5 anzsrc-for:17
6 anzsrc-for:1702
7 schema:author N559e4f4820c34860b5d33b43cad3b9cf
8 schema:citation sg:pub.10.1007/11676935_28
9 sg:pub.10.1007/3-540-57868-4_57
10 sg:pub.10.1007/978-1-4757-2440-0
11 sg:pub.10.1007/bf00994018
12 sg:pub.10.1023/a:1008641220268
13 sg:pub.10.1023/a:1012487302797
14 schema:datePublished 2010-03-31
15 schema:datePublishedReg 2010-03-31
16 schema:description Genomic data, and more generally biomedical data, are often characterized by high dimensionality. An input selection procedure can attain the two objectives of highlighting the relevant variables (genes) and possibly improving classification results. In this paper, we propose a wrapper approach to gene selection in classification of gene expression data using simulated annealing along with supervised classification. The proposed approach can perform global combinatorial searches through the space of all possible input subsets, can handle cases with numerical, categorical or mixed inputs, and is able to find (sub-)optimal subsets of inputs giving low classification errors. The method has been tested on publicly available bioinformatics data sets using support vector machines and on a mixed type data set using classification trees. We also propose some heuristics able to speed up the convergence. The experimental results highlight the ability of the method to select minimal sets of relevant features.
17 schema:genre article
18 schema:inLanguage en
19 schema:isAccessibleForFree false
20 schema:isPartOf N4c3c9c3945904815a2fa3145362a7b1f
21 Na42f02da4a704c3699aeb7367b5288f3
22 sg:journal.1050238
23 schema:keywords ability
24 annealing
25 approach
26 available bioinformatics data sets
27 bioinformatics data sets
28 biomedical data
29 cases
30 classification
31 classification error
32 classification results
33 classification trees
34 combinatorial search
35 convergence
36 data
37 data sets
38 dimensionality
39 error
40 experimental results
41 expression data
42 features
43 gene expression data
44 gene selection
45 genomic data
46 global combinatorial search
47 heuristics
48 high dimensionality
49 input
50 input selection procedure
51 input subset
52 lower classification error
53 machine
54 method
55 minimal set
56 mixed inputs
57 mixed type data
58 objective
59 paper
60 possible input subsets
61 procedure
62 relevant features
63 relevant variables
64 results
65 search
66 selection
67 selection procedure
68 set
69 space
70 subset
71 subset of inputs
72 supervised classification
73 supervised gene selection
74 support vector machine
75 trees
76 type data
77 variables
78 vector machine
79 wrapper approach
80 schema:name Simulated annealing for supervised gene selection
81 schema:pagination 1471-1482
82 schema:productId Nb2929ba9ecba4fd394058b66f0d129c0
83 Nee84b905563a4dd4abe8773ecffa9966
84 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021001908
85 https://doi.org/10.1007/s00500-010-0597-8
86 schema:sdDatePublished 2022-01-01T18:24
87 schema:sdLicense https://scigraph.springernature.com/explorer/license/
88 schema:sdPublisher Ne42b47232777454794096fe51cc5d3ad
89 schema:url https://doi.org/10.1007/s00500-010-0597-8
90 sgo:license sg:explorer/license/
91 sgo:sdDataset articles
92 rdf:type schema:ScholarlyArticle
93 N13c85af8bb5a4f88a0aa31cd51d59343 rdf:first sg:person.013052261502.67
94 rdf:rest N9fd02840926d4864b80d700ea8c4c3e5
95 N4c3c9c3945904815a2fa3145362a7b1f schema:issueNumber 8
96 rdf:type schema:PublicationIssue
97 N559e4f4820c34860b5d33b43cad3b9cf rdf:first sg:person.07706215665.03
98 rdf:rest N13c85af8bb5a4f88a0aa31cd51d59343
99 N9fd02840926d4864b80d700ea8c4c3e5 rdf:first sg:person.015767137221.48
100 rdf:rest rdf:nil
101 Na42f02da4a704c3699aeb7367b5288f3 schema:volumeNumber 15
102 rdf:type schema:PublicationVolume
103 Nb2929ba9ecba4fd394058b66f0d129c0 schema:name dimensions_id
104 schema:value pub.1021001908
105 rdf:type schema:PropertyValue
106 Ne42b47232777454794096fe51cc5d3ad schema:name Springer Nature - SN SciGraph project
107 rdf:type schema:Organization
108 Nee84b905563a4dd4abe8773ecffa9966 schema:name doi
109 schema:value 10.1007/s00500-010-0597-8
110 rdf:type schema:PropertyValue
111 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
112 schema:name Mathematical Sciences
113 rdf:type schema:DefinedTerm
114 anzsrc-for:0102 schema:inDefinedTermSet anzsrc-for:
115 schema:name Applied Mathematics
116 rdf:type schema:DefinedTerm
117 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
118 schema:name Information and Computing Sciences
119 rdf:type schema:DefinedTerm
120 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
121 schema:name Artificial Intelligence and Image Processing
122 rdf:type schema:DefinedTerm
123 anzsrc-for:17 schema:inDefinedTermSet anzsrc-for:
124 schema:name Psychology and Cognitive Sciences
125 rdf:type schema:DefinedTerm
126 anzsrc-for:1702 schema:inDefinedTermSet anzsrc-for:
127 schema:name Cognitive Sciences
128 rdf:type schema:DefinedTerm
129 sg:grant.6852919 http://pending.schema.org/fundedItem sg:pub.10.1007/s00500-010-0597-8
130 rdf:type schema:MonetaryGrant
131 sg:journal.1050238 schema:issn 1432-7643
132 1433-7479
133 schema:name Soft Computing
134 schema:publisher Springer Nature
135 rdf:type schema:Periodical
136 sg:person.013052261502.67 schema:affiliation grid-institutes:grid.264727.2
137 schema:familyName Masulli
138 schema:givenName Francesco
139 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013052261502.67
140 rdf:type schema:Person
141 sg:person.015767137221.48 schema:affiliation grid-institutes:None
142 schema:familyName Rovetta
143 schema:givenName Stefano
144 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015767137221.48
145 rdf:type schema:Person
146 sg:person.07706215665.03 schema:affiliation grid-institutes:grid.8756.c
147 schema:familyName Filippone
148 schema:givenName Maurizio
149 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07706215665.03
150 rdf:type schema:Person
151 sg:pub.10.1007/11676935_28 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036239508
152 https://doi.org/10.1007/11676935_28
153 rdf:type schema:CreativeWork
154 sg:pub.10.1007/3-540-57868-4_57 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029649421
155 https://doi.org/10.1007/3-540-57868-4_57
156 rdf:type schema:CreativeWork
157 sg:pub.10.1007/978-1-4757-2440-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027312764
158 https://doi.org/10.1007/978-1-4757-2440-0
159 rdf:type schema:CreativeWork
160 sg:pub.10.1007/bf00994018 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025150743
161 https://doi.org/10.1007/bf00994018
162 rdf:type schema:CreativeWork
163 sg:pub.10.1023/a:1008641220268 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033779583
164 https://doi.org/10.1023/a:1008641220268
165 rdf:type schema:CreativeWork
166 sg:pub.10.1023/a:1012487302797 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048573168
167 https://doi.org/10.1023/a:1012487302797
168 rdf:type schema:CreativeWork
169 grid-institutes:None schema:alternateName CNISM Genova Research Unit, Genoa, Italy
170 schema:name CNISM Genova Research Unit, Genoa, Italy
171 Department of Computer and Information Sciences, University of Genova, Genoa, Italy
172 rdf:type schema:Organization
173 grid-institutes:grid.264727.2 schema:alternateName Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, Temple University, Philadelphia, PA, USA
174 schema:name CNISM Genova Research Unit, Genoa, Italy
175 Department of Computer and Information Sciences, University of Genova, Genoa, Italy
176 Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, Temple University, Philadelphia, PA, USA
177 rdf:type schema:Organization
178 grid-institutes:grid.8756.c schema:alternateName Department of Computing Science, University of Glasgow, Sir Alwyn Williams Building, G12 8QQ, Glasgow, UK
179 schema:name Department of Computing Science, University of Glasgow, Sir Alwyn Williams Building, G12 8QQ, Glasgow, UK
180 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...