SimBA: simulation algorithm to fit extant-population distributions View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2015-03-14

AUTHORS

Laxmi Parida, Niina Haiminen

ABSTRACT

BackgroundSimulation of populations with specified characteristics such as allele frequencies, linkage disequilibrium etc., is an integral component of many studies, including in-silico breeding optimization. Since the accuracy and sensitivity of population simulation is critical to the quality of the output of the applications that use them, accurate algorithms are required to provide a strong foundation to the methods in these studies.ResultsIn this paper we present SimBA (Simulation using Best-fit Algorithm) a non-generative approach, based on a combination of stochastic techniques and discrete methods. We optimize a hill climbing algorithm and extend the framework to include multiple subpopulation structures. Additionally, we show that SimBA is very sensitive to the input specifications, i.e., very similar but distinct input characteristics result in distinct outputs with high fidelity to the specified distributions. This property of the simulation is not explicitly modeled or studied by previous methods.ConclusionsWe show that SimBA outperforms the existing population simulation methods, both in terms of accuracy as well as time-efficiency. Not only does it construct populations that meet the input specifications more stringently than other published methods, SimBA is also easy to use. It does not require explicit parameter adaptations or calibrations. Also, it can work with input specified as distributions, without an exemplar matrix or population as required by some methods. SimBA is available at http://researcher.ibm.com/project/5669. More... »

PAGES

82

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/s12859-015-0525-0

DOI

http://dx.doi.org/10.1186/s12859-015-0525-0

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1006799034

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/25886895


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Algorithms", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Computer Simulation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Conservation of Natural Resources", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Gene Frequency", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genetics, Population", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Humans", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Linkage Disequilibrium", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Models, Theoretical", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Population Dynamics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Computational Biology Center, IBM T. J. Watson Research, Yorktown Heights, NY, USA", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Computational Biology Center, IBM T. J. Watson Research, Yorktown Heights, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Parida", 
        "givenName": "Laxmi", 
        "id": "sg:person.01336557015.68", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336557015.68"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Computational Biology Center, IBM T. J. Watson Research, Yorktown Heights, NY, USA", 
          "id": "http://www.grid.ac/institutes/None", 
          "name": [
            "Computational Biology Center, IBM T. J. Watson Research, Yorktown Heights, NY, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Haiminen", 
        "givenName": "Niina", 
        "id": "sg:person.0746114007.76", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0746114007.76"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1007/978-3-662-44753-6_19", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1019593946", 
          "https://doi.org/10.1007/978-3-662-44753-6_19"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-1-4899-4441-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1108488398", 
          "https://doi.org/10.1007/978-1-4899-4441-2"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s13258-013-0081-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041969888", 
          "https://doi.org/10.1007/s13258-013-0081-9"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nrg3130", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1035741028", 
          "https://doi.org/10.1038/nrg3130"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10528-011-9416-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022978703", 
          "https://doi.org/10.1007/s10528-011-9416-x"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature02168", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033654326", 
          "https://doi.org/10.1038/nature02168"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2015-03-14", 
    "datePublishedReg": "2015-03-14", 
    "description": "BackgroundSimulation of populations with specified characteristics such as allele frequencies, linkage disequilibrium etc., is an integral component of many studies, including in-silico breeding optimization. Since the accuracy and sensitivity of population simulation is critical to the quality of the output of the applications that use them, accurate algorithms are required to provide a strong foundation to the methods in these studies.ResultsIn this paper we present SimBA (Simulation using Best-fit Algorithm) a non-generative approach, based on a combination of stochastic techniques and discrete methods. We optimize a hill climbing algorithm and extend the framework to include multiple subpopulation structures. Additionally, we show that SimBA is very sensitive to the input specifications, i.e., very similar but distinct input characteristics result in distinct outputs with high fidelity to the specified distributions. This property of the simulation is not explicitly modeled or studied by previous methods.ConclusionsWe show that SimBA outperforms the existing population simulation methods, both in terms of accuracy as well as time-efficiency. Not only does it construct populations that meet the input specifications more stringently than other published methods, SimBA is also easy to use. It does not require explicit parameter adaptations or calibrations. Also, it can work with input specified as distributions, without an exemplar matrix or population as required by some methods. SimBA is available at http://researcher.ibm.com/project/5669.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/s12859-015-0525-0", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "16"
      }
    ], 
    "keywords": [
      "stochastic techniques", 
      "parameter adaptation", 
      "non-generative approach", 
      "discrete method", 
      "simulation algorithm", 
      "accurate algorithm", 
      "terms of accuracy", 
      "simulation method", 
      "input characteristics", 
      "algorithm", 
      "input specification", 
      "previous methods", 
      "simulations", 
      "high fidelity", 
      "population simulations", 
      "distinct outputs", 
      "distribution", 
      "optimization", 
      "accuracy", 
      "SIMBA", 
      "output", 
      "subpopulation structure", 
      "matrix", 
      "specification", 
      "terms", 
      "properties", 
      "applications", 
      "framework", 
      "input", 
      "fidelity", 
      "approach", 
      "technique", 
      "structure", 
      "calibration", 
      "foundation", 
      "frequency", 
      "method", 
      "characteristics", 
      "strong foundation", 
      "components", 
      "quality", 
      "combination", 
      "integral component", 
      "adaptation", 
      "Hill", 
      "study", 
      "sensitivity", 
      "population", 
      "allele frequencies", 
      "disequilibrium", 
      "ConclusionsWe", 
      "ResultsIn", 
      "paper", 
      "BackgroundSimulation"
    ], 
    "name": "SimBA: simulation algorithm to fit extant-population distributions", 
    "pagination": "82", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1006799034"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/s12859-015-0525-0"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "25886895"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/s12859-015-0525-0", 
      "https://app.dimensions.ai/details/publication/pub.1006799034"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-09-02T15:59", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220902/entities/gbq_results/article/article_684.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/s12859-015-0525-0"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s12859-015-0525-0'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s12859-015-0525-0'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s12859-015-0525-0'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s12859-015-0525-0'


 

This table displays all metadata directly associated to this object as RDF triples.

181 TRIPLES      21 PREDICATES      94 URIs      80 LITERALS      16 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/s12859-015-0525-0 schema:about N06ed29f839904c3ab18c7f652ab27e5d
2 N23c866bafac44ca59eee4e637293eb99
3 N325c963e023a47a29a3c860468d035fe
4 N651cf763aeca4e2ea1370084c111247e
5 N92840247dede4aeaa89b0015d4c406d9
6 N938d1b7b3b5f4d88bc1ef2192ab971c6
7 Nb189d9ddeffa42d5a1448ca5f35ee684
8 Nd36c7a93cb6b43cda96ad9a7b91551e6
9 Ne73ce92e3afd47c7a8e643448de1fa7f
10 anzsrc-for:08
11 anzsrc-for:0801
12 schema:author N6bde4e0687d24c19b31d8ceb5a6749b9
13 schema:citation sg:pub.10.1007/978-1-4899-4441-2
14 sg:pub.10.1007/978-3-662-44753-6_19
15 sg:pub.10.1007/s10528-011-9416-x
16 sg:pub.10.1007/s13258-013-0081-9
17 sg:pub.10.1038/nature02168
18 sg:pub.10.1038/nrg3130
19 schema:datePublished 2015-03-14
20 schema:datePublishedReg 2015-03-14
21 schema:description BackgroundSimulation of populations with specified characteristics such as allele frequencies, linkage disequilibrium etc., is an integral component of many studies, including in-silico breeding optimization. Since the accuracy and sensitivity of population simulation is critical to the quality of the output of the applications that use them, accurate algorithms are required to provide a strong foundation to the methods in these studies.ResultsIn this paper we present SimBA (Simulation using Best-fit Algorithm) a non-generative approach, based on a combination of stochastic techniques and discrete methods. We optimize a hill climbing algorithm and extend the framework to include multiple subpopulation structures. Additionally, we show that SimBA is very sensitive to the input specifications, i.e., very similar but distinct input characteristics result in distinct outputs with high fidelity to the specified distributions. This property of the simulation is not explicitly modeled or studied by previous methods.ConclusionsWe show that SimBA outperforms the existing population simulation methods, both in terms of accuracy as well as time-efficiency. Not only does it construct populations that meet the input specifications more stringently than other published methods, SimBA is also easy to use. It does not require explicit parameter adaptations or calibrations. Also, it can work with input specified as distributions, without an exemplar matrix or population as required by some methods. SimBA is available at http://researcher.ibm.com/project/5669.
22 schema:genre article
23 schema:isAccessibleForFree true
24 schema:isPartOf N09142486e48e478da0c9bc9441c71e04
25 N4dfecd6844e14e099325c185d9742a31
26 sg:journal.1023786
27 schema:keywords BackgroundSimulation
28 ConclusionsWe
29 Hill
30 ResultsIn
31 SIMBA
32 accuracy
33 accurate algorithm
34 adaptation
35 algorithm
36 allele frequencies
37 applications
38 approach
39 calibration
40 characteristics
41 combination
42 components
43 discrete method
44 disequilibrium
45 distinct outputs
46 distribution
47 fidelity
48 foundation
49 framework
50 frequency
51 high fidelity
52 input
53 input characteristics
54 input specification
55 integral component
56 matrix
57 method
58 non-generative approach
59 optimization
60 output
61 paper
62 parameter adaptation
63 population
64 population simulations
65 previous methods
66 properties
67 quality
68 sensitivity
69 simulation algorithm
70 simulation method
71 simulations
72 specification
73 stochastic techniques
74 strong foundation
75 structure
76 study
77 subpopulation structure
78 technique
79 terms
80 terms of accuracy
81 schema:name SimBA: simulation algorithm to fit extant-population distributions
82 schema:pagination 82
83 schema:productId N1b0df85c03bd4421b637b25c3f4a8916
84 N80862c97a14647808323f91d8629f07c
85 Nb4bc8fbc88e542f5b966ed5528e913e2
86 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006799034
87 https://doi.org/10.1186/s12859-015-0525-0
88 schema:sdDatePublished 2022-09-02T15:59
89 schema:sdLicense https://scigraph.springernature.com/explorer/license/
90 schema:sdPublisher N2a2c378654cd4874a102a92a58cc38d6
91 schema:url https://doi.org/10.1186/s12859-015-0525-0
92 sgo:license sg:explorer/license/
93 sgo:sdDataset articles
94 rdf:type schema:ScholarlyArticle
95 N04b1b661cc184e60b2525e145ec40633 rdf:first sg:person.0746114007.76
96 rdf:rest rdf:nil
97 N06ed29f839904c3ab18c7f652ab27e5d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
98 schema:name Models, Theoretical
99 rdf:type schema:DefinedTerm
100 N09142486e48e478da0c9bc9441c71e04 schema:issueNumber 1
101 rdf:type schema:PublicationIssue
102 N1b0df85c03bd4421b637b25c3f4a8916 schema:name dimensions_id
103 schema:value pub.1006799034
104 rdf:type schema:PropertyValue
105 N23c866bafac44ca59eee4e637293eb99 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
106 schema:name Algorithms
107 rdf:type schema:DefinedTerm
108 N2a2c378654cd4874a102a92a58cc38d6 schema:name Springer Nature - SN SciGraph project
109 rdf:type schema:Organization
110 N325c963e023a47a29a3c860468d035fe schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
111 schema:name Linkage Disequilibrium
112 rdf:type schema:DefinedTerm
113 N4dfecd6844e14e099325c185d9742a31 schema:volumeNumber 16
114 rdf:type schema:PublicationVolume
115 N651cf763aeca4e2ea1370084c111247e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
116 schema:name Population Dynamics
117 rdf:type schema:DefinedTerm
118 N6bde4e0687d24c19b31d8ceb5a6749b9 rdf:first sg:person.01336557015.68
119 rdf:rest N04b1b661cc184e60b2525e145ec40633
120 N80862c97a14647808323f91d8629f07c schema:name doi
121 schema:value 10.1186/s12859-015-0525-0
122 rdf:type schema:PropertyValue
123 N92840247dede4aeaa89b0015d4c406d9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
124 schema:name Genetics, Population
125 rdf:type schema:DefinedTerm
126 N938d1b7b3b5f4d88bc1ef2192ab971c6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
127 schema:name Humans
128 rdf:type schema:DefinedTerm
129 Nb189d9ddeffa42d5a1448ca5f35ee684 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
130 schema:name Computer Simulation
131 rdf:type schema:DefinedTerm
132 Nb4bc8fbc88e542f5b966ed5528e913e2 schema:name pubmed_id
133 schema:value 25886895
134 rdf:type schema:PropertyValue
135 Nd36c7a93cb6b43cda96ad9a7b91551e6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
136 schema:name Conservation of Natural Resources
137 rdf:type schema:DefinedTerm
138 Ne73ce92e3afd47c7a8e643448de1fa7f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
139 schema:name Gene Frequency
140 rdf:type schema:DefinedTerm
141 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
142 schema:name Information and Computing Sciences
143 rdf:type schema:DefinedTerm
144 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
145 schema:name Artificial Intelligence and Image Processing
146 rdf:type schema:DefinedTerm
147 sg:journal.1023786 schema:issn 1471-2105
148 schema:name BMC Bioinformatics
149 schema:publisher Springer Nature
150 rdf:type schema:Periodical
151 sg:person.01336557015.68 schema:affiliation grid-institutes:None
152 schema:familyName Parida
153 schema:givenName Laxmi
154 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336557015.68
155 rdf:type schema:Person
156 sg:person.0746114007.76 schema:affiliation grid-institutes:None
157 schema:familyName Haiminen
158 schema:givenName Niina
159 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0746114007.76
160 rdf:type schema:Person
161 sg:pub.10.1007/978-1-4899-4441-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1108488398
162 https://doi.org/10.1007/978-1-4899-4441-2
163 rdf:type schema:CreativeWork
164 sg:pub.10.1007/978-3-662-44753-6_19 schema:sameAs https://app.dimensions.ai/details/publication/pub.1019593946
165 https://doi.org/10.1007/978-3-662-44753-6_19
166 rdf:type schema:CreativeWork
167 sg:pub.10.1007/s10528-011-9416-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1022978703
168 https://doi.org/10.1007/s10528-011-9416-x
169 rdf:type schema:CreativeWork
170 sg:pub.10.1007/s13258-013-0081-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041969888
171 https://doi.org/10.1007/s13258-013-0081-9
172 rdf:type schema:CreativeWork
173 sg:pub.10.1038/nature02168 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033654326
174 https://doi.org/10.1038/nature02168
175 rdf:type schema:CreativeWork
176 sg:pub.10.1038/nrg3130 schema:sameAs https://app.dimensions.ai/details/publication/pub.1035741028
177 https://doi.org/10.1038/nrg3130
178 rdf:type schema:CreativeWork
179 grid-institutes:None schema:alternateName Computational Biology Center, IBM T. J. Watson Research, Yorktown Heights, NY, USA
180 schema:name Computational Biology Center, IBM T. J. Watson Research, Yorktown Heights, NY, USA
181 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...