Controlling Size When Aligning Multiple Genomic Sequences with Duplications View Full Text


Ontology type: schema:Chapter      Open Access: True


Chapter Info

DATE

2006

AUTHORS

Minmei Hou , Piotr Berman , Louxin Zhang , Webb Miller

ABSTRACT

For a genomic region containing a tandem gene cluster, a proper set of alignments needs to align only orthologous segments, i.e., those separated by a speciation event. Otherwise, methods for finding regions under evolutionary selection will not perform properly. Conversely, the alignments should indicate every orthologous pair of genes or genomic segments. Attaining this goal in practice requires a technique for avoiding a combinatorial explosion in the number of local alignments. To better understand this process, we model it as a graph problem of finding a minimum cardinality set of cliques that contain all edges. We provide an upper bound for an important class of graphs (the problem is NP-hard and very difficult to approximate in the general case), and use the bound and computer simulations to evaluate two heuristic solutions. An implementation of one of them is evaluated on mammalian sequences from the α-globin gene cluster. More... »

PAGES

138-149

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/11851561_13

DOI

http://dx.doi.org/10.1007/11851561_13

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1005960990


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA", 
          "id": "http://www.grid.ac/institutes/grid.29857.31", 
          "name": [
            "Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hou", 
        "givenName": "Minmei", 
        "id": "sg:person.01044604166.80", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01044604166.80"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA", 
          "id": "http://www.grid.ac/institutes/grid.29857.31", 
          "name": [
            "Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Berman", 
        "givenName": "Piotr", 
        "id": "sg:person.01274506210.27", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01274506210.27"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Mathematics, National University of Singapore, Science Drive 2, 117543, Singapore", 
          "id": "http://www.grid.ac/institutes/grid.4280.e", 
          "name": [
            "Department of Mathematics, National University of Singapore, Science Drive 2, 117543, Singapore"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Zhang", 
        "givenName": "Louxin", 
        "id": "sg:person.012763757651.13", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012763757651.13"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA", 
          "id": "http://www.grid.ac/institutes/grid.29857.31", 
          "name": [
            "Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Miller", 
        "givenName": "Webb", 
        "id": "sg:person.01324623405.50", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324623405.50"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2006", 
    "datePublishedReg": "2006-01-01", 
    "description": "For a genomic region containing a tandem gene cluster, a proper set of alignments needs to align only orthologous segments, i.e., those separated by a speciation event. Otherwise, methods for finding regions under evolutionary selection will not perform properly. Conversely, the alignments should indicate every orthologous pair of genes or genomic segments. Attaining this goal in practice requires a technique for avoiding a combinatorial explosion in the number of local alignments. To better understand this process, we model it as a graph problem of finding a minimum cardinality set of cliques that contain all edges. We provide an upper bound for an important class of graphs (the problem is NP-hard and very difficult to approximate in the general case), and use the bound and computer simulations to evaluate two heuristic solutions. An implementation of one of them is evaluated on mammalian sequences from the \u03b1-globin gene cluster.", 
    "editor": [
      {
        "familyName": "B\u00fccher", 
        "givenName": "Philipp", 
        "type": "Person"
      }, 
      {
        "familyName": "Moret", 
        "givenName": "Bernard M. E.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/11851561_13", 
    "isAccessibleForFree": true, 
    "isPartOf": {
      "isbn": [
        "978-3-540-39583-6", 
        "978-3-540-39584-3"
      ], 
      "name": "Algorithms in Bioinformatics", 
      "type": "Book"
    }, 
    "keywords": [
      "gene cluster", 
      "tandem gene clusters", 
      "\u03b1-globin gene cluster", 
      "multiple genomic sequences", 
      "speciation events", 
      "orthologous pairs", 
      "mammalian sequences", 
      "genomic regions", 
      "orthologous segments", 
      "genomic sequences", 
      "genomic segments", 
      "evolutionary selection", 
      "sequence", 
      "genes", 
      "local alignment", 
      "duplication", 
      "important class", 
      "region", 
      "clusters", 
      "alignment", 
      "segments", 
      "selection", 
      "pairs", 
      "events", 
      "number", 
      "graph problems", 
      "set", 
      "heuristic solution", 
      "size", 
      "combinatorial explosion", 
      "process", 
      "minimum cardinality set", 
      "class", 
      "proper set", 
      "computer simulations", 
      "edge", 
      "graph", 
      "cliques", 
      "implementation", 
      "explosion", 
      "goal", 
      "method", 
      "technique", 
      "simulations", 
      "solution", 
      "problem", 
      "practice"
    ], 
    "name": "Controlling Size When Aligning Multiple Genomic Sequences with Duplications", 
    "pagination": "138-149", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1005960990"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/11851561_13"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/11851561_13", 
      "https://app.dimensions.ai/details/publication/pub.1005960990"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-08-04T17:17", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220804/entities/gbq_results/chapter/chapter_279.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/11851561_13"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/11851561_13'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/11851561_13'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/11851561_13'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/11851561_13'


 

This table displays all metadata directly associated to this object as RDF triples.

135 TRIPLES      22 PREDICATES      72 URIs      65 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/11851561_13 schema:about anzsrc-for:06
2 anzsrc-for:0604
3 schema:author N8bb7972b9c6947589d603adbd035208d
4 schema:datePublished 2006
5 schema:datePublishedReg 2006-01-01
6 schema:description For a genomic region containing a tandem gene cluster, a proper set of alignments needs to align only orthologous segments, i.e., those separated by a speciation event. Otherwise, methods for finding regions under evolutionary selection will not perform properly. Conversely, the alignments should indicate every orthologous pair of genes or genomic segments. Attaining this goal in practice requires a technique for avoiding a combinatorial explosion in the number of local alignments. To better understand this process, we model it as a graph problem of finding a minimum cardinality set of cliques that contain all edges. We provide an upper bound for an important class of graphs (the problem is NP-hard and very difficult to approximate in the general case), and use the bound and computer simulations to evaluate two heuristic solutions. An implementation of one of them is evaluated on mammalian sequences from the α-globin gene cluster.
7 schema:editor Ndbde70bd64914d308127fa76d6ccfc06
8 schema:genre chapter
9 schema:isAccessibleForFree true
10 schema:isPartOf Nb2abb2c2faef4f85b4471af25c6e623f
11 schema:keywords alignment
12 class
13 cliques
14 clusters
15 combinatorial explosion
16 computer simulations
17 duplication
18 edge
19 events
20 evolutionary selection
21 explosion
22 gene cluster
23 genes
24 genomic regions
25 genomic segments
26 genomic sequences
27 goal
28 graph
29 graph problems
30 heuristic solution
31 implementation
32 important class
33 local alignment
34 mammalian sequences
35 method
36 minimum cardinality set
37 multiple genomic sequences
38 number
39 orthologous pairs
40 orthologous segments
41 pairs
42 practice
43 problem
44 process
45 proper set
46 region
47 segments
48 selection
49 sequence
50 set
51 simulations
52 size
53 solution
54 speciation events
55 tandem gene clusters
56 technique
57 α-globin gene cluster
58 schema:name Controlling Size When Aligning Multiple Genomic Sequences with Duplications
59 schema:pagination 138-149
60 schema:productId Naeb7f53e871b4822b81d8eadd33d8dfa
61 Nd5e138f2c12f40e29b3a440994dce110
62 schema:publisher Nb19e56ce97634e00bd4bb1dcc02c6521
63 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005960990
64 https://doi.org/10.1007/11851561_13
65 schema:sdDatePublished 2022-08-04T17:17
66 schema:sdLicense https://scigraph.springernature.com/explorer/license/
67 schema:sdPublisher Nbc3aca30a3c84035ae6c33ee73c189ac
68 schema:url https://doi.org/10.1007/11851561_13
69 sgo:license sg:explorer/license/
70 sgo:sdDataset chapters
71 rdf:type schema:Chapter
72 N0bbeae798a504fa5835659442a948382 rdf:first sg:person.01324623405.50
73 rdf:rest rdf:nil
74 N0e11cf3cfc044032a80439ba7bb377d9 schema:familyName Bücher
75 schema:givenName Philipp
76 rdf:type schema:Person
77 N3ce8154658b9471487f216c2a68d9b84 schema:familyName Moret
78 schema:givenName Bernard M. E.
79 rdf:type schema:Person
80 N45964d9bffb9421593a5d95a6d7f530d rdf:first sg:person.012763757651.13
81 rdf:rest N0bbeae798a504fa5835659442a948382
82 N725d91ed93ed496a95a90012ff01b274 rdf:first sg:person.01274506210.27
83 rdf:rest N45964d9bffb9421593a5d95a6d7f530d
84 N8bb7972b9c6947589d603adbd035208d rdf:first sg:person.01044604166.80
85 rdf:rest N725d91ed93ed496a95a90012ff01b274
86 Na33372045beb431fa1c1fb1bf2576254 rdf:first N3ce8154658b9471487f216c2a68d9b84
87 rdf:rest rdf:nil
88 Naeb7f53e871b4822b81d8eadd33d8dfa schema:name doi
89 schema:value 10.1007/11851561_13
90 rdf:type schema:PropertyValue
91 Nb19e56ce97634e00bd4bb1dcc02c6521 schema:name Springer Nature
92 rdf:type schema:Organisation
93 Nb2abb2c2faef4f85b4471af25c6e623f schema:isbn 978-3-540-39583-6
94 978-3-540-39584-3
95 schema:name Algorithms in Bioinformatics
96 rdf:type schema:Book
97 Nbc3aca30a3c84035ae6c33ee73c189ac schema:name Springer Nature - SN SciGraph project
98 rdf:type schema:Organization
99 Nd5e138f2c12f40e29b3a440994dce110 schema:name dimensions_id
100 schema:value pub.1005960990
101 rdf:type schema:PropertyValue
102 Ndbde70bd64914d308127fa76d6ccfc06 rdf:first N0e11cf3cfc044032a80439ba7bb377d9
103 rdf:rest Na33372045beb431fa1c1fb1bf2576254
104 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
105 schema:name Biological Sciences
106 rdf:type schema:DefinedTerm
107 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
108 schema:name Genetics
109 rdf:type schema:DefinedTerm
110 sg:person.01044604166.80 schema:affiliation grid-institutes:grid.29857.31
111 schema:familyName Hou
112 schema:givenName Minmei
113 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01044604166.80
114 rdf:type schema:Person
115 sg:person.01274506210.27 schema:affiliation grid-institutes:grid.29857.31
116 schema:familyName Berman
117 schema:givenName Piotr
118 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01274506210.27
119 rdf:type schema:Person
120 sg:person.012763757651.13 schema:affiliation grid-institutes:grid.4280.e
121 schema:familyName Zhang
122 schema:givenName Louxin
123 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012763757651.13
124 rdf:type schema:Person
125 sg:person.01324623405.50 schema:affiliation grid-institutes:grid.29857.31
126 schema:familyName Miller
127 schema:givenName Webb
128 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01324623405.50
129 rdf:type schema:Person
130 grid-institutes:grid.29857.31 schema:alternateName Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA
131 schema:name Department of Computer Science and Engineering, Penn State, 16801, University Park, PA, USA
132 rdf:type schema:Organization
133 grid-institutes:grid.4280.e schema:alternateName Department of Mathematics, National University of Singapore, Science Drive 2, 117543, Singapore
134 schema:name Department of Mathematics, National University of Singapore, Science Drive 2, 117543, Singapore
135 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...