An Optimal Algorithm for Maximum-Sum Segment and Its Application in Bioinformatics View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2003-06-24

AUTHORS

Tsai-Hung Fan , Shufen Lee , Hsueh-I Lu , Tsung-Shan Tsou , Tsai-Cheng Wang , Adam Yao

ABSTRACT

We study a fundamental sequence algorithm arising from bioinformatics. Given two integers L and U and a sequence A of n numbers, the maximum-sum segment problem is to find a segment A[i,j] of A with L ≤ j+i+1 ≤ U that maximizes A[i]+A[i+1]+···+A[j]. The problem finds applications in finding repeats, designing low complexity filter, and locating segments with rich C+G content for biomolecular sequences. The best known algorithm, due to Lin, Jiang, and Chao, runs in O(n) time, based upon a clever technique called left-negative decomposition for A. In the present paper, we present a new O(n)-time algorithm that bypasses the left-negative decomposition. As a result, our algorithm has the capability to handle the input sequence in an online manner, which is clearly an important feature to cope with genome-scale sequences. We also show how to exploit the sparsity in the input sequence: If A is representable in O(k) space in some format, then our algorithm runs in O(k) time. Moreover, practical implementation of our algorithm running on the rice genome helps us to identify a very long repeat structure in rice chromosome 1 that is previously unknown. More... »

PAGES

251-257

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/3-540-45089-0_23

DOI

http://dx.doi.org/10.1007/3-540-45089-0_23

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1051758054


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Mathematical Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0102", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Applied Mathematics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Institute of Statistics, National Central University, 320, Chung-li, Taiwan, R.O.C.", 
          "id": "http://www.grid.ac/institutes/grid.37589.30", 
          "name": [
            "Institute of Statistics, National Central University, 320, Chung-li, Taiwan, R.O.C."
          ], 
          "type": "Organization"
        }, 
        "familyName": "Fan", 
        "givenName": "Tsai-Hung", 
        "id": "sg:person.01024023247.33", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01024023247.33"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C.", 
          "id": "http://www.grid.ac/institutes/grid.482251.8", 
          "name": [
            "Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C."
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lee", 
        "givenName": "Shufen", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Information Science, Academia Sinica, 115, Taipei, Taiwan, R.O.C.", 
          "id": "http://www.grid.ac/institutes/grid.28665.3f", 
          "name": [
            "Institute of Information Science, Academia Sinica, 115, Taipei, Taiwan, R.O.C."
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lu", 
        "givenName": "Hsueh-I", 
        "id": "sg:person.013345515575.46", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013345515575.46"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Statistics, National Central University, 320, Chung-li, Taiwan, R.O.C.", 
          "id": "http://www.grid.ac/institutes/grid.37589.30", 
          "name": [
            "Institute of Statistics, National Central University, 320, Chung-li, Taiwan, R.O.C."
          ], 
          "type": "Organization"
        }, 
        "familyName": "Tsou", 
        "givenName": "Tsung-Shan", 
        "id": "sg:person.0621035037.51", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621035037.51"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C.", 
          "id": "http://www.grid.ac/institutes/grid.482251.8", 
          "name": [
            "Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C."
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wang", 
        "givenName": "Tsai-Cheng", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C.", 
          "id": "http://www.grid.ac/institutes/grid.482251.8", 
          "name": [
            "Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C."
          ], 
          "type": "Organization"
        }, 
        "familyName": "Yao", 
        "givenName": "Adam", 
        "type": "Person"
      }
    ], 
    "datePublished": "2003-06-24", 
    "datePublishedReg": "2003-06-24", 
    "description": "We study a fundamental sequence algorithm arising from bioinformatics. Given two integers L and U and a sequence A of n numbers, the maximum-sum segment problem is to find a segment A[i,j] of A with L \u2264 j+i+1 \u2264 U that maximizes A[i]+A[i+1]+\u00b7\u00b7\u00b7+A[j]. The problem finds applications in finding repeats, designing low complexity filter, and locating segments with rich C+G content for biomolecular sequences. The best known algorithm, due to Lin, Jiang, and Chao, runs in O(n) time, based upon a clever technique called left-negative decomposition for A. In the present paper, we present a new O(n)-time algorithm that bypasses the left-negative decomposition. As a result, our algorithm has the capability to handle the input sequence in an online manner, which is clearly an important feature to cope with genome-scale sequences. We also show how to exploit the sparsity in the input sequence: If A is representable in O(k) space in some format, then our algorithm runs in O(k) time. Moreover, practical implementation of our algorithm running on the rice genome helps us to identify a very long repeat structure in rice chromosome 1 that is previously unknown.", 
    "editor": [
      {
        "familyName": "Ibarra", 
        "givenName": "Oscar H.", 
        "type": "Person"
      }, 
      {
        "familyName": "Dang", 
        "givenName": "Zhe", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/3-540-45089-0_23", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-540-40561-0", 
        "978-3-540-45089-4"
      ], 
      "name": "Implementation and Application of Automata", 
      "type": "Book"
    }, 
    "keywords": [
      "low-complexity filters", 
      "input sequence", 
      "complexity filters", 
      "biomolecular sequences", 
      "maximum sum segments", 
      "optimal algorithm", 
      "time algorithm", 
      "integer L", 
      "practical implementation", 
      "algorithm", 
      "sequence algorithm", 
      "online manner", 
      "maximum-sum segment problem", 
      "clever techniques", 
      "present paper", 
      "problem", 
      "chaos", 
      "important features", 
      "sparsity", 
      "segment problems", 
      "decomposition", 
      "space", 
      "applications", 
      "sequence A", 
      "filter", 
      "bioinformatics", 
      "genome-scale sequences", 
      "Lin", 
      "Jiang", 
      "sequence", 
      "technique", 
      "structure", 
      "number", 
      "implementation", 
      "time", 
      "results", 
      "capability", 
      "features", 
      "segments", 
      "manner", 
      "format", 
      "content", 
      "rice chromosome 1", 
      "repeat structure", 
      "rice genome", 
      "genome", 
      "paper", 
      "repeats", 
      "chromosome 1"
    ], 
    "name": "An Optimal Algorithm for Maximum-Sum Segment and Its Application in Bioinformatics", 
    "pagination": "251-257", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1051758054"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/3-540-45089-0_23"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/3-540-45089-0_23", 
      "https://app.dimensions.ai/details/publication/pub.1051758054"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-05-10T10:38", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220509/entities/gbq_results/chapter/chapter_137.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/3-540-45089-0_23"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/3-540-45089-0_23'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/3-540-45089-0_23'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/3-540-45089-0_23'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/3-540-45089-0_23'


 

This table displays all metadata directly associated to this object as RDF triples.

152 TRIPLES      23 PREDICATES      74 URIs      67 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/3-540-45089-0_23 schema:about anzsrc-for:01
2 anzsrc-for:0102
3 schema:author N7c142a9c0cb247a7892b15c6a01c8cae
4 schema:datePublished 2003-06-24
5 schema:datePublishedReg 2003-06-24
6 schema:description We study a fundamental sequence algorithm arising from bioinformatics. Given two integers L and U and a sequence A of n numbers, the maximum-sum segment problem is to find a segment A[i,j] of A with L ≤ j+i+1 ≤ U that maximizes A[i]+A[i+1]+···+A[j]. The problem finds applications in finding repeats, designing low complexity filter, and locating segments with rich C+G content for biomolecular sequences. The best known algorithm, due to Lin, Jiang, and Chao, runs in O(n) time, based upon a clever technique called left-negative decomposition for A. In the present paper, we present a new O(n)-time algorithm that bypasses the left-negative decomposition. As a result, our algorithm has the capability to handle the input sequence in an online manner, which is clearly an important feature to cope with genome-scale sequences. We also show how to exploit the sparsity in the input sequence: If A is representable in O(k) space in some format, then our algorithm runs in O(k) time. Moreover, practical implementation of our algorithm running on the rice genome helps us to identify a very long repeat structure in rice chromosome 1 that is previously unknown.
7 schema:editor N8424be24477a4a228e8243bfd19e1aa6
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf Nb6bdc0e6e2384b60ba6f2f0c01f5c317
12 schema:keywords Jiang
13 Lin
14 algorithm
15 applications
16 bioinformatics
17 biomolecular sequences
18 capability
19 chaos
20 chromosome 1
21 clever techniques
22 complexity filters
23 content
24 decomposition
25 features
26 filter
27 format
28 genome
29 genome-scale sequences
30 implementation
31 important features
32 input sequence
33 integer L
34 low-complexity filters
35 manner
36 maximum sum segments
37 maximum-sum segment problem
38 number
39 online manner
40 optimal algorithm
41 paper
42 practical implementation
43 present paper
44 problem
45 repeat structure
46 repeats
47 results
48 rice chromosome 1
49 rice genome
50 segment problems
51 segments
52 sequence
53 sequence A
54 sequence algorithm
55 space
56 sparsity
57 structure
58 technique
59 time
60 time algorithm
61 schema:name An Optimal Algorithm for Maximum-Sum Segment and Its Application in Bioinformatics
62 schema:pagination 251-257
63 schema:productId N4f3c21d61d8943fba6a978465a4d4263
64 N57685b1aa66447d49b9bea6e3c3f7aba
65 schema:publisher N09f073f7b9414c2cbebb8f75ee526983
66 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051758054
67 https://doi.org/10.1007/3-540-45089-0_23
68 schema:sdDatePublished 2022-05-10T10:38
69 schema:sdLicense https://scigraph.springernature.com/explorer/license/
70 schema:sdPublisher N7abefb2ce49e4d22b3af929188f97976
71 schema:url https://doi.org/10.1007/3-540-45089-0_23
72 sgo:license sg:explorer/license/
73 sgo:sdDataset chapters
74 rdf:type schema:Chapter
75 N09f073f7b9414c2cbebb8f75ee526983 schema:name Springer Nature
76 rdf:type schema:Organisation
77 N1699701380af44419ec6c81fd8a718fb schema:familyName Dang
78 schema:givenName Zhe
79 rdf:type schema:Person
80 N4f3c21d61d8943fba6a978465a4d4263 schema:name doi
81 schema:value 10.1007/3-540-45089-0_23
82 rdf:type schema:PropertyValue
83 N57685b1aa66447d49b9bea6e3c3f7aba schema:name dimensions_id
84 schema:value pub.1051758054
85 rdf:type schema:PropertyValue
86 N7abefb2ce49e4d22b3af929188f97976 schema:name Springer Nature - SN SciGraph project
87 rdf:type schema:Organization
88 N7c142a9c0cb247a7892b15c6a01c8cae rdf:first sg:person.01024023247.33
89 rdf:rest Nda59bf1cecd1499e8d90d06f3c5155b6
90 N80c9dc26f3d946af9dbebeeeb65644a3 rdf:first N9ddf58040b6f4afe8b2064c880f45e9e
91 rdf:rest Na0c92959e9944bd5862edcab8fce0450
92 N8424be24477a4a228e8243bfd19e1aa6 rdf:first Nd4d79dd558314693a66e217b7bf53df0
93 rdf:rest Ndb75fd1b15554a9e82cdb647046ec9fc
94 N860753f8a77245d9be41e1e0b67dd985 rdf:first sg:person.013345515575.46
95 rdf:rest Nde493863336f46a1a836f0dba494af01
96 N8d2497bc149d400aa97a7b222686a994 schema:affiliation grid-institutes:grid.482251.8
97 schema:familyName Lee
98 schema:givenName Shufen
99 rdf:type schema:Person
100 N9ddf58040b6f4afe8b2064c880f45e9e schema:affiliation grid-institutes:grid.482251.8
101 schema:familyName Wang
102 schema:givenName Tsai-Cheng
103 rdf:type schema:Person
104 Na0c92959e9944bd5862edcab8fce0450 rdf:first Ndc95be04e4574511b537ca3574991a73
105 rdf:rest rdf:nil
106 Nb6bdc0e6e2384b60ba6f2f0c01f5c317 schema:isbn 978-3-540-40561-0
107 978-3-540-45089-4
108 schema:name Implementation and Application of Automata
109 rdf:type schema:Book
110 Nd4d79dd558314693a66e217b7bf53df0 schema:familyName Ibarra
111 schema:givenName Oscar H.
112 rdf:type schema:Person
113 Nda59bf1cecd1499e8d90d06f3c5155b6 rdf:first N8d2497bc149d400aa97a7b222686a994
114 rdf:rest N860753f8a77245d9be41e1e0b67dd985
115 Ndb75fd1b15554a9e82cdb647046ec9fc rdf:first N1699701380af44419ec6c81fd8a718fb
116 rdf:rest rdf:nil
117 Ndc95be04e4574511b537ca3574991a73 schema:affiliation grid-institutes:grid.482251.8
118 schema:familyName Yao
119 schema:givenName Adam
120 rdf:type schema:Person
121 Nde493863336f46a1a836f0dba494af01 rdf:first sg:person.0621035037.51
122 rdf:rest N80c9dc26f3d946af9dbebeeeb65644a3
123 anzsrc-for:01 schema:inDefinedTermSet anzsrc-for:
124 schema:name Mathematical Sciences
125 rdf:type schema:DefinedTerm
126 anzsrc-for:0102 schema:inDefinedTermSet anzsrc-for:
127 schema:name Applied Mathematics
128 rdf:type schema:DefinedTerm
129 sg:person.01024023247.33 schema:affiliation grid-institutes:grid.37589.30
130 schema:familyName Fan
131 schema:givenName Tsai-Hung
132 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01024023247.33
133 rdf:type schema:Person
134 sg:person.013345515575.46 schema:affiliation grid-institutes:grid.28665.3f
135 schema:familyName Lu
136 schema:givenName Hsueh-I
137 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013345515575.46
138 rdf:type schema:Person
139 sg:person.0621035037.51 schema:affiliation grid-institutes:grid.37589.30
140 schema:familyName Tsou
141 schema:givenName Tsung-Shan
142 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0621035037.51
143 rdf:type schema:Person
144 grid-institutes:grid.28665.3f schema:alternateName Institute of Information Science, Academia Sinica, 115, Taipei, Taiwan, R.O.C.
145 schema:name Institute of Information Science, Academia Sinica, 115, Taipei, Taiwan, R.O.C.
146 rdf:type schema:Organization
147 grid-institutes:grid.37589.30 schema:alternateName Institute of Statistics, National Central University, 320, Chung-li, Taiwan, R.O.C.
148 schema:name Institute of Statistics, National Central University, 320, Chung-li, Taiwan, R.O.C.
149 rdf:type schema:Organization
150 grid-institutes:grid.482251.8 schema:alternateName Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C.
151 schema:name Institute of Biomedical Sciences, Academia Sinica, 115, Taipei, Taiwan, R.O.C.
152 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...