Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2010-12

AUTHORS

Gayathri Jayaraman, Rahul Siddharthan

ABSTRACT

BACKGROUND: While most multiple sequence alignment programs expect that all or most of their input is known to be homologous, and penalise insertions and deletions, this is not a reasonable assumption for non-coding DNA, which is much less strongly conserved than protein-coding genes. Arguing that the goal of sequence alignment should be the detection of homology and not similarity, we incorporate an evolutionary model into a previously published multiple sequence alignment program for non-coding DNA, Sigma, as a sensitive likelihood-based way to assess the significance of alignments. Version 1 of Sigma was successful in eliminating spurious alignments but exhibited relatively poor sensitivity on synthetic data. Sigma 1 used a p-value (the probability under the "null hypothesis" of non-homology) to assess the significance of alignments, and, optionally, a background model that captured short-range genomic correlations. Sigma version 2, described here, retains these features, but calculates the p-value using a sophisticated evolutionary model that we describe here, and also allows for a transition matrix for different substitution rates from and to different nucleotides. Our evolutionary model takes separate account of mutation and fixation, and can be extended to allow for locally differing functional constraints on sequence. RESULTS: We demonstrate that, on real and synthetic data, Sigma-2 significantly outperforms other programs in specificity to genuine homology (that is, it minimises alignment of spuriously similar regions that do not have a common ancestry) while it is now as sensitive as the best current programs. CONCLUSIONS: Comparing these results with an extrapolation of the best results from other available programs, we suggest that conservation rates in intergenic DNA are often significantly over-estimated. It is increasingly important to align non-coding DNA correctly, in regulatory genomics and in the context of whole-genome alignment, and Sigma-2 is an important step in that direction. More... »

PAGES

464

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-11-464

DOI

http://dx.doi.org/10.1186/1471-2105-11-464

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1029723621

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/20846408


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "DNA, Intergenic", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Evolution, Molecular", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Genomics", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Likelihood Functions", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Alignment", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Sequence Analysis, DNA", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Institute of Mathematical Sciences", 
          "id": "https://www.grid.ac/institutes/grid.462414.1", 
          "name": [
            "The Institute of Mathematical Sciences, Taramani, 600 113, Chennai, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Jayaraman", 
        "givenName": "Gayathri", 
        "id": "sg:person.01336147564.03", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336147564.03"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Mathematical Sciences", 
          "id": "https://www.grid.ac/institutes/grid.462414.1", 
          "name": [
            "The Institute of Mathematical Sciences, Taramani, 600 113, Chennai, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Siddharthan", 
        "givenName": "Rahul", 
        "id": "sg:person.0614124227.85", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0614124227.85"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1371/journal.pcbi.1000156", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000816158"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/356168a0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003127095", 
          "https://doi.org/10.1038/356168a0"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-11-54", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003163452", 
          "https://doi.org/10.1186/1471-2105-11-54"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-11-54", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003163452", 
          "https://doi.org/10.1186/1471-2105-11-54"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature04979", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004695015", 
          "https://doi.org/10.1038/nature04979"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature04979", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004695015", 
          "https://doi.org/10.1038/nature04979"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.78.1.454", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1005636769"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.0010067", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007052752"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btm404", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007683223"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.076554.108", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011842335"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0045-6039(85)90488-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011933465"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0045-6039(85)90488-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011933465"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-7-143", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014283709", 
          "https://doi.org/10.1186/1471-2105-7-143"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-7-143", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014283709", 
          "https://doi.org/10.1186/1471-2105-7-143"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/15.7.607", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014666719"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.1960404", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1017077573"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.86.4.1183", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1020152024"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1006/jmbi.2000.4042", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022575813"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0809770105", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023033178"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01731581", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023239976", 
          "https://doi.org/10.1007/bf01731581"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01731581", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023239976", 
          "https://doi.org/10.1007/bf01731581"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01731581", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023239976", 
          "https://doi.org/10.1007/bf01731581"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btg008", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023659156"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0893-9659(01)80026-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024072417"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0022-2836(81)90087-5", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024589839"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/gkh340", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1025846396"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1748-7188-3-6", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027480499", 
          "https://doi.org/10.1186/1748-7188-3-6"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00163848", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027539169", 
          "https://doi.org/10.1007/bf00163848"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02101694", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039008589", 
          "https://doi.org/10.1007/bf02101694"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02101694", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039008589", 
          "https://doi.org/10.1007/bf02101694"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/15.3.211", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1041319529"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bti376", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042305598"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/22.22.4673", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1042438223"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/gr.926603", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043392725"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01734359", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044065382", 
          "https://doi.org/10.1007/bf01734359"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01734359", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044065382", 
          "https://doi.org/10.1007/bf01734359"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-6-298", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047546418", 
          "https://doi.org/10.1186/1471-2105-6-298"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-6-298", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047546418", 
          "https://doi.org/10.1186/1471-2105-6-298"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02193625", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049370355", 
          "https://doi.org/10.1007/bf02193625"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf02193625", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049370355", 
          "https://doi.org/10.1007/bf02193625"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btg1040", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049915428"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/0092-8674(87)90322-9", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050657578"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pcbi.1000392", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1050816305"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0022-5193(05)80104-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1053616169"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1080/10635150802422324", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058369811"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/6.2.81", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059413944"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1126/science.172.3988.1089", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1062502913"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/oxfordjournals.molbev.a040752", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1077145115"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/oxfordjournals.molbev.a040023", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1082762864"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://app.dimensions.ai/details/publication/pub.1083334412", 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2010-12", 
    "datePublishedReg": "2010-12-01", 
    "description": "BACKGROUND: While most multiple sequence alignment programs expect that all or most of their input is known to be homologous, and penalise insertions and deletions, this is not a reasonable assumption for non-coding DNA, which is much less strongly conserved than protein-coding genes. Arguing that the goal of sequence alignment should be the detection of homology and not similarity, we incorporate an evolutionary model into a previously published multiple sequence alignment program for non-coding DNA, Sigma, as a sensitive likelihood-based way to assess the significance of alignments. Version 1 of Sigma was successful in eliminating spurious alignments but exhibited relatively poor sensitivity on synthetic data. Sigma 1 used a p-value (the probability under the \"null hypothesis\" of non-homology) to assess the significance of alignments, and, optionally, a background model that captured short-range genomic correlations. Sigma version 2, described here, retains these features, but calculates the p-value using a sophisticated evolutionary model that we describe here, and also allows for a transition matrix for different substitution rates from and to different nucleotides. Our evolutionary model takes separate account of mutation and fixation, and can be extended to allow for locally differing functional constraints on sequence.\nRESULTS: We demonstrate that, on real and synthetic data, Sigma-2 significantly outperforms other programs in specificity to genuine homology (that is, it minimises alignment of spuriously similar regions that do not have a common ancestry) while it is now as sensitive as the best current programs.\nCONCLUSIONS: Comparing these results with an extrapolation of the best results from other available programs, we suggest that conservation rates in intergenic DNA are often significantly over-estimated. It is increasingly important to align non-coding DNA correctly, in regulatory genomics and in the context of whole-genome alignment, and Sigma-2 is an important step in that direction.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/1471-2105-11-464", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "11"
      }
    ], 
    "name": "Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model", 
    "pagination": "464", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "8d99105b5f7d122a78715703d1c0fb5191817fec0dc8f534cb37ad9b8a90b954"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "20846408"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100965194"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-11-464"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1029723621"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-11-464", 
      "https://app.dimensions.ai/details/publication/pub.1029723621"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-10T20:53", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8684_00000550.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "http://link.springer.com/10.1186/1471-2105-11-464"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-464'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-464'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-464'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-11-464'


 

This table displays all metadata directly associated to this object as RDF triples.

233 TRIPLES      21 PREDICATES      76 URIs      28 LITERALS      16 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-11-464 schema:about N2401d312cc8e4120b6fd89538675e359
2 N509a8119e5394336a71f47b72be7b0a7
3 N76798f88ac324730ba78106b00a45dcc
4 N9d6afd17020e4c478050b83b93ce13ed
5 Na7e14f86caa34ad3943f273866e7ba4e
6 Ncbaa0c3b7cbc414386b5dac99d25f8be
7 Ndcee0d442a92427e92006b75e28c464d
8 anzsrc-for:06
9 anzsrc-for:0604
10 schema:author N7b4b3f9a12a64deba26d224a0dd528cb
11 schema:citation sg:pub.10.1007/bf00163848
12 sg:pub.10.1007/bf01731581
13 sg:pub.10.1007/bf01734359
14 sg:pub.10.1007/bf02101694
15 sg:pub.10.1007/bf02193625
16 sg:pub.10.1038/356168a0
17 sg:pub.10.1038/nature04979
18 sg:pub.10.1186/1471-2105-11-54
19 sg:pub.10.1186/1471-2105-6-298
20 sg:pub.10.1186/1471-2105-7-143
21 sg:pub.10.1186/1748-7188-3-6
22 https://app.dimensions.ai/details/publication/pub.1083334412
23 https://doi.org/10.1006/jmbi.2000.4042
24 https://doi.org/10.1016/0022-2836(81)90087-5
25 https://doi.org/10.1016/0045-6039(85)90488-9
26 https://doi.org/10.1016/0092-8674(87)90322-9
27 https://doi.org/10.1016/s0022-5193(05)80104-3
28 https://doi.org/10.1016/s0893-9659(01)80026-4
29 https://doi.org/10.1073/pnas.0809770105
30 https://doi.org/10.1073/pnas.78.1.454
31 https://doi.org/10.1073/pnas.86.4.1183
32 https://doi.org/10.1080/10635150802422324
33 https://doi.org/10.1093/bioinformatics/15.3.211
34 https://doi.org/10.1093/bioinformatics/15.7.607
35 https://doi.org/10.1093/bioinformatics/6.2.81
36 https://doi.org/10.1093/bioinformatics/btg008
37 https://doi.org/10.1093/bioinformatics/btg1040
38 https://doi.org/10.1093/bioinformatics/bti376
39 https://doi.org/10.1093/bioinformatics/btm404
40 https://doi.org/10.1093/nar/22.22.4673
41 https://doi.org/10.1093/nar/gkh340
42 https://doi.org/10.1093/oxfordjournals.molbev.a040023
43 https://doi.org/10.1093/oxfordjournals.molbev.a040752
44 https://doi.org/10.1101/gr.076554.108
45 https://doi.org/10.1101/gr.1960404
46 https://doi.org/10.1101/gr.926603
47 https://doi.org/10.1126/science.172.3988.1089
48 https://doi.org/10.1371/journal.pcbi.0010067
49 https://doi.org/10.1371/journal.pcbi.1000156
50 https://doi.org/10.1371/journal.pcbi.1000392
51 schema:datePublished 2010-12
52 schema:datePublishedReg 2010-12-01
53 schema:description BACKGROUND: While most multiple sequence alignment programs expect that all or most of their input is known to be homologous, and penalise insertions and deletions, this is not a reasonable assumption for non-coding DNA, which is much less strongly conserved than protein-coding genes. Arguing that the goal of sequence alignment should be the detection of homology and not similarity, we incorporate an evolutionary model into a previously published multiple sequence alignment program for non-coding DNA, Sigma, as a sensitive likelihood-based way to assess the significance of alignments. Version 1 of Sigma was successful in eliminating spurious alignments but exhibited relatively poor sensitivity on synthetic data. Sigma 1 used a p-value (the probability under the "null hypothesis" of non-homology) to assess the significance of alignments, and, optionally, a background model that captured short-range genomic correlations. Sigma version 2, described here, retains these features, but calculates the p-value using a sophisticated evolutionary model that we describe here, and also allows for a transition matrix for different substitution rates from and to different nucleotides. Our evolutionary model takes separate account of mutation and fixation, and can be extended to allow for locally differing functional constraints on sequence. RESULTS: We demonstrate that, on real and synthetic data, Sigma-2 significantly outperforms other programs in specificity to genuine homology (that is, it minimises alignment of spuriously similar regions that do not have a common ancestry) while it is now as sensitive as the best current programs. CONCLUSIONS: Comparing these results with an extrapolation of the best results from other available programs, we suggest that conservation rates in intergenic DNA are often significantly over-estimated. It is increasingly important to align non-coding DNA correctly, in regulatory genomics and in the context of whole-genome alignment, and Sigma-2 is an important step in that direction.
54 schema:genre research_article
55 schema:inLanguage en
56 schema:isAccessibleForFree true
57 schema:isPartOf N69d9533b0de0477eaecac9d8a174139f
58 Ndf414ea34729473cb0b00098e6c16907
59 sg:journal.1023786
60 schema:name Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model
61 schema:pagination 464
62 schema:productId N16c9da6878ae4eacb42b32b58a254e12
63 N52f5621fc50b4757860493073c7372ee
64 N9332795fd8e6478a98282aeb05e2a9a7
65 N96d8b7c6045c4df091f0d570e6deb287
66 Nc42ba598611c4c2b9d22352e431090df
67 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029723621
68 https://doi.org/10.1186/1471-2105-11-464
69 schema:sdDatePublished 2019-04-10T20:53
70 schema:sdLicense https://scigraph.springernature.com/explorer/license/
71 schema:sdPublisher Ne116a4707fc4448ead2afcabb93f6ae2
72 schema:url http://link.springer.com/10.1186/1471-2105-11-464
73 sgo:license sg:explorer/license/
74 sgo:sdDataset articles
75 rdf:type schema:ScholarlyArticle
76 N16c9da6878ae4eacb42b32b58a254e12 schema:name readcube_id
77 schema:value 8d99105b5f7d122a78715703d1c0fb5191817fec0dc8f534cb37ad9b8a90b954
78 rdf:type schema:PropertyValue
79 N2401d312cc8e4120b6fd89538675e359 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
80 schema:name Sequence Alignment
81 rdf:type schema:DefinedTerm
82 N509a8119e5394336a71f47b72be7b0a7 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
83 schema:name DNA, Intergenic
84 rdf:type schema:DefinedTerm
85 N52f5621fc50b4757860493073c7372ee schema:name doi
86 schema:value 10.1186/1471-2105-11-464
87 rdf:type schema:PropertyValue
88 N69d9533b0de0477eaecac9d8a174139f schema:issueNumber 1
89 rdf:type schema:PublicationIssue
90 N76798f88ac324730ba78106b00a45dcc schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
91 schema:name Genomics
92 rdf:type schema:DefinedTerm
93 N7b4b3f9a12a64deba26d224a0dd528cb rdf:first sg:person.01336147564.03
94 rdf:rest Nec8e6460b0d94dfe8c0635d0768f587d
95 N9332795fd8e6478a98282aeb05e2a9a7 schema:name pubmed_id
96 schema:value 20846408
97 rdf:type schema:PropertyValue
98 N96d8b7c6045c4df091f0d570e6deb287 schema:name dimensions_id
99 schema:value pub.1029723621
100 rdf:type schema:PropertyValue
101 N9d6afd17020e4c478050b83b93ce13ed schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
102 schema:name Software
103 rdf:type schema:DefinedTerm
104 Na7e14f86caa34ad3943f273866e7ba4e schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
105 schema:name Likelihood Functions
106 rdf:type schema:DefinedTerm
107 Nc42ba598611c4c2b9d22352e431090df schema:name nlm_unique_id
108 schema:value 100965194
109 rdf:type schema:PropertyValue
110 Ncbaa0c3b7cbc414386b5dac99d25f8be schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
111 schema:name Evolution, Molecular
112 rdf:type schema:DefinedTerm
113 Ndcee0d442a92427e92006b75e28c464d schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
114 schema:name Sequence Analysis, DNA
115 rdf:type schema:DefinedTerm
116 Ndf414ea34729473cb0b00098e6c16907 schema:volumeNumber 11
117 rdf:type schema:PublicationVolume
118 Ne116a4707fc4448ead2afcabb93f6ae2 schema:name Springer Nature - SN SciGraph project
119 rdf:type schema:Organization
120 Nec8e6460b0d94dfe8c0635d0768f587d rdf:first sg:person.0614124227.85
121 rdf:rest rdf:nil
122 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
123 schema:name Biological Sciences
124 rdf:type schema:DefinedTerm
125 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
126 schema:name Genetics
127 rdf:type schema:DefinedTerm
128 sg:journal.1023786 schema:issn 1471-2105
129 schema:name BMC Bioinformatics
130 rdf:type schema:Periodical
131 sg:person.01336147564.03 schema:affiliation https://www.grid.ac/institutes/grid.462414.1
132 schema:familyName Jayaraman
133 schema:givenName Gayathri
134 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01336147564.03
135 rdf:type schema:Person
136 sg:person.0614124227.85 schema:affiliation https://www.grid.ac/institutes/grid.462414.1
137 schema:familyName Siddharthan
138 schema:givenName Rahul
139 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0614124227.85
140 rdf:type schema:Person
141 sg:pub.10.1007/bf00163848 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027539169
142 https://doi.org/10.1007/bf00163848
143 rdf:type schema:CreativeWork
144 sg:pub.10.1007/bf01731581 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023239976
145 https://doi.org/10.1007/bf01731581
146 rdf:type schema:CreativeWork
147 sg:pub.10.1007/bf01734359 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044065382
148 https://doi.org/10.1007/bf01734359
149 rdf:type schema:CreativeWork
150 sg:pub.10.1007/bf02101694 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039008589
151 https://doi.org/10.1007/bf02101694
152 rdf:type schema:CreativeWork
153 sg:pub.10.1007/bf02193625 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049370355
154 https://doi.org/10.1007/bf02193625
155 rdf:type schema:CreativeWork
156 sg:pub.10.1038/356168a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003127095
157 https://doi.org/10.1038/356168a0
158 rdf:type schema:CreativeWork
159 sg:pub.10.1038/nature04979 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004695015
160 https://doi.org/10.1038/nature04979
161 rdf:type schema:CreativeWork
162 sg:pub.10.1186/1471-2105-11-54 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003163452
163 https://doi.org/10.1186/1471-2105-11-54
164 rdf:type schema:CreativeWork
165 sg:pub.10.1186/1471-2105-6-298 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047546418
166 https://doi.org/10.1186/1471-2105-6-298
167 rdf:type schema:CreativeWork
168 sg:pub.10.1186/1471-2105-7-143 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014283709
169 https://doi.org/10.1186/1471-2105-7-143
170 rdf:type schema:CreativeWork
171 sg:pub.10.1186/1748-7188-3-6 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027480499
172 https://doi.org/10.1186/1748-7188-3-6
173 rdf:type schema:CreativeWork
174 https://app.dimensions.ai/details/publication/pub.1083334412 schema:CreativeWork
175 https://doi.org/10.1006/jmbi.2000.4042 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022575813
176 rdf:type schema:CreativeWork
177 https://doi.org/10.1016/0022-2836(81)90087-5 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024589839
178 rdf:type schema:CreativeWork
179 https://doi.org/10.1016/0045-6039(85)90488-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011933465
180 rdf:type schema:CreativeWork
181 https://doi.org/10.1016/0092-8674(87)90322-9 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050657578
182 rdf:type schema:CreativeWork
183 https://doi.org/10.1016/s0022-5193(05)80104-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053616169
184 rdf:type schema:CreativeWork
185 https://doi.org/10.1016/s0893-9659(01)80026-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024072417
186 rdf:type schema:CreativeWork
187 https://doi.org/10.1073/pnas.0809770105 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023033178
188 rdf:type schema:CreativeWork
189 https://doi.org/10.1073/pnas.78.1.454 schema:sameAs https://app.dimensions.ai/details/publication/pub.1005636769
190 rdf:type schema:CreativeWork
191 https://doi.org/10.1073/pnas.86.4.1183 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020152024
192 rdf:type schema:CreativeWork
193 https://doi.org/10.1080/10635150802422324 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058369811
194 rdf:type schema:CreativeWork
195 https://doi.org/10.1093/bioinformatics/15.3.211 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041319529
196 rdf:type schema:CreativeWork
197 https://doi.org/10.1093/bioinformatics/15.7.607 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014666719
198 rdf:type schema:CreativeWork
199 https://doi.org/10.1093/bioinformatics/6.2.81 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059413944
200 rdf:type schema:CreativeWork
201 https://doi.org/10.1093/bioinformatics/btg008 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023659156
202 rdf:type schema:CreativeWork
203 https://doi.org/10.1093/bioinformatics/btg1040 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049915428
204 rdf:type schema:CreativeWork
205 https://doi.org/10.1093/bioinformatics/bti376 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042305598
206 rdf:type schema:CreativeWork
207 https://doi.org/10.1093/bioinformatics/btm404 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007683223
208 rdf:type schema:CreativeWork
209 https://doi.org/10.1093/nar/22.22.4673 schema:sameAs https://app.dimensions.ai/details/publication/pub.1042438223
210 rdf:type schema:CreativeWork
211 https://doi.org/10.1093/nar/gkh340 schema:sameAs https://app.dimensions.ai/details/publication/pub.1025846396
212 rdf:type schema:CreativeWork
213 https://doi.org/10.1093/oxfordjournals.molbev.a040023 schema:sameAs https://app.dimensions.ai/details/publication/pub.1082762864
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1093/oxfordjournals.molbev.a040752 schema:sameAs https://app.dimensions.ai/details/publication/pub.1077145115
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1101/gr.076554.108 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011842335
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1101/gr.1960404 schema:sameAs https://app.dimensions.ai/details/publication/pub.1017077573
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1101/gr.926603 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043392725
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1126/science.172.3988.1089 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062502913
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1371/journal.pcbi.0010067 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007052752
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1371/journal.pcbi.1000156 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000816158
228 rdf:type schema:CreativeWork
229 https://doi.org/10.1371/journal.pcbi.1000392 schema:sameAs https://app.dimensions.ai/details/publication/pub.1050816305
230 rdf:type schema:CreativeWork
231 https://www.grid.ac/institutes/grid.462414.1 schema:alternateName Institute of Mathematical Sciences
232 schema:name The Institute of Mathematical Sciences, Taramani, 600 113, Chennai, India
233 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...