SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2014-12-08

AUTHORS

Yi-Fan Liou, Phasit Charoenkwan, Yerukala Sathipati Srinivasulu, Tamara Vasylenko, Shih-Chung Lai, Hua-Chin Lee, Yi-Hsiung Chen, Hui-Ling Huang, Shinn-Ying Ho

ABSTRACT

BACKGROUND: Heme binding proteins (HBPs) are metalloproteins that contain a heme ligand (an iron-porphyrin complex) as the prosthetic group. Several computational methods have been proposed to predict heme binding residues and thereby to understand the interactions between heme and its host proteins. However, few in silico methods for identifying HBPs have been proposed. RESULTS: This work proposes a scoring card method (SCM) based method (named SCMHBP) for predicting and analyzing HBPs from sequences. A balanced dataset of 747 HBPs (selected using a Gene Ontology term GO:0020037) and 747 non-HBPs (selected from 91,414 putative non-HBPs) with an identity of 25% was firstly established. Consequently, a set of scores that quantified the propensity of amino acids and dipeptides to be HBPs is estimated using SCM to maximize the predictive accuracy of SCMHBP. Finally, the informative physicochemical properties of 20 amino acids are identified by utilizing the estimated propensity scores to be used to categorize HBPs. The training and mean test accuracies of SCMHBP applied to three independent test datasets are 85.90% and 71.57%, respectively. SCMHBP performs well relative to comparison with such methods as support vector machine (SVM), decision tree J48, and Bayes classifiers. The putative non-HBPs with high sequence propensity scores are potential HBPs, which can be further validated by experimental confirmation. The propensity scores of individual amino acids and dipeptides are examined to elucidate the interactions between heme and its host proteins. The following characteristics of HBPs are derived from the propensity scores: 1) aromatic side chains are important to the effectiveness of specific HBP functions; 2) a hydrophobic environment is important in the interaction between heme and binding sites; and 3) the whole HBP has low flexibility whereas the heme binding residues are relatively flexible. CONCLUSIONS: SCMHBP yields knowledge that improves our understanding of HBPs rather than merely improves the prediction accuracy in predicting HBPs. More... »

PAGES

s4-s4

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/1471-2105-15-s16-s4

DOI

http://dx.doi.org/10.1186/1471-2105-15-s16-s4

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1014547146

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/25522279


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Bayes Theorem", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Binding Sites", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Carrier Proteins", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Databases, Protein", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Dipeptides", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Heme", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Heme-Binding Proteins", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Hemeproteins", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Humans", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Hydrophobic and Hydrophilic Interactions", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Ligands", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Propensity Score", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Protein Conformation", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Software", 
        "type": "DefinedTerm"
      }, 
      {
        "inDefinedTermSet": "https://www.nlm.nih.gov/mesh/", 
        "name": "Support Vector Machine", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Liou", 
        "givenName": "Yi-Fan", 
        "id": "sg:person.0652635604.75", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0652635604.75"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Charoenkwan", 
        "givenName": "Phasit", 
        "id": "sg:person.0750772430.13", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0750772430.13"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Srinivasulu", 
        "givenName": "Yerukala Sathipati", 
        "id": "sg:person.0767064204.78", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0767064204.78"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Vasylenko", 
        "givenName": "Tamara", 
        "id": "sg:person.01035177404.59", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01035177404.59"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lai", 
        "givenName": "Shih-Chung", 
        "id": "sg:person.01103312604.02", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01103312604.02"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
            "Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Lee", 
        "givenName": "Hua-Chin", 
        "id": "sg:person.01065221030.28", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065221030.28"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Chen", 
        "givenName": "Yi-Hsiung", 
        "id": "sg:person.01217541204.27", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01217541204.27"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
            "Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Huang", 
        "givenName": "Hui-Ling", 
        "id": "sg:person.0702657230.18", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702657230.18"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan", 
          "id": "http://www.grid.ac/institutes/grid.260539.b", 
          "name": [
            "Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan", 
            "Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Ho", 
        "givenName": "Shinn-Ying", 
        "id": "sg:person.01074261064.02", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01074261064.02"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1186/1471-2105-12-s1-s47", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032715721", 
          "https://doi.org/10.1186/1471-2105-12-s1-s47"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nsmb1182", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1021806369", 
          "https://doi.org/10.1038/nsmb1182"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-12-207", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016930168", 
          "https://doi.org/10.1186/1471-2105-12-207"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf00993309", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003980985", 
          "https://doi.org/10.1007/bf00993309"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s10969-005-1103-x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040567833", 
          "https://doi.org/10.1007/s10969-005-1103-x"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-13-s17-s3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1078668083", 
          "https://doi.org/10.1186/1471-2105-13-s17-s3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nature02724", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039585157", 
          "https://doi.org/10.1038/nature02724"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1477-5956-10-s1-s20", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1040559490", 
          "https://doi.org/10.1186/1477-5956-10-s1-s20"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1472-6807-11-13", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1011861864", 
          "https://doi.org/10.1186/1472-6807-11-13"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/bf01195768", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031855956", 
          "https://doi.org/10.1007/bf01195768"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/s00253-003-1432-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000298809", 
          "https://doi.org/10.1007/s00253-003-1432-2"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014-12-08", 
    "datePublishedReg": "2014-12-08", 
    "description": "BACKGROUND: Heme binding proteins (HBPs) are metalloproteins that contain a heme ligand (an iron-porphyrin complex) as the prosthetic group. Several computational methods have been proposed to predict heme binding residues and thereby to understand the interactions between heme and its host proteins. However, few in silico methods for identifying HBPs have been proposed.\nRESULTS: This work proposes a scoring card method (SCM) based method (named SCMHBP) for predicting and analyzing HBPs from sequences. A balanced dataset of 747 HBPs (selected using a Gene Ontology term GO:0020037) and 747 non-HBPs (selected from 91,414 putative non-HBPs) with an identity of 25% was firstly established. Consequently, a set of scores that quantified the propensity of amino acids and dipeptides to be HBPs is estimated using SCM to maximize the predictive accuracy of SCMHBP. Finally, the informative physicochemical properties of 20 amino acids are identified by utilizing the estimated propensity scores to be used to categorize HBPs. The training and mean test accuracies of SCMHBP applied to three independent test datasets are 85.90% and 71.57%, respectively. SCMHBP performs well relative to comparison with such methods as support vector machine (SVM), decision tree J48, and Bayes classifiers. The putative non-HBPs with high sequence propensity scores are potential HBPs, which can be further validated by experimental confirmation. The propensity scores of individual amino acids and dipeptides are examined to elucidate the interactions between heme and its host proteins. The following characteristics of HBPs are derived from the propensity scores: 1) aromatic side chains are important to the effectiveness of specific HBP functions; 2) a hydrophobic environment is important in the interaction between heme and binding sites; and 3) the whole HBP has low flexibility whereas the heme binding residues are relatively flexible.\nCONCLUSIONS: SCMHBP yields knowledge that improves our understanding of HBPs rather than merely improves the prediction accuracy in predicting HBPs.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/1471-2105-15-s16-s4", 
    "inLanguage": "en", 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1023786", 
        "issn": [
          "1471-2105"
        ], 
        "name": "BMC Bioinformatics", 
        "publisher": "Springer Nature", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "Suppl 16", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "15"
      }
    ], 
    "keywords": [
      "support vector machine", 
      "scoring card method", 
      "vector machine", 
      "balanced dataset", 
      "Bayes classifier", 
      "test dataset", 
      "prediction accuracy", 
      "independent test dataset", 
      "datasets", 
      "computational methods", 
      "such methods", 
      "accuracy", 
      "J48", 
      "test accuracy", 
      "low flexibility", 
      "predictive accuracy", 
      "classifier", 
      "machine", 
      "method", 
      "flexibility", 
      "card method", 
      "set", 
      "sets of scores", 
      "environment", 
      "effectiveness", 
      "informative physicochemical properties", 
      "silico methods", 
      "work", 
      "training", 
      "knowledge", 
      "prediction", 
      "yields knowledge", 
      "sequence", 
      "interaction", 
      "characteristics", 
      "function", 
      "comparison", 
      "analysis", 
      "chain", 
      "understanding", 
      "identity", 
      "scores", 
      "properties", 
      "sites", 
      "group", 
      "confirmation", 
      "experimental confirmation", 
      "binding residues", 
      "propensity", 
      "propensity score", 
      "physicochemical properties", 
      "residues", 
      "protein", 
      "individual amino acids", 
      "amino acids", 
      "dipeptides", 
      "heme binding protein", 
      "binding protein", 
      "heme binding residues", 
      "host proteins", 
      "metalloproteins", 
      "heme ligand", 
      "ligands", 
      "prosthetic group", 
      "heme", 
      "acid", 
      "aromatic side chains", 
      "side chains", 
      "hydrophobic environment", 
      "SCMHBP", 
      "tree J48", 
      "high sequence propensity scores", 
      "sequence propensity scores", 
      "potential HBPs", 
      "characteristics of HBPs", 
      "specific HBP functions", 
      "HBP functions", 
      "whole HBP", 
      "SCMHBP yields knowledge", 
      "understanding of HBPs"
    ], 
    "name": "SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides", 
    "pagination": "s4-s4", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1014547146"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/1471-2105-15-s16-s4"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "25522279"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/1471-2105-15-s16-s4", 
      "https://app.dimensions.ai/details/publication/pub.1014547146"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-01-01T18:34", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/article/article_631.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/1471-2105-15-s16-s4"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-15-s16-s4'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-15-s16-s4'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-15-s16-s4'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/1471-2105-15-s16-s4'


 

This table displays all metadata directly associated to this object as RDF triples.

303 TRIPLES      22 PREDICATES      132 URIs      113 LITERALS      22 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/1471-2105-15-s16-s4 schema:about N0f292cef63ea421eb582ede7b7e7b71c
2 N228d5796662449f49f974a11aff430aa
3 N2bca06986b414c3baece94779f1954f6
4 N455a215fc8f346228001be44420f39cf
5 N5a643d25df2d4fac81cdbb7eff75bcae
6 N63162642662245f29a8554a62abc112c
7 N6644a1c93e434149af4d65a21d7ac4f1
8 N6c582d051f6a4b4885b98df2ecffc229
9 N6d2b04fe4e094db0978668ace31f9e59
10 N859974a27a9f4e9a9a1c89db6cab1b04
11 N8968ae5b6ac949fd88c66a56a5a5c2c8
12 N915a4b03d38c41118d11dd62404a69f9
13 N967c2afb62874645af15314cf9628eb4
14 Ndafcc6132eea40299acf49fd5e2cee3f
15 Ne0eed3f2b5a54e58b0e6285e4d2ad98c
16 anzsrc-for:08
17 anzsrc-for:0801
18 schema:author Nd30f99c8da634b5db06661475677dde0
19 schema:citation sg:pub.10.1007/bf00993309
20 sg:pub.10.1007/bf01195768
21 sg:pub.10.1007/s00253-003-1432-2
22 sg:pub.10.1007/s10969-005-1103-x
23 sg:pub.10.1038/nature02724
24 sg:pub.10.1038/nsmb1182
25 sg:pub.10.1186/1471-2105-12-207
26 sg:pub.10.1186/1471-2105-12-s1-s47
27 sg:pub.10.1186/1471-2105-13-s17-s3
28 sg:pub.10.1186/1472-6807-11-13
29 sg:pub.10.1186/1477-5956-10-s1-s20
30 schema:datePublished 2014-12-08
31 schema:datePublishedReg 2014-12-08
32 schema:description BACKGROUND: Heme binding proteins (HBPs) are metalloproteins that contain a heme ligand (an iron-porphyrin complex) as the prosthetic group. Several computational methods have been proposed to predict heme binding residues and thereby to understand the interactions between heme and its host proteins. However, few in silico methods for identifying HBPs have been proposed. RESULTS: This work proposes a scoring card method (SCM) based method (named SCMHBP) for predicting and analyzing HBPs from sequences. A balanced dataset of 747 HBPs (selected using a Gene Ontology term GO:0020037) and 747 non-HBPs (selected from 91,414 putative non-HBPs) with an identity of 25% was firstly established. Consequently, a set of scores that quantified the propensity of amino acids and dipeptides to be HBPs is estimated using SCM to maximize the predictive accuracy of SCMHBP. Finally, the informative physicochemical properties of 20 amino acids are identified by utilizing the estimated propensity scores to be used to categorize HBPs. The training and mean test accuracies of SCMHBP applied to three independent test datasets are 85.90% and 71.57%, respectively. SCMHBP performs well relative to comparison with such methods as support vector machine (SVM), decision tree J48, and Bayes classifiers. The putative non-HBPs with high sequence propensity scores are potential HBPs, which can be further validated by experimental confirmation. The propensity scores of individual amino acids and dipeptides are examined to elucidate the interactions between heme and its host proteins. The following characteristics of HBPs are derived from the propensity scores: 1) aromatic side chains are important to the effectiveness of specific HBP functions; 2) a hydrophobic environment is important in the interaction between heme and binding sites; and 3) the whole HBP has low flexibility whereas the heme binding residues are relatively flexible. CONCLUSIONS: SCMHBP yields knowledge that improves our understanding of HBPs rather than merely improves the prediction accuracy in predicting HBPs.
33 schema:genre article
34 schema:inLanguage en
35 schema:isAccessibleForFree true
36 schema:isPartOf N33e447f4b2dc4b7bbd8a2474bdddfdde
37 Nc9e56646a81d4aa0bae4388b7ad15726
38 sg:journal.1023786
39 schema:keywords Bayes classifier
40 HBP functions
41 J48
42 SCMHBP
43 SCMHBP yields knowledge
44 accuracy
45 acid
46 amino acids
47 analysis
48 aromatic side chains
49 balanced dataset
50 binding protein
51 binding residues
52 card method
53 chain
54 characteristics
55 characteristics of HBPs
56 classifier
57 comparison
58 computational methods
59 confirmation
60 datasets
61 dipeptides
62 effectiveness
63 environment
64 experimental confirmation
65 flexibility
66 function
67 group
68 heme
69 heme binding protein
70 heme binding residues
71 heme ligand
72 high sequence propensity scores
73 host proteins
74 hydrophobic environment
75 identity
76 independent test dataset
77 individual amino acids
78 informative physicochemical properties
79 interaction
80 knowledge
81 ligands
82 low flexibility
83 machine
84 metalloproteins
85 method
86 physicochemical properties
87 potential HBPs
88 prediction
89 prediction accuracy
90 predictive accuracy
91 propensity
92 propensity score
93 properties
94 prosthetic group
95 protein
96 residues
97 scores
98 scoring card method
99 sequence
100 sequence propensity scores
101 set
102 sets of scores
103 side chains
104 silico methods
105 sites
106 specific HBP functions
107 such methods
108 support vector machine
109 test accuracy
110 test dataset
111 training
112 tree J48
113 understanding
114 understanding of HBPs
115 vector machine
116 whole HBP
117 work
118 yields knowledge
119 schema:name SCMHBP: prediction and analysis of heme binding proteins using propensity scores of dipeptides
120 schema:pagination s4-s4
121 schema:productId N06cc4479a22540cab526007bce5f6e2a
122 N1ed54014aeb74283a4f33a2de15c7186
123 Nbb216c6d1189452aaf03e6d6e2952a47
124 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014547146
125 https://doi.org/10.1186/1471-2105-15-s16-s4
126 schema:sdDatePublished 2022-01-01T18:34
127 schema:sdLicense https://scigraph.springernature.com/explorer/license/
128 schema:sdPublisher Nae3d15675c7f426bb48108ef3366ca04
129 schema:url https://doi.org/10.1186/1471-2105-15-s16-s4
130 sgo:license sg:explorer/license/
131 sgo:sdDataset articles
132 rdf:type schema:ScholarlyArticle
133 N06cc4479a22540cab526007bce5f6e2a schema:name pubmed_id
134 schema:value 25522279
135 rdf:type schema:PropertyValue
136 N0f292cef63ea421eb582ede7b7e7b71c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
137 schema:name Databases, Protein
138 rdf:type schema:DefinedTerm
139 N1ed54014aeb74283a4f33a2de15c7186 schema:name doi
140 schema:value 10.1186/1471-2105-15-s16-s4
141 rdf:type schema:PropertyValue
142 N228d5796662449f49f974a11aff430aa schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
143 schema:name Protein Conformation
144 rdf:type schema:DefinedTerm
145 N2bca06986b414c3baece94779f1954f6 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
146 schema:name Hemeproteins
147 rdf:type schema:DefinedTerm
148 N33e447f4b2dc4b7bbd8a2474bdddfdde schema:issueNumber Suppl 16
149 rdf:type schema:PublicationIssue
150 N455a215fc8f346228001be44420f39cf schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
151 schema:name Hydrophobic and Hydrophilic Interactions
152 rdf:type schema:DefinedTerm
153 N48a64cb0c6234cfba4418a7623db4538 rdf:first sg:person.01074261064.02
154 rdf:rest rdf:nil
155 N571b1da865394b19b8bb70c9b9bb1ec3 rdf:first sg:person.0750772430.13
156 rdf:rest Nc72b08c7f80c4b098c0f6defca43134e
157 N5a643d25df2d4fac81cdbb7eff75bcae schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
158 schema:name Dipeptides
159 rdf:type schema:DefinedTerm
160 N63162642662245f29a8554a62abc112c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
161 schema:name Support Vector Machine
162 rdf:type schema:DefinedTerm
163 N6644a1c93e434149af4d65a21d7ac4f1 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
164 schema:name Software
165 rdf:type schema:DefinedTerm
166 N6c582d051f6a4b4885b98df2ecffc229 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
167 schema:name Heme
168 rdf:type schema:DefinedTerm
169 N6d2b04fe4e094db0978668ace31f9e59 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
170 schema:name Carrier Proteins
171 rdf:type schema:DefinedTerm
172 N859974a27a9f4e9a9a1c89db6cab1b04 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
173 schema:name Heme-Binding Proteins
174 rdf:type schema:DefinedTerm
175 N8968ae5b6ac949fd88c66a56a5a5c2c8 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
176 schema:name Propensity Score
177 rdf:type schema:DefinedTerm
178 N915a4b03d38c41118d11dd62404a69f9 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
179 schema:name Bayes Theorem
180 rdf:type schema:DefinedTerm
181 N967c2afb62874645af15314cf9628eb4 schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
182 schema:name Humans
183 rdf:type schema:DefinedTerm
184 N9e4e20f75a2547f08f59f936f18f4e95 rdf:first sg:person.01065221030.28
185 rdf:rest Ncb131c51871e483a9efaf0ec47c186bc
186 Nae3d15675c7f426bb48108ef3366ca04 schema:name Springer Nature - SN SciGraph project
187 rdf:type schema:Organization
188 Nbb216c6d1189452aaf03e6d6e2952a47 schema:name dimensions_id
189 schema:value pub.1014547146
190 rdf:type schema:PropertyValue
191 Nc023943f77f94d3bad43280907cb5222 rdf:first sg:person.01035177404.59
192 rdf:rest Nfea62baf19e3431b8b7367dee0922015
193 Nc72b08c7f80c4b098c0f6defca43134e rdf:first sg:person.0767064204.78
194 rdf:rest Nc023943f77f94d3bad43280907cb5222
195 Nc9e56646a81d4aa0bae4388b7ad15726 schema:volumeNumber 15
196 rdf:type schema:PublicationVolume
197 Ncb131c51871e483a9efaf0ec47c186bc rdf:first sg:person.01217541204.27
198 rdf:rest Nd7e28caedacd4a00b845e1f5957d3d82
199 Nd30f99c8da634b5db06661475677dde0 rdf:first sg:person.0652635604.75
200 rdf:rest N571b1da865394b19b8bb70c9b9bb1ec3
201 Nd7e28caedacd4a00b845e1f5957d3d82 rdf:first sg:person.0702657230.18
202 rdf:rest N48a64cb0c6234cfba4418a7623db4538
203 Ndafcc6132eea40299acf49fd5e2cee3f schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
204 schema:name Binding Sites
205 rdf:type schema:DefinedTerm
206 Ne0eed3f2b5a54e58b0e6285e4d2ad98c schema:inDefinedTermSet https://www.nlm.nih.gov/mesh/
207 schema:name Ligands
208 rdf:type schema:DefinedTerm
209 Nfea62baf19e3431b8b7367dee0922015 rdf:first sg:person.01103312604.02
210 rdf:rest N9e4e20f75a2547f08f59f936f18f4e95
211 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
212 schema:name Information and Computing Sciences
213 rdf:type schema:DefinedTerm
214 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
215 schema:name Artificial Intelligence and Image Processing
216 rdf:type schema:DefinedTerm
217 sg:journal.1023786 schema:issn 1471-2105
218 schema:name BMC Bioinformatics
219 schema:publisher Springer Nature
220 rdf:type schema:Periodical
221 sg:person.01035177404.59 schema:affiliation grid-institutes:grid.260539.b
222 schema:familyName Vasylenko
223 schema:givenName Tamara
224 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01035177404.59
225 rdf:type schema:Person
226 sg:person.01065221030.28 schema:affiliation grid-institutes:grid.260539.b
227 schema:familyName Lee
228 schema:givenName Hua-Chin
229 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01065221030.28
230 rdf:type schema:Person
231 sg:person.01074261064.02 schema:affiliation grid-institutes:grid.260539.b
232 schema:familyName Ho
233 schema:givenName Shinn-Ying
234 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01074261064.02
235 rdf:type schema:Person
236 sg:person.01103312604.02 schema:affiliation grid-institutes:grid.260539.b
237 schema:familyName Lai
238 schema:givenName Shih-Chung
239 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01103312604.02
240 rdf:type schema:Person
241 sg:person.01217541204.27 schema:affiliation grid-institutes:grid.260539.b
242 schema:familyName Chen
243 schema:givenName Yi-Hsiung
244 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01217541204.27
245 rdf:type schema:Person
246 sg:person.0652635604.75 schema:affiliation grid-institutes:grid.260539.b
247 schema:familyName Liou
248 schema:givenName Yi-Fan
249 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0652635604.75
250 rdf:type schema:Person
251 sg:person.0702657230.18 schema:affiliation grid-institutes:grid.260539.b
252 schema:familyName Huang
253 schema:givenName Hui-Ling
254 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0702657230.18
255 rdf:type schema:Person
256 sg:person.0750772430.13 schema:affiliation grid-institutes:grid.260539.b
257 schema:familyName Charoenkwan
258 schema:givenName Phasit
259 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0750772430.13
260 rdf:type schema:Person
261 sg:person.0767064204.78 schema:affiliation grid-institutes:grid.260539.b
262 schema:familyName Srinivasulu
263 schema:givenName Yerukala Sathipati
264 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0767064204.78
265 rdf:type schema:Person
266 sg:pub.10.1007/bf00993309 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003980985
267 https://doi.org/10.1007/bf00993309
268 rdf:type schema:CreativeWork
269 sg:pub.10.1007/bf01195768 schema:sameAs https://app.dimensions.ai/details/publication/pub.1031855956
270 https://doi.org/10.1007/bf01195768
271 rdf:type schema:CreativeWork
272 sg:pub.10.1007/s00253-003-1432-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000298809
273 https://doi.org/10.1007/s00253-003-1432-2
274 rdf:type schema:CreativeWork
275 sg:pub.10.1007/s10969-005-1103-x schema:sameAs https://app.dimensions.ai/details/publication/pub.1040567833
276 https://doi.org/10.1007/s10969-005-1103-x
277 rdf:type schema:CreativeWork
278 sg:pub.10.1038/nature02724 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039585157
279 https://doi.org/10.1038/nature02724
280 rdf:type schema:CreativeWork
281 sg:pub.10.1038/nsmb1182 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021806369
282 https://doi.org/10.1038/nsmb1182
283 rdf:type schema:CreativeWork
284 sg:pub.10.1186/1471-2105-12-207 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016930168
285 https://doi.org/10.1186/1471-2105-12-207
286 rdf:type schema:CreativeWork
287 sg:pub.10.1186/1471-2105-12-s1-s47 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032715721
288 https://doi.org/10.1186/1471-2105-12-s1-s47
289 rdf:type schema:CreativeWork
290 sg:pub.10.1186/1471-2105-13-s17-s3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1078668083
291 https://doi.org/10.1186/1471-2105-13-s17-s3
292 rdf:type schema:CreativeWork
293 sg:pub.10.1186/1472-6807-11-13 schema:sameAs https://app.dimensions.ai/details/publication/pub.1011861864
294 https://doi.org/10.1186/1472-6807-11-13
295 rdf:type schema:CreativeWork
296 sg:pub.10.1186/1477-5956-10-s1-s20 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040559490
297 https://doi.org/10.1186/1477-5956-10-s1-s20
298 rdf:type schema:CreativeWork
299 grid-institutes:grid.260539.b schema:alternateName Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
300 Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
301 schema:name Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
302 Institute of Bioinformatics and Systems Biology, National Chiao Tung University, Hsinchu, Taiwan
303 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...