funbarRF: DNA barcode-based fungal species prediction using multiclass Random Forest supervised learning model View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2019-12

AUTHORS

Prabina Kumar Meher, Tanmaya Kumar Sahu, Shachi Gahoi, Ruchi Tomar, Atmakuri Ramakrishna Rao

ABSTRACT

BACKGROUND: Identification of unknown fungal species aids to the conservation of fungal diversity. As many fungal species cannot be cultured, morphological identification of those species is almost impossible. But, DNA barcoding technique can be employed for identification of such species. For fungal taxonomy prediction, the ITS (internal transcribed spacer) region of rDNA (ribosomal DNA) is used as barcode. Though the computational prediction of fungal species has become feasible with the availability of huge volume of barcode sequences in public domain, prediction of fungal species is challenging due to high degree of variability among ITS regions within species. RESULTS: A Random Forest (RF)-based predictor was built for identification of unknown fungal species. The reference and query sequences were mapped onto numeric features based on gapped base pair compositions, and then used as training and test sets respectively for prediction of fungal species using RF. More than 85% accuracy was found when 4 sequences per species in the reference set were utilized; whereas it was seen to be stabilized at ~88% if ≥7 sequence per species in the reference set were used for training of the model. The proposed model achieved comparable accuracy, while evaluated against existing methods through cross-validation procedure. The proposed model also outperformed several existing models used for identification of different species other than fungi. CONCLUSIONS: An online prediction server "funbarRF" is established at http://cabgrid.res.in:8080/funbarrf/ for fungal species identification. Besides, an R-package funbarRF ( https://cran.r-project.org/web/packages/funbarRF/ ) is also available for prediction using high throughput sequence data. The effort put in this work will certainly supplement the future endeavors in the direction of fungal taxonomy assignments based on DNA barcode. More... »

PAGES

2

Journal

TITLE

BMC Genetics

ISSUE

1

VOLUME

20

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/s12863-018-0710-z

DOI

http://dx.doi.org/10.1186/s12863-018-0710-z

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1111225119

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/30616524


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Indian Agricultural Statistics Research Institute", 
          "id": "https://www.grid.ac/institutes/grid.463150.5", 
          "name": [
            "Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, 110012, New Delhi, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Meher", 
        "givenName": "Prabina Kumar", 
        "id": "sg:person.01010516501.02", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01010516501.02"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Indian Agricultural Statistics Research Institute", 
          "id": "https://www.grid.ac/institutes/grid.463150.5", 
          "name": [
            "Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, 110012, New Delhi, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Sahu", 
        "givenName": "Tanmaya Kumar", 
        "id": "sg:person.01017164135.43", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01017164135.43"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Indian Agricultural Statistics Research Institute", 
          "id": "https://www.grid.ac/institutes/grid.463150.5", 
          "name": [
            "Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, 110012, New Delhi, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Gahoi", 
        "givenName": "Shachi", 
        "id": "sg:person.0762652776.78", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0762652776.78"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Indian Agricultural Statistics Research Institute", 
          "id": "https://www.grid.ac/institutes/grid.463150.5", 
          "name": [
            "Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, 110012, New Delhi, India", 
            "Department of Bioinformatics, Janta Vedic College, 250611, Baraut, Baghpat, Uttar Pradesh, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Tomar", 
        "givenName": "Ruchi", 
        "id": "sg:person.014511257536.49", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014511257536.49"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Indian Agricultural Statistics Research Institute", 
          "id": "https://www.grid.ac/institutes/grid.463150.5", 
          "name": [
            "Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, 110012, New Delhi, India"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Rao", 
        "givenName": "Atmakuri Ramakrishna", 
        "id": "sg:person.013310041572.77", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013310041572.77"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1371/journal.pone.0030986", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1000783148"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0014689", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001079094"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0076910", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001440016"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1755-0998.2010.02844.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001517455"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1755-0998.2010.02844.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1001517455"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-642-22709-7_30", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002967803", 
          "https://doi.org/10.1007/978-3-642-22709-7_30"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1007/978-3-642-22709-7_30", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002967803", 
          "https://doi.org/10.1007/978-3-642-22709-7_30"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.0905845106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1003624990"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1756-0381-7-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1010349038", 
          "https://doi.org/10.1186/1756-0381-7-4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s13104-016-2203-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1012386029", 
          "https://doi.org/10.1186/s13104-016-2203-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s13104-016-2203-3", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1012386029", 
          "https://doi.org/10.1186/s13104-016-2203-3"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/b978-1-55860-377-6.50023-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013049849"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0030490", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013594611"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0022-2836(05)80360-2", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1013618994"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/ismej.2012.106", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014657236", 
          "https://doi.org/10.1038/ismej.2012.106"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-10-s14-s7", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1014930590", 
          "https://doi.org/10.1186/1471-2105-10-s14-s7"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1755-0998.2009.02635.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015717931"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1755-0998.2009.02635.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1015717931"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.fgb.2008.02.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1016202088"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/1755-0998.12073", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1022518269"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1261/rna.5060403", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1023788394"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1023/a:1010933404324", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1024739340", 
          "https://doi.org/10.1023/a:1010933404324"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/ece3.546", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1026745796"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btq003", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027162145"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1755-0998.2008.02372.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1027635822"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s13040-016-0086-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029380617", 
          "https://doi.org/10.1186/s13040-016-0086-4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/s13040-016-0086-4", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1029380617", 
          "https://doi.org/10.1186/s13040-016-0086-4"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1098/rspb.2002.2218", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030057572"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/j.1471-8286.2007.01678.x", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1031314971"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btv419", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1032739181"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-7-s5-s15", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1033964515", 
          "https://doi.org/10.1186/1471-2105-7-s5-s15"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.fbr.2011.01.001", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1036010946"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.01541-09", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038354310"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/bti547", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1039451568"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1111/mec.12481", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1043063567"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1128/aem.00062-07", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1045980007"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1073/pnas.1117018109", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1046949865"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/nar/25.17.3389", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047265454"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.inpa.2016.08.002", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1047997438"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.gene.2016.07.010", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1048770769"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-10-s14-s10", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1049737748", 
          "https://doi.org/10.1186/1471-2105-10-s14-s10"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1098/rspb.2010.1089", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1051432252"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1371/journal.pone.0099982", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1052506737"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1086/282802", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1058592925"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1089/cmb.1997.4.127", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059245161"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/bioinformatics/btw346", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1059414789"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3852/14-293", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1071480260"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1093/oxfordjournals.molbev.a040454", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1079752303"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/074161", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085112883"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/074161", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085112883"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1101/074161", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1085112883"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2019-12", 
    "datePublishedReg": "2019-12-01", 
    "description": "BACKGROUND: Identification of unknown fungal species aids to the conservation of fungal diversity. As many fungal species cannot be cultured, morphological identification of those species is almost impossible. But, DNA barcoding technique can be employed for identification of such species. For fungal taxonomy prediction, the ITS (internal transcribed spacer) region of rDNA (ribosomal DNA) is used as barcode. Though the computational prediction of fungal species has become feasible with the availability of huge volume of barcode sequences in public domain, prediction of fungal species is challenging due to high degree of variability among ITS regions within species.\nRESULTS: A Random Forest (RF)-based predictor was built for identification of unknown fungal species. The reference and query sequences were mapped onto numeric features based on gapped base pair compositions, and then used as training and test sets respectively for prediction of fungal species using RF. More than 85% accuracy was found when 4 sequences per species in the reference set were utilized; whereas it was seen to be stabilized at ~88% if \u22657 sequence per species in the reference set were used for training of the model. The proposed model achieved comparable accuracy, while evaluated against existing methods through cross-validation procedure. The proposed model also outperformed several existing models used for identification of different species other than fungi.\nCONCLUSIONS: An online prediction server \"funbarRF\" is established at http://cabgrid.res.in:8080/funbarrf/ for fungal species identification. Besides, an R-package funbarRF ( https://cran.r-project.org/web/packages/funbarRF/ ) is also available for prediction using high throughput sequence data. The effort put in this work will certainly supplement the future endeavors in the direction of fungal taxonomy assignments based on DNA barcode.", 
    "genre": "research_article", 
    "id": "sg:pub.10.1186/s12863-018-0710-z", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": true, 
    "isPartOf": [
      {
        "id": "sg:journal.1024251", 
        "issn": [
          "1471-2156"
        ], 
        "name": "BMC Genetics", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "20"
      }
    ], 
    "name": "funbarRF: DNA barcode-based fungal species prediction using multiclass Random Forest supervised learning model", 
    "pagination": "2", 
    "productId": [
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "68a3fccc145fdd2631edcda04f9032cfea08a83aed4a378ad178092606b97241"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "30616524"
        ]
      }, 
      {
        "name": "nlm_unique_id", 
        "type": "PropertyValue", 
        "value": [
          "100966978"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/s12863-018-0710-z"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1111225119"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/s12863-018-0710-z", 
      "https://app.dimensions.ai/details/publication/pub.1111225119"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2019-04-11T08:36", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000314_0000000314/records_55852_00000000.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://link.springer.com/10.1186%2Fs12863-018-0710-z"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/s12863-018-0710-z'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/s12863-018-0710-z'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/s12863-018-0710-z'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/s12863-018-0710-z'


 

This table displays all metadata directly associated to this object as RDF triples.

239 TRIPLES      21 PREDICATES      73 URIs      21 LITERALS      9 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/s12863-018-0710-z schema:about anzsrc-for:06
2 anzsrc-for:0604
3 schema:author N74d3388a881946108a07d0c3cc7cf1cb
4 schema:citation sg:pub.10.1007/978-3-642-22709-7_30
5 sg:pub.10.1023/a:1010933404324
6 sg:pub.10.1038/ismej.2012.106
7 sg:pub.10.1186/1471-2105-10-s14-s10
8 sg:pub.10.1186/1471-2105-10-s14-s7
9 sg:pub.10.1186/1471-2105-7-s5-s15
10 sg:pub.10.1186/1756-0381-7-4
11 sg:pub.10.1186/s13040-016-0086-4
12 sg:pub.10.1186/s13104-016-2203-3
13 https://doi.org/10.1002/ece3.546
14 https://doi.org/10.1016/b978-1-55860-377-6.50023-2
15 https://doi.org/10.1016/j.fbr.2011.01.001
16 https://doi.org/10.1016/j.fgb.2008.02.001
17 https://doi.org/10.1016/j.gene.2016.07.010
18 https://doi.org/10.1016/j.inpa.2016.08.002
19 https://doi.org/10.1016/s0022-2836(05)80360-2
20 https://doi.org/10.1073/pnas.0905845106
21 https://doi.org/10.1073/pnas.1117018109
22 https://doi.org/10.1086/282802
23 https://doi.org/10.1089/cmb.1997.4.127
24 https://doi.org/10.1093/bioinformatics/bti547
25 https://doi.org/10.1093/bioinformatics/btq003
26 https://doi.org/10.1093/bioinformatics/btv419
27 https://doi.org/10.1093/bioinformatics/btw346
28 https://doi.org/10.1093/nar/25.17.3389
29 https://doi.org/10.1093/oxfordjournals.molbev.a040454
30 https://doi.org/10.1098/rspb.2002.2218
31 https://doi.org/10.1098/rspb.2010.1089
32 https://doi.org/10.1101/074161
33 https://doi.org/10.1111/1755-0998.12073
34 https://doi.org/10.1111/j.1471-8286.2007.01678.x
35 https://doi.org/10.1111/j.1755-0998.2008.02372.x
36 https://doi.org/10.1111/j.1755-0998.2009.02635.x
37 https://doi.org/10.1111/j.1755-0998.2010.02844.x
38 https://doi.org/10.1111/mec.12481
39 https://doi.org/10.1128/aem.00062-07
40 https://doi.org/10.1128/aem.01541-09
41 https://doi.org/10.1261/rna.5060403
42 https://doi.org/10.1371/journal.pone.0014689
43 https://doi.org/10.1371/journal.pone.0030490
44 https://doi.org/10.1371/journal.pone.0030986
45 https://doi.org/10.1371/journal.pone.0076910
46 https://doi.org/10.1371/journal.pone.0099982
47 https://doi.org/10.3852/14-293
48 schema:datePublished 2019-12
49 schema:datePublishedReg 2019-12-01
50 schema:description BACKGROUND: Identification of unknown fungal species aids to the conservation of fungal diversity. As many fungal species cannot be cultured, morphological identification of those species is almost impossible. But, DNA barcoding technique can be employed for identification of such species. For fungal taxonomy prediction, the ITS (internal transcribed spacer) region of rDNA (ribosomal DNA) is used as barcode. Though the computational prediction of fungal species has become feasible with the availability of huge volume of barcode sequences in public domain, prediction of fungal species is challenging due to high degree of variability among ITS regions within species. RESULTS: A Random Forest (RF)-based predictor was built for identification of unknown fungal species. The reference and query sequences were mapped onto numeric features based on gapped base pair compositions, and then used as training and test sets respectively for prediction of fungal species using RF. More than 85% accuracy was found when 4 sequences per species in the reference set were utilized; whereas it was seen to be stabilized at ~88% if ≥7 sequence per species in the reference set were used for training of the model. The proposed model achieved comparable accuracy, while evaluated against existing methods through cross-validation procedure. The proposed model also outperformed several existing models used for identification of different species other than fungi. CONCLUSIONS: An online prediction server "funbarRF" is established at http://cabgrid.res.in:8080/funbarrf/ for fungal species identification. Besides, an R-package funbarRF ( https://cran.r-project.org/web/packages/funbarRF/ ) is also available for prediction using high throughput sequence data. The effort put in this work will certainly supplement the future endeavors in the direction of fungal taxonomy assignments based on DNA barcode.
51 schema:genre research_article
52 schema:inLanguage en
53 schema:isAccessibleForFree true
54 schema:isPartOf Naa20e8292f124eff9620e5585f9dd2f4
55 Nfe6c9b71affc40669aaa1e962be30207
56 sg:journal.1024251
57 schema:name funbarRF: DNA barcode-based fungal species prediction using multiclass Random Forest supervised learning model
58 schema:pagination 2
59 schema:productId N0138b21a44234839af87f8058e5b8602
60 N3560a57756644e47acebd0c7205b30a6
61 N4a64d9c739dd4654b2963325c17bb878
62 N5e234f4e033a45ba917d3a1611c761a3
63 Ndb2f1093e1c44b01869395741e9236ea
64 schema:sameAs https://app.dimensions.ai/details/publication/pub.1111225119
65 https://doi.org/10.1186/s12863-018-0710-z
66 schema:sdDatePublished 2019-04-11T08:36
67 schema:sdLicense https://scigraph.springernature.com/explorer/license/
68 schema:sdPublisher Na2eab11f23b44034a3889573c378cd2c
69 schema:url https://link.springer.com/10.1186%2Fs12863-018-0710-z
70 sgo:license sg:explorer/license/
71 sgo:sdDataset articles
72 rdf:type schema:ScholarlyArticle
73 N0138b21a44234839af87f8058e5b8602 schema:name doi
74 schema:value 10.1186/s12863-018-0710-z
75 rdf:type schema:PropertyValue
76 N08ee669e1d0d43b5babbfe84e5caa803 rdf:first sg:person.013310041572.77
77 rdf:rest rdf:nil
78 N144c80b16e624091927b489cd84d74bd rdf:first sg:person.014511257536.49
79 rdf:rest N08ee669e1d0d43b5babbfe84e5caa803
80 N152aa422244146858ef1e057f3ae551f rdf:first sg:person.01017164135.43
81 rdf:rest N52c7debdecc741c3a885add359e9cfed
82 N3560a57756644e47acebd0c7205b30a6 schema:name nlm_unique_id
83 schema:value 100966978
84 rdf:type schema:PropertyValue
85 N4a64d9c739dd4654b2963325c17bb878 schema:name pubmed_id
86 schema:value 30616524
87 rdf:type schema:PropertyValue
88 N52c7debdecc741c3a885add359e9cfed rdf:first sg:person.0762652776.78
89 rdf:rest N144c80b16e624091927b489cd84d74bd
90 N5e234f4e033a45ba917d3a1611c761a3 schema:name dimensions_id
91 schema:value pub.1111225119
92 rdf:type schema:PropertyValue
93 N74d3388a881946108a07d0c3cc7cf1cb rdf:first sg:person.01010516501.02
94 rdf:rest N152aa422244146858ef1e057f3ae551f
95 Na2eab11f23b44034a3889573c378cd2c schema:name Springer Nature - SN SciGraph project
96 rdf:type schema:Organization
97 Naa20e8292f124eff9620e5585f9dd2f4 schema:volumeNumber 20
98 rdf:type schema:PublicationVolume
99 Ndb2f1093e1c44b01869395741e9236ea schema:name readcube_id
100 schema:value 68a3fccc145fdd2631edcda04f9032cfea08a83aed4a378ad178092606b97241
101 rdf:type schema:PropertyValue
102 Nfe6c9b71affc40669aaa1e962be30207 schema:issueNumber 1
103 rdf:type schema:PublicationIssue
104 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
105 schema:name Biological Sciences
106 rdf:type schema:DefinedTerm
107 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
108 schema:name Genetics
109 rdf:type schema:DefinedTerm
110 sg:journal.1024251 schema:issn 1471-2156
111 schema:name BMC Genetics
112 rdf:type schema:Periodical
113 sg:person.01010516501.02 schema:affiliation https://www.grid.ac/institutes/grid.463150.5
114 schema:familyName Meher
115 schema:givenName Prabina Kumar
116 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01010516501.02
117 rdf:type schema:Person
118 sg:person.01017164135.43 schema:affiliation https://www.grid.ac/institutes/grid.463150.5
119 schema:familyName Sahu
120 schema:givenName Tanmaya Kumar
121 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01017164135.43
122 rdf:type schema:Person
123 sg:person.013310041572.77 schema:affiliation https://www.grid.ac/institutes/grid.463150.5
124 schema:familyName Rao
125 schema:givenName Atmakuri Ramakrishna
126 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013310041572.77
127 rdf:type schema:Person
128 sg:person.014511257536.49 schema:affiliation https://www.grid.ac/institutes/grid.463150.5
129 schema:familyName Tomar
130 schema:givenName Ruchi
131 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014511257536.49
132 rdf:type schema:Person
133 sg:person.0762652776.78 schema:affiliation https://www.grid.ac/institutes/grid.463150.5
134 schema:familyName Gahoi
135 schema:givenName Shachi
136 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0762652776.78
137 rdf:type schema:Person
138 sg:pub.10.1007/978-3-642-22709-7_30 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002967803
139 https://doi.org/10.1007/978-3-642-22709-7_30
140 rdf:type schema:CreativeWork
141 sg:pub.10.1023/a:1010933404324 schema:sameAs https://app.dimensions.ai/details/publication/pub.1024739340
142 https://doi.org/10.1023/a:1010933404324
143 rdf:type schema:CreativeWork
144 sg:pub.10.1038/ismej.2012.106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014657236
145 https://doi.org/10.1038/ismej.2012.106
146 rdf:type schema:CreativeWork
147 sg:pub.10.1186/1471-2105-10-s14-s10 schema:sameAs https://app.dimensions.ai/details/publication/pub.1049737748
148 https://doi.org/10.1186/1471-2105-10-s14-s10
149 rdf:type schema:CreativeWork
150 sg:pub.10.1186/1471-2105-10-s14-s7 schema:sameAs https://app.dimensions.ai/details/publication/pub.1014930590
151 https://doi.org/10.1186/1471-2105-10-s14-s7
152 rdf:type schema:CreativeWork
153 sg:pub.10.1186/1471-2105-7-s5-s15 schema:sameAs https://app.dimensions.ai/details/publication/pub.1033964515
154 https://doi.org/10.1186/1471-2105-7-s5-s15
155 rdf:type schema:CreativeWork
156 sg:pub.10.1186/1756-0381-7-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1010349038
157 https://doi.org/10.1186/1756-0381-7-4
158 rdf:type schema:CreativeWork
159 sg:pub.10.1186/s13040-016-0086-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1029380617
160 https://doi.org/10.1186/s13040-016-0086-4
161 rdf:type schema:CreativeWork
162 sg:pub.10.1186/s13104-016-2203-3 schema:sameAs https://app.dimensions.ai/details/publication/pub.1012386029
163 https://doi.org/10.1186/s13104-016-2203-3
164 rdf:type schema:CreativeWork
165 https://doi.org/10.1002/ece3.546 schema:sameAs https://app.dimensions.ai/details/publication/pub.1026745796
166 rdf:type schema:CreativeWork
167 https://doi.org/10.1016/b978-1-55860-377-6.50023-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013049849
168 rdf:type schema:CreativeWork
169 https://doi.org/10.1016/j.fbr.2011.01.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1036010946
170 rdf:type schema:CreativeWork
171 https://doi.org/10.1016/j.fgb.2008.02.001 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016202088
172 rdf:type schema:CreativeWork
173 https://doi.org/10.1016/j.gene.2016.07.010 schema:sameAs https://app.dimensions.ai/details/publication/pub.1048770769
174 rdf:type schema:CreativeWork
175 https://doi.org/10.1016/j.inpa.2016.08.002 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047997438
176 rdf:type schema:CreativeWork
177 https://doi.org/10.1016/s0022-2836(05)80360-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013618994
178 rdf:type schema:CreativeWork
179 https://doi.org/10.1073/pnas.0905845106 schema:sameAs https://app.dimensions.ai/details/publication/pub.1003624990
180 rdf:type schema:CreativeWork
181 https://doi.org/10.1073/pnas.1117018109 schema:sameAs https://app.dimensions.ai/details/publication/pub.1046949865
182 rdf:type schema:CreativeWork
183 https://doi.org/10.1086/282802 schema:sameAs https://app.dimensions.ai/details/publication/pub.1058592925
184 rdf:type schema:CreativeWork
185 https://doi.org/10.1089/cmb.1997.4.127 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059245161
186 rdf:type schema:CreativeWork
187 https://doi.org/10.1093/bioinformatics/bti547 schema:sameAs https://app.dimensions.ai/details/publication/pub.1039451568
188 rdf:type schema:CreativeWork
189 https://doi.org/10.1093/bioinformatics/btq003 schema:sameAs https://app.dimensions.ai/details/publication/pub.1027162145
190 rdf:type schema:CreativeWork
191 https://doi.org/10.1093/bioinformatics/btv419 schema:sameAs https://app.dimensions.ai/details/publication/pub.1032739181
192 rdf:type schema:CreativeWork
193 https://doi.org/10.1093/bioinformatics/btw346 schema:sameAs https://app.dimensions.ai/details/publication/pub.1059414789
194 rdf:type schema:CreativeWork
195 https://doi.org/10.1093/nar/25.17.3389 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047265454
196 rdf:type schema:CreativeWork
197 https://doi.org/10.1093/oxfordjournals.molbev.a040454 schema:sameAs https://app.dimensions.ai/details/publication/pub.1079752303
198 rdf:type schema:CreativeWork
199 https://doi.org/10.1098/rspb.2002.2218 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030057572
200 rdf:type schema:CreativeWork
201 https://doi.org/10.1098/rspb.2010.1089 schema:sameAs https://app.dimensions.ai/details/publication/pub.1051432252
202 rdf:type schema:CreativeWork
203 https://doi.org/10.1101/074161 schema:sameAs https://app.dimensions.ai/details/publication/pub.1085112883
204 rdf:type schema:CreativeWork
205 https://doi.org/10.1111/1755-0998.12073 schema:sameAs https://app.dimensions.ai/details/publication/pub.1022518269
206 rdf:type schema:CreativeWork
207 https://doi.org/10.1111/j.1471-8286.2007.01678.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1031314971
208 rdf:type schema:CreativeWork
209 https://doi.org/10.1111/j.1755-0998.2008.02372.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1027635822
210 rdf:type schema:CreativeWork
211 https://doi.org/10.1111/j.1755-0998.2009.02635.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1015717931
212 rdf:type schema:CreativeWork
213 https://doi.org/10.1111/j.1755-0998.2010.02844.x schema:sameAs https://app.dimensions.ai/details/publication/pub.1001517455
214 rdf:type schema:CreativeWork
215 https://doi.org/10.1111/mec.12481 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043063567
216 rdf:type schema:CreativeWork
217 https://doi.org/10.1128/aem.00062-07 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045980007
218 rdf:type schema:CreativeWork
219 https://doi.org/10.1128/aem.01541-09 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038354310
220 rdf:type schema:CreativeWork
221 https://doi.org/10.1261/rna.5060403 schema:sameAs https://app.dimensions.ai/details/publication/pub.1023788394
222 rdf:type schema:CreativeWork
223 https://doi.org/10.1371/journal.pone.0014689 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001079094
224 rdf:type schema:CreativeWork
225 https://doi.org/10.1371/journal.pone.0030490 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013594611
226 rdf:type schema:CreativeWork
227 https://doi.org/10.1371/journal.pone.0030986 schema:sameAs https://app.dimensions.ai/details/publication/pub.1000783148
228 rdf:type schema:CreativeWork
229 https://doi.org/10.1371/journal.pone.0076910 schema:sameAs https://app.dimensions.ai/details/publication/pub.1001440016
230 rdf:type schema:CreativeWork
231 https://doi.org/10.1371/journal.pone.0099982 schema:sameAs https://app.dimensions.ai/details/publication/pub.1052506737
232 rdf:type schema:CreativeWork
233 https://doi.org/10.3852/14-293 schema:sameAs https://app.dimensions.ai/details/publication/pub.1071480260
234 rdf:type schema:CreativeWork
235 https://www.grid.ac/institutes/grid.463150.5 schema:alternateName Indian Agricultural Statistics Research Institute
236 schema:name Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, 110012, New Delhi, India
237 Department of Bioinformatics, Janta Vedic College, 250611, Baraut, Baghpat, Uttar Pradesh, India
238 Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, 110012, New Delhi, India
239 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...