Ontology type: schema:ScholarlyArticle Open Access: True
2022-05-03
AUTHORSDaisy Salifu, Eric Ali Ibrahim, Henri E. Z. Tonnang
ABSTRACTAnalysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal component analysis (PCA), canonical variate analysis (CVA) and discriminant analysis (DA). However, advancement in computing power creates a paradigm shift to apply modern tools such as machine learning. Herein, we assess the predictive performance of four machine learning classifiers; K-nearest neighbor (KNN), random forest (RF), support vector machine (the linear, polynomial and radial kernel SVMs) and artificial neural network (ANNs) on fruit fly morphometrics that were previously analysed using PCA and CVA. KNN and RF performed poorly with overall model accuracy lower than “no-information rate” (NIR) (p value > 0.1). The SVM models had a predictive accuracy of > 95%, significantly higher than NIR (p < 0.001), Kappa > 0.78 and area under curve (AUC) of the receiver operating characteristics was > 0.91; while ANN model had a predictive accuracy of 96%, significantly higher than NIR, Kappa of 0.83 and AUC was 0.98. Wing veins 2, 3, 8, 10, 14 and tibia length were of higher importance than other variables based on both SVM and ANN models. We conclude that SVM and ANN models could be used to discriminate fruit fly species based on wing vein and tibia length measurements or any other morphologically similar pest taxa. These algorithms could be used as candidates for developing an integrated and smart application software for insect discrimination and identification. Variable importance analysis results in this study would be useful for future studies for deciding what must be measured. More... »
PAGES7208
http://scigraph.springernature.com/pub.10.1038/s41598-022-11258-w
DOIhttp://dx.doi.org/10.1038/s41598-022-11258-w
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1147551568
PUBMEDhttps://www.ncbi.nlm.nih.gov/pubmed/35505067
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/01",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Mathematical Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0104",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Statistics",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Artificial Intelligence and Image Processing",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Algorithms",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Machine Learning",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Neural Networks, Computer",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "ROC Curve",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Support Vector Machine",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "International Centre of Insect Physiology and Ecology (icipe), P.O. Box 30772-00100, Nairobi, Kenya",
"id": "http://www.grid.ac/institutes/grid.419326.b",
"name": [
"International Centre of Insect Physiology and Ecology (icipe), P.O. Box 30772-00100, Nairobi, Kenya"
],
"type": "Organization"
},
"familyName": "Salifu",
"givenName": "Daisy",
"id": "sg:person.0615611233.95",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0615611233.95"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Department of Statistics, Jomo Kenyatta University of Agriculture and Technology, P.O. Box 62000-00200, Nairobi, Kenya",
"id": "http://www.grid.ac/institutes/grid.411943.a",
"name": [
"International Centre of Insect Physiology and Ecology (icipe), P.O. Box 30772-00100, Nairobi, Kenya",
"Department of Statistics, Jomo Kenyatta University of Agriculture and Technology, P.O. Box 62000-00200, Nairobi, Kenya"
],
"type": "Organization"
},
"familyName": "Ibrahim",
"givenName": "Eric Ali",
"id": "sg:person.012520243612.57",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012520243612.57"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "International Centre of Insect Physiology and Ecology (icipe), P.O. Box 30772-00100, Nairobi, Kenya",
"id": "http://www.grid.ac/institutes/grid.419326.b",
"name": [
"International Centre of Insect Physiology and Ecology (icipe), P.O. Box 30772-00100, Nairobi, Kenya"
],
"type": "Organization"
},
"familyName": "Tonnang",
"givenName": "Henri E. Z.",
"id": "sg:person.01121617221.22",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01121617221.22"
],
"type": "Person"
}
],
"citation": [
{
"id": "sg:pub.10.1186/1756-3305-5-2",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1010349152",
"https://doi.org/10.1186/1756-3305-5-2"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/1756-3305-5-257",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1002203607",
"https://doi.org/10.1186/1756-3305-5-257"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-319-24277-4",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1028525626",
"https://doi.org/10.1007/978-3-319-24277-4"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/nbt1206-1565",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1051026888",
"https://doi.org/10.1038/nbt1206-1565"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/s13071-017-2163-z",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1085376761",
"https://doi.org/10.1186/s13071-017-2163-z"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-1-60327-101-1_2",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1041914437",
"https://doi.org/10.1007/978-1-60327-101-1_2"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-0-387-21706-2",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1035613449",
"https://doi.org/10.1007/978-0-387-21706-2"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/s10115-019-01335-4",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1111840881",
"https://doi.org/10.1007/s10115-019-01335-4"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/s42452-019-0295-9",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1112468091",
"https://doi.org/10.1007/s42452-019-0295-9"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1186/s13071-016-1943-1",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1016840595",
"https://doi.org/10.1186/s13071-016-1943-1"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1023/a:1010933404324",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1024739340",
"https://doi.org/10.1023/a:1010933404324"
],
"type": "CreativeWork"
}
],
"datePublished": "2022-05-03",
"datePublishedReg": "2022-05-03",
"description": "Analysis of landmark-based morphometric measurements taken on body parts of insects have been a useful taxonomic approach alongside DNA barcoding in insect identification. Statistical analysis of morphometrics have largely been dominated by traditional methods and approaches such as principal component analysis (PCA), canonical variate analysis (CVA) and discriminant analysis (DA). However, advancement in computing power creates a paradigm shift to apply modern tools such as machine learning. Herein, we assess the predictive performance of four machine learning classifiers; K-nearest neighbor (KNN), random forest (RF), support vector machine (the linear, polynomial and radial kernel SVMs) and artificial neural network (ANNs) on fruit fly morphometrics that were previously analysed using PCA and CVA. KNN and RF performed poorly with overall model accuracy lower than \u201cno-information rate\u201d (NIR) (p value\u2009>\u20090.1). The SVM models had a predictive accuracy of\u2009>\u200995%, significantly higher than NIR (p\u2009<\u20090.001), Kappa\u2009>\u20090.78 and area under curve (AUC) of the receiver operating characteristics was\u2009>\u20090.91; while ANN model had a predictive accuracy of 96%, significantly higher than NIR, Kappa of 0.83 and AUC was 0.98. Wing veins 2, 3, 8, 10, 14 and tibia length were of higher importance than other variables based on both SVM and ANN models. We conclude that SVM and ANN models could be used to discriminate fruit fly species based on wing vein and tibia length measurements or any other morphologically similar pest taxa. These algorithms could be used as candidates for developing an integrated and smart application software for insect discrimination and identification. Variable importance analysis results in this study would be useful for future studies for deciding what must be measured.",
"genre": "article",
"id": "sg:pub.10.1038/s41598-022-11258-w",
"inLanguage": "en",
"isAccessibleForFree": true,
"isPartOf": [
{
"id": "sg:journal.1045337",
"issn": [
"2045-2322"
],
"name": "Scientific Reports",
"publisher": "Springer Nature",
"type": "Periodical"
},
{
"issueNumber": "1",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "12"
}
],
"keywords": [
"artificial neural network",
"random forest",
"ANN model",
"application software",
"machine learning",
"neural network",
"vector machine",
"SVM model",
"nearest neighbors",
"predictive accuracy",
"overall model accuracy",
"machine",
"principal component analysis",
"importance analysis results",
"SVM",
"model accuracy",
"information rate",
"algorithm",
"discriminant analysis",
"traditional methods",
"modern tools",
"predictive performance",
"insect identification",
"accuracy",
"classifier",
"paradigm shift",
"KNN",
"body parts",
"software",
"tool",
"network",
"canonical variate analysis",
"component analysis",
"learning",
"neighbors",
"high importance",
"tibia length measurements",
"model",
"analysis results",
"advancement",
"performance",
"statistical analysis",
"identification",
"receiver",
"variate analysis",
"method",
"approach",
"analysis",
"power",
"AUC",
"measurements",
"variables",
"forest",
"part",
"results",
"area",
"curves",
"characteristics",
"kappa",
"candidates",
"importance",
"length",
"length measurements",
"NIR",
"discrimination",
"study",
"rate",
"shift",
"morphometrics",
"morphometric measurements",
"future studies",
"taxonomic approach",
"fruit",
"insects",
"tibia length",
"Herein",
"barcoding",
"vein",
"species",
"DNA barcoding",
"wing veins",
"taxa",
"pest taxa",
"vein 2"
],
"name": "Leveraging machine learning tools and algorithms for analysis of fruit fly morphometrics",
"pagination": "7208",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1147551568"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1038/s41598-022-11258-w"
]
},
{
"name": "pubmed_id",
"type": "PropertyValue",
"value": [
"35505067"
]
}
],
"sameAs": [
"https://doi.org/10.1038/s41598-022-11258-w",
"https://app.dimensions.ai/details/publication/pub.1147551568"
],
"sdDataset": "articles",
"sdDatePublished": "2022-06-01T22:23",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220601/entities/gbq_results/article/article_930.jsonl",
"type": "ScholarlyArticle",
"url": "https://doi.org/10.1038/s41598-022-11258-w"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11258-w'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11258-w'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11258-w'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11258-w'
This table displays all metadata directly associated to this object as RDF triples.
235 TRIPLES
22 PREDICATES
128 URIs
107 LITERALS
12 BLANK NODES