Ontology type: schema:ScholarlyArticle Open Access: True
2022-05-03
AUTHORSMin Oh, Liqing Zhang
ABSTRACTPredictive models trained on sequencing profiles often fail to achieve expected performance when externally validated on unseen profiles. While many factors such as batch effects, small data sets, and technical errors contribute to the gap between source and unseen data distributions, it is a challenging problem to generalize the predictive models across studies without any prior knowledge of the unseen data distribution. Here, this study proposes DeepBioGen, a sequencing profile augmentation procedure that characterizes visual patterns of sequencing profiles, generates realistic profiles based on a deep generative model capturing the patterns, and generalizes the subsequent classifiers. DeepBioGen outperforms other methods in terms of enhancing the generalizability of the prediction models on unseen data. The generalized classifiers surpass the state-of-the-art method, evaluated on RNA sequencing tumor expression profiles for anti-PD1 therapy response prediction and WGS human gut microbiome profiles for type 2 diabetes diagnosis. More... »
PAGES7151
http://scigraph.springernature.com/pub.10.1038/s41598-022-11363-w
DOIhttp://dx.doi.org/10.1038/s41598-022-11363-w
DIMENSIONShttps://app.dimensions.ai/details/publication/pub.1147549871
PUBMEDhttps://www.ncbi.nlm.nih.gov/pubmed/35504956
JSON-LD is the canonical representation for SciGraph data.
TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT
[
{
"@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json",
"about": [
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Information and Computing Sciences",
"type": "DefinedTerm"
},
{
"id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801",
"inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/",
"name": "Artificial Intelligence and Image Processing",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Diabetes Mellitus, Type 2",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Humans",
"type": "DefinedTerm"
},
{
"inDefinedTermSet": "https://www.nlm.nih.gov/mesh/",
"name": "Sequence Analysis, RNA",
"type": "DefinedTerm"
}
],
"author": [
{
"affiliation": {
"alternateName": "Department of Computer Science, Virginia Tech, Blacksburg, VA, USA",
"id": "http://www.grid.ac/institutes/grid.438526.e",
"name": [
"Department of Computer Science, Virginia Tech, Blacksburg, VA, USA"
],
"type": "Organization"
},
"familyName": "Oh",
"givenName": "Min",
"id": "sg:person.014266323475.09",
"sameAs": [
"https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014266323475.09"
],
"type": "Person"
},
{
"affiliation": {
"alternateName": "Department of Computer Science, Virginia Tech, Blacksburg, VA, USA",
"id": "http://www.grid.ac/institutes/grid.438526.e",
"name": [
"Department of Computer Science, Virginia Tech, Blacksburg, VA, USA"
],
"type": "Organization"
},
"familyName": "Zhang",
"givenName": "Liqing",
"type": "Person"
}
],
"citation": [
{
"id": "sg:pub.10.1007/978-3-642-15561-1_16",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1047848527",
"https://doi.org/10.1007/978-3-642-15561-1_16"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/nrg2825",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1037809833",
"https://doi.org/10.1038/nrg2825"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-030-01424-7_58",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1107352273",
"https://doi.org/10.1007/978-3-030-01424-7_58"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/nmeth.3589",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1028162909",
"https://doi.org/10.1038/nmeth.3589"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41591-018-0157-9",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1106133485",
"https://doi.org/10.1038/s41591-018-0157-9"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/505612a",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1044453509",
"https://doi.org/10.1038/505612a"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/533452a",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1010295273",
"https://doi.org/10.1038/533452a"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/nature12198",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1002791386",
"https://doi.org/10.1038/nature12198"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41598-019-52737-x",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1122598037",
"https://doi.org/10.1038/s41598-019-52737-x"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-319-68612-7_71",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1092344930",
"https://doi.org/10.1007/978-3-319-68612-7_71"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41467-019-13993-7",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1123915232",
"https://doi.org/10.1038/s41467-019-13993-7"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/nature11450",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1004546178",
"https://doi.org/10.1038/nature11450"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/nature06758",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1022504514",
"https://doi.org/10.1038/nature06758"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41591-018-0136-1",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1106133483",
"https://doi.org/10.1038/s41591-018-0136-1"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41467-020-19957-6",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1133284935",
"https://doi.org/10.1038/s41467-020-19957-6"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41598-018-37186-2",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1111102200",
"https://doi.org/10.1038/s41598-018-37186-2"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/s10994-006-6226-1",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1007730804",
"https://doi.org/10.1007/s10994-006-6226-1"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41598-019-56847-4",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1123950493",
"https://doi.org/10.1038/s41598-019-56847-4"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41467-019-14018-z",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1123953545",
"https://doi.org/10.1038/s41467-019-14018-z"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1038/s41598-019-47765-6",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1120146669",
"https://doi.org/10.1038/s41598-019-47765-6"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/978-3-030-01267-0_38",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1107463421",
"https://doi.org/10.1007/978-3-030-01267-0_38"
],
"type": "CreativeWork"
},
{
"id": "sg:pub.10.1007/bf02289263",
"sameAs": [
"https://app.dimensions.ai/details/publication/pub.1052823670",
"https://doi.org/10.1007/bf02289263"
],
"type": "CreativeWork"
}
],
"datePublished": "2022-05-03",
"datePublishedReg": "2022-05-03",
"description": "Predictive models trained on sequencing profiles often fail to achieve expected performance when externally validated on unseen profiles. While many factors such as batch effects, small data sets, and technical errors contribute to the gap between source and unseen data distributions, it is a challenging problem to generalize the predictive models across studies without any prior knowledge of the unseen data distribution. Here, this study proposes DeepBioGen, a sequencing profile augmentation procedure that characterizes visual patterns of sequencing profiles, generates realistic profiles based on a deep generative model capturing the patterns, and generalizes the subsequent classifiers. DeepBioGen outperforms other methods in terms of enhancing the generalizability of the prediction models on unseen data. The generalized classifiers surpass the state-of-the-art method, evaluated on RNA sequencing tumor expression profiles for anti-PD1 therapy response prediction and WGS human gut microbiome profiles for type 2 diabetes diagnosis.",
"genre": "article",
"id": "sg:pub.10.1038/s41598-022-11363-w",
"inLanguage": "en",
"isAccessibleForFree": true,
"isPartOf": [
{
"id": "sg:journal.1045337",
"issn": [
"2045-2322"
],
"name": "Scientific Reports",
"publisher": "Springer Nature",
"type": "Periodical"
},
{
"issueNumber": "1",
"type": "PublicationIssue"
},
{
"type": "PublicationVolume",
"volumeNumber": "12"
}
],
"keywords": [
"type 2 diabetes diagnosis",
"sequencing profiles",
"therapy response prediction",
"diabetes diagnosis",
"augmentation procedures",
"tumor expression profiles",
"human gut",
"technical errors",
"expression profiles",
"predictive model",
"diagnosis",
"profile",
"study",
"response prediction",
"gut",
"visual patterns",
"factors",
"patterns",
"generalizability",
"effect",
"procedure",
"prediction model",
"data",
"model",
"knowledge",
"batch effects",
"method",
"distribution",
"source",
"terms",
"state",
"prediction",
"gap",
"data sets",
"challenging problem",
"problem",
"error",
"classifier",
"performance",
"unseen data",
"generalized classifier",
"set",
"prior knowledge",
"small data sets",
"data distribution",
"deep generative models",
"generative model",
"subsequent classifier",
"art methods",
"realistic profiles"
],
"name": "Generalizing predictions to unseen sequencing profiles via deep generative models",
"pagination": "7151",
"productId": [
{
"name": "dimensions_id",
"type": "PropertyValue",
"value": [
"pub.1147549871"
]
},
{
"name": "doi",
"type": "PropertyValue",
"value": [
"10.1038/s41598-022-11363-w"
]
},
{
"name": "pubmed_id",
"type": "PropertyValue",
"value": [
"35504956"
]
}
],
"sameAs": [
"https://doi.org/10.1038/s41598-022-11363-w",
"https://app.dimensions.ai/details/publication/pub.1147549871"
],
"sdDataset": "articles",
"sdDatePublished": "2022-06-01T22:23",
"sdLicense": "https://scigraph.springernature.com/explorer/license/",
"sdPublisher": {
"name": "Springer Nature - SN SciGraph project",
"type": "Organization"
},
"sdSource": "s3://com-springernature-scigraph/baseset/20220601/entities/gbq_results/article/article_931.jsonl",
"type": "ScholarlyArticle",
"url": "https://doi.org/10.1038/s41598-022-11363-w"
}
]
Download the RDF metadata as: json-ld nt turtle xml License info
JSON-LD is a popular format for linked data which is fully compatible with JSON.
curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11363-w'
N-Triples is a line-based linked data format ideal for batch operations.
curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11363-w'
Turtle is a human-readable linked data format.
curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11363-w'
RDF/XML is a standard XML format for linked data.
curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1038/s41598-022-11363-w'
This table displays all metadata directly associated to this object as RDF triples.
217 TRIPLES
22 PREDICATES
101 URIs
71 LITERALS
10 BLANK NODES