Understanding how and why the Gene Ontology and its annotations evolve: the GO within UniProt View Full Text


Ontology type: schema:ScholarlyArticle      Open Access: True


Article Info

DATE

2014-03-18

AUTHORS

Rachael P Huntley, Tony Sawford, Maria J Martin, Claire O’Donovan

ABSTRACT

The Gene Ontology Consortium (GOC) is a major bioinformatics project that provides structured controlled vocabularies to classify gene product function and location. GOC members create annotations to gene products using the Gene Ontology (GO) vocabularies, thus providing an extensive, publicly available resource. The GO and its annotations to gene products are now an integral part of functional analysis, and statistical tests using GO data are becoming routine for researchers to include when publishing functional information. While many helpful articles about the GOC are available, there are certain updates to the ontology and annotation sets that sometimes go unobserved. Here we describe some of the ways in which GO can change that should be carefully considered by all users of GO as they may have a significant impact on the resulting gene product annotations, and therefore the functional description of the gene product, or the interpretation of analyses performed on GO datasets. GO annotations for gene products change for many reasons, and while these changes generally improve the accuracy of the representation of the underlying biology, they do not necessarily imply that previous annotations were incorrect. We additionally describe the quality assurance mechanisms we employ to improve the accuracy of annotations, which necessarily changes the composition of the annotation sets we provide. We use the Universal Protein Resource (UniProt) for illustrative purposes of how the GO Consortium, as a whole, manages these changes. More... »

PAGES

4

References to SciGraph publications

Identifiers

URI

http://scigraph.springernature.com/pub.10.1186/2047-217x-3-4

DOI

http://dx.doi.org/10.1186/2047-217x-3-4

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1045988176

PUBMED

https://www.ncbi.nlm.nih.gov/pubmed/24641996


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/06", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Biological Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0604", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Genetics", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK", 
          "id": "http://www.grid.ac/institutes/grid.225360.0", 
          "name": [
            "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Huntley", 
        "givenName": "Rachael P", 
        "id": "sg:person.01120775354.41", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01120775354.41"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK", 
          "id": "http://www.grid.ac/institutes/grid.225360.0", 
          "name": [
            "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Sawford", 
        "givenName": "Tony", 
        "id": "sg:person.01277647757.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01277647757.79"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK", 
          "id": "http://www.grid.ac/institutes/grid.225360.0", 
          "name": [
            "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Martin", 
        "givenName": "Maria J", 
        "id": "sg:person.01007251502.34", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01007251502.34"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK", 
          "id": "http://www.grid.ac/institutes/grid.225360.0", 
          "name": [
            "European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "O\u2019Donovan", 
        "givenName": "Claire", 
        "id": "sg:person.01017767203.44", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01017767203.44"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "sg:pub.10.1038/nrg2363", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1018486403", 
          "https://doi.org/10.1038/nrg2363"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/75556", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1044135237", 
          "https://doi.org/10.1038/75556"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1186/1471-2105-11-530", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1037563685", 
          "https://doi.org/10.1186/1471-2105-11-530"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "sg:pub.10.1038/nbt1210-1248", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1002756332", 
          "https://doi.org/10.1038/nbt1210-1248"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014-03-18", 
    "datePublishedReg": "2014-03-18", 
    "description": "The Gene Ontology Consortium (GOC) is a major bioinformatics project that provides structured controlled vocabularies to classify gene product function and location. GOC members create annotations to gene products using the Gene Ontology (GO) vocabularies, thus providing an extensive, publicly available resource. The GO and its annotations to gene products are now an integral part of functional analysis, and statistical tests using GO data are becoming routine for researchers to include when publishing functional information. While many helpful articles about the GOC are available, there are certain updates to the ontology and annotation sets that sometimes go unobserved. Here we describe some of the ways in which GO can change that should be carefully considered by all users of GO as they may have a significant impact on the resulting gene product annotations, and therefore the functional description of the gene product, or the interpretation of analyses performed on GO datasets. GO annotations for gene products change for many reasons, and while these changes generally improve the accuracy of the representation of the underlying biology, they do not necessarily imply that previous annotations were incorrect. We additionally describe the quality assurance mechanisms we employ to improve the accuracy of annotations, which necessarily changes the composition of the annotation sets we provide. We use the Universal Protein Resource (UniProt) for illustrative purposes of how the GO Consortium, as a whole, manages these changes.", 
    "genre": "article", 
    "id": "sg:pub.10.1186/2047-217x-3-4", 
    "isAccessibleForFree": true, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.5141094", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.5141042", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2697607", 
        "type": "MonetaryGrant"
      }, 
      {
        "id": "sg:grant.2697599", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": [
      {
        "id": "sg:journal.1047731", 
        "issn": [
          "2047-217X"
        ], 
        "name": "GigaScience", 
        "publisher": "Oxford University Press (OUP)", 
        "type": "Periodical"
      }, 
      {
        "issueNumber": "1", 
        "type": "PublicationIssue"
      }, 
      {
        "type": "PublicationVolume", 
        "volumeNumber": "3"
      }
    ], 
    "keywords": [
      "Gene Ontology Consortium", 
      "accuracy of annotation", 
      "Universal Protein Resource", 
      "bioinformatics projects", 
      "ontology vocabulary", 
      "annotation sets", 
      "GO data", 
      "annotation", 
      "product annotations", 
      "GO Consortium", 
      "GO datasets", 
      "available resources", 
      "ontology", 
      "functional description", 
      "previous annotations", 
      "helpful articles", 
      "gene product annotations", 
      "Gene Ontology vocabulary", 
      "assurance mechanisms", 
      "product functions", 
      "accuracy", 
      "GO annotations", 
      "users", 
      "product changes", 
      "vocabulary", 
      "resources", 
      "dataset", 
      "quality assurance mechanisms", 
      "integral part", 
      "gene products", 
      "functional information", 
      "UniProt", 
      "representation", 
      "information", 
      "update", 
      "set", 
      "interpretation of analyses", 
      "gene product functions", 
      "routines", 
      "certain updates", 
      "illustrative purposes", 
      "project", 
      "researchers", 
      "protein resources", 
      "consortium", 
      "Gene Ontology", 
      "functional analysis", 
      "GO", 
      "significant impact", 
      "statistical tests", 
      "way", 
      "data", 
      "description", 
      "location", 
      "analysis", 
      "purpose", 
      "part", 
      "products", 
      "article", 
      "biology", 
      "reasons", 
      "whole", 
      "function", 
      "interpretation", 
      "members", 
      "mechanism", 
      "impact", 
      "changes", 
      "composition", 
      "test"
    ], 
    "name": "Understanding how and why the Gene Ontology and its annotations evolve: the GO within UniProt", 
    "pagination": "4", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1045988176"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1186/2047-217x-3-4"
        ]
      }, 
      {
        "name": "pubmed_id", 
        "type": "PropertyValue", 
        "value": [
          "24641996"
        ]
      }
    ], 
    "sameAs": [
      "https://doi.org/10.1186/2047-217x-3-4", 
      "https://app.dimensions.ai/details/publication/pub.1045988176"
    ], 
    "sdDataset": "articles", 
    "sdDatePublished": "2022-10-01T06:39", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20221001/entities/gbq_results/article/article_639.jsonl", 
    "type": "ScholarlyArticle", 
    "url": "https://doi.org/10.1186/2047-217x-3-4"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1186/2047-217x-3-4'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1186/2047-217x-3-4'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1186/2047-217x-3-4'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1186/2047-217x-3-4'


 

This table displays all metadata directly associated to this object as RDF triples.

175 TRIPLES      21 PREDICATES      98 URIs      86 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1186/2047-217x-3-4 schema:about anzsrc-for:06
2 anzsrc-for:0604
3 schema:author N1bb023bd29dc4fda9795a8df8eba7e12
4 schema:citation sg:pub.10.1038/75556
5 sg:pub.10.1038/nbt1210-1248
6 sg:pub.10.1038/nrg2363
7 sg:pub.10.1186/1471-2105-11-530
8 schema:datePublished 2014-03-18
9 schema:datePublishedReg 2014-03-18
10 schema:description The Gene Ontology Consortium (GOC) is a major bioinformatics project that provides structured controlled vocabularies to classify gene product function and location. GOC members create annotations to gene products using the Gene Ontology (GO) vocabularies, thus providing an extensive, publicly available resource. The GO and its annotations to gene products are now an integral part of functional analysis, and statistical tests using GO data are becoming routine for researchers to include when publishing functional information. While many helpful articles about the GOC are available, there are certain updates to the ontology and annotation sets that sometimes go unobserved. Here we describe some of the ways in which GO can change that should be carefully considered by all users of GO as they may have a significant impact on the resulting gene product annotations, and therefore the functional description of the gene product, or the interpretation of analyses performed on GO datasets. GO annotations for gene products change for many reasons, and while these changes generally improve the accuracy of the representation of the underlying biology, they do not necessarily imply that previous annotations were incorrect. We additionally describe the quality assurance mechanisms we employ to improve the accuracy of annotations, which necessarily changes the composition of the annotation sets we provide. We use the Universal Protein Resource (UniProt) for illustrative purposes of how the GO Consortium, as a whole, manages these changes.
11 schema:genre article
12 schema:isAccessibleForFree true
13 schema:isPartOf N2428acacde704778a3199d6902fcab81
14 N89351db9b3e146a2bb8ec96f7dd570b9
15 sg:journal.1047731
16 schema:keywords GO
17 GO Consortium
18 GO annotations
19 GO data
20 GO datasets
21 Gene Ontology
22 Gene Ontology Consortium
23 Gene Ontology vocabulary
24 UniProt
25 Universal Protein Resource
26 accuracy
27 accuracy of annotation
28 analysis
29 annotation
30 annotation sets
31 article
32 assurance mechanisms
33 available resources
34 bioinformatics projects
35 biology
36 certain updates
37 changes
38 composition
39 consortium
40 data
41 dataset
42 description
43 function
44 functional analysis
45 functional description
46 functional information
47 gene product annotations
48 gene product functions
49 gene products
50 helpful articles
51 illustrative purposes
52 impact
53 information
54 integral part
55 interpretation
56 interpretation of analyses
57 location
58 mechanism
59 members
60 ontology
61 ontology vocabulary
62 part
63 previous annotations
64 product annotations
65 product changes
66 product functions
67 products
68 project
69 protein resources
70 purpose
71 quality assurance mechanisms
72 reasons
73 representation
74 researchers
75 resources
76 routines
77 set
78 significant impact
79 statistical tests
80 test
81 update
82 users
83 vocabulary
84 way
85 whole
86 schema:name Understanding how and why the Gene Ontology and its annotations evolve: the GO within UniProt
87 schema:pagination 4
88 schema:productId N40926df7e43842b9b19350de69ea849e
89 Nccade864cf714abcb9cfa0c23bb37565
90 Ne2a8c4a7c6d544a186971309960d9bee
91 schema:sameAs https://app.dimensions.ai/details/publication/pub.1045988176
92 https://doi.org/10.1186/2047-217x-3-4
93 schema:sdDatePublished 2022-10-01T06:39
94 schema:sdLicense https://scigraph.springernature.com/explorer/license/
95 schema:sdPublisher N687108643d254713bd8b12225412e2de
96 schema:url https://doi.org/10.1186/2047-217x-3-4
97 sgo:license sg:explorer/license/
98 sgo:sdDataset articles
99 rdf:type schema:ScholarlyArticle
100 N1bb023bd29dc4fda9795a8df8eba7e12 rdf:first sg:person.01120775354.41
101 rdf:rest Nd3de618935574882bc6deb857c01b052
102 N2428acacde704778a3199d6902fcab81 schema:issueNumber 1
103 rdf:type schema:PublicationIssue
104 N40926df7e43842b9b19350de69ea849e schema:name doi
105 schema:value 10.1186/2047-217x-3-4
106 rdf:type schema:PropertyValue
107 N687108643d254713bd8b12225412e2de schema:name Springer Nature - SN SciGraph project
108 rdf:type schema:Organization
109 N8038dbd1bbdf421ba279c46653d3373f rdf:first sg:person.01017767203.44
110 rdf:rest rdf:nil
111 N89351db9b3e146a2bb8ec96f7dd570b9 schema:volumeNumber 3
112 rdf:type schema:PublicationVolume
113 Nccade864cf714abcb9cfa0c23bb37565 schema:name pubmed_id
114 schema:value 24641996
115 rdf:type schema:PropertyValue
116 Nd3de618935574882bc6deb857c01b052 rdf:first sg:person.01277647757.79
117 rdf:rest Ned4a487a8228405ea3213379f04e032f
118 Ne2a8c4a7c6d544a186971309960d9bee schema:name dimensions_id
119 schema:value pub.1045988176
120 rdf:type schema:PropertyValue
121 Ned4a487a8228405ea3213379f04e032f rdf:first sg:person.01007251502.34
122 rdf:rest N8038dbd1bbdf421ba279c46653d3373f
123 anzsrc-for:06 schema:inDefinedTermSet anzsrc-for:
124 schema:name Biological Sciences
125 rdf:type schema:DefinedTerm
126 anzsrc-for:0604 schema:inDefinedTermSet anzsrc-for:
127 schema:name Genetics
128 rdf:type schema:DefinedTerm
129 sg:grant.2697599 http://pending.schema.org/fundedItem sg:pub.10.1186/2047-217x-3-4
130 rdf:type schema:MonetaryGrant
131 sg:grant.2697607 http://pending.schema.org/fundedItem sg:pub.10.1186/2047-217x-3-4
132 rdf:type schema:MonetaryGrant
133 sg:grant.5141042 http://pending.schema.org/fundedItem sg:pub.10.1186/2047-217x-3-4
134 rdf:type schema:MonetaryGrant
135 sg:grant.5141094 http://pending.schema.org/fundedItem sg:pub.10.1186/2047-217x-3-4
136 rdf:type schema:MonetaryGrant
137 sg:journal.1047731 schema:issn 2047-217X
138 schema:name GigaScience
139 schema:publisher Oxford University Press (OUP)
140 rdf:type schema:Periodical
141 sg:person.01007251502.34 schema:affiliation grid-institutes:grid.225360.0
142 schema:familyName Martin
143 schema:givenName Maria J
144 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01007251502.34
145 rdf:type schema:Person
146 sg:person.01017767203.44 schema:affiliation grid-institutes:grid.225360.0
147 schema:familyName O’Donovan
148 schema:givenName Claire
149 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01017767203.44
150 rdf:type schema:Person
151 sg:person.01120775354.41 schema:affiliation grid-institutes:grid.225360.0
152 schema:familyName Huntley
153 schema:givenName Rachael P
154 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01120775354.41
155 rdf:type schema:Person
156 sg:person.01277647757.79 schema:affiliation grid-institutes:grid.225360.0
157 schema:familyName Sawford
158 schema:givenName Tony
159 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01277647757.79
160 rdf:type schema:Person
161 sg:pub.10.1038/75556 schema:sameAs https://app.dimensions.ai/details/publication/pub.1044135237
162 https://doi.org/10.1038/75556
163 rdf:type schema:CreativeWork
164 sg:pub.10.1038/nbt1210-1248 schema:sameAs https://app.dimensions.ai/details/publication/pub.1002756332
165 https://doi.org/10.1038/nbt1210-1248
166 rdf:type schema:CreativeWork
167 sg:pub.10.1038/nrg2363 schema:sameAs https://app.dimensions.ai/details/publication/pub.1018486403
168 https://doi.org/10.1038/nrg2363
169 rdf:type schema:CreativeWork
170 sg:pub.10.1186/1471-2105-11-530 schema:sameAs https://app.dimensions.ai/details/publication/pub.1037563685
171 https://doi.org/10.1186/1471-2105-11-530
172 rdf:type schema:CreativeWork
173 grid-institutes:grid.225360.0 schema:alternateName European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK
174 schema:name European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK
175 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...