A Parallel Greek-Bulgarian Corpus: A Digital Resource of the Shared Cultural Heritage View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2011-04-26

AUTHORS

Voula Giouli , Kiril Simov , Petya Osenova

ABSTRACT

There has been a long tradition in the digitization and manual documentation of cultural heritage data, yet the need for indexing and retrieval that goes beyond mere bibliographic information has only recently been recognized. This chapter reports on completed work aimed at highlighting textual cultural resources that, as of yet, remain under-exploited by creating the necessary infrastructure with the support and customization of Language Technologies (LT). The ultimate goal was to promote the study of cultural heritage of the neighboring areas of Greece and Bulgaria and to raise awareness about their common cultural identity, the focus being on literature, folklore and language. To this end, a bilingual collection of literary and folklore texts in Greek and Bulgarian was developed along with a number of accompanying resources. The authors present the methodology adopted for the automatic annotation of the textual data at various levels of linguistic analysis elaborating on the Greek and Bulgarian text processing tools that are integrated in the cross-lingual search and retrieval mechanisms, and discuss issues and problems encountered in the course of the project life-cycle. More... »

PAGES

99-112

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-642-20227-8_6

DOI

http://dx.doi.org/10.1007/978-3-642-20227-8_6

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1006820406


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/20", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Language, Communication and Culture", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2004", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Linguistics", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/2005", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Literary Studies", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Institute for Language and Speech Processing Epidavrou 6 & Artemidos, 15125, Athens, Greece", 
          "id": "http://www.grid.ac/institutes/grid.424851.e", 
          "name": [
            "Institute for Language and Speech Processing Epidavrou 6 & Artemidos, 15125, Athens, Greece"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Giouli", 
        "givenName": "Voula", 
        "id": "sg:person.012632535220.96", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012632535220.96"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Parallel Processing, Bulgarian Academy of Sciences, Acad. G. Bonchev 25A, 1113, Sofia, Bulgaria", 
          "id": "http://www.grid.ac/institutes/grid.424859.6", 
          "name": [
            "Institute of Parallel Processing, Bulgarian Academy of Sciences, Acad. G. Bonchev 25A, 1113, Sofia, Bulgaria"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Simov", 
        "givenName": "Kiril", 
        "id": "sg:person.07534373001.35", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07534373001.35"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Institute of Parallel Processing, Bulgarian Academy of Sciences, Acad. G. Bonchev 25A, 1113, Sofia, Bulgaria", 
          "id": "http://www.grid.ac/institutes/grid.424859.6", 
          "name": [
            "Institute of Parallel Processing, Bulgarian Academy of Sciences, Acad. G. Bonchev 25A, 1113, Sofia, Bulgaria"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Osenova", 
        "givenName": "Petya", 
        "id": "sg:person.014725200041.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014725200041.24"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2011-04-26", 
    "datePublishedReg": "2011-04-26", 
    "description": "There has been a long tradition in the digitization and manual documentation of cultural heritage data, yet the need for indexing and retrieval that goes beyond mere bibliographic information has only recently been recognized. This chapter reports on completed work aimed at highlighting textual cultural resources that, as of yet, remain under-exploited by creating the necessary infrastructure with the support and customization of Language Technologies (LT). The ultimate goal was to promote the study of cultural heritage of the neighboring areas of Greece and Bulgaria and to raise awareness about their common cultural identity, the focus being on literature, folklore and language. To this end, a bilingual collection of literary and folklore texts in Greek and Bulgarian was developed along with a number of accompanying resources. The authors present the methodology adopted for the automatic annotation of the textual data at various levels of linguistic analysis elaborating on the Greek and Bulgarian text processing tools that are integrated in the cross-lingual search and retrieval mechanisms, and discuss issues and problems encountered in the course of the project life-cycle.", 
    "editor": [
      {
        "familyName": "Sporleder", 
        "givenName": "Caroline", 
        "type": "Person"
      }, 
      {
        "familyName": "van den Bosch", 
        "givenName": "Antal", 
        "type": "Person"
      }, 
      {
        "familyName": "Zervanou", 
        "givenName": "Kalliopi", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-642-20227-8_6", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-642-20226-1", 
        "978-3-642-20227-8"
      ], 
      "name": "Language Technology for Cultural Heritage", 
      "type": "Book"
    }, 
    "keywords": [
      "language technology", 
      "cultural heritage", 
      "common cultural identity", 
      "cross-lingual search", 
      "cultural heritage data", 
      "text processing tools", 
      "bilingual collection", 
      "folklore texts", 
      "linguistic analysis", 
      "cultural identity", 
      "cultural resources", 
      "digital resources", 
      "heritage data", 
      "automatic annotation", 
      "textual data", 
      "long tradition", 
      "retrieval mechanism", 
      "manual documentation", 
      "bibliographic information", 
      "processing tools", 
      "Greek", 
      "heritage", 
      "necessary infrastructure", 
      "language", 
      "folklore", 
      "text", 
      "corpus", 
      "Bulgarians", 
      "tradition", 
      "identity", 
      "resources", 
      "indexing", 
      "ultimate goal", 
      "customization", 
      "retrieval", 
      "annotation", 
      "digitization", 
      "infrastructure", 
      "Greece", 
      "literature", 
      "authors", 
      "chapter", 
      "technology", 
      "awareness", 
      "focus", 
      "information", 
      "project", 
      "Bulgaria", 
      "search", 
      "issues", 
      "tool", 
      "collection", 
      "documentation", 
      "data", 
      "work", 
      "methodology", 
      "goal", 
      "course", 
      "need", 
      "support", 
      "end", 
      "number", 
      "neighboring areas", 
      "study", 
      "analysis", 
      "area", 
      "problem", 
      "mechanism", 
      "levels", 
      "mere bibliographic information", 
      "textual cultural resources", 
      "Bulgarian text processing tools", 
      "Parallel Greek-Bulgarian Corpus", 
      "Greek-Bulgarian Corpus"
    ], 
    "name": "A Parallel Greek-Bulgarian Corpus: A Digital Resource of the Shared Cultural Heritage", 
    "pagination": "99-112", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1006820406"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-642-20227-8_6"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-642-20227-8_6", 
      "https://app.dimensions.ai/details/publication/pub.1006820406"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2021-12-01T20:05", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211201/entities/gbq_results/chapter/chapter_328.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-642-20227-8_6"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-20227-8_6'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-20227-8_6'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-20227-8_6'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-642-20227-8_6'


 

This table displays all metadata directly associated to this object as RDF triples.

165 TRIPLES      23 PREDICATES      99 URIs      91 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-642-20227-8_6 schema:about anzsrc-for:20
2 anzsrc-for:2004
3 anzsrc-for:2005
4 schema:author Nad5cd1c68c114d1f9273e674328fcf38
5 schema:datePublished 2011-04-26
6 schema:datePublishedReg 2011-04-26
7 schema:description There has been a long tradition in the digitization and manual documentation of cultural heritage data, yet the need for indexing and retrieval that goes beyond mere bibliographic information has only recently been recognized. This chapter reports on completed work aimed at highlighting textual cultural resources that, as of yet, remain under-exploited by creating the necessary infrastructure with the support and customization of Language Technologies (LT). The ultimate goal was to promote the study of cultural heritage of the neighboring areas of Greece and Bulgaria and to raise awareness about their common cultural identity, the focus being on literature, folklore and language. To this end, a bilingual collection of literary and folklore texts in Greek and Bulgarian was developed along with a number of accompanying resources. The authors present the methodology adopted for the automatic annotation of the textual data at various levels of linguistic analysis elaborating on the Greek and Bulgarian text processing tools that are integrated in the cross-lingual search and retrieval mechanisms, and discuss issues and problems encountered in the course of the project life-cycle.
8 schema:editor N2b039060736a4fa387545492d0ee8966
9 schema:genre chapter
10 schema:inLanguage en
11 schema:isAccessibleForFree false
12 schema:isPartOf N80c9a76e08714dd7848e63e75a1c27d0
13 schema:keywords Bulgaria
14 Bulgarian text processing tools
15 Bulgarians
16 Greece
17 Greek
18 Greek-Bulgarian Corpus
19 Parallel Greek-Bulgarian Corpus
20 analysis
21 annotation
22 area
23 authors
24 automatic annotation
25 awareness
26 bibliographic information
27 bilingual collection
28 chapter
29 collection
30 common cultural identity
31 corpus
32 course
33 cross-lingual search
34 cultural heritage
35 cultural heritage data
36 cultural identity
37 cultural resources
38 customization
39 data
40 digital resources
41 digitization
42 documentation
43 end
44 focus
45 folklore
46 folklore texts
47 goal
48 heritage
49 heritage data
50 identity
51 indexing
52 information
53 infrastructure
54 issues
55 language
56 language technology
57 levels
58 linguistic analysis
59 literature
60 long tradition
61 manual documentation
62 mechanism
63 mere bibliographic information
64 methodology
65 necessary infrastructure
66 need
67 neighboring areas
68 number
69 problem
70 processing tools
71 project
72 resources
73 retrieval
74 retrieval mechanism
75 search
76 study
77 support
78 technology
79 text
80 text processing tools
81 textual cultural resources
82 textual data
83 tool
84 tradition
85 ultimate goal
86 work
87 schema:name A Parallel Greek-Bulgarian Corpus: A Digital Resource of the Shared Cultural Heritage
88 schema:pagination 99-112
89 schema:productId N649f152486d747cdb0ef2aced8235f77
90 Nf35480edcfba4589a25b661d51943630
91 schema:publisher N2b5a92b493d74ce78f6034af5d1117e0
92 schema:sameAs https://app.dimensions.ai/details/publication/pub.1006820406
93 https://doi.org/10.1007/978-3-642-20227-8_6
94 schema:sdDatePublished 2021-12-01T20:05
95 schema:sdLicense https://scigraph.springernature.com/explorer/license/
96 schema:sdPublisher N4a3ee0be579e4834bcd64e66743d59cd
97 schema:url https://doi.org/10.1007/978-3-642-20227-8_6
98 sgo:license sg:explorer/license/
99 sgo:sdDataset chapters
100 rdf:type schema:Chapter
101 N085d1c8ea59b4785ab8a4eda74be444a rdf:first N28906ee41df74cdca4eb844fbd749d67
102 rdf:rest rdf:nil
103 N1990dcd922af49a2bbc49c8aff86ce4b rdf:first N1e3d86e863224b81a62e56e966ba5808
104 rdf:rest N085d1c8ea59b4785ab8a4eda74be444a
105 N1e3d86e863224b81a62e56e966ba5808 schema:familyName van den Bosch
106 schema:givenName Antal
107 rdf:type schema:Person
108 N28906ee41df74cdca4eb844fbd749d67 schema:familyName Zervanou
109 schema:givenName Kalliopi
110 rdf:type schema:Person
111 N2b039060736a4fa387545492d0ee8966 rdf:first Nc16329fff69f441baac11f56f7719009
112 rdf:rest N1990dcd922af49a2bbc49c8aff86ce4b
113 N2b5a92b493d74ce78f6034af5d1117e0 schema:name Springer Nature
114 rdf:type schema:Organisation
115 N4a3ee0be579e4834bcd64e66743d59cd schema:name Springer Nature - SN SciGraph project
116 rdf:type schema:Organization
117 N547db6b0b20a4abeb324d92eca261cac rdf:first sg:person.07534373001.35
118 rdf:rest Nc127010851ec4bb8a197448a33213d12
119 N649f152486d747cdb0ef2aced8235f77 schema:name dimensions_id
120 schema:value pub.1006820406
121 rdf:type schema:PropertyValue
122 N80c9a76e08714dd7848e63e75a1c27d0 schema:isbn 978-3-642-20226-1
123 978-3-642-20227-8
124 schema:name Language Technology for Cultural Heritage
125 rdf:type schema:Book
126 Nad5cd1c68c114d1f9273e674328fcf38 rdf:first sg:person.012632535220.96
127 rdf:rest N547db6b0b20a4abeb324d92eca261cac
128 Nc127010851ec4bb8a197448a33213d12 rdf:first sg:person.014725200041.24
129 rdf:rest rdf:nil
130 Nc16329fff69f441baac11f56f7719009 schema:familyName Sporleder
131 schema:givenName Caroline
132 rdf:type schema:Person
133 Nf35480edcfba4589a25b661d51943630 schema:name doi
134 schema:value 10.1007/978-3-642-20227-8_6
135 rdf:type schema:PropertyValue
136 anzsrc-for:20 schema:inDefinedTermSet anzsrc-for:
137 schema:name Language, Communication and Culture
138 rdf:type schema:DefinedTerm
139 anzsrc-for:2004 schema:inDefinedTermSet anzsrc-for:
140 schema:name Linguistics
141 rdf:type schema:DefinedTerm
142 anzsrc-for:2005 schema:inDefinedTermSet anzsrc-for:
143 schema:name Literary Studies
144 rdf:type schema:DefinedTerm
145 sg:person.012632535220.96 schema:affiliation grid-institutes:grid.424851.e
146 schema:familyName Giouli
147 schema:givenName Voula
148 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.012632535220.96
149 rdf:type schema:Person
150 sg:person.014725200041.24 schema:affiliation grid-institutes:grid.424859.6
151 schema:familyName Osenova
152 schema:givenName Petya
153 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014725200041.24
154 rdf:type schema:Person
155 sg:person.07534373001.35 schema:affiliation grid-institutes:grid.424859.6
156 schema:familyName Simov
157 schema:givenName Kiril
158 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07534373001.35
159 rdf:type schema:Person
160 grid-institutes:grid.424851.e schema:alternateName Institute for Language and Speech Processing Epidavrou 6 & Artemidos, 15125, Athens, Greece
161 schema:name Institute for Language and Speech Processing Epidavrou 6 & Artemidos, 15125, Athens, Greece
162 rdf:type schema:Organization
163 grid-institutes:grid.424859.6 schema:alternateName Institute of Parallel Processing, Bulgarian Academy of Sciences, Acad. G. Bonchev 25A, 1113, Sofia, Bulgaria
164 schema:name Institute of Parallel Processing, Bulgarian Academy of Sciences, Acad. G. Bonchev 25A, 1113, Sofia, Bulgaria
165 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...