Collecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2015

AUTHORS

Dayna El-Sakhawy , Slim Abdennadher , Injy Hamed

ABSTRACT

Building Automatic Speech Recognition (ASR) systems for spoken languages usually suffer from the problem of limited available transcriptions. Automatic Speech Recognition (ASR) systems require large speech corpora that contain speech and their corresponding transcriptions for training acoustic models. In this paper, we target the Egyptian dialectal Arabic. As other spoken languages, it is mainly used for spoken rather than writing purposes. Transcriptions are usually collected manually by experts. However, this proved to be a time-consuming and expensive process. In this paper, we introduce Games With a Purpose as a cheap and fast approach to gather transcriptions for Egyptian dialectal Arabic. Furthermore, Arabic orthographic transcriptions lack diacritizations, which leads to ambiguity. On the other hand, transcriptions written in Arabic Chat Alphabet are widely used, and include the pronunciation effects given by diacritics. In this work, we present the game (pronouced as makhamekho) that aims at collecting transcriptions in Arabic orthography, as well as in Arabic Chat Alphabet. It also gathers mappings of words from Arabic orthography to Arabic Chat Alphabet. More... »

PAGES

99-108

Book

TITLE

Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction

ISBN

978-3-319-15556-2
978-3-319-15557-9

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-15557-9_10

DOI

http://dx.doi.org/10.1007/978-3-319-15557-9_10

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1053559331


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "German University in Cairo", 
          "id": "https://www.grid.ac/institutes/grid.187323.c", 
          "name": [
            "Media Engineering and Technology Faculty, German University in Cairo, New Cairo, Egypt"
          ], 
          "type": "Organization"
        }, 
        "familyName": "El-Sakhawy", 
        "givenName": "Dayna", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "German University in Cairo", 
          "id": "https://www.grid.ac/institutes/grid.187323.c", 
          "name": [
            "Media Engineering and Technology Faculty, German University in Cairo, New Cairo, Egypt"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Abdennadher", 
        "givenName": "Slim", 
        "id": "sg:person.010445445574.13", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010445445574.13"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "German University in Cairo", 
          "id": "https://www.grid.ac/institutes/grid.187323.c", 
          "name": [
            "Media Engineering and Technology Faculty, German University in Cairo, New Cairo, Egypt"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hamed", 
        "givenName": "Injy", 
        "id": "sg:person.016542541344.73", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016542541344.73"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.3115/1034678.1034680", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009916577"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/slt.2010.5700870", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093301973"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icassp.2011.5947463", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093438248"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.3115/1613715.1613751", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1099150814"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2015", 
    "datePublishedReg": "2015-01-01", 
    "description": "Building Automatic Speech Recognition (ASR) systems for spoken languages usually suffer from the problem of limited available transcriptions. Automatic Speech Recognition (ASR) systems require large speech corpora that contain speech and their corresponding transcriptions for training acoustic models. In this paper, we target the Egyptian dialectal Arabic. As other spoken languages, it is mainly used for spoken rather than writing purposes. Transcriptions are usually collected manually by experts. However, this proved to be a time-consuming and expensive process. In this paper, we introduce Games With a Purpose as a cheap and fast approach to gather transcriptions for Egyptian dialectal Arabic. Furthermore, Arabic orthographic transcriptions lack diacritizations, which leads to ambiguity. On the other hand, transcriptions written in Arabic Chat Alphabet are widely used, and include the pronunciation effects given by diacritics. In this work, we present the game (pronouced as makhamekho) that aims at collecting transcriptions in Arabic orthography, as well as in Arabic Chat Alphabet. It also gathers mappings of words from Arabic orthography to Arabic Chat Alphabet.", 
    "editor": [
      {
        "familyName": "B\u00f6ck", 
        "givenName": "Ronald", 
        "type": "Person"
      }, 
      {
        "familyName": "Bonin", 
        "givenName": "Francesca", 
        "type": "Person"
      }, 
      {
        "familyName": "Campbell", 
        "givenName": "Nick", 
        "type": "Person"
      }, 
      {
        "familyName": "Poppe", 
        "givenName": "Ronald", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-15557-9_10", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-319-15556-2", 
        "978-3-319-15557-9"
      ], 
      "name": "Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction", 
      "type": "Book"
    }, 
    "name": "Collecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose", 
    "pagination": "99-108", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-15557-9_10"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "ff1bee4bc14b3ee37ddf83036cec9be71ae6cd83db68fa76d21fd786d3f36f44"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1053559331"
        ]
      }
    ], 
    "publisher": {
      "location": "Cham", 
      "name": "Springer International Publishing", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-15557-9_10", 
      "https://app.dimensions.ai/details/publication/pub.1053559331"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T16:07", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8675_00000092.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-319-15557-9_10"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-15557-9_10'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-15557-9_10'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-15557-9_10'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-15557-9_10'


 

This table displays all metadata directly associated to this object as RDF triples.

105 TRIPLES      23 PREDICATES      31 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-15557-9_10 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N741497cfdabe49f38ace2d94fdf85bbd
4 schema:citation https://doi.org/10.1109/icassp.2011.5947463
5 https://doi.org/10.1109/slt.2010.5700870
6 https://doi.org/10.3115/1034678.1034680
7 https://doi.org/10.3115/1613715.1613751
8 schema:datePublished 2015
9 schema:datePublishedReg 2015-01-01
10 schema:description Building Automatic Speech Recognition (ASR) systems for spoken languages usually suffer from the problem of limited available transcriptions. Automatic Speech Recognition (ASR) systems require large speech corpora that contain speech and their corresponding transcriptions for training acoustic models. In this paper, we target the Egyptian dialectal Arabic. As other spoken languages, it is mainly used for spoken rather than writing purposes. Transcriptions are usually collected manually by experts. However, this proved to be a time-consuming and expensive process. In this paper, we introduce Games With a Purpose as a cheap and fast approach to gather transcriptions for Egyptian dialectal Arabic. Furthermore, Arabic orthographic transcriptions lack diacritizations, which leads to ambiguity. On the other hand, transcriptions written in Arabic Chat Alphabet are widely used, and include the pronunciation effects given by diacritics. In this work, we present the game (pronouced as makhamekho) that aims at collecting transcriptions in Arabic orthography, as well as in Arabic Chat Alphabet. It also gathers mappings of words from Arabic orthography to Arabic Chat Alphabet.
11 schema:editor Nd170b7eaf7e7406283296d4a83f96d14
12 schema:genre chapter
13 schema:inLanguage en
14 schema:isAccessibleForFree false
15 schema:isPartOf N5fdcbfb45b384efcb43ffd7c991135a4
16 schema:name Collecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose
17 schema:pagination 99-108
18 schema:productId N5b5055e0cff24265bc37e5f6bb62ecb5
19 N86ff8c5ae22b409594ce456d50b2428e
20 Nef0cd309566a4ba2afc833c7540cc60c
21 schema:publisher N8fc24b78898747648d0e3bdb92a9a77d
22 schema:sameAs https://app.dimensions.ai/details/publication/pub.1053559331
23 https://doi.org/10.1007/978-3-319-15557-9_10
24 schema:sdDatePublished 2019-04-15T16:07
25 schema:sdLicense https://scigraph.springernature.com/explorer/license/
26 schema:sdPublisher N98857d061c2a4416860c06832259cf4a
27 schema:url http://link.springer.com/10.1007/978-3-319-15557-9_10
28 sgo:license sg:explorer/license/
29 sgo:sdDataset chapters
30 rdf:type schema:Chapter
31 N004e786dc0e7476b8eff1742803224ae schema:familyName Campbell
32 schema:givenName Nick
33 rdf:type schema:Person
34 N080ff082ca64421499c5070cfe7352be schema:familyName Böck
35 schema:givenName Ronald
36 rdf:type schema:Person
37 N29a82ce67f234adf932c0d6db3889ff0 schema:familyName Poppe
38 schema:givenName Ronald
39 rdf:type schema:Person
40 N2bad90927c9643f0871c68e7fb2b811f schema:familyName Bonin
41 schema:givenName Francesca
42 rdf:type schema:Person
43 N40c9417493a045aabb52fbcab60cafd6 schema:affiliation https://www.grid.ac/institutes/grid.187323.c
44 schema:familyName El-Sakhawy
45 schema:givenName Dayna
46 rdf:type schema:Person
47 N4317160d175647cea3f7d8340a06a995 rdf:first N004e786dc0e7476b8eff1742803224ae
48 rdf:rest Nadcfeeb753ae4d409a7065e2e3d116ed
49 N5b5055e0cff24265bc37e5f6bb62ecb5 schema:name dimensions_id
50 schema:value pub.1053559331
51 rdf:type schema:PropertyValue
52 N5fdcbfb45b384efcb43ffd7c991135a4 schema:isbn 978-3-319-15556-2
53 978-3-319-15557-9
54 schema:name Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction
55 rdf:type schema:Book
56 N741497cfdabe49f38ace2d94fdf85bbd rdf:first N40c9417493a045aabb52fbcab60cafd6
57 rdf:rest N7b08aefb1c68458795235b3f72d935ce
58 N7b08aefb1c68458795235b3f72d935ce rdf:first sg:person.010445445574.13
59 rdf:rest Nd1838669c5de453885e90914d642613f
60 N86ff8c5ae22b409594ce456d50b2428e schema:name readcube_id
61 schema:value ff1bee4bc14b3ee37ddf83036cec9be71ae6cd83db68fa76d21fd786d3f36f44
62 rdf:type schema:PropertyValue
63 N8fc24b78898747648d0e3bdb92a9a77d schema:location Cham
64 schema:name Springer International Publishing
65 rdf:type schema:Organisation
66 N94e443b277814d05afa737f251fa1319 rdf:first N2bad90927c9643f0871c68e7fb2b811f
67 rdf:rest N4317160d175647cea3f7d8340a06a995
68 N98857d061c2a4416860c06832259cf4a schema:name Springer Nature - SN SciGraph project
69 rdf:type schema:Organization
70 Nadcfeeb753ae4d409a7065e2e3d116ed rdf:first N29a82ce67f234adf932c0d6db3889ff0
71 rdf:rest rdf:nil
72 Nd170b7eaf7e7406283296d4a83f96d14 rdf:first N080ff082ca64421499c5070cfe7352be
73 rdf:rest N94e443b277814d05afa737f251fa1319
74 Nd1838669c5de453885e90914d642613f rdf:first sg:person.016542541344.73
75 rdf:rest rdf:nil
76 Nef0cd309566a4ba2afc833c7540cc60c schema:name doi
77 schema:value 10.1007/978-3-319-15557-9_10
78 rdf:type schema:PropertyValue
79 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
80 schema:name Information and Computing Sciences
81 rdf:type schema:DefinedTerm
82 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
83 schema:name Artificial Intelligence and Image Processing
84 rdf:type schema:DefinedTerm
85 sg:person.010445445574.13 schema:affiliation https://www.grid.ac/institutes/grid.187323.c
86 schema:familyName Abdennadher
87 schema:givenName Slim
88 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010445445574.13
89 rdf:type schema:Person
90 sg:person.016542541344.73 schema:affiliation https://www.grid.ac/institutes/grid.187323.c
91 schema:familyName Hamed
92 schema:givenName Injy
93 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016542541344.73
94 rdf:type schema:Person
95 https://doi.org/10.1109/icassp.2011.5947463 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093438248
96 rdf:type schema:CreativeWork
97 https://doi.org/10.1109/slt.2010.5700870 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093301973
98 rdf:type schema:CreativeWork
99 https://doi.org/10.3115/1034678.1034680 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009916577
100 rdf:type schema:CreativeWork
101 https://doi.org/10.3115/1613715.1613751 schema:sameAs https://app.dimensions.ai/details/publication/pub.1099150814
102 rdf:type schema:CreativeWork
103 https://www.grid.ac/institutes/grid.187323.c schema:alternateName German University in Cairo
104 schema:name Media Engineering and Technology Faculty, German University in Cairo, New Cairo, Egypt
105 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...