Data Driven Generation of Pronunciation Dictionaries View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2000

AUTHORS

Matthias Eichner , Matthias Wolff , Rüdiger Hoffmann

ABSTRACT

In the framework of the German Verbmobil project we developed a procedure for the automatic, data-driven generation of pronunciation dictionaries for speech recognition systems. In most recognizers only simple dictionaries containing the canonical pronunciation form are used. They represent the correct pronunciation, but in most cases the canonical pronunciation does not match the actual realization of the word. To solve this problem we chose an approach to derive pronunciation variants automatically from a speech database. The training algorithm bases on a canonical dictionary which is compiled into a graph representation in a first stage. Pronunciation variants are then learned from a training sample consisting of speech signal and its orthographic transcription. In this paper we will focus on the experimental results obtained in the Verbmobil framework and introduce methods to evaluate pronunciation dictionaries generated by the training procedure. More... »

PAGES

95-105

Book

TITLE

Verbmobil: Foundations of Speech-to-Speech Translation

ISBN

978-3-642-08730-1
978-3-662-04230-4

Author Affiliations

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-662-04230-4_7

DOI

http://dx.doi.org/10.1007/978-3-662-04230-4_7

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1040184765


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/1702", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Cognitive Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/17", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Psychology and Cognitive Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "TU Dresden", 
          "id": "https://www.grid.ac/institutes/grid.4488.0", 
          "name": [
            "Laboratory of Acoustics and Speech Communication, Technische Universit\u00e4t Dresden, Germany"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Eichner", 
        "givenName": "Matthias", 
        "id": "sg:person.07377047071.24", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07377047071.24"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "TU Dresden", 
          "id": "https://www.grid.ac/institutes/grid.4488.0", 
          "name": [
            "Laboratory of Acoustics and Speech Communication, Technische Universit\u00e4t Dresden, Germany"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Wolff", 
        "givenName": "Matthias", 
        "id": "sg:person.014247106521.76", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014247106521.76"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "TU Dresden", 
          "id": "https://www.grid.ac/institutes/grid.4488.0", 
          "name": [
            "Laboratory of Acoustics and Speech Communication, Technische Universit\u00e4t Dresden, Germany"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hoffmann", 
        "givenName": "R\u00fcdiger", 
        "id": "sg:person.07501447443.39", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07501447443.39"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1109/icassp.1995.479626", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094353058"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icassp.2000.862075", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1095444763"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2000", 
    "datePublishedReg": "2000-01-01", 
    "description": "In the framework of the German Verbmobil project we developed a procedure for the automatic, data-driven generation of pronunciation dictionaries for speech recognition systems. In most recognizers only simple dictionaries containing the canonical pronunciation form are used. They represent the correct pronunciation, but in most cases the canonical pronunciation does not match the actual realization of the word. To solve this problem we chose an approach to derive pronunciation variants automatically from a speech database. The training algorithm bases on a canonical dictionary which is compiled into a graph representation in a first stage. Pronunciation variants are then learned from a training sample consisting of speech signal and its orthographic transcription. In this paper we will focus on the experimental results obtained in the Verbmobil framework and introduce methods to evaluate pronunciation dictionaries generated by the training procedure.", 
    "editor": [
      {
        "familyName": "Wahlster", 
        "givenName": "Wolfgang", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-662-04230-4_7", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-642-08730-1", 
        "978-3-662-04230-4"
      ], 
      "name": "Verbmobil: Foundations of Speech-to-Speech Translation", 
      "type": "Book"
    }, 
    "name": "Data Driven Generation of Pronunciation Dictionaries", 
    "pagination": "95-105", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-662-04230-4_7"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "bf6cf6f40d96b1aaf1d824c0bd5601512439a1de648c5430a8eb65ae0e12cb46"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1040184765"
        ]
      }
    ], 
    "publisher": {
      "location": "Berlin, Heidelberg", 
      "name": "Springer Berlin Heidelberg", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-662-04230-4_7", 
      "https://app.dimensions.ai/details/publication/pub.1040184765"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T23:41", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8697_00000069.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-662-04230-4_7"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-662-04230-4_7'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-662-04230-4_7'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-662-04230-4_7'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-662-04230-4_7'


 

This table displays all metadata directly associated to this object as RDF triples.

85 TRIPLES      23 PREDICATES      29 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-662-04230-4_7 schema:about anzsrc-for:17
2 anzsrc-for:1702
3 schema:author N8f1449da5ee743e9923b05158a0f5b51
4 schema:citation https://doi.org/10.1109/icassp.1995.479626
5 https://doi.org/10.1109/icassp.2000.862075
6 schema:datePublished 2000
7 schema:datePublishedReg 2000-01-01
8 schema:description In the framework of the German Verbmobil project we developed a procedure for the automatic, data-driven generation of pronunciation dictionaries for speech recognition systems. In most recognizers only simple dictionaries containing the canonical pronunciation form are used. They represent the correct pronunciation, but in most cases the canonical pronunciation does not match the actual realization of the word. To solve this problem we chose an approach to derive pronunciation variants automatically from a speech database. The training algorithm bases on a canonical dictionary which is compiled into a graph representation in a first stage. Pronunciation variants are then learned from a training sample consisting of speech signal and its orthographic transcription. In this paper we will focus on the experimental results obtained in the Verbmobil framework and introduce methods to evaluate pronunciation dictionaries generated by the training procedure.
9 schema:editor N8f9575c265e744939b786527b7172741
10 schema:genre chapter
11 schema:inLanguage en
12 schema:isAccessibleForFree false
13 schema:isPartOf Nc67b0dc4e644477cbf82b5a2955c618d
14 schema:name Data Driven Generation of Pronunciation Dictionaries
15 schema:pagination 95-105
16 schema:productId N701ff279b87b47fb99e510a19528537b
17 Nb21615f8d16b4009b6d023853e36a1ad
18 Nd746e6b9980f40dca9f676a77934af49
19 schema:publisher N95fd2df4dcf7443597f91dfc25d85806
20 schema:sameAs https://app.dimensions.ai/details/publication/pub.1040184765
21 https://doi.org/10.1007/978-3-662-04230-4_7
22 schema:sdDatePublished 2019-04-15T23:41
23 schema:sdLicense https://scigraph.springernature.com/explorer/license/
24 schema:sdPublisher Nba996ee8e5bc4555845c8d0caa8cd1d6
25 schema:url http://link.springer.com/10.1007/978-3-662-04230-4_7
26 sgo:license sg:explorer/license/
27 sgo:sdDataset chapters
28 rdf:type schema:Chapter
29 N5e8124d703284a53af62407db439f6ff rdf:first sg:person.07501447443.39
30 rdf:rest rdf:nil
31 N701ff279b87b47fb99e510a19528537b schema:name doi
32 schema:value 10.1007/978-3-662-04230-4_7
33 rdf:type schema:PropertyValue
34 N845e892577e04104a804adf2c2837c51 rdf:first sg:person.014247106521.76
35 rdf:rest N5e8124d703284a53af62407db439f6ff
36 N8f1449da5ee743e9923b05158a0f5b51 rdf:first sg:person.07377047071.24
37 rdf:rest N845e892577e04104a804adf2c2837c51
38 N8f9575c265e744939b786527b7172741 rdf:first Nf72481482c824306ae2ffffe585768df
39 rdf:rest rdf:nil
40 N95fd2df4dcf7443597f91dfc25d85806 schema:location Berlin, Heidelberg
41 schema:name Springer Berlin Heidelberg
42 rdf:type schema:Organisation
43 Nb21615f8d16b4009b6d023853e36a1ad schema:name dimensions_id
44 schema:value pub.1040184765
45 rdf:type schema:PropertyValue
46 Nba996ee8e5bc4555845c8d0caa8cd1d6 schema:name Springer Nature - SN SciGraph project
47 rdf:type schema:Organization
48 Nc67b0dc4e644477cbf82b5a2955c618d schema:isbn 978-3-642-08730-1
49 978-3-662-04230-4
50 schema:name Verbmobil: Foundations of Speech-to-Speech Translation
51 rdf:type schema:Book
52 Nd746e6b9980f40dca9f676a77934af49 schema:name readcube_id
53 schema:value bf6cf6f40d96b1aaf1d824c0bd5601512439a1de648c5430a8eb65ae0e12cb46
54 rdf:type schema:PropertyValue
55 Nf72481482c824306ae2ffffe585768df schema:familyName Wahlster
56 schema:givenName Wolfgang
57 rdf:type schema:Person
58 anzsrc-for:17 schema:inDefinedTermSet anzsrc-for:
59 schema:name Psychology and Cognitive Sciences
60 rdf:type schema:DefinedTerm
61 anzsrc-for:1702 schema:inDefinedTermSet anzsrc-for:
62 schema:name Cognitive Sciences
63 rdf:type schema:DefinedTerm
64 sg:person.014247106521.76 schema:affiliation https://www.grid.ac/institutes/grid.4488.0
65 schema:familyName Wolff
66 schema:givenName Matthias
67 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014247106521.76
68 rdf:type schema:Person
69 sg:person.07377047071.24 schema:affiliation https://www.grid.ac/institutes/grid.4488.0
70 schema:familyName Eichner
71 schema:givenName Matthias
72 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07377047071.24
73 rdf:type schema:Person
74 sg:person.07501447443.39 schema:affiliation https://www.grid.ac/institutes/grid.4488.0
75 schema:familyName Hoffmann
76 schema:givenName Rüdiger
77 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.07501447443.39
78 rdf:type schema:Person
79 https://doi.org/10.1109/icassp.1995.479626 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094353058
80 rdf:type schema:CreativeWork
81 https://doi.org/10.1109/icassp.2000.862075 schema:sameAs https://app.dimensions.ai/details/publication/pub.1095444763
82 rdf:type schema:CreativeWork
83 https://www.grid.ac/institutes/grid.4488.0 schema:alternateName TU Dresden
84 schema:name Laboratory of Acoustics and Speech Communication, Technische Universität Dresden, Germany
85 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...