CVX-Optimized Beamforming and Vector Taylor Series Compensation with German ASR Employing Star-Shaped Microphone Array View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2014

AUTHORS

Juan A. Morales-Cordovilla , Hannes Pessentheiner , Martin Hagmüller , José A. González , Gernot Kubin

ABSTRACT

This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a star-shaped microphone array and vector Taylor series (VTS) compensation. First, a beamformer yields an enhanced single-channel signal by applying convex (CVX) optimization over three spatial dimensions given the spatio-temporal position of the target speaker as prior knowledge. Then, VTS compensation is applied over the speech features extracted from the temporal signal obtained by the beamformer. Finally, the compensated features are used for speech recognition. Due to a lack of existing resources in German to evaluate the proposed enhancement framework, this paper also introduces a new speech database. In particular, we present a medium-vocabulary German database for microphone array made of embedded clean signals contaminated with real room impulsive responses and mixed in a ‘natural’ way with real noises. We show that the proposed enhancement framework performs better than other related systems on the presented database. More... »

PAGES

148-157

Book

TITLE

Advances in Speech and Language Technologies for Iberian Languages

ISBN

978-3-319-13622-6
978-3-319-13623-3

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_16

DOI

http://dx.doi.org/10.1007/978-3-319-13623-3_16

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1008540097


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Graz University of Technology", 
          "id": "https://www.grid.ac/institutes/grid.410413.3", 
          "name": [
            "Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Morales-Cordovilla", 
        "givenName": "Juan A.", 
        "id": "sg:person.016533156075.18", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016533156075.18"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Graz University of Technology", 
          "id": "https://www.grid.ac/institutes/grid.410413.3", 
          "name": [
            "Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Pessentheiner", 
        "givenName": "Hannes", 
        "id": "sg:person.011761437545.59", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011761437545.59"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Graz University of Technology", 
          "id": "https://www.grid.ac/institutes/grid.410413.3", 
          "name": [
            "Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Hagm\u00fcller", 
        "givenName": "Martin", 
        "id": "sg:person.01266415575.60", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01266415575.60"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Sheffield", 
          "id": "https://www.grid.ac/institutes/grid.11835.3e", 
          "name": [
            "Dept. of Computer Science, University of Sheffield, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Gonz\u00e1lez", 
        "givenName": "Jos\u00e9 A.", 
        "id": "sg:person.010477131004.17", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010477131004.17"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Graz University of Technology", 
          "id": "https://www.grid.ac/institutes/grid.410413.3", 
          "name": [
            "Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Kubin", 
        "givenName": "Gernot", 
        "id": "sg:person.014167736065.79", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014167736065.79"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1109/icassp.2009.4959524", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094363026"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/9780470994443", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1098662670"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1002/9780470994443", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1098662670"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014", 
    "datePublishedReg": "2014-01-01", 
    "description": "This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a star-shaped microphone array and vector Taylor series (VTS) compensation. First, a beamformer yields an enhanced single-channel signal by applying convex (CVX) optimization over three spatial dimensions given the spatio-temporal position of the target speaker as prior knowledge. Then, VTS compensation is applied over the speech features extracted from the temporal signal obtained by the beamformer. Finally, the compensated features are used for speech recognition. Due to a lack of existing resources in German to evaluate the proposed enhancement framework, this paper also introduces a new speech database. In particular, we present a medium-vocabulary German database for microphone array made of embedded clean signals contaminated with real room impulsive responses and mixed in a \u2018natural\u2019 way with real noises. We show that the proposed enhancement framework performs better than other related systems on the presented database.", 
    "editor": [
      {
        "familyName": "Navarro Mesa", 
        "givenName": "Juan Luis", 
        "type": "Person"
      }, 
      {
        "familyName": "Ortega", 
        "givenName": "Alfonso", 
        "type": "Person"
      }, 
      {
        "familyName": "Teixeira", 
        "givenName": "Ant\u00f3nio", 
        "type": "Person"
      }, 
      {
        "familyName": "Hern\u00e1ndez P\u00e9rez", 
        "givenName": "Eduardo", 
        "type": "Person"
      }, 
      {
        "familyName": "Quintana Morales", 
        "givenName": "Pedro", 
        "type": "Person"
      }, 
      {
        "familyName": "Ravelo Garc\u00eda", 
        "givenName": "Antonio", 
        "type": "Person"
      }, 
      {
        "familyName": "Guerra Moreno", 
        "givenName": "Iv\u00e1n", 
        "type": "Person"
      }, 
      {
        "familyName": "Toledano", 
        "givenName": "Doroteo T.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-13623-3_16", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-319-13622-6", 
        "978-3-319-13623-3"
      ], 
      "name": "Advances in Speech and Language Technologies for Iberian Languages", 
      "type": "Book"
    }, 
    "name": "CVX-Optimized Beamforming and Vector Taylor Series Compensation with German ASR Employing Star-Shaped Microphone Array", 
    "pagination": "148-157", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-13623-3_16"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "28afc48049a2ccd712a1c089f08020bd6f65ddc140b4630309b34ef669bbc6ad"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1008540097"
        ]
      }
    ], 
    "publisher": {
      "location": "Cham", 
      "name": "Springer International Publishing", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-13623-3_16", 
      "https://app.dimensions.ai/details/publication/pub.1008540097"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T22:40", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8695_00000014.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-319-13623-3_16"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_16'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_16'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_16'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_16'


 

This table displays all metadata directly associated to this object as RDF triples.

137 TRIPLES      23 PREDICATES      29 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-13623-3_16 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N42a93cc4107f45538e3c085250963910
4 schema:citation https://doi.org/10.1002/9780470994443
5 https://doi.org/10.1109/icassp.2009.4959524
6 schema:datePublished 2014
7 schema:datePublishedReg 2014-01-01
8 schema:description This paper addresses the problem of distant speech recognition in reverberant noisy conditions employing a star-shaped microphone array and vector Taylor series (VTS) compensation. First, a beamformer yields an enhanced single-channel signal by applying convex (CVX) optimization over three spatial dimensions given the spatio-temporal position of the target speaker as prior knowledge. Then, VTS compensation is applied over the speech features extracted from the temporal signal obtained by the beamformer. Finally, the compensated features are used for speech recognition. Due to a lack of existing resources in German to evaluate the proposed enhancement framework, this paper also introduces a new speech database. In particular, we present a medium-vocabulary German database for microphone array made of embedded clean signals contaminated with real room impulsive responses and mixed in a ‘natural’ way with real noises. We show that the proposed enhancement framework performs better than other related systems on the presented database.
9 schema:editor Naccad93451c1425b8a42ec4dca7cc25c
10 schema:genre chapter
11 schema:inLanguage en
12 schema:isAccessibleForFree false
13 schema:isPartOf N3c82f4b0eb2d4b48b6efe9a06990e4cb
14 schema:name CVX-Optimized Beamforming and Vector Taylor Series Compensation with German ASR Employing Star-Shaped Microphone Array
15 schema:pagination 148-157
16 schema:productId N01b738cf83724275922726381dee3781
17 Ne97be92aaafe45578c66c3da34894440
18 Nfd3d21961ffc494cae64732cfed42492
19 schema:publisher N5b7ee45ff4cf4b75a050615a711b4930
20 schema:sameAs https://app.dimensions.ai/details/publication/pub.1008540097
21 https://doi.org/10.1007/978-3-319-13623-3_16
22 schema:sdDatePublished 2019-04-15T22:40
23 schema:sdLicense https://scigraph.springernature.com/explorer/license/
24 schema:sdPublisher N588d3a84e9b447ee8e55186062edf7c5
25 schema:url http://link.springer.com/10.1007/978-3-319-13623-3_16
26 sgo:license sg:explorer/license/
27 sgo:sdDataset chapters
28 rdf:type schema:Chapter
29 N01b738cf83724275922726381dee3781 schema:name dimensions_id
30 schema:value pub.1008540097
31 rdf:type schema:PropertyValue
32 N161b86273d4846e28792e19c58efba7a schema:familyName Guerra Moreno
33 schema:givenName Iván
34 rdf:type schema:Person
35 N1d09696637fb4e91b3a928a9528c399a schema:familyName Ravelo García
36 schema:givenName Antonio
37 rdf:type schema:Person
38 N2eccd35db2644f1a91fcd56e3b028166 rdf:first sg:person.01266415575.60
39 rdf:rest Na42369e3f52447c7a630b3865c91ee2d
40 N380f2b677f2e4d698813ed47edbf1e19 rdf:first N55641665bdcd4344827e2fc8d04cc4fb
41 rdf:rest rdf:nil
42 N3c80a023a92a49f1a8a1c08d722e3f19 rdf:first N161b86273d4846e28792e19c58efba7a
43 rdf:rest N380f2b677f2e4d698813ed47edbf1e19
44 N3c82f4b0eb2d4b48b6efe9a06990e4cb schema:isbn 978-3-319-13622-6
45 978-3-319-13623-3
46 schema:name Advances in Speech and Language Technologies for Iberian Languages
47 rdf:type schema:Book
48 N42a93cc4107f45538e3c085250963910 rdf:first sg:person.016533156075.18
49 rdf:rest Nbd94fb1f686848fa94eeb3a9f9180369
50 N4eed12d2f13e43488a26b22e97651734 schema:familyName Hernández Pérez
51 schema:givenName Eduardo
52 rdf:type schema:Person
53 N55641665bdcd4344827e2fc8d04cc4fb schema:familyName Toledano
54 schema:givenName Doroteo T.
55 rdf:type schema:Person
56 N559744aa1c274bdd988a335bddbfb79c schema:familyName Quintana Morales
57 schema:givenName Pedro
58 rdf:type schema:Person
59 N57c20bad762546dab9285dcc762ae4d2 schema:familyName Teixeira
60 schema:givenName António
61 rdf:type schema:Person
62 N588d3a84e9b447ee8e55186062edf7c5 schema:name Springer Nature - SN SciGraph project
63 rdf:type schema:Organization
64 N59089701807f45c1afb8a6344285b90b schema:familyName Ortega
65 schema:givenName Alfonso
66 rdf:type schema:Person
67 N5b021a2afafd491487f609c6a6f29990 rdf:first sg:person.014167736065.79
68 rdf:rest rdf:nil
69 N5b7ee45ff4cf4b75a050615a711b4930 schema:location Cham
70 schema:name Springer International Publishing
71 rdf:type schema:Organisation
72 N5cfac3459f774646a866991cc95e0a64 rdf:first N1d09696637fb4e91b3a928a9528c399a
73 rdf:rest N3c80a023a92a49f1a8a1c08d722e3f19
74 N70077afccc2e4844a18cd48520fa0430 rdf:first N59089701807f45c1afb8a6344285b90b
75 rdf:rest Ncf360d5f85a54876bcdc7c45baf8b514
76 N7037af75fa1443aa99dd0766497dcfd5 rdf:first N559744aa1c274bdd988a335bddbfb79c
77 rdf:rest N5cfac3459f774646a866991cc95e0a64
78 N9097153584b9489a868aed37a66c505a schema:familyName Navarro Mesa
79 schema:givenName Juan Luis
80 rdf:type schema:Person
81 N92beef8565da4144b80d8e7c3d633443 rdf:first N4eed12d2f13e43488a26b22e97651734
82 rdf:rest N7037af75fa1443aa99dd0766497dcfd5
83 Na42369e3f52447c7a630b3865c91ee2d rdf:first sg:person.010477131004.17
84 rdf:rest N5b021a2afafd491487f609c6a6f29990
85 Naccad93451c1425b8a42ec4dca7cc25c rdf:first N9097153584b9489a868aed37a66c505a
86 rdf:rest N70077afccc2e4844a18cd48520fa0430
87 Nbd94fb1f686848fa94eeb3a9f9180369 rdf:first sg:person.011761437545.59
88 rdf:rest N2eccd35db2644f1a91fcd56e3b028166
89 Ncf360d5f85a54876bcdc7c45baf8b514 rdf:first N57c20bad762546dab9285dcc762ae4d2
90 rdf:rest N92beef8565da4144b80d8e7c3d633443
91 Ne97be92aaafe45578c66c3da34894440 schema:name readcube_id
92 schema:value 28afc48049a2ccd712a1c089f08020bd6f65ddc140b4630309b34ef669bbc6ad
93 rdf:type schema:PropertyValue
94 Nfd3d21961ffc494cae64732cfed42492 schema:name doi
95 schema:value 10.1007/978-3-319-13623-3_16
96 rdf:type schema:PropertyValue
97 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
98 schema:name Information and Computing Sciences
99 rdf:type schema:DefinedTerm
100 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
101 schema:name Artificial Intelligence and Image Processing
102 rdf:type schema:DefinedTerm
103 sg:person.010477131004.17 schema:affiliation https://www.grid.ac/institutes/grid.11835.3e
104 schema:familyName González
105 schema:givenName José A.
106 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010477131004.17
107 rdf:type schema:Person
108 sg:person.011761437545.59 schema:affiliation https://www.grid.ac/institutes/grid.410413.3
109 schema:familyName Pessentheiner
110 schema:givenName Hannes
111 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.011761437545.59
112 rdf:type schema:Person
113 sg:person.01266415575.60 schema:affiliation https://www.grid.ac/institutes/grid.410413.3
114 schema:familyName Hagmüller
115 schema:givenName Martin
116 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.01266415575.60
117 rdf:type schema:Person
118 sg:person.014167736065.79 schema:affiliation https://www.grid.ac/institutes/grid.410413.3
119 schema:familyName Kubin
120 schema:givenName Gernot
121 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.014167736065.79
122 rdf:type schema:Person
123 sg:person.016533156075.18 schema:affiliation https://www.grid.ac/institutes/grid.410413.3
124 schema:familyName Morales-Cordovilla
125 schema:givenName Juan A.
126 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016533156075.18
127 rdf:type schema:Person
128 https://doi.org/10.1002/9780470994443 schema:sameAs https://app.dimensions.ai/details/publication/pub.1098662670
129 rdf:type schema:CreativeWork
130 https://doi.org/10.1109/icassp.2009.4959524 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094363026
131 rdf:type schema:CreativeWork
132 https://www.grid.ac/institutes/grid.11835.3e schema:alternateName University of Sheffield
133 schema:name Dept. of Computer Science, University of Sheffield, UK
134 rdf:type schema:Organization
135 https://www.grid.ac/institutes/grid.410413.3 schema:alternateName Graz University of Technology
136 schema:name Signal Processing and Speech Communication Laboratory, Graz University of Technology, Austria
137 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...