A Deep Neural Network Approach for Missing-Data Mask Estimation on Dual-Microphone Smartphones: Application to Noise-Robust Speech Recognition View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2014

AUTHORS

Iván López-Espejo , José A. González , Ángel M. Gómez , Antonio M. Peinado

ABSTRACT

The inclusion of two or more microphones in smartphones is becoming quite common. These were originally intended to perform noise reduction and few benefit is still being taken from this feature for noise-robust automatic speech recognition (ASR). In this paper we propose a novel system to estimate missing-data masks for robust ASR on dual-microphone smartphones. This novel system is based on deep neural networks (DNNs), which have proven to be a powerful tool in the field of ASR in different ways. To assess the performance of the proposed technique, spectral reconstruction experiments are carried out on a dual-channel database derived from Aurora-2. Our results demonstrate that the DNN is better able to exploit the dual-channel information and yields an improvement on word accuracy of more than 6% over state-of-the-art single-channel mask estimation techniques. More... »

PAGES

119-128

Book

TITLE

Advances in Speech and Language Technologies for Iberian Languages

ISBN

978-3-319-13622-6
978-3-319-13623-3

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_13

DOI

http://dx.doi.org/10.1007/978-3-319-13623-3_13

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1041244334


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "University of Granada", 
          "id": "https://www.grid.ac/institutes/grid.4489.1", 
          "name": [
            "Dept. of Signal Theory, Telematics and Communications, University of Granada, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "L\u00f3pez-Espejo", 
        "givenName": "Iv\u00e1n", 
        "id": "sg:person.010072002657.12", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010072002657.12"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Sheffield", 
          "id": "https://www.grid.ac/institutes/grid.11835.3e", 
          "name": [
            "Dept. of Computer Science, University of Sheffield, UK"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Gonz\u00e1lez", 
        "givenName": "Jos\u00e9 A.", 
        "id": "sg:person.010477131004.17", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010477131004.17"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Granada", 
          "id": "https://www.grid.ac/institutes/grid.4489.1", 
          "name": [
            "Dept. of Signal Theory, Telematics and Communications, University of Granada, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "G\u00f3mez", 
        "givenName": "\u00c1ngel M.", 
        "id": "sg:person.013366554741.94", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013366554741.94"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "University of Granada", 
          "id": "https://www.grid.ac/institutes/grid.4489.1", 
          "name": [
            "Dept. of Signal Theory, Telematics and Communications, University of Granada, Spain"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Peinado", 
        "givenName": "Antonio M.", 
        "id": "sg:person.013434670141.68", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013434670141.68"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1126/science.1127647", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1004607132"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1162/089976602760128018", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1007443228"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/j.specom.2004.03.007", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1009768526"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1016/s0167-6393(00)00034-0", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1030004932"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/msp.2012.2205597", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061423808"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tasl.2010.2087753", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061516639"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tasl.2013.2250961", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061517108"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/tassp.1984.1164453", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1061519527"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/icassp.2013.6639038", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1093924949"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1109/iscslp.2012.6423512", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1094983647"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2014", 
    "datePublishedReg": "2014-01-01", 
    "description": "The inclusion of two or more microphones in smartphones is becoming quite common. These were originally intended to perform noise reduction and few benefit is still being taken from this feature for noise-robust automatic speech recognition (ASR). In this paper we propose a novel system to estimate missing-data masks for robust ASR on dual-microphone smartphones. This novel system is based on deep neural networks (DNNs), which have proven to be a powerful tool in the field of ASR in different ways. To assess the performance of the proposed technique, spectral reconstruction experiments are carried out on a dual-channel database derived from Aurora-2. Our results demonstrate that the DNN is better able to exploit the dual-channel information and yields an improvement on word accuracy of more than 6% over state-of-the-art single-channel mask estimation techniques.", 
    "editor": [
      {
        "familyName": "Navarro Mesa", 
        "givenName": "Juan Luis", 
        "type": "Person"
      }, 
      {
        "familyName": "Ortega", 
        "givenName": "Alfonso", 
        "type": "Person"
      }, 
      {
        "familyName": "Teixeira", 
        "givenName": "Ant\u00f3nio", 
        "type": "Person"
      }, 
      {
        "familyName": "Hern\u00e1ndez P\u00e9rez", 
        "givenName": "Eduardo", 
        "type": "Person"
      }, 
      {
        "familyName": "Quintana Morales", 
        "givenName": "Pedro", 
        "type": "Person"
      }, 
      {
        "familyName": "Ravelo Garc\u00eda", 
        "givenName": "Antonio", 
        "type": "Person"
      }, 
      {
        "familyName": "Guerra Moreno", 
        "givenName": "Iv\u00e1n", 
        "type": "Person"
      }, 
      {
        "familyName": "Toledano", 
        "givenName": "Doroteo T.", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-13623-3_13", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-319-13622-6", 
        "978-3-319-13623-3"
      ], 
      "name": "Advances in Speech and Language Technologies for Iberian Languages", 
      "type": "Book"
    }, 
    "name": "A Deep Neural Network Approach for Missing-Data Mask Estimation on Dual-Microphone Smartphones: Application to Noise-Robust Speech Recognition", 
    "pagination": "119-128", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-13623-3_13"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "c4798063aaf417e45e57a42276872453492549d2fc67db5b216d9c1c01d438c1"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1041244334"
        ]
      }
    ], 
    "publisher": {
      "location": "Cham", 
      "name": "Springer International Publishing", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-13623-3_13", 
      "https://app.dimensions.ai/details/publication/pub.1041244334"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T14:27", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8669_00000268.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-319-13623-3_13"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_13'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_13'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_13'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-13623-3_13'


 

This table displays all metadata directly associated to this object as RDF triples.

154 TRIPLES      23 PREDICATES      37 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-13623-3_13 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N19f6884d60324f79a7857188c20ab573
4 schema:citation https://doi.org/10.1016/j.specom.2004.03.007
5 https://doi.org/10.1016/s0167-6393(00)00034-0
6 https://doi.org/10.1109/icassp.2013.6639038
7 https://doi.org/10.1109/iscslp.2012.6423512
8 https://doi.org/10.1109/msp.2012.2205597
9 https://doi.org/10.1109/tasl.2010.2087753
10 https://doi.org/10.1109/tasl.2013.2250961
11 https://doi.org/10.1109/tassp.1984.1164453
12 https://doi.org/10.1126/science.1127647
13 https://doi.org/10.1162/089976602760128018
14 schema:datePublished 2014
15 schema:datePublishedReg 2014-01-01
16 schema:description The inclusion of two or more microphones in smartphones is becoming quite common. These were originally intended to perform noise reduction and few benefit is still being taken from this feature for noise-robust automatic speech recognition (ASR). In this paper we propose a novel system to estimate missing-data masks for robust ASR on dual-microphone smartphones. This novel system is based on deep neural networks (DNNs), which have proven to be a powerful tool in the field of ASR in different ways. To assess the performance of the proposed technique, spectral reconstruction experiments are carried out on a dual-channel database derived from Aurora-2. Our results demonstrate that the DNN is better able to exploit the dual-channel information and yields an improvement on word accuracy of more than 6% over state-of-the-art single-channel mask estimation techniques.
17 schema:editor N7d5da835395441bd9baad84b16a9f427
18 schema:genre chapter
19 schema:inLanguage en
20 schema:isAccessibleForFree false
21 schema:isPartOf N8726c23b69cd49ebbc2c936b7ee205c4
22 schema:name A Deep Neural Network Approach for Missing-Data Mask Estimation on Dual-Microphone Smartphones: Application to Noise-Robust Speech Recognition
23 schema:pagination 119-128
24 schema:productId N1c16bde5f1134d449adbcda4c886fb35
25 Nd6de1ecc9ac6422f90db0e3d6770057f
26 Nde266fc9d22f44acb8758f7fcbd96ad2
27 schema:publisher Ne2c40ff6511c4be3a6c10bf6e374d5af
28 schema:sameAs https://app.dimensions.ai/details/publication/pub.1041244334
29 https://doi.org/10.1007/978-3-319-13623-3_13
30 schema:sdDatePublished 2019-04-15T14:27
31 schema:sdLicense https://scigraph.springernature.com/explorer/license/
32 schema:sdPublisher Nbaf9bf93dea44e9c8eaf6c295ac0843c
33 schema:url http://link.springer.com/10.1007/978-3-319-13623-3_13
34 sgo:license sg:explorer/license/
35 sgo:sdDataset chapters
36 rdf:type schema:Chapter
37 N0bb3ff13b428487395070f4c641c1306 rdf:first Nf68ba12f6a974db693f525a5db03e3a8
38 rdf:rest rdf:nil
39 N0f0a55033b3e425b86bf5d57ff3854e8 rdf:first N9d5e14d0bbde41d29a76552809838abc
40 rdf:rest Ne79a526618a943e8b2a743786cc2c792
41 N19780a7f85bc4a15a94fbcc46151d939 rdf:first sg:person.013434670141.68
42 rdf:rest rdf:nil
43 N19f6884d60324f79a7857188c20ab573 rdf:first sg:person.010072002657.12
44 rdf:rest N59bdba48251d4e95ab4264c88a77a813
45 N1c16bde5f1134d449adbcda4c886fb35 schema:name dimensions_id
46 schema:value pub.1041244334
47 rdf:type schema:PropertyValue
48 N1d1858aa60e14cd281115c691207773c schema:familyName Quintana Morales
49 schema:givenName Pedro
50 rdf:type schema:Person
51 N462b0a0a48684e79874518e8c196e189 rdf:first sg:person.013366554741.94
52 rdf:rest N19780a7f85bc4a15a94fbcc46151d939
53 N4d9f6784377d4f019f12c52e3a7b0123 rdf:first Nd7492464145540d89eb1c80ed64be559
54 rdf:rest Ne4f4daf3124741f3919f7392a3109b5d
55 N59bdba48251d4e95ab4264c88a77a813 rdf:first sg:person.010477131004.17
56 rdf:rest N462b0a0a48684e79874518e8c196e189
57 N6dbb2261f19440dfb7aca30adb7daef6 schema:familyName Navarro Mesa
58 schema:givenName Juan Luis
59 rdf:type schema:Person
60 N7d5da835395441bd9baad84b16a9f427 rdf:first N6dbb2261f19440dfb7aca30adb7daef6
61 rdf:rest Nbe0ed88b32284c54813fc5baa5721afb
62 N828461e1edf240278241e115a0936af5 schema:familyName Ortega
63 schema:givenName Alfonso
64 rdf:type schema:Person
65 N86bca800f94a44fdbc61284b5b7d9b24 schema:familyName Guerra Moreno
66 schema:givenName Iván
67 rdf:type schema:Person
68 N8726c23b69cd49ebbc2c936b7ee205c4 schema:isbn 978-3-319-13622-6
69 978-3-319-13623-3
70 schema:name Advances in Speech and Language Technologies for Iberian Languages
71 rdf:type schema:Book
72 N98fa4b92cdda4268a51dd86b7692fb91 rdf:first N1d1858aa60e14cd281115c691207773c
73 rdf:rest N0f0a55033b3e425b86bf5d57ff3854e8
74 N9d5e14d0bbde41d29a76552809838abc schema:familyName Ravelo García
75 schema:givenName Antonio
76 rdf:type schema:Person
77 Nbaf9bf93dea44e9c8eaf6c295ac0843c schema:name Springer Nature - SN SciGraph project
78 rdf:type schema:Organization
79 Nbe0ed88b32284c54813fc5baa5721afb rdf:first N828461e1edf240278241e115a0936af5
80 rdf:rest N4d9f6784377d4f019f12c52e3a7b0123
81 Nd6de1ecc9ac6422f90db0e3d6770057f schema:name readcube_id
82 schema:value c4798063aaf417e45e57a42276872453492549d2fc67db5b216d9c1c01d438c1
83 rdf:type schema:PropertyValue
84 Nd7492464145540d89eb1c80ed64be559 schema:familyName Teixeira
85 schema:givenName António
86 rdf:type schema:Person
87 Nde266fc9d22f44acb8758f7fcbd96ad2 schema:name doi
88 schema:value 10.1007/978-3-319-13623-3_13
89 rdf:type schema:PropertyValue
90 Ne2c40ff6511c4be3a6c10bf6e374d5af schema:location Cham
91 schema:name Springer International Publishing
92 rdf:type schema:Organisation
93 Ne4f4daf3124741f3919f7392a3109b5d rdf:first Nf628efbb1aa04355acfdb00e32e7f2c8
94 rdf:rest N98fa4b92cdda4268a51dd86b7692fb91
95 Ne79a526618a943e8b2a743786cc2c792 rdf:first N86bca800f94a44fdbc61284b5b7d9b24
96 rdf:rest N0bb3ff13b428487395070f4c641c1306
97 Nf628efbb1aa04355acfdb00e32e7f2c8 schema:familyName Hernández Pérez
98 schema:givenName Eduardo
99 rdf:type schema:Person
100 Nf68ba12f6a974db693f525a5db03e3a8 schema:familyName Toledano
101 schema:givenName Doroteo T.
102 rdf:type schema:Person
103 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
104 schema:name Information and Computing Sciences
105 rdf:type schema:DefinedTerm
106 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
107 schema:name Artificial Intelligence and Image Processing
108 rdf:type schema:DefinedTerm
109 sg:person.010072002657.12 schema:affiliation https://www.grid.ac/institutes/grid.4489.1
110 schema:familyName López-Espejo
111 schema:givenName Iván
112 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010072002657.12
113 rdf:type schema:Person
114 sg:person.010477131004.17 schema:affiliation https://www.grid.ac/institutes/grid.11835.3e
115 schema:familyName González
116 schema:givenName José A.
117 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010477131004.17
118 rdf:type schema:Person
119 sg:person.013366554741.94 schema:affiliation https://www.grid.ac/institutes/grid.4489.1
120 schema:familyName Gómez
121 schema:givenName Ángel M.
122 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013366554741.94
123 rdf:type schema:Person
124 sg:person.013434670141.68 schema:affiliation https://www.grid.ac/institutes/grid.4489.1
125 schema:familyName Peinado
126 schema:givenName Antonio M.
127 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013434670141.68
128 rdf:type schema:Person
129 https://doi.org/10.1016/j.specom.2004.03.007 schema:sameAs https://app.dimensions.ai/details/publication/pub.1009768526
130 rdf:type schema:CreativeWork
131 https://doi.org/10.1016/s0167-6393(00)00034-0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1030004932
132 rdf:type schema:CreativeWork
133 https://doi.org/10.1109/icassp.2013.6639038 schema:sameAs https://app.dimensions.ai/details/publication/pub.1093924949
134 rdf:type schema:CreativeWork
135 https://doi.org/10.1109/iscslp.2012.6423512 schema:sameAs https://app.dimensions.ai/details/publication/pub.1094983647
136 rdf:type schema:CreativeWork
137 https://doi.org/10.1109/msp.2012.2205597 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061423808
138 rdf:type schema:CreativeWork
139 https://doi.org/10.1109/tasl.2010.2087753 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061516639
140 rdf:type schema:CreativeWork
141 https://doi.org/10.1109/tasl.2013.2250961 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061517108
142 rdf:type schema:CreativeWork
143 https://doi.org/10.1109/tassp.1984.1164453 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061519527
144 rdf:type schema:CreativeWork
145 https://doi.org/10.1126/science.1127647 schema:sameAs https://app.dimensions.ai/details/publication/pub.1004607132
146 rdf:type schema:CreativeWork
147 https://doi.org/10.1162/089976602760128018 schema:sameAs https://app.dimensions.ai/details/publication/pub.1007443228
148 rdf:type schema:CreativeWork
149 https://www.grid.ac/institutes/grid.11835.3e schema:alternateName University of Sheffield
150 schema:name Dept. of Computer Science, University of Sheffield, UK
151 rdf:type schema:Organization
152 https://www.grid.ac/institutes/grid.4489.1 schema:alternateName University of Granada
153 schema:name Dept. of Signal Theory, Telematics and Communications, University of Granada, Spain
154 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...