A Comparative Analysis of Bayesian Nonparametric Inference Algorithms for Acoustic Modeling in Speech Recognition View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2015

AUTHORS

John Steinberg , Amir Harati , Joseph Picone

ABSTRACT

Nonparametric Bayesian models have become increasingly popular in speech recognition for their ability to discover data’s underlying structure in an iterative manner. Dirichlet process mixtures (DPMs) are a widely used nonparametric method that do not require a priori assumptions about the structure of the data. DPMs, however, require an infinite number of parameters so inference algorithms are needed to make posterior calculations tractable. The focus of this work is an evaluation of three variational inference algorithms for acoustic modeling: Accelerated Variational Dirichlet Process Mixtures (AVDPM), Collapsed Variational Stick Breaking (CVSB), and Collapsed Dirichlet Priors (CDP). A phoneme classification task is chosen to more clearly assess the viability of these algorithms for acoustic modeling. Evaluations were conducted on the CALLHOME English and Mandarin corpora, consisting of two languages that, from a human perspective, are phonologically very different. In this work, we show that these inference algorithms yield error rates comparable to a baseline Gaussian mixture model (GMM) but with a factor of 20 fewer mixture components. AVDPM is the most attractive choice because it delivers the most compact models and is computationally efficient, enabling its application to big data problems. More... »

PAGES

461-466

Book

TITLE

Innovations and Advances in Computing, Informatics, Systems Sciences, Networking and Engineering

ISBN

978-3-319-06772-8
978-3-319-06773-5

Author Affiliations

From Grant

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-06773-5_61

DOI

http://dx.doi.org/10.1007/978-3-319-06773-5_61

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1013023752


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Temple University", 
          "id": "https://www.grid.ac/institutes/grid.264727.2", 
          "name": [
            "Department of Electrical and Computer Engineering, Temple University, Philadelphia, PA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Steinberg", 
        "givenName": "John", 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Temple University", 
          "id": "https://www.grid.ac/institutes/grid.264727.2", 
          "name": [
            "Department of Electrical and Computer Engineering, Temple University, Philadelphia, PA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Harati", 
        "givenName": "Amir", 
        "id": "sg:person.0751601420.23", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0751601420.23"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Temple University", 
          "id": "https://www.grid.ac/institutes/grid.264727.2", 
          "name": [
            "Department of Electrical and Computer Engineering, Temple University, Philadelphia, PA, USA"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Picone", 
        "givenName": "Joseph", 
        "id": "sg:person.013440266677.70", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013440266677.70"
        ], 
        "type": "Person"
      }
    ], 
    "citation": [
      {
        "id": "https://doi.org/10.1214/aos/1176342871", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1038214170"
        ], 
        "type": "CreativeWork"
      }, 
      {
        "id": "https://doi.org/10.1214/06-ba104", 
        "sameAs": [
          "https://app.dimensions.ai/details/publication/pub.1064389454"
        ], 
        "type": "CreativeWork"
      }
    ], 
    "datePublished": "2015", 
    "datePublishedReg": "2015-01-01", 
    "description": "Nonparametric Bayesian models have become increasingly popular in speech recognition for their ability to discover data\u2019s underlying structure in an iterative manner. Dirichlet process mixtures (DPMs) are a widely used nonparametric method that do not require a priori assumptions about the structure of the data. DPMs, however, require an infinite number of parameters so inference algorithms are needed to make posterior calculations tractable. The focus of this work is an evaluation of three variational inference algorithms for acoustic modeling: Accelerated Variational Dirichlet Process Mixtures (AVDPM), Collapsed Variational Stick Breaking (CVSB), and Collapsed Dirichlet Priors (CDP). A phoneme classification task is chosen to more clearly assess the viability of these algorithms for acoustic modeling. Evaluations were conducted on the CALLHOME English and Mandarin corpora, consisting of two languages that, from a human perspective, are phonologically very different. In this work, we show that these inference algorithms yield error rates comparable to a baseline Gaussian mixture model (GMM) but with a factor of 20 fewer mixture components. AVDPM is the most attractive choice because it delivers the most compact models and is computationally efficient, enabling its application to big data problems.", 
    "editor": [
      {
        "familyName": "Sobh", 
        "givenName": "Tarek", 
        "type": "Person"
      }, 
      {
        "familyName": "Elleithy", 
        "givenName": "Khaled", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-06773-5_61", 
    "inLanguage": [
      "en"
    ], 
    "isAccessibleForFree": false, 
    "isFundedItemOf": [
      {
        "id": "sg:grant.3110216", 
        "type": "MonetaryGrant"
      }
    ], 
    "isPartOf": {
      "isbn": [
        "978-3-319-06772-8", 
        "978-3-319-06773-5"
      ], 
      "name": "Innovations and Advances in Computing, Informatics, Systems Sciences, Networking and Engineering", 
      "type": "Book"
    }, 
    "name": "A Comparative Analysis of Bayesian Nonparametric Inference Algorithms for Acoustic Modeling in Speech Recognition", 
    "pagination": "461-466", 
    "productId": [
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-06773-5_61"
        ]
      }, 
      {
        "name": "readcube_id", 
        "type": "PropertyValue", 
        "value": [
          "3e0b4dc2cf9c18cb687b9b521aa2300e42b04fc1a3dddb9e59ca548bb0a1c62b"
        ]
      }, 
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1013023752"
        ]
      }
    ], 
    "publisher": {
      "location": "Cham", 
      "name": "Springer International Publishing", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-06773-5_61", 
      "https://app.dimensions.ai/details/publication/pub.1013023752"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2019-04-15T17:12", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8678_00000251.jsonl", 
    "type": "Chapter", 
    "url": "http://link.springer.com/10.1007/978-3-319-06773-5_61"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-06773-5_61'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-06773-5_61'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-06773-5_61'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-06773-5_61'


 

This table displays all metadata directly associated to this object as RDF triples.

91 TRIPLES      23 PREDICATES      29 URIs      20 LITERALS      8 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-06773-5_61 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N3b5705cc753d407eb85feadda42837a0
4 schema:citation https://doi.org/10.1214/06-ba104
5 https://doi.org/10.1214/aos/1176342871
6 schema:datePublished 2015
7 schema:datePublishedReg 2015-01-01
8 schema:description Nonparametric Bayesian models have become increasingly popular in speech recognition for their ability to discover data’s underlying structure in an iterative manner. Dirichlet process mixtures (DPMs) are a widely used nonparametric method that do not require a priori assumptions about the structure of the data. DPMs, however, require an infinite number of parameters so inference algorithms are needed to make posterior calculations tractable. The focus of this work is an evaluation of three variational inference algorithms for acoustic modeling: Accelerated Variational Dirichlet Process Mixtures (AVDPM), Collapsed Variational Stick Breaking (CVSB), and Collapsed Dirichlet Priors (CDP). A phoneme classification task is chosen to more clearly assess the viability of these algorithms for acoustic modeling. Evaluations were conducted on the CALLHOME English and Mandarin corpora, consisting of two languages that, from a human perspective, are phonologically very different. In this work, we show that these inference algorithms yield error rates comparable to a baseline Gaussian mixture model (GMM) but with a factor of 20 fewer mixture components. AVDPM is the most attractive choice because it delivers the most compact models and is computationally efficient, enabling its application to big data problems.
9 schema:editor N4664a2222be347b6bbe605e11df87921
10 schema:genre chapter
11 schema:inLanguage en
12 schema:isAccessibleForFree false
13 schema:isPartOf N4ca9ef7cf22b4a44bd2e89d79d5cbafc
14 schema:name A Comparative Analysis of Bayesian Nonparametric Inference Algorithms for Acoustic Modeling in Speech Recognition
15 schema:pagination 461-466
16 schema:productId N20350b9ff99548a5bfe524976d6252ae
17 N432e23c2fd204c4da07d49d14e2a3806
18 N5d1c8e8052e04df5abbfec631f85094b
19 schema:publisher N2407e01603c84129ad519f4af8a2caa7
20 schema:sameAs https://app.dimensions.ai/details/publication/pub.1013023752
21 https://doi.org/10.1007/978-3-319-06773-5_61
22 schema:sdDatePublished 2019-04-15T17:12
23 schema:sdLicense https://scigraph.springernature.com/explorer/license/
24 schema:sdPublisher Nca0f76a857984858ac9baeb19ec391ea
25 schema:url http://link.springer.com/10.1007/978-3-319-06773-5_61
26 sgo:license sg:explorer/license/
27 sgo:sdDataset chapters
28 rdf:type schema:Chapter
29 N156db0958f45450d8019be378db3ffd9 rdf:first sg:person.013440266677.70
30 rdf:rest rdf:nil
31 N20350b9ff99548a5bfe524976d6252ae schema:name readcube_id
32 schema:value 3e0b4dc2cf9c18cb687b9b521aa2300e42b04fc1a3dddb9e59ca548bb0a1c62b
33 rdf:type schema:PropertyValue
34 N2407e01603c84129ad519f4af8a2caa7 schema:location Cham
35 schema:name Springer International Publishing
36 rdf:type schema:Organisation
37 N3b5705cc753d407eb85feadda42837a0 rdf:first N4d9e8999a2fe47339deb60cf8b868d05
38 rdf:rest Nee594a6338b7417db3dac56d44dd32c7
39 N432e23c2fd204c4da07d49d14e2a3806 schema:name doi
40 schema:value 10.1007/978-3-319-06773-5_61
41 rdf:type schema:PropertyValue
42 N4664a2222be347b6bbe605e11df87921 rdf:first N47608424519c425590a2d38e08073287
43 rdf:rest N5c237a7fe3f04ea9ae32fbae2b80d06d
44 N47608424519c425590a2d38e08073287 schema:familyName Sobh
45 schema:givenName Tarek
46 rdf:type schema:Person
47 N4ca9ef7cf22b4a44bd2e89d79d5cbafc schema:isbn 978-3-319-06772-8
48 978-3-319-06773-5
49 schema:name Innovations and Advances in Computing, Informatics, Systems Sciences, Networking and Engineering
50 rdf:type schema:Book
51 N4d9e8999a2fe47339deb60cf8b868d05 schema:affiliation https://www.grid.ac/institutes/grid.264727.2
52 schema:familyName Steinberg
53 schema:givenName John
54 rdf:type schema:Person
55 N5c237a7fe3f04ea9ae32fbae2b80d06d rdf:first Nbee2d387d4114fc4b7fb56f7de651b7d
56 rdf:rest rdf:nil
57 N5d1c8e8052e04df5abbfec631f85094b schema:name dimensions_id
58 schema:value pub.1013023752
59 rdf:type schema:PropertyValue
60 Nbee2d387d4114fc4b7fb56f7de651b7d schema:familyName Elleithy
61 schema:givenName Khaled
62 rdf:type schema:Person
63 Nca0f76a857984858ac9baeb19ec391ea schema:name Springer Nature - SN SciGraph project
64 rdf:type schema:Organization
65 Nee594a6338b7417db3dac56d44dd32c7 rdf:first sg:person.0751601420.23
66 rdf:rest N156db0958f45450d8019be378db3ffd9
67 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
68 schema:name Information and Computing Sciences
69 rdf:type schema:DefinedTerm
70 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
71 schema:name Artificial Intelligence and Image Processing
72 rdf:type schema:DefinedTerm
73 sg:grant.3110216 http://pending.schema.org/fundedItem sg:pub.10.1007/978-3-319-06773-5_61
74 rdf:type schema:MonetaryGrant
75 sg:person.013440266677.70 schema:affiliation https://www.grid.ac/institutes/grid.264727.2
76 schema:familyName Picone
77 schema:givenName Joseph
78 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013440266677.70
79 rdf:type schema:Person
80 sg:person.0751601420.23 schema:affiliation https://www.grid.ac/institutes/grid.264727.2
81 schema:familyName Harati
82 schema:givenName Amir
83 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0751601420.23
84 rdf:type schema:Person
85 https://doi.org/10.1214/06-ba104 schema:sameAs https://app.dimensions.ai/details/publication/pub.1064389454
86 rdf:type schema:CreativeWork
87 https://doi.org/10.1214/aos/1176342871 schema:sameAs https://app.dimensions.ai/details/publication/pub.1038214170
88 rdf:type schema:CreativeWork
89 https://www.grid.ac/institutes/grid.264727.2 schema:alternateName Temple University
90 schema:name Department of Electrical and Computer Engineering, Temple University, Philadelphia, PA, USA
91 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...