Speaker Change Detection Using Binary Key Modelling with Contextual Information View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2017-09-27

AUTHORS

Jose Patino , Héctor Delgado , Nicholas Evans

ABSTRACT

Speaker change detection can be of benefit to a number of different speech processing tasks such as speaker diarization, recognition and detection. Current solutions rely either on highly localized data or on training with large quantities of background data. While efficient, the former tend to over-segment. While more stable, the latter are less efficient and need adaptation to mis-matching data. Building on previous work in speaker recognition and diarization, this paper reports a new binary key (BK) modelling approach to speaker change detection which aims to strike a balance between efficiency and segmentation accuracy. The BK approach benefits from training using a controllable degree of contextual data, rather than relying on external background data, and is efficient in terms of computation and speaker discrimination. Experiments on a subset of the standard ETAPE database show that the new approach outperforms the current state-of-the-art methods for speaker change detection and gives an average relative improvement in segment coverage and purity of 18.71% and 4.51% respectively. More... »

PAGES

250-261

Book

TITLE

Statistical Language and Speech Processing

ISBN

978-3-319-68455-0
978-3-319-68456-7

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21

DOI

http://dx.doi.org/10.1007/978-3-319-68456-7_21

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1091962650


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Digital Security, EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "Department of Digital Security, EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Patino", 
        "givenName": "Jose", 
        "id": "sg:person.010630645461.61", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010630645461.61"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Digital Security, EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "Department of Digital Security, EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Delgado", 
        "givenName": "H\u00e9ctor", 
        "id": "sg:person.016045564603.89", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016045564603.89"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Digital Security, EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "Department of Digital Security, EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Evans", 
        "givenName": "Nicholas", 
        "id": "sg:person.010025622670.52", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010025622670.52"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2017-09-27", 
    "datePublishedReg": "2017-09-27", 
    "description": "Speaker change detection can be of benefit to a number of different speech processing tasks such as speaker diarization, recognition and detection. Current solutions rely either on highly localized data or on training with large quantities of background data. While efficient, the former tend to over-segment. While more stable, the latter are less efficient and need adaptation to mis-matching data. Building on previous work in speaker recognition and diarization, this paper reports a new binary key (BK) modelling approach to speaker change detection which aims to strike a balance between efficiency and segmentation accuracy. The BK approach benefits from training using a controllable degree of contextual data, rather than relying on external background data, and is efficient in terms of computation and speaker discrimination. Experiments on a subset of the standard ETAPE database show that the new approach outperforms the current state-of-the-art methods for speaker change detection and gives an average relative improvement in segment coverage and purity of 18.71% and 4.51% respectively.", 
    "editor": [
      {
        "familyName": "Camelin", 
        "givenName": "Nathalie", 
        "type": "Person"
      }, 
      {
        "familyName": "Est\u00e8ve", 
        "givenName": "Yannick", 
        "type": "Person"
      }, 
      {
        "familyName": "Mart\u00edn-Vide", 
        "givenName": "Carlos", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-68456-7_21", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-319-68455-0", 
        "978-3-319-68456-7"
      ], 
      "name": "Statistical Language and Speech Processing", 
      "type": "Book"
    }, 
    "keywords": [
      "speaker change detection", 
      "change detection", 
      "terms of computation", 
      "average relative improvement", 
      "segmentation accuracy", 
      "speaker recognition", 
      "art methods", 
      "segment coverage", 
      "current solutions", 
      "speaker diarization", 
      "contextual information", 
      "key modelling approaches", 
      "contextual data", 
      "approach benefits", 
      "diarization", 
      "relative improvement", 
      "speaker discrimination", 
      "different speech", 
      "new approach", 
      "previous work", 
      "current state", 
      "controllable degrees", 
      "recognition", 
      "detection", 
      "modelling approach", 
      "computation", 
      "task", 
      "database", 
      "information", 
      "accuracy", 
      "data", 
      "training", 
      "speech", 
      "background data", 
      "benefits", 
      "modelling", 
      "large quantities", 
      "efficiency", 
      "solution", 
      "work", 
      "method", 
      "experiments", 
      "coverage", 
      "subset", 
      "adaptation", 
      "improvement", 
      "terms", 
      "number", 
      "state", 
      "segments", 
      "degree", 
      "quantity", 
      "balance", 
      "discrimination", 
      "purity", 
      "approach", 
      "paper", 
      "mis-matching data", 
      "new binary key (BK) modelling approach", 
      "binary key (BK) modelling approach", 
      "BK approach benefits", 
      "external background data", 
      "standard ETAPE database", 
      "ETAPE database", 
      "Binary Key Modelling", 
      "Key Modelling"
    ], 
    "name": "Speaker Change Detection Using Binary Key Modelling with Contextual Information", 
    "pagination": "250-261", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1091962650"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-68456-7_21"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-68456-7_21", 
      "https://app.dimensions.ai/details/publication/pub.1091962650"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2021-11-01T18:46", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20211101/entities/gbq_results/chapter/chapter_111.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-319-68456-7_21"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'


 

This table displays all metadata directly associated to this object as RDF triples.

150 TRIPLES      23 PREDICATES      91 URIs      84 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-68456-7_21 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author Nf2453019b6674ca3a97b99d71a5cafb3
4 schema:datePublished 2017-09-27
5 schema:datePublishedReg 2017-09-27
6 schema:description Speaker change detection can be of benefit to a number of different speech processing tasks such as speaker diarization, recognition and detection. Current solutions rely either on highly localized data or on training with large quantities of background data. While efficient, the former tend to over-segment. While more stable, the latter are less efficient and need adaptation to mis-matching data. Building on previous work in speaker recognition and diarization, this paper reports a new binary key (BK) modelling approach to speaker change detection which aims to strike a balance between efficiency and segmentation accuracy. The BK approach benefits from training using a controllable degree of contextual data, rather than relying on external background data, and is efficient in terms of computation and speaker discrimination. Experiments on a subset of the standard ETAPE database show that the new approach outperforms the current state-of-the-art methods for speaker change detection and gives an average relative improvement in segment coverage and purity of 18.71% and 4.51% respectively.
7 schema:editor Ne713b7b8e30a416bbb5acf1b0a47b8e0
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf Nac2a645638f94e9d8b43cc31acb6fc19
12 schema:keywords BK approach benefits
13 Binary Key Modelling
14 ETAPE database
15 Key Modelling
16 accuracy
17 adaptation
18 approach
19 approach benefits
20 art methods
21 average relative improvement
22 background data
23 balance
24 benefits
25 binary key (BK) modelling approach
26 change detection
27 computation
28 contextual data
29 contextual information
30 controllable degrees
31 coverage
32 current solutions
33 current state
34 data
35 database
36 degree
37 detection
38 diarization
39 different speech
40 discrimination
41 efficiency
42 experiments
43 external background data
44 improvement
45 information
46 key modelling approaches
47 large quantities
48 method
49 mis-matching data
50 modelling
51 modelling approach
52 new approach
53 new binary key (BK) modelling approach
54 number
55 paper
56 previous work
57 purity
58 quantity
59 recognition
60 relative improvement
61 segment coverage
62 segmentation accuracy
63 segments
64 solution
65 speaker change detection
66 speaker diarization
67 speaker discrimination
68 speaker recognition
69 speech
70 standard ETAPE database
71 state
72 subset
73 task
74 terms
75 terms of computation
76 training
77 work
78 schema:name Speaker Change Detection Using Binary Key Modelling with Contextual Information
79 schema:pagination 250-261
80 schema:productId N2b948f73e2d345a3ada41eb12eca0fdd
81 N5b5a932b382f4f469e46eba7ab38b0a3
82 schema:publisher N966b46fd376e49fc8549177f701aca95
83 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091962650
84 https://doi.org/10.1007/978-3-319-68456-7_21
85 schema:sdDatePublished 2021-11-01T18:46
86 schema:sdLicense https://scigraph.springernature.com/explorer/license/
87 schema:sdPublisher N420b14609eb4490aa460683f1b324b5c
88 schema:url https://doi.org/10.1007/978-3-319-68456-7_21
89 sgo:license sg:explorer/license/
90 sgo:sdDataset chapters
91 rdf:type schema:Chapter
92 N0eeda624fa2a4c27ba52a372e34c6f0e rdf:first sg:person.010025622670.52
93 rdf:rest rdf:nil
94 N0fd105a6d2c7490ebac78d43e0e27f4a schema:familyName Camelin
95 schema:givenName Nathalie
96 rdf:type schema:Person
97 N2b948f73e2d345a3ada41eb12eca0fdd schema:name doi
98 schema:value 10.1007/978-3-319-68456-7_21
99 rdf:type schema:PropertyValue
100 N31ab6796a6c6431bb1a2f423b6376793 schema:familyName Estève
101 schema:givenName Yannick
102 rdf:type schema:Person
103 N420b14609eb4490aa460683f1b324b5c schema:name Springer Nature - SN SciGraph project
104 rdf:type schema:Organization
105 N5b5a932b382f4f469e46eba7ab38b0a3 schema:name dimensions_id
106 schema:value pub.1091962650
107 rdf:type schema:PropertyValue
108 N772ea80b423b461e9df47a6d769222ca schema:familyName Martín-Vide
109 schema:givenName Carlos
110 rdf:type schema:Person
111 N815b0fc89e6546c7819c04cc8223487f rdf:first sg:person.016045564603.89
112 rdf:rest N0eeda624fa2a4c27ba52a372e34c6f0e
113 N966b46fd376e49fc8549177f701aca95 schema:name Springer Nature
114 rdf:type schema:Organisation
115 Na4d655ba54af4de29121e8281b654b8d rdf:first N772ea80b423b461e9df47a6d769222ca
116 rdf:rest rdf:nil
117 Nac2a645638f94e9d8b43cc31acb6fc19 schema:isbn 978-3-319-68455-0
118 978-3-319-68456-7
119 schema:name Statistical Language and Speech Processing
120 rdf:type schema:Book
121 Nc6e9b5d55a0a411eb06be410c1bdf022 rdf:first N31ab6796a6c6431bb1a2f423b6376793
122 rdf:rest Na4d655ba54af4de29121e8281b654b8d
123 Ne713b7b8e30a416bbb5acf1b0a47b8e0 rdf:first N0fd105a6d2c7490ebac78d43e0e27f4a
124 rdf:rest Nc6e9b5d55a0a411eb06be410c1bdf022
125 Nf2453019b6674ca3a97b99d71a5cafb3 rdf:first sg:person.010630645461.61
126 rdf:rest N815b0fc89e6546c7819c04cc8223487f
127 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
128 schema:name Information and Computing Sciences
129 rdf:type schema:DefinedTerm
130 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
131 schema:name Artificial Intelligence and Image Processing
132 rdf:type schema:DefinedTerm
133 sg:person.010025622670.52 schema:affiliation grid-institutes:grid.28848.3e
134 schema:familyName Evans
135 schema:givenName Nicholas
136 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010025622670.52
137 rdf:type schema:Person
138 sg:person.010630645461.61 schema:affiliation grid-institutes:grid.28848.3e
139 schema:familyName Patino
140 schema:givenName Jose
141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010630645461.61
142 rdf:type schema:Person
143 sg:person.016045564603.89 schema:affiliation grid-institutes:grid.28848.3e
144 schema:familyName Delgado
145 schema:givenName Héctor
146 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016045564603.89
147 rdf:type schema:Person
148 grid-institutes:grid.28848.3e schema:alternateName Department of Digital Security, EURECOM, Sophia Antipolis, France
149 schema:name Department of Digital Security, EURECOM, Sophia Antipolis, France
150 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...