Speaker Change Detection Using Binary Key Modelling with Contextual Information View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

2017-09-27

AUTHORS

Jose Patino , Héctor Delgado , Nicholas Evans

ABSTRACT

Speaker change detection can be of benefit to a number of different speech processing tasks such as speaker diarization, recognition and detection. Current solutions rely either on highly localized data or on training with large quantities of background data. While efficient, the former tend to over-segment. While more stable, the latter are less efficient and need adaptation to mis-matching data. Building on previous work in speaker recognition and diarization, this paper reports a new binary key (BK) modelling approach to speaker change detection which aims to strike a balance between efficiency and segmentation accuracy. The BK approach benefits from training using a controllable degree of contextual data, rather than relying on external background data, and is efficient in terms of computation and speaker discrimination. Experiments on a subset of the standard ETAPE database show that the new approach outperforms the current state-of-the-art methods for speaker change detection and gives an average relative improvement in segment coverage and purity of 18.71% and 4.51% respectively. More... »

PAGES

250-261

Book

TITLE

Statistical Language and Speech Processing

ISBN

978-3-319-68455-0
978-3-319-68456-7

Identifiers

URI

http://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21

DOI

http://dx.doi.org/10.1007/978-3-319-68456-7_21

DIMENSIONS

https://app.dimensions.ai/details/publication/pub.1091962650


Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
Incoming Citations Browse incoming citations for this publication using opencitations.net

JSON-LD is the canonical representation for SciGraph data.

TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

[
  {
    "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
    "about": [
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Information and Computing Sciences", 
        "type": "DefinedTerm"
      }, 
      {
        "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
        "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
        "name": "Artificial Intelligence and Image Processing", 
        "type": "DefinedTerm"
      }
    ], 
    "author": [
      {
        "affiliation": {
          "alternateName": "Department of Digital Security, EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "Department of Digital Security, EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Patino", 
        "givenName": "Jose", 
        "id": "sg:person.010630645461.61", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010630645461.61"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Digital Security, EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "Department of Digital Security, EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Delgado", 
        "givenName": "H\u00e9ctor", 
        "id": "sg:person.016045564603.89", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016045564603.89"
        ], 
        "type": "Person"
      }, 
      {
        "affiliation": {
          "alternateName": "Department of Digital Security, EURECOM, Sophia Antipolis, France", 
          "id": "http://www.grid.ac/institutes/grid.28848.3e", 
          "name": [
            "Department of Digital Security, EURECOM, Sophia Antipolis, France"
          ], 
          "type": "Organization"
        }, 
        "familyName": "Evans", 
        "givenName": "Nicholas", 
        "id": "sg:person.010025622670.52", 
        "sameAs": [
          "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010025622670.52"
        ], 
        "type": "Person"
      }
    ], 
    "datePublished": "2017-09-27", 
    "datePublishedReg": "2017-09-27", 
    "description": "Speaker change detection can be of benefit to a number of different speech processing tasks such as speaker diarization, recognition and detection. Current solutions rely either on highly localized data or on training with large quantities of background data. While efficient, the former tend to over-segment. While more stable, the latter are less efficient and need adaptation to mis-matching data. Building on previous work in speaker recognition and diarization, this paper reports a new binary key (BK) modelling approach to speaker change detection which aims to strike a balance between efficiency and segmentation accuracy. The BK approach benefits from training using a controllable degree of contextual data, rather than relying on external background data, and is efficient in terms of computation and speaker discrimination. Experiments on a subset of the standard ETAPE database show that the new approach outperforms the current state-of-the-art methods for speaker change detection and gives an average relative improvement in segment coverage and purity of 18.71% and 4.51% respectively.", 
    "editor": [
      {
        "familyName": "Camelin", 
        "givenName": "Nathalie", 
        "type": "Person"
      }, 
      {
        "familyName": "Est\u00e8ve", 
        "givenName": "Yannick", 
        "type": "Person"
      }, 
      {
        "familyName": "Mart\u00edn-Vide", 
        "givenName": "Carlos", 
        "type": "Person"
      }
    ], 
    "genre": "chapter", 
    "id": "sg:pub.10.1007/978-3-319-68456-7_21", 
    "inLanguage": "en", 
    "isAccessibleForFree": false, 
    "isPartOf": {
      "isbn": [
        "978-3-319-68455-0", 
        "978-3-319-68456-7"
      ], 
      "name": "Statistical Language and Speech Processing", 
      "type": "Book"
    }, 
    "keywords": [
      "speaker change detection", 
      "change detection", 
      "terms of computation", 
      "average relative improvement", 
      "segmentation accuracy", 
      "speaker recognition", 
      "art methods", 
      "segment coverage", 
      "current solutions", 
      "speaker diarization", 
      "contextual information", 
      "key modelling approaches", 
      "contextual data", 
      "approach benefits", 
      "diarization", 
      "relative improvement", 
      "speaker discrimination", 
      "different speech", 
      "new approach", 
      "previous work", 
      "current state", 
      "controllable degrees", 
      "recognition", 
      "detection", 
      "modelling approach", 
      "computation", 
      "task", 
      "database", 
      "information", 
      "accuracy", 
      "data", 
      "training", 
      "speech", 
      "background data", 
      "benefits", 
      "modelling", 
      "large quantities", 
      "efficiency", 
      "solution", 
      "work", 
      "method", 
      "experiments", 
      "coverage", 
      "subset", 
      "adaptation", 
      "improvement", 
      "terms", 
      "number", 
      "state", 
      "segments", 
      "degree", 
      "quantity", 
      "balance", 
      "discrimination", 
      "purity", 
      "approach", 
      "paper", 
      "mis-matching data", 
      "new binary key (BK) modelling approach", 
      "binary key (BK) modelling approach", 
      "BK approach benefits", 
      "external background data", 
      "standard ETAPE database", 
      "ETAPE database", 
      "Binary Key Modelling", 
      "Key Modelling"
    ], 
    "name": "Speaker Change Detection Using Binary Key Modelling with Contextual Information", 
    "pagination": "250-261", 
    "productId": [
      {
        "name": "dimensions_id", 
        "type": "PropertyValue", 
        "value": [
          "pub.1091962650"
        ]
      }, 
      {
        "name": "doi", 
        "type": "PropertyValue", 
        "value": [
          "10.1007/978-3-319-68456-7_21"
        ]
      }
    ], 
    "publisher": {
      "name": "Springer Nature", 
      "type": "Organisation"
    }, 
    "sameAs": [
      "https://doi.org/10.1007/978-3-319-68456-7_21", 
      "https://app.dimensions.ai/details/publication/pub.1091962650"
    ], 
    "sdDataset": "chapters", 
    "sdDatePublished": "2022-01-01T19:24", 
    "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
    "sdPublisher": {
      "name": "Springer Nature - SN SciGraph project", 
      "type": "Organization"
    }, 
    "sdSource": "s3://com-springernature-scigraph/baseset/20220101/entities/gbq_results/chapter/chapter_413.jsonl", 
    "type": "Chapter", 
    "url": "https://doi.org/10.1007/978-3-319-68456-7_21"
  }
]
 

Download the RDF metadata as:  json-ld nt turtle xml License info

HOW TO GET THIS DATA PROGRAMMATICALLY:

JSON-LD is a popular format for linked data which is fully compatible with JSON.

curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'

N-Triples is a line-based linked data format ideal for batch operations.

curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'

Turtle is a human-readable linked data format.

curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'

RDF/XML is a standard XML format for linked data.

curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/978-3-319-68456-7_21'


 

This table displays all metadata directly associated to this object as RDF triples.

150 TRIPLES      23 PREDICATES      91 URIs      84 LITERALS      7 BLANK NODES

Subject Predicate Object
1 sg:pub.10.1007/978-3-319-68456-7_21 schema:about anzsrc-for:08
2 anzsrc-for:0801
3 schema:author N0b0bb775ae2645d7a6153c40fba1b402
4 schema:datePublished 2017-09-27
5 schema:datePublishedReg 2017-09-27
6 schema:description Speaker change detection can be of benefit to a number of different speech processing tasks such as speaker diarization, recognition and detection. Current solutions rely either on highly localized data or on training with large quantities of background data. While efficient, the former tend to over-segment. While more stable, the latter are less efficient and need adaptation to mis-matching data. Building on previous work in speaker recognition and diarization, this paper reports a new binary key (BK) modelling approach to speaker change detection which aims to strike a balance between efficiency and segmentation accuracy. The BK approach benefits from training using a controllable degree of contextual data, rather than relying on external background data, and is efficient in terms of computation and speaker discrimination. Experiments on a subset of the standard ETAPE database show that the new approach outperforms the current state-of-the-art methods for speaker change detection and gives an average relative improvement in segment coverage and purity of 18.71% and 4.51% respectively.
7 schema:editor N6ecfef0bdcf44c9b9534e50bdae60db6
8 schema:genre chapter
9 schema:inLanguage en
10 schema:isAccessibleForFree false
11 schema:isPartOf N4d82b58a7d2745abba7c3a4c9fd5b03d
12 schema:keywords BK approach benefits
13 Binary Key Modelling
14 ETAPE database
15 Key Modelling
16 accuracy
17 adaptation
18 approach
19 approach benefits
20 art methods
21 average relative improvement
22 background data
23 balance
24 benefits
25 binary key (BK) modelling approach
26 change detection
27 computation
28 contextual data
29 contextual information
30 controllable degrees
31 coverage
32 current solutions
33 current state
34 data
35 database
36 degree
37 detection
38 diarization
39 different speech
40 discrimination
41 efficiency
42 experiments
43 external background data
44 improvement
45 information
46 key modelling approaches
47 large quantities
48 method
49 mis-matching data
50 modelling
51 modelling approach
52 new approach
53 new binary key (BK) modelling approach
54 number
55 paper
56 previous work
57 purity
58 quantity
59 recognition
60 relative improvement
61 segment coverage
62 segmentation accuracy
63 segments
64 solution
65 speaker change detection
66 speaker diarization
67 speaker discrimination
68 speaker recognition
69 speech
70 standard ETAPE database
71 state
72 subset
73 task
74 terms
75 terms of computation
76 training
77 work
78 schema:name Speaker Change Detection Using Binary Key Modelling with Contextual Information
79 schema:pagination 250-261
80 schema:productId N2f0f3a9ed1d2497aa6f2f0000e1d7b76
81 N538b73032c694e089cb13f3114499aca
82 schema:publisher N5fa6717af112494b924bc454b7558bf6
83 schema:sameAs https://app.dimensions.ai/details/publication/pub.1091962650
84 https://doi.org/10.1007/978-3-319-68456-7_21
85 schema:sdDatePublished 2022-01-01T19:24
86 schema:sdLicense https://scigraph.springernature.com/explorer/license/
87 schema:sdPublisher N44464e6e9451434fabba8b31f39d61c3
88 schema:url https://doi.org/10.1007/978-3-319-68456-7_21
89 sgo:license sg:explorer/license/
90 sgo:sdDataset chapters
91 rdf:type schema:Chapter
92 N0b0bb775ae2645d7a6153c40fba1b402 rdf:first sg:person.010630645461.61
93 rdf:rest N194a26ee66c84f639dd3ec82e955b3ab
94 N194a26ee66c84f639dd3ec82e955b3ab rdf:first sg:person.016045564603.89
95 rdf:rest N66f81263c11a4adebeb6d0acdd301a5b
96 N2f0f3a9ed1d2497aa6f2f0000e1d7b76 schema:name dimensions_id
97 schema:value pub.1091962650
98 rdf:type schema:PropertyValue
99 N30cb33e2346b4bea8264a316a8f2a856 schema:familyName Martín-Vide
100 schema:givenName Carlos
101 rdf:type schema:Person
102 N44464e6e9451434fabba8b31f39d61c3 schema:name Springer Nature - SN SciGraph project
103 rdf:type schema:Organization
104 N4d82b58a7d2745abba7c3a4c9fd5b03d schema:isbn 978-3-319-68455-0
105 978-3-319-68456-7
106 schema:name Statistical Language and Speech Processing
107 rdf:type schema:Book
108 N50695a1a3ce24cabb13ac6dff8afecd3 schema:familyName Camelin
109 schema:givenName Nathalie
110 rdf:type schema:Person
111 N538b73032c694e089cb13f3114499aca schema:name doi
112 schema:value 10.1007/978-3-319-68456-7_21
113 rdf:type schema:PropertyValue
114 N5fa6717af112494b924bc454b7558bf6 schema:name Springer Nature
115 rdf:type schema:Organisation
116 N66f81263c11a4adebeb6d0acdd301a5b rdf:first sg:person.010025622670.52
117 rdf:rest rdf:nil
118 N6ecfef0bdcf44c9b9534e50bdae60db6 rdf:first N50695a1a3ce24cabb13ac6dff8afecd3
119 rdf:rest Nd8611872fa92412ca51bfbe0fef84900
120 N856d137659734205ab982aac27ece319 schema:familyName Estève
121 schema:givenName Yannick
122 rdf:type schema:Person
123 Nacc1181828694532b177952a6d01f328 rdf:first N30cb33e2346b4bea8264a316a8f2a856
124 rdf:rest rdf:nil
125 Nd8611872fa92412ca51bfbe0fef84900 rdf:first N856d137659734205ab982aac27ece319
126 rdf:rest Nacc1181828694532b177952a6d01f328
127 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
128 schema:name Information and Computing Sciences
129 rdf:type schema:DefinedTerm
130 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
131 schema:name Artificial Intelligence and Image Processing
132 rdf:type schema:DefinedTerm
133 sg:person.010025622670.52 schema:affiliation grid-institutes:grid.28848.3e
134 schema:familyName Evans
135 schema:givenName Nicholas
136 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010025622670.52
137 rdf:type schema:Person
138 sg:person.010630645461.61 schema:affiliation grid-institutes:grid.28848.3e
139 schema:familyName Patino
140 schema:givenName Jose
141 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.010630645461.61
142 rdf:type schema:Person
143 sg:person.016045564603.89 schema:affiliation grid-institutes:grid.28848.3e
144 schema:familyName Delgado
145 schema:givenName Héctor
146 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.016045564603.89
147 rdf:type schema:Person
148 grid-institutes:grid.28848.3e schema:alternateName Department of Digital Security, EURECOM, Sophia Antipolis, France
149 schema:name Department of Digital Security, EURECOM, Sophia Antipolis, France
150 rdf:type schema:Organization
 




Preview window. Press ESC to close (or click here)


...