A comparison of active shape model and scale decomposition based features for visual speech recognition View Full Text


Ontology type: schema:Chapter     


Chapter Info

DATE

1998

AUTHORS

Iain Matthews , J. Andrew Bangham , Richard Harvey , Stephen Cox

ABSTRACT

Two quite different strategies for characterising mouth shapes for visual speech recognition (lipreading) are compared. The first strategy extracts the parameters required to fit an active shape model (ASM) to the outline of the lips. The second uses a feature derived from a one-dimensional multiscale spatial analysis (MSA) of the mouth region using a new processor derived from mathematical morphology and median filtering. With multispeaker trials, using image data only, the accuracy is 45% using MSA and 19% using ASM on a letters database. A digits database is simpler with accuracies of 77% and 77% respectively. These scores are significant since separate work has demonstrated that even quite low recognition accuracies in the vision channel can be combined with the audio system to give improved composite performance [16]. More... »

PAGES

514-528

References to SciGraph publications

  • 1976-12. Hearing lips and seeing voices in NATURE
  • Book

    TITLE

    Computer Vision — ECCV’98

    ISBN

    978-3-540-64613-6
    978-3-540-69235-5

    Author Affiliations

    Identifiers

    URI

    http://scigraph.springernature.com/pub.10.1007/bfb0054762

    DOI

    http://dx.doi.org/10.1007/bfb0054762

    DIMENSIONS

    https://app.dimensions.ai/details/publication/pub.1043812027


    Indexing Status Check whether this publication has been indexed by Scopus and Web Of Science using the SN Indexing Status Tool
    Incoming Citations Browse incoming citations for this publication using opencitations.net

    JSON-LD is the canonical representation for SciGraph data.

    TIP: You can open this SciGraph record using an external JSON-LD service: JSON-LD Playground Google SDTT

    [
      {
        "@context": "https://springernature.github.io/scigraph/jsonld/sgcontext.json", 
        "about": [
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/0801", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Artificial Intelligence and Image Processing", 
            "type": "DefinedTerm"
          }, 
          {
            "id": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/08", 
            "inDefinedTermSet": "http://purl.org/au-research/vocabulary/anzsrc-for/2008/", 
            "name": "Information and Computing Sciences", 
            "type": "DefinedTerm"
          }
        ], 
        "author": [
          {
            "affiliation": {
              "alternateName": "University of East Anglia", 
              "id": "https://www.grid.ac/institutes/grid.8273.e", 
              "name": [
                "School of Information Systems, University of East Anglia, NR4 7TJ\u00a0Norwich, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Matthews", 
            "givenName": "Iain", 
            "id": "sg:person.0634154656.41", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0634154656.41"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of East Anglia", 
              "id": "https://www.grid.ac/institutes/grid.8273.e", 
              "name": [
                "School of Information Systems, University of East Anglia, NR4 7TJ\u00a0Norwich, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Bangham", 
            "givenName": "J. Andrew", 
            "id": "sg:person.015235013665.90", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015235013665.90"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of East Anglia", 
              "id": "https://www.grid.ac/institutes/grid.8273.e", 
              "name": [
                "School of Information Systems, University of East Anglia, NR4 7TJ\u00a0Norwich, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Harvey", 
            "givenName": "Richard", 
            "id": "sg:person.013620351211.30", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013620351211.30"
            ], 
            "type": "Person"
          }, 
          {
            "affiliation": {
              "alternateName": "University of East Anglia", 
              "id": "https://www.grid.ac/institutes/grid.8273.e", 
              "name": [
                "School of Information Systems, University of East Anglia, NR4 7TJ\u00a0Norwich, UK"
              ], 
              "type": "Organization"
            }, 
            "familyName": "Cox", 
            "givenName": "Stephen", 
            "id": "sg:person.0605735320.01", 
            "sameAs": [
              "https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0605735320.01"
            ], 
            "type": "Person"
          }
        ], 
        "citation": [
          {
            "id": "https://doi.org/10.1016/0165-1684(94)90156-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016029284"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0165-1684(94)90156-2", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1016029284"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0262-8856(94)90060-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020753425"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/0262-8856(94)90060-4", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1020753425"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1117/12.243349", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1021082839"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1016/1047-3203(92)90028-r", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1021647737"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "sg:pub.10.1038/264746a0", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1047030157", 
              "https://doi.org/10.1038/264746a0"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/34.494642", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061156401"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/35.41402", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061159096"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1109/83.503918", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1061239460"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1121/1.1907309", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062273221"
            ], 
            "type": "CreativeWork"
          }, 
          {
            "id": "https://doi.org/10.1121/1.1908620", 
            "sameAs": [
              "https://app.dimensions.ai/details/publication/pub.1062274414"
            ], 
            "type": "CreativeWork"
          }
        ], 
        "datePublished": "1998", 
        "datePublishedReg": "1998-01-01", 
        "description": "Two quite different strategies for characterising mouth shapes for visual speech recognition (lipreading) are compared. The first strategy extracts the parameters required to fit an active shape model (ASM) to the outline of the lips. The second uses a feature derived from a one-dimensional multiscale spatial analysis (MSA) of the mouth region using a new processor derived from mathematical morphology and median filtering. With multispeaker trials, using image data only, the accuracy is 45% using MSA and 19% using ASM on a letters database. A digits database is simpler with accuracies of 77% and 77% respectively. These scores are significant since separate work has demonstrated that even quite low recognition accuracies in the vision channel can be combined with the audio system to give improved composite performance [16].", 
        "editor": [
          {
            "familyName": "Burkhardt", 
            "givenName": "Hans", 
            "type": "Person"
          }, 
          {
            "familyName": "Neumann", 
            "givenName": "Bernd", 
            "type": "Person"
          }
        ], 
        "genre": "chapter", 
        "id": "sg:pub.10.1007/bfb0054762", 
        "inLanguage": [
          "en"
        ], 
        "isAccessibleForFree": false, 
        "isPartOf": {
          "isbn": [
            "978-3-540-64613-6", 
            "978-3-540-69235-5"
          ], 
          "name": "Computer Vision \u2014 ECCV\u201998", 
          "type": "Book"
        }, 
        "name": "A comparison of active shape model and scale decomposition based features for visual speech recognition", 
        "pagination": "514-528", 
        "productId": [
          {
            "name": "doi", 
            "type": "PropertyValue", 
            "value": [
              "10.1007/bfb0054762"
            ]
          }, 
          {
            "name": "readcube_id", 
            "type": "PropertyValue", 
            "value": [
              "d0b19fb1d01cd70374e334fb072341e0c42a2ccae7781f8afdceedcc8e7694d1"
            ]
          }, 
          {
            "name": "dimensions_id", 
            "type": "PropertyValue", 
            "value": [
              "pub.1043812027"
            ]
          }
        ], 
        "publisher": {
          "location": "Berlin, Heidelberg", 
          "name": "Springer Berlin Heidelberg", 
          "type": "Organisation"
        }, 
        "sameAs": [
          "https://doi.org/10.1007/bfb0054762", 
          "https://app.dimensions.ai/details/publication/pub.1043812027"
        ], 
        "sdDataset": "chapters", 
        "sdDatePublished": "2019-04-15T13:30", 
        "sdLicense": "https://scigraph.springernature.com/explorer/license/", 
        "sdPublisher": {
          "name": "Springer Nature - SN SciGraph project", 
          "type": "Organization"
        }, 
        "sdSource": "s3://com-uberresearch-data-dimensions-target-20181106-alternative/cleanup/v134/2549eaecd7973599484d7c17b260dba0a4ecb94b/merge/v9/a6c9fde33151104705d4d7ff012ea9563521a3ce/jats-lookup/v90/0000000001_0000000264/records_8664_00000270.jsonl", 
        "type": "Chapter", 
        "url": "http://link.springer.com/10.1007/BFb0054762"
      }
    ]
     

    Download the RDF metadata as:  json-ld nt turtle xml License info

    HOW TO GET THIS DATA PROGRAMMATICALLY:

    JSON-LD is a popular format for linked data which is fully compatible with JSON.

    curl -H 'Accept: application/ld+json' 'https://scigraph.springernature.com/pub.10.1007/bfb0054762'

    N-Triples is a line-based linked data format ideal for batch operations.

    curl -H 'Accept: application/n-triples' 'https://scigraph.springernature.com/pub.10.1007/bfb0054762'

    Turtle is a human-readable linked data format.

    curl -H 'Accept: text/turtle' 'https://scigraph.springernature.com/pub.10.1007/bfb0054762'

    RDF/XML is a standard XML format for linked data.

    curl -H 'Accept: application/rdf+xml' 'https://scigraph.springernature.com/pub.10.1007/bfb0054762'


     

    This table displays all metadata directly associated to this object as RDF triples.

    122 TRIPLES      23 PREDICATES      37 URIs      20 LITERALS      8 BLANK NODES

    Subject Predicate Object
    1 sg:pub.10.1007/bfb0054762 schema:about anzsrc-for:08
    2 anzsrc-for:0801
    3 schema:author N77b150960e314e3483faeb5408486a56
    4 schema:citation sg:pub.10.1038/264746a0
    5 https://doi.org/10.1016/0165-1684(94)90156-2
    6 https://doi.org/10.1016/0262-8856(94)90060-4
    7 https://doi.org/10.1016/1047-3203(92)90028-r
    8 https://doi.org/10.1109/34.494642
    9 https://doi.org/10.1109/35.41402
    10 https://doi.org/10.1109/83.503918
    11 https://doi.org/10.1117/12.243349
    12 https://doi.org/10.1121/1.1907309
    13 https://doi.org/10.1121/1.1908620
    14 schema:datePublished 1998
    15 schema:datePublishedReg 1998-01-01
    16 schema:description Two quite different strategies for characterising mouth shapes for visual speech recognition (lipreading) are compared. The first strategy extracts the parameters required to fit an active shape model (ASM) to the outline of the lips. The second uses a feature derived from a one-dimensional multiscale spatial analysis (MSA) of the mouth region using a new processor derived from mathematical morphology and median filtering. With multispeaker trials, using image data only, the accuracy is 45% using MSA and 19% using ASM on a letters database. A digits database is simpler with accuracies of 77% and 77% respectively. These scores are significant since separate work has demonstrated that even quite low recognition accuracies in the vision channel can be combined with the audio system to give improved composite performance [16].
    17 schema:editor Nca3edd5ee1104117a33910fa0eede359
    18 schema:genre chapter
    19 schema:inLanguage en
    20 schema:isAccessibleForFree false
    21 schema:isPartOf N4effbb4f0eec411d8fcb4ce3a0c074ae
    22 schema:name A comparison of active shape model and scale decomposition based features for visual speech recognition
    23 schema:pagination 514-528
    24 schema:productId N11be6d7a16694baba40a30c698527169
    25 N3c1d6c1f82c24da69d8abc12cc1614bb
    26 N86fc788977614269ae88f1714803e508
    27 schema:publisher N7515f944a7354252830169071ebfc89d
    28 schema:sameAs https://app.dimensions.ai/details/publication/pub.1043812027
    29 https://doi.org/10.1007/bfb0054762
    30 schema:sdDatePublished 2019-04-15T13:30
    31 schema:sdLicense https://scigraph.springernature.com/explorer/license/
    32 schema:sdPublisher N39d86e14fd2f40388ee4b18a9c447559
    33 schema:url http://link.springer.com/10.1007/BFb0054762
    34 sgo:license sg:explorer/license/
    35 sgo:sdDataset chapters
    36 rdf:type schema:Chapter
    37 N11be6d7a16694baba40a30c698527169 schema:name readcube_id
    38 schema:value d0b19fb1d01cd70374e334fb072341e0c42a2ccae7781f8afdceedcc8e7694d1
    39 rdf:type schema:PropertyValue
    40 N28320d453a2540009081165e14863e8c rdf:first sg:person.013620351211.30
    41 rdf:rest N3e9f8065c1214ec8ab0c07e5e791cad8
    42 N39d86e14fd2f40388ee4b18a9c447559 schema:name Springer Nature - SN SciGraph project
    43 rdf:type schema:Organization
    44 N3c1d6c1f82c24da69d8abc12cc1614bb schema:name dimensions_id
    45 schema:value pub.1043812027
    46 rdf:type schema:PropertyValue
    47 N3e9f8065c1214ec8ab0c07e5e791cad8 rdf:first sg:person.0605735320.01
    48 rdf:rest rdf:nil
    49 N444691bf1a0f4605be090a052c7cb00c schema:familyName Burkhardt
    50 schema:givenName Hans
    51 rdf:type schema:Person
    52 N4effbb4f0eec411d8fcb4ce3a0c074ae schema:isbn 978-3-540-64613-6
    53 978-3-540-69235-5
    54 schema:name Computer Vision — ECCV’98
    55 rdf:type schema:Book
    56 N74e6734e399a4ef4b2c701e2b999e7f1 rdf:first N8f52d281124441d1bc18114b1b4db9d4
    57 rdf:rest rdf:nil
    58 N7515f944a7354252830169071ebfc89d schema:location Berlin, Heidelberg
    59 schema:name Springer Berlin Heidelberg
    60 rdf:type schema:Organisation
    61 N77b150960e314e3483faeb5408486a56 rdf:first sg:person.0634154656.41
    62 rdf:rest Ndce77247244942f5b1a7785163bad39e
    63 N86fc788977614269ae88f1714803e508 schema:name doi
    64 schema:value 10.1007/bfb0054762
    65 rdf:type schema:PropertyValue
    66 N8f52d281124441d1bc18114b1b4db9d4 schema:familyName Neumann
    67 schema:givenName Bernd
    68 rdf:type schema:Person
    69 Nca3edd5ee1104117a33910fa0eede359 rdf:first N444691bf1a0f4605be090a052c7cb00c
    70 rdf:rest N74e6734e399a4ef4b2c701e2b999e7f1
    71 Ndce77247244942f5b1a7785163bad39e rdf:first sg:person.015235013665.90
    72 rdf:rest N28320d453a2540009081165e14863e8c
    73 anzsrc-for:08 schema:inDefinedTermSet anzsrc-for:
    74 schema:name Information and Computing Sciences
    75 rdf:type schema:DefinedTerm
    76 anzsrc-for:0801 schema:inDefinedTermSet anzsrc-for:
    77 schema:name Artificial Intelligence and Image Processing
    78 rdf:type schema:DefinedTerm
    79 sg:person.013620351211.30 schema:affiliation https://www.grid.ac/institutes/grid.8273.e
    80 schema:familyName Harvey
    81 schema:givenName Richard
    82 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.013620351211.30
    83 rdf:type schema:Person
    84 sg:person.015235013665.90 schema:affiliation https://www.grid.ac/institutes/grid.8273.e
    85 schema:familyName Bangham
    86 schema:givenName J. Andrew
    87 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.015235013665.90
    88 rdf:type schema:Person
    89 sg:person.0605735320.01 schema:affiliation https://www.grid.ac/institutes/grid.8273.e
    90 schema:familyName Cox
    91 schema:givenName Stephen
    92 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0605735320.01
    93 rdf:type schema:Person
    94 sg:person.0634154656.41 schema:affiliation https://www.grid.ac/institutes/grid.8273.e
    95 schema:familyName Matthews
    96 schema:givenName Iain
    97 schema:sameAs https://app.dimensions.ai/discover/publication?and_facet_researcher=ur.0634154656.41
    98 rdf:type schema:Person
    99 sg:pub.10.1038/264746a0 schema:sameAs https://app.dimensions.ai/details/publication/pub.1047030157
    100 https://doi.org/10.1038/264746a0
    101 rdf:type schema:CreativeWork
    102 https://doi.org/10.1016/0165-1684(94)90156-2 schema:sameAs https://app.dimensions.ai/details/publication/pub.1016029284
    103 rdf:type schema:CreativeWork
    104 https://doi.org/10.1016/0262-8856(94)90060-4 schema:sameAs https://app.dimensions.ai/details/publication/pub.1020753425
    105 rdf:type schema:CreativeWork
    106 https://doi.org/10.1016/1047-3203(92)90028-r schema:sameAs https://app.dimensions.ai/details/publication/pub.1021647737
    107 rdf:type schema:CreativeWork
    108 https://doi.org/10.1109/34.494642 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061156401
    109 rdf:type schema:CreativeWork
    110 https://doi.org/10.1109/35.41402 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061159096
    111 rdf:type schema:CreativeWork
    112 https://doi.org/10.1109/83.503918 schema:sameAs https://app.dimensions.ai/details/publication/pub.1061239460
    113 rdf:type schema:CreativeWork
    114 https://doi.org/10.1117/12.243349 schema:sameAs https://app.dimensions.ai/details/publication/pub.1021082839
    115 rdf:type schema:CreativeWork
    116 https://doi.org/10.1121/1.1907309 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062273221
    117 rdf:type schema:CreativeWork
    118 https://doi.org/10.1121/1.1908620 schema:sameAs https://app.dimensions.ai/details/publication/pub.1062274414
    119 rdf:type schema:CreativeWork
    120 https://www.grid.ac/institutes/grid.8273.e schema:alternateName University of East Anglia
    121 schema:name School of Information Systems, University of East Anglia, NR4 7TJ Norwich, UK
    122 rdf:type schema:Organization
     




    Preview window. Press ESC to close (or click here)


    ...