Detalle del congreso

Autores: Lucas D. Terissi; Marianela Parodi; Juan Carlos Gómez.

Resumen: In this paper, a visual speech classification scheme based on wavelets and Random Forests (RF) is proposed. Wavelet analysis is used to represent the sequence of visual parameters, either model-based or image-based features. The coefficients associated with these representations are used as features to model the visual information. Lip reading is then performed using these wavelet-based features and a Random Forests classification method. The performance of the proposed visual speech classification scheme is evaluated with three different isolated word audio-visual databases, two of them public ones and the other compiled by the authors of this paper. Experimental results show that a good performance is achieved with the proposed lip reading system over the three databases. In addition, the proposed method performs better than other reported methods in the literature over the two public databases. Experiments over the three different databases were performed using the same configuration, i.e., there was no need to adapt the wavelet representation stage parameters or the RF classifier parameters to each particular database.

Tipo de reunión: Conferencia.

Tipo de trabajo: Artículo Completo.

Producción: Lip reading using wavelet-based features and Random Forests classification.

Reunión científica: 22nd International Conference on Pattern Recognition.

Lugar: Estocolmo.

Publicado: Sí

Lugar publicación: Estocolmo

Mes de reunión: 8

Año: 2014.

Volver