Deep learning for cross-corpora based speech emotion recognition

Title:Deep learning for cross-corpora based speech emotion recognition
Institute:University of Twente (HMI)
Place:Enschede The Netherlands
Type:Capita selecta and Research Topics
End date:not present
HMI ContactKhiet TruongJaebok Kim


For speech emotion recognition, there have been large efforts to collect data and develop new algorithms for a realistic performance. We have 5 different corpora containing large numbers of different speakers and acoustic backgrounds. In this assignment, we aim to build speech emotion recognition models using these corpora and deep-learning

To do this, we need to address and solve the following research questions:

1. How to capture underlying abstract features for aggregated corpora?
2. How to select sets of samples from aggregated corpora?
3. What is an optimal normalisation for cross-corpora?

For this research, feature extraction, modelling tools (including deep-learning), and 5 well-known corpora: LDC, SEMAINE, EMODB, VAM, ENTERFACE will be provided.


SCHULLER, Björn, et al. Cross-corpus acoustic emotion recognition: variances and strategies. Affective Computing, IEEE Transactions on, 2010, 1.2: 119-131.

DENG, Jun, et al. Sparse autoencoder-based feature transfer learning for speech emotion recognition. In: Affective Computing and Intelligent Interaction (ACII), 2013 Humaine Association Conference on. IEEE, 2013. p. 511-516.