Emotion Recognition Using Multi-Scale Auto-Encoders with Cross Session Adoption

G ChennaKesava Reddy; P Reshma; T Vaishnavi; J Siva Shankar; N Venkata Sai; S Mohammed Mohid; T Bharath Kumar

doi:10.5281/zenodo.15251013

Vol. 4 No. 4 (2025): October

RESEARCH ARTICLES

Emotion Recognition Using Multi-Scale Auto-Encoders with Cross Session Adoption

DOWNLOAD PDF

G ChennaKesava Reddy,
P Reshma,
T Vaishnavi,
J Siva Shankar,
N Venkata Sai,
S Mohammed Mohid,
T Bharath Kumar

more info

G ChennaKesava Reddy
Department of AI and Data Science, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India

P Reshma
Department of AI and Data Science, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India

T Vaishnavi
Department of AI and Data Science, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India

J Siva Shankar
Department of AI and Data Science, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India

N Venkata Sai
Department of AI and Data Science, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India

S Mohammed Mohid
Department of AI and Data Science, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India

T Bharath Kumar
Department of AI and Data Science, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India

DOI: https://doi.org/10.5281/zenodo.15251013

Published 2025-04-20

Keywords

Multi-Scale Masked Autoencoders (MSMAE),
Electroencephalogram (EEG),
Dataset for Emotion Analysis using Physiological Signals,
Long-Short Term Memory (LSTM),
Galvanic Skin Response (GSR)

How to Cite

G ChennaKesava Reddy, P Reshma, T Vaishnavi, J Siva Shankar, N Venkata Sai, S Mohammed Mohid, & T Bharath Kumar. (2025). Emotion Recognition Using Multi-Scale Auto-Encoders with Cross Session Adoption . International Journal of Computational Learning & Intelligence, 4(4), 706–715. https://doi.org/10.5281/zenodo.15251013

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Abstract

Emotion recognition from EEG (electroencephalography) signals is a challenging yet promising area of research, with applications ranging from mental health monitoring to adaptive human-computer interactions. Traditional approaches, such as those using Random Forest algorithms, have shown potential but often fall short in effectively capturing the complex temporal and spatial patterns inherent in EEG data. In this study, we propose a novel framework employing Multi-Scale Masked Autoencoders (MSMAE) combined with Convolutional Neural Networks (CNNs) for cross-session emotion recognition. Utilizing the Seed IV EEG dataset, our method leverages the multi-scale feature extraction capabilities of MSMAE to handle varying signal frequencies and the powerful pattern recognition abilities of CNNs to enhance classification accuracy. The MSMAE framework pre-trains the CNN by reconstructing the masked EEG signals at different scales, enabling it to learn robust and generalized features across different sessions. Comparative evaluations demonstrate that our proposed MSMAE-CNN model significantly outperforms the existing Random Forest algorithm, providing a more reliable and effective solution for emotion recognition in diverse and dynamic environments. This advancement not only highlights the potential of deep learning models in EEG-based emotion recognition but also sets a new benchmark for future research in this field

DOWNLOAD PDF

References

He, K., Chen, X., Xie, S., Li, Y., Dollár, P., & Girshick, R. (2022). Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 16000–16009).
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., ... & Guo, B. (2021). Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 10012–10022).
Sikka, K., Zhou, Z., & Picard, R. W. (2021). Cross-corpus speech emotion recognition using adversarial discriminative domain adaptation. IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2021.3086303
Li, S., Deng, W., & Du, J. (2023). Deep learning for multimodal emotion recognition: A survey. IEEE Transactions on Affective Computing, 14(2), 1001–1025.
Acharya, U. R., Oh, S. L., Hagiwara, Y., Tan, J. H., & Adam, M. (2022). Deep convolutional neural network for automated diagnosis of emotion using physiological signals. Frontiers in Neuroscience. https://doi.org/10.3389/fnins.2022.720456
Ahmed, S. T., Basha, S. M., Ramachandran, M., Daneshmand, M., & Gandomi, A. H. (2023). An edge-AI-enabled autonomous connected ambulance-route resource recommendation protocol (ACA-R3) for eHealth in smart cities. IEEE Internet of Things Journal, 10(13), 11497-11506.
Ahmed, S. T., Fathima, A. S., Nishabai, M., & Sophia, S. (2024). Medical ChatBot assistance for primary clinical guidance using machine learning techniques. Procedia Computer Science, 233, 279-287.
Alotaibi, F., & Alotaibi, B. (2023). Edge-based real-time emotion recognition using deep learning. Sensors, 23(3), 1250.
Chen, T., & Zhang, M. (2021). A masked autoencoder framework for multimodal learning in emotion recognition. In Proceedings of the ACM Multimedia Conference (MM) (pp. 530–539).
Cireşan, D. C., Meier, U., & Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3642–3649).
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction (2nd ed.). Springer.
He, Y., & Zhang, Z. (2021). Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 2945–2954).
Hyang, P. (2024). Multi MAE-DER: Multimodal masked autoencoder for dynamic emotion recognition. Preprint/Conference Proceedings (details not provided).
Jafar, H., & Ateya, A. (2020). Emotion recognition from EEG signals using deep learning models. IEEE Access, 8, 58933–58941.
Kim, H., & Lee, J. (2022). Transformer-based multimodal emotion recognition with attention mechanisms. Information Fusion, 88, 1–15.
Kim, S., & Lee, J. (2022). Deep learning-based emotion recognition using combined EEG and eye movement data. Frontiers in Neuroscience, 16, 720456.
Kumar, S. S., Ahmed, S. T., Sandeep, S., Madheswaran, M., & Basha, S. M. (2022). Unstructured Oncological Image Cluster Identification Using Improved Unsupervised Clustering Techniques. Computers, Materials & Continua, 72(1).
Liu, Y., Sourina, O., & Nguyen, M. K. (2010). Real-time EEG-based human emotion recognition and visualization. In Proceedings of the International Conference on Cyberworlds (pp. 83–89).
Liu, Y., Sourina, O., & Nguyen, M. K. (2020). Real-time EEG-based human emotion recognition and visualization. Cyberworlds, 32(1), e5130.
Pang, M., Wang, H., Huang, J., Vong, C.-M., Ziqiang, & Chen, C. (2024). Multi-scale masked autoencoders for cross-session emotion recognition. IEEE Transactions on Neural Systems and Rehabilitation Engineering. https://doi.org/10.1109/TNSRE.2024.XXXXXXX
Pang, M., Wang, H., Huang, J., Vong, C.-M., Ziqiang, & Chen, C. (2024). Multi-scale masked autoencoders for cross-session emotion recognition. IEEE Transactions on Neural Systems and Rehabilitation Engineering.
Pasha, A., Ahmed, S. T., Painam, R. K., Mathivanan, S. K., Mallik, S., & Qin, H. (2024). Leveraging ANFIS with Adam and PSO optimizers for Parkinson's disease. Heliyon, 10(9).
Periasamy, K., Periasamy, S., Velayutham, S., Zhang, Z., Ahmed, S. T., & Jayapalan, A. (2022). A proactive model to predict osteoporosis: An artificial immune system approach. Expert Systems, 39(4), e12708.
Rani, P., Kumar, R., & Kumar, A. (2023). Emotion recognition using multimodal deep learning: A review and future directions. Multimedia Tools and Applications, 82(1), 1–35.
Sreedhar, K. S., Ahmed, S. T., & Sreejesh, G. (2022, June). An Improved Technique to Identify Fake News on Social Media Network using Supervised Machine Learning Concepts. In 2022 IEEE World Conference on Applied Intelligence and Computing (AIC) (pp. 652-658). IEEE.
Tripathi, S., & Tripathi, R. K. (2023). Explainable AI for emotion recognition: A critical review. Artificial Intelligence Review, 56(5), 3897–3930.
Yang, Z., & Yu, H. (2022). Multimodal fusion for emotion recognition in speech and text. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2942–2946).
Zhao, Y., Li, X., & Zhang, H. (2023). Graph convolutional networks for facial emotion recognition: A comprehensive review. Pattern Recognition, 138, 109351.
Zhou, Z., & Zhang, J. (2021). Masking and predicting for multimodal emotion recognition. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) (pp. 1224–1231).

Emotion Recognition Using Multi-Scale Auto-Encoders with Cross Session Adoption

Keywords

How to Cite

Download Citation

Abstract

References