0% found this document useful (0 votes)
121 views8 pages

Classifying Emotional Engagement in Online Learning Via Deep Learning Architecture

The world has seen a phenomenal rise in online learning over the past decade, with universities shifting courses to online modes, MOOCs(Massive Open Online Course) emerging and laptop and tab-based initiatives being extensively promoted. However, educators face significant challenges in analyzing learning environments due to issues like lack of in-person cues, small video size, etc. To address these challenges, it is crucial to analyze the engagement levels of online classes. Out of the various

Uploaded by

Poonam Kilaniya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views8 pages

Classifying Emotional Engagement in Online Learning Via Deep Learning Architecture

The world has seen a phenomenal rise in online learning over the past decade, with universities shifting courses to online modes, MOOCs(Massive Open Online Course) emerging and laptop and tab-based initiatives being extensively promoted. However, educators face significant challenges in analyzing learning environments due to issues like lack of in-person cues, small video size, etc. To address these challenges, it is crucial to analyze the engagement levels of online classes. Out of the various

Uploaded by

Poonam Kilaniya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

International Journal of Advanced Engineering, Management and

Science (IJAEMS)
Peer-Reviewed Journal
ISSN: 2454-1311 | Vol-10, Issue-5; Jul-Aug, 2024
Journal Home Page: https://ijaems.com/
DOI: https://dx.doi.org/10.22161/ijaems.105.2

Classifying Emotional Engagement in Online Learning Via


Deep Learning Architecture
Prisha Jain1, Chaya Ravindra2
1NeerjaModi School, India
Email: [email protected]
2Department of Engineering and Technology, REVA University, India

Email: [email protected]

Received: 08 May 2024; Received in revised form: 16 Jun 2024; Accepted: 24 Jun 2024; Available online: 02 Jul 2024

Abstract— The world has seen a phenomenal rise in online learning over the past decade, with universities
shifting courses to online modes, MOOCs(Massive Open Online Course) emerging and laptop and tab-based
initiatives being extensively promoted. However, educators face significant challenges in analyzing learning
environments due to issues like lack of in-person cues, small video size, etc. To address these challenges, it is
crucial to analyze the engagement levels of online classes. Out of the various subcategories of engagement,
emotional engagement is one that is overlooked, but integral to analysis and deterministic in its approach. In
response, we developed a deep learning architecture to analyze emotional engagement in online classes. Our
method utilizes a ResNet50-based algorithm, refined through experimentation with various techniques such as
transfer learning, optimizers, and pre-trained weights. The model adds a unique layer to the analysis of different
algorithms used for engagement detection in academia while also achieving stellar rates of 81.34% validation
accuracy and 81.04% training accuracy. Unlike other models, our approach employs high-quality image data
for training, ensuring more reliable results. Moreover, we constructed a novel framework for applying emotional
engagement to real-world scenarios, thus bridging the pre-existing gap between implementation and academia.
The integration of this technology into online learning has immense potential, and can bring with it a shift in the
quality of education. By fostering a safe and healthy learning space for every student, we can significantly
enhance the effectiveness of online education systems.
Keywords— deep learning, emotional engagement, engagement, framework, online learning, ResNet-50

I. INTRODUCTION advantageous due to how ubiquitously and flexibly it can


Education stands as a fundamental pillar of modern- day be used along with the increased course variety it provides,
society and one of the most influential developments in it still lacks in many aspects, including teacher- student
this field is online learning. Over the past decade, online interaction and practical education provision (Das & Paris,
learning has rapidly gained popularity and usage 2022).
(Mukhopadhyay et al., 2020), with the COVID 19 One key challenge with online classes is analyzing
pandemic greatly catalyzing its implementation into learning environments. This is due to multiple reasons,
society (Gupta & Kumar, 2022). For instance, many including the absence of non-verbal and in-person cues,
universities and institutes have shifted onto virtual the miniscule size of videos which makes it impractical to
platforms. MOOCs (Massive Open Online Course) have assess students' reactions and teach simultaneously, the
emerged, dramatically changing the education landscape, necessity of muting student microphones which hinders
with over 150,000 being available in 2023 (Pickard et al., interactive feedback, etc. Therefore, teachers tend to teach
2023). Multiple laptop and tab-based initiatives have been without a complete understanding of whether or not
promoted by schools and governments globally (Clarke & students are concentrating on and comprehending the
Svanaes, 2014; Fuhrman, 2014). While online learning is material, as has been proved in multiple studies

This article can be downloaded from here: www.ijaems.com 63


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Jain and Ravindra International Journal of Advanced Engineering, Management and Science, 10(5) -2024

(Sobieszczuk-Nowicka et al., 2018; Mashoedah et al., networks to imitate the intricate decision-making
2018). It is also difficult for educators to understand the capability of the human brain. These deep neural networks
class dynamics and environment in online modes. As a are trained on vast amounts of data to enable them to
result, students’ emotional well-being can’t be catered to. identify phenomena, observe patterns in information, and
In turn, since students' participation is highly impacted by make predictions and decisions. They only need to be
the direct attention and support they obtain from teachers, trained once, however, after which they can efficiently be
students are prompted to leave the class or disengage from used for purposes ranging from medical diagnosis to
lessons (Azlan et al., 2020). voice-enabled machinery (Goodfellow et al., 2016). Many
To initiate change, it is necessary to systematically analyze deep learning algorithms are used to create neural
online classes. The principal approach for analyzing networks. This paper focuses on ResNet-50, which was
learning environments is to monitor student engagement developed by Microsoft researchers in 2015. It was
levels. Engagement can be defined as “the interaction designed to enable better performance through its residual
between the time, effort and other relevant resources connections. Interestingly, its name was derived from its
invested by both students and their institutions to optimize characteristic feature of having 50 layers in its network.
the student experience while also enhancing the learning One particular machine learning technique that we will use
outcomes and development of students as well as the in the study is transfer learning. Regarding theoretical
performance of the institution” (Trowler, 2010). There are context, transfer learning can be defined as a method
multiple types or sub-categories of engagement within the where a model trained on one task is used as the starting
educational setting. Researchers agree that cognitive, point for a model on a second task. By using the learned
emotional and behavioral engagement are the most features from the first task, the model can work more
deterministic. Cognitive engagement refers to the efficiently and quickly even with a small amount of data
willingness and effort to grasp more difficult concepts and (Ali et al., 2023).
try challenging puzzles, behavioral engagement refers to
concentration and attention on the material, and emotional
engagement refers to the presence of positive emotion such
as interest and enthusiasm in regards to the material being
taught (Hasnine et al., 2023).
This paper has limited its scope to emotional engagement
due to its comprehensiveness and significance, along with
the elusiveness of its quantifiability in pre-existing
frameworks. According to Patrick et al, the premise is
simple: “the more emotionally involved students are with
their environment while studying a subject, the more
engaged they are, and the more support students get with
managing their emotional states, the more they can pay
attention in classes” (Patrick et al., 2007). In other words,
student engagement is directly proportional to their Fig. I The process of transfer learning
achievement (Skinner et al., 1998). Hence, it is crucial for
achieving learning goals and receiving quality education.
Many methods are used to gauge emotional engagement.
Traditionally, educators rely on quizzes and questionnaires
at the end of sessions, but this is prone to demand
characteristics and is susceptible to the student’s angle of
analysis (McCambridge et al., 2012). It also requires a lot
of effort from both the students and the educators. Hence,
automation has been brought into the limelight,
significantly shifting the potential scope of emotional
engagement analysis. Our research delves into the field of
automated analysis through the usage of deep learning.
Deep Learning (DL) is a subset of machine learning that
utilizes multi-layered neural networks called deep neural Fig. II Deep learning architecture

This article can be downloaded from here: www.ijaems.com 64


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Jain and Ravindra International Journal of Advanced Engineering, Management and Science, 10(5) -2024

The key contributions of this paper are: discrepancies. The model may also have difficulty
• This paper proposes a model that has been trained interpreting mixed emotions, since it is trained on
to detect the emotional engagement levels of artificially emotive images.
students in real-time with a stellar accuracy of
81.34% val. accuracy and 81.01% test accuracy II. METHOD
• This paper adds on to the plethora of research This research was carried out on Google Collab software
done in this field by methodologically with T4 GPU, using the highly-acclaimed python libraries
experimenting with 4+ datasets, 3+ algorithms, of Keras and Tensorflow. The dataset was uploaded to
and a wide range of machine learning techniques Google Drive, where file paths were used to reference the
to determine which is more lightweight and yields images and train the model on them. Initially, the
better results, along with the learning rate and employed system underwent training with the FER-2013
epoch number at which it does so dataset, which contains 30,000+ images of people of
• The model uses high quality data, a feature of different cultures and ages. However, due to low image
datasets that is rarely seen in research in this field quality and lack of color, the Facial Expressions Training
Data was chosen instead. This dataset is a high-quality,
• This paper also aims to provide a modified
coloured dataset consisting of 29,000+ (96 by 96 pixels)
framework that prioritizes privacy by analyzing
images. It was taken from Kaggle, a public dataset
student videos on their own devices and provides
publishing platform.
visual, easy to navigate, graphical summaries to
educators. It will also enable a student support To pre-process the data, multiple steps were taken. The
system to assist students with dire emotional labeled data was first sorted into its respective emotion
states class folders, and split into validation, training and testing
data by a 10-80-10 split. Training and validation data was
In terms of potential limitations in our research, a
shuffled to ensure random selection.
prevalent issue is the scarcity of available high-quality
data, which reduces the accuracy of models and their
ability to learn relevant features. Moreover, there may be
biases due to deep learning models mirroring the innate
biases of the training data. For example, cultural
accessories such as bindis and headscarves may not be
properly identified by the model and hence may create

Fig. III Process of data cleaning

The employed CNN (Convolutional Neural Network) To construct the architecture, we removed the fully
architecture was integral to this study. We experimented connected layers at the top of the pre-trained models to
with MobileNet, ResNet-50 and EfficientNet, evaluating enable customization of layers. 8 output classes were
which would be better for the chosen objective. While all added, namely ‘Happy’, ‘Sad’, ‘Contempt’, ‘Surprised’,
of them converged as epochs increased, ResNet-50 had the ‘Neutral’, ‘Fear’, and ‘Anger’.
best overall performance since it gave higher accuracies In terms of the layers in the models, the functional transfer
even at smaller epochs. Additionally, transfer learning learning layers were followed by alternating flatten and
proved to be a crucial technique to increase the speed and dense layers. These dense layers were composed of 2048
accuracy of the model. We used pre-trained a ResNet50 neurons. For activation, ReLu was used to prevent
model from Keras Applications. gradients from saturating and hence solve the issue of

This article can be downloaded from here: www.ijaems.com 65


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Jain and Ravindra International Journal of Advanced Engineering, Management and Science, 10(5) -2024

vanishing gradients. In the final layer, Softmax was used, analysis, we deemed SGD (Stochastic Gradient Descent)
which helped training converge at a faster rate. to be better suited due to how well it converged to more
Moreover, the model weights pre-trained on the standard optimal solutions.
ImageNet dataset were used. These weights were locked In this study, loss calculation was done through sparse
into the models to ensure learned representations are not categorical cross entropy. In comparison to other methods,
lost. After the convolutional layers, global average pooling it saves time in memory as well as computation. The key
was used to reduce the amount of computation required metric we used to measure the success of the model was
while retaining important features. In terms of optimizers, training accuracy, which estimates the potential of a
we initially implemented Adam, which is a standard model.
method to help the model converge faster. However, upon

Fig. IV Model structure summary

III. RESULTS precision (88.9%) and recall (89.7%) for class 1. However,
i) Model it has a relatively high false positive rate (92.6%) and a
low false negative rate (10.3%).
The final model is an 8-layer sequential classification
model, composed of a pre-trained layer along with
alternating dense and flattening layers. The usage of high-
quality data and experimentation with parameters has
resulted in lightweight yet high performance structuring. In
essence, this model analyses student expressions to
accurately classify their emotional engagement states.
ii) Framework
This framework is designed to be an extension app in
online learning platforms such as Zoom. Currently, the
market does not host any such platforms, with the closest
alternative being Engagement Hub, an extension on Zoom
Fig. V: Confusion matrix Marketplace that allows users to automatically transcribe
and analyze meeting recordings. This lack of
implementation may be a result of how restricted
In summary, the confusion matrix indicates that the model engagement analysis via deep learning architecture is to
has a high accuracy (80.8%) and performs well in terms of academia.

This article can be downloaded from here: www.ijaems.com 66


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Jain and Ravindra International Journal of Advanced Engineering, Management and Science, 10(5) -2024

Fig. VI Process of emotional engagement analysis

Following are the steps of the devised framework- • The extracted image will then be run through an
• At the start of any session, an automated message emotional engagement detection model, where it
will be displayed on all student devices to notify will be pre-processed and then analyzed. Through
them that they are being recorded and analyzed. methods like transfer learning, optimization,
This will be similar to the pre-existing feature on pooling layers, etc. the model is fine tuned to
zoom that notifies participants when screen accurately predict the emotion of the student.
recording is turned on by a user. Through this • The student’s name is then extracted from their
feature, the privacy rights of students will be name label. The name and its associated emotion
protected. classification is encrypted and sent to the
• The cycle of emotional engagement analysis will teacher’s device.
repeat in a set interval of time, for example, every • At the teacher’s device, all data is decrypted and
2 minutes. On each student’s device, their camera entered into an array. This process will run in the
will be connected to the framework and a backend, where it can’t be accessed by the
screenshot will be taken. teacher.
• Through a basic AI (Artificial Intelligence) • The emotion classification data of the array will
algorithm, the student’s face will be detected. then be used to generate a pie-graph. This will be
Then, facial features of the image will be an easy to read, understandable format for
extracted by mapping of facial points. For both educators to quickly access and analyze. The
face detection and feature extraction, the OpenCV graph will be available during screen sharing and
library will be used, which provides ready-to-use be readily movable across the educator’s page.
methods with advanced capabilities. This will ensure ease and efficiency.

This article can be downloaded from here: www.ijaems.com 67


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Jain and Ravindra International Journal of Advanced Engineering, Management and Science, 10(5) -2024

Fig. VII Process of student support

Following are the steps for student support in the devised willing to spend a lot or aren't comfortable with
framework- professional therapy
• For long-term courses that engage with students for
more than 3 sessions, educators can turn on settings to IV. DISCUSSION
enable student support. Data arrays from each session
For the purpose of this study, we developed a ResNet50-
will be automatically stored on the teacher’s device.
based classification model, with the aim of analyzing
This data will be encrypted to prevent privacy
different architectures, datasets, parameters, etc. to develop
invasion. It will be loaded back onto the streaming
the most accurate and efficient version. This was
platform architecture in the backend when the next
accompanied with constructing a framework which
session starts.
detailed the real-time process of image extraction, feature
• The data arrays in the backend will be analyzed and if detection, emotion classification and data storage. The
a student is flagged to have shown negative emotions student support system is unique from pre-existing
such as ‘sad’, ‘angry’, ‘contempt’, ‘fear’, etc. research through its ability to actually utilize the data
repeatedly (i.e. more than 7 times in an average emotion classification data to assist students that are
session of 30 minutes), their device will be contacted. struggling.
An automated support message will be sent asking for
This study’s results are promising, both in terms of model
their consent to take further action. Moreover, if all
analysis and framework development. The high training
students show negative emotions consistently, this
accuracy reflects that the model architecture and
will be an indication to the educator to make their
hyperparameters are well-suited to the task. The ResNet50
sessions more engaging.
model is particularly noteworthy due to its performance
• If the student gives consent, they will be prompted to and lightweight characteristics. Additionally, the features
take one or more of 3 actions- in the dataset are highly predictive of the target variable.
• They can contact their teacher or other trusted While the research objectives were met, it’s essential to
staff, with whom they can then share their consider the limitations as well. Due to the lack of
concerns. This method would be best suited for available data, the potential of this model was stunted to
issues with the learning style or course load. some extent. With resources like more computational
power, it could have had better performance. As predicted,
• They can contact professional therapists or
the model also may have difficulties in real- world
psychologists. We will suggest trained experts
scenarios, where lighting, angles, accessories, etc may
they can reach out to. This method is suited for
distort faces in images and lead to inaccurate predictions.
personal issues, such as mental health disorders,
The training data may also be artificial in its expression of
financial issues, health-related challenges, etc.
specific emotions, leading to disparity with real-life
• We can also collaborate with high-quality analysis scenarios since students don't portray singular
therapist AI bots. This would work best for emotions in real life but rather have mixed emotions that
students who have minor problems, and aren't the model may get confused with.

This article can be downloaded from here: www.ijaems.com 68


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Jain and Ravindra International Journal of Advanced Engineering, Management and Science, 10(5) -2024

The framework is well developed through its cumulation ACKNOWLEDGEMENTS


of the ideations of notable researchers and creative I would like to extend my sincere thanks to Ndeavors,
addition of more unique features. It is also realistic in which managed the communication and collaboration
terms of implementation and ethically sound. While it between the authors of this paper, and to Neerja Modi
prompts the shifting of academia into practical usage, School, which sponsored and supported the paper.
limitations such as the lack of resources meant that this
study was unable to fully implement the framework.
Moreover, the immediate concern of users not willing to REFERENCES
share their data or permit to be recorded persists and can [1] Ali, A. H., Yaseen, M. G., Aljanabi, M., & Abed, S. A.
only be resolved with a change in ideology towards (2023). Transfer learning: A new promising technique.
sharing data. Mesopotamian Journal of Big Data, 2023, 29-30.
[2] Azlan, C. A., Wong, J. H. D., Tan, L. K., Huri, M. S. N. A.,
In response to these limitations, a strategic procedure is
Ung, N. M., Pallath, V., ... & Ng, K. H. (2020). Teaching
necessary for future developments. Most importantly, and learning of postgraduate medical physics using Internet-
gathering sufficient resources is required, since only with based e-learning during the COVID-19 pandemic–A case
more data and computational power can the model be study from Malaysia. Physica Medica, 80, 10-16.
made to classify all variations of emotive images. Data can [3] Das, I., & Paris, K. (2022). A Deep Learning-Based
furthermore be augmented to increase both the amount of Approach for Adaptive Virtual Learning with Human Facial
data and the symmetry of the amount per class of Emotion Detection. Journal of Student Research, 11(3).
emotions. This will also reduce any chances of overfitting. [4] Goodfellow, Ian, Yoshua Bengio, and Aaron Courville.
Deep learning. MIT press, 2016.
Researchers that aim to create an optimal solution should
[5] Gupta, S., Kumar, P., & Tekchandani, R. K. (2022). Facial
also specifically use datasets that have extracted data from
emotion recognition based real-time learner engagement
online learning sessions to guarantee the training data is detection system in online learning context using deep
similar in characteristics to real world emotive data in learning models. Multimedia Tools and Applications, 82(8),
online classes. In terms of future steps with this research, 11365–11394.
this model exhibits noteworthy scalability. We can make https://doi.org/10.1007/s11042-022-13558-9
the analysis mechanism more multifaceted, with inclusion [6] Hasnine, M. N., Nguyen, H. T., Tran, T. T. T., Bui, H. T.,
of behavioral engagement analysis and chat analysis, Akçapınar, G., & Ueda, H. (2023). A real-time learning
hence improving not only the accuracy of the model but analytics dashboard for automatic detection of online
learners’ affective states. Sensors, 23(9), 4243.
also the reliability of its analysis.
[7] M. Mashoedah, M. Hartmann, H. D. Surjono, Z. Zamroni.
The Vocational High School Teachers ‘Awareness Level
V. CONCLUSION and Implementation of the Students’ Learning Style
Assessment. Jurnal Pendidikan Teknologi dan Kejuruan. 24,
This study presents a ResNet50-based model for real-time 91-101 (2018).
emotion classification of students, achieving validation [8] McCambridge, J., De Bruin, M., & Witton, J. (2012). The
and test accuracies of 81.34% and 81.01%, respectively. effects of demand characteristics on research participant
The research evaluates various architectures, datasets, behaviours in non-laboratory settings: a systematic review.
parameters, and machine learning techniques to optimize PloS one, 7(6), e39116.
performance. It uniquely employs high-quality data and [9] Mukhopadhyay, M., Pal, S., Nayyar, A., Pramanik, P. K. D.,
Dasgupta, N., & Choudhury, P. (2020). Facial emotion
considers privacy by processing videos on student devices,
detection to assess learner’s state of mind in an online
offering visual summaries for educators and support for
learning system (ICIIT’20). Association for Computing
emotionally distressed students. Machinery.
Limitations include limited data and computational power, https://doi.org/10.1145/3385209.3385231
potential inaccuracies in diverse real-world conditions, and [10] Patrick, H., Ryan, A. M., & Kaplan, A. (2007). Early
reluctance from users to share data. Enhancing adolescents' perceptions of the classroom social
environment, motivational beliefs, and engagement. Journal
performance requires more data, improved computational
of educational psychology, 99(1), 83.
resources, and augmenting datasets to balance emotional
[11] Pickard, L. (2024, April 30). Massive List of MOOC
classes and prevent overfitting. The model has significant Platforms Around the World in 2024 — Class Central. The
scalability potential. Future enhancements could include Report by Class Central.
analyzing behavioral engagement and chat interactions, https://www.classcentral.com/report/mooc-platforms/
which would increase both the accuracy and reliability of [12] Shah, D., Pickard, L., & Ma, R. (2022). Massive list of
the model’s emotional engagement analysis. MOOC platforms around the world in 2023. Class Central’s
MOOC Report.

This article can be downloaded from here: www.ijaems.com 69


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/
Jain and Ravindra International Journal of Advanced Engineering, Management and Science, 10(5) -2024

[13] Skinner, E. A., Zimmer-Gembeck, M. J., Connell, J. P.,


Eccles, J. S., & Wellborn, J. G. (1998). Individual
differences and the development of perceived control.
Monographs of the society for Research in Child
Development, i-231.
[14] Sobieszczuk-Nowicka, E., Rybska, E., Jarmużek, J.,
Adamiec, M., & Chyleńska, Z. (2018). Are We Aware of
What Is Going on in a Student’s Mind? Understanding
Wrong Answers about Plant Tropisms and Connection
between Student’s Conceptions and Metacognition in
Teacher and Learner Minds. Education Sciences, 8(4), 164.
[15] Trowler, V. (2010). Student engagement literature
review. The higher education academy, 11(1), 1-15.
[16] A mobile initiative that’s more than just a tablet handout --
campus technology. (2014, March 20). Campus Technology.
https://campustechnology.com/articles/2014/03/20/a-
mobile-initiative-thats-more-than-just-a-tablet-handout.aspx

This article can be downloaded from here: www.ijaems.com 70


©2024 The Author(s). Published by Infogain Publication, This work is licensed under a Creative Commons Attribution 4.0 License.
http://creativecommons.org/licenses/by/4.0/

You might also like