0% found this document useful (0 votes)
18 views3 pages

Chapter 2

This chapter reviews the evolution of Human Activity Recognition (HAR) from traditional handcrafted methods to advanced deep learning techniques, highlighting the effectiveness of CNNs and transfer learning. Key findings indicate that ensemble methods combining diverse models yield improved accuracy and generalization, particularly in complex scenarios. The insights gained from this literature survey inform the design of a proposed HAR solution utilizing multiple deep learning architectures.

Uploaded by

211fa07087
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views3 pages

Chapter 2

This chapter reviews the evolution of Human Activity Recognition (HAR) from traditional handcrafted methods to advanced deep learning techniques, highlighting the effectiveness of CNNs and transfer learning. Key findings indicate that ensemble methods combining diverse models yield improved accuracy and generalization, particularly in complex scenarios. The insights gained from this literature survey inform the design of a proposed HAR solution utilizing multiple deep learning architectures.

Uploaded by

211fa07087
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

CHAPTER 2

LITERATURE SURVEY

Human Activity Recognition (HAR) has been a prominent research domain in computer vision,
particularly with the evolution of deep learning. Traditional HAR techniques used handcrafted
features like HOG (Histogram of Oriented Gradients) and SIFT (Scale-Invariant Feature
Transform), often paired with classifiers such as Support Vector Machines (SVMs). However,
these methods had limitations in dealing with complex backgrounds and dynamic postures.
With the introduction of Convolutional Neural Networks (CNNs) and transfer learning, HAR
performance has seen remarkable improvements.

S.No Authors Title Algorithms Advantages Limitations


(Year) Used

1 Jain et al. HAR using CNN High Limited to one


(2020) CNN accuracy on architecture
RGB
images

2 Kumar et Activity VGG + Temporal Computationally


al. (2021) Detection LSTM content expensive
using VGG modeling
& LSTM

3 Zhao et al. Efficient Efficient Lightweight Needs fine-


(2022) Net for Net and scalable tuning
HAR

4 Singh et al. Transfer ResNet, Utilizes pre- May not


(2019) Learning in Inception training generalize
HAR across datasets

5 Roy et al. Ensemble Ensemble Improved Multiple models


(2023) Learning Voting accuracy
for Human
Activity
Detection

Table 1: Papers comparsion

This chapter discusses the existing research approaches and highlights key findings that shaped
the direction of this project.

3
2.1 Existing Research

Several recent studies have investigated human activity recognition using image data:

CNN-Based Models: CNNs automatically learn spatial hierarchies from image data. Early
HAR models utilized simple CNNs to extract deep features and classify activities. While
effective, they often suffered from overfitting on small datasets.

Transfer Learning with Pretrained Models: Researchers have employed models like
VGG16, ResNet50, InceptionV3, and MobileNetV2, pretrained on ImageNet, for HAR tasks.
These models provide high-level features that generalize well across domains:

 VGG16/VGG19 offer deep, structured layers good for static posture classification.
 ResNet50 introduces skip connections, helping mitigate the vanishing gradient problem.
 InceptionV3 captures multi-scale features efficiently.
 EfficientNetB7 balances model accuracy and parameter size through compound scaling.

Ensemble Learning: Combining multiple models using ensemble techniques (averaging,


majority voting, or stacking) has been shown to improve prediction accuracy. Ensembles are
particularly effective when models are diverse and uncorrelated.

Real-Time and Video-Based HAR: Some studies extend HAR to video sequences using 3D
CNNs or LSTM networks. However, these require temporal data and high computational
power, unlike still image-based classification.

2.2 Key Findings

 Deep learning models significantly outperform traditional feature-based methods in


HAR tasks.
 Transfer learning using pretrained models such as VGG16, ResNet50, and InceptionV3
speeds up training and yields better generalization on small datasets.
 Residual and multi-scale feature architectures (like ResNet and Inception) capture
complex activity features better than simple CNNs.
 Lightweight architectures (e.g., MobileNetV2) are suitable for deployment on mobile
or real-time applications, although they may show slightly reduced accuracy.
 Ensemble methods combining multiple model predictions yield higher accuracy and
stability, especially in datasets with high variability in poses and environments.

4
2.3 Summary

This literature review highlights the evolution from handcrafted feature methods to modern
deep learning-based techniques for human activity recognition. The surveyed studies
emphasize that no single model is universally optimal; hence, an ensemble of carefully selected
models can provide a more robust and accurate HAR system. These insights strongly influenced
the design of our proposed solution, where multiple deep learning architectures were trained
and integrated into an ensemble framework to enhance performance and reliability.

You might also like