100% found this document useful (1 vote)
55 views27 pages

EISystems Report On Email Spam Detection (SRI HARI)

The internship report details the development of an email spam detection system using the Multinomial Naive Bayes algorithm, highlighting the skills acquired in Python programming and machine learning. The project successfully classifies emails as spam or ham, providing a user-friendly web interface for real-time predictions. The report includes an overview of EISystems Services, the organization behind the internship, and outlines the learning objectives and outcomes achieved during the program.

Uploaded by

Srihari Pulipati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
55 views27 pages

EISystems Report On Email Spam Detection (SRI HARI)

The internship report details the development of an email spam detection system using the Multinomial Naive Bayes algorithm, highlighting the skills acquired in Python programming and machine learning. The project successfully classifies emails as spam or ham, providing a user-friendly web interface for real-time predictions. The report includes an overview of EISystems Services, the organization behind the internship, and outlines the learning objectives and outcomes achieved during the program.

Uploaded by

Srihari Pulipati
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Internship Report

On
Email Spam Detection using Multinomial Naive Bayes

Submitted by Submitted to
[PULIPATI SRI HARI] Mallika Srivastava
[University Roll No:- 22195A0506] Head, Training Delivery
[College Name:- JNTUA College of Engineering Pulivendula] EISystems Services

&

Mayur Dev Sewak


Head, Internships & Trainings
EISystems Services
0
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Student’s Declaration
I, PULIPATI SRI HARI, a student of BTech program, Roll No. 22195A0506 of the
Department of CSE , JNTUA COLLEGE OF ENGINEERING PUIVENDULA College do hereby
declare that I have completed the mandatory internship in EiSystems Technologies
under the faculty guideship of MALLIKA SRIVASTAVA Head of the Department of
Training Delivery, EISystems Services.

P. Sri hari
06/05/2024
(Signature and Date)

Endorsements

SIGNATURE
[Mallika Srivastava]
[Head, Training Delivery]
[EISystems Services]

SIGNATURE
[Mayur Dev Sewak]
[Head, Internships & Trainings]
[EISystems Services]
1
Page

Table of Content
EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Serial No Title of the Content Page No

1 Executive Summary 6

2 Overview of Organization 7

3 Project Summary 8

4 Data Flow Diagram / Process Flow 9

5 Code reflect the Process Flow of the Software Project: 10-11

6 Code / Program with Supported Screenshots 12-16

7 Input / Output with Datasets & Supported Screenshots 17-18

8 Images / Video Links 19

9 References 20

10 Student Self Evaluation of the Short-Term Internship 21

11 Annexure 1 - Daily Activity Report 22-25

12 Annexure 2 - Weekly Progress Report 26

List of Figures
2
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Serial No Image Caption Page No

1 Fig - DFD of Email Spam Detection 9

2 Fig - Importing the required libraries 12

3 Fig - Data Preparation 12

4 Fig - Data Visualization using Pie Chart 13

5 Fig - Data Transformation using LabelEncoder 13

6 Fig - Fitting the Training data to MultinomialNB 14

7 Fig - Predicting the Email type either spam or ham 14

8 Fig - Importing Streamlit library and Loading the Pickle files 15

9 Fig -Implementation of main() and applying the some style 15

10 Fig - Adding Title of the interface and components 16

11 Fig - Email Spam Dataset 17

12 Fig - User Interface of the model 18

13 Fig - Predicting Message as not Spam(Ham) 19

14 Fig - Predicting Message as Spam 19

List of Tables
3
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Serial No Table Name Page
No

1 Student Self Evaluation Table 21

2 Daily Activity Report Week-1 22

3 Daily Activity Report Week-2 22

4 Daily Activity Report Week-3 23

5 Daily Activity Report Week-4 23

6 Daily Activity Report Week-5 24

7 Daily Activity Report Week-6 24

8 Daily Activity Report Week-7 25

9 Daily Activity Report Week-8 25

10 Weekly Progress Report Table 26

Nomenclature / Notations (if any)


4
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Serial Notations Description
No

1 Data Flow Diagram The Data Flow Diagram (DFD) for the Email Spam Detection Model
illustrates the flow of data from email input to prediction result through
processes such as text preprocessing, feature extraction, model training,
and prediction, facilitating a clear understanding of data flow and
system operations.

Executive Summary
The EISystems Data Science internship equipped participants with a robust foundation in Python
programming and machine learning concepts, fostering practical skills through hands-on projects. The
5
Page

internship aimed to achieve proficiency in data analysis, machine learning algorithms, and project
EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
management, resulting in tangible learning outcomes and valuable experiences.
Learning Objectives:
1. Attain proficiency in Python programming for data manipulation, analysis, and visualization.
2. Understand core machine learning algorithms and their application in solving real-world problems.
3. Develop skills in data preprocessing, feature engineering, model training, and evaluation.
4. Enhance project management, collaboration, and communication skills within a team environment.
Learning Outcomes:
1. Mastered Python programming for data science tasks, including data cleaning, exploration, and
visualization using Pandas, NumPy, and Matplotlib.
2. Demonstrated proficiency in core machine learning algorithms, applying techniques such as
regression, classification, and ensemble methods to real-world datasets.
3. Implemented effective data preprocessing techniques, handling missing values, encoding
categorical variables, and scaling features.
4. Successfully managed and executed data science projects, from project planning to presentation of
results, within specified timelines.

Summary of Activities:
1. Engaged in comprehensive training sessions covering Python programming, data manipulation, and
machine learning concepts.
2. Completed hands-on exercises and assignments to reinforce learning and practical skills.
3. Worked collaboratively on real-world data science projects, conducting data preprocessing,
exploratory data analysis, and predictive modeling.
4. Presented project findings and results in team meetings, fostering collaboration and knowledge
sharing.
5. Pursued continuous learning through self-study and exploration of additional resources to deepen
understanding and expand skill set in data science and Python programming.

Overview of Organization
EISystems Services is a leading technology solutions provider offering software development, data
analytics, and digital transformation services. Committed to innovation and excellence, we empower
organizations worldwide with cutting-edge technology solutions. Our vision is to be a global leader in
technology, driving innovation and delivering transformative results. We value excellence, integrity,
6
Page

collaboration, innovation, and customer focus. Through our internship program, we provide hands-on
EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
learning experiences and mentorship to nurture talent and foster innovation

1) Introduction of the Organization:


EISystems Services is a leading technology solutions provider specializing in software development, data
analytics, and digital transformation services. With a commitment to innovation and excellence, EISystems
has established itself as a trusted partner for organizations seeking to leverage technology to achieve their
business objectives. Our comprehensive suite of services encompasses a wide range of domains, including
robotics, edge computing, ethical hacking, cybersecurity, data science, machine learning, and artificial
intelligence.

2) Vision, Mission, and Values of the Organization:


Vision: To be a global leader in technology solutions, driving innovation and delivering transformative
results for our clients.
Mission: To empower organizations with cutting-edge technology solutions that optimize efficiency,
enhance competitiveness, and enable growth.
Values:
1. Excellence: Striving for the highest standards of quality, performance, and customer satisfaction in
everything we do.
2. Integrity: Upholding honesty, transparency, and ethical conduct in all our interactions and business
practices.
3. Collaboration: Fostering teamwork, communication, and mutual respect to achieve common goals
and shared success.
4. Innovation: Embracing creativity, curiosity, and continuous improvement to drive innovation and
stay ahead of the curve.
5. Customer Focus: Putting the needs and priorities of our clients first, and delivering value-added
solutions that exceed their expectations.
3) Policy of the Organization, in relation with the intern role:
EISystems Services is dedicated to providing interns with valuable learning experiences, mentorship, and
professional growth opportunities. Our program offers hands-on experience, exposure to real-world
projects, and guidance from industry experts. Interns actively participate in projects, collaborate with team
members, and contribute creativity to drive impactful outcomes. We foster a culture of learning, diversity,
and inclusion, enabling interns to develop skills, explore interests, and prepare for successful careers.

Project Summary
Idea Behind Making This Project:
The idea behind this project is to develop a machine learning model capable of distinguishing between
spam and non-spam (ham) emails. By leveraging the Multinomial Naive Bayes algorithm, the project aims
to create an effective spam detection system that can automatically filter out unwanted emails, saving
time and improving inbox management.
7
Page

About Project:
EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
The project involves building a spam email classifier using the Multinomial Naive Bayes algorithm. It utilizes
a dataset of labeled email messages, where each message is categorized as spam or ham. The model is
trained on features extracted from the text content of emails, allowing it to learn patterns and
characteristics associated with spam emails.

Software Used in Project:

Python: Programming language used for data preprocessing, model training, and evaluation.
Streamlit: Web application framework used for building the user interface.
Scikit-learn: Python library used for implementing the Multinomial Naive Bayes classifier and other
machine learning functionalities.

Technical Apparatus Requirements Before Making This Project:

Python programming skills for data manipulation, analysis, and machine learning model implementation.
Familiarity with machine learning algorithms, particularly the Multinomial Naive Bayes algorithm.
Understanding of text preprocessing techniques, including tokenization, stemming, and vectorization.
Basic knowledge of web development for building the user interface using Streamlit.

Result or Working of Project:


The project successfully develops a spam email classifier capable of accurately identifying spam and non-
spam emails. Users can input email content into the web interface, and the model predicts whether the
email is spam or ham. The system provides real-time feedback, allowing users to make informed decisions
about handling incoming emails.

Research Done:
Research was conducted to explore various machine learning algorithms suitable for text classification
tasks, with a focus on the Multinomial Naive Bayes algorithm for spam detection. Experimentation was
carried out to optimize model performance through hyperparameter tuning and feature selection
techniques. Additionally, research was conducted on best practices for data preprocessing, including text
cleaning and feature engineering, to improve the effectiveness of the spam detection system.

Data Flow Diagram / Process Flow


Here's a Process Flow representation of the Email Spam Detection project:
8
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
User Input

Data Preprocessing
(Text Cleaning,
Feature Extraction)

Model Training
(MultinomialNB)

Model Evaluation (Confusion


Matrix, Classification Report)

New Email Input


Save Trained Model
(Classifier.pkl,
Vectorizer.pkl)

Load Trained Model


(Classifier.pkl,
Vectorizer.pkl)

Model Prediction

Output (Spam/Ham
Classification)

Figure 1:- DFD of Email Spam Detection

Code reflect the Process Flow of the Software Project:


9

PROGRAM :-
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report,
ConfusionMatrixDisplay
import pickle

dataset = pd.read_csv('spam.csv', encoding='latin1')


dataset.drop(columns=['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4'], inplace=True, axis=1)
dataset.columns = ["Category", "Message"]
dataset.drop_duplicates(keep='first', inplace=True)

number_of_spam = dataset[dataset['Category'] == 'spam'].shape[0]pip


number_of_ham = dataset[dataset['Category'] == 'ham'].shape[0]

plt.figure(figsize=(15, 6))
mail_categories = [number_of_ham, number_of_spam]
labels = [f"Ham = {number_of_ham}", f"Spam = {number_of_spam}"]
explode = [.2, 0]
plt.pie(mail_categories, labels=labels, explode=explode, autopct="%.2f %%")
plt.title("Ham vs Spam")
plt.show()

encoder = LabelEncoder()
dataset['spam'] = encoder.fit_transform(dataset['Category'])

x = dataset['Message']
y = dataset['spam']
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=42)

vectorizer = CountVectorizer()
x_train_counts = vectorizer.fit_transform(x_train)
classifier = MultinomialNB()
classifier.fit(x_train_counts, y_train)

# Save trained classifier to a pickle file


with open('classifier.pkl', 'wb') as f:
pickle.dump(classifier, f)
10

# Save fitted CountVectorizer to a pickle file


Page

with open('vectorizer.pkl', 'wb') as f:


EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
pickle.dump(vectorizer, f)

x_test_counts = vectorizer.transform(x_test)

# Display the confusion matrix


y_pred = classifier.predict(x_test_counts)
confusion_matrix = confusion_matrix(y_test, y_pred)
cm_display = ConfusionMatrixDisplay(confusion_matrix=confusion_matrix, display_labels=["Ham",
"Spam"])
cm_display.plot()
plt.show()

print(classification_report(y_test, y_pred))

emails = [
"Hey jessica, I'm at the Ms.Salahshor class waiting for you, where are you?",
'Upto 20% discount on parking, exclusive offer just for you. Dont miss this reward!',
'''Join us on Saturday, February 24 at 14:00 UTC on our YouTube channel to take this
interactive lesson, taught by Tutor Darryl.'''
]

emails_count = vectorizer.transform(emails)
print(emails_count)
print(classifier.predict(emails_count))

Code / Program with Supported Screenshots


11
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Figure 2:- importing the required libraries

Figure 3:- Data Preparation


12
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Figure 4:- Data visualization using Pie chart

Figure 5:- Data Transformation by LabelEncoder


13
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Figure 6:- Fitting the Training data to MultinomialNB

Figure 7:- Predicting the Email type either spam or ham


14
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
CODE FOR USER INTERFACE OF THE MODEL

Figure 8:- Importing streamlit library and loading the pickle files

15

Figure 9:- Implementation of main() and applying the some style


Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Figure 10:- Adding Title of the interface and Components

16

Input / Output with Datasets & Supported Screenshots


Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Input Dataset of Email Spam and Ham Messages:

Figure 11:- Email Spam Dataset -1

17
Page

Output Interface of the Model:


EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Figure 12:- User Interface of the Model

18

Images / Video Links


Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Email Spam Detection interface by predicting the message as Spam

Figure 13:-Predicting Message as not Spam (Ham)

Figure 14:- Predicting Message as Spam

VIDEO LINK :-
https://drive.google.com/file/d/1n9DTSmHNeamZhfzEYSfqkx86rMGTbhLq/view?usp=sha
ring
References
19
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
1. Dataset Source:
 Kaggle: https://www.kaggle.com/datasets
 UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/index.php
2. Research Papers:
 "Machine Learning Techniques in Spam Email Detection": Link to paper
 "Email Spam Filtering: A Review": Link to paper
3. Books:
 "Machine Learning Yearning" by Andrew Ng: Link to book
 "Natural Language Processing with Python" by Steven Bird, Ewan Klein, and Edward Loper:
Link to book
4. Online Articles and Tutorials:
 Towards Data Science: https://towardsdatascience.com/
 Analytics Vidhya: https://www.analyticsvidhya.com/
 Medium: https://medium.com/
5. Official Documentation:
 Scikit-learn documentation for Multinomial Naive Bayes: Link to documentation
 Streamlit documentation for building web applications: Link to documentation
6. GitHub Repositories:
 GitHub: https://github.com/ (Search for email spam detection projects)

Student Self Evaluation of the Short-Term Internship


20

Please rate your performance in the following areas:


Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
1) Oral communication 1 2 3 4 5

2) Written communication 1 2 3 4 5

3) Initiative 1 2 3 4 5

4) Interaction with staff 1 2 3 4 5

5) Attitude 1 2 3 4 5

6) Dependability 1 2 3 4 5

7) Ability to learn 1 2 3 4 5

8) Planning and organization 1 2 3 4 5

9) Professionalism 1 2 3 4 5

10) Creativity 1 2 3 4 5

11) Quality of work 1 2 3 4 5

12) Productivity 1 2 3 4 5

13) Progress of learning 1 2 3 4 5

14) Adaptability to organization’s culture/policies 1 2 3 4 5

15) OVERALL PERFORMANCE 1 2 3 4 5

Rating Scale: 5 will be Best while 1 will be Worst

P. Sri hari
Signature of the Student

Annexure 1
Daily Activity Report
21

WEEK - 1
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
Day 1 Introduction to Python Understanding basic Mallika Srivastava
basics Python syntax and data
types
Day 2 Variables, data types, and Familiarity with Python Mallika Srivastava
operators variables, data types, and
operators
Day 3 Control flow and loops Knowledge of conditional Mallika Srivastava
statements and loops in
Python
Day 4 Lists, tuples, dictionaries Understanding Python data Mallika Srivastava
structures like lists, tuples,
and dictionaries
Day 5 Functions and file handling Learning to define Mallika Srivastava
functions and work with
files in Python

WEEK - 2

Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
Day 1 Introduction to Python Understanding function Mallika Srivastava
functions syntax and usage in Python
Day 2 Parameters, arguments, Familiarity with function Mallika Srivastava
and return values parameters, arguments,
and return values
Day 3 Scope of variables and Knowledge of variable Mallika Srivastava
built-in modules scope and Python built-in
modules
Day 4 Creating and using custom Learning to create and Mallika Srivastava
modules import custom modules in
Python
Day 5 Error handling and Understanding error types Mallika Srivastava
exception handling and how to handle
exceptions in Python
WEEK - 3

Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
22

Day 1 Introduction to Python Understanding the purpose Mallika Srivastava


Page

libraries and usage of Python


libraries
EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Day 2 NumPy and Pandas Familiarity with NumPy Mallika Srivastava
arrays and Pandas
dataframes
Day 3 Matplotlib and Seaborn Knowledge of data Mallika Srivastava
visualization with
Matplotlib and Seaborn
Day 4 Scikit-learn and TensorFlow Learning about machine Mallika Srivastava
learning with Scikit-learn
and TensorFlow
Day 5 Introduction to web Understanding web Mallika Srivastava
scraping with scraping techniques with
BeautifulSoup BeautifulSoup

WEEK - 4

Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
Day 1 Introduction to data Understanding the Mallika Srivastava
visualization importance of data
visualization
Day 2 Basic plotting with Familiarity with basic Mallika Srivastava
Matplotlib plotting techniques in
Matplotlib
Day 3 Advanced plotting with Knowledge of advanced Mallika Srivastava
Seaborn data visualization
techniques with Seaborn
Day 4 Interactive visualization Learning to create Mallika Srivastava
with Plotly interactive plots with Plotly
Day 5 Dashboard creation with Understanding how to Mallika Srivastava
Streamlit create interactive
dashboards with Streamlit

WEEK - 5

Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
23

Day 1 Introduction to machine Understanding the Mallika Srivastava


learning fundamentals of machine
Page

learning

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Day 2 Supervised learning and Familiarity with different Mallika Srivastava
unsupervised learning types of machine learning
techniques
Day 3 Model evaluation metrics Knowledge of metrics used Mallika Srivastava
to evaluate machine
learning models
Day 4 Model selection and Learning to select Mallika Srivastava
hyperparameter tuning appropriate models and
tune hyperparameters
Day 5 Model deployment Understanding techniques Mallika Srivastava
considerations and and considerations for
techniques deploying machine learning
models

WEEK - 6

Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
Day 1 Introduction to machine Understanding various Mallika Srivastava
learning algorithms machine learning
algorithms
Day 2 Linear regression and Familiarity with linear Mallika Srivastava
logistic regression regression and logistic
regression algorithms
Day 3 Decision trees and Knowledge of decision tree Mallika Srivastava
ensemble methods algorithms and ensemble
methods
Day 4 Support vector machines Learning about support Mallika Srivastava
and k-nearest neighbors vector machines and k-
nearest neighbors
algorithms
Day 5 Clustering algorithms and Understanding clustering Mallika Srivastava
dimensionality reduction algorithms and
techniques dimensionality reduction
techniques
WEEK - 7

Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
Day 1 Understanding email data Familiarity with email Mallika Srivastava
24

and preprocessing dataset and preprocessing


techniques
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Day 2 Feature extraction and Knowledge of extracting Mallika Srivastava
selection relevant features from
email data
Day 3 Model selection and Understanding how to Mallika Srivastava
evaluation select and evaluate models
for email spam detection
Day 4 Hyperparameter tuning Learning to tune Mallika Srivastava
and performance hyperparameters and
optimization optimize model
performance
Day 5 Fine-tuning the model and Applying final adjustments Mallika Srivastava
finalizing the email spam and optimizations to the
detection model model

WEEK - 8

Day & Date Brief Description of Daily Learning Outcome Person In-Charge
Activity
Day 1 Designing the user Understanding user Mallika Srivastava
interface for the email interface design principles
spam detection app
Day 2 Developing the frontend Familiarity with frontend Mallika Srivastava
components development tools and
frameworks
Day 3 Integrating the frontend Knowledge of integrating Mallika Srivastava
with the backend frontend and backend
components
Day 4 Testing and debugging the Learning to identify and fix Mallika Srivastava
application bugs in the application
Day 5 Deploying the application Understanding the Mallika Srivastava
on GitHub Pages deployment process and
hosting the app

Annexure 2
Weekly Progress Report
Week No: ______
25

(1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16)
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000
Week(s) Summary of Weekly Activity

Week 1 Introduction to Python basics, covering variables, data types, control flow,
and functions.

Week 2 Delving into Python functions and modules, including parameters, return
values, and custom module creation.

Week 3 Exploring Python libraries such as NumPy, Pandas, Matplotlib, and Seaborn
for data manipulation and visualization.

Week 4 Introduction to data visualization techniques using Matplotlib, Seaborn, and


Plotly, and basic dashboard creation with Streamlit.

Week 5 Understanding machine learning fundamentals, including supervised and


unsupervised learning, model evaluation metrics, and deployment
considerations.

Week 6 Learning various machine learning algorithms like linear regression, logistic
regression, decision trees, and ensemble methods.

Week 7 Focus on email spam detection model development, involving data


preprocessing, feature extraction, model selection, and hyperparameter
tuning.

Week 8 Developing the user interface for the email spam detection app and
deploying it on GitHub Pages after integrating frontend and backend
components.
26
Page

EISYSTEMS SERVICES
FF-110, Express Greens Plaza, Sector 1
Vaishali – Delhi NCR – India 201010
W: www.eisystems.in | E: [email protected] | T: +91 92122-51000

You might also like