0% found this document useful (0 votes)
20 views13 pages

Intern

The document outlines an internship review focused on a project analyzing global cost of living using K Means Clustering. The project aimed to identify cost patterns across regions, providing insights for policymakers and businesses. Key findings highlighted the influence of economic factors and geographic location on living costs, with the internship enhancing technical skills in data analysis and machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views13 pages

Intern

The document outlines an internship review focused on a project analyzing global cost of living using K Means Clustering. The project aimed to identify cost patterns across regions, providing insights for policymakers and businesses. Key findings highlighted the influence of economic factors and geographic location on living costs, with the internship enhancing technical skills in data analysis and machine learning.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 13

Department of Information Technology

Internship Review
on
A cost of living Analysis using K Means Clustering
At
Team Members :
•E. Amulya Priya
•Ch. Shashank
•D. Swapna
01 Internship Overview

Table Of Contents 02 Project Introduction

03 Methodology

K Means Clustering
04 Implementation

05 Result and Key findings

Technical Learnings and


06 Observations
INTERNSHIP OVERVIEW

Brief introduction to the organizati


Objectives of the internship Your role and responsibilities
on
My role involved data collection,
Driven by a passion to address the evolving needs Apply the skills and knowledge gained
preprocessing for various datasets,
of today’s global workforce, AimR set out to throughout the internship to real-world projects. implementing the K-Means
create a platform where individuals can access This includes participating in collaborative clustering algorithm, and
generating visualizations to
top-tier training and skill development. efforts, solving complex problems, and
communicate findings effectively
Recognizing the dynamic nature of the global job contributing to the development of innovative to the team.
market, AimR was born out of a desire to equip machine learning and NLP solutions, thereby
individuals with the tools necessary to succeed in enhancing both technical abilities and teamwork
skills.
PROJECT OVERVIEW

01 02 03
Title of the project Objective of the project Significance of the project

The project was titled "Cost of The aim was to analyze This project holds significance as it
Living Analysis using K-Mean diverse global cost of living assists in understanding economic d
s Clustering," aimed at examin isparities and cost challenges faced
datasets to identify distinct
ing global cost variations acros by various regions, informing both
s regions. cost patterns, thus clustering
policymakers and stakeholders abou
. geographical areas effectively t potential areas for intervention.
and enabling insights about
economic conditions.
Expected Outcomes

Analysis of global cost of living data Insights from clustering regions

Identifying high and low-cost areas would enable busin


A comprehensive comparison of cost struct
esses and policymakers to tailor strategies that address t
ures across different countries, providing a he needs of specific demographics more effectively.
framework for assessing how various facto
rs affect living expenses globally.
Tools and Technologies
Python

Python served as the primary


programming language due to
its versatility and extensive lib
Matplotlib and Seaborn
raries for data manipulation a
nd analysis. These visualization libraries w
ere utilized to create informati
ve graphs and plots that demo
nstrate clustering results and
Pandas, Scikit-learn
aid in interpreting data insight

Pandas facilitated data handlin s.

g and manipulation, while Scik


it-learn provided robust machi
ne learning tools necessary for
implementing the K-Means clu
stering technique.
Steps Involved

O1 O2 O3

Data collection Data preprocessing Feature selection

The data was sourced from reputable The preprocessing phase involved cle Key features selected for the analysis
platforms like Kaggle , which aggrega aning the dataset, addressing missing included housing costs, transportation
te diverse datasets on living costs, ho values through imputation, and stand costs, and food prices, essential for ac
using, and expenses across multiple r ardizing the data for accurate analysis curately clustering living cost profiles.
egions. .
Algorithm Explanation

What is K-Means Clusteri Determining optimal clus


ng? ters
K-Means is a popular unsupervis The optimal number of clusters
ed learning algorithm used to pa was identified using the Elbow
rtition a dataset into K distinct cl Method, which involves plotting
usters based on feature similarit variance against the number of
y, minimizing the variance withi clusters to find the point where
n each cluster. adding more clusters results in
diminishing returns.
VISUAL REPRESENTATIONS

Cost of Living Index Cloropleth


KEY FINDINGS
Insights derived from analysis

Insights showed that economic factors, geo


graphic location, and lifestyle choices signif
icantly influence living costs, indicating the
relevance of tailored housing and policy int
erventions.

Summary of cluster patterns

The clusters revealed distinct patterns, wit


h one cluster representing high-cost region
s like urban centers, while another highligh
ted areas with significantly lower living cos
ts, primarily in rural or less developed area
s.
Technical Learnings And Observations

Data preprocessing techni Machine learning algorith


ques ms Visualization techniques
Gained expertise in data cleaning, nor Developed a strong understanding of clust Enhanced skills in data visualization metho
malization, and handling missing data, ering algorithms, specifically K-Means, alo ds using Matplotlib and Seaborn, which pro
ng with the theory behind their functionalit ved effective in illustrating complex data p
crucial for maintaining dataset integrit y and applications in real-world problems. atterns and making the findings comprehe
y and ensuring accurate analysis outco nsible.
mes.
OUTCOME OF THE INTERNSHIP
Application of classroom knowl
01
edge
Practical application of theoretical concepts l
earned in class provided a deeper understan
ding and appreciation for data science, reinf
orcing the relevance of education in real-wor
ld scenarios.

02 Influence on career plans

The internship solidified my interest in a dat


a science career, particularly in economic an
alysis, where I can leverage data to inform p
olicy and business decisions.
THANK YOU !

You might also like