0% found this document useful (0 votes)

4 views12 pages

Intro To Statistics and Assignments

The document discusses the role of statistics in data science, highlighting descriptive and inferential statistics. It details types of data, measures of central tendency, variability, and the importance of variance and standard deviation in understanding data spread. Additionally, it provides an example of creating age-based insurance premiums and assignments related to statistical calculations.

Uploaded by

ashwin150900014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views12 pages

Intro To Statistics and Assignments

Uploaded by

ashwin150900014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Statistics:

In the realm of data science, statistics, particularly data science statistics play a
pivotal role in data analysis and decision-making. It gathers, analyse, visualize
to make conclusions from the data.
Descriptive statistics describe the main features of a data set, while inferential
statistics use sample data to draw conclusions about a larger population

Types of Data:

Numerical Data:
• Discrete
eg:
1. No of students in Data Science People click
2. No. of voters in RS Puram

• Continuous:
Height, Weight…

Categorical Data:
1.Eg:
Gender: Male, Female

2.Eg:
Problem Statement: Designing Age-Based Insurance Premiums for Families
(INR)
Objective:
An insurance company wants to create family insurance plans with premiums
tailored to different age groups. The goal is to categorize family members by
age, determine appropriate premiums based on risk, and ensure affordability
while maintaining profitability.

Age Categories and Premiums (INR):

1. Children (1–12 years)
o Low medical risk: Basic healthcare needs such as vaccinations and
minor illnesses.
o Premium range: ₹1,000 – ₹2,000 per month.
2. Teens (13–19 years)
o Moderate risk: Higher likelihood of sports injuries, mental health
care.
o Premium range: ₹2,000 – ₹3,500 per month.
3. Middle-aged Adults (21–59 years)
o Higher risk: Lifestyle-related health issues such as hypertension
and diabetes.
o Premium range: ₹4,000 – ₹8,000 per month.
4. Old-aged Adults (60+ years)
o Highest risk: Chronic diseases, frequent hospital visits, critical care.
o Premium range: ₹10,000 – ₹15,000 per month.
Types of Data and Level of Measurements:
Descriptive statistics – Describes about the data

Summarize a data set's characteristics using measures like mean, median, and
standard deviation. Descriptive statistics are limited to the data collected and
are used to present and summarize it.

Measure of Central Tendency:

1.Eg: Age of people visiting the tuition centre
Age: {18,20,25,20,15,10}

2.Eg: Age of people visiting the tuition centre

Age: {18,20,25,75,15,10}

Step 1: Sum of all values

18+20+25+75+15+10=163
Step 2: Number of values
There are 6 values in the data set.
Median:

Mode:
Eg: Age of people attending Maths Tuition in RS Puram.
Age: {19,20,21,25,21,24,25,21,20,18,24,24}
The dataset indicates that most of the age groups attending Maths tuition in RS
Puram belongs to the age 21 and 24.
Measures of Variability/Dispersion (Spread):
Range:
1.Eg: Age of people visiting the tuition centre
Age: {18,20,25,20,15,10}
Range = Max – Min
25-10= 15

Variance:
Variance and Standard Deviation
Variance and Standard Deviation are measures of how spread out a set of data
is around the mean.
• Variance measures the average squared deviation of each data point
from the mean.
• Standard Deviation is the square root of the variance, providing a
measure of spread in the same units as the data.
Normal Distribution(Overview):
Variance
• Definition: Variance is the average of the squared differences between
each data point and the mean of the dataset. It provides a measure of
how spread out the data is.
• Interpretation:
o Higher variance means the data points are more spread out from
the mean. This indicates greater variability in the dataset.
o Lower variance suggests the data points are closer to the mean,
showing less variability.
• Units: Variance is expressed in squared units of the original data, which
makes it harder to interpret directly in the context of the data.
Standard Deviation
• Definition: Standard deviation is the square root of the variance. It gives
a measure of the spread of data points around the mean, but in the
same units as the original data.
• Interpretation:
o Higher standard deviation indicates more variability in the
dataset.
o Lower standard deviation suggests that the data points are closer
to the mean.
• Units: Standard deviation is expressed in the same units as the data
itself, making it more interpretable than variance.
Comparison & Inferences:
• Direct comparison: Both measure the dispersion of data, but the
standard deviation is generally more useful for understanding the spread
in real-world terms because it's in the same units as the data.
• Understanding variability: If you know the standard deviation, you can
easily gauge how far, on average, data points are from the mean.
• For data analysis: If you're comparing datasets with similar means, the
dataset with the higher standard deviation (or variance) will show more
fluctuation or unpredictability.
In essence, while variance tells you about the overall spread, standard
deviation provides a more intuitive understanding of that spread in the context
of the original data's units.

Inferential statistics
Inferential statistics on the other hand is an important concept that deals with
drawing conclusions based on small samples collected from the entire
population. For example, during an election poll, people will often want to
predict the exit poll results so they will conduct a survey in various parts of
state or country and record their opinion. Based on the information they have
collected they tend to draw conclusions and make inferences to predict results
for the entire population.
Assignments:
1)
Calculate the mean, median, mode, variance, standard deviation and range for
the problem statement given below.
The number of calls from motorists per day for roadside service was recorded
for a particular month:
28, 122, 217, 130, 120, 86, 80, 90, 140, 120, 70, 40, 145, 113, 90, 68, 174, 194,
170,100, 75, 104, 97, 75,123, 100, 75, 104, 97, 75, 123, 100, 89, 120, 109.

2)
a) Please find the mean, median, mode, variance, standard deviation, and
total for Annual income.

b) Syntax: Filtered_data = data[data[‘Age’] <= 20 ]

Filter the gender whose age is less than or equal to 20 using just Age,
Gender and Annual Income (k$).

c) Filter the gender whose age is less than or equal to 40 using just Age,
Gender and Annual Income (k$).

d) Follow the above given syntax to filter the gender whose age is above 20
and below 40 using just Age, Gender and Annual Income (k$) .

Hint: Use (&) AND Operator

Note: Attached the file on whatsapp - Customer Segmentation.csv

Emgt 512 SP 2024
No ratings yet
Emgt 512 SP 2024
156 pages
1 Nature of Economics
100% (1)
1 Nature of Economics
99 pages
Brief Reminder On Statistics - Rev0
No ratings yet
Brief Reminder On Statistics - Rev0
128 pages
Topics To Be Covered
No ratings yet
Topics To Be Covered
58 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
93 pages
MBC Stat For Nonstat v1.0 Final
No ratings yet
MBC Stat For Nonstat v1.0 Final
172 pages
L2-Types of Data, Central Tendency and Dispersion-2
No ratings yet
L2-Types of Data, Central Tendency and Dispersion-2
81 pages
Group 5 Stats
No ratings yet
Group 5 Stats
45 pages
Chapter 1
No ratings yet
Chapter 1
44 pages
Principles of Data Science WEB 5
No ratings yet
Principles of Data Science WEB 5
30 pages
Unit 1 Computational Statistics
No ratings yet
Unit 1 Computational Statistics
58 pages
Unit II TYCS DS
No ratings yet
Unit II TYCS DS
176 pages
Exp 3
No ratings yet
Exp 3
16 pages
Unit 1
No ratings yet
Unit 1
16 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
INTRODUCTION TO STATISTICS Notes
No ratings yet
INTRODUCTION TO STATISTICS Notes
16 pages
Descriptive Statistics SV
No ratings yet
Descriptive Statistics SV
77 pages
1 Basic Statistics
No ratings yet
1 Basic Statistics
35 pages
Descriptive, Diagnostic, Predictive and Prescriptive Analytics
No ratings yet
Descriptive, Diagnostic, Predictive and Prescriptive Analytics
26 pages
FBR & IT Applications: Compiled and Presented by DR - Deepak Joshi For Academic Use Only
No ratings yet
FBR & IT Applications: Compiled and Presented by DR - Deepak Joshi For Academic Use Only
77 pages
Discriptive Statistics
No ratings yet
Discriptive Statistics
23 pages
Quantitative Methods: Sessions 1-3 Case: Catalog Marketing
No ratings yet
Quantitative Methods: Sessions 1-3 Case: Catalog Marketing
70 pages
DS Chapter - 2
No ratings yet
DS Chapter - 2
73 pages
Business Statistics: Session 2
No ratings yet
Business Statistics: Session 2
60 pages
Chapter 4 Basic Statistics
No ratings yet
Chapter 4 Basic Statistics
22 pages
Lesson 02 Probability and Statistics
No ratings yet
Lesson 02 Probability and Statistics
127 pages
BBA Statistics
No ratings yet
BBA Statistics
4 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Biostat Ch-5
No ratings yet
Biostat Ch-5
58 pages
Statistics 24 04 2021 20210618114031
No ratings yet
Statistics 24 04 2021 20210618114031
41 pages
Descriptive Statsistics
No ratings yet
Descriptive Statsistics
34 pages
Descriptive Statistics PDF
100% (1)
Descriptive Statistics PDF
40 pages
Basics of Statistics
No ratings yet
Basics of Statistics
32 pages
DSBDAL - Assignment No 3
No ratings yet
DSBDAL - Assignment No 3
4 pages
Exp 10
No ratings yet
Exp 10
4 pages
Statistics For Data Science 1
No ratings yet
Statistics For Data Science 1
65 pages
Quantitative Data Analysis
100% (2)
Quantitative Data Analysis
27 pages
Standard Error
No ratings yet
Standard Error
14 pages
1-Descriptive Statistics
No ratings yet
1-Descriptive Statistics
44 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
Action Research Proposal Writing The Step by Step Way Presented by Sir Dennis N. Sabido 6
No ratings yet
Action Research Proposal Writing The Step by Step Way Presented by Sir Dennis N. Sabido 6
74 pages
Stats 1 Module Updated
No ratings yet
Stats 1 Module Updated
53 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
34 pages
ML Lab Final R22
No ratings yet
ML Lab Final R22
67 pages
Statistics - Compendium - DMS IIT DELHI - 2025
No ratings yet
Statistics - Compendium - DMS IIT DELHI - 2025
18 pages
Dsbda Unit 2
No ratings yet
Dsbda Unit 2
155 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
15 pages
Credit Eda Case Study Analysis
75% (4)
Credit Eda Case Study Analysis
13 pages
Basic Statistics
100% (9)
Basic Statistics
73 pages
Assignment No 3
No ratings yet
Assignment No 3
16 pages
Comprehensive Ebook of Statistics For Data Science - Chaitali
No ratings yet
Comprehensive Ebook of Statistics For Data Science - Chaitali
21 pages
Statistics Lecture 1
No ratings yet
Statistics Lecture 1
20 pages
Data Management
No ratings yet
Data Management
48 pages
2466939-EDA and STATISTICS NOTES
No ratings yet
2466939-EDA and STATISTICS NOTES
15 pages
Qtymeth Dispersion
No ratings yet
Qtymeth Dispersion
8 pages
MMW (Data Management) - Part 1
No ratings yet
MMW (Data Management) - Part 1
26 pages
CH 2 Lecture Notes
No ratings yet
CH 2 Lecture Notes
12 pages
Midterms Gec Math Adooooor
No ratings yet
Midterms Gec Math Adooooor
6 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
13 pages
Measures of Dispersion Tendency
No ratings yet
Measures of Dispersion Tendency
7 pages
Statistics and Its Types (v1.0)
No ratings yet
Statistics and Its Types (v1.0)
6 pages
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
No ratings yet
Chapter 1: Descriptive Statistics: Example 1: Making Steel Rods
20 pages
Ge8 Statistics
No ratings yet
Ge8 Statistics
2 pages
A Study of Humour Advertisement and Its Influence On Consumer Purchashing Decision: Evidence From Malaysia
No ratings yet
A Study of Humour Advertisement and Its Influence On Consumer Purchashing Decision: Evidence From Malaysia
19 pages
Chapter 4: Forecasting: Problem 1: Auto Sales at Carmen's Chevrolet Are Shown Below. Find A Naive Forecast
No ratings yet
Chapter 4: Forecasting: Problem 1: Auto Sales at Carmen's Chevrolet Are Shown Below. Find A Naive Forecast
11 pages
Social Media
No ratings yet
Social Media
10 pages
Author Guidelines IJPO
No ratings yet
Author Guidelines IJPO
4 pages
Application of Coefficient of Contingency Among Classification
No ratings yet
Application of Coefficient of Contingency Among Classification
12 pages
Thematic Article
No ratings yet
Thematic Article
24 pages
University of Perpetual Help Laguna College of Business and Accountancy
No ratings yet
University of Perpetual Help Laguna College of Business and Accountancy
49 pages
CSIS 5420 Final Exam - Answers (13 Jul 05)
No ratings yet
CSIS 5420 Final Exam - Answers (13 Jul 05)
8 pages
Leadership Style of Urban Barangay Chairmen in Oza PDF
No ratings yet
Leadership Style of Urban Barangay Chairmen in Oza PDF
10 pages
Multiple Linear Regression: Chapter 12
No ratings yet
Multiple Linear Regression: Chapter 12
49 pages
Written Report (Experimental Research)
No ratings yet
Written Report (Experimental Research)
6 pages
Assignment 2 0f Inferential Statistics-Converted-Compressed-1 PDF
No ratings yet
Assignment 2 0f Inferential Statistics-Converted-Compressed-1 PDF
21 pages
Book Assignment in PR2
No ratings yet
Book Assignment in PR2
4 pages
20230225DSLG2088
No ratings yet
20230225DSLG2088
33 pages
Sample Size Calculations For Evaluating Mediation
No ratings yet
Sample Size Calculations For Evaluating Mediation
17 pages
Hassan I 2010
No ratings yet
Hassan I 2010
12 pages
Quantitative and Qualitative Methods
No ratings yet
Quantitative and Qualitative Methods
6 pages
Practice Problems: Chapter 4, Forecasting: Problem 1
No ratings yet
Practice Problems: Chapter 4, Forecasting: Problem 1
10 pages
Jurnal 2021
No ratings yet
Jurnal 2021
12 pages
Checking The Success of Manipulations in Marketing Experiments
No ratings yet
Checking The Success of Manipulations in Marketing Experiments
10 pages
Assignment For Statistics
No ratings yet
Assignment For Statistics
3 pages
Worksheet 1
No ratings yet
Worksheet 1
5 pages
MIDTERM LAB QUIZ 2 - Attempt Review
No ratings yet
MIDTERM LAB QUIZ 2 - Attempt Review
8 pages
NewTerm I Class 11 (2022-23)
No ratings yet
NewTerm I Class 11 (2022-23)
3 pages
CV (Abu Nayem) - University of Padua - Data Science
No ratings yet
CV (Abu Nayem) - University of Padua - Data Science
2 pages
Business Statistics
No ratings yet
Business Statistics
2 pages
Descriptive Statistics: Six Sigma Thinking, #3
From Everand
Descriptive Statistics: Six Sigma Thinking, #3
Sumeet Savant
No ratings yet

Intro To Statistics and Assignments

Uploaded by

Intro To Statistics and Assignments

Uploaded by

Statistics:

Age Categories and Premiums (INR):

Measure of Central Tendency:

2.Eg: Age of people visiting the tuition centre

Step 1: Sum of all values

b) Syntax: Filtered_data = data[data[‘Age’] <= 20 ]

Hint: Use (&) AND Operator

Note: Attached the file on whatsapp - Customer Segmentation.csv

You might also like