Adsexp 1
Adsexp 1
Date:
EXPERIMENT NO.:01
Aim : Study and Implement Descriptive and Inferential Statistics on a given dataset.
Theory:
Statistics is the branch of mathematics that deals with collecting, analyzing, interpreting,
presenting, and organizing data. It involves the study of methods for gathering, summarizing,
and interpreting data to make informed decisions and draw meaningful conclusions.
Descriptive Statistics: Descriptive statistics is a term given to the analysis of data that helps to
describe, show and summarize data in a meaningful way. It is a simple way to describe our data.
Descriptive statistics is very important to present our raw data in an ineffective/meaningful way
using numerical calculations or graphs or tables. This type of statistics is applied to already
known data.
tips['day'].mode()
2. Measures of Spread:
● Range: The difference between the maximum and minimum values.
tips['tip'].max() - tips['tip'].min()
● Variance: The average of the squared differences from the mean. Sensitive to outliers.
round(tips['tip'].var(),3)
● Standard Deviation: The square root of the variance. Provides a measure of data spread
in the same units as the original data.
round(tips['tip'].std(),3)
Inferential Statistics: In inferential statistics, predictions are made by taking any group of data
in which you are interested. It can be defined as a random sample of data taken from a
population to describe and make inferences about the population. Any group of data that includes
all the data you are interested in is known as population. It basically allows you to make
predictions by taking a small sample instead of working on the whole population.
1. Regression analysis:Calculates how one variable will change to another. Linear
regression is the most common type of regression used in inferential statistics.
2. Hypothesis Testing: A method to draw conclusions about a population parameter based
on sample data. It involves setting null and alternative hypotheses, determining a
significance level, and using the p-value to decide whether to reject the null hypothesis.
● Z-Test is mainly used when comparing two groups or a sample mean to a
population mean, especially with large sample sizes.
● ANOVA is used when comparing three or more groups to assess if there is any
significant difference between them. It is widely used in experimental designs
involving multiple groups.
Aspect Descriptive Statistics Inferential Statistics
Scope Deals with observed data Deals with inferences and predictions about
(specific to the sample or a larger population based on a sample
population).
Outcome Provides summary measures and Draws conclusions, tests hypotheses, and
visual representations (e.g., mean, makes predictions beyond the data (e.g.,
variance). p-values, confidence intervals).
Key Methods Mean, Median, Mode, Standard Hypothesis Testing (t-tests, chi-square tests),
Deviation, Range, Histograms, Confidence Intervals, Regression, ANOVA,
Bar Charts Correlation.
Example Calculating the average age of a Using data from a sample to predict the
group of students, creating a average height of a population of students or
histogram of test scores testing if two groups have different average
heights.
Data Focus Describes the actual data Makes inferences about a population or
collected future events based on the sample.
Program:
# Import necessary libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy import stats
from sklearn.datasets import load_iris
# Display results
print("Mean values:")
print(mean_values)
print("\nMedian values:")
print(median_values)
print("\nMode values:")
print(mode_values)
# Visualization
# Create subplots for each feature
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
Output:
Conclusion: Thus, we studied Descriptive statistics summarize and visualize data, while
inferential statistics make predictions and generalizations about a population based on sample
data.