Statistical Analysis Presentation

Uploaded by

sarveshayaam2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views11 pages

Statistical Analysis Presentation

Uploaded by

sarveshayaam2023

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Probability Calculator and Hypothesis Tester: A

Comprehensive Statistical Toolkit

IE509 (Computation and Programming Lab)

Prof. Urban Larsson & Balamurugan
Palaniappan

Sarvesh Maurya Goutam Agarwal

24N0465 24N0463
Motivation and Background

The idea for this feature stems from my

undergraduate studies when we used to
manually refer to Z-tables and t-tables for
finding probabilities and critical values
during hypothesis testing. We thought,
'Why not use Python programming to
simplify this process
Introduction
This project is divided into two main parts:

• Part 1: Probability Calculations with Visualizations

• Provides various types of probability distributions (e.g., normal, binomial, Poisson,), allowing users to select
parameters and distribution types.
• Visualizes each distribution with shaded areas, highlighting specific probability regions for a clearer
understanding.
• Part 2: Hypothesis Testing and Data Analysis
• Accepts data input manually or from CSV.
• Offers three test options: one-sample t-test, two-sample t-test, and z-test, with assumption checks.
• Includes outlier detection, descriptive statistics (mean, variance, etc.), and data visualization.
• Outputs the test results, comparing the p-value with the significance level, and displays a confidence interval
for the results.
Probability Calculations with Visualizations
• You can calculate probability from both discrete and continuous distributions (Binomial, Poisson, Geometric,
Exponential, Normal, Uniform, Gamma, Chi-square)

• The code is written like this, After selecting the distribution it will ask appropriate parameters for the chosen distribution.
• It will also ask what type of probability do you want to calculate like P(X<x) P(X>x) or P(a<X<b).
• It will return you the appropriate probability and shaded region.
Usability and Key Features – Probability Calculations
This project’s first part combines all essential probability distributions in one place:
1. Unified Distribution Options: All types of distributions (normal, binomial, Poisson, etc.)
are available in a single platform.
2. Simplifies Code: No need to write separate code for each distribution; all calculations
and visualizations are integrated, saving time and reducing complexity.
3. Automatic Visualizations: Visualizations with shaded areas make it easy to interpret
probabilities without consulting separate tables (e.g., normal or chi-square tables).
4. Academic Applications: Ideal for academic purposes, providing a ready-to-use tool for
understanding and exploring various probability distributions."
Hypothesis Testing and Data Analysis
The second half of this project streamlines hypothesis testing by
automating key steps in the process:
1. Data Input
User provides data either manually or by uploading a CSV file.
2. Column Selection
Specific columns are selected for hypothesis testing, allowing for targeted analysis.
3. Null Value Removal
Automatically detects and removes any null (missing) values to ensure clean, usable data.
4. Assumption Checks
Verifies essential assumptions (e.g., normality, homogeneity of variance) for the chosen test, ensuring validity.
5. Hypothesis Test Selection
User selects the appropriate test: one-sample t-test, two-sample t-test, or z-test, based on the research question.
6. Hypothesis Testing
The test is conducted, and the p-value is calculated.
7. Compare p-value with Significance Level
The p-value is compared to the significance level (α) to decide whether to reject or fail to reject the null hypothesis.
8. Confidence Interval Visualization
Displays a confidence interval around the sample mean or difference, providing a visual summary of the test results.
9. Results Interpretation
Presents a summary of findings, including confidence intervals and the final statistical decision
Internal working of the the programme
Data Input and Validation
• The program first allows the user to enter data manually or upload a CSV file.
• After data is loaded, specific columns are selected based on the user’s analysis needs.
• The program validates the data to ensure it’s in the correct format, handling any errors in data types or
structure.

Null Value Detection and Removal

•Checks for any missing (null) values within the dataset.
•Automatically removes null values to ensure accurate calculations and reliable results in subsequent .steps

Outlier Detection
Automatically detects and removes outliers from the dataset, ensuring a clean and accurate analysis by using box
plot
Assumption Checks
•Based on the test selected, the program runs necessary assumption checks:
• Normality Check: Uses statistical tests like the Shapiro-Wilk test to check if data follows a normal
distribution.
• Variance Equality (for two-sample tests): For two-sample tests, it uses tests like Levene’s to ensure the
variances of groups are comparable.
•If assumptions are violated, the program may suggest alternatives or warn the user, ensuring valid and
appropriate test results.

Hypothesis Test Execution

•Executes the selected test (one-sample t-test, two-sample t-test, or z-test).
•Calculates the test statistic and p-value, which are central to determining the outcome of the hypothesis test.

Confidence interval visual

Provides a visual representation of the confidence interval around the mean, offering insight into the range of
values likely to contain the true population parameter.
Usability and Key Features – Part 2: Hypothesis Testing
• Flexible Data Input: Users can input data manually or via CSV files,
and can select the column on which he/she wants to perform test
• Automated Assumptions Checking: Assumptions for each test (e.g.,
normality, variance equality) are checked automatically, avoiding the
need for extra code.
• Saves Time: By streamlining assumption checks and offering multiple
test options, this project helps users avoid redundant code.
• Versatile and Practical: Simplifies hypothesis testing workflows,
making it practical for academic and research applications where
quick and reliable statistical insights are needed.
Methodology and Concept used in this project
Throughout this project, We have made use of the key concepts we have learned in the lab,
including:

•Object-Oriented Programming (OOP): We designed the program in a modular way, using

classes and objects to handle different distributions and statistical functions.
•Data Structures: Lists, dictionaries, and arrays are used to store user inputs and manage the
flow of data within the program.

•Python Libraries: The project leverages several powerful Python libraries, such as:
• NumPy: For mathematical computations
• Pandas: For data manipulation and analysis
• Matplotlib and Seaborn: For generating visualizations
• SciPy: For performing statistical tests and calculating probabilities
Q&A

Thank you! Any questions?

Solutions Manual For Quality Improvement 9th Edition by Besterfield
38% (8)
Solutions Manual For Quality Improvement 9th Edition by Besterfield
24 pages
Fresco
100% (2)
Fresco
17 pages
Gage R R
No ratings yet
Gage R R
8 pages
DWDM
No ratings yet
DWDM
18 pages
Intro To Essential Stats With Python
No ratings yet
Intro To Essential Stats With Python
51 pages
Introduction To Hypothesis Test in R
No ratings yet
Introduction To Hypothesis Test in R
103 pages
CS194 Lec 06 EDA
No ratings yet
CS194 Lec 06 EDA
40 pages
Hands On With Probability and Statistical
No ratings yet
Hands On With Probability and Statistical
9 pages
Parametric and Non Parametric Test
No ratings yet
Parametric and Non Parametric Test
76 pages
AP Stats Cheat Sheet FINAL
No ratings yet
AP Stats Cheat Sheet FINAL
8 pages
Static Tics
No ratings yet
Static Tics
47 pages
Stats Final Review
No ratings yet
Stats Final Review
11 pages
Statistics Quick Guide With Interpretations
No ratings yet
Statistics Quick Guide With Interpretations
11 pages
Complete Data Analysts RoadMap
No ratings yet
Complete Data Analysts RoadMap
47 pages
Unit4 R
No ratings yet
Unit4 R
21 pages
Statistics Quick Guide With Graphs
No ratings yet
Statistics Quick Guide With Graphs
8 pages
1 Vocab Reasoning
No ratings yet
1 Vocab Reasoning
3 pages
MLCourse Slides
No ratings yet
MLCourse Slides
356 pages
Omkar
No ratings yet
Omkar
37 pages
Sem 4
No ratings yet
Sem 4
19 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
17 pages
5 - Stat Lecture..
No ratings yet
5 - Stat Lecture..
44 pages
11 Statistical Tests
No ratings yet
11 Statistical Tests
24 pages
39 Gs Inference Hypothesis
No ratings yet
39 Gs Inference Hypothesis
5 pages
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
No ratings yet
Hypothesis Testing in Machine Learning Using Python - by Yogesh Agrawal - 151413
15 pages
2a EDA
No ratings yet
2a EDA
16 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
8 pages
(Ebook PDF) Statistics: Unlocking The Power of Data, 2nd Edition Download
100% (2)
(Ebook PDF) Statistics: Unlocking The Power of Data, 2nd Edition Download
51 pages
Introduction To Data Science Exploratory Data Analysis
No ratings yet
Introduction To Data Science Exploratory Data Analysis
55 pages
UL3
No ratings yet
UL3
2 pages
BUS 211 Notes: Chapter 1 Introduction and Data Collection
No ratings yet
BUS 211 Notes: Chapter 1 Introduction and Data Collection
17 pages
TC2-Lab Manual
No ratings yet
TC2-Lab Manual
35 pages
ML Course Slides
No ratings yet
ML Course Slides
356 pages
Statistic & Machine Learning: Team 2
No ratings yet
Statistic & Machine Learning: Team 2
42 pages
2CSOE03 IR December 2022
No ratings yet
2CSOE03 IR December 2022
4 pages
Unit 5
No ratings yet
Unit 5
36 pages
ML Course Slides
No ratings yet
ML Course Slides
345 pages
04 DS 2023
No ratings yet
04 DS 2023
63 pages
Unit4 R
No ratings yet
Unit4 R
21 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
12 pages
ASPjskjkjksnkxdqlcwl
No ratings yet
ASPjskjkjksnkxdqlcwl
34 pages
Module2 Ds
No ratings yet
Module2 Ds
28 pages
Unit 4
No ratings yet
Unit 4
14 pages
Hypothesis Test For Mean Using Given Data (Standard Deviation Known-Z-Test)
No ratings yet
Hypothesis Test For Mean Using Given Data (Standard Deviation Known-Z-Test)
12 pages
PRW Questions
No ratings yet
PRW Questions
31 pages
Machine Learning Ess - Week 1-4week
No ratings yet
Machine Learning Ess - Week 1-4week
43 pages
SI Final Term Report (Salik 21678)
No ratings yet
SI Final Term Report (Salik 21678)
31 pages
AD3411 - 6 To11
No ratings yet
AD3411 - 6 To11
15 pages
cs447 - Tool Using Simulation To Test A Hypothesis
No ratings yet
cs447 - Tool Using Simulation To Test A Hypothesis
4 pages
Intro To Traditional and Bayesian M Using R-Guilford 2017
No ratings yet
Intro To Traditional and Bayesian M Using R-Guilford 2017
330 pages
06 Analyze
No ratings yet
06 Analyze
25 pages
Module2 DS
No ratings yet
Module2 DS
46 pages
Calculator Shortcuts For AP
No ratings yet
Calculator Shortcuts For AP
5 pages
BUSINESS INTELLIGENCE Docs
No ratings yet
BUSINESS INTELLIGENCE Docs
12 pages
260 Proj
No ratings yet
260 Proj
3 pages
Confidence Interval and Hypothesis Testing: Iv An Andr Es Trujilllo Abella
No ratings yet
Confidence Interval and Hypothesis Testing: Iv An Andr Es Trujilllo Abella
30 pages
RM Presentation
No ratings yet
RM Presentation
19 pages
Machine Learning With Python: The Complete Course
No ratings yet
Machine Learning With Python: The Complete Course
17 pages
Presenting Data: Descriptive Statistics
No ratings yet
Presenting Data: Descriptive Statistics
21 pages
MLCourse Slides
No ratings yet
MLCourse Slides
427 pages
1518 (Assignment 1) - Packages Iii
No ratings yet
1518 (Assignment 1) - Packages Iii
6 pages
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Corn Silk Lemon Grass Tea 1
No ratings yet
Corn Silk Lemon Grass Tea 1
50 pages
Assignment: MBA 1 Semester Statistics For Management (Mb0040)
No ratings yet
Assignment: MBA 1 Semester Statistics For Management (Mb0040)
10 pages
Training and Employee Performance in Microfinance
No ratings yet
Training and Employee Performance in Microfinance
17 pages
S No. Buyer / Non-Buyer Durability Light Weight Low Cost Rot Resistance
No ratings yet
S No. Buyer / Non-Buyer Durability Light Weight Low Cost Rot Resistance
3 pages
2017 Book GeneticImprovementOfTropicalCr PDF
No ratings yet
2017 Book GeneticImprovementOfTropicalCr PDF
331 pages
Development and Quantification of Sustainability Indicators
100% (2)
Development and Quantification of Sustainability Indicators
95 pages
Aban Et Al 2006 JASA
No ratings yet
Aban Et Al 2006 JASA
9 pages
Classification - Prediction Data Model Very Important
No ratings yet
Classification - Prediction Data Model Very Important
173 pages
An Empirical Evaluation of Explanations For State Repression
No ratings yet
An Empirical Evaluation of Explanations For State Repression
27 pages
R Rec P.1410 4 200702 I!!pdf e
No ratings yet
R Rec P.1410 4 200702 I!!pdf e
28 pages
Case Study Report Format: Cover Page Introduction
No ratings yet
Case Study Report Format: Cover Page Introduction
9 pages
Research Methods Spss 5
No ratings yet
Research Methods Spss 5
4 pages
Correlation and Regression
No ratings yet
Correlation and Regression
42 pages
Isye 20203
No ratings yet
Isye 20203
3 pages
Wafo Tutorial 2017
100% (1)
Wafo Tutorial 2017
195 pages
Kruskal-Wallis Test
No ratings yet
Kruskal-Wallis Test
10 pages
Mẫu A KQM AR Nghiên cứu Mar
No ratings yet
Mẫu A KQM AR Nghiên cứu Mar
5 pages
Exam 3 Review
No ratings yet
Exam 3 Review
16 pages
PR2 Research Group 6
No ratings yet
PR2 Research Group 6
11 pages
Practical Research 2: Quarter 1 - Module 2 Research Variables
63% (8)
Practical Research 2: Quarter 1 - Module 2 Research Variables
23 pages
Alpha PDF
No ratings yet
Alpha PDF
8 pages
Briefreport: The Role of Social Media For Blood Donor Motivation and Recruitment
No ratings yet
Briefreport: The Role of Social Media For Blood Donor Motivation and Recruitment
3 pages
CHAPTER 03-Random Variable
No ratings yet
CHAPTER 03-Random Variable
68 pages
Gsdeemer Third Edition: Alberto Ades, Rumi Masih, Daniel Tenengauzer September 1999
No ratings yet
Gsdeemer Third Edition: Alberto Ades, Rumi Masih, Daniel Tenengauzer September 1999
16 pages
Introduction To Research Assignment
No ratings yet
Introduction To Research Assignment
20 pages
Final Assignment Business Analytics
No ratings yet
Final Assignment Business Analytics
10 pages
Distrib: Probability Distribution Analysis
No ratings yet
Distrib: Probability Distribution Analysis
17 pages
Chemistry IA Proposal Form
No ratings yet
Chemistry IA Proposal Form
7 pages