0% found this document useful (0 votes)

88 views22 pages

13 Statistical Analysis Methods For Data Analysts & Data Scientists - by BTD - Medium

This document provides an overview of 13 statistical analysis methods for data analysts and data scientists. It describes techniques for descriptive statistics, inferential statistics, multivariate analysis, and time series analysis. Specific methods discussed include measures of central tendency, hypothesis testing, regression analysis, correlation analysis, factor analysis, and cluster analysis.

Uploaded by

ravinder.ds7865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

88 views22 pages

13 Statistical Analysis Methods For Data Analysts & Data Scientists - by BTD - Medium

Uploaded by

ravinder.ds7865

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Member-only story

13 Statistical Analysis Methods for

Data Analysts & Data Scientists
btd · Follow
14 min read · Nov 9, 2023

21
Photo by alksndra on Unsplash

Statistical analysis techniques encompass a wide range of methods used to

analyze data, make inferences, and draw conclusions about populations or
datasets. Here is a list of various statistical analysis techniques:

1. Descriptive Statistics:
These techniques provide a summary of data, including measures of central
tendency (mean, median, mode), variability (range, variance, standard
deviation), and the shape of data distributions.

Measures of central tendency (mean, median, mode)

Measures of variability (range, variance, standard deviation)

Measures of distribution (skewness, kurtosis)

Frequency distributions and histograms

2. Inferential Statistics:
These methods are used to draw conclusions about populations or datasets
based on a sample. They include hypothesis testing, confidence intervals,
regression analysis, correlation analysis, and more.

a. Hypothesis Testing:
Student’s t-test: A statistical test used to determine if there is a significant
difference between the means of two groups. It’s commonly employed
when working with small sample sizes.

Analysis of Variance (ANOVA): A statistical method used to assess whether

there are any statistically significant differences between the means of
three or more independent groups. It helps identify which group(s) differ
from the others.

Chi-squared test: A statistical test used to determine if there is a significant

association between two categorical variables. It is often applied to data
arranged in a contingency table.

Z-test: A statistical test that assesses whether the mean of a sample differs
significantly from a known population mean. It is particularly useful
when dealing with large sample sizes.

b. Confidence Intervals:
Confidence interval estimation: A statistical technique used to quantify the
uncertainty around an estimate by providing a range of values within
which the true population parameter is likely to fall with a certain level of
confidence. The confidence interval is computed based on the sample
data and reflects the precision of the estimate. For example, a 95%
confidence interval indicates that if the same sampling process were
repeated many times, the true parameter would fall within the interval in
95% of those cases.

c. Regression Analysis:
Linear regression: A statistical method used to model the relationship
between a dependent variable and one or more independent variables. It
assumes a linear relationship and aims to find the best-fitting line
(regression line) that minimizes the sum of squared differences between
the observed and predicted values.

Multiple regression: An extension of linear regression that involves

modeling the relationship between a dependent variable and two or more
independent variables. It allows for the analysis of how multiple factors
influence the dependent variable simultaneously.

Logistic regression: A statistical method used for predicting the probability

of a binary outcome. It models the relationship between a dependent
binary variable and one or more independent variables using the logistic
function. It is commonly used for classification problems.

Poisson regression: A type of regression used when the dependent variable

represents counts and follows a Poisson distribution. It models the
relationship between the counts and one or more independent variables,
making it suitable for analyzing data where the outcome is a count
variable (e.g., number of events) and the assumptions of normality are
not met.

d. Correlation Analysis:
Pearson correlation coefficient: A measure of the linear relationship
between two continuous variables. It ranges from -1 to +1, where +1
indicates a perfect positive linear relationship, -1 indicates a perfect
negative linear relationship, and 0 indicates no linear relationship. It is
sensitive to outliers and assumes that the variables are approximately
normally distributed.

Spearman rank correlation: A non-parametric measure of the strength and

direction of the monotonic relationship between two variables. It
assesses how well the relationship between the variables can be
described using a monotonic function (either increasing or decreasing),
making it more robust to outliers than Pearson correlation. It works by
assigning ranks to the data points and then computing the correlation
based on these ranks rather than the raw data values.

e. Time Series Analysis:

AutoRegressive Integrated Moving Average (ARIMA): A popular time series
forecasting model that combines autoregression (AR), differencing (I for
integrated), and moving averages (MA). ARIMA models are effective for
capturing trends and seasonality in time series data, making them
valuable for predicting future points in a time series.

Seasonal decomposition of time series (STL): A method for decomposing a

time series into its three main components: Seasonal, Trend, and
Residual. STL decomposition helps analysts better understand and model
time series data by separating out these components, making it easier to
analyze and forecast each aspect independently.

Exponential smoothing: A time series forecasting method that assigns

exponentially decreasing weights to past observations. This method is
particularly useful for data with trends and seasonality. It includes
different variants like Simple Exponential Smoothing (SES), Double
Exponential Smoothing (Holt’s method), and Triple Exponential
Smoothing (Holt-Winters method), each accommodating different time
series patterns.

f. Non-parametric Tests:
Mann-Whitney U test: A non-parametric test used to determine whether
there is a significant difference between the distributions of two
independent groups. It is an alternative to the independent samples t-test
and does not assume normal distribution.
Wilcoxon signed-rank test: A non-parametric test used to assess whether
there is a significant difference between paired observations. It is often
applied when the data is not normally distributed or when the
assumption of equal variances is violated.

Kruskal-Wallis test: A non-parametric test used to determine whether

there are statistically significant differences between three or more
independent groups. It extends the Mann-Whitney U test to multiple
groups and is applicable when the assumptions for parametric ANOVA
are not met.

Friedman test: A non-parametric test used to detect differences in

treatments across multiple related groups. It is an extension of the
Wilcoxon signed-rank test for more than two related samples. The
Friedman test is often used when the data is not normally distributed and
violates the assumptions of a parametric repeated-measures ANOVA.

g. Survival Analysis:
Kaplan-Meier estimator: A non-parametric statistical method used to
estimate the survival function from time-to-event data, such as the time
until a patient experiences an event (e.g., death). It is commonly used in
medical research and other fields where the time until an event is of
interest. The Kaplan-Meier estimator can handle censored data, where
the event of interest has not occurred for some subjects by the end of the
study.

Cox proportional hazards model: A statistical model used for survival

analysis that examines the association between the time until an event
occurs (survival time) and one or more predictor variables. The Cox
proportional hazards model assumes that the hazard (risk of the event)
for any individual is a constant multiple of the hazard for any other
individual, and this proportionality remains constant over time. It’s a
semi-parametric model that does not make strong assumptions about the
shape of the survival function.

Open in app

3. Multivariate Analysis:
Search Write
This category covers techniques for analyzing data with multiple variables,
such as factor analysis, cluster analysis, PCA, canonical correlation analysis,
and discriminant analysis.

a. Factor Analysis:
Exploratory Factor Analysis (EFA): A statistical technique used to identify
underlying relationships (factors) among a set of observed variables
without pre-specifying the nature of those relationships. EFA aims to
discover the structure of the data by grouping variables that tend to co-
occur. It is often used in the initial stages of research to explore and
generate hypotheses about the underlying structure of the data.

Confirmatory Factor Analysis (CFA): A statistical technique used to test a

specific hypothesis about the structure of relationships among observed
variables. Unlike EFA, CFA involves specifying a priori a model that
hypothesizes how the observed variables are related to underlying latent
factors. The goal is to confirm or reject the proposed factor structure
based on the observed data. CFA is commonly used for validating existing
theories or models in social sciences, psychology, and other fields.

b. Cluster Analysis:
K-Means clustering: A partitioning method for clustering data points into
distinct groups or clusters. The algorithm assigns each data point to the
cluster whose mean (centroid) is closest, minimizing the sum of squared
distances within clusters. K-Means is widely used for its simplicity and
efficiency in creating clusters based on similarity.
Hierarchical clustering: A clustering method that creates a hierarchy of
clusters. It starts with each data point as a separate cluster and iteratively
merges or splits clusters based on their similarity. Hierarchical clustering
results in a tree-like structure called a dendrogram, where the leaves
represent individual data points and the branches represent the merging
of clusters at different similarity levels.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise): A

density-based clustering algorithm that groups together data points that
are close to each other and have a sufficient number of neighbors,
forming dense regions. It is particularly effective at identifying clusters
with irregular shapes and can identify noise points as well. DBSCAN is
less sensitive to the initial configuration of points compared to K-Means
and does not require specifying the number of clusters beforehand.

c. Principal Component Analysis (PCA):

A dimensionality reduction technique used to transform a set of
correlated variables into a new set of uncorrelated variables called
principal components. PCA identifies the directions of maximum
variance in the data and projects the data onto these directions. It is
commonly employed for feature extraction, visualization, and reducing
the complexity of high-dimensional datasets.

d. Canonical Correlation Analysis (CCA):

A multivariate statistical technique that explores the relationships
between two sets of variables. CCA identifies linear combinations of
variables (canonical variates) in each set such that the correlation
between the sets is maximized. It is often used to analyze the
relationships between pairs of variables or datasets and is particularly
useful when dealing with multiple correlated outcome variables. CCA
finds patterns of association rather than summarizing the variables
themselves, making it a valuable tool in fields such as psychology,
economics, and biology.

e. Discriminant Analysis:
Linear Discriminant Analysis (LDA): A classification and dimensionality
reduction technique that aims to find the linear combinations of features
that best separate two or more classes. LDA seeks to maximize the
distance between class means while minimizing the spread (variance)
within each class. It is commonly used for supervised classification
problems when the classes are known in advance.

Quadratic Discriminant Analysis (QDA): Similar to Linear Discriminant

Analysis, QDA is a classification method that assumes different
covariance matrices for each class, as opposed to LDA, which assumes a
common covariance matrix for all classes. QDA is more flexible in
handling cases where the classes have different variances. Like LDA, it is
used for supervised classification problems when the classes are known.

4. Experimental Design:
Methods for designing and analyzing controlled experiments, including
ANOVA, randomized controlled trials, factorial experiments, and more.

Analysis of Variance (ANOVA): A statistical method used to analyze the

differences among group means in a sample. It assesses whether there
are any statistically significant differences between the means of three or
more independent groups. ANOVA is often used in experimental research
to compare the effect of different treatments or conditions.

Randomized Controlled Trials (RCTs): Experimental studies in which

participants are randomly assigned to different groups, including a
treatment group and a control group. RCTs are widely used in medical
and scientific research to evaluate the effectiveness of interventions or
treatments while minimizing bias and confounding variables.

Factorial Experiments: Experimental designs that involve studying the

effects of two or more independent variables (factors) simultaneously.
Factorial experiments allow researchers to investigate the main effects of
each factor as well as potential interactions between factors, providing a
more comprehensive understanding of their combined influence on the
dependent variable.

Block Design: A design in experimental research where participants are

grouped into blocks based on certain characteristics that are expected to
influence the outcome. Within each block, random assignment to
different conditions or treatments is performed. Block designs help
control for potential sources of variability and increase the precision of
the experiment.

Crossover Design: A type of experimental design commonly used in

clinical trials where each participant receives different treatments at
different times, with a washout period in between. Crossover designs
help control for individual differences, making each participant serve as
their own control. They are often employed when carryover effects are a
concern.

5. Bayesian Statistics:
Bayesian techniques involve updating beliefs based on prior knowledge and
new evidence. They include Bayesian inference, Bayesian networks, and
Markov Chain Monte Carlo (MCMC) methods.

Bayesian Inference: A statistical method that involves updating probability

estimates based on prior knowledge and new evidence. In Bayesian
inference, probability is treated as a measure of belief or certainty, and
Bayes’ theorem is used to update probabilities as new data becomes
available. It provides a framework for incorporating prior knowledge,
making predictions, and updating beliefs in a principled manner.

Bayesian Networks: Graphical models that represent the probabilistic

relationships among a set of variables using a directed acyclic graph.
Nodes in the graph represent random variables, and edges represent
probabilistic dependencies. Bayesian networks are used for modeling
and reasoning under uncertainty, making them valuable for tasks such as
risk assessment, decision support, and predictive modeling.

Markov Chain Monte Carlo (MCMC): A computational technique for

sampling from complex probability distributions, particularly in
Bayesian statistics. MCMC methods, such as the Metropolis-Hastings
algorithm and the Gibbs sampler, generate a sequence of samples that
converge to the desired distribution. MCMC is widely used for Bayesian
inference when analytical solutions are difficult to obtain, and it is
employed in various fields, including statistics, machine learning, and
physics.

6. Spatial Analysis:
These techniques are used to analyze data with geographic or spatial
attributes. Methods include spatial autocorrelation analysis, geostatistics,
and kernel density estimation.

Spatial Autocorrelation Analysis: A statistical technique used to assess

whether the values of a variable in a geographic space exhibit spatial
dependence. It evaluates whether similar or dissimilar values tend to
occur near each other. Positive spatial autocorrelation indicates
clustering of similar values, while negative spatial autocorrelation
suggests a dispersion pattern. Spatial autocorrelation analysis is common
in geography, ecology, and other fields studying spatial patterns.

Geostatistics: A set of statistical methods used for analyzing spatial data

that incorporates the spatial structure and variation of the data.
Geostatistical techniques, including kriging, variogram analysis, and
spatial interpolation, are widely used in fields like environmental
science, geology, and agriculture to model and predict spatial patterns
and distributions.

Kernel Density Estimation: A non-parametric method for estimating the

probability density function of a continuous random variable in a spatial
context. Kernel density estimation creates a smooth, continuous surface
that represents the spatial distribution of events or observations. It is
commonly used in spatial statistics to visualize and analyze the intensity
or concentration of events across a geographic area, helping to identify
hotspots or clusters.

7. Machine Learning Techniques:

A wide range of algorithms used for tasks such as classification, regression,
clustering, and dimensionality reduction.

Decision Trees: A supervised machine learning algorithm that makes

decisions by recursively splitting the dataset based on features. It creates
a tree structure where each internal node represents a decision based on
a feature, and each leaf node represents the predicted outcome.

Random Forest: An ensemble learning method that constructs multiple

decision trees during training and outputs the mode (classification) or
mean prediction (regression) of the individual trees. It improves
accuracy and reduces overfitting compared to a single decision tree.

Support Vector Machines (SVM): A supervised learning algorithm used for

classification and regression tasks. SVM aims to find a hyperplane that
best separates data points into different classes while maximizing the
margin between the classes.

Neural Networks: A class of machine learning models inspired by the

structure and functioning of the human brain. Neural networks consist
of interconnected nodes (neurons) organized into layers, including input,
hidden, and output layers. They are powerful for a wide range of tasks,
including image and speech recognition, and can be used for both
regression and classification.

K-Nearest Neighbors (K-NN): A simple, instance-based learning algorithm

used for classification and regression. It classifies a data point based on
the majority class of its k nearest neighbors in the feature space.

Clustering Algorithms (e.g., K-Means): Unsupervised learning methods that

group similar data points together. K-Means is a popular clustering
algorithm that partitions data into k clusters based on similarity.

Dimensionality Reduction (e.g., PCA): Techniques to reduce the number of

features in a dataset while preserving its essential information. PCA, for
example, identifies and retains the most important features by
transforming the data into a new set of uncorrelated variables called
principal components.

Ensemble Methods (e.g., Bagging and Boosting): Techniques that combine

multiple models to improve overall performance. Bagging (Bootstrap
Aggregating) creates an ensemble by training models on bootstrapped
subsets of the data. Boosting combines weak models sequentially, giving
more weight to misclassified instances.
Natural Language Processing (NLP) techniques for text analysis: A field of
study focused on the interaction between computers and human
language. NLP techniques include tasks like sentiment analysis, named
entity recognition, and language translation, often using machine
learning algorithms.

Deep Learning (e.g., Convolutional Neural Networks, Recurrent Neural

Networks): A subset of machine learning that involves neural networks
with many layers (deep neural networks). Convolutional Neural Networks
(CNNs) are effective for image-related tasks, while Recurrent Neural
Networks (RNNs) are well-suited for sequential data, such as natural
language processing tasks. Deep learning has achieved significant
success in various domains due to its ability to automatically learn
hierarchical representations.

8. Time-Frequency Analysis:
Methods for analyzing signals and time series data in both the time and
frequency domains.

Fast Fourier Transform (FFT): A computational algorithm that efficiently

computes the Discrete Fourier Transform (DFT) and its inverse. FFT is
widely used for analyzing and processing signals in various applications,
such as signal processing, audio analysis, image processing, and
telecommunications. It transforms a signal from the time domain to the
frequency domain, revealing its frequency components.

Wavelet Transform: A mathematical technique for transforming signals

into a different representation, emphasizing different aspects of the
signal at different scales. The wavelet transform is particularly useful for
analyzing signals with non-stationary characteristics, where different
frequencies dominate at different points in time. It has applications in
signal and image processing, compression, and denoising.

Short-Time Fourier Transform (STFT): A time-frequency analysis technique

that provides a compromise between the time and frequency resolutions
of a signal. STFT divides a signal into short overlapping segments and
applies the Fourier transform to each segment, producing a time-varying
representation of the signal’s frequency content. STFT is commonly used
in audio processing and speech analysis, where the signal characteristics
change over time.

9. Meta-Analysis:
Techniques used to combine and analyze results from multiple studies or
experiments.

10. Simulation and Monte Carlo Analysis:

These techniques use simulations to model complex systems and analyze
probabilistic outcomes.

11. Quality Control and Process Control:

Statistical methods used to monitor and improve the quality of products or
processes. They include control charts and Six Sigma methodologies.

Control charts: Statistical tools used in quality control to monitor and

maintain the stability of a process over time. Control charts display
process variation over time and help identify whether observed
variations are within acceptable limits or if there are any patterns or
trends that may indicate a need for process adjustment. They are
essential in manufacturing, healthcare, and other industries to ensure
consistent product or service quality.

Six Sigma methodologies: A set of techniques and tools for process

improvement, quality management, and reduction of defects or errors in
a manufacturing or business process. Six Sigma aims to achieve nearly
defect-free processes by systematically identifying and removing the
causes of variation and waste. It follows a structured problem-solving
approach, often defined by the DMAIC (Define, Measure, Analyze,
Improve, Control) cycle, and emphasizes data-driven decision-making.
Organizations adopting Six Sigma strive to achieve a level of performance
where only 3.4 defects per million opportunities occur.

12. Econometric Analysis:

Techniques for modeling and analyzing economic data, often used in
economic research and policy analysis.

13. Spatial Analysis and Geographic Information Systems (GIS):

Analyzing geographic and spatial data to make informed decisions, manage
resources, and understand spatial relationships.

These are just a selection of statistical analysis techniques, and the choice of
method depends on the nature of the data, research questions, and specific
objectives of a study or analysis. Researchers and analysts often use a
combination of these techniques to gain a comprehensive understanding of
data and make meaningful conclusions.

Data Analysis Statistical Analysis Data Science Data Scientist Data Analyst
Written by btd Follow

536 Followers

Learning & making lists

More from btd

btd btd

Explainable AI (XAI) for Anomaly SQL Query Flow: Understanding

Detection: Understanding Outlier… Query Execution Order
Explainable AI (XAI) for anomaly detection Understanding the order of execution of SQL
focuses on enhancing the interpretability of… queries is crucial for writing efficient and…

· 4 min read · Nov 23, 2023 · 3 min read · Nov 12, 2023

23 12

btd btd

Image Preprocessing for Computer Step-by-Step Guide to Fine-tuning

Vision Tasks in Python Using… Pre-trained Models for Computer…
Image preprocessing is a crucial step in Fine-tuning pre-trained models for specific
preparing images for machine learning task… computer vision tasks is a common and…

· 4 min read · Nov 22, 2023 · 3 min read · Nov 21, 2023

23 1
See all from btd

Recommended from Medium

Andres Vourakis Anmol Tomar

The Sad Reality: Not Enough Actual Pandas Crash Course: Top 30
Data Science Functions for ANY Data Analysis
Last letter to my boss after I quit Become a Pro in using Pandas for Data
Science

· 3 min read · Jan 18, 2024 · 8 min read · Feb 6, 2024

310 12 369 2

Lists

Predictive Modeling w/ Practical Guides to Machine

Python Learning
20 stories · 911 saves 10 stories · 1065 saves
ChatGPT prompts Coding & Development
39 stories · 1121 saves 11 stories · 447 saves

Mirko Pet… in Mirko Peters — Data & Analytics … Ryan O'Sullivan

Understanding P-Values: A Guide Using Causal Graphs to answer

to Statistical Significance causal questions
In the field of statistics, p-values play a crucial This article gives a practical introduction to
role in determining the significance of… the potential of causal graphs.

· 13 min read · Nov 5, 2023 9 min read · Jan 31, 2024

7 248 2

John Loewen, PhD in Data Storytelling Corner RStudioDataLab

What Makes a Data Visual Solve Classification Problems with

Awesome? Eight Examples And M… LDA: An R-Powered Guide
Eight wonderfully diverse data visualizations Learn how LDA tackles multi-class problems.
explained Optimize your models and explore how LDA…

· 6 min read · Jan 31, 2024 16 min read · 3 days ago

226 3

See more recommendations

JAILBREAKER-Automated Jailbreak Across Multiple Large Language Model Chatbots-2023 7
100% (2)
JAILBREAKER-Automated Jailbreak Across Multiple Large Language Model Chatbots-2023 7
15 pages
D-MSS-DS-23 Exam Updated Practice Questions 2025
No ratings yet
D-MSS-DS-23 Exam Updated Practice Questions 2025
5 pages
Urb 100 Tuning Ver1
No ratings yet
Urb 100 Tuning Ver1
23 pages
Electrical Switch
No ratings yet
Electrical Switch
6 pages
Application Note: Revision 01
No ratings yet
Application Note: Revision 01
34 pages
World English 2 Split B
100% (3)
World English 2 Split B
122 pages
How To Use NFC Shield With Arduino and Demo Code
No ratings yet
How To Use NFC Shield With Arduino and Demo Code
8 pages
IC Packaging 2008
No ratings yet
IC Packaging 2008
26 pages
7SJ602 Catalogue V35
50% (2)
7SJ602 Catalogue V35
31 pages
ACTIVITY 2 - The PC System
No ratings yet
ACTIVITY 2 - The PC System
4 pages
Sukam Online Ups
100% (1)
Sukam Online Ups
12 pages
EC612 User Manual
No ratings yet
EC612 User Manual
131 pages
OD2e L2 Word List
No ratings yet
OD2e L2 Word List
5 pages
Dbms Notes
No ratings yet
Dbms Notes
48 pages
Review Questions
100% (1)
Review Questions
44 pages
Lecture Notes in MAED Stat Part 1
100% (1)
Lecture Notes in MAED Stat Part 1
15 pages
Addition of Integers
No ratings yet
Addition of Integers
6 pages
Types of Statistics
No ratings yet
Types of Statistics
2 pages
Mini Project Report
No ratings yet
Mini Project Report
26 pages
Case Study - Facebook Business Model
No ratings yet
Case Study - Facebook Business Model
12 pages
Automobile Gannt Chart
No ratings yet
Automobile Gannt Chart
6 pages
JCM Tabela de Preços
No ratings yet
JCM Tabela de Preços
5 pages
How To Set Up A LLC in USA For Non Residents
No ratings yet
How To Set Up A LLC in USA For Non Residents
29 pages
MPM1D Unit 2 Lesson 9 Zero and Negative Exponent 1vysndd
No ratings yet
MPM1D Unit 2 Lesson 9 Zero and Negative Exponent 1vysndd
2 pages
ASM Using R 2 Marks Answer Keys
100% (1)
ASM Using R 2 Marks Answer Keys
10 pages
2CS402 - Database Management Systems
No ratings yet
2CS402 - Database Management Systems
2 pages
21bce1716 - DSA LAB ASSIGNMENT
No ratings yet
21bce1716 - DSA LAB ASSIGNMENT
10 pages
BHEL Unit Implements ERP Package
No ratings yet
BHEL Unit Implements ERP Package
9 pages
R - S - ALR - 87013181 Material Ledger Data Over Several Periods
No ratings yet
R - S - ALR - 87013181 Material Ledger Data Over Several Periods
9 pages
Untitled Presentation
No ratings yet
Untitled Presentation
11 pages
Stats CHP 1 Notes
No ratings yet
Stats CHP 1 Notes
10 pages
Our Strategic Searching Lesson Plan:, Grade Level: 6-9
No ratings yet
Our Strategic Searching Lesson Plan:, Grade Level: 6-9
4 pages
Logitech POP Keys and POP Mouse Bundle
No ratings yet
Logitech POP Keys and POP Mouse Bundle
1 page
Inferential Statistics
No ratings yet
Inferential Statistics
19 pages
Statistical Method
No ratings yet
Statistical Method
3 pages
Notes of DA Unit-II
No ratings yet
Notes of DA Unit-II
91 pages
Windows 7 Regal Business Edition 2014 SP1
No ratings yet
Windows 7 Regal Business Edition 2014 SP1
1 page
PREDICTIVE BUSINESS ANALYTICS Sem 4
No ratings yet
PREDICTIVE BUSINESS ANALYTICS Sem 4
31 pages
Module Ii
No ratings yet
Module Ii
31 pages
Group-1 BSCOS401B
No ratings yet
Group-1 BSCOS401B
49 pages
Descriptive and Inferential Statistics
No ratings yet
Descriptive and Inferential Statistics
8 pages
Unit 2 Data Analytics
No ratings yet
Unit 2 Data Analytics
33 pages
Business Research Methods Unit 4
No ratings yet
Business Research Methods Unit 4
25 pages
Statistics
No ratings yet
Statistics
64 pages
Data Analysis Notes
No ratings yet
Data Analysis Notes
9 pages
Explosives
100% (11)
Explosives
78 pages
Regression
No ratings yet
Regression
86 pages
Analytics PrepBook AnSoc 2017 PDF
100% (1)
Analytics PrepBook AnSoc 2017 PDF
41 pages
Statistical Analysis Tools in Analyzing Quantitative Data
No ratings yet
Statistical Analysis Tools in Analyzing Quantitative Data
5 pages
MNS3173 - Chapter 8 - Types of Data Analysis Methods
No ratings yet
MNS3173 - Chapter 8 - Types of Data Analysis Methods
19 pages
Unit 3 Parametric& Non Parametric
No ratings yet
Unit 3 Parametric& Non Parametric
8 pages
Psych Stats
No ratings yet
Psych Stats
8 pages
Data Analysis
No ratings yet
Data Analysis
8 pages
Research Meth 4
No ratings yet
Research Meth 4
7 pages
Articulo Sobre Estadística
No ratings yet
Articulo Sobre Estadística
7 pages
Statistics Cheatsheet 1703847367
No ratings yet
Statistics Cheatsheet 1703847367
8 pages
Module 3 - fIELD mEThod
No ratings yet
Module 3 - fIELD mEThod
4 pages
Statistical Methods
No ratings yet
Statistical Methods
15 pages
Lý thuyết:: Measures of Central Tendency
No ratings yet
Lý thuyết:: Measures of Central Tendency
5 pages
Assignment 3 .RM
No ratings yet
Assignment 3 .RM
10 pages
Business Stats
No ratings yet
Business Stats
5 pages
Untitled
No ratings yet
Untitled
73 pages
2 4 Module Lectures
No ratings yet
2 4 Module Lectures
10 pages
Class: I MSC Psychology Subject Name: Research Methodology & Applied Statistics Subject Code: 23psy13 Unit-Iv
No ratings yet
Class: I MSC Psychology Subject Name: Research Methodology & Applied Statistics Subject Code: 23psy13 Unit-Iv
9 pages
Statistical Data Analysis
No ratings yet
Statistical Data Analysis
4 pages
Unit 4 Notes
No ratings yet
Unit 4 Notes
9 pages
DAV Short Notes
No ratings yet
DAV Short Notes
5 pages
Math Stats
No ratings yet
Math Stats
4 pages
4th Unit Research Methodology 4th
No ratings yet
4th Unit Research Methodology 4th
10 pages
All Statistical Tests and Their Applications Updated Latest Latest Latest Latest
No ratings yet
All Statistical Tests and Their Applications Updated Latest Latest Latest Latest
14 pages
Best Practices For
No ratings yet
Best Practices For
8 pages
ESM 507 Statistical Analysis B
No ratings yet
ESM 507 Statistical Analysis B
3 pages
Tata Power
No ratings yet
Tata Power
14 pages
Data Analysis - Selecting A Test
No ratings yet
Data Analysis - Selecting A Test
5 pages
Inferential Statistics
No ratings yet
Inferential Statistics
3 pages
Microunit 4
No ratings yet
Microunit 4
5 pages
Explosives Akhavan PDF
86% (7)
Explosives Akhavan PDF
196 pages
STATISTICS
No ratings yet
STATISTICS
6 pages
Statistical Tools - Summary
No ratings yet
Statistical Tools - Summary
4 pages
JUAN - RSH 631 Reading Report 2
No ratings yet
JUAN - RSH 631 Reading Report 2
3 pages
Statisticsgm
No ratings yet
Statisticsgm
2 pages
PERFORMANCE TAS-WPS Office
No ratings yet
PERFORMANCE TAS-WPS Office
2 pages
Data Analysis Guide
No ratings yet
Data Analysis Guide
4 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
2 pages
Data Science Theory, Analysis and Applications - Memon - Ahmed
100% (12)
Data Science Theory, Analysis and Applications - Memon - Ahmed
345 pages
Statistics Theory Notes
No ratings yet
Statistics Theory Notes
21 pages
Wisdom and StatisticsTecq-Amitava
No ratings yet
Wisdom and StatisticsTecq-Amitava
18 pages
Statistics
No ratings yet
Statistics
2 pages
Deshmukh Abstract New
No ratings yet
Deshmukh Abstract New
3 pages
SMDE - (US) Experts Sesion Multivariate Analysis
No ratings yet
SMDE - (US) Experts Sesion Multivariate Analysis
4 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Mines Fuzes Warsaw Pact PDF
100% (3)
Mines Fuzes Warsaw Pact PDF
80 pages
Usmc Sniper Maual
100% (10)
Usmc Sniper Maual
204 pages
Understanding Machine Learning
100% (69)
Understanding Machine Learning
416 pages
Overview Of Bayesian Approach To Statistical Methods: Software
From Everand
Overview Of Bayesian Approach To Statistical Methods: Software
Vinaitheerthan Renganathan
No ratings yet
Glossary of Research Methodology
From Everand
Glossary of Research Methodology
Dr. Awadhesh Kishore
No ratings yet
Ukraine Aircraft Bombs - CAT-UXO
100% (2)
Ukraine Aircraft Bombs - CAT-UXO
6 pages
The Soviet Navy
100% (1)
The Soviet Navy
432 pages
Practical Projects
100% (30)
Practical Projects
478 pages
Bombing of Feltham in WW2
100% (6)
Bombing of Feltham in WW2
84 pages
(2022) Explosive Ordnance Guide For Ukraine
100% (2)
(2022) Explosive Ordnance Guide For Ukraine
152 pages
Top 10 Excel Formulas
No ratings yet
Top 10 Excel Formulas
12 pages
AR75 15 Responsibilities EOD
No ratings yet
AR75 15 Responsibilities EOD
30 pages
MK80
No ratings yet
MK80
2 pages
Handbook of Ammunition Used in Kurdistan and Surrounding Areas
No ratings yet
Handbook of Ammunition Used in Kurdistan and Surrounding Areas
124 pages
Old Chemical Weapons 1994
100% (1)
Old Chemical Weapons 1994
331 pages
Sea Mines in Amphibious Warfare
100% (2)
Sea Mines in Amphibious Warfare
146 pages
Стрелковое оружие отчет PDF
No ratings yet
Стрелковое оружие отчет PDF
352 pages
Agent Orange
No ratings yet
Agent Orange
78 pages
Decontamination of Explosive Contamhated Stxuctures and Equipment ADA526816
No ratings yet
Decontamination of Explosive Contamhated Stxuctures and Equipment ADA526816
8 pages
Weapon Disposal at Sea Ic - Munitions - Seabed - Rep
No ratings yet
Weapon Disposal at Sea Ic - Munitions - Seabed - Rep
90 pages
DAX Formulas
No ratings yet
DAX Formulas
19 pages
155mm M107
No ratings yet
155mm M107
1 page
IMAS 07 20 Ed1 Am3
No ratings yet
IMAS 07 20 Ed1 Am3
29 pages
TM 30-546 1956 Obsolete) Glossary of Soviet Military and Related Abbreviations
100% (2)
TM 30-546 1956 Obsolete) Glossary of Soviet Military and Related Abbreviations
184 pages
Introduction To Chemical Agents
No ratings yet
Introduction To Chemical Agents
8 pages
Holtville Rocket Range No. 2
No ratings yet
Holtville Rocket Range No. 2
194 pages
Camp Anza History
No ratings yet
Camp Anza History
238 pages
Mechanical Mine Clearance Technologies and Humanitarian PDF
No ratings yet
Mechanical Mine Clearance Technologies and Humanitarian PDF
12 pages
Mine Project
No ratings yet
Mine Project
78 pages

13 Statistical Analysis Methods For Data Analysts & Data Scientists - by BTD - Medium

Uploaded by

13 Statistical Analysis Methods For Data Analysts & Data Scientists - by BTD - Medium

Uploaded by

Member-only story

13 Statistical Analysis Methods for

Statistical analysis techniques encompass a wide range of methods used to

Measures of central tendency (mean, median, mode)

Measures of variability (range, variance, standard deviation)

Measures of distribution (skewness, kurtosis)

Frequency distributions and histograms

Analysis of Variance (ANOVA): A statistical method used to assess whether

Chi-squared test: A statistical test used to determine if there is a significant

Multiple regression: An extension of linear regression that involves

Logistic regression: A statistical method used for predicting the probability

Poisson regression: A type of regression used when the dependent variable

Spearman rank correlation: A non-parametric measure of the strength and

e. Time Series Analysis:

Seasonal decomposition of time series (STL): A method for decomposing a

Exponential smoothing: A time series forecasting method that assigns

Kruskal-Wallis test: A non-parametric test used to determine whether

Friedman test: A non-parametric test used to detect differences in

Cox proportional hazards model: A statistical model used for survival

Confirmatory Factor Analysis (CFA): A statistical technique used to test a

DBSCAN (Density-Based Spatial Clustering of Applications with Noise): A

c. Principal Component Analysis (PCA):

d. Canonical Correlation Analysis (CCA):

Quadratic Discriminant Analysis (QDA): Similar to Linear Discriminant

Analysis of Variance (ANOVA): A statistical method used to analyze the

Randomized Controlled Trials (RCTs): Experimental studies in which

Factorial Experiments: Experimental designs that involve studying the

Block Design: A design in experimental research where participants are

Crossover Design: A type of experimental design commonly used in

Bayesian Inference: A statistical method that involves updating probability

Bayesian Networks: Graphical models that represent the probabilistic

Markov Chain Monte Carlo (MCMC): A computational technique for

Spatial Autocorrelation Analysis: A statistical technique used to assess

Geostatistics: A set of statistical methods used for analyzing spatial data

Kernel Density Estimation: A non-parametric method for estimating the

7. Machine Learning Techniques:

Decision Trees: A supervised machine learning algorithm that makes

Random Forest: An ensemble learning method that constructs multiple

Support Vector Machines (SVM): A supervised learning algorithm used for

Neural Networks: A class of machine learning models inspired by the

K-Nearest Neighbors (K-NN): A simple, instance-based learning algorithm

Clustering Algorithms (e.g., K-Means): Unsupervised learning methods that

Dimensionality Reduction (e.g., PCA): Techniques to reduce the number of

Ensemble Methods (e.g., Bagging and Boosting): Techniques that combine

Deep Learning (e.g., Convolutional Neural Networks, Recurrent Neural

Fast Fourier Transform (FFT): A computational algorithm that efficiently

Wavelet Transform: A mathematical technique for transforming signals

Short-Time Fourier Transform (STFT): A time-frequency analysis technique

10. Simulation and Monte Carlo Analysis:

11. Quality Control and Process Control:

Control charts: Statistical tools used in quality control to monitor and

Six Sigma methodologies: A set of techniques and tools for process

12. Econometric Analysis:

13. Spatial Analysis and Geographic Information Systems (GIS):

Learning & making lists

More from btd

Explainable AI (XAI) for Anomaly SQL Query Flow: Understanding

Image Preprocessing for Computer Step-by-Step Guide to Fine-tuning

Recommended from Medium

Andres Vourakis Anmol Tomar

· 3 min read · Jan 18, 2024 · 8 min read · Feb 6, 2024

Predictive Modeling w/ Practical Guides to Machine

Mirko Pet… in Mirko Peters — Data & Analytics … Ryan O'Sullivan

Understanding P-Values: A Guide Using Causal Graphs to answer

· 13 min read · Nov 5, 2023 9 min read · Jan 31, 2024

John Loewen, PhD in Data Storytelling Corner RStudioDataLab

What Makes a Data Visual Solve Classification Problems with

· 6 min read · Jan 31, 2024 16 min read · 3 days ago

See more recommendations

You might also like