0% found this document useful (0 votes)
16 views4 pages

Statistics and Data Analytics Notes

The document provides comprehensive notes on statistics and data analytics, covering key concepts such as descriptive statistics, probability distributions, hypothesis testing, and linear algebra. It also discusses statistical modeling techniques like regression and ANOVA, as well as advanced data analytics topics including vector spaces and eigenvalues. Overall, it serves as a foundational guide for understanding data analysis and statistical inference.

Uploaded by

yashg8883
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views4 pages

Statistics and Data Analytics Notes

The document provides comprehensive notes on statistics and data analytics, covering key concepts such as descriptive statistics, probability distributions, hypothesis testing, and linear algebra. It also discusses statistical modeling techniques like regression and ANOVA, as well as advanced data analytics topics including vector spaces and eigenvalues. Overall, it serves as a foundational guide for understanding data analysis and statistical inference.

Uploaded by

yashg8883
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Comprehensive Statistics and Data Analytics Notes

Statistics: Introduction & Descriptive Statistics

Statistics is the science of collecting, organizing, analyzing, and interpreting data to make informed decisions.

Descriptive statistics summarize data using numerical measures and graphical tools.

- Mean: The arithmetic average of a data set.

- Median: The middle value separating the higher half from the lower half.

- Mode: The most frequently occurring value.

- Variance: The average of the squared differences from the mean.

- Standard Deviation: The square root of the variance, indicating data spread.

Data Visualization

Data visualization involves presenting data in graphical format to identify patterns, trends, and outliers.

Common tools include:

- Histograms

- Boxplots

- Scatter plots

- Bar charts

- Line graphs

Introduction to Probability Distributions

Probability distributions describe how probabilities are distributed over values of a random variable.
Comprehensive Statistics and Data Analytics Notes

- Discrete distributions: Binomial, Poisson

- Continuous distributions: Normal, Exponential

Each has properties like mean, variance, skewness, and kurtosis.

Hypothesis Testing

A statistical method used to make decisions based on data.

- Null Hypothesis (H0): Assumes no effect or difference.

- Alternative Hypothesis (H1): Assumes an effect or difference exists.

- Test statistics: z-test, t-test, chi-square test, etc.

- P-value: Probability of observing data assuming H0 is true.

- Significance level (): Commonly set at 0.05

Linear Algebra and Population Statistics

Linear algebra involves vectors, matrices, and linear transformations used in population modeling and

statistical analysis.

- Population statistics: Mean, variance, and correlation structures modeled using matrices.

- Matrix operations support regression and multivariate analyses.

Mathematical Methods and Probability Theory

Probability theory underpins statistical inference.

- Set theory, combinatorics


Comprehensive Statistics and Data Analytics Notes

- Conditional probability, Bayes theorem

- Random variables and expectation

- Law of large numbers, Central Limit Theorem

Sampling Distributions and Statistical Inference

- Sampling distributions describe the distribution of sample statistics.

- Central Limit Theorem enables inference from samples to populations.

- Confidence intervals and p-values form the basis of inference.

Quantitative Analysis

Involves the use of mathematical and statistical modeling, measurement, and research to understand

behavior.

- Descriptive and inferential statistics

- Optimization

- Time series analysis

Unit 2: Statistical Modelling

- Linear models & Regression: Predictive models using linear relationships

- ANOVA: Analyzes variance across groups

- Gauss-Markov Theorem: OLS estimators are BLUE (Best Linear Unbiased Estimators)

- Least Squares Geometry: Minimizing sum of squared residuals


Comprehensive Statistics and Data Analytics Notes

- Model diagnostics: Residual analysis, influence, multicollinearity

- Transformations: e.g., Box-Cox

- Logistic & Poisson Regression for binary/count data

Unit 3: Data Analytics

- Open & Closed Sets: Defined by inclusion of boundary points

- Compactness: Every open cover has a finite subcover

- Metric Space: Defines distance (e.g., Euclidean in R^n)

- Cauchy Sequences: Sequences where terms get arbitrarily close

- Completeness: All Cauchy sequences converge

- Connectedness: Space can't be divided into two disjoint open sets

Unit 4: Advanced Data Analytics

- Vector Space: Collection of vectors closed under vector addition and scalar multiplication

- Subspaces: Subsets that are also vector spaces

- Independence, Basis & Dimension: Basis is a minimal set of independent vectors

- Eigenvalues & Eigenvectors: Solve Av = v, important in PCA and systems analysis

You might also like