0% found this document useful (0 votes)
5 views5 pages

Unit 1

The document covers foundational concepts in statistics and data analysis, including definitions of mean, median, and mode, along with variance, standard deviation, and types of data visualizations. It also delves into statistical modeling techniques like linear regression, ANOVA, and logistic regression, as well as mathematical concepts such as metric spaces and vector spaces. Advanced topics include eigenvalues, eigenvectors, and their applications in data analytics.

Uploaded by

Jasbir Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views5 pages

Unit 1

The document covers foundational concepts in statistics and data analysis, including definitions of mean, median, and mode, along with variance, standard deviation, and types of data visualizations. It also delves into statistical modeling techniques like linear regression, ANOVA, and logistic regression, as well as mathematical concepts such as metric spaces and vector spaces. Advanced topics include eigenvalues, eigenvectors, and their applications in data analytics.

Uploaded by

Jasbir Singh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Unit 1: Statistics and Data Analysis Foundations

1. Define mean, median, and mode with examples.

o Mean: Average of values. Example: Mean of 2, 4, 6 is (2+4+6)/3 = 4.

o Median: Middle value when sorted. Example: Median of 3, 5, 7 is 5.

o Mode: Most frequent value. Example: Mode of 2, 2, 3 is 2.

2. How do you compute variance and standard deviation?

o Variance (σ²): Average of squared differences from the mean.

o Standard Deviation (σ): Square root of variance.

3. Explain the significance of standard deviation in data analysis.

o It measures the spread or dispersion of data; low SD = data close to mean.

4. What are different types of data visualizations and when would you use each?

o Bar Chart: Comparing categories.

o Histogram: Distribution of continuous data.

o Pie Chart: Proportions of categories.

o Scatter Plot: Correlation between two variables.

5. What is the difference between PMF and PDF?

o PMF (Probability Mass Function): For discrete variables.

o PDF (Probability Density Function): For continuous variables.

6. Explain the Normal Distribution and its properties.

o Bell-shaped, symmetric about mean, area under curve = 1.

7. What is hypothesis testing? Explain with a real-world example.

o Testing a claim. Example: Testing if a new drug is more effective than existing one.

8. Differentiate between Type I and Type II errors.

o Type I: Rejecting a true null hypothesis.

o Type II: Failing to reject a false null hypothesis.

9. What is a p-value? How is it interpreted?

o Probability of getting observed results assuming H₀ is true. p < 0.05 often leads to
rejecting H₀.

10. When would you use a t-test vs a z-test?

o t-test: Small sample size, unknown population SD.

o z-test: Large sample size, known population SD.


11. Define population vs sample.

o Population: Entire group.

o Sample: Subset of the population.

12. Explain the Central Limit Theorem and its implications.

o The sampling distribution of the sample mean approaches normality as sample size
increases.

13. What are confidence intervals? How are they useful?

o Range of values likely to include population parameter.

14. What is statistical inference?

o Drawing conclusions about a population based on a sample.

15. Explain conditional probability with an example.

o Probability of event A given B has occurred.

16. What is Bayes’ theorem? State its formula.

o [ P(A|B) = \frac{P(B|A)P(A)}{P(B)} \ ]

17. What are sampling distributions?

o Distribution of a statistic across many samples.

18. What is the role of linear algebra in statistics?

o Used in multivariate analysis, PCA, regression models.

Unit 2: Statistical Modelling

1. What is the difference between simple and multiple linear regression?

o Simple: One independent variable. Multiple: More than one.

2. Explain the assumptions of the linear regression model.

o Linearity, independence, homoscedasticity, normality, no multicollinearity.

3. What does R-squared indicate?

o Proportion of variance explained by the model.

4. Define and interpret regression coefficients.

o Show change in dependent variable for one unit change in predictor.

5. Explain the ANOVA test and its uses.

o Tests for significant differences between group means.

6. What is the Gauss-Markov theorem?


o OLS estimators are BLUE under certain conditions.

7. What does BLUE mean in the context of linear regression?

o Best Linear Unbiased Estimator.

8. Explain the concept of least squares geometrically.

o Minimizes the perpendicular distance between observed and predicted values.

9. What are orthogonal projections in linear models?

o Projections of data onto column space of X to estimate coefficients.

10. How do you identify multicollinearity in regression models?

o High VIF values, strong correlations between predictors.

11. What is residual analysis?

o Checks assumptions of model by analyzing residuals.

12. Explain influence diagnostics like Cook’s distance.

o Measures influence of individual data points on model.

13. What is the purpose of data transformation?

o To stabilize variance, make data normal, improve model fit.

14. How does the Box-Cox transformation work?

o Applies power transformation to normalize data.

15. Describe strategies for model selection.

o Stepwise, forward, backward selection; based on AIC/BIC.

16. What is AIC and BIC?

o Criteria for model selection; penalize model complexity.

17. Explain logistic regression. Where is it used?

o Predicts binary outcomes. Used in classification problems.

18. Explain Poisson regression and give a use case.

o For modeling count data. Used in call center modeling.

Unit 3: Mathematical Concepts in Data Analytics

1. What is an open set? Provide an example.

o Every point has a neighborhood within the set. Example: (0,1).

2. What is a closed set? Provide an example.

o Contains all its limit points. Example: [0,1].


3. Explain compactness in metric spaces.

o Every open cover has a finite subcover.

4. State and explain the Heine-Borel theorem.

o In ℝⁿ, compact ⇔ closed and bounded.

5. Define a metric space. Give an example in R^n.

o Set with a metric. Example: Euclidean space with d(x,y).

6. What are the properties of a metric?

o Non-negativity, identity, symmetry, triangle inequality.

7. Define a Cauchy sequence. Give an example.

o Terms get arbitrarily close. Example: 1/n in ℝ.

8. What does completeness of a metric space mean?

o Every Cauchy sequence converges in the space.

9. How is compactness different from completeness?

o Compact: All sequences have convergent subsequence. Complete: All Cauchy


sequences converge.

10. What is connectedness in a topological space?

o Cannot be split into two disjoint open sets.

11. Use Cauchy sequence to explain convergence.

o If sequence is Cauchy and space is complete → sequence converges.

12. Provide a real-world example where compactness is important.

o Optimization problems—ensures existence of maxima/minima.

Unit 4: Advanced Linear Algebra Concepts

1. Define a vector space. What are its axioms?

o A set with operations satisfying 8 axioms (closure, associativity, identity, inverse,


etc.).

2. What is a subspace? Give an example.

o Subset that is also a vector space. Example: Plane through origin in ℝ³.

3. Explain linear independence with an example.

o No vector in set is linear combination of others. Example: (1,0), (0,1).

4. Define basis and dimension of a vector space.

o Basis: Linearly independent spanning set. Dimension: Number of basis vectors.


5. How do you determine if a set of vectors is a basis?

o Check linear independence and spanning.

6. What are eigenvalues and eigenvectors?

o Ax = λx. λ is eigenvalue, x is eigenvector.

7. How do you compute eigenvalues of a matrix?

o Solve det(A - λI) = 0.

8. What is the spectral theorem?

o Real symmetric matrices can be diagonalized; eigenvectors are orthogonal.

9. Explain the importance of eigenvectors in PCA.

o Principal components are eigenvectors of covariance matrix.

10. What is diagonalization of a matrix?

o A = PDP⁻¹, where D is diagonal matrix of eigenvalues.

11. Provide a real-world application of eigenvectors.

o Face recognition, vibration analysis.

12. How do eigenvectors relate to transformation in space?

o They define invariant directions under linear transformation.

You might also like