Assignment DSBDS Insem
Assignment DSBDS Insem
Assignment- 2
What is the need of statistics in data science and big data analytics?
Statistics plays a crucial role in data science and big data analytics by providing methodologies to
collect, analyze, interpret, and present data meaningfully. Without statistical principles, data analysis
would lack reliability and scientific accuracy.
Importance of Statistics in Data Science
1. Data Collection & Sampling:
o Statistics helps in determining proper sampling techniques to extract meaningful
insights from a dataset without analyzing the entire population.
2. Descriptive Analytics:
o Measures like mean, median, mode, variance, and standard deviation help summarize
data distributions.
3. Inferential Analytics:
o Hypothesis testing, confidence intervals, and regression analysis help in making
predictions and data-driven decisions.
4. Machine Learning & AI:
o Algorithms rely on statistical methods like probability distributions and Bayesian
inference.
5. Data Cleaning & Transformation:
o Identifying missing values, handling outliers, and normalizing data require statistical
techniques.
6. Predictive Modeling & Forecasting:
o Statistical techniques like regression, time-series analysis, and clustering help in
forecasting trends.
7. Big Data Processing:
o Since big data is vast and unstructured, statistical models help in summarizing and
analyzing it efficiently.