0% found this document useful (0 votes)
9 views6 pages

Module_3_Answers_Updated

Uploaded by

hwoeou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views6 pages

Module_3_Answers_Updated

Uploaded by

hwoeou
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Module 3 - Answers

2 Marks

1. Define Scoring and Ranking.

Scoring: Assigning a value to data based on specific criteria.

Ranking: Ordering data based on scores.

2. Explain Z-score in statistics.

Z-score indicates how many standard deviations a data point is from the mean.

3. What does a z-score of 0 mean?

A z-score of 0 means the data point is equal to the mean.

4. Why are z-scores useful in data analysis?

Z-scores standardize data, enabling comparison across datasets with different scales.

5. Define Sampling.

Sampling is selecting a subset of data from a larger population for analysis.

6. Define Distribution.

Distribution describes how data values are spread across a range.

7. What is outlier detection?

Identifying data points that significantly deviate from the rest.


8. Define Standard Deviation.

A measure of the spread of data points around the mean.

9. Define Normalization and Outlier Detection.

Normalization: Rescaling data to a standard range (e.g., 0 to 1).

Outlier Detection: Identifying anomalous data points.

10. What is the importance of p-value?

It measures the probability of observing results under the null hypothesis.

5 Marks

11. Write down Characteristics of Z-Score.

- Represents the number of standard deviations from the mean.

- Standardized measure for data comparison.

- A positive z-score indicates data above the mean; negative indicates below.

- Z-scores have a mean of 0 and a standard deviation of 1.

- Useful for detecting outliers.

12. How is a z-score calculated?

Formula: Z = (X - mean) / standard deviation

Where: X = data point, mean = average, standard deviation = spread measure.

Example: For X = 155, mean = 170, standard deviation = 8:

Z = (155 - 170) / 8 = -1.875.

13. What is z-score and why are z-scores useful?

Z-score measures how far a data point is from the mean in terms of standard deviations.
It is useful for comparing data across different distributions and identifying outliers.

14. What is Null Hypothesis Testing? Explain with example.

Null hypothesis testing evaluates if a result is due to chance.

Example: Testing if a new drug is effective compared to a placebo. Null hypothesis assumes no

difference between them.

15. What is min-max scaling? Give one example.

Min-max scaling rescales data to a fixed range, usually [0, 1].

Formula: X_scaled = (X - X_min) / (X_max - X_min)

Example: For data [2, 4, 6], scaled values are [0, 0.5, 1].

16. Explain normal distribution.

A symmetric, bell-shaped distribution where most data points cluster around the mean.

Characteristics:

- Mean = Median = Mode.

- 68% of data lies within 1 standard deviation, 95% within 2.

17. Explain binomial distribution.

A probability distribution for binary outcomes (success/failure).

Parameters:

- n: number of trials.

- p: probability of success.

Example: Tossing a coin 10 times with p = 0.5.

18. Illustrate the concept of population and sample in detail.

Population: The entire set of data.


Sample: A subset of the population used for analysis.

Example: Surveying 100 students from a school of 1000.

19. The average height of adults in a population is 170 cm, with a standard deviation of 8 cm. If a

person is 155 cm tall, what is their z-score?

Formula: Z = (X - mean) / standard deviation

Z = (155 - 170) / 8 = -1.875.

10 Marks

20. Brief overview of the steps involved in developing scoring systems.

1. Define Objectives: Clearly specify what the scoring system will achieve.

2. Identify Variables: Select features relevant to the objective.

3. Data Collection: Gather necessary data from reliable sources.

4. Preprocessing: Clean and normalize the data for consistency.

5. Assign Weights: Use statistical methods or domain expertise to weigh variables.

6. Develop Formula: Create a mathematical model for scoring.

7. Validate: Test the scoring system with sample datasets.

8. Implement: Deploy the system for real-world use.

9. Monitor and Update: Continuously evaluate performance and update as needed.

21. Write a note on scoring and ranking with example.

Scoring assigns numeric values to data based on criteria, while ranking orders data based on

these scores.

Example: A university assigns scores to students based on exam performance.

- Student A: Score = 85 -> Rank = 1

- Student B: Score = 78 -> Rank = 2


22. Explain Z-Score with formula and example.

Z-score measures how many standard deviations a data point is from the mean.

Formula: Z = (X - mean) / standard deviation

Example: For a dataset with mean = 50 and standard deviation = 10, a value X = 70:

Z = (70 - 50) / 10 = 2.

23. Brief characteristics of Z-Score and its use in data science.

Characteristics:

- Indicates the relative position of a data point in a dataset.

- Useful for detecting outliers (values with Z > 3 or Z < -3).

- Standardizes datasets, making them comparable.

Use in Data Science:

- Identifying anomalies in datasets.

- Normalizing features for machine learning algorithms.

- Comparing variables across different distributions.

24. Explain statistical significance in terms of Null Hypothesis, Permutation Test, and P-values.

- Null Hypothesis: Assumes no effect or difference in a study.

- P-value: Measures the probability of observing results as extreme as the current data under the

null hypothesis. Low p-values (<0.05) suggest rejecting the null hypothesis.

- Permutation Test: A non-parametric method that tests hypotheses by rearranging data and

calculating test statistics for each arrangement.

25. Write a note on Sampling and Distribution.

Sampling: Selecting a subset of a population for study. Techniques include random, stratified, and
systematic sampling.

Distribution: Describes how data is spread. Types include:

- Normal Distribution: Bell-shaped curve.

- Uniform Distribution: Equal probabilities for all outcomes.

- Skewed Distribution: Data concentrated on one side.

You might also like