0% found this document useful (0 votes)
49 views8 pages

Ai Chapter 3

Uploaded by

rohankokatare6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views8 pages

Ai Chapter 3

Uploaded by

rohankokatare6
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

2 Marks Questions

1. Define Machine Learning.

• Machine learning is an application of AI that enables systems to learn and improve


from experience without being explicitly programmed. It focuses on developing
computer programs that can access data and learn for themselves.

2. List the steps of data exploration.

1. Reading data

2. Variable identification

3. Univariate Analysis

4. Bivariate Analysis

5. Missing value treatment

6. Outlier treatment

7. Variable transformation

3. Define accuracy, precision w.r.t Machine learning model.

• Accuracy: The ratio of total number of correct predictions to the total number of
predictions.

• Precision: Out of all the positive predictions, the percentage that is truly positive.

4. List the stages of predictive modeling.

1. Problem definition

2. Hypothesis generation

3. Data extraction/collection


4. Data exploration and transformation

5. Model building/predictive modeling

6. Model deployment

5. Define Specificity, Sensitivity w.r.t Machine learning model.

• Sensitivity (Recall): The number of samples correctly identified as positive out of


total true positives.

• Specificity: The percentage of true negative instances out of the overall actual
negative instances present in the dataset.

6. List the methods of variable transformation.

1. Logarithm

2. Square root

3. Cube root

4. Binning

7. List any 4 applications of ML.

1. Medical diagnostics

2. Email filtering

3. Computer vision

4. Stock price prediction

8. List functions of Pandas.

1. Reading different varieties of data (CSV, Excel, JSON)


2. Functions for filtering, selecting, and manipulating data

3. Visual exploration of data (plotting)

4. Exporting data

9. Classify ML.

1. Supervised Learning

2. Unsupervised Learning

3. Reinforcement Learning

10. Classify supervised learning.

1. Classification

2. Regression

11. List functions of Predictive modeling.

1. Making use of past data and other attributes

2. Predicting the future using this data

12. Differentiate categorical and continuous variables.

• Categorical Variables: Discrete in nature (e.g., gender, color).

• Continuous Variables: Can have an infinite number of possible values (e.g., age,
salary).

13. List types of graphical methods of Univariate analysis of continuous variables.

1. Histogram

2. Boxplot
14. Draw a labeled boxplot.

• (A visual representation would be required here; typically, a boxplot shows the


median, quartiles, and potential outliers.)

15. Define Univariate and Bivariate analysis.

• Univariate Analysis: Analyzing one variable at a time.

• Bivariate Analysis: Studying two variables together for their empirical relationship.

16. List types of Bivariate analysis.

1. Continuous-Continuous Analysis

2. Categorical-Continuous Analysis

3. Categorical-Categorical Analysis

17. List reasons for missing value in a dataset.

1. Non-response

2. Error in data collection

3. Error in reading data

18. List types of missing values.

1. Missing Completely at Random (MCR)

2. Missing At Random (MAR)

3. Missing Not At Random (MNAR)

19. List methods to identify missing values.

1. Checking for null values


2. Summary statistics

3. Visualizations (e.g., heatmaps)

20. List reasons for outliers.

1. Data entry error

2. Measurement error

3. Change in the underlying population

21. List methods to treat outliers.

1. Deleting observations

2. Transforming and binning values

3. Imputing outliers like missing values

22. List methods of variable transformation.

1. Logarithm

2. Square root

3. Cube root

4. Binning

23. List the steps of predictive modeling.

1. Problem definition

2. Hypothesis generation

3. Data extraction/collection

4. Data exploration and transformation

5. Model building

6. Model deployment

24. Define model deployment.

• The process of placing a finished machine learning model into a live environment
where it can be used for its intended purpose.

25. List any 4 evaluation metrics.

1. Confusion matrix

2. Accuracy

3. Sensitivity/Recall

4. Precision

26. Draw a labeled 2x2 confusion matrix.

• (A visual representation would be required here; typically, it includes True Positive,


True Negative, False Positive, and False Negative.)

27. Define TP, TN w.r.t confusion matrix.

• True Positive (TP): The predicted value matches the actual positive value.

• True Negative (TN): The predicted value matches the actual negative value.

28. Define FP, FN w.r.t confusion matrix.

• False Positive (FP): The actual value is negative, but the model predicted it as
positive.

• False Negative (FN): The actual value is positive, but the model predicted it as
negative.

29. List different methods of validation techniques.


1. Hold-out validation

2. Stratified Hold-out validation

3. Leave one out

4. K-fold cross validation

30. What are hyperparameters?

• Hyperparameters are the settings that can be tuned before running a training job to
control the behavior of an ML algorithm.

31. List different hyperparameter tuning methods.

1. Random Search

2. Grid Search

3. Bayesian Optimization

4. Tree-structured Parzen estimators (TPE)

4 Marks Questions

1. Explain the confusion matrix with one example.

• A confusion matrix is a 2x2 matrix used for binary classification. It shows the
performance of a classification model by comparing the actual values with the
predicted values. For example, if a model predicts whether an email is spam or not,
the confusion matrix will show the counts of True Positives (correctly predicted
spam), True Negatives (correctly predicted not spam), False Positives (predicted
spam but not), and False Negatives (predicted not spam but is spam).

2. Explain Bivariate analysis.

• Bivariate analysis involves studying the relationship between two variables. It helps
in understanding how one variable affects another and can be used to identify
correlations or associations. For example, analyzing the relationship between age
and income can reveal trends and patterns.

3. Classify Machine learning and explain each type.


• Machine learning can be classified into:

• Supervised Learning: Involves training a model on labeled data (e.g.,


classification and regression).

• Unsupervised Learning: Involves training a model on unlabeled data to find


patterns (e.g., clustering).

• Reinforcement Learning: Involves training a model through trial and error to


maximize a reward.

4. Explain the Steps to read csv and excel file in Jupyter notebook inside pandas.

• To read CSV and Excel files in Pandas:

1. Import the Pandas library.

2. Use pd.read_csv('data.csv') to read a CSV file and store it in a DataFrame.

3. Use pd.read_excel('data.xlsx') to read an Excel file.

4. Use the head() function to display the first few rows of the DataFrame to confirm successful
reading.

5. Explain the process of model building.

• Model building involves:

1. Algorithm Selection: Choosing the appropriate algorithm based on the problem type.

2. Training Model: Learning the relationship between independent and dependent variables
using the training data.

3. Prediction/Scoring: Using the trained model to predict outcomes on the test dataset.

6. Explain Univariate analysis.

• Univariate analysis involves analyzing a single variable to summarize its


characteristics and understand its distribution. It can include calculating measures of
central tendency (mean, median) and dispersion (standard deviation) and visualizing
the data using histograms

You might also like