0% found this document useful (0 votes)
23 views11 pages

Notes XII AI

Uploaded by

divyanshulala027
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views11 pages

Notes XII AI

Uploaded by

divyanshulala027
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

UNIT 1.

CAPSTONE PROJECT : PART2


Analytic Approach

Q1. Draw a diagram of analytical approach.

Q2.Define analytical approach.

An analytical approach in AI means carefully examining data and algorithms to understand and
solve problems. It starts with preparing and cleaning data, choosing the right algorithms, and
building models. Then, it involves testing these models to see how well they work. By
interpreting the results and making adjustments as needed, this approach helps in gaining
useful insights and making informed decisions based on data.

Q3. Types of analytical approach.

1.Descriptive Analysis: Summarizes past data to show what has happened.


2.Diagnostic Analysis: Examine data to understand why something happened.
3.Predictive Analysis: Uses past data to forecast future outcomes.
4.Prescriptive Analysis: Provides recommendations on what actions to take based on data
insights.
Q4. Define Data collection and Data requirement.

1. Data Collection: This refers to the process of acquiring data from different sources to use in
analysis or research. It involves identifying what data is needed, choosing appropriate methods
and tools for gathering it, and ensuring the data is accurate and relevant. Data can be collected
through various means, such as surveys, experiments, sensors, or accessing existing
databases.

2.Data Requirement: This outlines the specific types, quantities, and quality of data needed to
meet particular goals or answer specific questions. It defines what data is necessary to address
a problem or conduct analysis effectively. Data requirements include considerations of data
format, granularity, and the scope of information needed to ensure that the collected data will be
sufficient and appropriate for the intended use.

Q5. Define Data understanding and Data preparation.

Data understanding:- involves the initial exploration and examination of the data to gain insights
into its structure, quality, and patterns. This phase helps to comprehend the data and identify
any issues that might affect the analysis or modeling process.

Data preparation:- is the process of cleaning, transforming, and organizing data to make it
suitable for analysis or modeling. This includes handling missing values, correcting errors, and
converting data into appropriate formats.

Q6. Define Evaluation and Modelling.

Evaluation:- is the process of assessing the performance of a machine learning model using
various metrics and techniques. This phase helps determine how well the model generalizes to
new, unseen data and identifies any potential issues such as overfitting or underfitting.

Modeling:- is the phase where machine learning algorithms are applied to the prepared data to
create a model that can make predictions or classifications. This involves selecting appropriate
algorithms, training the model on the dataset, and fine-tuning it to optimize performance.

Q7. Define Deployment and Feedback.

Deployment is the process of setting up and making a product, service, or software available for
people to use. This includes preparing, adjusting settings, testing, and monitoring to ensure it
works correctly.
Feedback is the information or opinions from users about how something works, which helps in
identifying areas for improvement and enhancing overall quality.

Q8. Explain how to validate model quality.

To check a model’s quality, start by dividing your data into training, validation, and testing sets.
Use measures like accuracy or error rates to see how well the model performs. Try
cross-validation to confirm it works well with different data parts. Look at tools like confusion
matrices and ROC curves to understand results. Watch for overfitting (too tailored to training
data) and underfitting (not learning enough). Adjust settings for the best outcomes, and have
experts review it. Test the model under different conditions to make sure it stays reliable.

Q9. Define Train dataset and Test dataset.

Train Dataset: A train dataset is the portion of your data used to build and fit the model. It helps
the model learn patterns, relationships, and features by adjusting its parameters to minimize
errors. The model is repeatedly exposed to this data during the training process to improve its
performance and accuracy.

Test Dataset: A test dataset is a separate portion of your data that is used to evaluate the
model's performance after training. It provides an unbiased assessment of how well the model
generalizes to new, unseen data. The test dataset helps determine the model’s accuracy,
robustness, and effectiveness in making predictions.

Q10. Differentiate between Train dataset and Test dataset.


Q11. Explain the concept of cross validation.

Cross-validation is a method to check how well a model works by using different parts of the
data for training and testing. Here’s how it works in simple steps:

1. Divide Data: Split your data into several equal parts, called folds.

2. Train and Test: Train the model on some folds and test it on the remaining fold. Repeat this
process so each fold gets a chance to be the test set.

3. Evaluate: Calculate the model’s performance for each fold and then average the results. This
gives a better idea of how the model will perform on new, unseen data.

4. Advantages: It helps ensure the model is tested thoroughly and makes the most of the
available data, leading to a more reliable performance estimate.

Q12.Define objective function ,loss functions, gradient descent in Metrics of model


quality.

- Objective Function: A formula that the model tries to optimize, combining metrics and possibly
regularization to achieve the best performance.

- Loss Functions: Measures how far the model’s predictions are from the actual outcomes,
guiding how to improve the model. Examples include Mean Squared Error for regression and
Cross-Entropy Loss for classification.

- Gradient Descent: An algorithm that adjusts model parameters to minimize the loss function by
moving in the direction that reduces errors.

Q13.Categorized the loss function into 2 types, Classification and Regression Loss.
Q14.Define the following

A. RMSE (Root Mean Squared Error)


B. MSE (Mean Squared Error)

- Mean Squared Error (MSE): MSE measures the average squared difference between the
predicted values and the actual values. It quantifies how far the predictions are from the actual
outcomes, with larger errors having a greater impact due to squaring.

- Root Mean Squared Error (RMSE): RMSE is the square root of MSE. It provides a measure of
the standard deviation of the prediction errors and represents the average distance between
predicted and actual values in the same units as the data, making it easier to interpret.

Q15. How you can calculate the mean, median, and mode using Python:

1. Mean: The mean is calculated using `statistics.mean(data)`, which adds up all the numbers in
the list and divides by the total count.

2. Median: The median is calculated with `statistics.median(data)`, which sorts the list and finds
the middle value. If there’s an even number of elements, it averages the two middle values.

3. Mode: The mode is calculated with `statistics.mode(data)`, which finds the number that
appears most frequently in the list. Note that if there are multiple modes, it returns the first one it
finds.

Example:
Unit:2:- AI Model Life Cycle
1. Define AI model life cycle.

The AI model lifecycle is a process that starts with defining the problem and gathering the
necessary data. Next, the data is cleaned and prepared for use. Various models are then trained
and tested to find the best fit for the problem. After training, the model is evaluated to ensure it
performs well. Once it passes these checks, the model is deployed to do its job in real-world
settings. To keep it effective, the model is continuously monitored and updated based on new
data or changing needs. This ongoing process helps the AI model stay useful and accurate.

2. Explain the different types of AI models.

a. Supervised Learning Models: These models learn from labeled datasets, where the
correct answers are provided. The model makes predictions or classifications based on
this training. Examples include linear regression, decision trees, support vector
machines, and neural networks.
b. Unsupervised Learning Models: These models work with unlabeled data and aim to
find patterns or groupings within the data. Common unsupervised learning models
include clustering algorithms like k-means, hierarchical clustering, and dimensionality
reduction techniques like principal component analysis (PCA).
c. Reinforcement Learning Models: These models learn by interacting with an
environment, taking actions, and receiving feedback in the form of rewards or penalties.
The model aims to maximize cumulative rewards over time. Reinforcement learning is
often used in robotics, game playing, and autonomous systems.
d. Deep Learning Models: A subset of machine learning models, deep learning models
use neural networks with multiple layers (deep neural networks) to process complex
data. These models are particularly effective for tasks such as image and speech
recognition, natural language processing, and more. Examples include convolutional
neural networks (CNNs), recurrent neural networks (RNNs), and transformers.
e. Generative Models: These models generate new data samples that are similar to the
training data. They are used in applications such as image generation, text synthesis,
and music composition. Examples include Generative Adversarial Networks (GANs) and
Variational Autoencoders (VAEs).
f. Transfer Learning Models: These models leverage pre-trained models on a large
dataset and fine-tune them for a specific task with a smaller dataset. This approach is
often used when there is limited labeled data available for the specific task at hand.

3. Explain AI project lifecycle three main stages.

Stage 1. Project Scoping:

● Problem Definition: Clearly define the problem the AI solution aims to solve. This
involves understanding the business objectives and identifying how AI can add value.
● Feasibility Study: Assess whether the problem can be solved with AI, considering data
availability, technical requirements, and potential risks.
● Resource Planning: Determine the resources needed, including data, tools,
infrastructure, and personnel. Establish timelines and milestones for the project.
● Stakeholder Engagement: Engage with all relevant stakeholders to ensure their needs
are understood and incorporated into the project plan.

Stage 2. Design or Build Phase:

● Data Collection and Preparation: Gather the necessary data and prepare it for
analysis. This includes cleaning, transforming, and possibly augmenting the data to
ensure it is suitable for modeling.
● Model Selection and Development: Choose the appropriate AI models and techniques
based on the problem requirements and the nature of the data. Develop and train the
models using the prepared data.
● Testing and Evaluation: Rigorously test the model to evaluate its performance and
accuracy. This may involve using a separate validation dataset or applying
cross-validation techniques.
● Iteration: Based on the testing results, refine and improve the model iteratively. This
may involve tweaking the model parameters, trying different algorithms, or further data
preprocessing.

Stage 3. Deployment in Production:

● Integration: Integrate the AI model into the existing systems or workflows. This may
involve developing APIs or user interfaces to interact with the model.
● Monitoring and Maintenance: Continuously monitor the model’s performance in the
production environment. Implement mechanisms to collect feedback and data on the
model’s predictions or outputs.
● Updating and Retraining: As new data becomes available or if the model’s
performance degrades, update and retrain the model to ensure it remains effective and
accurate.
● Documentation and Communication: Document the model, its usage, and its
performance metrics. Communicate with stakeholders about the model’s impact and any
necessary updates.

4. Design/Building the Model. During this phase, you need to evaluate the various AI
development platforms. Explain.

Open Languages

● Python: The most popular programming language for AI and machine learning due to its
simplicity, readability, and the vast ecosystem of libraries and frameworks.
● R: Preferred for statistical analysis and data visualization, with extensive packages for
data manipulation, analysis, and machine learning.
● Scala: Often used in big data contexts, Scala is known for its compatibility with Apache
Spark, a popular framework for large-scale data processing.

Open Frameworks

● Scikit-learn: A Python library that provides simple and efficient tools for data mining and
data analysis, covering a range of machine learning algorithms.
● XGBoost: An optimized gradient boosting library designed to be highly efficient, flexible,
and portable, widely used for structured or tabular data.
● TensorFlow: An end-to-end open-source platform for machine learning, providing a
comprehensive ecosystem of tools, libraries, and community resources.

Approaches and Techniques

● Classic Machine Learning: Techniques such as regression, decision trees, and


clustering, forming the basis of many AI solutions.
● State-of-the-Art Techniques: Advanced methods like Generative Adversarial Networks
(GANs) and Reinforcement Learning (RL), pushing the boundaries of AI capabilities in
generating content and learning through interactions.

Productivity-Enhancing Capabilities

● Visual Modeling: Tools that provide a graphical interface to build and visualize models,
reducing the need for extensive coding.
● AutoAI: Automated machine learning techniques that help in automating the processes
of feature engineering, algorithm selection, and hyperparameter optimization, making it
easier and faster to build models.

Development Tools

● DataRobot: A platform that automates the end-to-end process of building, deploying,


and maintaining AI models, enhancing productivity and accuracy.
● H2O: An open-source AI platform providing tools for machine learning, deep learning,
and AI development.
● Watson Studio: An integrated environment from IBM for building and training machine
learning models, and deploying them into production.
● Azure ML Studio: A cloud-based service by Microsoft that enables building, training,
and deploying machine learning models quickly and efficiently.
● Sagemaker: Amazon’s machine learning service that provides developers and data
scientists with the ability to build, train, and deploy ML models at scale.
● Anaconda: A distribution of Python and R for scientific computing, aimed at simplifying
package management and deployment.
UNIT 3: STORYTELLING

Q1.Why storytelling is so powerful and cross-cultural, and what this means for data
storytelling?

Storytelling is powerful because it draws people in and makes them feel connected. It helps
people understand different cultures and builds community. When we use stories to explain
data, the information becomes clearer and more engaging. This way, important messages are
easier to understand and more likely to inspire action.

For Example :- A company reports a 30% waste reduction. Instead of just numbers, they tell a
story about Alex, who started a recycling program and inspired others. This personal story
makes the data relatable and motivates action.

Q2. What are the steps involved in storytelling ?

The steps involved in telling an effective data story are given below:

● Understanding the audience


● Choosing the right data and visualizations
● Drawing attention to key information
● Developing a narrative
● Engaging your audience

Q3. How to tell a great story with your data?

To tell a great story with your data, focus on these key points:

1. Define the Purpose: Know the key message you want to convey.

2. Know Your Audience: Tailor the story to their interests and understanding.

3. Create a Narrative: Structure the data with a clear beginning, middle, and end.

4. Use Visuals Effectively: Enhance the story with clear, relevant visuals.

5. Simplify Data: Present information in an easily understandable way.

6. Highlight Key Insights: Focus on the most impactful data points.

7. Engage Emotionally: Connect the data to real-world effects or stories.


Q4.Write the steps that can assist in finding compelling stories in the data sets ?

Step 1: Get the data and organize it.

Step 2: Visualize the data.

Step 3: Examine data relationships.

Step 4: Create a simple narrative embedded with conflict.

Q5.Why is data storytelling important?

Data storytelling is important because it simplifies complex information, makes data more
meaningful and engaging, and helps drive change by providing context and insight. Stories with
data are more persuasive, standardize communication, and make information more memorable.

Q6.Identify the elements that make a compelling data story and name them
- Input*: This is the data or information that you provide to a system (like a computer or a
process) for it to work with. For example, typing on a keyboard is input for a computer.

- Narrative: This is the way a story or series of events is told or structured. It includes the plot,
characters, and setting. For example, a novel or a film has a narrative that guides how the story
unfolds.

- Representation: This refers to how something is shown or portrayed. It can be visual, like a
painting, or conceptual, like how data is presented in charts. It’s about how an idea or thing is
depicted to others.

You might also like