AIML Notes
AIML Notes
Unit I
Introduction to AI and
ML in Mechanical
Engineering
2. Logic: In AI, logic is like the language of reasoning. It helps computers understand
relationships between different pieces of information and draw conclusions. AI logic systems
use rules to represent knowledge and make decisions based on those rules. For example, they
can help in diagnosing diseases by analysing symptoms and medical data, or in planning routes
by considering traffic conditions and time constraints.
4. Machine Learning: This is one of the most exciting fields in AI. Machine learning is all
about teaching computers to learn from data. Instead of programming explicit rules, we give
the computer examples and let it figure out patterns and relationships on its own. Machine
learning algorithms can be used for a wide range of tasks, such as recognizing handwriting,
predicting stock prices, or recommending movies on streaming platforms.
5. Natural Language Processing (NLP): Have you ever talked to a chatbot or asked Siri a
question? That's NLP in action. Natural language processing is about teaching computers to
understand and generate human language. NLP algorithms can analyze text to extract meaning,
translate between languages, or even generate human-like responses in conversations.
6. Vision: Computers can see, but not quite like humans do. Computer vision is the field of AI
that deals with teaching computers to interpret images and videos. This includes tasks like
object recognition (identifying what's in an image), image classification (sorting images into
categories), and image segmentation (dividing an image into meaningful parts). Computer
vision has applications in fields ranging from autonomous vehicles to medical imaging.
7. Robotics: Robotics combines engineering and AI to create machines that can sense, think,
and act in the physical world. Robots are becoming increasingly sophisticated, with sensors
that can detect objects and movements, algorithms that can make decisions based on sensory
input, and actuators that allow them to move and interact with their environment. AI plays a
crucial role in robotics by enabling robots to adapt to changing situations, learn from
experience, and perform complex tasks autonomously.
8. Expert Systems: Imagine having an expert in your pocket, ready to help you with any
problem you encounter. That's the idea behind expert systems. These are AI systems designed
to mimic the knowledge and reasoning skills of human experts in a specific domain. Expert
systems use a knowledge base of facts and rules, along with inference engines to draw
conclusions and make recommendations. They're used in various fields, including medicine
(for diagnosing diseases), finance (for making investment decisions), and engineering (for
troubleshooting technical problems).
“The art of creating machines that perform functions that require intelligence when
performed by people.” (Kurzweil)
“The study of how to make computers do things at which, at the moment, people are
better.” (Rich and Knight)
You enter a room which has a computer terminal. You have a fixed period of time to type
what you want into the terminal and study the replies. At the other end of the line is either a
human being or a computer system. If it is a computer system, and at the end of the period you
cannot reliably determine whether it is a system or a human, then the system is deemed to be
intelligent.
1. Setup:
The Turing Test typically involves a text-based conversation between a human evaluator and
two participants: a human and a machine.
The evaluator interacts with both participants through a computer terminal, without knowing
which is the human and which is the machine.
2. Criterion:
The goal of the Turing Test is for the machine to convince the evaluator that it is human. If
the machine can successfully mimic human behaviour to the extent that the evaluator cannot
distinguish it from the human participant, it is said to have passed the test.
3. Nature of Interaction:
The conversation between the evaluator and the participants can cover a wide range of topics
and may involve questions, responses, and exchanges of information.
The machine's task is to generate responses that are linguistically and contextually
appropriate, demonstrating comprehension, reasoning, and coherence akin to human
communication.
1. Indication of Intelligence:
Passing the Turing Test has historically been regarded as a significant milestone in AI
research, suggesting that a machine possesses human-like intelligence.
The ability to engage in natural language conversation and exhibit behaviours indistinguishable
from humans implies a level of understanding and cognitive capability comparable to human
intelligence.
2. Challenges and Limitations:
Critics of the Turing Test argue that it may prioritize superficial imitation over genuine
understanding and intelligence. A machine could potentially pass the test by employing clever
strategies or pre-programmed responses without truly comprehending the content of the
conversation.
Additionally, the Turing Test focuses primarily on linguistic abilities and may not adequately
assess other facets of intelligence, such as creativity, emotional intelligence, or problem-
solving skills.
let's not forget about the people – assembling a team with the right mix of data science and
engineering skills is crucial to making our vision a reality.
3. Designing the MVP: Building a Prototype
Now it's time to roll up our sleeves and create a minimal viable product (MVP). Think
of it as a small-scale version of our final AI solution – something we can test with real data to
see how well it works. Building an MVP lets us iron out any kinks and tweak our approach
before we dive into full-scale development.
Throughout this journey, we'll encounter decision points where we need to evaluate our
progress and decide whether to keep going or change course. If things aren't going as planned,
we might need to go back to an earlier stage and try a different approach. This cyclical nature
ensures that our AI development is flexible and adaptable, allowing us to evolve and improve
as we go.
By following these detailed stages and embracing the iterative nature of the lifecycle, we can
navigate the world of AI and machine learning with confidence, turning our ideas into impactful
solutions that make a real difference.
Evolution of AI/ ML
AI Model Development
The diagram suggests that there is a way to train a machine learning model without
labelled data, It is a good starting point for understanding how machine learning models can
be trained without labelled data. Labelled data is data that has been manually labelled with the
correct output. For example, if you were training a machine learning model to recognize images
of cats, you would need to provide the model with a dataset of images that have been labelled
as "cat" or "not cat".
Training a machine learning model without labelled data is a challenging task, but it is
possible. AI Model Training Process for Identifying and Sorting Shapes. Training an AI model
to identify and sort shapes such as circles, triangles, and squares involves several steps. Here's
a detailed breakdown of the process:
1. Data Collection:
Gather a diverse dataset containing images of circles, triangles, and squares. Ensure
that the images cover various sizes, colors, orientations, and backgrounds to improve the
model's robustness. Label each image with its corresponding shape (e.g., circle, triangle,
square) to provide supervised training data for the model.
2. Data Preprocessing:
Resize all images to a consistent size to ensure uniformity in the input data.
Normalize pixel values to a common scale (e.g., between 0 and 1) to facilitate efficient training.
3. Model Architecture Selection:
Choose an appropriate neural network architecture for image classification tasks.
Convolutional Neural Networks (CNNs) are commonly used for this purpose due to their
ability to effectively capture spatial features in images.
Design the architecture with input layers, convolutional layers for feature extraction, pooling
layers for dimensionality reduction, and fully connected layers for classification.
4. Model Training:
Split the dataset into training, validation, and test sets. The training set is used to train
the model, the validation set is used to tune hyperparameters and monitor performance, and the
test set is used to evaluate the final model.
Advantages of AIML
1. Boost Revenue:
AI and ML empower businesses to offer customized and personalized services to
customers, thereby enhancing customer satisfaction and loyalty, leading to increased revenue
streams.
2. Reduce Cost:
By implementing AI and ML technologies, mechanical engineering processes can be
optimized and automated, resulting in cost savings through improved efficiency, reduced
errors, and better resource utilization.
3. Reduce Error:
Automation processes enabled by AI and ML significantly reduce the occurrence of
errors in mechanical engineering tasks, leading to enhanced quality control and operational
reliability.
4. Unleash Potential:
AI and ML algorithms enable businesses to extract valuable insights from large
datasets, uncovering untapped opportunities for innovation, optimization, and growth in
mechanical engineering operations.
5. Enhanced Predictive Maintenance:
AI and ML algorithms analyze sensor data from mechanical systems to predict
equipment failures before they occur, allowing for proactive maintenance and minimizing
downtime.
6. Optimized Product Design:
AI and ML techniques optimize product design processes by analyzing historical data,
simulations, and feedback to iteratively improve product performance, reliability, and
efficiency.
7. Improved Supply Chain Management:
AI and ML algorithms optimize supply chain operations by forecasting demand,
optimizing inventory levels, and identifying opportunities for cost savings and efficiency
improvements.
Applications of AIML
1. Data Acquisition:
This is where the journey begins. Data is gathered from various sources such as internal
databases, external data repositories, or even web scraping. This data could be anything from
customer information to sensor readings.
2. Data Analysis:
Once we have our data, it's time to dive in and explore. We analyse the data to uncover
patterns, trends, and insights. This involves using statistical methods, machine learning
techniques, or visualizations to understand the underlying structure of the data.
3. Data Cleansing and Preparation:
Before we can use our data for modelling, we need to clean it up. This involves
identifying and correcting errors, dealing with missing values, outliers, or inconsistencies.
Once our data is clean, we prepare it for use in a machine learning model by formatting it
appropriately.
4. Model Deployment:
With our data ready, we can now build our machine learning model. This model is
trained using the cleaned and prepared data, and once trained, it can be deployed into
production. In production, the model can make predictions on new, unseen data.
5. Optimization:
The work doesn't end once the model is deployed. We continually monitor its
performance and look for ways to improve it. This may involve tweaking the model's
hyperparameters, retraining it on new data, or even changing the model architecture altogether.
Overall, the process depicted in the image is cyclical, meaning it doesn't end once we've built
and deployed a model. Instead, it's an ongoing journey of acquiring, analysing, cleaning, and
using data to develop and refine machine learning models, ensuring they remain accurate and
effective over time.
Implementation of AI in Automation
The Fig above gives a brief idea of how we can implement AI workflow in automation
system. The process consists of various steps which if implemented correctly enables us to
implement AI any advanced automation system. The steps are as follows:
6. Implement Model:
Once the AI model demonstrates satisfactory performance, proceed with its
implementation within the chosen workflow automation tool. Integrate the model seamlessly
into the workflow, configuring it to handle specific tasks and decision-making processes
efficiently.
7. Check Results:
Continuously monitor the AI-powered workflow to evaluate its effectiveness and
impact on business outcomes. Track key performance indicators (KPIs) to measure the success
of automation efforts and ensure alignment with organizational goals. Analyze model outputs,
identify any discrepancies, and refine the workflow as needed to optimize results.
8. Iterate and Refine:
Recognize that AI workflow automation is an iterative process. Regularly review and
refine both the AI model and the automated workflow to adapt to changing business
requirements, technological advancements, and evolving data patterns. Continuously seek
opportunities to enhance efficiency, accuracy, and scalability through ongoing iteration and
improvement.
By following this detailed workflow, organizations can effectively leverage AI and machine
learning technologies to automate complex business processes, drive operational efficiencies,
and achieve tangible business outcomes.
1. Training Data: This is the data that the machine learning algorithm learns from. It is
important to have high-quality training data in order to train an accurate machine learning
model.
2. Machine Learning Algorithm: This is the code that learns from the training data and makes
predictions on new data. There are many different machine learning algorithms, and the best
algorithm for a particular task will depend on the nature of the data and the desired outcome.
3. Machine Learning Model: This is the output of the machine learning algorithm. It is a
representation of the knowledge that the algorithm has learned from the training data.
4. Predictions: These are the outputs that the machine learning model makes on new data. The
accuracy of the predictions will depend on the quality of the training data and the complexity
of the task.
The red line in the image represents the boundary between the training data and the machine
learning algorithm. The machine learning algorithm does not have access to the labels or target
variables of the training data during training. This is important because it helps to ensure that
the model is generalizing well and is not simply memorizing the training data.
The process of machine learning can be broken down into several steps:
1. Data Collection: This is the first step in the process, where data is collected from various
sources.
2. Data Preprocessing: This step involves cleaning and preparing the data for use in a
machine learning algorithm.
3. Model Selection: This step involves choosing a machine learning algorithm that is
appropriate for the task at hand.
4. Model Training: This step involves training the machine learning algorithm on the training
data.
5. Model Evaluation: This step involves evaluating the performance of the machine learning
model on a separate test dataset.
6. Model Deployment: This step involves deploying the machine learning model to
production.
Overall, MMIs are intricate systems with many parts. They work together to let users interact
with machines safely, efficiently, and easily. Whether you're driving a car or operating
machinery, MMIs make the experience smoother and more manageable.
The Fig. above is a diagram about different types of modelling techniques used in AI and
machine learning. Here's a detailed breakdown of the various techniques depicted in the
diagram:
• Data Rectification This process extracts meaningful features from raw data by eliminating
redundancies and noise. Data rectification employs recursive algorithms to clean the data.
• Predictive Modelling This technique involves training a machine learning model to forecast
future events or outcomes based on historical process data. The model can be used to predict
various aspects like sales figures, equipment failures, or customer churn.
• Process Optimization As the name suggests, this technique uses a trained AI model to
optimize a process for target parameters. The model can be continuously updated based on
real-time or previous process data to achieve the desired outcome.
• Fault Detection This AI-powered technique helps identify anomalies from the normal
process state. The model can pinpoint the root cause of these inconsistencies, allowing for
corrective actions to be taken.
• Process Control This technique involves training a machine learning model to establish and
regulate process parameters to maintain a desired target. The model can monitor and adapt
these parameters in real-time to optimize the process.
• Mechanistic Modelling This technique offers mechanistic insights into processes through
a combination of reverse engineering and employing AI models. It helps gain a deeper
understanding of how the various elements within a system function and interact with each
other.
AIML is a language specifically designed for the creation of chatbots and virtual agents.
Following are the broad areas in which we have lots of opportunities in integration of AI and
ML in Mechanical Engineering.
• Design Optimization AI algorithms can analyze vast amounts of data to optimize designs.
This could involve optimizing for factors like material usage, weight reduction, or
structural integrity.
• Mechanical Systems Automation The design and optimization of mechanical systems and
parts may be automated using AI. This could involve tasks such as generating design
concepts or simulating the performance of different designs.
• Performance Simulation The performance of mechanical systems can also be simulated
and analysed using AI to forecast behaviour and suggest changes. This could be helpful in
predicting how a mechanical system will perform under different conditions or how it might
fail.
Overall, we can say that AI can be a powerful tool for mechanical engineers. By using AI for
design optimization, mechanical systems automation, and performance simulation, engineers
can create better products more efficiently.
It's important to note that while the concept might reference AIML, AI encompasses a
broader range of technologies than chatbot development. Machine learning, deep learning, and
natural language processing are some of the subfields of AI that are having a significant impact
on mechanical engineering.
Integrating AI/ML in mechanical engineering poses several challenges, which can be classified
into four main types: mechanical data complexity, data quality, explainability, and ethical
challenges.
Ethical Challenges:
Integrating AI/ML in mechanical engineering raises ethical considerations related to
privacy, bias, fairness, and accountability. Privacy concerns arise from the collection and use
of sensitive data, especially in applications involving personal information or proprietary
technology. Bias in AI/ML models can perpetuate unfair outcomes or discriminatory practices,
leading to social or legal ramifications. Ensuring fairness and equity in AI/ML algorithms
requires careful consideration of dataset biases and mitigation strategies. Moreover, ensuring
accountability and responsible AI deployment entails establishing clear guidelines, standards,
and governance frameworks to monitor and mitigate potential ethical risks throughout the AI
lifecycle.
Addressing these challenges requires interdisciplinary collaboration between
mechanical engineers, data scientists, ethicists, and policymakers to develop robust AI/ML
solutions that meet technical, ethical, and societal requirements in mechanical engineering
applications. Additionally, ongoing research and innovation are essential for advancing AI/ML
techniques tailored to the unique challenges and requirements of mechanical systems and
processes.
Unit II
AI Tools
AI Libraries AI Framework
Definition:
AI libraries and frameworks are collections of pre-written code modules, functions, and
utilities designed to simplify and expedite AI development tasks. They offer a range of tools
and functionalities tailored to various AI tasks, including data preprocessing, model training,
evaluation, and deployment.
Caffe: Caffe is a deep learning framework developed by the Berkeley Vision and Learning
Centre (BVLC), optimized for speed and efficiency in image classification tasks. It is widely
used in computer vision applications.
Application Areas:
AI libraries and frameworks find applications across various domains, including
computer vision, natural language processing, reinforcement learning, robotics, and healthcare.
They empower developers to build AI-driven solutions for diverse use cases, from image
recognition and language translation to autonomous driving and medical diagnosis.
Features:
• Flexibility: TensorFlow offers a flexible architecture that supports both high-level APIs for
easy model development and low-level APIs for fine-grained control over model
components.
• Scalability: It can efficiently scale from running on a single CPU to distributed computing
environments with multiple GPUs or even across clusters of machines.
• Comprehensive Tooling: TensorFlow provides a comprehensive suite of tools for data
preprocessing, model training, evaluation, and deployment, including TensorBoard for
visualization and TensorFlow Serving for serving trained models in production
environments.
• Extensibility: It supports customization and extensibility through its rich ecosystem of
libraries, extensions, and integration with other popular machine learning frameworks.
• Support for Various Platforms: TensorFlow supports deployment across a wide range of
platforms, including desktops, servers, mobile devices, and cloud environments, making it
suitable for both research and production use cases.
Applications:
• Computer Vision: Image classification, object detection, image segmentation.
• Natural Language Processing (NLP): Text classification, sentiment analysis, language
translation.
• Speech Recognition: Voice recognition, speech synthesis, speaker identification.
• Reinforcement Learning: Game playing, robotics, autonomous systems.
• Healthcare: Medical image analysis, disease diagnosis, drug discovery.
Advantages:
• Performance: TensorFlow offers high performance and efficiency, especially when
running on GPUs or distributed computing environments.
• Flexibility: Its flexible architecture supports experimentation and customization, allowing
researchers and developers to implement a wide range of machine learning algorithms and
models.
• Scalability: TensorFlow can scale seamlessly from training small models on a single device
to training large-scale models on distributed systems.
• Community Support: TensorFlow benefits from a large and active community of
developers, researchers, and enthusiasts who contribute to its ongoing development,
provide support, and share resources such as tutorials, libraries, and pre-trained models.
TensorFlow Workflow
The image above shows a high-level overview of a TensorFlow workflow, specifically focused
on building and deploying machine learning models. Here's a breakdown of the steps involved:
• Input Function (tf.data): This stage involves preparing and feeding data into the
TensorFlow pipeline. It ensures the data is formatted appropriately for the machine learning
model.
• Model (model_fn): This represents the core machine learning model you're building using
TensorFlow. The model definition outlines the layers, architecture, and hyperparameters
that will be used for training.
• Train: This step refers to the training process where the machine learning model is trained
on the provided data using the defined model architecture.
• Evaluate: After training the model, it's crucial to evaluate its performance on a separate
dataset to assess its effectiveness and identify areas for improvement.
• Checkpoint: This refers to periodically saving the model's state during training. These
checkpoints can be used to resume training later or roll back to a previous state if necessary.
• Predict: Once trained and evaluated, the model can be used to generate predictions on new,
unseen data.
• TensorFlow Serving: This refers to deploying the trained model for real-world usage.
TensorFlow Serving allows you to serve the model through APIs or web applications,
enabling it to make predictions on data received from external sources.
• TensorFlow Lite: This step represents an optional pathway for deploying the model on mobile
and embedded devices. TensorFlow Lite is a lightweight framework that converts the
TensorFlow model into a smaller, more efficient format suitable for resource-constrained
environments.
• Data Files: These represent the raw data used to train the machine learning model. The file
format can vary depending on the data type (images, text, etc.).
Overall, the image depicts a simplified overview of the TensorFlow machine learning workflow,
encompassing data preparation, model building, training, evaluation, deployment, and potential
conversion for mobile/embedded devices.
Introduction:
Keras is an open-source neural networks library written in Python. It was developed with a
focus on enabling fast experimentation and prototyping of deep learning models. Keras was
designed to be user-friendly, modular, and extensible, making it a popular choice among
researchers and practitioners in the field of artificial intelligence.
Development:
Keras was originally developed by François Chollet, a Google engineer, as part of the research
project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System). It was
released in March 2015 and has since gained widespread adoption due to its simplicity and ease
of use.
Features:
User-Friendly Interface: Keras offers a simple and intuitive API that allows users to define
and train neural networks with minimal code. It abstracts away the complexities of deep
learning frameworks, making it accessible to beginners and experts alike.
Modularity: Keras follows a modular design philosophy, allowing users to construct neural
network models by assembling pre-built layers and modules. This modular approach enables
rapid experimentation and facilitates the creation of complex network architectures.
Flexibility: It provides support for both convolutional neural networks (CNNs) and recurrent
neural networks (RNNs), as well as a wide range of activation functions, optimizers, and loss
functions. Additionally, Keras can seamlessly integrate with other deep learning libraries such
as TensorFlow and Theano.
Extensibility: Keras allows users to easily extend its functionality by creating custom layers,
loss functions, and callbacks. This extensibility enables the implementation of advanced deep
learning techniques and architectures.
Compatibility: Keras is compatible with multiple backends, including TensorFlow, Theano,
and Microsoft Cognitive Toolkit (CNTK). This compatibility ensures that users can leverage
the performance benefits of different backend engines while retaining the same high-level API.
Applications:
Image Classification: Identifying objects in images, such as cats, dogs, or cars.
Natural Language Processing (NLP): Text classification, sentiment analysis, named entity
recognition.
Sequence Modelling: Time series forecasting, speech recognition, language translation.
Generative Modelling: Generating images, text, or music using generative adversarial
networks (GANs) or variational autoencoders (VAEs).
Advantages:
Simplicity: Keras offers a straightforward and easy-to-understand interface, making it suitable
for beginners and researchers without deep learning expertise.
Flexibility: Its modular design and extensible architecture enable users to create custom neural
network architectures tailored to their specific requirements.
Integration: Keras seamlessly integrates with popular deep learning frameworks such as
TensorFlow, allowing users to leverage the performance benefits of these backends while using
the same high-level API.
Community Support: Keras benefits from a large and active community of developers,
researchers, and enthusiasts who contribute to its ongoing development, provide support, and
share resources such as tutorials, libraries, and pre-trained models.
The image above is related to a high-level comparison between traditional machine learning
and deep learning approaches. Here's a breakdown of the key elements in the image:
Deep Learning
• Raw Data - Deep learning models can work directly with raw data, such as images, text,
or sensor readings. This eliminates the need for manual feature engineering, as the deep
learning model can automatically learn the relevant features from the data itself.
• Hidden Layers - Deep learning models contain multiple hidden layers between the input
and output layers. These hidden layers allow the model to learn complex relationships
between the features in the data. The more hidden layers a model has, the deeper it is said
to be.
• Deep Neural Network - Deep learning models are essentially deep neural networks, which
are inspired by the structure and function of the human brain. These neural networks consist
of interconnected nodes (artificial neurons) that process information layer by layer.
Key Differences
The image highlights the key differences between traditional machine learning and deep
learning:
• Feature Engineering - Traditional machine learning requires manual feature engineering,
while deep learning can learn features automatically.
• Model Complexity - Deep learning models can be more complex than traditional machine
learning models, with multiple hidden layers and a deeper architecture.
• Data Requirements - Deep learning models often require more data to train effectively
compared to traditional machine learning models.
Overall, the image depicts the advantages of deep learning over traditional machine learning
for complex tasks. Deep learning's ability to learn features automatically and its deeper
architecture allows it to capture intricate patterns in data, making it a powerful tool for various
applications.
In Keras, a popular deep learning library, there are two primary ways to build neural network
architectures: the Sequential model and the Functional API. Let's explore each approach:
1. Sequential Model:
The Sequential model is the simplest and most commonly used way to build neural
networks in Keras. It allows you to create models layer-by-layer in a linear fashion, where each
layer has exactly one input tensor and one output tensor. This model is suitable for building
straightforward architectures, such as feedforward neural networks or simple convolutional
neural networks (CNNs).
You can create a Sequential model and add layers to it using the Sequential class and
the add() method, like this: The Sequential model is easy to understand and implement, making
it ideal for beginners and quick prototyping.
2. Functional Model:
The Functional API provides a more flexible and powerful way to build neural network
architectures compared to the Sequential model. It allows you to create models with complex
architectures, including multiple inputs, multiple outputs, shared layers, and branching
networks.
With the Functional API, you can define multiple input tensors and output tensors and explicitly
connect layers together to create a directed acyclic graph (DAG) of layers.
This model is suitable for building advanced architectures, such as siamese networks,
multi-input/multi-output networks, and models with residual connections. You can create
models using the Functional API by instantiating keras. Model objects and connecting layers
using the functional Model class and the keras. layers module.
• Keras is an API that was made to be easy to learn for people. Keras was made to be simple.
• Prototyping time in Keras is less. This means that your ideas can be implemented and
deployed in a shorter time.
• The research community for Keras is vast and highly developed.
Advantages of Keras:
• MATLAB provided all the necessary tools to successfully apply AI in imaging and
radiomics.
• MATLAB was used for tasks such as data preprocessing, image segmentation.
• MATLAB is a programming platform designed specifically for engineers and scientists to
analyze and design systems and products.
• The heart of MATLAB is the MATLAB language, a matrix-based language expression of
computational mathematics.
• Data Analysis and Visualization: MATLAB provides powerful tools for data analysis,
manipulation, and visualization. It offers functions for data cleaning, filtering, statistics,
plotting, and creating interactive visualizations.
• Algorithm Development: MATLAB is widely used for developing and implementing
algorithms.
The image above is a diagram depicting the process of model tuning in MATLAB. Model
tuning is an iterative process that involves adjusting the hyperparameters of a machine learning
model to improve its performance on a given task. Here's a detailed breakdown of the steps
outlined in the diagram:
Select Features
• This initial step involves choosing the features or input variables that will be used to train
the machine learning model. Selecting a relevant set of features is crucial for achieving
optimal model performance.
Preprocessed Data
• This refers to the data that has been prepared for use in the machine learning model.
Preprocessing may involve steps such as cleaning the data, handling missing values, and
scaling the features to a common range.
Extract Features
• In some cases, you might extract additional features from the raw data to improve the
model's ability to learn complex patterns. Feature extraction techniques can involve
mathematical transformations or domain-specific knowledge.
Overall, the image highlights the importance of model tuning in achieving optimal performance
from machine learning models in MATLAB. By following a structured approach that involves
selecting features, preprocessing data, training the model, optimizing hyperparameters, and
evaluating performance, data scientists can leverage MATLAB's capabilities to build robust
and effective machine learning models. MATLAB provides powerful tools for data analysis,
manipulation, and visualization. It offers functions for data cleaning, filtering, statistics,
plotting, and creating interactive visualizations.
AI in Predictive Maintenance
By using machine learning (ML) algorithms to underpin larger AI frameworks,
companies can collect historic and current data to anticipate failures before they happen
and take action to reduce the risk. AI systems can analyze historical and real-time
equipment, inventory, and purchasing data to help maintenance teams maintain optimal
inventory levels. These systems can identify patterns in your parts usage and purchase
history to recommend when you should restock parts
In manufacturing, predictive maintenance using AI helps minimize unplanned
downtime by analyzing sensor data from machines and equipment. ML algorithms can detect
anomalies and patterns in data, predicting when maintenance is needed.
The image above is block diagram of a simple machine learning model. The block diagram
shows how a simple machine learning model works step-by-step. Here's a breakdown of the
different stages:
• Training Data: This is the data that the machine learning algorithm learns from. It is
important to have high-quality training data in order to train an accurate machine learning
model. In the image, the training data goes into a rectangular box labelled "Data".
• Data Preprocessing: This step involves cleaning and preparing the data for use in a machine
learning algorithm. Common preprocessing tasks include handling missing values,
normalization, and feature scaling. The image depicts this stage with a rectangular box
labelled "Preprocess".
• Feature Engineering: This step involves creating new features from the raw data that might
be more informative for the machine learning model. This is an optional step, but it can be
important for improving the performance of the model. The image does not explicitly show
this stage, but it can be incorporated between "Preprocess" and "Model" blocks.
• Model: This is the code that learns from the training data and makes predictions on new
data. There are many different machine learning algorithms, and the best algorithm for a
particular task will depend on the nature of the data and the desired outcome. The image
represents the model as a rectangular box labelled "Model".
• Learning Algorithm: This is the algorithm that is used to train the machine learning model.
The learning algorithm takes the training data as input and outputs a model that can be used
to make predictions on new data. The specific learning algorithm used will depend on the
type of machine learning model being used. The image does not differentiate between
"Model" and "Learning Algorithm".
• Predictions: These are the outputs that the machine learning model makes on new data. The
accuracy of the predictions will depend on the quality of the training data and the complexity
of the task. The image shows the predictions coming out of a rectangular box labelled
"Predictions".
The red line in the image represents the boundary between the training data and the machine
learning model. The machine learning algorithm does not have access to the labels or target
variables of the training data during training. This is important because it helps to ensure that
the model is generalizing well and is not simply memorizing the training data.
The process of machine learning can be broken down into several steps:
1. Data Collection: This is the first step in the process, where data is collected from various
sources.
2. Data Preprocessing: This step involves cleaning and preparing the data for use in a machine
learning algorithm.
3. Model Selection: This step involves choosing a machine learning algorithm that is
appropriate for the task at hand.
4. Model Training: This step involves training the machine learning algorithm on the training
data.
5. Model Evaluation: This step involves evaluating the performance of the machine learning
model on a separate test dataset.
6. Model Deployment: This step involves deploying the machine learning model to
production.
Machine learning is a powerful tool that can be used to solve a wide variety of problems.
However, it is important to remember that machine learning is not a magic bullet. The success
of a machine learning project depends on a number of factors, including the quality of the data,
the choice of machine learning algorithm, and the expertise of the data scientist.
The image you sent appears to be a diagram of a data science project lifecycle, including the
main stages involved in bringing a data science project from ideation to production. Here's a
breakdown of the stages depicted in the diagram:
1. Business Understanding: This initial stage involves understanding the business problem
or opportunity that the data science project aims to address. It's crucial to clearly define the
goals and objectives of the project to ensure it aligns with the overall business strategy.
2. Data Acquisition: Here, data scientists gather the relevant data required to build the
machine learning model. This data may come from various sources, including internal
databases, external data providers, or web scraping. Data quality is essential at this stage,
as it significantly impacts the performance of the model.
3. Data Understanding: Once the data is collected, data scientists explore and analyze it to
gain insights into its properties, distribution, and potential challenges. This stage involves
data cleaning, handling missing values, and identifying anomalies or outliers.
4. Data Preparation: In this stage, the data is preprocessed to prepare it for use in a machine
learning model. Preprocessing steps may involve scaling numerical features, encoding
categorical features, and feature engineering to create new features that might be more
informative for the model.
5. Modelling: This stage involves selecting a suitable machine learning algorithm and
training the model on the prepared data. The choice of algorithm depends on the nature of
the problem and the type of predictions you want to make. Some common machine learning
algorithms include linear regression, decision trees, random forests, and support vector
machines.
6. Model Evaluation: After training the model, it's crucial to evaluate its performance on a
separate test dataset to assess itsgeneralizability and avoid overfitting. Overfitting refers to
a scenario where the model performs well on the training data but poorly on unseen data.
Common metrics used for model evaluation include accuracy, precision, recall, and F1-
score.
7. Deployment: If the model performs well on the test dataset, it can be deployed to
production. This may involve integrating the model into a web application, API, or data
pipeline to make predictions on new, real-world data.
8. Monitoring: Once deployed, it's essential to monitor the model's performance over time to
ensure it continues to make accurate predictions. Monitoring may involve tracking model
metrics and retraining the model with new data if its performance degrades.
The cyclical nature of the arrows in the diagram emphasizes that data science projects are
iterative. As you learn more from the data and the deployed model, you can revisit earlier stages
to refine your approach or improve the model's performance.
Overall, the image provides a high-level overview of the data science project lifecycle,
highlighting the key stages involved in taking a project from conception to real-world
implementation.
Unit III
Machine Learning
Linear Regression
Introduction:
Linear regression is a fundamental supervised learning technique used for modelling the
relationship between a dependent variable (target) and one or more independent variables
(features). It assumes a linear relationship between the input variables and the output variable,
making it one of the simplest and most widely used regression methods.
Linear regression analysis is used to predict the value of a variable based on the value
of another variable. The variable you want to predict is called the dependent variable. The
variable you are using to predict the other variable's value is called the independent variable.
Linear regression shows the linear relationship between the independent(predictor) variable i.e.
X-axis and the dependent(output) variable i.e. Y-axis, called linear regression.
Simple linear regression is used to estimate the relationship between two quantitative variables
Key Concepts:
Linear Model: In linear regression, the relationship between the independent variables
Objective: The goal of linear regression is to find the best-fitting line (or hyperplane in higher
dimensions) that minimizes the difference between the predicted values and the actual values
of the target variable.
Simple Linear Regression: In simple linear regression, there is only one independent variable
𝑥.
Multiple Linear Regression: In multiple linear regression, there are multiple independent
variables that are used to predict the dependent variable 𝑦.
The relationship between the independent variables and the dependent variable is represented
by a hyperplane in a multidimensional space.
Advantages:
Interpretability: Linear regression models are easy to interpret, making it straightforward to
understand the relationship between the independent and dependent variables.
Computational Efficiency: Linear regression models are computationally efficient and can
handle large datasets with ease.
Versatility: Linear regression can be applied to both continuous and categorical variables,
making it suitable for a wide range of applications.
Baseline Model: Linear regression serves as a baseline model for more complex machine
learning algorithms, providing a simple and interpretable benchmark for comparison.
Challenges:
Assumption Violations: Linear regression relies on several assumptions that may not hold true
in real-world datasets, such as the linearity and independence of variables.
Overfitting/Underfitting: Linear regression may suffer from underfitting (when the model is
too simple to capture the underlying patterns) or overfitting (when the model is too complex
and captures noise in the data).
Limited Complexity: Linear regression assumes a linear relationship between the variables,
which may not always be the case in practice.
In summary, linear regression is a powerful and widely used technique for modelling the
relationship between variables. By understanding its principles, assumptions, and applications,
you can leverage linear regression to gain insights from data and make informed decisions in
various domains.
Linear regression analysis is a systematic approach used to understand and model the
relationship between one or more independent variables and a dependent variable. Linear
Regression Analysis consists of more than just fitting a linear line through a cloud of data
points. It consists of 3 stages
Before fitting a linear model, it's essential to analyse the correlation and directionality
of the data. This involves examining the relationships between the independent and dependent
variables. Correlation analysis helps determine the strength and direction of the linear
relationship between variables. It quantifies the degree to which changes in one variable are
associated with changes in another variable.
Directionality analysis examines whether the relationship between variables is positive
or negative. A positive correlation indicates that as one variable increases, the other variable
also tends to increase, while a negative correlation indicates the opposite.
Scatter plots, correlation coefficients (such as Pearson's correlation coefficient), and
hypothesis testing (such as t-tests) are commonly used techniques for analyzing correlation and
directionality.
2. Estimating the Model - Fitting the Line:
Once the correlation and directionality of the data are understood, the next stage
involves estimating the linear model by fitting a line to the data.
In simple linear regression, a straight line is fitted to the data using the method of least squares.
The goal is to minimize the sum of squared differences between the observed values and the
values predicted by the model.
3. Evaluating the Validity and Usefulness of the Model:
Once the model is estimated, it's important to evaluate its validity and usefulness to
ensure that it provides meaningful insights and predictions.
Model evaluation involves assessing several aspects, including:
Goodness of fit: How well does the model fit the data? This can be evaluated using metrics
such as R-squared (coefficient of determination) and adjusted R-squared.
Residual analysis: Are the residuals (the differences between observed and predicted values)
normally distributed and homoscedastic? Residual plots and statistical tests (such as Shapiro-
Wilk test) can be used to assess this.
Significance of coefficients: Are the coefficients statistically significant? Hypothesis testing
(such as t-tests) can be used to determine whether the coefficients are different from zero.
Finally, the model should be validated using external data or compared to alternative models
to ensure its robustness and generalizability.
The image above is a conceptual diagram outlining the linear regression process.
Data Collection: This initial step involves gathering the data that will be used to train the linear
regression model. The data should be relevant to the problem you're trying to solve and include
independent variables (predictor variables) and a dependent variable (target variable).
o Mean squared error (MSE): This metric measures the average squared
difference between the predicted values and the actual values.
o Root mean squared error (RMSE): The square root of the MSE.
• Model Refinement: Based on the evaluation results, you may need to refine your model.
This could involve:
o Trying different transformations of the variables
o Selecting a different model complexity (e.g., adding or removing terms from
the linear equation)
o Collecting more data
• Prediction: Once you have a well-performing model, you can use it to make predictions
on new data. For instance, you can predict the value of the dependent variable for new
observations based on the values of the independent variables.
The arrows in the diagram suggest that linear regression is an iterative process. You may need
to go back and forth between different steps as you clean your data, explore relationships,
evaluate your model, and refine it.
Key Concepts:
Linear Relationship:
Multilinear regression assumes a linear relationship between the independent variables and the
dependent variable. The relationship is represented by a linear equation of the form:
Advantages:
Multiple Predictors: Multilinear regression allows for the analysis of the combined effects of
multiple predictors on the dependent variable.
Interpretability: The coefficients in the multilinear regression model provide insights into the
strength and direction of the relationships between the independent variables and the dependent
variable.
Flexibility: Multilinear regression can accommodate both quantitative and categorical
independent variables, making it suitable for a wide range of data types.
Challenges:
Assumption Violations: Multilinear regression relies on assumptions such as linearity,
independence, homoscedasticity, and normality of residuals. Violations of these assumptions
can affect the validity of the model.
Overfitting/Underfitting: Multilinear regression models may suffer from overfitting (when
the model is too complex and captures noise in the data) or underfitting (when the model is too
simple to capture the underlying patterns).
Logistic Regression
Logistic regression is a process of modelling the probability of a discrete outcome given
an input variable. The most common logistic regression models a binary outcome; something
that can take two values such as true/false, yes/no, and so on.
Logistic regression is a powerful statistical technique used for modelling the
relationship between a binary dependent variable and one or more independent variables.
Despite its name, logistic regression is a classification algorithm rather than a regression
algorithm, as it is commonly used to predict the probability of occurrence of a binary outcome.
20 No 50 YES
40 No 45 YES
10 YES 15 No
30 No 35 YES
60 No 85 No
Key Concepts:
Binary Dependent Variable:
Logistic regression is used when the dependent variable (or response variable) is categorical
with two possible outcomes, typically coded as 0 and 1.
Examples of binary outcomes include yes/no, success/failure, pass/fail, and presence/absence.
Odds Ratio:
Logistic regression models the odds of the positive outcome rather than the outcome
itself. The odds ratio represents the odds of the positive outcome occurring relative to the odds
of the negative outcome. It is calculated as the ratio of the probability of the positive outcome
to the probability of the negative outcome.
Maximum Likelihood Estimation (MLE):
The parameters (coefficients) of the logistic regression model are estimated using
maximum likelihood estimation (MLE), a statistical method that maximizes the likelihood of
observing the given data under the assumed model. MLE finds the values of the coefficients
that make the observed data most probable given the assumed logistic regression model.
In summary, logistic regression is a versatile and widely used classification technique for
modelling the relationship between independent variables and a binary outcome. By
understanding its principles, applications, advantages, and challenges, analysts can leverage
logistic regression to make accurate predictions and informed decisions in various domains.
The image above is a diagram of a logistic regression model. Here's a detailed breakdown of
the components in the image:
S-shaped curve: This curve represents the logistic function, which is the core of logistic
regression. It maps the linear combination of the independent variables (x-axis) to a probability
value between 0 and 1 (y-axis). The S-shape ensures that the predicted probabilities always fall
within this range.
Independent variables (X): These are the predictor variables that you input into the model.
The model will learn the relationship between these variables and the dependent variable.
Linear combination of weights (ΣwX): This represents the weighted sum of the independent
variables, where each weight (w) reflects the model's learned importance of each variable in
predicting the outcome.
Threshold (θ): The threshold is a decision boundary on the x-axis of the S-shaped curve.
Values above the threshold are classified as positive outcomes (1), while values below the
threshold are classified as negative outcomes (0).
Predicted probability (y): This is the output of the logistic regression model, representing the
likelihood of a positive outcome (between 0 and 1) for a given set of independent variables.
1. Data Collection: You start by collecting data that includes the independent variables and
the dependent variable (binary outcome).
2. Model Training: The logistic regression model is trained on the data. During training, the
model learns the weights for each independent variable and the threshold value. This is
done by minimizing the difference between the predicted probabilities and the actual
outcomes in the data.
3. Prediction: Once trained, you can use the model to predict the probability of a positive
outcome for new data points. The model takes the new data point's independent variable
values as input, calculates the weighted sum, and then applies the sigmoid function to get
the predicted probability.
The image above is a diagram showing a high-level overview of the steps involved in building
a machine learning model.
1. Problem Definition
This initial stage involves clearly defining the problem you're trying to solve with a
machine learning model. Formulating a well-defined problem statement is crucial for the
success of the machine learning project, as it guides the entire process from data collection to
model selection and evaluation.
2. Data Collection
Once you have a clear understanding of the problem, you need to gather the data that
will be used to train the machine learning model. The data should be relevant to the problem
and include features (variables) that can potentially influence the outcome you want to predict.
Data can be collected from various sources such as databases, surveys, web scraping, or APIs.
3. Data Understanding
After collecting the data, it's essential to explore and analyze it to gain insights into its
properties, distribution, and potential challenges. This stage involves data cleaning tasks like
identifying and handling missing values, outliers, and inconsistencies. Exploratory data
analysis (EDA) may also involve visualizing the data using histograms, scatter plots, and box
plots to understand the relationships between variables.
4. Data Preparation
In this stage, the data is preprocessed to prepare it for use in a machine learning model.
Preprocessing steps may involve:
o Scaling: Scaling numerical features to a common range to prevent features with larger
values from dominating the model during training.
o Encoding: Transforming categorical variables into numerical representations that machine
learning models can understand. Common encoding techniques include one-hot encoding
and label encoding.
o Feature Engineering: Creating new features from existing ones that might be more
informative for the model.
5. Model Selection
Here, you choose a suitable machine learning algorithm for your task. The choice of
algorithm depends on several factors, including the nature of the problem (classification,
regression, etc.), the type of data you have (tabular, image, text), and the desired outcome.
Common machine learning algorithms include linear regression, decision trees, random forests,
support vector machines, and deep learning models like convolutional neural networks (CNNs)
for image recognition or recurrent neural networks (RNNs) for sequence data.
6. Model Training
In this stage, the machine learning model is trained on the prepared data. The training
process involves fitting the model to the data and learning the patterns and relationships
between the features and the target variable. Training algorithms iteratively adjust the model's
parameters to minimize the error between the model's predictions and the actual values in the
training data.
7. Model Evaluation
After training the model, it's crucial to evaluate its performance on a separate test
dataset to assess its generalizability and avoid overfitting. Overfitting refers to a scenario where
the model performs well on the training data but poorly on unseen data. Common metrics used
for evaluation depend on the machine learning task, but they may include accuracy, precision,
recall, F1-score, mean squared error (MSE), or root mean squared error (RMSE).
8. Model Deployment
If the model performs well on the test dataset, you can deploy it for real-world use.
Deployment involves integrating the model into a production environment where it can make
predictions on new data. For instance, the model could be deployed as a web service or
embedded into a mobile application.
9. Model Monitoring
Once deployed, it's essential to monitor the model's performance over time to ensure it
continues to make accurate predictions. Monitoring may involve tracking the model's
evaluation metrics on new data and retraining the model with fresh data if its performance
degrades.
Unit IV
Introduction to
Power BI
Introduction:
Power BI is a powerful business analytics tool developed by Microsoft that enables users to
visualize and analyse data from various sources, gain insights, and make data-driven decisions.
It offers a suite of features for data preparation, modelling, visualization, and collaboration,
making it a popular choice for businesses of all sizes.
Development:
Power BI was first introduced by Microsoft in 2013 as a cloud-based business intelligence
service.
Over the years, it has evolved into a comprehensive suite of tools, including Power BI Desktop
(for authoring reports), Power BI Service (for sharing and collaborating on reports), and Power
BI Mobile (for accessing reports on mobile devices).
Microsoft continues to invest in Power BI, regularly releasing updates and new features to
enhance functionality and usability.
Data Connectivity:
Power BI supports connectivity to a wide range of data sources, including Excel spreadsheets,
SQL databases, cloud services (such as Azure, Google Analytics, and Salesforce), and on-
premises data sources. Users can import data directly into Power BI or establish live
connections to maintain real-time access to data.
Data Preparation:
Power BI offers robust data preparation capabilities, allowing users to clean, transform, and
shape data using a user-friendly interface. Features such as data profiling, data cleansing, and
data modelling simplify the process of preparing data for analysis.
Data Modelling:
Users can create relationships between different datasets, define calculated columns and
measures, and build data models using Power BI's intuitive modelling tools. Power BI's Data
Analysis Expressions (DAX) language enables advanced calculations and aggregation
functions for creating sophisticated analytical models.
Visualization:
Power BI provides a rich set of visualization options, including bar charts, line graphs, pie
charts, maps, and custom visuals. Users can customize and format visualizations to suit their
preferences, adding titles, labels, colors, and interactive elements to enhance readability and
engagement.
Dashboarding:
Power BI allows users to create interactive dashboards by pinning visualizations from multiple
reports onto a single canvas. Dashboards can be customized with filters, slicers, and other
interactive controls to enable users to explore data dynamically.
Collaboration and Sharing:
Power BI Service enables users to publish, share, and collaborate on reports and dashboards
with colleagues within or outside their organization. Features such as content packs, apps, and
workspaces facilitate seamless collaboration and knowledge sharing across teams.
Data Visualization:
Data visualization is the graphical representation of data using charts, graphs, maps, and other
visual elements. Visualization enables users to explore and understand complex datasets more
effectively by presenting information in a visually intuitive manner.
Common types of data visualizations include bar charts, line graphs, pie charts, scatter plots,
heatmaps, and geospatial maps. Interactive visualization tools allow users to manipulate and
explore data dynamically, enabling deeper insights and analysis.
Benefits:
Enhances Understanding: Visual representations of data make it easier to understand
complex relationships and patterns, even for non-technical users.
Facilitates Decision-Making: Visualizations provide actionable insights that support decision-
making processes, helping stakeholders identify trends, opportunities, and areas for
improvement.
Improves Communication: Visualizations serve as a powerful communication tool, enabling
stakeholders to share insights and findings with others more effectively.
Supports Data Exploration: Interactive visualizations allow users to explore data
dynamically, drilling down into specific details and uncovering hidden insights.
Applications:
➢ Power BI finds applications across various industries and functions, including:
➢ Business Intelligence and Analytics
➢ Financial Reporting and Analysis
➢ Sales and Marketing Analytics
➢ Operations and Supply Chain Management
➢ Human Resources and Talent Management
Advantages:
1. Visualizations
A visualization is a representation of data in a visual format. It could be a line chart, a bar
graph, a color coded map or any visual way to present the data.
2. Datasets
A dataset is a collection of data that Power BI uses to create its visualizations. You can
have a simple dataset that’s based on a single table from a Microsoft Excel workbook,
similar to what’s shown in the following image.
Dataset can also be a combination of many different sources, which can be filtered
using Power BI and combined into one to use.
For example: One data source contains countries and locations in the form of latitude and
longitude. Another data source contains demographics of these countries like population and
GDP. Power BI can combine these two data sources into one dataset which can be used for
visualizations.
An important feature of Power BI is the ability to connect to various data sources
using its connectors. Whether the data you want is in Excel or a Microsoft SQL Server
database, in Azure or Oracle, or in a service like Facebook, Salesforce, or MailChimp, Power
BI has built-in data connectors that let you easily connect to that data, filter it if necessary,
and bring it into your dataset.
After you have a dataset, you can begin creating visualizations that show different
portions of it in different ways, and gain insights based on what you see. That is where
reports come in.
3. Reports
In Power BI, a Report is a collection of visualizations that appear together on one
or more pages. A report in Power BI is a collection of items that are related to each other.
We will be working with the gapminder data to create the report below that looks at the
GDP, population and life expectancy by global regions.
Reports let us create and structure visualizations on pages based on the way the we want to
tell the story.
4. Dashboards
A Power BI dashboard is a collection of visuals from a single page that you can share with
others. Often it is a selected group of visuals that provide quick insight into the data or story
you are trying to present.
A dashboard must fit on a single page, often called a canvas (the canvas is the blank
backdrop in Power BI Desktop or the service, where you put visualizations). Think of it like
the canvas that an artist or painter uses — a workspace where you create, combine, and
rework interesting and compelling visuals. You can share dashboards with other users or
groups, who can then interact with your dashboards when they’re in the Power BI service
or on their mobile device.
Introduction:
Dashboards in Power BI serve as dynamic, interactive canvases for presenting key insights and
performance metrics derived from data analysis. They enable users to monitor, analyze, and
share real-time information in a visually compelling and intuitive manner.
Development:
Power BI Dashboards were introduced as part of the Power BI suite by Microsoft to
enhance data visualization and decision-making capabilities. Continual updates and
enhancements to Power BI have refined dashboard creation, offering users an ever-expanding
array of features and capabilities.
Visualizations:
Power BI Dashboards support a wide range of visualizations, including charts, graphs,
maps, and custom visuals, allowing users to represent data in a meaningful and impactful way.
Users can select from a library of pre-built visualizations or create custom visualizations using
Power BI's developer tools.
Interactivity:
Dashboards in Power BI are highly interactive, enabling users to explore data
dynamically by applying filters, drill-downs, and slicers to visualize specific insights.
Interactive elements such as buttons, bookmarks, and tooltips further enhance user engagement
and exploration.
Real-Time Updates:
Power BI Dashboards can display real-time data updates, ensuring that users have
access to the most current information available. Live connections to data sources enable
automatic refreshes and ensure that dashboards reflect the latest changes in underlying data.
Mobile Accessibility:
Power BI Dashboards are optimized for mobile devices, allowing users to access and
interact with dashboards on smartphones and tablets using the Power BI Mobile app.
Responsive design ensures that dashboards adapt to different screen sizes and orientations for
a seamless mobile experience.
Customization:
Users can customize dashboards in Power BI to suit their specific needs and
preferences, including layout, colors, branding, and interactive features.
Custom visuals and themes enable users to create unique and visually stunning dashboards
tailored to their audience.
Advantages:
Actionable Insights: Power BI Dashboards provide users with actionable insights derived
from data analysis, enabling informed decision-making and driving business outcomes.
Visualization: Visualizations in Power BI Dashboards make complex data more accessible and
understandable, facilitating communication and collaboration across teams.
Interactivity: Interactive features empower users to explore data dynamically, uncovering
hidden patterns and trends that drive deeper understanding and analysis.
Real-Time Monitoring: Real-time updates ensure that dashboards reflect the latest
information, enabling timely decision-making and response to changing business conditions.
Consider the flow of information and prioritize the most important insights to be displayed
prominently on the dashboard.
Step 4: Add Interactivity
Enhance your dashboard with interactive features such as slicers, filters, and drill-
downs to enable users to explore the data dynamically.
Use slicers to filter data based on specific criteria, such as time periods, regions, or product
categories. Enable cross-filtering and cross-highlighting between visualizations to maintain
context and interactivity.
Step 5: Create Dashboard Tiles
Once you have designed your dashboard layout and added interactivity, you can convert
individual visualizations into dashboard tiles.
Select the visualizations you want to include on the dashboard and click the "Pin Live Page"
or "Pin to Dashboard" button to pin them to the dashboard.
Choose the dashboard to which you want to pin the visualization tiles or create a new
dashboard.
Step 6: Customize Dashboard Tiles
After pinning visualizations to the dashboard, customize the dashboard tiles to optimize
their appearance and functionality.
Resize and rearrange tiles to create a visually appealing layout that fits the dashboard canvas.
Add titles, subtitles, and descriptions to provide context and guidance for users navigating the
dashboard.
Step 7: Publish and Share the Dashboard
Once you have finalized your dashboard design, publish it to the Power BI Service to
share it with others. Click the "Publish" button in Power BI Desktop to upload the dashboard
to the Power BI Service. Share the dashboard with colleagues or stakeholders by granting them
access permissions or embedding it in SharePoint, Teams, or other applications.
Step 8: Monitor and Maintain the Dashboard
Regularly monitor the performance and usage of your dashboard in the Power BI
Service. Use usage metrics and feedback from users to identify areas for improvement and
optimization. Update the dashboard as needed to reflect changes in data or business
requirements, ensuring that it remains relevant and valuable to users.
Unit V
Advanced Excel
Techniques
Introduction:
Microsoft Excel is part of the Microsoft Office suite of productivity software, which also
includes Word, PowerPoint, Outlook, and Access.
It was first released in 1985 for the Apple Macintosh and later for Microsoft Windows in 1987,
becoming one of the most popular spreadsheet applications globally. Excel uses a grid of cells
organized into rows and columns, where users can enter data, perform calculations, and create
charts and graphs.
Excel includes advanced data analysis tools such as PivotTables, PivotCharts, and Data
Analysis ToolPak, which enable users to summarize, filter, and analyze large datasets quickly.
These tools facilitate tasks such as trend analysis, regression analysis, and hypothesis testing,
helping users gain insights from their data.
Data Visualization:
Excel allows users to create visually appealing dashboards and reports by combining charts,
graphs, and tables on a single worksheet.
Users can use features like conditional formatting and sparklines to highlight key trends and
outliers in their data visually.
Applications:
Microsoft Excel is widely used across various industries and sectors for diverse purposes,
including:
➢ Financial Analysis and Budgeting
➢ Business Planning and Forecasting
➢ Data Management and Reporting
➢ Project Management and Task Tracking
➢ Academic Research and Statistical Analysis
➢ Business Intelligence: Visualizations are widely used in business intelligence for
monitoring key performance indicators (KPIs), analyzing trends, and identifying areas for
optimization.
➢ Data Analytics: Data visualizations aid in exploratory data analysis, hypothesis testing,
and model evaluation, enabling data scientists to derive insights and validate findings.
➢ Presentations and Reports: Visualizations enhance the presentation of findings in reports,
presentations, and dashboards, making information more engaging and accessible to
audiences.
Advantages:
User-Friendly Interface: Excel's intuitive interface makes it accessible to users with varying
levels of expertise.
Versatility: Excel can handle a wide range of tasks, from simple calculations to complex data
analysis and modelling.
Integration: Excel integrates seamlessly with other Microsoft Office applications and third-
party software, enhancing productivity and collaboration.
Cost-Effective: Excel is widely available and relatively inexpensive compared to specialized
data analysis software.
Features of MS Excel
Ribbon
Th Ribbon in MS-Excel is the topmost row of tabs that provide the user with different
facilities/functionalities. These tabs are:
Home Tab
It provides the basic facilities like changing the font, size of text, editing the cells in the
spreadsheet, autosum, etc.
Insert Tab
It provides the facilities like inserting tables, pivot tables, images, clip art, charts, links, etc.
Page layout
It provides all the facilities related to the spreadsheet-like margins, orientation, height, width,
background etc. The worksheet appearance will be the same in the hard copy as well.
Formulas
It is a package of different in-built formulas/functions which can be used by user just by
selecting the cell or range of cells for values.
Data
The Data Tab helps to perform different operations on a vast set of data like analysis through
what-if analysis tools and many other data analysis tools, removing duplicate data, transpose
the row and column, etc. It also helps to access data(s) from different sources as well, such as
from Ms-Access, from web, etc.
Review
This tab provides the facility of thesaurus, checking spellings, translating the text, and helps to
protect and share the worksheet and workbook.
View
It contains the commands to manage the view of the workbook, show/hide ruler, gridlines, etc,
freezing panes, and adding macros.
ARRAY Formulas:
Array formulas enable users to perform calculations on arrays of data rather than individual
cells. They are particularly useful for performing complex calculations, such as matrix
operations, statistical analysis, and multi-criteria calculations.
Some other functions useful in data analysis in Excel are Concat, Substitute, Trim, Filter etc.
Dashboards in Excel are powerful tools for visually summarizing and presenting key insights
from data in a concise and interactive format. They allow users to monitor performance, track
trends, and make data-driven decisions more effectively. Here's an overview of dashboards in
Excel:
Introduction:
A dashboard is a visual representation of data that provides a snapshot view of key performance
indicators (KPIs), metrics, and trends.
Excel dashboards typically consist of charts, graphs, tables, and interactive elements arranged
on a single worksheet or across multiple sheets.
Interactive Controls:
Excel dashboards can incorporate interactive elements such as drop-down lists, scroll bars, and
option buttons to enable users to filter and explore data dynamically.
These controls enhance user engagement and allow for more personalized data analysis.
Conditional Formatting:
Conditional formatting is used to highlight important data points and trends visually.
Users can apply formatting rules based on specified criteria, such as color scales, data bars, and
icon sets, to draw attention to significant changes or outliers in the data.
Excel offers various automation techniques to streamline repetitive tasks and increase
productivity.
Macros: Macros are recorded sequences of actions that automate repetitive tasks in Excel.
They can be created using the Macro Recorder or written in VBA (Visual Basic for
Applications) code.
Data Validation: Data validation rules restrict the type and format of data that users can
input into cells, ensuring data accuracy and consistency.
Power Query: Power Query is a data connection and transformation tool in Excel that
automates the process of importing, cleaning, and shaping data from various sources.
Power Pivot: Power Pivot is an Excel add-in that enhances data analysis capabilities by
enabling users to create data models, perform advanced calculations, and generate interactive
reports.
Benefits:
Efficiency: Data modelling and automation techniques in Excel streamline workflows,
reduce manual effort, and increase productivity.
Accuracy: By automating calculations and data manipulation tasks, Excel helps minimize
errors and ensure data accuracy.
Consistency: Automation techniques ensure consistency in data formatting, calculations, and
reporting, leading to standardized and reliable outputs.
Scalability: Excel's data modelling and automation capabilities can scale to handle large
datasets and complex analysis requirements, making it suitable for both small-scale and
enterprise-level applications.
Define Objectives:
Clearly define the purpose and objectives of the dashboard to ensure it meets the needs of its
intended audience.
Choose Relevant Metrics:
Select key metrics and KPIs that align with the objectives of the dashboard and provide
actionable insights.
Keep it Simple:
Avoid clutter and information overload by focusing on essential data points and using clear,
concise visuals.
Use Consistent Formatting:
Maintain consistency in formatting, color schemes, and font styles to enhance readability and
visual appeal.
Test and Iterate:
Test the dashboard with end-users and gather feedback to identify areas for improvement.
Iterate on the design based on user input and evolving business requirements.