0% found this document useful (0 votes)
57 views

Unit 3 - Data Science, Machine Learning

The document discusses the history and evolution of machine learning and artificial intelligence. It covers early concepts from ancient philosophers through the development of computers and the modern era of deep learning and big data. Key algorithms and approaches are explained, including neural networks, reinforcement learning, and natural language processing.

Uploaded by

badaltanwarr
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Unit 3 - Data Science, Machine Learning

The document discusses the history and evolution of machine learning and artificial intelligence. It covers early concepts from ancient philosophers through the development of computers and the modern era of deep learning and big data. Key algorithms and approaches are explained, including neural networks, reinforcement learning, and natural language processing.

Uploaded by

badaltanwarr
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Section C

Data Science, Machine learning – history and evolution, AI


Evolution, Statistics vs. data mining vs. data analytics Vs. data
science. Supervised and unsupervised learning.
Important Definitions
• Data mining is a process that uses statistical, mathematical, and artificial
intelligence techniques to extract and identify useful information and
subsequent knowledge (or patterns) from large sets of data.

• “the nontrivial process of identifying valid, novel, potentially useful, and


ultimately understandable patterns in data stored in structured
databases,” where the data are organized in records structured by
categorical, ordinal, and continuous variables

• Machine learning: The process by which a computer learns from


experience (e.g., using programs that can learn from historical cases).
Machine learning - History and Evolution
Early Concepts (Ancient to 1940s):
 Ancient philosophers and mathematicians like Aristotle and Pythagoras laid the
foundation for logical reasoning and mathematical principles, which are
fundamental to machine learning.
 In the 17th century, the philosopher and mathematician René Descartes proposed
the idea of automating reasoning.
 In the 19th century, George Boole's work on Boolean algebra provided a
mathematical framework for logic, which became essential for machine learning
algorithms.
 Alan Turing's 1936 paper on the Turing machine introduced the concept of a
universal machine that could simulate any other machine. This idea is fundamental
to modern computing and machine learning.
Machine learning - History and Evolution
Early Computational Models (1940s-1950s):
• The development of electronic computers in the mid-20th century allowed
researchers to experiment with computational models of learning and artificial
intelligence (AI).
• In 1950, Alan Turing published a paper titled "Computing Machinery and
Intelligence," which introduced the Turing Test as a measure of a machine's ability
to exhibit intelligent behavior.

Rule-Based Systems (1950s-1960s):


• Early AI research focused on rule-based systems and expert systems, which used
predefined rules to make decisions or solve problems.
• The General Problem Solver (GPS), developed by Allen Newell and Herbert A.
Simon, was one of the first computer programs capable of solving a wide range of
problems.
Machine learning - History and Evolution

Symbolic AI and Knowledge-Based Systems (1960s-1980s):


• Symbolic AI, also known as "good old-fashioned AI" (GOFAI), dominated this era.
Researchers used formal logic and symbols to represent knowledge and make
inferences.
• Expert systems, such as MYCIN (used for medical diagnosis) and Dendral (used for
chemistry), gained popularity during this time.

Connectionism and Neural Networks (1980s-1990s):


• The field of artificial neural networks gained attention, inspired by the human
brain's neural structure.
• Backpropagation, a key algorithm for training neural networks, was developed.
• However, neural networks fell out of favor in the 1990s due to limitations and the
dominance of other AI approaches.
Machine learning - History and Evolution

Reinforcement Learning and Support Vector Machines (1990s-2000s):


• Reinforcement learning gained prominence, with algorithms like Q-learning and
the development of reinforcement learning in game-playing agents.
• Support vector machines (SVMs) became popular for classification tasks.

Big Data and Deep Learning (2010s-Present):


• The advent of big data and powerful hardware led to a resurgence of neural
networks, particularly deep learning.
• Convolutional neural networks (CNNs) revolutionized image recognition, and
recurrent neural networks (RNNs) improved sequence modeling.
• Advances in deep learning contributed to significant breakthroughs in speech
recognition, natural language processing, and computer vision.
• Reinforcement learning saw notable successes, with deep reinforcement learning
methods achieving superhuman performance in games like Go and Dota 2.
Data Science
• Data science is commonly defined as a methodology by which
actionable insights can be inferred from data:
• In general, data science allows us to adopt four different
strategies to explore the world using data:
1. Probing reality : Data can be gathered by passive or by active
methods. In the latter case, data represents the response of
the world to our actions.
• Analysis of those responses can be extremely valuable when
it comes to taking decisions about our subsequent actions.
e.g. are:
• Use of A/B testing for web development: What is the best
button size and color?
• The best answer can only be found by probing the world.
Data Science
2. Pattern discovery : Divide and conquer is an old heuristic used to
solve complex problems.
• Datified problems can be analyzed automatically to discover useful
patterns and natural clusters that can greatly simplify their solutions.
• The use of this technique to profile users is a critical ingredient
today in such important fields as programmatic advertising or digital
marketing.
3. Predicting future events: Predictive analytics allows decisions to be
taken in response to future events, not only reactively.
• the identification of predictable events represents valuable
knowledge.
• For example, predictive analytics can be used to optimize the tasks
planned for retail store staff during the following week, by
analyzing data such as weather, historic sales, traffic conditions, etc.
Data Science

3. Understanding people and the world: The development of


deep learning methods for natural language understanding and
for visual object recognition is a good example of this kind of
research.
Artificial Intelligence
• An artificial intelligence is a system that can learn how to learn
• The subfield of computer science is concerned with symbolic reasoning and
problem-solving.
• A series of instructions (an algorithm) that allows computers to write their own
algorithms without being explicitly programmed for.
• Artificial Intelligence (AI) is a broad and interdisciplinary field of computer
science that focuses on creating intelligent agents or systems capable of
simulating human-like cognitive processes.
• These systems aim to perform tasks that typically require human intelligence,
such as understanding natural language, recognizing patterns, making
decisions, and learning from experience
• AI as an interdisciplinary field that covers (and requires) the study of manifold
sub-disciplines, such as natural language processes, computer vision, as well as
the Internet of things and robotics.
Artificial Intelligence - Some key aspects and concepts
• Machine Learning: Machine learning is a subset of AI that focuses on developing
algorithms and models that allow computers to learn from data and improve
their performance on a specific task over time. Common machine-learning
techniques include supervised learning, unsupervised learning, and
reinforcement learning.
• Deep Learning: Deep learning is a subfield of machine learning that involves
neural networks with multiple layers (deep neural networks). It has been highly
successful in tasks such as image and speech recognition and natural language
processing.
• Natural Language Processing (NLP): NLP is a branch of AI that deals with the
interaction between computers and human language. It enables computers to
understand, interpret, and generate human language, facilitating applications
like chatbots, translation, and sentiment analysis.
• Computer Vision: Computer vision involves teaching computers to interpret and
understand visual information from the world, such as images and videos. It has
applications in facial recognition, object detection, and autonomous vehicles.
Artificial Intelligence - Some key aspects
and concepts
• Robotics: AI plays a crucial role in robotics, enabling robots to perceive their
environment, make decisions, and perform tasks autonomously. This has
applications in industries like manufacturing, healthcare, and space exploration.

• Reinforcement Learning: In reinforcement learning, agents learn to make decisions


by interacting with an environment. They receive feedback in the form of rewards
or penalties and aim to maximize their cumulative reward over time. This approach
is used in autonomous systems, game playing, and robotics.
Data Mining Vs Statistics
Data Mining Statistics
Explorative – Dig out the data first, discover Confirmative – Provide theory first and
novel patterns and then make theories. then test it using various statistical tools.

Involves Data Cleaning Statistical methods applied on Clean Data


Usually involves working with large Usually involves working with small
datasets. datasets.

Makes generous use of heuristics think There is no scope for heuristics think.
Inductive process Deductive (Does not involve making any
predictions)

Numeric and Non-Numeric Data Numeric Data


Less concerned about data collection. More concerned about data collection.
Some of the popular data mining methods Some of the popular statistical methods
include –Estimation, Classification, Neural include –Inferential and Descriptive
Networks, Clustering, Association, and Statistics.
Visualization.
Data Science Vs Data Analytics:
Data science is an umbrella term that encompasses data analytics, data mining, machine learning,
and several other related disciplines. While a data scientist is expected to forecast the future based
on past patterns, data analysts extract meaningful insights from various data sources. A data
scientist creates questions, while a data analyst finds answers to the existing set of questions .
Data Science Data Analytics

Scope Macro: Data science encompasses a broader Micro : Data analytics is more
range of activities, including data collection, data focused on processing and
cleaning, data transformation, machine learning, analyzing structured data
statistical analysis, and data visualization. using various techniques such
as descriptive statistics, data
mining, and business
intelligence.
Objective Data science is a multidisciplinary field that aims Data analytics focuses on
to extract insights, knowledge, and predictions examining historical data to
from complex and unstructured data. It often identify trends, draw
involves asking open-ended questions and conclusions, and support
exploring data to discover new patterns and decision-making. Its primary
trends. goal is to provide answers to
specific questions and solve
well-defined problems.
Data Science Data Analytics
Techniqu Data scientists use a wide variety of Data analysts primarily use descriptive
es: techniques, including machine and diagnostic analytics techniques to
learning algorithms, statistical summarize data, identify trends, and gain
modeling, and deep learning, to insights. While some basic predictive
develop predictive models and gain a analytics might be involved, the focus is
deep understanding of data. less on building complex predictive
models.

Role Data scientists are responsible for Data analysts play a key role in
developing complex models, creating generating reports, dashboards, and
algorithms, and designing visualizations to support operational and
experiments to solve business tactical decisions within an organization.
problems. They have a strong They often work closely with business
background in computer science, stakeholders.
mathematics, and domain expertise.

Tools Data scientists use programming Data analysts commonly use tools like
languages like Python and R Excel, Tableau, Power BI, and SQL for
extensively, along with tools like data analysis and reporting. They may
Jupyter notebooks and libraries such not need extensive programming or
as TensorFlow and scikit-learn. machine learning expertise.
Data Science Data Analytics
Output The primary output of data science Data analytics produces reports,
includes predictive models, data- charts, and dashboards that
driven recommendations, and provide a clear picture of historical
actionable insights that drive decision- performance, enabling businesses
making at a strategic level within an to make informed decisions for
organization day-to-day operations and short-
term planning.
Supervised and Unsupervised Learning
Supervised Learning Unsupervised Learning
Objective In supervised learning, the algorithm Unsupervised learning, in
learns to map input data to a known target contrast, deals with unlabeled
or output variable. The primary goal is to data and seeks to discover
make predictions or classify data based on patterns, structures, or
labeled examples. relationships within the data
without the guidance of
predefined target variables.

Labeled Data Supervised learning requires a labeled Unsupervised learning


dataset, where each data point has algorithms work with data
associated target values or class labels. that lacks explicit labels or
These labels serve as the ground truth for categories. The goal is to find
the learning algorithm. hidden structures or
groupings within the data.

Training Process During training, the algorithm adjusts its Unsupervised learning is
model parameters to minimize the more close to the true
difference between its predictions and the Artificial Intelligence as it
true labels. Common supervised learning learns similarly as a child
tasks include regression (predicting a learns daily routine things by
continuous value) and classification his experiences.
(assigning data points to predefined
classes).
Supervised Learning Unsupervised Learning

Examples: Some common examples of supervised Clustering customer data to


learning tasks include predicting house identify distinct customer
prices based on features like square segments, reducing the
footage and location (regression) or dimensionality of image data for
classifying emails as spam or not spam feature extraction, and topic
(classification). modeling for text data are
examples of unsupervised learning
applications.

Evaluation: Supervised learning models are Unsupervised learning models are


evaluated based on their ability to typically evaluated differently from
accurately predict or classify new, supervised models. Evaluation
unseen data. Common evaluation often involves measuring the
metrics include accuracy, precision, quality of the discovered patterns
recall, and mean squared error, among or structures. However, evaluation
others. can be more subjective and
context-dependent in unsupervised
learning.
In summary, the key difference between supervised
and unsupervised learning lies in the presence or
absence of labeled data and the primary objectives.
Supervised learning focuses on making predictions
or classifications based on labeled data, while
unsupervised learning aims to discover hidden
patterns or structures in unlabeled data. Each
approach has its own set of algorithms, techniques,
and applications suited to specific problem
domains.
Important Questions
• Differentiate between Supervised and Unsupervised learning.
• "What is the primary difference between supervised and unsupervised learning,
and how does it impact the way each approach is used in machine learning
tasks?“
• "What are the fundamental distinctions between data science and data analytics,
and how do these differences impact the roles and responsibilities associated with
each field?“
• Differentiate Between Data Science and data analytics.
• "Can you provide an in-depth discussion of a specific aspect of artificial
intelligence, such as natural language processing, computer vision, reinforcement
learning, or any other area of your expertise or interest?"
• Discuss some aspect of artificial intelligence
• Discuss data science and outline the strategies and techniques that data scientists
employ to extract insights from data.
• What is the concept of data science, and how do data scientists apply various
strategies to analyze and interpret data?
• Explain data science and discuss the different methodologies that data scientists
employ.

You might also like