0% found this document useful (0 votes)
47 views62 pages

Introduction To Big Data

Uploaded by

usharaninayak007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
47 views62 pages

Introduction To Big Data

Uploaded by

usharaninayak007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 62

Course in Artificial Intelligence for

Government Officials

1
1. Introduction to Big Data
What is Big Data
Big Data is a phrase used to mean a massive volume of both
structured and unstructured data that is so large it is difficult to
process using traditional database and software techniques.1
Professionally, Big Data is a field that studies various means of
extracting, analyzing, or dealing with sets of data that are so
complex to be handled by traditional data-processing systems.2

1. Source : www. https://www.webopedia.com/TERM/B/big_data.html/


2. Source : https://www.analyticsinsight.net/using-artificial-intelligence-in-big-data/

3
4
5V's of Big Data
Velocity: The data is increasing at a very fast rate. It is estimated
that the volume of data will double in every 2 years.

Variety: Now a days data are not stored in rows and column only.
Data is structured as well as unstructured.

Volume: The amount of data which we deal with is of very large size
of Peta bytes (10^15 bytes)

Veracity: Quality of data , should be clean and accurate

Value: Ability to transform a tsunami of data into business


Source : https://www.javatpoint.com/what-is-big-data

5
Structured & Unstructured Data

6
Structured Data

➢ Structured data is data that adheres to a pre-defined data model and


is therefore straightforward to analyze.

➢ Structured data conforms to a tabular format with relationship


between the different rows and columns.

➢ Example : Excel, Google Sheets, SQL, customer data, phone records,


transaction history

. Source : https://www.bigdataframework.org/data-types-structured-vs-unstructured-
data/#:~:text=Structured%20data%20is%20data%20that,Excel%20files%20or%20SQL%20databases.

7
Structured Data

Select CustomerID, State, Gender, Product


from "demographic table", "product table"
where Product= XXYY

8
Unstructured Data
➢ Unstructured data is information that either does not have a
predefined data model or is not organized in a pre-defined manner.

➢ Unstructured information is typically text-heavy, but may contain


data such as dates, numbers, and facts as well.

➢ This results in irregularities and ambiguities that make it difficult to


understand using traditional programs as compared to data stored in
structured databases.

➢ Common examples of unstructured data include audio, video files or


No-SQL databases, pdf files..

9
10
Structured & Unstructured Data

11
Structured & Unstructured Data

12
What is Artificial Intelligence

Artificial intelligence (AI), sometimes called machine


intelligence, is intelligence demonstrated by
machines, unlike the natural intelligence displayed
by humans and animals

Source :https://en.wikipedia.org/wiki/Artificial_intelligence

13
What is Artificial Intelligence

14
What Artificial Intelligence can do

Image Recognition Speech Recognition

Self-Driving Car
Image source: https://towardsdatascience.com/what-is-big-data-and-what-artificial-intelligence-can-do-d3f1d14b84ce:

15
How Big Data and AI are connected
➢ There is a massive amount of data online and offline ,ranging from people,
their routine, their preferences, etc. to non-living things, their properties,
their uses, etc.

➢ This huge stockpile of data, when properly harnessed, can give valuable
insights and business analytics to the sector/ industry where the data set
belongs. Thus, artificially intelligent algorithms are written to benefit from
large and complex data

Source :https://www.analyticsinsight.net/using-artificial-intelligence-in-big-data/

16
How Big Data and AI are connected

17
How Big Data and AI are connected

Natural language processing, where millions of samples from the human


language are recorded and linked to their corresponding computer
programming language translations.

Thus, computers are programmed and used in helping organizations analyze


and process huge amounts of human language data.

Source :https://www.analyticsinsight.net/using-artificial-intelligence-in-big-data/

18
How Big Data and AI are connected

19
How Big Data and AI are connected

20
How Big Data and AI are connected

AI helps farmers to count and monitor their produce through every growth
stage till maturity.

AI can identify weak points or defects long before they spread to other areas
of these huge acres of land.

For Example:, satellite systems or drones are used by the AI for viewing and
extracting the data.

Source :https://www.analyticsinsight.net/using-artificial-intelligence-in-big-data/

21
How Big Data and AI are connected

22
How Big Data and AI are connected

23
How Big Data and AI are connected
The Securities Exchange Commission (SEC) is using network analytics and
natural language processing to foil illegal trading activities in financial
markets.

Trading data analytics are obtained for high-frequency trading, making


decision-based trading, risk analysis, and predictive analysis.

They are also used for early fraud warning, card fraud detection, archival
and analysis of audit trails, reporting enterprise credit, customer data
transformation, etc.

Source :https://www.analyticsinsight.net/using-artificial-intelligence-in-big-data/

24
How Big Data and AI are connected

25
Introduction to Search Algorithms in AI

Searching algorithms are a key element of artificial


intelligence; they teach computers to “act rationally” by
achieving a certain goal with a certain input value.

Essentially, artificial intelligence can find solutions to given


problems through use of searching algorithms.

Source :https://medium.com/datadriveninvestor/searching-algorithms-for-artificial-intelligence-
85d58a8e4a42#:~:text=Essentially%2C%20artificial%20intelligence%20can%20find,blind%2C%20informed%2C%20and%20optimal.

26
Search Algorithm Terminologies:

1. Search: Searching is a step-by-step procedure to solve a search-problem in


a given search space. A search problem can have three main factors:

a. Search Space: Search space represents a set of possible solutions,


which a system may have.
b. Start State: It is a state from where agent begins the search.
c. Goal test: It is a function which observe the current state and returns
whether the goal state is achieved or not.

2. Search tree: A tree representation of search problem is called Search tree.


The root of the search tree is the root node which is corresponding to the
initial state.
Source :https://www.javatpoint.com/search-algorithms-in-ai.

27
Search Algorithm Terminologies:

3. Actions: It gives the description of all the available actions to the agent.

4. Transition model: A description of what each action do, can be represented


as a transition model.

5. Path Cost: It is a function which assigns a numeric cost to each path.

6. Solution: It is an action sequence which leads from the start node to the
goal node.

7. Optimal Solution: If a solution has the lowest cost among all solutions.

Source :https://www.javatpoint.com/search-algorithms-in-ai.

28
Properties of Search Algorithms:
Following are the four essential properties of search algorithms to compare the
efficiency of these algorithms:

❖ Completeness: A search algorithm is said to be complete if it guarantees to


return a solution if at least any solution exists for any random input.

❖ Optimality: If a solution found for an algorithm is guaranteed to be the best


solution (lowest path cost) among all other solutions, then such a solution for is
said to be an optimal solution.

❖ Time Complexity: Time complexity is a measure of time for an algorithm to


complete its task.

❖ Space Complexity: It is the maximum storage space required at any point during
the search, as the complexity of the problem.

Source :https://www.javatpoint.com/search-algorithms-in-ai.

29
Types of Search Algorithms

There are two types of search algorithms

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

30
Types of Search Algorithms

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

31
Informed Search Algorithms

➢ Informed search algorithms have domain knowledge.


➢ It contains the problem description as well as extra
information like how far is the goal node.
➢ It is also called the Heuristic search algorithm.
➢ It might not give the optimal solution always but it will
definitely give a good solution in a reasonable time.
➢ It can solve complex problems more easily than
uninformed.

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

32
Informed Search Algorithms

It is mainly of two types:

➢ Greedy Best First Search

➢ A* Search

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

33
Uninformed Search Algorithms

➢ Uninformed search algorithms do not have any domain


knowledge.
➢ It works in a brute force manner and hence also called
brute force algorithms.
➢ It has no knowledge about how far the goal node is, it only
knows the way to traverse and to distinguish between a leaf
node and goal node.
➢ It examines every node without any prior knowledge hence
also called blind search algorithms.

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

34
Uninformed Search Algorithms

Following are the various types of uninformed search


algorithms:

➢ Breadth-first Search
➢ Depth-first Search
➢ Depth-limited Search
➢ Iterative deepening depth-first search
➢ Uniform cost search
➢ Bidirectional Search
Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

35
Uninformed Search Algorithms

Breadth-First Search(BFS)

➢ In breadth first search, the tree or the graph is traversed breadth wise i.e. it
starts from a node called search key and then explores all the neighboring nodes
of the search key at that depth-first and then moves to the next level nodes.
➢ It is implemented using the queue data structure that works on the concept of
first in first out (FIFO).
➢ The time complexity of BFS is O(bd) where b (branching factor) is the average
number of child nodes for any given node and d stands for depth.
➢ The disadvantage of this algorithm is that it requires a lot of memory space
because it has to store each level of nodes for the next one. It may also check
duplicate nodes.

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

36
Uninformed Search Algorithms

Breadth-First Search(BFS)
Example: If search starts from root node “S” to reach goal node “K” then it will traverse
S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K .
It traverses level wise i.e. explores the shallowest node first.

37
Uninformed Search Algorithms

Depth-First Search(DFS)
➢ In depth first search, the tree or the graph is traversed depth-wise i.e. it starts
from a node called search key and nodes along the branch then backtracks.
➢ It is implemented using a stack data structure that works on the concept of last in
first out (LIFO).
➢ The time complexity of DFS is O(nm ) where n stands for number of nodes and m
stands for maximum depth of any node. It stores nodes linearly hence less space
requirement.
➢ The major disadvantage is that this algorithm may go in an infinite loop.

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

38
Uninformed Search Algorithms

Depth-First Search(DFS)

Example: If search starts from root node “S” to reach goal node “K” then it will traverse
S---> A--->B---->D---->E--->C--->G
It traverses depth wise i.e. explores the deepest node first.

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

39
Uninformed Search Algorithms

Depth-Limited Search Algorithm:


➢ A depth-limited search algorithm is similar to depth-first search with a
predetermined limit.
➢ Depth-limited search can solve the drawback of the infinite path in the Depth-first
search. In this algorithm, the node at the depth limit will treat as it has no
successor nodes further.
➢ Depth-limited search can be terminated with two Conditions of failure:
• Standard failure value: It indicates that problem does not have any solution.
• Cutoff failure value: It defines no solution for the problem within a given
depth limit.

Source:https://www.javatpoint.com/ai-uninformed-search-algorithms.

40
Uninformed Search Algorithms

Depth-Limited Search Algorithm:


Advantages:

➢ Depth-limited search is Memory efficient.

Disadvantages:

➢ Depth-limited search also has a disadvantage of incompleteness.


➢ It may not be optimal if the problem has more than one solution.

Source:https://www.javatpoint.com/ai-uninformed-search-algorithms.

41
Uninformed Search Algorithms

Depth-Limited Search Algorithm:

Example: If search starts from root node “S” to reach goal node “J” then it will traverse
S---> A--->C---->D---->B--->I--->J
The pre-determined limit is set to level 2

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

42
Uninformed Search Algorithms

Uniform-cost Search Algorithm:


➢ Uniform-cost search is a searching algorithm used for traversing a weighted tree or
graph.
➢ This algorithm comes into play when a different cost is available for each edge.
➢ The primary goal of the uniform-cost search is to find a path to the goal node
which has the lowest cumulative cost.
➢ Uniform-cost search expands nodes according to their path costs form the root
node.
➢ It can be used to solve any graph/tree where the optimal cost is in demand.
➢ A uniform-cost search algorithm is implemented by the priority queue. It gives
maximum priority to the lowest cumulative cost.
➢ Uniform cost search is equivalent to BFS algorithm if the path cost of all edges is
the same.
Source:https://www.javatpoint.com/ai-uninformed-search-algorithms.

43
Uninformed Search Algorithms

Uniform-cost Search Algorithm:

Advantages:

➢ Uniform cost search is optimal because at every state the path with the least cost
is chosen.

Disadvantages:

➢ It does not care about the number of steps involve in searching and only
concerned about path cost. Due to which this algorithm may be stuck in an infinite
loop.

Source: https://www.javatpoint.com/ai-uninformed-search-algorithms.

44
Uninformed Search Algorithms
Uniform-cost Search Algorithm:

Example: If search starts from root node “S” to reach goal node “G” then it will traverse
S---> A--->D---->G
The algorithm choses the path with lowest cost first.

Source :https://www.educba.com/search-algorithms-in-ai/?source=leftnav.

45
Uninformed Search Algorithms

Iterative deepening depth-first Search:


➢ The iterative deepening algorithm is a combination of DFS and BFS algorithms.
This search algorithm finds out the best depth limit and does it by gradually
increasing the limit until a goal is found.

➢ This algorithm performs depth-first search up to a certain "depth limit", and it


keeps increasing the depth limit after each iteration until the goal node is found.

➢ This Search algorithm combines the benefits of Breadth-first search's fast search
and depth-first search's memory efficiency.

➢ The iterative search algorithm is useful uninformed search when search space is
large, and depth of goal node is unknown.
Source:https://www.javatpoint.com/ai-uninformed-search-algorithms.

46
Uninformed Search Algorithms

Iterative deepening depth-first Search:

Advantages:

➢ It combines the benefits of BFS and DFS search algorithm in terms of fast search
and memory efficiency.

Disadvantages:

➢ The main drawback of IDDFS is that it repeats all the work of the previous phase.

Source:https://www.javatpoint.com/ai-uninformed-search-algorithms.

47
Uninformed Search Algorithms
Iterative deepening depth-first Search:
Example: If search starts from root node “S” to reach goal node “G” then it will traverse
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G
4'th Iteration------>A, B, D, H, I, E, C, F, K, G
In the fourth iteration, the algorithm will find the goal node.

48
Uninformed Search Algorithms

Bidirectional Search Algorithm:


➢ Bidirectional search algorithm runs two simultaneous searches, one form initial
state called as forward-search and other from goal node called as backward-
search, to find the goal node.
➢ Bidirectional search replaces one single search graph with two small subgraphs in
which one starts the search from an initial vertex and other starts from goal
vertex.
➢ The search stops when these two graphs intersect each other.
➢ Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.

Source:https://www.javatpoint.com/ai-uninformed-search-algorithms.

49
Uninformed Search Algorithms

Bidirectional Search Algorithm:

Advantages:

➢ Bidirectional search is fast.


➢ Bidirectional search requires less memory

Disadvantages:

➢ Implementation of the bidirectional search tree is difficult.


➢ In bidirectional search, one should know the goal state in advance..

Source:https://www.javatpoint.com/ai-uninformed-search-algorithms.

50
Uninformed Search Algorithms
Bidirectional Search Algorithm:
In the below search tree, bidirectional search algorithm is applied. This algorithm divides one
graph/tree into two sub-graphs. It starts traversing from node 1 in the forward direction and
starts from goal node 16 in the backward direction.

The algorithm terminates at node 9 where two searches meet.

51
Agents in Artificial Intelligence

• An AI system can be defined as the study of the


rational agent and its environment.
• The agents sense the environment through sensors
and act on their environment through actuators.
• An AI agent can have mental properties such as
knowledge, belief, intention, etc.

52
Agents in Artificial Intelligence
What is an Agent?
An agent can be anything that perceive its environment through sensors and
act upon that environment through actuators. An Agent runs in the cycle of
perceiving, thinking, and acting. An agent can be:

• Human-Agent: A human agent has eyes, ears, and other organs which
work for sensors and hand, legs, vocal tract work for actuators.
• Robotic Agent: A robotic agent can have cameras, infrared range finder,
NLP for sensors and various motors for actuators.
• Software Agent: Software agent can have keystrokes, file contents as
sensory input and act on those inputs and display output on the screen.

53
Agents in Artificial Intelligence

54
Agents in Artificial Intelligence
Terminologies
Sensor: Sensor is a device which detects the change in the environment and
sends the information to other electronic devices. An agent observes its
environment through sensors.

Actuators: Actuators are the component of machines that converts energy


into motion. The actuators are only responsible for moving and controlling a
system. An actuator can be an electric motor, gears, rails, etc.

Effectors: Effectors are the devices which affect the environment. Effectors
can be legs, wheels, arms, fingers, wings, fins, and display screen.

55
Agents in Artificial Intelligence

56
Intelligent Agents
An intelligent agent is an autonomous entity which act upon an
environment using sensors and actuators for achieving goals. An
intelligent agent may learn from the environment to achieve their
goals. A thermostat is an example of an intelligent agent.

Following are the main four rules for an AI agent:

Rule 1: An AI agent must have the ability to perceive the environment.


Rule 2: The observation must be used to make decisions.
Rule 3: Decision should result in an action.
Rule 4: The action taken by an AI agent must be a rational action.

57
Rational Agent
A rational agent is an agent which has clear preference, models
uncertainty, and acts in a way to maximize its performance measure with
all possible actions.

A rational agent is said to perform the right things. AI is about creating


rational agents to use for game theory and decision theory for various
real-world scenarios.

For an AI agent, the rational action is most important because in AI


reinforcement learning algorithm, for each best possible action, agent
gets the positive reward and for each wrong action, an agent gets a
negative reward.
Note: Rational agents in AI are very similar to intelligent agents.

58
Rationality
The rationality of an agent is measured by its performance measure.
Rationality can be judged on the basis of following points:

• Performance measure which defines the success criterion.


• Agent prior knowledge of its environment.
• Best possible actions that an agent can perform.
• The sequence of percepts.

Note: Rationality differs from Omniscience because an Omniscient agent knows the actual
outcome of its action and act accordingly, which is not possible in reality.

59
Structure of an AI Agent
The task of AI is to design an agent program which implements the agent
function. The structure of an intelligent agent is a combination of architecture and
agent program. It can be viewed as:

Agent = Architecture + Agent program

Following are the main three terms involved in the structure of an AI agent:

• Architecture: Architecture is machinery that an AI agent executes on.


• Agent Function: Agent function is used to map a percept to an action.

f:P* → A

• Agent program: Agent program is an implementation of agent function. An


agent program executes on the physical architecture to produce function f.

60
PEAS Representation
PEAS is a type of model on which an AI agent works upon. When we
define an AI agent or rational agent, then we can group its properties
under PEAS representation model. It is made up of four words:

P: Performance measure
E: Environment
A: Actuators
S: Sensors

Here performance measure is the objective for the success of an agent's behavior.

61
Example of Agents with their PEAS
representation
Agent Performance Environment Actuators Sensors
measure
1. Medical •Healthy •Patient •Tests Keyboard
Diagnose patient •Hospital •Treatments (Entry of
•Minimized cost •Staff symptoms)

2. Vacuum •Cleanliness •Room •Wheels •Camera


Cleaner •Efficiency •Table •Brushes •Dirt detection
•Battery life •Wood floor •Vacuum sensor
•Security •Carpet Extractor •Cliff sensor
•Various •Bump Sensor
obstacles •Infrared Wall
Sensor

3. Part - •Percentage of •Conveyor belt •Jointed Arms •Camera


picking Robot parts in correct with parts, •Hand •Joint angle
bins. •Bins sensors.

62

You might also like