Data mining and cryptograph notes for computer science
Data mining and cryptograph notes for computer science
In today's world, technology is growing very fast, and we are getting in touch with different new
Here, one of the booming technologies of computer science is Artificial Intelligence which is ready
It is currently working with a variety of subfields, ranging from general to specific, such as
self-driving cars, playing chess, proving theorems, playing music, Painting, etc.
2nd semester 2010 Dr. Qusai Abuein 2
Hua Zhibing, China's first virtual
student developed by Tsinghua
University
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial defines
"man-made," and intelligence defines "thinking power", hence AI means "a man-made thinking
power."
"It is a branch of computer science by which we can create intelligent machines which can behave
like a human, think like humans, and able to make decisions."
Artificial Intelligence exists when a machine can have human based skills such as learning, reasoning, and
solving problems
With Artificial Intelligence you do not need to preprogram a machine to do some work, despite that you
can create a machine with programmed algorithms which can work with own intelligence, and that is the
awesomeness of AI.
It is believed that AI is not a new technology, and some people says that as per Greek myth, there were
Mechanical men in early days which can work and behave like humans
Goals of Artificial Intelligence
Building a machine which can perform tasks that requires human intelligence such as:
• Proving a theorem
Creating some system which can exhibit intelligent behavior, learn new things by itself, demonstrate,
of Artificial Intelligence.
Problem-solving agents:
Search:
Search Space: Search space represents a set of possible solutions, which a system may have.
Goal test: It is a function which observe the current state and returns whether the goal state is
achieved or not.
Search tree: A tree representation of search problem is called Search tree. The root of the search
Actions: It gives the description of all the available actions to the agent.
Transition model: A description of what each action do, can be represented as a transition
model.
Solution: It is an action sequence which leads from the start node to the goal node.
Optimal Solution: If a solution has the lowest cost among all solutions.
Properties of Search Algorithms:
Following are the four essential properties of search algorithms to compare the efficiency of these
algorithms:
Optimality: If a solution found for an algorithm is guaranteed to be the best solution (lowest path cost)
among all other solutions, then such a solution for is said to be an optimal solution.
Time Complexity: Time complexity is a measure of time for an algorithm to complete its task.
Space Complexity: It is the maximum storage space required at any point during the search, as the
Based on the search problems we can classify the search algorithms into uninformed (Blind search)
The uninformed search does not contain any domain knowledge such as closeness, the location
of the goal.
It operates in a brute-force way as it only includes information about how to traverse the tree
Uninformed search applies a way in which search tree is searched without any information about
the search space like initial state operators and test for the goal, so it is also called blind search.
It examines each node of the tree until it achieves the goal node.
Informed Search
In an informed search, problem information is available which can guide the search. Informed
search strategies can find a solution more efficiently than an uninformed search strategy.
A heuristic is a way which might not always be guaranteed for best solutions but guaranteed to
Informed search can solve much complex problem which could not be solved in another way.
Greedy Search
A* Search
Uninformed Search Algorithms
Uninformed search is a class of general-purpose search algorithms which operates in brute force-way.
Uninformed search algorithms do not have additional information about state or search space other
Breadth-first search is the most common search strategy for traversing a tree or graph. This
BFS algorithm starts searching from the root node of the tree and expands all successor node at the
If there are more than one solutions for a given problem, then BFS will provide the minimal solution which
Disadvantages:
It requires lots of memory since each level of the tree must be saved into memory to expand the next level.
BFS needs lots of time if the solution is far away from the root node
Example:
In the below tree structure, we have shown the traversing of the tree using BFS algorithm from the root node S to
goal node K.
BFS search algorithm traverse in layers, so it will follow the path which is shown by the dotted arrow, and the
S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
S---> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Time Complexity:
Time Complexity of BFS algorithm can be obtained by the number of nodes traversed in BFS until the
shallowest Node.
O (bd)
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of frontier which is O(b d+1).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite depth, then BFS will
find a solution.
Optimality: BFS is optimal only when all actions have same cost, let it be any depth
Depth-first Search
Depth-first search is a recursive algorithm for traversing a tree or graph data structure.
It is called the depth-first search because it starts from the root node and follows each path to its greatest depth
Advantage:
DFS requires very less memory as it only needs to store a stack of the nodes on the path from root node to the
current node.
It takes less time to reach to the goal node than BFS algorithm (if it traverses in the right path).
Disadvantage:
There is the possibility that many states keep re-occurring, and there is no guarantee of finding the solution.
DFS algorithm goes for deep down searching and sometime it may go to the infinite loop.
Example:
In the below search tree, we have shown the flow of depth-first search, and it will follow the order as:
tree as E has no other successor and still goal node is not found.
After backtracking it will traverse node C and then G, and here it will terminate as it found goal node.
Depth-Limited Search Algorithm:
A depth-limited search algorithm is similar to depth-first search with a predetermined limit to depth level L.
Depth-limited search can solve the drawback of the infinite path in the Depth-first search. In this algorithm, the node
Standard failure value: It indicates that problem does not have any solution.
Cutoff failure value: It defines no solution for the problem within a given depth limit.
Advantages:
Disadvantages:
It may not be optimal if the problem has more than one solution.
Completeness: DLS search algorithm is complete if the solution
is above the depth-limit.
Uniform-cost search is a searching algorithm used for traversing a weighted tree or graph.
This algorithm comes into play when a different cost is available for each edge.
The primary goal of the uniform-cost search is to find a path to the goal node which has the lowest cumulative
cost. Uniform-cost search expands nodes according to their path costs form the root node.
It can be used to solve any graph/tree where the optimal cost is in demand.
Uniform cost search is equivalent to BFS algorithm if the path cost of all edges is the same.
Advantages:
Uniform cost search is optimal because at every state the path with the least cost is chosen.
Disadvantages:
It does not care about the number of steps involve in searching and only concerned about path cost.
The iterative deepening algorithm is a combination of DFS and BFS algorithms. This search algorithm finds
out the best depth limit and does it by gradually increasing the limit until a goal is found.
This algorithm performs depth-first search up to a certain "depth limit", and it keeps increasing the depth
This Search algorithm combines the benefits of Breadth-first search's fast search and depth-first search's
memory efficiency.
The iterative search algorithm is useful uninformed search when search space is large, and depth of goal
node is unknown.
Advantages:
It combines the benefits of BFS and DFS search algorithm in terms of fast search and memory efficiency.
Disadvantages:
The main drawback of IDDFS is that it repeats all the work of the previous phase
Example:
IDDFS algorithm performs various iterations until it does not find the goal node.
1'st Iteration-----> A
2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G
4'th Iteration------>A, B, D, H, I, E, C, F, K, G
In the fourth iteration, the algorithm will find the goal node.
Bidirectional Search Algorithm:
Bidirectional search algorithm runs two simultaneous searches, one form initial state called as
forward-search and other from goal node called as backward-search, to find the goal node.
Bidirectional search replaces one single search graph with two small subgraphs in which one starts
the search from an initial vertex and other starts from goal vertex.
The search stops when these two graphs intersect each other.
Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
Advantages:
Disadvantages:
It starts traversing from node 1 in the forward direction and starts from goal node 16 in the backward
direction.
Knowledge-based search algorithms (also known as heuristic search) have been developed to incorporate
heuristic information, such as estimated costs, to guide the search process efficiently.
Greedy Best-First Search was one such algorithm, but it did not guarantee optimality for finding the
shortest path.
A* Search Algorithm in Artificial Intelligence
A* (pronounced "A-star") is a powerful graph traversal and path finding algorithm widely used in artificial intelligence and
computer science.
It is mainly used to find the shortest path between two nodes in a graph, given the estimated cost of getting from the current node to the
destination node.
The main advantage of the algorithm is its ability to provide an optimal path by exploring the graph in a more informed way compared
Algorithm A* combines the advantages of two other search algorithms: Dijkstra's algorithm and Greedy Best-First Search.
Like Dijkstra's algorithm, A* ensures that the path found is as short as possible but does so more efficiently by directing its search
A heuristic function, denoted h(n), estimates the cost of getting from any given node n to the destination node.
The main idea of A* is to evaluate each node based on two parameters:
g(n): the actual cost to get from the initial node to node n. It represents the sum of the costs of node n outgoing edges.
h(n): Heuristic cost (also known as "estimation cost") from node n to destination node n.
This problem-specific heuristic function must be acceptable, meaning it never overestimates the actual cost of achieving
the goal. The evaluation function of node n is defined as f(n) = g(n) h(n).
Algorithm A* selects the nodes to be explored based on the lowest value of f(n), preferring the nodes with the lowest
Find the node with the smallest f-value (i.e., the node with the minor g(n) h(n)) in the open list.
Move the selected node from the open list to the closed list.
For each successor, calculate its g-value as the sum of the current node's g value and the cost of moving
from the current node to the successor node. Update the g-value of the tracker when a better path is found.
If the follower is not in the open list, add it with the calculated g-value and calculate its h-value. If it is
already in the open list, update its g value if the new path is better.
Repeat the cycle. Algorithm A* terminates when the target node is reached or when the open list empties, indicating
no paths from the start node to the target node. The A* search algorithm is widely used in various fields such as
robotics, video games, network routing, and design problems because it is efficient and can find optimal paths in
graphs or networks.
Early search algorithms:
Before the development of A*, various graph search algorithms existed, including Depth-First Search (DFS) and
Although these algorithms helped find paths, they did not guarantee optimality or consider heuristics to guide the
search
It was developed by Peter Hart, Nils Nilsson, and Bertram Raphael at the Stanford Research Institute (now SRI
International) as an extension of Dijkstra's algorithm and other search algorithms of the time.
A* was first published in 1968 and quickly gained recognition for its importance and effectiveness in the artificial
In 1959, Dutch computer scientist Edsger W. Dijkstra introduced Dijkstra's algorithm, which found the shortest path
Dijkstra's algorithm was efficient, but due to its exhaustive nature, it had limitations when used on larger graphs
A* development:
In 1968, Peter Hart, Nils Nilsson, and Bertram Raphael introduced the A* algorithm as a combination of Dijkstra's
A* used a heuristic function to estimate the cost from the current node to the destination node by combining it with
This allowed A* to explore the graph more consciously, avoiding unnecessary paths and guaranteeing an optimal
solution.
Righteousness and Perfection:
The authors of A* showed that the algorithm is perfect (always finds a solution if one exists) and optimal (finds the shortest path)
A* quickly gained popularity in the AI and IT communities due to its efficiency and Researchers and developers have extended and
applied the A* algorithm to various fields, including robotics, video games, engineering, and network routing.
Several variations and optimizations of the A* algorithm have been proposed over the years, such as Incremental A* and Parallel A*.
Today, the A* search algorithm is still a fundamental and widely used algorithm in artificial intelligence and graph traversal.
Its impact on artificial intelligence and its contribution to pathfinding and optimization problems have made it a cornerstone
Optimal solution
Completeness
Efficiency
Versatility
Optimized search
Memory efficiency
Tunable Heuristics
Extensively researched
Web search
Optimal solution:
A* ensures finding the optimal (shortest) path from the start node to the destination node in the weighted
This optimality is a decisive advantage in many applications where finding the shortest path is essential.
Completeness:
If a solution exists, A* will find it, provided the graph does not have an infinite cost This completeness
Heuristics guide the search to a goal by focusing on promising paths and avoiding unnecessary exploration, making
A* more efficient than non-aware search algorithms such as breadth-first search or depth-first search.
Versatility:
A* is widely applicable to various problem areas, including way finding, route planning, robotics, game
A* can be used to find optimal solutions efficiently as long as a meaningful heuristic can be defined.
Optimized search:
A* maintains a priority order to select the nodes with the minor f(n) value (g(n) and h(n)) for
expansion.
This allows it to explore promising paths first, which reduces the search space and leads to faster
convergence.
Memory efficiency:
Unlike some other search algorithms, such as breadth-first search, A* stores only a limited number
of nodes in the priority queue, which makes it memory efficient, especially for large graphs.
Tunable Heuristics:
More educated heuristics can lead to faster convergence and less expanded nodes.
Extensively researched:
Many optimizations and variations have been developed, making it a reliable and well-understood
troubleshooting tool.
Web search:
A* can be used for web-based path search, where the algorithm constantly updates the path according
dynamic scenarios.
Disadvantages of A* Search Algorithm in Artificial Intelligence
Although the A* (letter A) search algorithm is a widely used and powerful technique for solving AI pathfinding and
Heuristic accuracy
Memory usage
Time complexity
Cost Binding
The performance of the A* algorithm depends heavily on the accuracy of the heuristic function used to estimate
the cost from the current node to the If the heuristic is unacceptable (never overestimates the actual cost) or
inconsistent (satisfies the triangle inequality), A* may not find an optimal path or may explore more nodes than
Memory usage:
A* requires that all visited nodes be kept in memory to keep track of explored paths.
Memory usage can sometimes become a significant issue, especially when dealing with an ample search space or
Although A* is generally efficient, its time complexity can be a concern for vast search spaces or graphs.
In the worst case, A* can take exponentially longer to find the optimal path if the heuristic is inappropriate
In specific scenarios, the A* algorithm needs to explore nodes far from the destination before finally
This the problem occurs when the heuristic needs to direct the search to the goal early effectively.
Cost Binding:
A* faces difficulties when multiple nodes have the same f-value (the sum of the actual cost and the
heuristic cost).
The strategy used can affect the optimality and efficiency of the discovered path. If not handled correctly,
it can lead to unnecessary nodes being explored and slow down the algorithm.
In dynamic environments where the cost of edges or nodes may change during the search, A* may not be
Reformulation from scratch can be computationally expensive, and D* (Dynamic A*) algorithms were
Maze solving
Puzzle-solving
Network Routing
Game AI
Path finding in Games:
A* is often used in video games for character movement, enemy AI navigation, and finding the shortest path from one location to
Its ability to find the optimal path based on cost and heuristics makes it ideal for real-time applications such as games.
A* is used in robotics and autonomous vehicle navigation to plan an optimal route for robots to reach a destination, avoiding
Maze solving:
A* can efficiently find the shortest path through a maze, making it valuable in many maze-solving applications, such as solving
In GPS systems and mapping applications, A* can be used to find the optimal route between two points on a map, considering
Resource Allocation:
In scenarios where resources must be optimally allocated, A* can help find the most efficient allocation path, minimizing cost
Network Routing:
A* can be used in computer networks to find the most efficient route for data packets from a source to a destination node.
In some NLP tasks, A* can generate coherent and contextual responses by searching for possible word sequences based on their