# What is A*search?

A* (pronounced "A-star") is a popular pathfinding algorithm used in computer science and artificial intelligence. It helps find the shortest path between two points in a graph or a grid.

Imagine you're standing in a maze and you want to find the shortest path to reach the exit. A* can help you with that. The algorithm works by exploring different possible paths, evaluating each one based on two factors:

1. The actual cost of reaching a particular point from the starting point.

2. An estimated cost of how much farther it is to reach the destination from that point.

The algorithm then combines these two factors to make a decision on which path to explore next. It chooses the path that has the lowest combined cost.

To make these cost calculations, A* uses something called a heuristic function. This function provides an estimate of how close a particular point is to the destination. It helps guide the algorithm towards the goal by favoring paths that appear to be closer.

A* continues exploring different paths, updating the costs and making decisions based on the combined values until it reaches the destination. This way, it avoids exploring unnecessary paths and efficiently finds the shortest path.

In summary, A* is like having a smart explorer in a maze who keeps evaluating paths based on their total cost and the estimated remaining distance to the goal. It makes decisions to move towards the destination, ultimately finding the shortest path through the maze.

## Who invented it?

The A* algorithm was developed by Peter Hart, Nils Nilsson, and Bertram Raphael in 1968. They introduced it in a paper titled "A Formal Basis for the Heuristic Determination of Minimum Cost Paths" and it quickly became widely used in various fields of computer science and artificial intelligence. Since then, A* has been a fundamental algorithm in pathfinding and has had numerous applications in areas such as robotics, video games, and route planning systems.

## Pseudocode

```
function AStar(start, goal):
openSet = create an empty set
closedSet = create an empty set
add start to openSet
while openSet is not empty:
current = node in openSet with the lowest f_score
if current is equal to goal:
return reconstructPath(current)
remove current from openSet
add current to closedSet
for each neighbor of current:
if neighbor is in closedSet:
continue
tentative_g_score = g_score[current] + distance(current, neighbor)
if neighbor is not in openSet or tentative_g_score < g_score[neighbor]:
set g_score[neighbor] = tentative_g_score
set h_score[neighbor] = heuristic(neighbor, goal)
set f_score[neighbor] = g_score[neighbor] + h_score[neighbor]
set parent[neighbor] = current
if neighbor is not in openSet:
add neighbor to openSet
return failure
function reconstructPath(current):
path = [current]
while current has a parent:
current = parent[current]
add current to path
return path
```

In this pseudocode, start represents the starting node, goal represents the destination node, and neighbor refers to the neighboring nodes of the current node. g_score represents the cost of reaching a node from the start, h_score represents the heuristic (estimated) cost from a node to the goal, and f_score is the sum of g_score and h_score.

The algorithm uses an open set to keep track of nodes to be evaluated and a closed set to store nodes that have already been evaluated. It iteratively selects the node with the lowest f_score from the open set, evaluates its neighbors, and updates their scores accordingly. The process continues until the goal node is reached or there are no more nodes in the open set.

Finally, the algorithm reconstructs the path by following the parent pointers from the goal node back to the start node.

The algorithm uses an open set to keep track of nodes to be evaluated and a closed set to store nodes that have already been evaluated. It iteratively selects the node with the lowest f_score from the open set, evaluates its neighbors, and updates their scores accordingly. The process continues until the goal node is reached or there are no more nodes in the open set.

Finally, the algorithm reconstructs the path by following the parent pointers from the goal node back to the start node.

## Sample Code

```
// C++ code snippet
#include < iostream>
#include < vector>
#include < set>
#include < unordered_map>
#include < limits>
// Structure representing a node in the graph
struct Node {
int id;
std::vector< Node*> neighbors;
};
// Function to calculate the heuristic (estimated) cost from a node to the goal
int heuristic(Node* current, Node* goal) {
// Implement your own heuristic calculation here
// This is just a placeholder
return 0;
}
// Function to perform the A* algorithm
std::vector < Node*> AStar(Node* start, Node* goal) {
std::unordered_map < Node*, Node*> parent;
std::unordered_map < Node*, int> g_score;
std::unordered_map < Node*, int> h_score;
std::unordered_map < Node*, int> f_score;
std::set < std::pair < int, Node*>> openSet;
openSet.insert(std::make_pair(0, start));
g_score[start] = 0;
h_score[start] = heuristic(start, goal);
f_score[start] = g_score[start] + h_score[start];
while (!openSet.empty()) {
Node* current = openSet.begin()->second;
if (current == goal) {
// Reconstruct and return the path
std::vector
``` path;
while (current != nullptr) {
path.push_back(current);
current = parent[current];
}
std::reverse(path.begin(), path.end());
return path;
}
openSet.erase(openSet.begin());
for (Node* neighbor : current->neighbors) {
int tentative_g_score = g_score[current] + 1; // Assuming all edges have a cost of 1
if (g_score.find(neighbor) == g_score.end() || tentative_g_score < g_score[neighbor]) {
parent[neighbor] = current;
g_score[neighbor] = tentative_g_score;
h_score[neighbor] = heuristic(neighbor, goal);
f_score[neighbor] = g_score[neighbor] + h_score[neighbor];
openSet.insert(std::make_pair(f_score[neighbor], neighbor));
}
}
}
// No path found
return std::vector();
}
int main() {
// Create nodes for a simple graph
Node* A = new Node{1};
Node* B = new Node{2};
Node* C = new Node{3};
Node* D = new Node{4};
Node* E = new Node{5};
// Connect the nodes in the graph
A->neighbors = {B, C};
B->neighbors = {A, D};
C->neighbors = {A, D, E};
D->neighbors = {B, C, E};
E->neighbors = {C, D};
// Run A* algorithm
std::vector < Node*> path = AStar(A, E);
// Print the resulting path
if (!path.empty()) {
std::cout << "Path found: ";
for (Node* node : path) {
std::cout << node->id << " ";
}
std::cout << std::endl;
} else {
std::cout << "Path not found!" << std::endl;
}
// Clean up memory
delete A;
delete B;
delete C;
delete D;
delete E;
return 0;
}

```
# Python code snippet
import heapq
# Structure representing a node in the graph
class Node:
def __init__(self, id):
self.id = id
self.neighbors = []
# Function to calculate the heuristic (estimated) cost from a node to the goal
def heuristic(current, goal):
# Implement your own heuristic calculation here
# This is just a placeholder
return 0
# Function to perform the A* algorithm
def AStar(start, goal):
openSet = [(0, start)]
closedSet = set()
g_score = {start: 0}
h_score = {start: heuristic(start, goal)}
f_score = {start: g_score[start] + h_score[start]}
parent = {}
while openSet:
current = heapq.heappop(openSet)[1]
if current == goal:
# Reconstruct and return the path
path = []
while current is not None:
path.append(current)
current = parent.get(current)
return path[::-1]
closedSet.add(current)
for neighbor in current.neighbors:
tentative_g_score = g_score[current] + 1 # Assuming all edges have a cost of 1
if neighbor in closedSet:
continue
if neighbor not in g_score or tentative_g_score < g_score[neighbor]:
parent[neighbor] = current
g_score[neighbor] = tentative_g_score
h_score[neighbor] = heuristic(neighbor, goal)
f_score[neighbor] = g_score[neighbor] + h_score[neighbor]
heapq.heappush(openSet, (f_score[neighbor], neighbor))
# No path found
return []
# Create nodes for a simple graph
A = Node(1)
B = Node(2)
C = Node(3)
D = Node(4)
E = Node(5)
# Connect the nodes in the graph
A.neighbors = [B, C]
B.neighbors = [A, D]
C.neighbors = [A, D, E]
D.neighbors = [B, C, E]
E.neighbors = [C, D]
# Run A* algorithm
path = AStar(A, E)
# Print the resulting path
if path:
print("Path found:", [node.id for node in path])
else:
print("Path not found!")
```

```
import java.util.*;
// Structure representing a node in the graph
class Node {
int id;
List
``` neighbors;
public Node(int id) {
this.id = id;
this.neighbors = new ArrayList<>();
}
}
class AStar {
// Function to calculate the heuristic (estimated) cost from a node to the goal
static int heuristic(Node current, Node goal) {
// Implement your own heuristic calculation here
// This is just a placeholder
return 0;
}
// Function to perform the A* algorithm
static List AStar(Node start, Node goal) {
PriorityQueue openSet = new PriorityQueue<>(Comparator.comparingInt(node -> node.fScore));
Set closedSet = new HashSet<>();
Map gScore = new HashMap<>();
Map hScore = new HashMap<>();
Map fScore = new HashMap<>();
Map parent = new HashMap<>();
gScore.put(start, 0);
hScore.put(start, heuristic(start, goal));
fScore.put(start, gScore.get(start) + hScore.get(start));
openSet.add(start);
while (!openSet.isEmpty()) {
Node current = openSet.poll();
if (current == goal) {
// Reconstruct and return the path
List path = new ArrayList<>();
while (current != null) {
path.add(current);
current = parent.get(current);
}
Collections.reverse(path);
return path;
}
closedSet.add(current);
for (Node neighbor : current.neighbors) {
int tentativeGScore = gScore.get(current) + 1; // Assuming all edges have a cost of 1
if (closedSet.contains(neighbor)) {
continue;
}
if (!gScore.containsKey(neighbor) || tentativeGScore < gScore.get(neighbor)) {
parent.put(neighbor, current);
gScore.put(neighbor, tentativeGScore);
hScore.put(neighbor, heuristic(neighbor, goal));
fScore.put(neighbor, gScore.get(neighbor) + hScore.get(neighbor));
if (!openSet.contains(neighbor)) {
openSet.add(neighbor);
}
}
}
}
// No path found
return Collections.emptyList();
}
public static void main(String[] args) {
// Create nodes for a simple graph
Node A = new Node(1);
Node B = new Node(2);
Node C = new Node(3);
Node D = new Node(4);
Node E = new Node(5);
// Connect the nodes in the graph
A.neighbors = Arrays.asList(B, C);
B.neighbors = Arrays.asList(A, D);
C.neighbors = Arrays.asList(A, D, E);
D.neighbors = Arrays.asList(B, C, E);
E.neighbors = Arrays.asList(C, D);
// Run A* algorithm
List path = AStar(A, E);
// Print the resulting path
if (!path.isEmpty()) {
System.out.print("Path found: ");
for (Node node : path) {
System.out.print(node.id + " ");
}
System.out.println();
} else {
System.out.println("Path not found!");
}
}
}

## Time and Space Complexity

Time Complexity:

In the worst-case scenario, where every node needs to be expanded before reaching the goal, the time complexity of A* is exponential. However, with an effective heuristic function, A* often performs significantly better than that.

In the best-case scenario, where the heuristic provides perfect guidance towards the goal, A* can achieve a complexity of O(b^d), where b is the branching factor (average number of successors per node) and d is the depth of the solution. This makes it more efficient than uninformed search algorithms like breadth-first search or depth-first search.

In practice, the time complexity of A* is often closer to O(b^d) when the heuristic is well-designed and provides reasonable guidance towards the goal.

Space Complexity:

The space complexity of A* is determined by the number of nodes it needs to store in memory during the search. Specifically, it depends on the maximum number of nodes stored in the open set and the closed set.

In the worst-case scenario, where all nodes need to be stored, the space complexity is also exponential. However, in practice, A* typically has a lower space complexity due to various optimizations such as using efficient data structures (e.g., priority queues) and discarding unnecessary nodes.

The actual space complexity of A* depends on factors such as the size of the graph, the number of explored nodes, and the efficiency of the implementation. It can range from linear (O(V)) to exponential (O(b^d)).

In the worst-case scenario, where every node needs to be expanded before reaching the goal, the time complexity of A* is exponential. However, with an effective heuristic function, A* often performs significantly better than that.

In the best-case scenario, where the heuristic provides perfect guidance towards the goal, A* can achieve a complexity of O(b^d), where b is the branching factor (average number of successors per node) and d is the depth of the solution. This makes it more efficient than uninformed search algorithms like breadth-first search or depth-first search.

In practice, the time complexity of A* is often closer to O(b^d) when the heuristic is well-designed and provides reasonable guidance towards the goal.

Space Complexity:

The space complexity of A* is determined by the number of nodes it needs to store in memory during the search. Specifically, it depends on the maximum number of nodes stored in the open set and the closed set.

In the worst-case scenario, where all nodes need to be stored, the space complexity is also exponential. However, in practice, A* typically has a lower space complexity due to various optimizations such as using efficient data structures (e.g., priority queues) and discarding unnecessary nodes.

The actual space complexity of A* depends on factors such as the size of the graph, the number of explored nodes, and the efficiency of the implementation. It can range from linear (O(V)) to exponential (O(b^d)).

## Advantages

1. Completeness: A* is guaranteed to find a solution if one exists, given that the heuristic function is admissible (never overestimates the actual cost) and the graph is finite.

2. Optimality: When the heuristic function is both admissible and consistent (also known as monotonic), A* is optimal, meaning it guarantees finding the shortest path.

3. Efficiency: A* is generally efficient compared to uninformed search algorithms like breadth-first search or depth-first search because it uses the heuristic to guide the search towards the goal.

4. Flexibility: A* can be customized to fit various problem domains by choosing different heuristic functions, allowing it to adapt to different types of graphs and search spaces.

2. Optimality: When the heuristic function is both admissible and consistent (also known as monotonic), A* is optimal, meaning it guarantees finding the shortest path.

3. Efficiency: A* is generally efficient compared to uninformed search algorithms like breadth-first search or depth-first search because it uses the heuristic to guide the search towards the goal.

4. Flexibility: A* can be customized to fit various problem domains by choosing different heuristic functions, allowing it to adapt to different types of graphs and search spaces.

## Disadvantages

1. Heuristic Quality: The effectiveness of A* heavily relies on the quality of the heuristic function. If the heuristic is poorly designed or inaccurate, it may lead to suboptimal or inefficient paths.

2. Memory Requirements: A* requires storing and updating information for all nodes in the open and closed sets, which can consume significant memory, especially in large graphs.

3. Time Complexity: In the worst-case scenario, the time complexity of A* can be exponential, particularly when the branching factor of the graph is high and the heuristic function is not effective.

4. Difficulty with Changing Environments: A* assumes that the graph remains static during the search process. If the environment changes dynamically, such as obstacles moving or paths being blocked, A* may need to be recalculated or modified to handle such changes.

Overall, while A* has notable advantages in terms of completeness, optimality, and efficiency, it also has limitations in terms of heuristic quality, memory requirements, time complexity, and adaptability to changing environments. It is essential to consider these factors when deciding whether to use A* for a particular problem.

2. Memory Requirements: A* requires storing and updating information for all nodes in the open and closed sets, which can consume significant memory, especially in large graphs.

3. Time Complexity: In the worst-case scenario, the time complexity of A* can be exponential, particularly when the branching factor of the graph is high and the heuristic function is not effective.

4. Difficulty with Changing Environments: A* assumes that the graph remains static during the search process. If the environment changes dynamically, such as obstacles moving or paths being blocked, A* may need to be recalculated or modified to handle such changes.

Overall, while A* has notable advantages in terms of completeness, optimality, and efficiency, it also has limitations in terms of heuristic quality, memory requirements, time complexity, and adaptability to changing environments. It is essential to consider these factors when deciding whether to use A* for a particular problem.