Algorithm A*

A* is a special version of algorithm A.

It is one of the most popular and efficient algorithms used for pathfinding and graph traversal.

It uses a heuristic function \(h(n)\), for which

\[h(n) \leq h^*(n)\]
  • \(h^*(n) \rightarrow\) the actual cost of an optimal path joining \(n\) to a goal node.

This is known as an admissible heuristic function.

Algorithm A* provides a guarantee of an optimal path from start node to a goal node, provided at least one such path exists.

Therefore, it is known as admissible.

Lemma

At any time before A* terminates, there must be at least one node n' on \(\texttt{OPEN}\) list that is on an optimal path joining start node to a goal node, with:

\[f(n') \leq f^*(s)\]

Let the optimal path be \((s = n_0, n_1, \dots, n_k = \texttt{goal})\)

Let \(n'\) be the first node in the sequence which is on \(\texttt{OPEN}\) at any time before A* terminates.

Since \(n'\) lies on an optimal solution path,

\[g(n) = g^*(n)\]

Thus,

\[\begin{align*} f(n') &= g(n') + h(n') \\ &= g^*(n') + h(n')\\ \Rightarrow h(n') &= f(n') - g^*(n') \end{align*}\]

Since \(h\) is an admissible heuristic function,

\[\begin{align*} h(n') &\leq h^*(n')\\ \Rightarrow f(n') - g^*(n') &\leq h^*(n')\\ \Rightarrow f(n') &\leq g^*(n') + h^*(n')\\ \Rightarrow f(n') &\leq f^*(n')\\ \end{align*}\]

Since \(n'\) lies on an optimal path from \(s\) to \(\texttt{goal}\), \( f^*(n') = f^*(s) \). Hence,

\[f(n') \leq f^*(s)\]

Hence, the lemma is proved.

Theorem: A* Must Terminate

Preconditions:

  • There is a path from \(s\) to a goal node

  • Every node in the state space graph has a finite branching factor

  • The cost of every arc in the state space graph is strictly positive

Let us assume that A* does not terminate.

In that case:

  • \(g(n)\) (and hence, \(f(n)\)) will continue to increase indefinitely

  • there will come a time when no node \(n\) on OPEN has \(f(n) \leq f^*(s\))

This is not consistent with the lemma.

Hence, if there is a path from \(s\) to a goal node, A* must terminate.

Theorem: A* Is Admissible

The path formed by A* upon terminating is an optimal solution if an admissible heuristic function is used.

Let \(s\) be the start node and \(t\) be a goal node not on an optimal path.

Let us assume A* terminates by finding some goal node \(t\) without finding an optimal path.

Since \(t\) is a goal node, \(h(t) = 0\). Therefore,

\[f(t) = g(t) + \enclose{horizontalstrike}{h(t)}\\ \Rightarrow f(t) = g(t)\]

Since \(t\) is not on an optimal path,

\[g(t) > g^*(t) \Rightarrow f(t) > g^*(t)\\\]

Again, since \(t\) is a goal node, \(g^*(t) = f^*(s)\). Therefore,

\[f(t) > f^*(s)\]

Since node \(t\) was selected by A*,

\[f(t) \leq f^*(s)\]

This is a contradiction.

Therefore, A* must be admissible.

Informedness of A*

Let \(A_1\) and \(A_2\) be two versions of the A* algorithm with heuristic functions \(h_1(n)\) and \(h_2(n)\) respectively.

\(A_2\) is considered more informed than \(A_1\) if, for all nodes

\[h_2(n) > h_1(n)\]

At the termination of their searches on any graph having a path from \(s\) to a goal node, every node expanded by \(A_2\) is also expanded by \(A_1\). It follows that \(A_1\) expands at least as many nodes as does \(A_2\).

Static
\[h^*(n) = \min_i\{cost(n, g_i)\}\]

Induction Hypothesis

Assume \(A_1\) expands all the nodes expanded by \(A_2\) having depth \(k\) or less in the \(A_2\) search tree

Let us now prove that any node \(n\) expanded by \(A_2\) and of depth \(k + 1\) in the \(A_2\) search tree is also expanded by \(A_1\).

Static

Assume \(A_1\) did not expand node \(n\) expanded by \(A_2\). Therefore,

\[\begin{align*} f_1(n) &\geq f^*(n)\\ \Rightarrow g_1(n) + h_1(n) &\geq f^*(s)\\ \Rightarrow h_1(n) &\geq f^*(s) - g_1(n) \;\;\; \text{--} \;\;\; \text{(i)} \end{align*}\]

As per the induction hypothesis, any ancestor of \(n\) in the \(A_2\) search tree is also expanded by \(A_1\).

Node \(n\) is also in \(A_1\) search tree.

\[g_1(n) \leq g_2(n)\]
  • \(g(n) \rightarrow\) minimum of the costs of all paths discovered so far between \(s\) and \(n\).

Since \(A_2\) is admissible and it expands node \(n\)

\[\begin{align*} f_2(n) &\leq f^*(n)\\ \Rightarrow g_2(n) + h_2(n) &\leq f^*(s)\\ \Rightarrow h_2(n) &\leq f^*(s) - g_2(n)\\ \Rightarrow h_2(n) &\leq f^*(s) - g_1(n) \;\;\; \text{--} \;\;\; \text{(ii)} \end{align*}\]

From \(\text{(i)}\) and \(\text{(ii)}\),

\[h_2(n) \leq h_1(n)\]

But it is absurd!

Hence, the theorem is proved.

Consistency/Monotone Condition

Describing the GRAPHSEARCH procedure, we noted that when a node \(n\) is expanded, some of its successors may already be on OPEN or CLOSED.

The search tree may then need to be adjusted so that it defines the least costly paths in \(G\) from node \(s\) to the descendants of node \(n\). In addition to the burden of adjusting the search tree, it is often computationally quite expensive to test whether a node has been generated before.

We now show that given a rather mild and reasonable restriction on \(h\), when A* selects a node for expansion, it has already found an optimal path to that node. Thus, with this restriction, there is no need for A* to test if a newly generated node is already on CLOSED and there is no need to change the parentage in the search tree of any successors of this node in the search graph.

Let \(n_j\) be an immediate successor of \(n_i\).

Static

Then the consistency/monotone condition states that:

Along any path in the search graph, the heuristic estimate of the optimal (remaining) cost to the goal cannot decrease by more than the arc cost along the path. That is,

\[h(n_i) \leq h(n_j) + c(n_i, n_j)\]

The consistency condition also implies that \(f\)-values of the nodes in the search tree are monotonically non-decreasing as we move away from the start node.

Static
\[f(n_i) \leq f(n_j) \leq f(n_k)\]

Proof

Let \(n_i\) and \(n_j\) be two nodes in the search tree generated by A* with \(n_j\) as a successor of \(n_i\)

Static
\[\begin{align*} h(n_j) &\geq h(n_i) - c(n_i, n_j)\\ \Rightarrow g(n_j) + h(n_j) &\geq h(n_i) + g(n_j) - c(n_i, n_j)\\ \Rightarrow g(n_j) + h(n_j) &\geq h(n_i) + g(n_i)\\ \Rightarrow f(n_j) &\geq f(n_i)\\ \end{align*}\]

Hence, proved.

Theorem: Optimal Path Under Monotonicity

If the monotone condition in satisfied, then A* has already found an optimal path to any node it selects for expansion.

That is, if A* selects \(n\) for expansion and if the monotone condition is satisfied, then

\[g(n) = g^*(n)\]

Proof

Let \(\xi = (n_0, n_1, n_2, \dots, n_l, n_{l+1}, \dots, n = n_k)\) be a sequence of nodes constituting an optimal path from \(n_0\) to \(n\).

Let \(n_l\) be the last node on \(\xi\) expanded by A*. Then \(n_{l+1}\) must be on \(\texttt{OPEN}\) and a candidate for expansion.

Let:

  • \(n_i =\) any node on \(\xi\)

  • \(n_{i+1} =\) a successor of \(n_i\), also on an optimal path

\[\begin{align*} g(n_{i+1}) + h(n_{i+1}) &= g(n_i) + c(n_i, n_j) + h(n_{i + 1})\\ &\geq g(n_i) + h(n_i) \end{align*}\]

Transitivity of the relation \(`\; \geq\text{'}\) gives us:

\[g(n_j) + h(n_j) \geq g(n_i) + h(n_i)\]

for any \(n_i\), \(n_j\) on \(\xi\) if \(i \lt j\)

In particular,

\[g(n) + h(n) \geq \underbrace{g(n_{l+1}) + h(n_{l+1})}_{f(n_{l+1})}\]

Since A* has found an optimal path to \(n_{l+1}\),

\[g(n_{l+1}) = g^*(n_{l+1})\]

Since A* is about to expand \(n\) instead of \(n_{l+1}\), it must be the case that

\[\begin{align*} f(n) &= g(n) + h(n)\\ &\leq f(n_{l+1}) \end{align*}\]

But, we have already established that

\[g^*(n) + h(n) \geq f(n_{l+1})\]

Therefore, \(g(n) \leq g^*(n)\)

But since our method for computing the \(g\)'s implies that \(g(n) \geq g^*(n)\), it must be the case that:

\[g(n) = g^*(n)\]

Hence, proved.