Sebastian Lague A* Tutorial Series – Algorithm Explanation – Pt. 01

November 20, 2019

A* Tutorial Series

Pt. 01 – Algorithm Explanation

A* Pathfinding (E01: algorithm explanation)

Link – Tutorial

By: Sebastian Lague


Notes:

G cost = distance from starting node H cost (heuristic) = distance from the end node F cost = G cost + H cost

A* starts by creating a grid of an area. There is a starting node (initial position) and an end node (destination). Generally, you can move orthogonally (which is normalized as a distance of 1) or diagonally (which would then be a value of sqrt of 2, which is approx. 1.4). Just to make it look nicer and easier to read, it is standard to multiply these distances by 10 so that moving orthogonally has a value of 10, and diagonally has a value of 14.

The A* algorithm starts by generating the G cost and H cost of all 8 squares around the starting point (in a 2D example for ease of understanding). These are then used to calculate the F cost for each square. It starts by searching for the lowest F cost and then expanding the calculations to every square in contact with it (orthogonally and diagonally). Recalculations may be necessary if a certain path finds a way to reach the same square with a lower F cost. If multiple squares have the same F cost, it prioritizes the one with the lowest H cost (closest to the end node). And if there is still a tie, it basically can just pick one at random.

It is worth reiterating that the costs of a square can be updated through a single pathfinding event. This however only occurs, if the resulting F cost would be lower than what is already found in that square. This is actually very important as the search can lead to strange paths to certain squares giving them higher F costs than they should have when there is a much more direct way to reach that same square from the starting node.

Coding Approach

Psuedo Code (directly from tutorial):
OPEN //the set of nodes to be evaluated
CLOSED //the set of nodes already evaluated
add the start node to OPEN

loop
current = node in OPEN with the lowest f_cost
remove current from OPEN
add current to CLOSED

if current is the target node //path has been found
return

foreach neighbour of the current node
if neighbour is not traversable OR neighbour is in CLOSED
skip to the next neighbour

if new path to neighbour is shorter OR neighbour is not in OPEN
set f_cost of neighbour
set parent of neighbour to current
if neighbour is not in OPEN
add neighbour to OPEN

There are two lists of nodes: OPEN and CLOSED. The OPEN list are those nodes selected to be evaluated, and the CLOSED list are nodes that have already been evaluated. It starts by finding the node in the OPEN list with the lowest F cost. They then move this node from the OPEN list to the CLOSED list.

If that current node is the target node, it can be assumed the path has been determined so it can end right there. Otherwise, it checks each neighbour of the current node. If that neighbour is not traversable (an obstacle) or it is in the CLOSED list, it just skips to check the next neighbour.

Once it finds a neighbour to check, it checks that it is either not in the OPEN list (so this neighbour is a completely unchecked node since it is in no list since we also just checked to make sure it was not in the CLOSED list) or there is a new path to this neighbour that is shorter (which is done by calculating the current F cost of that neighbour, since it could be different now). If either of these are met it sets the calculated F cost as the actual F cost of this neighbour (since it is either lower or has never been calculated), and then sets the current node as a parent of this neighbour node. Finally, if neighbour was not in the OPEN list, it is added to the OPEN list.

Setting the current node as the parent of the neighbour in the last part of the psuedo code is helpful for keeping track of the full path. This gives some indication of where a node “came from”, so that when you reach the end you have some reference of which nodes to traverse.