Sebastian Lague A* Tutorial Series – Algorithm Explanation – Pt. 01
November 20, 2019
A* Tutorial Series
Pt. 01 – Algorithm Explanation
A* Pathfinding (E01: algorithm explanation)
By: Sebastian Lague
Notes:
G cost = distance from starting node H cost (heuristic) = distance from the end node F cost = G cost + H cost
A* starts by creating a grid of an area. There is a starting node (initial position) and an end node (destination). Generally, you can move orthogonally (which is normalized as a distance of 1) or diagonally (which would then be a value of sqrt of 2, which is approx. 1.4). Just to make it look nicer and easier to read, it is standard to multiply these distances by 10 so that moving orthogonally has a value of 10, and diagonally has a value of 14.
The A* algorithm starts by generating the G cost and H cost of all 8 squares around the starting point (in a 2D example for ease of understanding). These are then used to calculate the F cost for each square. It starts by searching for the lowest F cost and then expanding the calculations to every square in contact with it (orthogonally and diagonally). Recalculations may be necessary if a certain path finds a way to reach the same square with a lower F cost. If multiple squares have the same F cost, it prioritizes the one with the lowest H cost (closest to the end node). And if there is still a tie, it basically can just pick one at random.
It is worth reiterating that the costs of a square can be updated through a single pathfinding event. This however only occurs, if the resulting F cost would be lower than what is already found in that square. This is actually very important as the search can lead to strange paths to certain squares giving them higher F costs than they should have when there is a much more direct way to reach that same square from the starting node.
Coding Approach
Psuedo Code (directly from tutorial):OPEN //the set of nodes to be evaluatedCLOSED //the set of nodes already evaluatedadd the start node to OPENloop current = node in OPEN with the lowest f_cost remove current from OPEN add current to CLOSED if current is the target node //path has been found return foreach neighbour of the current node if neighbour is not traversable OR neighbour is in CLOSED skip to the next neighbour if new path to neighbour is shorter OR neighbour is not in OPEN set f_cost of neighbour set parent of neighbour to current if neighbour is not in OPEN add neighbour to OPEN
There are two lists of nodes: OPEN and CLOSED. The OPEN list are those nodes selected to be evaluated, and the CLOSED list are nodes that have already been evaluated. It starts by finding the node in the OPEN list with the lowest F cost. They then move this node from the OPEN list to the CLOSED list.
If that current node is the target node, it can be assumed the path has been determined so it can end right there. Otherwise, it checks each neighbour of the current node. If that neighbour is not traversable (an obstacle) or it is in the CLOSED list, it just skips to check the next neighbour.
Once it finds a neighbour to check, it checks that it is either not in the OPEN list (so this neighbour is a completely unchecked node since it is in no list since we also just checked to make sure it was not in the CLOSED list) or there is a new path to this neighbour that is shorter (which is done by calculating the current F cost of that neighbour, since it could be different now). If either of these are met it sets the calculated F cost as the actual F cost of this neighbour (since it is either lower or has never been calculated), and then sets the current node as a parent of this neighbour node. Finally, if neighbour was not in the OPEN list, it is added to the OPEN list.
Setting the current node as the parent of the neighbour in the last part of the psuedo code is helpful for keeping track of the full path. This gives some indication of where a node “came from”, so that when you reach the end you have some reference of which nodes to traverse.