## Monday, June 9, 2008

### Pathfinding notes and code

No progress this week, other than writing down a list of features, both necessary and desirable, for finishing my Thief-like Rogue-like.

When I was working on the pathfinding code I did some Googling for pathfinding tutorials, example code, etc. What I found was not entirely satisfactory, so I am adding some notes of my own here. This is talking specifically about the A* Search algorithm.

The basic idea is to search for a minimum-cost path through a graph of states. In my game a state is a position in the world, and the graph edges are the legal moves from one position to its adjacent positions. There can be up to eight of these in my game: north, south, east, west, and diagonals. The cost is based primarily on distance, but I add additional penalties to discourage movement through windows, over tables, through bushes, and things like that. I originally based cost on the number of moves required, ran into an aesthetic problem. On a square grid, if diagonal moves take the same amount of time as horizontal moves (which they do, in my game), there is no difference between these two moves:

If an AI needs to walk in a straight horizontal or vertical line it looks very strange if they veer out of their way, so I made the movement cost approximate true distance. It doesn't need to be exact; just enough so that a northeast-then-southeast move is more expensive than an east-east move, and a southeast move is cheaper than an east-then-south or south-then-east move. I assigned a cost of 2 to horizontal and vertical movements and a cost of 3 to diagonal movements, which took care of the problem.

The gist of A* is that we examine states in the order of the estimated total cost of a path through them from the start to the goal. In my code the states are called nodes, since they are nodes in a graph being searched. During a search I maintain a set (using an STL set) of nodes that have been seen so far. Here's the data structure:

``struct PATH_NODE{    PATH_NODE() {}    PATH_NODE(POINT2 pos_):        pos(pos_),        prev(NULL),        cost_from_start(0),        est_cost_to_goal(0),        num_paths(1),        closed(false)        {}    bool operator < (const PATH_NODE & other) const    {        return pos < other.pos;    }    POINT2 pos;    // Previous state on path through this node:    PATH_NODE * prev;    // Minimum cost found so far to reach this node from the start:    int cost_from_start;    // Estimated cost to reach goal from this node:    int est_cost_to_goal;    // How many paths with the same cost_from_start have been found so far?    int num_paths;    // Has the best path to this node been found already?    bool closed;};``

The node has a less-than operator for ordering nodes; this allows me to store them easily in an STL set. It's just ordering them by position, which will be unique.

The `prev` pointers allow the path to be reconstructed: once the search reaches the goal we just follow these pointers back to the start. std::set's implementation keeps entries at the same place in memory for their entire lifetimes so it's safe to keep pointers to them, as I'm doing here. `cost_from_start` may be reduced over time; it's possible to find one path to a node and later find a better path to that node. `est_cost_to_goal` never changes, though; it's based solely on the node's position. It's stored here just to save time recomputing it. `num_paths` is used to be able to choose uniformly randomly from paths with equal cost. Finally, `closed` indicates when we've found the best path to the node, after which we do not need to do anything further with it.

The other data structure used during the search is a priority queue, which keeps track of nodes to be examined in order of the total estimated cost to reach the goal through each one. I use the STL `priority_queue` for this, with the following queue entry data structure:

``struct PRIORITY_NODE{    PRIORITY_NODE() {}    PRIORITY_NODE(PATH_NODE * node_, int est_total_cost_):        node(node_),        est_total_cost(est_total_cost_)        {}    bool operator < (const PRIORITY_NODE & p) const    {        // Lower cost values have higher priority in the queue.        return p.est_total_cost < est_total_cost;    }    int est_total_cost;    PATH_NODE * node;};``

Again, I'm storing a pointer to a `PATH_NODE` inside a `std::set`.

Here's the pathfinding function:
``POINT2 move_toward(POINT2 pos_start, POINT2 pos_goal){    if (pos_start == pos_goal)        return POINT2(0, 0);    std::set<PATH_NODE> path_nodes;    std::priority_queue<PRIORITY_NODE> open;    typedef std::pair<std::set<PATH_NODE>::iterator, bool> INSERT_RESULT;        INSERT_RESULT insert_result = path_nodes.insert(PATH_NODE(pos_start));    PATH_NODE * start_node = &*insert_result.first;    start_node->est_cost_to_goal = cost_estimate(pos_start, pos_goal);    open.push(PRIORITY_NODE(start_node, start_node->est_cost_to_goal));    while (!open.empty())    {        PATH_NODE * node = open.top().node;        open.pop();        if (node->closed)            continue;        node->closed = true;        // Have we found a path?        if (node->pos == pos_goal)        {            while (node->prev != start_node)                node = node->prev;            POINT2 pos_new = node->pos;            if (GUARD::at(pos_new) || SHIP_CAPTAIN::at(pos_new))                return POINT2(0, 0);            return pos_new - pos_start;        }        // Generate successor states.        for (int i = 0; i < 8; ++i)        {            POINT2 pos_new = node->pos + dir[i];            int move_cost = guard_move_cost(node->pos, pos_new);            if (move_cost == infinite_cost)                continue;            move_cost += dir_cost[i];            int cost_from_start = node->cost_from_start + move_cost;            INSERT_RESULT insert_result = path_nodes.insert(PATH_NODE(pos_new));            PATH_NODE * node_next = &*insert_result.first;            if (insert_result.second)            {                // Node was just inserted                node_next->prev = node;                node_next->cost_from_start = cost_from_start;                node_next->est_cost_to_goal = cost_estimate(node_next->pos, pos_goal);                open.push(PRIORITY_NODE(node_next,                    node_next->cost_from_start + node_next->est_cost_to_goal));            }            else if (!node_next->closed)            {                // Node already exists; do we need to adjust its path?                if (cost_from_start < node->cost_from_start)                {                    // Better path to node_next found.                    node_next->prev = node;                    node_next->cost_from_start = cost_from_start;                    node_next->num_paths = 1;                    open.push(PRIORITY_NODE(node_next,                        node_next->cost_from_start + node_next->est_cost_to_goal));                }                else if (cost_from_start == node->cost_from_start)                {                    // Same-length path; randomly replace existing path with new one                    ++node_next->num_paths;                    if (random(node_next->num_paths) == 0)                    {                        node_next->prev = node;                    }                }            }        }    }    // No path to destination found.    return POINT2(0, 0);}``

If I find a better path to a node, I don't bother trying to adjust the priority on the existing priority queue entry for that node; I just push another entry into the queue with the improved priority. When the function eventually pops the older queue entry off it'll see that the node is already closed and just ignore it.

One thing that wasn't clear to me at first when I was learning the algorithm: the first time you pop off a node from the priority queue, you have found the best path to it. This is why we're able to immediately mark the node as closed in the code.

Eventually I may refactor this code to separate enumeration of successor states from the search. At the moment you can see a loop over the eight possible neighbor states.

My actual pathfinding function has a coda at the end. Instead of giving up when no path can be found, it searches through the node set for the closest node to the goal. In the case where multiple nodes are found with the same estimated cost to the goal, the one that is cheapest to reach from the start is chosen. Then I return the first step along that path.

As you can see, I'm currently regenerating paths from scratch every turn. So far it hasn't been a problem but if it does become a noticeable CPU drain it would be possible to cache paths in some fashion.

When a path is found there is some ugly code in there checking to see if a guard or ship captain is at the spot where we want to move. I'm treating them as passable during the search, and just not moving if one happens to be in the way.

Have you looked at Amit's pathfinding articles.

I found the same problem with my implementation, where diagonal moves were preferred over straight lines.

Check out the section on breaking ties in this article, it helped me solve the problem quite easily.

http://theory.stanford.edu/~amitp/GameProgramming/Heuristics.html

James McNeill said...

Amit's web pages are a great resource. Thanks for the reference to them! I do tend to stop there first for all things pathfinding.

My approach to cost estimation is similar to his Euclidian distance section. However, I don't use straight-line Euclidian distance to the goal as my cost estimate. Rather, I simply make the cost of diagonal moves relative to straight moves roughly mirror their differences in length. Exact cost ratios would be 1:sqrt(2). I'm using 2:3 instead to keep things integer-valued.

Thus, the cost-not-matching-heuristic caveat in Amit's article doesn't apply. These are the actual costs used to move horizontally and vertically, so they are used both in the cost estimate and in the cost tallying.

I estimate cost in the usual way: the number of diagonal moves is min(dx, dy) and the number of non-diagonal moves is max(dx, dy) - min(dx, dy). Multiply the diagonal moves by the diagonal move cost (3) and the non-diagonal moves by their cost (2) to get the total estimated cost.

My approach to breaking ties (choose uniformly randomly from equal-cost paths) is more general, and simpler, than any of the ideas presented on Amit's page. It doesn't aim to reduce the number of nodes explored, though, which seems to be what he's trying to do with some of his tie-breakers.

In practice, the paths I'm getting look exactly like I'd expect them to, so I'm happy with them.