algorithms(paths): ‹BF to A*› (#13)

This commit is contained in:
Matej Focko 2024-01-03 15:12:35 +01:00 committed by GitHub
commit 5549494f67
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
13 changed files with 1705 additions and 0 deletions

View file

@ -0,0 +1,573 @@
---
id: bf
slug: /paths/bf-to-astar/bf
title: BF
description: |
Solving the shortest path problem with a naïve approach that turns into
something.
tags:
- cpp
- brute force
- bellman ford
- dynamic programming
last_update:
date: 2024-01-01
---
## Basic idea
We will _ease in_ with our own algorithm to find the shortest path. We will
start by thinking about the ways we can achieve that. If we didn't have the `*`
cells, we could've easily run a BFS[^1] and be done with it. Maybe it is a good
place to start, or isn't, there is only one way to find out though.
_How does the BFS work?_ We know the vertex where we start and we know the
vertex we want to find the shortest path to. Given this knowledge we
incrementally visit all of our neighbours and we do that over and over until the
destination is found[^2]. Could we leverage this somehow?
## Naïve approach
Well, we could probably start with all vertices being _unreachable_ (having the
highest possible price) and try to improve what we've gotten so far until there
are no improvements. That sounds fine, we shall implement this. Since we are
going on repeat, we will name this function `bf()` as in _brute-force_, cause it
is trying to find it the hard way:
```cpp
const static std::vector<vertex_t> DIRECTIONS =
std::vector{std::make_pair(0, 1), std::make_pair(0, -1),
std::make_pair(1, 0), std::make_pair(-1, 0)};
auto bf(const graph& g, const vertex_t& source, const vertex_t& destination)
-> int {
// source must be within the bounds
assert(g.has(source));
// destination must be within the bounds
assert(g.has(destination));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we need to improve the paths as long as possible
bool improvement_found;
do {
// reset the flag at the beginning
improvement_found = false;
// go through all of the vertices
for (int y = 0; y < g.height(); ++y) {
for (int x = 0; x < g.width(); ++x) {
// skip the cells we cannot reach
if (distances[y][x] == graph::unreachable()) {
continue;
}
// go through the neighbours
auto u = std::make_pair(x, y);
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
distances[y + dy][x + dx] = distances[y][x] + cost;
improvement_found = true;
}
}
}
}
} while (improvement_found);
return distances[destination.second][destination.first];
}
```
:::info Relaxation
I have made a brief mention of the relaxation in the comment in the code. You've
been probably thought that **relaxation of an edge** means that you found
a better solution to the problem.
In general it is an approximation technique that _reduces_ the problem of
finding the path `u → x1 → … → xn → v` to subproblems
`u → x1, x1 → x2, …, xn → v` such that the sum of the costs of each step is
**minimal**.
:::
### Correctness
_Is our solution correct?_ It appears to be correct… We have rather complicated
map and our algorithm has finished in an instant with the following output:
```
Normal cost: 1
Vortex cost: 5
Graph:
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#D...#....#.#
########*.*.#
#S..........#
#############
Cost: 22
```
If you have a better look at the map, you will realize that the cost `22` is the
one path skipping the `*` cells, since they cost more than going around.
We can play around a bit with it. The `*` cells can be even vortices that pull
you in with a negative price and let you _propel_ yourself out :wink: Let's
change their cost to `-1` then. Let's check what's the fastest path to the cell.
```
Normal cost: 1
Vortex cost: -1
Graph:
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#D...#....#.#
########*.*.#
#S..........#
#############
```
And we're somehow stuck… The issue comes from the fact that _spinning around_ in
the vortices allows us to lower the cost infinitely. That's why after each
iteration there is still a possibility to lower the cost, hence the algorithm
doesn't finish. _What can we do about this?_
:::tip
This algorithm is correct as long as there are no negative loops, i.e. ways how
to lower the cost infinitely. Therefore we can also just lay a precondition that
requires no negative loops to be present.
:::
### Fixing the infinite loop
Our issue lies in the fact that we can endlessly lower the cost. Such thing must
surely happen in some kind of a loop. We could probably track the relaxations
and once we spot repeating patterns, we know we can safely terminate with _some_
results at least.
This approach will not even work on our 2D map, let alone any graph. Problem is
that the _negative loops_ lower the cost in **each** iteration and that results
in lowering of the costs to the cells that are reachable from the said loops.
That's why this problem is relatively hard to tackle, it's not that easy to spot
the repeating patterns algorithmically.
On the other hand, we can approach this from the different perspective. Let's
assume the worst-case scenario (generalized for any graph):
> Let $K_n$ be complete graph. Let $P$ be the shortest path from $v_1$ to $v_n$
> such that $P$ has $n - 1$ edges, i.e. the shortest path between the two chosen
> vertices visits all vertices (not necessarily in order) and has the lowest
> cost.
>
> In such scenario assume the worst-case ordering of the relaxations (only one
> _helpful_ relaxation per iteration). In this case, in each iteration we find
> the next edge on our path $P$ as the last. This means that we need
> $\vert V \vert - 1$ iterations to find the shortest path $P$.
>
> Because we have laid $P$ as the shortest path from $v_1$ to $v_n$ and it
> visits all vertices, its prefixes are the shortest paths from $v_1$ to any
> other vertex in our graph.
>
> Therefore, we can safely assume that any relaxation after $\vert V \vert - 1$
> iterations, is the effect of a negative loop in the graph.
_How can we leverage this?_ We will go through the edges only as many times as
cells we have. Let's adjust the code to fix the looping:
```cpp
auto bf_finite(const graph& g, const vertex_t& source,
const vertex_t& destination) -> int {
// source must be within the bounds
assert(g.has(source));
// destination must be within the bounds
assert(g.has(destination));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we only iterate as many times as cells that we have
for (int i = g.height() * g.width(); i > 0; --i) {
// go through all of the vertices
for (int y = 0; y < g.height(); ++y) {
for (int x = 0; x < g.width(); ++x) {
// skip the cells we cannot reach
if (distances[y][x] == graph::unreachable()) {
continue;
}
// go through the neighbours
auto u = std::make_pair(x, y);
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
distances[y + dy][x + dx] = distances[y][x] + cost;
}
}
}
}
}
return distances[destination.second][destination.first];
}
```
And we get the following result:
```
Normal cost: 1
Vortex cost: -1
Graph:
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#D...#....#.#
########*.*.#
#S..........#
#############
Cost: -236
```
The negative cost means that there is a way to _propel_ ourselves via some
vortices. Let's adjust the cost of _vortices_ back to the original `5` and check
whether our modified algorithm works as it did before. And it surely does yield
the `22` as before.
:::tip Refactoring
You can definitely notice some _deep nesting_ in our code, to counter this
phenomenon I will convert the looping over `x` and `y` to one variable that can
be decomposed to `x` and `y`. It is a very common practice when working with 2D
arrays/lists to represent them as 1D. In our case:
```
i : 0 → width * height - 1
x = i % width
y = i / width
```
:::
## Bellman-Ford
If you have ever attended any Algorithms course that had path-finding in its
syllabus, you probably feel like you've seen the algorithm above before[^3]… And
yes, the first algorithm I have proposed is a very dumb version of the
_Bellman-Ford_ algorithm, it's dumb, because it loops :wink: After our “looping”
prevention we got to the point that is almost the _Bellman-Ford_ with the one
exception that it doesn't report whether there are any negative cycles, it just
ends.
Let's have a look at a proper implementation of the Bellman-Ford algorithm:
```cpp
auto bellman_ford(const graph& g, const vertex_t& source)
-> std::vector<std::vector<int>> {
// source must be within the bounds
assert(g.has(source));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we only iterate as many times as cells that we have
for (int i = g.height() * g.width(); i > 0; --i) {
// go through all of the vertices
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
int y = v / g.width();
int x = v % g.width();
// skip the cells we cannot reach
if (distances[y][x] == graph::unreachable()) {
continue;
}
// go through the neighbours
auto u = std::make_pair(x, y);
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
distances[y + dy][x + dx] = distances[y][x] + cost;
}
}
}
}
// now we check for the negative loops
bool relaxed = false;
for (int v = g.height() * g.width() - 1; !relaxed && v >= 0; --v) {
int y = v / g.width();
int x = v % g.width();
// skip the cells we cannot reach
if (distances[y][x] == graph::unreachable()) {
continue;
}
// go through the neighbours
auto u = std::make_pair(x, y);
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
relaxed = true;
std::cerr << "Found a negative loop\n";
break;
}
}
}
return distances;
}
```
And if we run it with our negative cost of entering vortices:
```
[Bellman-Ford] Found a negative loop
[Bellman-Ford] Cost: -240
```
### On the Bellman-Ford
You might be surprised that we have managed to iterate from a brute-force method
that mindlessly tries to find a better path until there are no better paths left
all the way to the Bellman-Ford algorithm.
I always say that Bellman-Ford is a _smart_ brute-force. BF is also an algorithm
that leverages _dynamic programming_. You might wonder how can it utilize DP if
it is “technically” a brute-force technique. Table with the shortest distances
is the thing that makes it DP.
> I might not know the shortest path yet, but I do remember all of other paths,
> and I can improve them, if possible.
That's where the beauty of both _dynamic programming_ and _relaxing_ gets merged
together and does its magic.
Proof of the correctness of the BF is done via induction to the number of
iterations. I would suggest to try to prove the correctness yourself and
possibly look it up, if necessary.
Also the correctness of the BF relies on the conclusion we've made when fixing
the infinite-loop on our naïve BF solution.
## Time complexity
Let's have a short look at the time complexities of the presented algorithms:
1. naïve approach: given that there are no negative loops, we are bound by the
worst-case ordering of the relaxations which results in
$$
\mathcal{O}(\vert V \vert \cdot \vert E \vert)
$$
2. our naïve approach with the fixed count of iterations instead of the
`do-while` loop results in the same worst-case time complexity:
$$
\Theta(\vert V \vert \cdot \vert E \vert)
$$
3. and finally the well-known Bellman-Ford's algorithm time complexity:
$$
\Theta(\vert V \vert \cdot \vert E \vert)
$$
## Small refactor
Since we are literally copy-pasting the body of the loops just for the sake of
relaxing, we can factor that part out into a separate function:
```cpp
static auto _check_vertex(const graph& g,
std::vector<std::vector<int>>& distances, int v,
bool check_only = false) -> bool {
bool improvement_found = false;
// unpack the vertex coordinates
int y = v / g.width();
int x = v % g.width();
// skip the cells we cannot reach
if (distances[y][x] == graph::unreachable()) {
return false;
}
// go through the neighbours
auto u = std::make_pair(x, y);
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
if (check_only) {
return true;
}
distances[y + dy][x + dx] = distances[y][x] + cost;
improvement_found = true;
}
}
return improvement_found;
}
```
This function can be also used for checking the negative loops at the end of the
BF by using the `check_only` parameter to signal that we just want to know if
there would be any edge relaxed instead of performing the relaxation itself.
Then we can also see the differences between the specific versions of our
path-finding algorithms in a clear way:
```cpp
auto bf(const graph& g, const vertex_t& source, const vertex_t& destination)
-> int {
// source must be within the bounds
assert(g.has(source));
// destination must be within the bounds
assert(g.has(destination));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we need to improve the paths as long as possible
bool improvement_found;
do {
// reset the flag at the beginning
improvement_found = false;
// go through all of the vertices
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
improvement_found = _check_vertex(g, distances, v) || improvement_found;
}
} while (improvement_found);
return distances[destination.second][destination.first];
}
auto bf_finite(const graph& g, const vertex_t& source,
const vertex_t& destination) -> int {
// source must be within the bounds
assert(g.has(source));
// destination must be within the bounds
assert(g.has(destination));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we only iterate as many times as cells that we have
for (int i = g.height() * g.width(); i > 0; --i) {
// go through all of the vertices
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
_check_vertex(g, distances, v);
}
}
return distances[destination.second][destination.first];
}
auto bellman_ford(const graph& g, const vertex_t& source)
-> std::vector<std::vector<int>> {
// source must be within the bounds
assert(g.has(source));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we only iterate as many times as cells that we have
for (int i = g.height() * g.width(); i > 0; --i) {
// go through all of the vertices
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
_check_vertex(g, distances, v);
}
}
// now we check for the negative loops
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
if (_check_vertex(g, distances, v, true)) {
std::cerr << "[Bellman-Ford] Found a negative loop\n";
break;
}
}
return distances;
}
```
---
:::tip
You might've noticed that I've been using abbreviation _BF_ interchangeably for
both _Bellman-Ford_ and _brute-force_. If you think about the way Bellman-Ford
algorithm works, you should realize that in the worst case it's updating the
shortest path till there no shorter path exists, so in a sense, you could really
consider it a brute-force algorithm.
:::
[^1]: [Breadth-first search](https://en.wikipedia.org/wiki/Breadth-first_search)
[^2]: Of course, there are some technicalities like keeping track of the visited
vertices to not taint the shortest path by already visited vertices.
[^3]: or at least you should, LOL

View file

@ -0,0 +1,318 @@
---
id: dijkstra
slug: /paths/bf-to-astar/dijkstra
title: Dijkstra's algorithm
description: |
Moving from Bellman-Ford into the Dijsktra's algorithm.
tags:
- cpp
- dynamic programming
- greedy
- dijkstra
last_update:
date: 2024-01-03
---
## Intro
Let's rewind back to the small argument in the previous post about the fact that
we can safely bound the amount of iterations with relaxations being done.
We have said that assuming the worst-case scenario (bad order of relaxations) we
**need** at most $\vert V \vert - 1$ iterations over all edges. We've used that
to our advantage to _bound_ the iterations instead of the `do-while` loop that
was a risk given the possibility of the infinite loop (when negative loops are
present in the graph).
:::tip
We could've possibly used both _boolean flag_ to denote that some relaxation has
happened and the upper bound of iterations, for some graphs that would result in
faster termination.
Using only the upper bound we try to relax edges even though we can't.
:::
Now the question arises, could we leverage this somehow in a different way? What
if we used it to improve the algorithm instead of just bounding the iterations?
Would that be even possible?
**Yes, it would!** And that's when _Dijkstra's algorithm_ comes in.
## Dijkstra's algorithm
I'll start with a well-known meme about Dijkstra's algorithm:
![Dijkstra's algorithm meme](/img/algorithms/paths/bf-to-astar/dijkstra-meme.jpg)
And then follow up on that with the actual backstory from Dijkstra himself:
> What is the shortest way to travel from Rotterdam to Groningen, in general:
> from given city to given city. It is the algorithm for the shortest path,
> which I designed in about twenty minutes. One morning I was shopping in
> Amsterdam with my young fiancée, and tired, we sat down on the café terrace to
> drink a cup of coffee and I was just thinking about whether I could do this,
> and I then designed the algorithm for the shortest path. As I said, it was
> a twenty-minute invention. In fact, it was published in '59, three years
> later. The publication is still readable, it is, in fact, quite nice. One of
> the reasons that it is so nice was that I designed it without pencil and
> paper. I learned later that one of the advantages of designing without pencil
> and paper is that you are almost forced to avoid all avoidable complexities.
> Eventually, that algorithm became to my great amazement, one of the
> cornerstones of my fame.
>
> — Edsger Dijkstra, in an interview with Philip L. Frana, Communications of the
> ACM, 2001
:::caution Precondition
As our own naïve algorithm, Dijkstra's algorithm has a precondition that places
a requirement of _no edges with negative weights_ in the graph. This
precondition is required because of the nature of the algorithm that requires
monotonically non-decreasing changes in the costs of shortest paths.
:::
## Short description
Let's have a brief look at the pseudocode taken from the Wikipedia:
```
function Dijkstra(Graph, source):
for each vertex v in Graph.Vertices:
dist[v] ← INFINITY
prev[v] ← UNDEFINED
add v to Q
dist[source] ← 0
while Q is not empty:
u ← vertex in Q with min dist[u]
remove u from Q
for each neighbor v of u still in Q:
alt ← dist[u] + Graph.Edges(u, v)
if alt < dist[v]:
dist[v] ← alt
prev[v] ← u
return dist[], prev[]
```
Dijkstra's algorithm works in such way that it always tries to find the shortest
paths from a vertex to which it already has a shortest path. This may result in
finding the shortest path to another vertex, or at least some path, till further
relaxation.
Given that we need to **always** choose the _cheapest_ vertex, we can use a min
heap to our advantage which can further improve the time complexity of the
algorithm.
## Used techniques
This algorithm leverages the _dynamic programming_ technique that has already
been mentioned with regards to the _Bellman-Ford_ algorithm and also _greedy_
technique. Let's talk about them both!
_Dynamic programming_ technique comes from the fact that we are continuously
building on top of the shortest paths that we have found so far. We slowly build
the shortest paths from the given source vertex to all other vertices that are
reachable.
_Greedy_ technique is utilized in such way that Dijkstra's algorithm always
improves the shortest paths from the vertex that is the closest, i.e. it tries
extending the shortest path to some vertex by appending an edge, such extended
path may (or may not) be the shortest path to another vertex.
:::tip Greedy algorithms
_Greedy algorithms_ are algorithms that choose the most optimal action
**at the moment**.
:::
The reason why the algorithm requires no edges with negative weights comes from
the fact that it's greedy. By laying the requirement of non-negative weights in
the graph we are guaranteed that at any given moment of processing outgoing
edges from a vertex, we already have a shortest path to the given vertex. This
means that either this is the shortest path, or there is some other vertex that
may have a higher cost, but the outgoing edge compensates for it.
## Implementation
Firstly we need to have some priority queue wrappers. C++ itself offers
functions that can be used for maintaining max heaps. They also have generalized
version with any ordering, in our case we need reverse ordering, because we need
the min heap.
```cpp
using pqueue_item_t = std::pair<int, vertex_t>;
using pqueue_t = std::vector<pqueue_item_t>;
auto pushq(pqueue_t& q, pqueue_item_t v) -> void {
q.push_back(v);
std::push_heap(q.begin(), q.end(), std::greater<>{});
}
auto popq(pqueue_t& q) -> std::optional<pqueue_item_t> {
if (q.empty()) {
return {};
}
std::pop_heap(q.begin(), q.end(), std::greater<>{});
pqueue_item_t top = q.back();
q.pop_back();
return std::make_optional(top);
}
```
And now we can finally move to the actual implementation of the Dijkstra's
algorithm:
```cpp
auto dijkstra(const graph& g, const vertex_t& source)
-> std::vector<std::vector<int>> {
// make sure that source exists
assert(g.has(source));
// initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// initialize the visited
std::vector<std::vector<bool>> visited(g.height(),
std::vector(g.width(), false));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
pqueue_t priority_queue{std::make_pair(0, source)};
std::optional<pqueue_item_t> item{};
while ((item = popq(priority_queue))) {
auto [cost, u] = *item;
auto [x, y] = u;
// we have already found the shortest path
if (visited[y][x]) {
continue;
}
visited[y][x] = true;
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it and update queue
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
distances[y + dy][x + dx] = distances[y][x] + cost;
pushq(priority_queue, std::make_pair(distances[y + dy][x + dx], v));
}
}
}
return distances;
}
```
## Time complexity
The time complexity of Dijkstra's algorithm differs based on the backing data
structure.
The original implementation doesn't leverage the heap which results in
repetitive _look up_ of the “closest” vertex, hence we get the following
worst-case time complexity in the _Bachmann-Landau_ notation:
$$
\Theta(\vert V \vert^2)
$$
If we turn our attention to the backing data structure, we always want the
“cheapest” vertex, that's why we can use the min heap, given that we use
Fibonacci heap we can achieve the following amortized time complexity:
$$
\mathcal{O}(\vert E \vert + \vert V \vert \cdot \log{\vert V \vert})
$$
:::tip Fibonacci heap
Fibonacci heap is known as the heap that provides $\Theta(1)$ **amortized**
insertion and $\mathcal{O}(\log{n})$ **amortized** removal of the top (either
min or max).
:::
## Running the Dijkstra
Let's run our code:
```
Normal cost: 1
Vortex cost: 5
Graph:
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#D...#....#.#
########*.*.#
#S..........#
#############
[Finite BF] Cost: 22
[Bellman-Ford] Cost: 22
[Dijkstra] Cost: 22
```
OK, so it seems to be working just fine. Now the question arises:
> What happens when we have negative weights in our graph?
## Busting the myth about looping Dijkstra
One of the very common misconception about Dijkstra's algorithm is that it loops
infinitely when you have negative weights or loops in the graph. Well, if we use
our _propelling vortices_, not only we have the negative weights, but also the
negative loops. Let's run our code! Our first naïve approach was actually
looping:
```
Normal cost: 1
Vortex cost: -1
Graph:
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#D...#....#.#
########*.*.#
#S..........#
#############
[Finite BF] Cost: -240
[Bellman-Ford] Found a negative loop
[Bellman-Ford] Cost: -240
[Dijkstra] Cost: 14
```
Well, it definitely doesn't loop. How much does `14` make sense is a different
matter.
:::info Variations
There are multiple variations of the Dijkstra's algorithm. You **can** implement
it in such way that with negative weights or loops it loops infinitely, but it
can be countered. In our case we keep the track of the vertices that already got
a shortest path established via the `visited`, that's how even multiple entries
for one vertex in the heap are not an issue.
:::
## Summary
Now we have an algorithm for finding the shortest path that is faster than our
original naïve brute-force or Bellman-Ford. However we need to keep in mind its
requirement of no negative weights for correct functioning.
You can also see how we used our thought process of figuring out the worst-case
time complexity for the naïve or Bellman-Ford algorithm to improve the original
path-finding algorithms.

View file

@ -0,0 +1,224 @@
---
id: astar
slug: /paths/bf-to-astar/astar
title: A* algorithm
description: |
Moving from Dijkstra's algorithm into the A* algorithm.
tags:
- cpp
- dynamic programming
- astar
last_update:
date: 2024-01-03
---
## Intro
Let's start by the recap of what we've achieved so far:
1. We have implemented a naïve brute-force algorithm that tries to relax paths
as long as there are any paths to be relaxed.
2. Then we have fixed an issue caused by negative loops that can result in
a non-terminating run of our brute-force method. At this moment we have made
some small arguments why are bounding is enough and doesn't prevent any
shortest path to _not be_ discovered.
3. Finally we have converted our bounded brute-force algorithm into the
Bellman-Ford algorithm.
4. We have mentioned the worst-case time complexity of our bounded naïve
approach and also the Bellman-Ford algorithm. Our worst-case depended on the
fact that we assumed the worst possible ordering of the relaxations. However
we could also try to relax in the most ideal ordering which could result in a
faster algorithm and that's how we got to the Dijkstra's algorithm.
Now the question is, could we improve the Dijkstra's algorithm to get even
better results? And the answer is _maybe_!
Dijkstra's algorithm chooses the next cheapest vertex for relaxing. This is good
as long as there is no additional information. However, imagine a roadmap of
some country. If you're in the middle of the map and you want to go south, it
doesn't make much sense for you to go to the north (in the opposite direction),
but a little bit might make sense, so that you can switch to highway and go much
faster.
The important question here is how to _influence_ the algorithm, so that it does
choose the path that _makes more sense_ rather than the one that costs the
least.
## A* description
The _A* algorithm_ can be considered a modification of Dijkstra's algorithm. The
cost is still the same, we cannot change it, right? However when we pick the
vertices from the heap, we can influence the order by some _heuristic_. In this
case, we introduce a function that can suggest how feasible the vertex is.
## Roadmap heuristic
Let's have a look at the heuristic we could use for the roadmap example. There
are roads (the edges) and towns (the vertices). Cost could be an average time to
travel the road. What heuristic could we use to influence our algorithm to
choose a better ordering of the vertices when relaxing?
In the former example we've said that it doesn't make much sense to go in the
opposite direction than our goal is… We could choose the distance from our goal
as the heuristic, e.g. right now we're 100 km away from our goal, using this
road makes us 50 km away and using the other road we will be 200 km away.
## Heuristic for our map
Our map is a bit simpler, but we can use a very similar principle. We will use
the _Manhattan distance_, which is defined in a following way:
$$
\vert x_a - x_b \vert + \vert y_a - y_b \vert
$$
Since we cannot move in diagonals, it makes sense to maintain the distance in
the actual steps from the goal.
## Passing the heuristic
In our case, when we're using C++, we can just template the function that will
calculate the shortest path and pass the heuristic as a parameter.
## Implementation
Actual implementation is very easy once we have the Dijkstra's algorithm:
```cpp
auto astar(const graph& g, const vertex_t& source, const auto& h)
-> std::vector<std::vector<int>> {
// make sure that source exists
assert(g.has(source));
// initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// initialize the visited
std::vector<std::vector<bool>> visited(g.height(),
std::vector(g.width(), false));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
pqueue_t priority_queue{std::make_pair(0 + h(source), source)};
std::optional<pqueue_item_t> item{};
while ((item = popq(priority_queue))) {
auto [cost, u] = *item;
auto [x, y] = u;
// we have already found the shortest path
if (visited[y][x]) {
continue;
}
visited[y][x] = true;
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it and update queue
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
distances[y + dy][x + dx] = distances[y][x] + cost;
pushq(priority_queue,
std::make_pair(distances[y + dy][x + dx] + h(v), v));
}
}
}
return distances;
}
```
## Running on our map
For this algorithm I will also show the example of a call:
```cpp
distances = astar(g, std::make_pair(1, 9), [](const auto& u) {
auto [x, y] = u;
return std::abs(1 - x) + std::abs(7 - y);
});
std::cout << "[A*] Cost: " << distances[7][1] << "\n";
```
First argument to the function is the graph itself. Second argument is the
source vertex where we start. And finally the lambda returns
_Manhattan distance_ to the goal.
And we get the following result:
```
Normal cost: 1
Vortex cost: 5
Graph:
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#D...#....#.#
########*.*.#
#S..........#
#############
[Finite BF] Cost: 22
[Bellman-Ford] Cost: 22
[Dijkstra] Cost: 22
[A*] Cost: 22
```
## Comparison
Now you may wonder how does it compare to the previous algorithms. Supposedly it
should be faster. Let's add counters and debugging output when we update
distance to our goal. And now if we run our code, we get the following output:
```
Normal cost: 1
Vortex cost: 5
Graph:
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#D...#....#.#
########*.*.#
#S..........#
#############
Relaxing path to goal in 40. relaxation
Relaxing path to goal in 68. relaxation
Relaxing path to goal in 89. relaxation
[Finite BF] Cost: 22
Relaxing path to goal in 40. relaxation
Relaxing path to goal in 68. relaxation
Relaxing path to goal in 89. relaxation
[Bellman-Ford] Cost: 22
Relaxing path to goal in 41. iteration
[Dijkstra] Cost: 22
Relaxing path to goal in 31. iteration
[A*] Cost: 22
```
From the output we can easily deduce that for both brute-force and Bellman-Ford,
which are in our case identical, we actually relax three times and for the last
time in the 89th iteration.
Dijkstra's algorithm manages to find the shortest path to our goal already in
the 41st iteration.
And finally after introducing some heuristic, we could find the shortest path
in the 31st iteration.
:::danger
Please keep in mind that choosing bad heuristic can actually lead to worse
results than using no heuristic at all.
:::
## Summary
And there we have it. We have made our way from the brute-force algorithm all
the way to more optimal ones. Hopefully we could notice how the small
improvements of the already existing algorithms made them much better.

View file

@ -0,0 +1,171 @@
---
id: index
slug: /paths/bf-to-astar
title: From BF to A*
description: |
Figuring out shortest-path problem from the BF to the A* algorithm.
tags:
- cpp
- brute force
- bellman ford
- dynamic programming
- dijkstra
- greedy
- a star
last_update:
date: 2024-01-01
---
## Intro
We will delve into the details and ideas of the most common path-finding
algorithms. For the purpose of demonstrating some “features” of the improved
algorithms, we will use a 2D map with some rules that will allow us to show cons
and pros of the shown algorithms.
Let's have a look at the example map:
```
#############
#..#..*.*.**#
##***.....**#
#..########.#
#...###...#.#
#..#...##.#.#
#..#.*.#..#.#
#....#....#.#
########*.*.#
#...........#
#############
```
We can see three different kinds of cells:
1. `#` which represent walls, that cannot be entered at all
2. `*` which represent vortices that can be entered at the cost of 5 coins
3. `.` which represent normal cells that can be entered for 1 coin (which is the
base price of moving around the map)
Let's dissect a specific position on the map to get a better grasp of the rules:
```
.
#S*
.
```
We are standing in the cell marked with `S` and we have the following options
* move to the north (`.`) with the cost of 1 coin,
* move to the west (`#`) **is not** allowed because of the wall,
* move to the east (`*`) is allowed with the cost of 5 coins, and finally
* move to the south (`.`) with the cost of 1 coin.
:::info
Further on I will follow the same scheme for marking cells with an addition of
`D` to denote the _destination_ to which we will be finding the shortest path.
:::
## Boilerplate
For working with this map I have prepared a basic structure for the graph in C++
that will abstract some of the internal workings of our map, namely:
* remembers the costs of moving around
* provides a simple function that returns price for moving **directly** between
two positions on the map
* allows us to print the map out, just in case we'd need some adjustments to be
made
We can see the `graph` header here:
```cpp
#ifndef _GRAPH_HPP
#define _GRAPH_HPP
#include <cmath>
#include <limits>
#include <ostream>
#include <utility>
#include <vector>
using vertex_t = std::pair<int, int>;
struct graph {
graph(const std::vector<std::vector<char>>& map)
: map(map),
_height(static_cast<int>(map.size())),
_width(map.empty() ? 0 : static_cast<int>(map[0].size())) {}
static auto unreachable() -> int { return UNREACHABLE; }
static auto normal_cost() -> int { return NORMAL_COST; }
static auto vortex_cost() -> int { return VORTEX_COST; }
auto cost(const vertex_t& u, const vertex_t& v) const -> int {
auto [ux, uy] = u;
auto [vx, vy] = v;
auto hd = std::abs(ux - vx) + std::abs(uy - vy);
switch (hd) {
// u = v; staying on the same cell
case 0:
return 0;
// u and v are neighbours
case 1:
break;
// u and v are not neighbouring cells
default:
return UNREACHABLE;
}
// boundary check
if (vy < 0 || vy >= _height || vx < 0 || vx >= _width) {
return UNREACHABLE;
}
switch (map[vy][vx]) {
case '#':
return UNREACHABLE;
case '*':
return VORTEX_COST;
default:
return NORMAL_COST;
}
}
auto width() const -> int { return _width; }
auto height() const -> int { return _height; }
auto has(const vertex_t& v) const -> bool {
auto [x, y] = v;
return (0 <= y && y < _height) && (0 <= x && x < _width);
}
friend std::ostream& operator<<(std::ostream& os, const graph& g);
private:
std::vector<std::vector<char>> map;
int _height, _width;
const static int UNREACHABLE = std::numeric_limits<int>::max();
// XXX: modify here to change the price of entering the vortex
const static int VORTEX_COST = 5;
const static int NORMAL_COST = 1;
};
std::ostream& operator<<(std::ostream& os, const graph& g) {
for (const auto& row : g.map) {
for (const char cell : row) {
os << cell;
}
os << "\n";
}
return os;
}
#endif /* _GRAPH_HPP */
```
:::info Source code
You can find all the source code referenced in this series
[here](pathname:///files/algorithms/paths/bf-to-astar.tar.gz).
:::
Let's finally start with some algorithms!

View file

@ -0,0 +1,59 @@
#ifndef _ASTAR_HPP
#define _ASTAR_HPP
#include <algorithm>
#include <cassert>
#include <functional>
#include <optional>
#include <utility>
#include <vector>
#include "graph.hpp"
auto astar(const graph& g, const vertex_t& source, const auto& h)
-> std::vector<std::vector<int>> {
// make sure that source exists
assert(g.has(source));
// initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// initialize the visited
std::vector<std::vector<bool>> visited(g.height(),
std::vector(g.width(), false));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
pqueue_t priority_queue{std::make_pair(0 + h(source), source)};
std::optional<pqueue_item_t> item{};
while ((item = popq(priority_queue))) {
auto [cost, u] = *item;
auto [x, y] = u;
// we have already found the shortest path
if (visited[y][x]) {
continue;
}
visited[y][x] = true;
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it and update queue
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
distances[y + dy][x + dx] = distances[y][x] + cost;
pushq(priority_queue,
std::make_pair(distances[y + dy][x + dx] + h(v), v));
}
}
}
return distances;
}
#endif /* _ASTAR_HPP */

View file

@ -0,0 +1,136 @@
#ifndef _BF_HPP
#define _BF_HPP
#include <cassert>
#include <iostream>
#include <utility>
#include <vector>
#include "graph.hpp"
static auto _check_vertex(const graph& g,
std::vector<std::vector<int>>& distances, int v,
bool check_only = false) -> bool {
bool improvement_found = false;
// unpack the vertex coordinates
int y = v / g.width();
int x = v % g.width();
// skip the cells we cannot reach
if (distances[y][x] == graph::unreachable()) {
return false;
}
// go through the neighbours
auto u = std::make_pair(x, y);
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
if (check_only) {
return true;
}
distances[y + dy][x + dx] = distances[y][x] + cost;
improvement_found = true;
}
}
return improvement_found;
}
auto bf(const graph& g, const vertex_t& source, const vertex_t& destination)
-> int {
// source must be within the bounds
assert(g.has(source));
// destination must be within the bounds
assert(g.has(destination));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we need to improve the paths as long as possible
bool improvement_found;
do {
// reset the flag at the beginning
improvement_found = false;
// go through all of the vertices
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
improvement_found = _check_vertex(g, distances, v) || improvement_found;
}
} while (improvement_found);
return distances[destination.second][destination.first];
}
auto bf_finite(const graph& g, const vertex_t& source,
const vertex_t& destination) -> int {
// source must be within the bounds
assert(g.has(source));
// destination must be within the bounds
assert(g.has(destination));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we only iterate as many times as cells that we have
for (int i = g.height() * g.width(); i > 0; --i) {
// go through all of the vertices
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
_check_vertex(g, distances, v);
}
}
return distances[destination.second][destination.first];
}
auto bellman_ford(const graph& g, const vertex_t& source)
-> std::vector<std::vector<int>> {
// source must be within the bounds
assert(g.has(source));
// we need to initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
// now we only iterate as many times as cells that we have
for (int i = g.height() * g.width(); i > 0; --i) {
// go through all of the vertices
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
_check_vertex(g, distances, v);
}
}
// now we check for the negative loops
for (int v = g.height() * g.width() - 1; v >= 0; --v) {
if (_check_vertex(g, distances, v, true)) {
std::cerr << "[Bellman-Ford] Found a negative loop\n";
break;
}
}
return distances;
}
#endif /* _BF_HPP */

View file

@ -0,0 +1,58 @@
#ifndef _DIJKSTRA_HPP
#define _DIJKSTRA_HPP
#include <algorithm>
#include <cassert>
#include <functional>
#include <optional>
#include <utility>
#include <vector>
#include "graph.hpp"
auto dijkstra(const graph& g, const vertex_t& source)
-> std::vector<std::vector<int>> {
// make sure that source exists
assert(g.has(source));
// initialize the distances
std::vector<std::vector<int>> distances(
g.height(), std::vector(g.width(), graph::unreachable()));
// initialize the visited
std::vector<std::vector<bool>> visited(g.height(),
std::vector(g.width(), false));
// source destination denotes the beginning where the cost is 0
auto [sx, sy] = source;
distances[sy][sx] = 0;
pqueue_t priority_queue{std::make_pair(0, source)};
std::optional<pqueue_item_t> item{};
while ((item = popq(priority_queue))) {
auto [cost, u] = *item;
auto [x, y] = u;
// we have already found the shortest path
if (visited[y][x]) {
continue;
}
visited[y][x] = true;
for (const auto& [dx, dy] : DIRECTIONS) {
auto v = std::make_pair(x + dx, y + dy);
auto cost = g.cost(u, v);
// if we can move to the cell and it's better, relax¹ it and update queue
if (cost != graph::unreachable() &&
distances[y][x] + cost < distances[y + dy][x + dx]) {
distances[y + dy][x + dx] = distances[y][x] + cost;
pushq(priority_queue, std::make_pair(distances[y + dy][x + dx], v));
}
}
}
return distances;
}
#endif /* _DIJKSTRA_HPP */

View file

@ -0,0 +1,108 @@
#ifndef _GRAPH_HPP
#define _GRAPH_HPP
#include <cmath>
#include <limits>
#include <ostream>
#include <utility>
#include <vector>
using vertex_t = std::pair<int, int>;
struct graph {
graph(const std::vector<std::vector<char>>& map)
: map(map),
_height(static_cast<int>(map.size())),
_width(map.empty() ? 0 : static_cast<int>(map[0].size())) {}
static auto unreachable() -> int { return UNREACHABLE; }
static auto normal_cost() -> int { return NORMAL_COST; }
static auto vortex_cost() -> int { return VORTEX_COST; }
auto cost(const vertex_t& u, const vertex_t& v) const -> int {
auto [ux, uy] = u;
auto [vx, vy] = v;
auto md = std::abs(ux - vx) + std::abs(uy - vy);
switch (md) {
// u = v; staying on the same cell
case 0:
return 0;
// u and v are neighbours
case 1:
break;
// u and v are not neighbouring cells
default:
return UNREACHABLE;
}
// boundary check
if (vy < 0 || vy >= _height || vx < 0 || vx >= _width) {
return UNREACHABLE;
}
switch (map[vy][vx]) {
case '#':
return UNREACHABLE;
case '*':
return VORTEX_COST;
default:
return NORMAL_COST;
}
}
auto width() const -> int { return _width; }
auto height() const -> int { return _height; }
auto has(const vertex_t& v) const -> bool {
auto [x, y] = v;
return (0 <= y && y < _height) && (0 <= x && x < _width);
}
friend std::ostream& operator<<(std::ostream& os, const graph& g);
private:
std::vector<std::vector<char>> map;
int _height, _width;
const static int UNREACHABLE = std::numeric_limits<int>::max();
// XXX: modify here to change the price of entering the vortex
const static int VORTEX_COST = 5;
const static int NORMAL_COST = 1;
};
std::ostream& operator<<(std::ostream& os, const graph& g) {
for (const auto& row : g.map) {
for (const char cell : row) {
os << cell;
}
os << "\n";
}
return os;
}
const static std::vector<vertex_t> DIRECTIONS =
std::vector{std::make_pair(0, 1), std::make_pair(0, -1),
std::make_pair(1, 0), std::make_pair(-1, 0)};
using pqueue_item_t = std::pair<int, vertex_t>;
using pqueue_t = std::vector<pqueue_item_t>;
auto pushq(pqueue_t& q, pqueue_item_t v) -> void {
q.push_back(v);
std::push_heap(q.begin(), q.end(), std::greater<>{});
}
auto popq(pqueue_t& q) -> std::optional<pqueue_item_t> {
if (q.empty()) {
return {};
}
std::pop_heap(q.begin(), q.end(), std::greater<>{});
pqueue_item_t top = q.back();
q.pop_back();
return std::make_optional(top);
}
#endif /* _GRAPH_HPP */

View file

@ -0,0 +1,50 @@
#include <iostream>
#include <string>
#include <utility>
#include <vector>
#include "astar.hpp"
#include "bf.hpp"
#include "dijkstra.hpp"
#include "graph.hpp"
auto line_to_vector(const std::string& l) -> std::vector<char> {
return std::vector(l.begin(), l.end());
}
auto main() -> int {
graph g{std::vector{
line_to_vector(std::string("#############")),
line_to_vector(std::string("#..#..*.*.**#")),
line_to_vector(std::string("##***.....**#")),
line_to_vector(std::string("#..########.#")),
line_to_vector(std::string("#...###...#.#")),
line_to_vector(std::string("#..#...##.#.#")),
line_to_vector(std::string("#..#.*.#..#.#")),
line_to_vector(std::string("#....#....#.#")),
line_to_vector(std::string("########*.*.#")),
line_to_vector(std::string("#...........#")),
line_to_vector(std::string("#############")),
}};
std::cout << "Normal cost: " << g.normal_cost() << "\n";
std::cout << "Vortex cost: " << g.vortex_cost() << "\n";
std::cout << "Graph:\n" << g;
// finding the distances from the bottom left corner to the 2 rows above
auto cost = bf_finite(g, std::make_pair(1, 9), std::make_pair(1, 7));
std::cout << "[Finite BF] Cost: " << cost << "\n";
auto distances = bellman_ford(g, std::make_pair(1, 9));
std::cout << "[Bellman-Ford] Cost: " << distances[7][1] << "\n";
distances = dijkstra(g, std::make_pair(1, 9));
std::cout << "[Dijkstra] Cost: " << distances[7][1] << "\n";
distances = astar(g, std::make_pair(1, 9), [](const auto& u) {
auto [x, y] = u;
return std::abs(1 - x) + std::abs(7 - y);
});
std::cout << "[A*] Cost: " << distances[7][1] << "\n";
return 0;
}

View file

@ -0,0 +1,8 @@
CXX=c++
CXXFLAGS=-std=c++20 -Wall -Wextra -g
all: format
$(CXX) $(CXXFLAGS) main.cpp -o main
format:
clang-format -i -style=google *.hpp *.cpp

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB