Inspired by: https://www.youtube.com/watch?v=ap05clkKxxw

If you need to align sequences, ChatGPT will point you to Needleman-Wunsch as a first treat.
When I watched and read tutorials I found them quite confusing: it’s explained by introducing a weird alignment matrix and some score computation rules and while some people might have the intuition to understand why this algorithm works, I certainly didn’t.
As always, there is a slightly longer but much more amenable road to understanding complicated algorithms such as Needleman-Wunsch!
Start Let’s say we have two strings, a and b:
a = "FLIEGE"
b = "LIEBEN"
N, M = len(a), len(b)
Without introducing any gaps, we can calculate the similarity by simply aligning the accidentally equally long strings by counting the mismatches:
F L I E G E
L I E B E N
=> No characters match! Distance: 6
Ok, but what if we would allow gaps - to allow for more matches? We could get an example alignment, like this one:
F L I E G E -
- L I E B E N
=> Two gaps, 4 matches, 3 mismatches
That’s much better! You might object and say: but what do you mean by better? **While there is no universal law describing good in this case, we can simply say that the best alignment minimizes the hamming distance using as many gaps as we want without changing the order of characters. Apparently, that’s what people also just call the Levenshtein distance.
Now, how would we find an alignment that minimizes that distance? The first thing to realise is that any alignment can only end in one of three cases:
Aligned end
... E
... N
Gap in a
... -
... N
Gap in b
... E
... -
For our hamming distance it really doesn’t matter if we have:
... -
... -
=> We can drop the last gap from both and still get the same alignment score as
the hamming distance only counts mismatches.
The laziest way to compute the best alignment from here on is to define the best alignment recursively. We can just say that the best alignment is the one that has the lowest score, i.e.:
aligned = (1 if a[i] != b[j] else 0) + best(a[0:i-1], b[0:j-1])
gap_a = 1 + best(a[0:i], b[0:j-1])
gap_b = 1 + best(a[0:i-1], b[0:j])
best = min(aligned, gap_a, gap_b)
Now, the only question remains what our recursion base case is.
Does the recursion end in the same way it started? I.e. it can end with
**Case 1:**
F ...
L ...
**Case 2:**
- ...
L ...
**Case 3:**
F ...
- ...