# Advent of Code 2020: Day 8 Solution

Darryl Abbate
2020/12/08

A user on Reddit asked me to explain my AWK one-line solution to day 8’s puzzle (part 1); specifically to understand what’s going on in AWK.

``awk '{p[NR]=\$2\$1}END{i=1;for(;;){if(r[i]++){print a;exit}p[i]~/a/?a+=p[i++]:p[i]~/j/?i+=p[i]:++i}}' input``

The puzzle basically asks you to simulate a very simple virtual machine, with only `acc`, `jmp` and `nop` instructions. For the first part, you’re asked for the value in the accumulator before any instruction is run a second time.

## Input Parsing

By default, AWK’s Record Separator (`RS`) is a newline. The Field Separator (`FS`) is any whitespace. AWK will process the input one line at a time.

``{ p[NR] = \$2\$1 }``

This code block is an empty pattern, meaning AWK will unconditionally execute it once for each line of input. For each line of input, we want AWK to store the second field (`\$2`) concatenated with the first field (`\$1`) in an associative array `p`. The key being used here is `NR`, a special AWK variable which holds the current record number (or line number) of the input.

For example, when given the following input:

``````nop +335
acc +46
jmp +42``````

The array `p` will be populated as follows:

``````p = "+355nop"
p = "+46acc"
p = "+42jmp"``````

We want `\$2` to precede `\$1` in the concatenated strings to take advantage of the numeric coercion AWK employs when performing arithmetic on strings. AWK will coerce a string such as `+46acc` as the number `46` when used as an operand in an arithmetic expression.

## The `END` Block

Code in the `END` block is executed once all input has been processed.

``````END {
i = 1
for (;;) {
if (r[i]++) {
print a
exit
}
p[i] ~ /a/ ? a += p[i++] : p[i] ~ /j/ ? i += p[i] : ++i
}
}``````

The entire “virtual machine” is modeled in just 8 lines of AWK code inside the `END` block. We use a few variables to represent different parts of the machine.

Variable Description
`a` The accumulator. This value will change for every `acc` instruction executed.
`i` The “instruction pointer” (or program counter). This keeps track of our position in the input program (stored in the array `p`). This is initialized to `1` since our entry point (first instruction) is at `p`.
`p` The input program, which was initialized earlier.
`r` We use a second associative array `r` to mark which instructions have already been run.
``````for (;;) {
...
}``````

This represents the interpreter loop, which will run until we explicitly `exit` the program.

``````if (r[i]++) {
print a
exit
}``````

Here, we increment `r[i]` to mark the instruction `i` as “run.” If `r[i]` was already non-zero, we print the value in accumulator (`a`) and exit; the puzzle is solved!

``p[i] ~ /a/ ? a += p[i++] : p[i] ~ /j/ ? i += p[i] : ++i``

Otherwise, we use this chained ternary expression as a concise if-else construct.

If the string in `p[i]` contains the letter “`a`”, we know the instruction is `acc`; we add the value of `p[i]` to `a`. As mentioned earlier, AWK will coerce the string to a number when used as an addition operand. We also increment `i` so `p[i]` will point to the succeeding instruction when the interpreter loop jumps back to the beginning.

Else, if `p[i]` contains the letter “`j`”, we know it’s a `jmp` instruction. Instead of adding `p[i]` to `a`, we add it to `i` to simulate a “jump” in the program.

Finally, we can just assume the instruction is a `nop` if it reaches the final clause in the ternary expression. In this case, we just increment `i`.

AWK is one of my favorite programming languages, and I find it especially well-suited for Advent of Code puzzles. I solved every 2019 puzzle in AWK and even wrote an overly-complex assembler/disassembler/debugger/interpreter for Intcode in AWK.