AoC 2021 in bash

December 05, 2021

I feel AoC is a nice way to get to know the language of your choice. There's string manipulation, basic mathematics and more. bash isn't the ideal language for this task because (in my opinion) it wasn't meant to solve these kind of problems. Nevertheless, I wanted to expand my experience with bash and so I'll attempt to solve all the AoC problems using this language.

I might not be able to solve the problems on the same day (heck, I'm starting almost a week late) and might even skip some problems altogether. I also plan to use common tools like grep, sed, awk, etc and not rely on pure bash.

Index

· · ·

Day 01

Part 1

First, let's get the first value and setup variables

prev_value="$(head -1 $INP_FILE)"
# keep count in this variable
inc_count=0
# store input file in a variable
INP_FILE="../inp/1"

Now, skipping the first line, we need to iterate over the rest and count the changes

sed -n '2,$p' "$INP_FILE" | while read value; do
    (( value > prev_value )) && (( inc_count++ ))
    prev_value=$value
done

Finally, lets print the answer

echo $inc_count
# Output:
0

WRONG ANSWER. Umm… what?! After opening the file and confirming that the answer indeed isn't 0, I decided to do some print debugging…

sed -n '2,$p' "$INP_FILE" | while read value; do
    (( value > prev_value )) && (( inc_count++ ))
    echo $inc_count
    prev_value=$value
done

…and surely, non zero values were being printed:

...
1398
1399
1400
1400

This confirmed my doubt that the piped while loop was actually running inside a subshell.

A fix, also mentioned in the post was to wrap the while loop and echo inside braces to make it be processed by the same subshell.

sed -n '2,$p' "$INP_FILE" | {
    while read value; do
        (( value > prev_value )) && (( inc_count++ ))
        prev_value=$value
    done
    echo $inc_count
}

Note that the variable originally declared outside is still 0.

Part 2

Firstly, I'll make a function, part1 that solves part 1. It'll accept a single argument: a filename which has the numbers

We're going to use paste and process substitution to calculate the window sum.

A little bit of background first: we want to offset the stream of numbers and want to see them side by side, very much like the illustration provided on the website:

199  A
200  A B
208  A B C
210    B C D
200  E   C D
207  E F   D
240  E F G
269    F G H
260      G H
263        H

Using paste, we can merge two numbers from the lines together:

paste <(seq 1 5) <(seq 1 5)

# Output:
1	1
2	2
3	3
4	4
5	5

We can achieve the offset simply by using echo. Also, changing delimiter to +

paste -d+ <(seq 1 5) <(echo; seq 1 5)

# Output:
1+
2+1
3+2
4+3
5+4
+5

Passing this to bc will give us the sum for valid windows and print (standard_in) [NUMBER]: syntax error to stderr, which will get ignored anyways ;)

Putting it all together:

tmpfile="$(mktemp)"

# calculate sum of 3-interval window
# NOTE: numbers not part of a window will get ignored anyways
paste -d+ "$INP_FILE" <(echo; cat "$INP_FILE") <(echo; echo; cat "$INP_FILE") | bc > "$tmpfile" 2>/dev/null

part1 "$tmpfile"

rm "$tmpfile"
· · ·

Day 02

There was nothing surprising or special about my solution for Day 02 so I'll just put them here:

Part 1

function part1 {
    horizontal=0
    depth=0

    while read directive value; do
        case $directive in
            forward)
                (( horizontal += value ))
                ;;
            down)
                (( depth += value ))
                ;;
            up)
                (( depth -= value ))
                ;;
        esac
    done < "$INP_FILE"

    echo $(( horizontal * depth ))
}

Part 2

function part2 {
    horizontal=0
    depth=0
    aim=0

    while read directive value; do
        case $directive in
            forward)
                (( horizontal += value ))
                (( depth += value * aim ))
                ;;
            down)
                (( aim += value ))
                ;;
            up)
                (( aim -= value ))
                ;;
        esac
    done < "$INP_FILE"

    echo $(( horizontal * depth ))
}
· · ·

Day 03

Part 1

First, I'll define a few global variables that'll come handy in the solution. INP_FILE, like all other solutions and N_COLS. Since the number of columns in each line is fixed, we can simply check the characters in the first line and remove 1 for the newline character.

# this counts the newline too
N_COLS="$( head -1 "$INP_FILE" | wc -c )"
# account for that
(( N_COLS-- ))

With that done, I define a helper function that takes a file and calculates each columns' sum (actually, it counts the number of lines that have a 1 at that place). I use cut to only look at that specific column.

function count_set_bits {
    # count set bits at each position
    set_bit_counts=()

    for (( i=1; i <= N_COLS; i++)); do
        set_bit_counts+=("$( cut -c $i "$1" | grep 1 | wc -l )")
    done

    # "return" the bit count
    echo "${set_bit_counts[@]}"
}

This will contain the number of rows that have 1 at those columns.

Now, part 1 is trivial:

function part1 {
    set_bit_counts=($( count_set_bits "$1" ))
    # use redirection to avoid filename
    n="$( wc -l < "$1" )"

    gamma=0
    epsilon=0

    for (( i=0; i < N_COLS; i++ )); do
        (( set_bit_counts[i] > n - set_bit_counts[i] )) && majority_bit=1 || majority_bit=0

        (( gamma = 2 * gamma + majority_bit ))
        (( epsilon = 2 * epsilon + 1 - majority_bit ))
    done

    echo $(( gamma * epsilon ))
}

Part 2

For part 2, we need to continuously filter the list until only one remains. Since the core logic for calculating for Oxygen and Carbon Dioxide is the same, I'll only explain one here.

I'm going to use intermediate files again. This is definitely possible to do without files but this just makes it a little easier.

Basic setup

o2_candidates="$( cat "$1" )"
tmpfile=`mktemp`

# for keeping track of which bits we've already seen
cur_bit=0

Counting current number of candidates

Since the number of rows are changing at every step, we need to keep track of the current count. We also need to recalculate column sum at every step. The sum part is simple because of the function declared earlier (count_set_bits)

To get the count of candidates remaining, we can use wc:

wc -l <<< "$o2_candidates"

Filtering candidates

The other part of the puzzle is only keeping lines that we want to. This can be achieved by grep. When we're looking at the ith column, the first i-1 columns can be anything ("." in regex). So, for example if I want my 6th character to be Y and don't care about anything else, I'll use the following regex:

grep -E "^.{5}Y"

For this problem, filtering candidates looks like:

# ignore for `cur_bit` characters (0 indexed so -1 offset not needed)
# and then look for `set_bit`
o2_candidates="$( grep -E "^.{$cur_bit}${reqd_bit}" <<< "$o2_candidates" )"

Base 2 to decimal

bc has a special variable called ibase that controls the base of input. In case you were wondering, obase also exists. So, we can do math in binary and get the result in decimal. For example, multiplying 10 (1010 in binary) and 5 (101 in binary):

bc <<< "ibase=2; 1010 * 101"
# Output:
50

Full solution

function part2 {
    o2_candidates="$( cat "$1" )"
    co2_candidates="$( cat "$1" )"

    tmpfile=`mktemp`

    cp "$1" "$tmpfile"
    cur_bit=0
    while (( n="$( wc -l <<< "$o2_candidates" )" )); (( n > 1 )); do
        set_bit_counts=($( count_set_bits "$tmpfile" ))

        # if number of 1s >= number of 0s
        (( set_bit_counts[cur_bit] >= n - set_bit_counts[cur_bit] )) && reqd_bit=1 || reqd_bit=0

        # ignore for `cur_bit` characters (0 indexed so -1 offset not needed)
        # and then look for `reqd_bit`
        o2_candidates="$( grep -E "^.{$cur_bit}${reqd_bit}" <<< "$o2_candidates" )"
        echo "$o2_candidates" > "$tmpfile"

        (( cur_bit++ ))
    done

    # reset variables
    cur_bit=0
    cp "$1" "$tmpfile"
    while (( n="$( wc -l <<< "$co2_candidates" )" )); (( n > 1 )); do
        set_bit_counts=($( count_set_bits "$tmpfile" ))

        # if number of 0s > number of 1s
        (( n - set_bit_counts[cur_bit] > set_bit_counts[cur_bit] )) && reqd_bit=1 || reqd_bit=0

        co2_candidates="$( grep -E "^.{$cur_bit}${reqd_bit}" <<< "$co2_candidates" )"
        echo "$co2_candidates" > "$tmpfile"

        (( cur_bit++ ))
    done

    echo "ibase=2; $o2_candidates * $co2_candidates" | bc
    rm "$tmpfile"
}
· · ·

Day 04

Parts 1 and 2

Ok, this is where things got really interesting. For solving this, we need to cover a variety of concepts. Let's tackle them one by one and build towards the solution.

Multi-dimensional arrays

The bingo card is a 2D board. Bash natively doesn't support multiple dimensional arrays. Obviously we can map any N-d array to 1-d just by multiplying and adding indices but something better exists. Some other approaches can be found in this SO question.

I used the associative array approach. Basically, it is like a hashmap (or dict in python, Object in js, etc) where the key is the index in the form i,j and the value is the element at arr[i][j]. Since we have multiple boards, I used 3 indices for this array: first for indexing the board and rest two for indexing inside the board.

To facilitate fast number processing, I also store a mapping of number to index. When lets say 82 is called, I'll just use this LUT and update those specific boards only rather than iterating over all boards and checking each cell for 82.

Due to the requirements of this task, we need to keep track of which numbers have been marked. Some hack could've been arranged using negative numbers and adding one to everything but I chose to keep it simple and made another associative array, bingo_cards_marked that stores whether the number was marked or not.

# associative array
# key: [card index, row, col]
declare -A bingo_cards
declare -A bingo_cards_marked

function read_input {
    # read the list of random numbers
    IFS=, read -a numbers < <( head -1 "$INP" )

    local bingo_idx=0 line_idx=0 line bingo_row col key

    while read line; do
        # input next bingo card after 5 lines
        if (( line_idx == 5 )); then
            line_idx=0
            (( bingo_idx++ ))
        fi

        read -a bingo_row <<< "$line"
        for col in $( seq 0 $(( BINGO_COLS - 1 )) ); do
            key="$bingo_idx,$line_idx,$col"
            bingo_cards["$key"]=${bingo_row[$col]}
            # all numbers are unmarked in the beginning
            bingo_cards_marked["$key"]=0

            # store indices separated by space
            num2idx[${bingo_cards["$key"]}]="$( echo ${num2idx[${bingo_cards["$key"]}]} $key )"
        done

        (( line_idx++ ))
        #   this skips empty lines
        #   vvvvvvvvvvvvvvvvvvvv
    done < <(grep -v '^$' "$INP" | sed -n '2,$p')
    #                              ^^^^^^^^^^^^^
    #                              since first line has random number list
}

Checking for bingo

This was another piece of the puzzle which is going to be used repeatedly so the logic best resides inside a function. Given an index of a bingo card as argument, this goes over the rows and columns and checks whether atleast a single cell was unmarked. If nothing was unmarked, it returns 0 otherwise, it returns 1.

In bash, the return keyword in functions isn't what you expect it to be. As seen in the earlier problems, to really "return" from a function, we print it. In bash, the return keyword does two things:

  1. stop execution of the function: even if you're inside a loop, you'll get out of the function
  2. set the status code: a status code of 0 means success and anything else means failure. if and test use this definition

This SO answer explains this in more detail.

function bingo_done {
    # receives an index of bingo to check from bingo card list

    local bingo_dimen found_invalid_row found_invalid_col row_wise_key col_wise_key i j
    bingo_dimen=$BINGO_ROWS

    for i in $( seq 0 $(( bingo_dimen - 1 )) ); do
        found_invalid_row=0
        found_invalid_col=0

        for j in $( seq 0 $(( bingo_dimen - 1 )) ); do
            row_wise_key="$1,$i,$j"
            col_wise_key="$1,$j,$i"

            (( bingo_cards_marked["$row_wise_key"] == 0 )) && found_invalid_row=1
            (( bingo_cards_marked["$col_wise_key"] == 0 )) && found_invalid_col=1
        done

        # either didnt find invalid row or didnt find invalid col
        (( found_invalid_row * found_invalid_col == 0 )) && return 0
    done

    return 1
}

I also need to keep track of which bingo cards have won. The reason is twofold:

  1. We don't mark new numbers on a card that has already won
  2. It is needed in the way I'm solving this problem, to make sure that in every iteration I don't double count a card.

For this and for the solution to the problem itself, I'm using an array that stores card_number,score pairs in the order the cards won.

# list of bingo cards that were finished
# in order and their score
declare -a win_order_and_score

function bingo_win_recorded {
    # checks whether the card had already won
    local idx score winner

    for winner in "${win_order_and_score[@]}"; do
        IFS=, read idx score <<< "$winner"
        (( idx == "$1" )) && return 0
    done

    # $? will be 1, if statements will fail etc
    return 1
}

Now, we come to the actual bingo game. Since we're storing all the indices for each number, we just need to iterate over them, mark the number and check whether its a bingo in the subset of all cards.

Going into a little more depth in my solution here, num2idx stores a space separated list of indices. For example if the number 82 was present in cards 4, 9 and 13 at the top left, top right and bottom left corners respectively, num2idx[82] would look like 3,0,0 8,0,4 12,4,0. Note the space between comma separated 3-tuples. The -1 is due to 0-indexing.

We've already seen that we can use cut to separate out the individual items but here, we'll have to change the delimiter to a newline first. Instead, we use awk. Now awk, like cut also assumes that the different "rows" are in individual lines but we can change that using RS (record separator).

So, to extract just the card index, we can do something like:

awk -vRS=' ' -F, '{ print $1 }' <<< "3,0,0 8,0,4 12,4,0"
# Output:
3
8
12

Basics aside, this is how I go through the cards for which the number was called just now

function check_bingo_list {
    # receives an array of $bingo_idx,$line_idx,$col (basically whatever is stored in num2idx)
    bingo_idx_to_check=($( awk -vRS=' ' -F, '{ print $1 }' <<< "$1" ))

    bingo_found=()

    for idx in "${bingo_idx_to_check[@]}"; do
        #                    we dont want to add same card again
        #                    vvvvvvvvvvvvvvvvvvvvvvvvv
        bingo_done "$idx" && ! bingo_win_recorded $idx && bingo_found+=($idx)
    done

    echo "${bingo_found[@]}"
}

With all the pieces in place, we can now simulate the bingo game:

function simulate_game {
    read_input

    # loop over the list of random numbers
    for n in "${numbers[@]}"; do
        # update all bingo card values using `num2idx`
        for idx in ${num2idx[$n]}; do
            IFS=, read bingo_card _unused _unused <<< "$idx"
            # dont mark new numbers if card has already won
            bingo_win_recorded $bingo_card || bingo_cards_marked["$idx"]=1
        done

        # check if new bingo
        # check_bingo_list already filters for old winners
        bingo_found_idxs=($( check_bingo_list "${num2idx[$n]}" ))

        # calculate and store score
        for idx in "${bingo_found_idxs[@]}"; do
            sum=0
            for row in $( seq 0 $(( BINGO_ROWS - 1 )) ); do
                for col in $( seq 0 $(( BINGO_COLS - 1 )) ); do
                    key="$idx,$row,$col"
                    (( bingo_cards_marked["$key"] == 0 )) && (( sum += bingo_cards["$key"] ))
                done
            done

            win_order_and_score+=("$idx,$(( n*sum ))")
        done
    done
}

Now, we have the order in which boards won and their score stored in win_order_and_score. To solve parts 1 and 2, we just need to check the first and last element after calling the function:

simulate_game

echo "${win_order_and_score[0]}"
echo "${win_order_and_score[-1]}"
# Output:
88,10374
68,24742

While we can ignore the bingo card index, I want to see just the answers, like for other days. We can use parameter substitution:

${var#Pattern}, ${var##Pattern}

${var#Pattern}
    Remove from $var the shortest part of $Pattern that matches the front end of $var.

${var##Pattern}
    Remove from $var the longest part of $Pattern that matches the front end of $var.

simulate_game

echo "${win_order_and_score[0]##*,}"
echo "${win_order_and_score[-1]##*,}"
# Output:
10374
24742
· · ·

Day 05

Part 1

The 2D array part should be clear, thanks to day 4's problem.

Since we can deal with a single delimiter using read and IFS, its better to convert the ` -> ` to ,. While reading the input, we'll also figure out the total number of rows and cols.

N_ROW=0
N_COL=0

declare -a input
declare -A grid

function read_input {
    # input is comma separated now
    input=($( sed 's/ -> /,/' "$INP" ))

    local line x1 y1 x2 y2

    for line in "${input[@]}"; do
        IFS=, read x1 y1 x2 y2 <<< "$line"

        (( x1 + 1 > N_COL )) && N_COL=$(( x1 + 1 ))
        (( x2 + 1 > N_COL )) && N_COL=$(( x2 + 1 ))
        (( y1 + 1 > N_ROW )) && N_ROW=$(( y1 + 1 ))
        (( y2 + 1 > N_ROW )) && N_ROW=$(( y2 + 1 ))
    done
}

A helper function to initialize the associative array (sets all values to 0):

function init_grid {
    local i j key

    for (( i=0; i < N_ROW; i++ )); do
        for (( j=0; j < N_COL; j++ )); do
            key="$i,$j"
            grid["$key"]=0
        done
    done
}

For looping through the coordinates, we can use seq. seq takes an "increment" parameter that we can set to -1 for the descending values.

seq 5 -1 2
# Output:
5
4
3
2

Rest of part 1 is simple:

function straight_lines {
    local i j key line x1 y1 x2 y2 increment x y n_dangerous

    for line in "${input[@]}"; do
        IFS=, read x1 y1 x2 y2 <<< "$line"
        # y doesn't change
        # seq on x
        if (( y1 == y2 )); then
            y=$y1
            (( x1 > x2 )) && increment=-1 || increment=1
            for x in $( seq "$x1" "$increment" "$x2" ); do
                key="$y,$x"
                grid["$key"]=$(( grid["$key"] + 1 ))
            done
        # x doesn't change
        # seq on y
        elif (( x1 == x2 )); then
            x=$x1
            (( y1 > y2 )) && increment=-1 || increment=1
            for y in $( seq "$y1" "$increment" "$y2" ); do
                key="$y,$x"
                grid["$key"]=$(( grid["$key"] + 1 ))
            done
        fi
    done

    n_dangerous=0

    for (( i=0; i < N_ROW; i++ )); do
        for (( j=0; j < N_COL; j++ )); do
            key="$i,$j"
            (( grid["$key"] > 1 )) && (( n_dangerous++ ))
        done
    done

    echo "$n_dangerous"
}

Part 2

Since the second part asks for all the lines (both straight and diagonal), we can use the same grid after part 1 and just add the diagonal values to it.

Since both x and y are changing now, we'll need two seqs. Their outputs can be merged using paste.

function diagonal_lines {
    local i j key line x1 y1 x2 y2 x_inc y_inc x y n_dangerous

    for line in "${input[@]}"; do
        IFS=, read x1 y1 x2 y2 <<< "$line"
        (( x1 > x2 )) && x_inc=-1 || x_inc=1
        (( y1 > y2 )) && y_inc=-1 || y_inc=1

        # skip straight lines
        (( x1 == x2 )) && continue
        (( y1 == y2 )) && continue

        while read coords; do
            IFS=, read x y <<< "$coords"
            key="$y,$x"
            grid["$key"]=$(( grid["$key"] + 1 ))
        done <<< "$( paste -d, <( seq "$x1" "$x_inc" "$x2" ) <( seq "$y1" "$y_inc" "$y2" ) )"
    done

    n_dangerous=0

    for (( i=0; i < N_ROW; i++ )); do
        for (( j=0; j < N_COL; j++ )); do
            key="$i,$j"
            (( grid["$key"] > 1 )) && (( n_dangerous++ ))
        done
    done

    echo "$n_dangerous"
}
· · ·

My solutions, for both bash and very little rust can be found here.

« Port based routing