AoC 2021 in bash
December 05, 2021I feel AoC is a nice way to get to know the language of your choice. There's string manipulation, basic mathematics and more. bash
isn't the ideal language for this task because (in my opinion) it wasn't meant to solve these kind of problems. Nevertheless, I wanted to expand my experience with bash
and so I'll attempt to solve all the AoC problems using this language.
I might not be able to solve the problems on the same day (heck, I'm starting almost a week late) and might even skip some problems altogether. I also plan to use common tools like grep
, sed
, awk
, etc and not rely on pure bash.
Index
- Day 01 part 1 part 2
- Day 02 part 1 part 2
- Day 03 part 1 part 2
- Day 04 part 1 part 2
- Day 05 part 1 part 2
Day 01
Part 1
First, let's get the first value and setup variables
prev_value="$(head -1 $INP_FILE)"
# keep count in this variable
inc_count=0
# store input file in a variable
INP_FILE="../inp/1"
Now, skipping the first line, we need to iterate over the rest and count the changes
sed -n '2,$p' "$INP_FILE" | while read value; do
(( value > prev_value )) && (( inc_count++ ))
prev_value=$value
done
Finally, lets print the answer
echo $inc_count
# Output:
0
WRONG ANSWER
. Umm… what?! After opening the file and confirming that the answer indeed isn't 0, I decided to do some print debugging…
sed -n '2,$p' "$INP_FILE" | while read value; do
(( value > prev_value )) && (( inc_count++ ))
echo $inc_count
prev_value=$value
done
…and surely, non zero values were being printed:
...
1398
1399
1400
1400
This confirmed my doubt that the piped while
loop was actually running inside a subshell.
A fix, also mentioned in the post was to wrap the while
loop and echo
inside braces to make it be processed by the same subshell.
sed -n '2,$p' "$INP_FILE" | {
while read value; do
(( value > prev_value )) && (( inc_count++ ))
prev_value=$value
done
echo $inc_count
}
Note that the variable originally declared outside is still 0.
Part 2
Firstly, I'll make a function, part1
that solves part 1. It'll accept a single argument: a filename which has the numbers
We're going to use paste
and process substitution to calculate the window sum.
A little bit of background first: we want to offset the stream of numbers and want to see them side by side, very much like the illustration provided on the website:
199 A
200 A B
208 A B C
210 B C D
200 E C D
207 E F D
240 E F G
269 F G H
260 G H
263 H
Using paste, we can merge two numbers from the lines together:
paste <(seq 1 5) <(seq 1 5)
# Output:
1 1
2 2
3 3
4 4
5 5
We can achieve the offset simply by using echo
. Also, changing delimiter to +
paste -d+ <(seq 1 5) <(echo; seq 1 5)
# Output:
1+
2+1
3+2
4+3
5+4
+5
Passing this to bc
will give us the sum for valid windows and print (standard_in) [NUMBER]: syntax error
to stderr, which will get ignored anyways ;)
Putting it all together:
tmpfile="$(mktemp)"
# calculate sum of 3-interval window
# NOTE: numbers not part of a window will get ignored anyways
paste -d+ "$INP_FILE" <(echo; cat "$INP_FILE") <(echo; echo; cat "$INP_FILE") | bc > "$tmpfile" 2>/dev/null
part1 "$tmpfile"
rm "$tmpfile"
Day 02
There was nothing surprising or special about my solution for Day 02 so I'll just put them here:
Part 1
function part1 {
horizontal=0
depth=0
while read directive value; do
case $directive in
forward)
(( horizontal += value ))
;;
down)
(( depth += value ))
;;
up)
(( depth -= value ))
;;
esac
done < "$INP_FILE"
echo $(( horizontal * depth ))
}
Part 2
function part2 {
horizontal=0
depth=0
aim=0
while read directive value; do
case $directive in
forward)
(( horizontal += value ))
(( depth += value * aim ))
;;
down)
(( aim += value ))
;;
up)
(( aim -= value ))
;;
esac
done < "$INP_FILE"
echo $(( horizontal * depth ))
}
Day 03
Part 1
First, I'll define a few global variables that'll come handy in the solution. INP_FILE
, like all other solutions and N_COLS
.
Since the number of columns in each line is fixed, we can simply check the characters in the first line and remove 1 for the newline character.
# this counts the newline too
N_COLS="$( head -1 "$INP_FILE" | wc -c )"
# account for that
(( N_COLS-- ))
With that done, I define a helper function that takes a file and calculates each columns' sum (actually, it counts the number of lines that have a 1 at that place).
I use cut
to only look at that specific column.
function count_set_bits {
# count set bits at each position
set_bit_counts=()
for (( i=1; i <= N_COLS; i++)); do
set_bit_counts+=("$( cut -c $i "$1" | grep 1 | wc -l )")
done
# "return" the bit count
echo "${set_bit_counts[@]}"
}
This will contain the number of rows that have 1 at those columns.
Now, part 1 is trivial:
function part1 {
set_bit_counts=($( count_set_bits "$1" ))
# use redirection to avoid filename
n="$( wc -l < "$1" )"
gamma=0
epsilon=0
for (( i=0; i < N_COLS; i++ )); do
(( set_bit_counts[i] > n - set_bit_counts[i] )) && majority_bit=1 || majority_bit=0
(( gamma = 2 * gamma + majority_bit ))
(( epsilon = 2 * epsilon + 1 - majority_bit ))
done
echo $(( gamma * epsilon ))
}
Part 2
For part 2, we need to continuously filter the list until only one remains. Since the core logic for calculating for Oxygen and Carbon Dioxide is the same, I'll only explain one here.
I'm going to use intermediate files again. This is definitely possible to do without files but this just makes it a little easier.
Basic setup
o2_candidates="$( cat "$1" )"
tmpfile=`mktemp`
# for keeping track of which bits we've already seen
cur_bit=0
Counting current number of candidates
Since the number of rows are changing at every step, we need to keep track of the current count. We also need to recalculate column sum at every step. The sum part is simple because of the function declared earlier (count_set_bits
)
To get the count of candidates remaining, we can use wc
:
wc -l <<< "$o2_candidates"
Filtering candidates
The other part of the puzzle is only keeping lines that we want to. This can be achieved by grep
. When we're looking at the i
th column, the first i-1
columns can be anything (".
" in regex). So, for example if I want my 6th character to be Y
and don't care about anything else, I'll use the following regex:
grep -E "^.{5}Y"
For this problem, filtering candidates looks like:
# ignore for `cur_bit` characters (0 indexed so -1 offset not needed)
# and then look for `set_bit`
o2_candidates="$( grep -E "^.{$cur_bit}${reqd_bit}" <<< "$o2_candidates" )"
Base 2 to decimal
bc
has a special variable called ibase
that controls the base
of i
nput. In case you were wondering, obase
also exists. So, we can do math in binary and get the result in decimal. For example, multiplying 10 (1010
in binary) and 5 (101
in binary):
bc <<< "ibase=2; 1010 * 101"
# Output:
50
Full solution
function part2 {
o2_candidates="$( cat "$1" )"
co2_candidates="$( cat "$1" )"
tmpfile=`mktemp`
cp "$1" "$tmpfile"
cur_bit=0
while (( n="$( wc -l <<< "$o2_candidates" )" )); (( n > 1 )); do
set_bit_counts=($( count_set_bits "$tmpfile" ))
# if number of 1s >= number of 0s
(( set_bit_counts[cur_bit] >= n - set_bit_counts[cur_bit] )) && reqd_bit=1 || reqd_bit=0
# ignore for `cur_bit` characters (0 indexed so -1 offset not needed)
# and then look for `reqd_bit`
o2_candidates="$( grep -E "^.{$cur_bit}${reqd_bit}" <<< "$o2_candidates" )"
echo "$o2_candidates" > "$tmpfile"
(( cur_bit++ ))
done
# reset variables
cur_bit=0
cp "$1" "$tmpfile"
while (( n="$( wc -l <<< "$co2_candidates" )" )); (( n > 1 )); do
set_bit_counts=($( count_set_bits "$tmpfile" ))
# if number of 0s > number of 1s
(( n - set_bit_counts[cur_bit] > set_bit_counts[cur_bit] )) && reqd_bit=1 || reqd_bit=0
co2_candidates="$( grep -E "^.{$cur_bit}${reqd_bit}" <<< "$co2_candidates" )"
echo "$co2_candidates" > "$tmpfile"
(( cur_bit++ ))
done
echo "ibase=2; $o2_candidates * $co2_candidates" | bc
rm "$tmpfile"
}
Day 04
Parts 1 and 2
Ok, this is where things got really interesting. For solving this, we need to cover a variety of concepts. Let's tackle them one by one and build towards the solution.
Multi-dimensional arrays
The bingo card is a 2D board. Bash natively doesn't support multiple dimensional arrays. Obviously we can map any N-d array to 1-d just by multiplying and adding indices but something better exists. Some other approaches can be found in this SO question.
I used the associative array approach. Basically, it is like a hashmap (or dict
in python, Object
in js, etc) where the key is the index in the form i,j
and the value is the element at arr[i][j]
. Since we have multiple boards, I used 3 indices for this array: first for indexing the board and rest two for indexing inside the board.
To facilitate fast number processing, I also store a mapping of number to index. When lets say 82
is called, I'll just use this LUT and update those specific boards only rather than iterating over all boards and checking each cell for 82
.
Due to the requirements of this task, we need to keep track of which numbers have been marked. Some hack could've been arranged using negative numbers and adding one to everything but I chose to keep it simple and made another associative array, bingo_cards_marked
that stores whether the number was marked or not.
# associative array
# key: [card index, row, col]
declare -A bingo_cards
declare -A bingo_cards_marked
function read_input {
# read the list of random numbers
IFS=, read -a numbers < <( head -1 "$INP" )
local bingo_idx=0 line_idx=0 line bingo_row col key
while read line; do
# input next bingo card after 5 lines
if (( line_idx == 5 )); then
line_idx=0
(( bingo_idx++ ))
fi
read -a bingo_row <<< "$line"
for col in $( seq 0 $(( BINGO_COLS - 1 )) ); do
key="$bingo_idx,$line_idx,$col"
bingo_cards["$key"]=${bingo_row[$col]}
# all numbers are unmarked in the beginning
bingo_cards_marked["$key"]=0
# store indices separated by space
num2idx[${bingo_cards["$key"]}]="$( echo ${num2idx[${bingo_cards["$key"]}]} $key )"
done
(( line_idx++ ))
# this skips empty lines
# vvvvvvvvvvvvvvvvvvvv
done < <(grep -v '^$' "$INP" | sed -n '2,$p')
# ^^^^^^^^^^^^^
# since first line has random number list
}
Checking for bingo
This was another piece of the puzzle which is going to be used repeatedly so the logic best resides inside a function. Given an index of a bingo card as argument, this goes over the rows and columns and checks whether atleast a single cell was unmarked. If nothing was unmarked, it returns 0 otherwise, it returns 1.
In bash, the return
keyword in functions isn't what you expect it to be. As seen in the earlier problems, to really "return" from a function, we print it. In bash, the return keyword does two things:
- stop execution of the function: even if you're inside a loop, you'll get out of the function
- set the status code: a status code of 0 means success and anything else means failure.
if
andtest
use this definition
This SO answer explains this in more detail.
function bingo_done {
# receives an index of bingo to check from bingo card list
local bingo_dimen found_invalid_row found_invalid_col row_wise_key col_wise_key i j
bingo_dimen=$BINGO_ROWS
for i in $( seq 0 $(( bingo_dimen - 1 )) ); do
found_invalid_row=0
found_invalid_col=0
for j in $( seq 0 $(( bingo_dimen - 1 )) ); do
row_wise_key="$1,$i,$j"
col_wise_key="$1,$j,$i"
(( bingo_cards_marked["$row_wise_key"] == 0 )) && found_invalid_row=1
(( bingo_cards_marked["$col_wise_key"] == 0 )) && found_invalid_col=1
done
# either didnt find invalid row or didnt find invalid col
(( found_invalid_row * found_invalid_col == 0 )) && return 0
done
return 1
}
I also need to keep track of which bingo cards have won. The reason is twofold:
- We don't mark new numbers on a card that has already won
- It is needed in the way I'm solving this problem, to make sure that in every iteration I don't double count a card.
For this and for the solution to the problem itself, I'm using an array that stores card_number,score
pairs in the order the cards won.
# list of bingo cards that were finished
# in order and their score
declare -a win_order_and_score
function bingo_win_recorded {
# checks whether the card had already won
local idx score winner
for winner in "${win_order_and_score[@]}"; do
IFS=, read idx score <<< "$winner"
(( idx == "$1" )) && return 0
done
# $? will be 1, if statements will fail etc
return 1
}
Now, we come to the actual bingo game. Since we're storing all the indices for each number, we just need to iterate over them, mark the number and check whether its a bingo in the subset of all cards.
Going into a little more depth in my solution here, num2idx
stores a space separated list of indices. For example if the number 82
was present in cards 4, 9 and 13 at the top left, top right and bottom left corners respectively, num2idx[82]
would look like 3,0,0 8,0,4 12,4,0
. Note the space between comma separated 3-tuples. The -1 is due to 0-indexing.
We've already seen that we can use cut
to separate out the individual items but here, we'll have to change the delimiter to a newline first. Instead, we use awk
. Now awk
, like cut
also assumes that the different "rows" are in individual lines but we can change that using RS
(record separator).
So, to extract just the card index, we can do something like:
awk -vRS=' ' -F, '{ print $1 }' <<< "3,0,0 8,0,4 12,4,0"
# Output:
3
8
12
Basics aside, this is how I go through the cards for which the number was called just now
function check_bingo_list {
# receives an array of $bingo_idx,$line_idx,$col (basically whatever is stored in num2idx)
bingo_idx_to_check=($( awk -vRS=' ' -F, '{ print $1 }' <<< "$1" ))
bingo_found=()
for idx in "${bingo_idx_to_check[@]}"; do
# we dont want to add same card again
# vvvvvvvvvvvvvvvvvvvvvvvvv
bingo_done "$idx" && ! bingo_win_recorded $idx && bingo_found+=($idx)
done
echo "${bingo_found[@]}"
}
With all the pieces in place, we can now simulate the bingo game:
function simulate_game {
read_input
# loop over the list of random numbers
for n in "${numbers[@]}"; do
# update all bingo card values using `num2idx`
for idx in ${num2idx[$n]}; do
IFS=, read bingo_card _unused _unused <<< "$idx"
# dont mark new numbers if card has already won
bingo_win_recorded $bingo_card || bingo_cards_marked["$idx"]=1
done
# check if new bingo
# check_bingo_list already filters for old winners
bingo_found_idxs=($( check_bingo_list "${num2idx[$n]}" ))
# calculate and store score
for idx in "${bingo_found_idxs[@]}"; do
sum=0
for row in $( seq 0 $(( BINGO_ROWS - 1 )) ); do
for col in $( seq 0 $(( BINGO_COLS - 1 )) ); do
key="$idx,$row,$col"
(( bingo_cards_marked["$key"] == 0 )) && (( sum += bingo_cards["$key"] ))
done
done
win_order_and_score+=("$idx,$(( n*sum ))")
done
done
}
Now, we have the order in which boards won and their score stored in win_order_and_score
. To solve parts 1 and 2, we just need to check the first and last element after calling the function:
simulate_game
echo "${win_order_and_score[0]}"
echo "${win_order_and_score[-1]}"
# Output:
88,10374
68,24742
While we can ignore the bingo card index, I want to see just the answers, like for other days. We can use parameter substitution:
${var#Pattern}, ${var##Pattern}${var#Pattern}
Remove from $var the shortest part of $Pattern that matches the front end of $var.${var##Pattern}
Remove from $var the longest part of $Pattern that matches the front end of $var.
simulate_game
echo "${win_order_and_score[0]##*,}"
echo "${win_order_and_score[-1]##*,}"
# Output:
10374
24742
Day 05
Part 1
The 2D array part should be clear, thanks to day 4's problem.
Since we can deal with a single delimiter using read
and IFS
, its better to convert the ` -> ` to ,
. While reading the input, we'll also figure out the total number of rows and cols.
N_ROW=0
N_COL=0
declare -a input
declare -A grid
function read_input {
# input is comma separated now
input=($( sed 's/ -> /,/' "$INP" ))
local line x1 y1 x2 y2
for line in "${input[@]}"; do
IFS=, read x1 y1 x2 y2 <<< "$line"
(( x1 + 1 > N_COL )) && N_COL=$(( x1 + 1 ))
(( x2 + 1 > N_COL )) && N_COL=$(( x2 + 1 ))
(( y1 + 1 > N_ROW )) && N_ROW=$(( y1 + 1 ))
(( y2 + 1 > N_ROW )) && N_ROW=$(( y2 + 1 ))
done
}
A helper function to initialize the associative array (sets all values to 0):
function init_grid {
local i j key
for (( i=0; i < N_ROW; i++ )); do
for (( j=0; j < N_COL; j++ )); do
key="$i,$j"
grid["$key"]=0
done
done
}
For looping through the coordinates, we can use seq
. seq
takes an "increment" parameter that we can set to -1
for the descending values.
seq 5 -1 2
# Output:
5
4
3
2
Rest of part 1 is simple:
function straight_lines {
local i j key line x1 y1 x2 y2 increment x y n_dangerous
for line in "${input[@]}"; do
IFS=, read x1 y1 x2 y2 <<< "$line"
# y doesn't change
# seq on x
if (( y1 == y2 )); then
y=$y1
(( x1 > x2 )) && increment=-1 || increment=1
for x in $( seq "$x1" "$increment" "$x2" ); do
key="$y,$x"
grid["$key"]=$(( grid["$key"] + 1 ))
done
# x doesn't change
# seq on y
elif (( x1 == x2 )); then
x=$x1
(( y1 > y2 )) && increment=-1 || increment=1
for y in $( seq "$y1" "$increment" "$y2" ); do
key="$y,$x"
grid["$key"]=$(( grid["$key"] + 1 ))
done
fi
done
n_dangerous=0
for (( i=0; i < N_ROW; i++ )); do
for (( j=0; j < N_COL; j++ )); do
key="$i,$j"
(( grid["$key"] > 1 )) && (( n_dangerous++ ))
done
done
echo "$n_dangerous"
}
Part 2
Since the second part asks for all the lines (both straight and diagonal), we can use the same grid after part 1 and just add the diagonal values to it.
Since both x
and y
are changing now, we'll need two seq
s. Their outputs can be merged using paste
.
function diagonal_lines {
local i j key line x1 y1 x2 y2 x_inc y_inc x y n_dangerous
for line in "${input[@]}"; do
IFS=, read x1 y1 x2 y2 <<< "$line"
(( x1 > x2 )) && x_inc=-1 || x_inc=1
(( y1 > y2 )) && y_inc=-1 || y_inc=1
# skip straight lines
(( x1 == x2 )) && continue
(( y1 == y2 )) && continue
while read coords; do
IFS=, read x y <<< "$coords"
key="$y,$x"
grid["$key"]=$(( grid["$key"] + 1 ))
done <<< "$( paste -d, <( seq "$x1" "$x_inc" "$x2" ) <( seq "$y1" "$y_inc" "$y2" ) )"
done
n_dangerous=0
for (( i=0; i < N_ROW; i++ )); do
for (( j=0; j < N_COL; j++ )); do
key="$i,$j"
(( grid["$key"] > 1 )) && (( n_dangerous++ ))
done
done
echo "$n_dangerous"
}
My solutions, for both bash
and very little rust
can be found here.