GA 2.8: Ice Ice Baby, Part 1¶

CEGM1000 MUDE: Week 2.8, January 17, 2025.

In this notebook our goal is to get a handle on the uncertainty associated with when the ice breaks each year. Because they are roughly independent, we will consider these two times separately:

the day on which the ice breaks
the minute of the day on which the ice breaks

Note that we will use April 1 as a reference day, such that minute 0 (in Python indexing) corresponds to 00:00 on April 1.

Note also the following historical information about the ice classic (relative to the reference provided above):

The $\mu$ and $\sigma$ of the day is: 33 and 6.5
The $\mu$ and $\sigma$ of the minute in the day is 863, and 190

Maybe it's also handy to recall that there are 1440 minutes in a day. Running the following code snippet will convince you that 863 corresponds to the time 14:23.

print(f"The hour is: {863 // 60}")
print(f"The minute is: {863 % 60}")

Great, let's get started! The first task is to get a handle on the probability associated with picking the right minute in the Ice Classic.

In [1]:

%load_ext autoreload
%autoreload 2
import numpy as np
import matplotlib.pyplot as plt
import scipy.stats as stats
from tools import *

Part 1: Probability of Winning¶

To get started, we should get an order of magnitude estimate of the probability of winning the Ice Classic. This is the probability of guessing the exact minute of the breakup. First we will calculate the probability of guessing the exact day of the breakup, and then we will calculate the probability of guessing the exact "minute of the day" of the breakup.

Note that the historic information of previous breakups can easily be seen on the Nenana Ice Classic website brochure. In particular, you can see the breakup primarily occurs in April and May, and that the time is generally in the early afternoon.

Task 1.1:

Using the information provided above, compute the probability exact day of the breakup will be the historical average. Assume that the distribution is Gaussian and don't forget that the distribution is continuous and you are interested in the entire day being correct.

In [ ]:

YOUR_CODE_HERE
p_day = YOUR_CODE_HERE
print(f"P[day = historical average] = {p_day:.4f}")

Task 1.2:

Using the information provided above, compute the probability exact minute of the breakup will be the historical average. Assume that the distribution is Gaussian.

In [ ]:

YOUR_CODE_HERE
p_min = YOUR_CODE_HERE
print(f"P[min = historical average] = {p_min:.4f}")

Task 1.3:

Compute the probability that the actual breakup time is the historical average.

Hint: assume the day and minute are independent and if you think hard about Week 1.8 you might remember that you can probably make your life easier by assuming the joint probability is the union or intersection...

In [ ]:

p_day_and_min = YOUR_CODE_HERE
print(f"P[day and min] = {p_day_and_min:.4f}")

You probably calculated a number in the previous task, and it's probably not very large. And now remember that this is the mode of the distribution! Let's compute a few other probabilities to get a sense for how much less likely it is to guess the exact minute of the breakup when it is not the historical average.

Task 1.4:

Choose a minute that is less likely to occur (for example, April 16 at 3:33 AM) and evaluate the probabilities, comparing them to the mode.

In [ ]:

print(f"P[day = historical average] = {p_day:.2e}")
print(f"P[min = historical average] = {p_min:.2e}")
print(f"P[day and min] = {p_day_and_min:3e}")

unlikely_day = YOUR_CODE_HERE
unlikely_min = YOUR_CODE_HERE
p_unlikely_day = YOUR_CODE_HERE
p_unlikely_min = YOUR_CODE_HERE
p_unlikely_day_and_min = YOUR_CODE_HERE

print(f"P[unlikely day] = {p_unlikely_day:.2e}")
print(f"P[unlikely min] = {p_unlikely_min:.2e}")
print(f"P[unlikely day and min] = {p_unlikely_day_and_min:.2e}")

Task 1.5: Reflect on the probabilities calculated above. Note in particular the order of magnitued of the probabilities and how they compare to each other.

Your answer here.

Part 2: More than One Ticket¶

We've established that a single ticket is a long shot. The probability of guessing the winning minute is tiny. But what if we could make multiple guesses? We will explore that here, but first let's introduce ourselves to a tool that can help us choose tickets easily. Remember that each ticket costs $3.

The file tools.py contain some code that will help us easily choose different combinations of tickets and evaluate the probability that they are winners. The code is not very well documented, so don't bother reading it. Instead, we will introduce it below with a few examples, in particular the classes:

Minutes: a class making it easily convert between minutes in a variety of formats
Tickets: a class that makes it easy to choose tickets in a variety of ways
Models: a class that helps with calculations of probability, etc, including plots

Task 2.1:

Run the code cell below to see how you can define a set of tickets using a "list of lists". The first example illustrates how you can select all of the minutes in a single day, in this case, day 34, which corresponds to May 4.

In [ ]:

t = Tickets()
t.add([34])
t.status()

Task 2.2:

It turns out there is also an easy way to visualize which tickets we selected.

In [ ]:

t.show()

Task 2.3:

Now lets try specifying a set of days.

Note that this requires a list inside the list

In [ ]:

t.add([[20, 23]])
t.status()
t.show()

Task 2.4:

What if we want to include a subset of hours, rather than the entire day? Here we can add May 7, 14:00-16:00 to our existing ticket set.

In [ ]:

t.add([[37], [13, 16]])
t.status()
t.show()

Task 2.5:

It is also possible to add the same hours as before for multiple days.

In [ ]:

t.add([[3, 8], [13, 16]])
t.status()
t.show()

Task 2.6:

Yup, we can do it all with minutes too! Can you predict what the following code will do?

In [ ]:

t.add([[5, 15], [3, 6], [15, 45]])
t.status()
t.show()

Task 2.7:

Finally, note that non-consecutive lists of day, hour and minute can also be used to add interesting patterns.

In [ ]:

t.add([[38, 40, 42, 49, 56],
       [3, 6, 9, 12, 15],
       [15, 16, 17, 18, 19, 20, 30, 40]])
t.status()
t.show()

Part 3: Probability¶

Specifying interesting combinations of tickets is nice, but what we are really interested in is...winning!

And to do that, we should have a firm understanding of the probability associated with our bets. The Models class allows us to calculate probability in a straightforward way. First we have to initialize it.

Task 3.1:

Run the cell below, but note that it may take a minute or two to run! This is because it is calculating the probability of winning for every possible ticket given the historical average of breakup and saving that information in a pickle file.

While you are waiting, see if you can guess what the ticket being selected it is an the order of magnitude of the probability that will be calculated.

In [ ]:

m = Models(model_id=2)
t_test = Tickets()
t_test.add([[25], [13], [0]])
p_ticket = m.get_p(t_test.tickets)
print(f"The probability of the ticket is {p_ticket:.3e}")
t_test.status()
t_test.show()

Task 3.2:

Returning to our previous (arbitrary) selection of tickets, we can calculate and visualize the probability of each ticket.

In [ ]:

m = Models()
prob_T = np.zeros((len(t.tickets)))
for i, ti in enumerate(t.tickets):
    prob_T[i] = m.get_p([ti])

prob_T_matrix = m.map_data_to_day_min(prob_T, t.tickets)
m.plot(prob_T_matrix,
       custom_title="Probability of ticket",
       custom_label="Probability of ticket")

Task 3.3:

Using the array of probabilities found above, compute the probability of any of the tickets being a winner.

Note: you actually have to do something in this task!

In [ ]:

prob_any_T_wins = YOUR_CODE_HERE
print(f"Prob of any ticket winning: {prob_any_T_wins:.3e}")

Task 3.4:

To finish off the analysis in this notebook, check out the following visualization of the probability associated with every possible ticket. This information will be useful when you choose various ticket options later.

In [ ]:

m.plot(0)

End of notebook.

GA 2.8: Ice Ice Baby, Part 1¶

.markdown {width:100%; position: relative} article { position: relative }

Part 1: Probability of Winning¶

Part 2: More than One Ticket¶

Part 3: Probability¶