Last night we got a quarter of an inch of rain at our house, making roads
“impassable” according to the Fairbanks Police Department, and turning the dog
yard, deck, and driveway into an icy mess. There are videos floating around
Facebook showing Fairbanks residents playing hockey in the street in front of
their houses, and a reported seven vehicles off the road on Ballaine
Hill.

Let’s check out the weather database and take a look at how often Fairbanks
experiences this type of event, and when they usually happen. I’m going to skip
the parts of the code showing how we get pivoted daily data from the database,
but they’re in this post.

Starting with pivoted data we want to look for dates from November through March
with more than a tenth of an inch of precipitation, snowfall less than two
tenths of an inch and a daily high temperature above 20°F. Then we group by the
winter year and month, and aggregate the rain events into a single event. These
occurrences are rare enough that this aggregation shoudln’t combine events from
different parts of the month.

There haven’t been any rain events in December, which is a little surprising,
but next to that, February rains are the least common.

I looked at this two years ago (Winter freezing rain) using slightly different criteria. At
the bottom of that post I looked at the frequency of rain events over time and
concluded that they seem to come in cycles, but that the three events in this
decade was a bad sign. Now we can add another rain event to the total for the
2010s.

We’re getting some bad home Internet service from Alaska Communications, and
it’s getting worse. There are clear patterns indicating lower quality service in
the evening, and very poor download rates over the past couple days. Scroll down
to check out the plots.

Introduction

Over the past year we’ve started having trouble watching streaming video over
our Internet connection. We’re paying around $100/month for phone, long
distance and a 4 Mbps DSL Internet connection, which is a lot of money if we’re
not getting a quality product. The connection was pretty great when we first
signed up (and frankly, it’s better than a lot of people in Fairbanks), but over
time, the quality has degraded and despite having a technician out to take a
look, it hasn’t gotten better.

Methods

In September last year I started monitoring our bandwidth, once every two hours,
using the Python speedtest-cli tool, which uses speedtest.net to get the data.

To use it, install the package:

$ pip install speedtest-cli

Then set up a cron job on your server to run this once every two hours. I
have it running on the raspberry pi that collects our weather data. I use this
script, which appends data to a file each time it is run. You’ll want to change
the server to whichever is closest and most reliable at your location.

This isn’t a very good format for analysis, so I wrote a Python script to
process the data into a tidy data set with one row per observation, and columns
for ping time, download and upload rates as numbers.

From here, we can look at the data in R. First, let’s see how our rates change
during the day. One thing we’ve noticed is that our Internet will be fine until
around seven or eight in the evening, at which point we can no longer stream
video successfully. Hulu is particularly bad at handling a lower quality
connection.

Code to get the data and add some columns to group the data appropriately for
plotting:

Box and whisker plots (boxplots) show how data is distributed. The box
represents the range where half the data lies (from the 25th to the 75th
percentile) and the line through the box represents the median value. The
vertical lines extending above and below the box (the whiskers), show where most
of the rest of the observations are, and the dots are extreme values. The
figure above has a single boxplot for each two hour period, and the plots are
split into month-long periods so we can see if there are any trends over time.

There are some clear patterns across all months: our bandwidth is pretty close
to what we’re paying for for most of the day. The boxes are all up near 4 Mbps
and they’re skinny, indicating that most of the observations are close to 4
Mbps. Starting in the early evening, the boxes start getting larger,
demonstrating that we’re not always getting our expected rate. The boxes are
very large between eight and ten, which means we’re as likely to get 2 Mbps as
the 4 we pay for.

Patterns over time are also showing up. Starting in January, there’s another
drop in our bandwidth around noon and by February it’s rare that we’re getting
the full speed of our connection at any time of day.

One note: it is possible that some of the decline in our bandwidth during the
evening is because the download script is competing with the other things we are
doing on the Internet when we are home from work. This doesn’t explain the drop
around noon, however, and when I look at the actual Internet usage diagrams
collected from our router using SMTP / MRTG, it doesn’t appear that we are
really using enough bandwidth to explain the dramatic and consistent drops seen
in the plot above.

February is starting to look different from the other months, I took a closer
look at the data for that month. I’m filtering the data to just February, and
based on a look at the initial version of this plot, I added trend lines for the
period before and after noon on the 12th of February.

Ouch. Throughout the month our bandwidth has been going down, but you can also
see that after noon on the 12th, we’re no longer getting 4 Mpbs no matter what
time of day it is. The trend line probably isn’t statistically significant for
this period, but it’s clear that our Internet service, for which we pay a lot of
money for, is getting worse and worse, now averaging less than 2 Mbps.

Conclusion

I think there’s enough evidence here that we aren’t getting what we are paying
for from our ISP. Time to contact Alaska Communications and get them to either
reduce our rates based on the poor quality of service they are providing, or
upgrade their equipment to handle the traffic on our line. I suspect they
probably oversold the connection and the equipment can’t handle all the users
trying to get their full bandwidth at the same time.

Whenever we’re in the middle of a cold snap, as we are right now, I’m tempted to
see how the current snap compares to those in the past. The one we’re in right
now isn’t all that bad: sixteen days in a row where the minimum temperature is
colder than −20°F. In some years, such a threshold wouldn’t even qualify as the
definition of a “cold snap,” but right now, it feels like one.

Getting the length of consecutive things in a database isn’t simple. What we’ll
do is get a list of all the days where the minimum daily temperature was
warmer than −20°F. Then go through each record and count the number of days
between the current row and the next one. Most of these will be one, but when
the number of days is greater than one, that means there’s one or more
observations in between the “warm” days where the minimum temperature was
colder than −20°F (or there was missing data).

For example, given this set of dates and temperatures from earlier this year:

date

tmin_f

2015‑01‑02

−15

2015‑01‑03

−20

2015‑01‑04

−26

2015‑01‑05

−30

2015‑01‑06

−30

2015‑01‑07

−26

2015‑01‑08

−17

Once we select for rows where the temperature is above −20°F we get this:

date

tmin_f

2015‑01‑02

−15

2015‑01‑08

−17

Now we can grab the start and end of the period (January 2nd + one day and
January 8th - one day) and get the length of the cold snap. You can see why
missing data would be a problem, since it would create a gap that isn’t
necessarily due to cold temperatures.

I couldn't figure out how to get the time periods and check them for validity
all in one step, so I wrote a simple function that counts the days with valid
data between two dates, then used this function in the real query. Only periods
with non-null data on each day during the cold snap were included.

CREATEFUNCTIONvalid_n(date,date)RETURNSbigintAS'SELECT count(*) FROM ghcnd_pivot WHERE station_name = ''FAIRBANKS INTL AP'' AND dte BETWEEN $1 AND $2 AND tmin_c IS NOT NULL'LANGUAGESQLRETURNSNULLONNULLINPUT;

There have been seven cold snaps that lasted 16 days (including the one we’re
currently in), tied for 45th place.

Keep in mind that defining days where the daily minimum is −20°F or colder is a
pretty generous definition of a cold snap. If we require the minimum
temperatures be below −40° the lengths are considerably shorter:

Top ten longest cold snaps (−40° or colder minimum temp)

rank

start

end

days

1

1964‑12‑25

1965‑01‑11

18

2

1973‑01‑12

1973‑01‑26

15

2

1961‑12‑16

1961‑12‑30

15

2

2008‑12‑28

2009‑01‑11

15

5

1950‑02‑04

1950‑02‑17

14

5

1989‑01‑18

1989‑01‑31

14

5

1979‑02‑03

1979‑02‑16

14

5

1947‑01‑23

1947‑02‑05

14

9

1909‑01‑14

1909‑01‑25

12

9

1942‑12‑15

1942‑12‑26

12

9

1932‑02‑18

1932‑02‑29

12

9

1935‑12‑02

1935‑12‑13

12

9

1951‑01‑14

1951‑01‑25

12

I think it’s also interesting that only three (marked with a grey background) of
the top ten cold snaps defined at −20°F appear in those that have a −40°
threshold.

I’ve been a bit behind on mentioning the 2015 Tournament of Books.
The contestants were announced last month. As usual, here’s the list with a
three star rating system for those I've read: ☆ - not worthy, ☆☆ - good,
★★★ - great.

Thus far, my early favorite is, of course, The Bone Clocks by David Mitchell.
It's a fantastic book, similar in design to Cloud Atlas, but better. Both All
the Light We Cannot See and Dept. of Speculation are distant runner's up.
All the Light is great story, told in very short and easy to digest
chapters, and Speculation is a funny, heartrending, strange, and ultimately
redemptive story of marriage.

Following up on yesterday’s post about minimum temperatures, I was thinking that
a cumulative measure of cold temperatures would probably be a better measure of
how cold a winter is. We all remember the extremely cold days each winter when
the propane gells or the car won’t start, but it’s the long periods of deep cold
that really take their toll on buildings, equipment, and people in the Interior.

One way of measuring this is to find all the days in a winter year when the
average temperature is below freezing and sum all the temperatures below
freezing for that winter year. For example, if the temperature is 50°F, that’s
not below freezing so it doesn’t count. If the temperature is −40°, that’s 72
freezing degrees (Fahrenheit). Do this for each day in a year and add up all
the values.

Here’s the code to make the plot below (see my previous post for how we got
fai_pivot).

You’ll notice I’ve split the trend lines at 1975. When I ran the regressions
for the entire period, none of them were statistically significant, but looking
at the plot, it seems like something happens in 1975 where the cumulative
freezing degree days suddenly drop. Since then, they've been increasing at a
faster, and statistically significant rate.

This is odd, and it makes me wonder if I've made a mistake in the calculations
because what this says is that, at least since 1975, the winters are getting
colder as measured by the total number of degrees below freezing each winter.
My previous post (and studies of climate in general) show that the climate is
warming, not cooling.

One bias that's possible with cumulative calculations like this is that missing
data becomes more important, but I looked at the same relationships when I only
include years with at least 364 days of valid data (only one or two missing
days) and the same pattern exists.

Curious. When combined, this analysis and yesterday's suggest that winters in
Fairbanks are getting colder overall, but that the minimum temperature in any
year is likely to be warmer than in the past.