Introduction

I’m planning a short trip to visit family in Florida and thought I’d
take advantage of being in a new place to do some late winter
backpacking where it’s warmer than in Fairbanks. I think I’ve settled on
a 3‒5 day backpacking trip in Big South Fork National River and
Recreation Area, which is in northeastern Tennesee and southeastern
Kentucky.

Except for a couple summer trips in New England in the 80s, my backpacking
experience has been in summer, in places where it doesn’t rain much and is
typically hot and dry (California, Oregon). So I’d like to find out what the
weather should be like when I’m there.

Data

I’ll use the Global Historical Climatology Network —
Daily dataset,
which contains daily weather observations for more than 100 thousand
stations across the globe. There are more than 26 thousand active
stations in the United States, and data for some U.S. stations goes back
to 1836. I loaded the entire dataset—2.4 billion records as of last
week—into a PostgreSQL database, partitioning the data by year. I’m
interested in daily minimum and maximum temperature (TMIN,
TMAX), precipitation (PRCP) and snowfall (SNOW), and in
stations within 50 miles of the center of the recreation area.

The following map shows the recreation area boundary (with some strange
drawing errors, probably due to using the fortify command) in green,
the Tennessee/Kentucky border across the middle of the plot, and the
19 stations used in the analysis.

Here are the details on the stations:

station_id

station_name

start_year

end_year

latitude

longitude

miles

USC00407141

PICKETT SP

2000

2017

36.5514

-84.7967

6.13

USC00406829

ONEIDA

1959

2017

36.5028

-84.5308

9.51

USC00400081

ALLARDT

1928

2017

36.3806

-84.8744

12.99

USC00404590

JAMESTOWN

2003

2017

36.4258

-84.9419

14.52

USC00157677

STEARNS 2S

1936

2017

36.6736

-84.4792

16.90

USC00401310

BYRDSTOWN

1998

2017

36.5803

-85.1256

24.16

USC00406493

NEWCOMB

1999

2017

36.5517

-84.1728

29.61

USC00158711

WILLIAMSBURG 1NW

2011

2017

36.7458

-84.1753

33.60

USC00405332

LIVINGSTON RADIO WLIV

1961

2017

36.3775

-85.3364

36.52

USC00154208

JAMESTOWN WWTP

1971

2017

37.0056

-85.0617

39.82

USC00406170

MONTEREY

1904

2017

36.1483

-85.2650

40.04

USC00406619

NORRIS

1936

2017

36.2131

-84.0603

41.13

USC00402202

CROSSVILLE ED & RESEARCH

1912

2017

36.0147

-85.1314

41.61

USW00053868

OAK RIDGE ASOS

1999

2017

36.0236

-84.2375

42.24

USC00401561

CELINA

1948

2017

36.5408

-85.4597

42.31

USC00157510

SOMERSET 2 N

1950

2017

37.1167

-84.6167

42.36

USW00003841

OAK RIDGE ATDD

1948

2017

36.0028

-84.2486

43.02

USW00003847

CROSSVILLE MEM AP

1954

2017

35.9508

-85.0814

43.87

USC00404871

KINGSTON

2000

2017

35.8575

-84.5278

45.86

To perform the analysis, I collected all valid observations for the stations
listed, then reduced the results, including observations where the day of the
year was between 45 and 52 (February 14‒21).

variable

observations

PRCP

5,942

SNOW

5,091

TMAX

4,900

TMIN

4,846

Results

Temperature

We will consider temperature first. The following two plots show the
distribution of daily minimum and maximum temperatures. In both plots,
the bars represent the number of observations at that temperature, the
vertical red line through the middle of the plot shows the average
temperature, and the light orange and blue sections show the ranges of
temperatures enclosing 80% and 98% of the data.

The minimum daily temperature figure shows that the average minimum
temperature is below freezing, (28.9 °F) and eighty percent of all days
in the third week of February were between 15 and 43 °F (the light
orange region). The minimum temperature was colder than 15 °F or warmer
than 54 °F 2% of the time (the light blue region). Maximum daily
temperature was an average of 51 °F, and was rarely below freezing
or above 72 °F.

Another way to look at this sort of data is to count particular occurances and
divide by the total, “binning” the data into groups. Here we look at the number
of days that were below freezing, colder than 20 °F or colder than 10 °F.

temperature

observed days

percent chance

below freezing

3,006

62.0

colder than 20

1,079

22.3

colder than 10

203

4.2

TOTAL

4,846

100.0

What about the daily maximum temperature?

temperature

observed days

percent chance

colder than 20

22

0.4

below freezing

371

7.6

below 40

1,151

23.5

above 50

2,569

52.4

above 60

1,157

23.6

above 70

80

1.6

TOTAL

4,900

100.0

The chances of it being below freezing during the day are pretty slim,
and more than half the time it’s warmer than 50 °F, so even if it’s cold
at night, I should be able to get plenty warm hiking during the day.

Precipitation

How often it rains, and how much falls when it does is also important
for planning a successful backpacking trip. Most of my backpacking has
been done in the summer in California, where rainfall is rare and even
when it does rain, it’s typically over quickly. Daily weather data can’t
tell us about the hourly pattern of rainfall, but we can find out how
often and how much it has rained in the past.

rainfall amount

observed days

percent chance

raining

2,375

40.0

tenth

1,610

27.1

quarter

1,136

19.1

half

668

11.2

inch

308

5.2

TOTAL

5,942

100.0

This data shows that the chance of rain on any given day between
February 14th and the 21st is 40%, and the chance of getting at least a
tenth of an inch is 30%. That’s certainly higher than in the Sierra
Nevada in July, although by August, afternoon thunderstorms are more
common in the mountains.

When there is precipitation, the distribution of precipitation totals
looks like this:

cumulative frequency

precipition

1%

0.01

5%

0.02

10%

0.02

25%

0.07

50%

0.22

75%

0.59

90%

1.18

95%

1.71

99%

2.56

These numbers are cumulative which means that on 1 percent of the days
with precipition, there was a hundredth of an inch of liquid
precipitation or less. Ten percent of the days had 0.02 inches or
less. And 50 percent of rainy days had 0.22 inches or liquid
precipitation or less. Reading the numbers from the top of the
distribution, there was more than an inch of rain 10 percent of the days
on which it rained, which is a little disturbing.

One final question about precipitation is how long it rains once it
starts raining? Do we get little showers here and there, or are there
large storms that dump rain for days without a break? To answer this
question, I counted the number of days between zero-rainfall days, which
is equal to the number of consecutive days where it rained.

consecutive days

percent chance

1

53.0

2

24.4

3

11.9

4

7.5

5

2.2

6

0.9

7

0.1

The results show that more than half the time, a single day of rain is
followed by at least one day without. And the chances of having it rain
every day of a three day trip to this area in mid-February is 11.9%.

Snowfall

Repeating the precipitation analysis with snowfall:

snowfall amount

observed days

percent chance

snowing

322

6.3

inch

148

2.9

two

115

2.3

TOTAL

5,091

100.0

Snowfall isn’t common on these dates, but it did happen, so I will need to be
prepared for it. Also, the PRCP variable includes melted snow, so a small
portion of the precipitation from the previous section overlaps with the
snowfall shown here.

Conclusion

Based on this analysis, a 3‒5 day backpacking trip to the Big South Fork
National River and Recreation area seems well within my abilities and my
gear. It will almost certainly be below freezing at night, but isn’t
likely to be much below 20 °F, snowfall is uncommon, and even though
I will probably experience some rain, it shouldn’t be too much or
carry on for the entire trip.

Appendix

The R code for this analysis appears below. I’ve loaded the GHCND data
into a PostgreSQL database with observation data partitioned by year.
The database tables are structured basically as they come from the
National Centers for Environmental Information.