The Chicago Elevated Train, or the L as it is known, is a fixture of the city, with portions of the second-busiest transit system in the United States operating
since 1892. The L sees close to a million passengers every day spread across its 145 stations and more
than 100 miles of tracks.

Using ridership data stored in the Axibase Dataset Catalog released by the City
of Chicago rider estimates can be made for the future. Analytics tools in ATSD
can model current ridership trends to highlight patterns in transit system usage and Future Value models can be used to
estimate future rider totals. The population of Chicago is falling, and it is the only major United
States metropolitan to experience such changes, but unfortunately, that is unlikely to mean an extra seat opening up on
your commute home. Although currently in a period of growth stagnation, the City of Chicago has plans to fully update and
re-brand the L to adapt it to the modern city landscape and help it to regain its relevance in the age of the mass car ownership.

Four instances in time are examined: 2001, 2006, 2011, and 2016, to establish standards of annual ridership. Month-to-month
data is visualized to show usage trends throughout the year and finally, future ridership predictions are made
for the years 2020 and 2025.

Open the visualization below in ChartLab to see the data in its entirety. Use the drop-down menus at
the top to navigate through time, selecting the years to be observed. Use the third drop-down menu to toggle between visualizations
of the top or bottom fifteen stations, the top or bottom fifty stations, or select the wildcard *
option to see all 145.

To establish annual ridership averages, passenger totals for the fifteen busiest stations are used. Although these stations
make up just 10% of the total stations of the L, the passenger traffic between them accounted for a third of the total annual
traffic. This is a dynamic group of stations, meaning that the busiest stations in 2001 are not necessarily the busiest
stations in 2016, however, the purpose of such partitioning is merely to sample a relevant section of the whole, not to
comment on the features of a specific station.

While it is certainly possible that L ridership has peaked and will begin to decline in the future, a total of three such
periods of stagnation are observable throughout the observation period of sixteen years leading to the conclusion that there are and always
have been occasional decreases in annual ridership that are not necessarily indicative of true decline.

Similar increases and decreases in L ridership can be seen when the scale of the observation period is reduced to one year.
Drag the window to the left or right to shows rider numbers of any given month in the legend at the top of the screen.

Here the month of February is the only month where ridership fell below four million people among the top fifteen stations
while in 2001 four million riders among the top fifteen stations would have ranked as one of the busiest months of the year, showing
the relative overall growth of L ridership as a whole.

Because of the nature of L ridership, and the patterns observed in the visualizations above, using data from the previous
year as a baseline for performance during the current year is reasonable. Ridership is increasing consistently when aggregating
the data in five year interval periods. Despite this, there still exists occasional periods of stagnation that must be considered,
which is why a second baseline must be established. Using an Average Value Baseline makes sense given the fluctuations in passenger totals because
such a calculation would account for both positive and negative changes in ridership and combine them for calculation of
future year averages.

Similar to the financial model for calculating Future Value (FV) of an investment, the following formula Can be used to
predict ridership on the L several years in to the future:

For the given dataset, the average value baseline to be used is 170.69 million riders.

Method

Average

Standard Deviation

Average Step

PY

+1.62%

2.75

-

AVB

-

10.89

+1.49%

These calculations are meant to show two possible methods of predicting future ridership aboard the Chicago L train. The
typical data would be within 1.62% of previous year ridership, and each year would represent one step from the annual
average. This means that between any two given years, the difference is roughly 1.49% multiplied
by the number of years between them, and between any two given consecutive years, the ridership totals is within 1.62% of
each other. As the numbers are close together, they can even be used in conjunction with one another, with the understanding
that expected mean percentage error (MPE) is on the order of 0.13% multiplied by the difference in years, or when speaking
of passengers, using the average amount for reference, two hundred thousand passengers times the difference in years.
This error is not on an order higher that 10 x 10^-1 for these purposes, and merely speaks to the likelihood of the model to lose stability
as it attempts to predict rider amounts further and further into the future without inputting new training data and pruning
old training data.

Using 2016 as Year 0, 2020 as Year 4, and 2025 as Year 9, and the methods described above, predictions can be made for ridership
aboard the Chicago L Train in the future.

Year (t)

Rider Total (Million) (Avg of PYB and AVB)

MPE

2016 (0)

195.56

+/- 0% (0 passengers)

2020 (4)

207.88

+/- 0.52% (8,000 passengers)

2025 (9)

230.90

+/- 1.17% (18,000 passengers)

Essentially, this model predicts the probable bounds of ridership aboard the Chicago L. As the amount of time between the
training data and the prediction grows, the uncertainty of the model does as well, which is reflected by the growing difference
between the bounds of probable ridership.

While probably too broad a model to be used for serious funds allocation or city planning, the applications of such a model
are fairly diverse. Future growth estimates are excellent for determining target values of a certain infrastructure and providing
bounds for more specific calculations that can act as guidelines for the results. As time passes, the training data can be
modified to provide more time-specific information. If for example, L ridership has indeed peaked and the rate of its coming
decline are to be calculated, early 2000's training data could be excluded and more recent data could be included to predict
future, lower ridership levels using a new average value baseline, and updating the average percent change in the previous
year baseline number.

The City of Chicago pays close attention to L ridership levels and publishes an annual report
which details their own expectations for rider totals. For the reports available for 2017, the total number of riders
has been lower than the targeted amount posted by the Chicago Transit Authority, shown in the following DOCUMENT-REMOVED-FROM-SOURCE,
however, several station experienced significant growth in rider totals and the CTA has plans to open several new stations
before the year 2020. With respect to the falling population in Chicago, and Illinois as a whole,
several analysts have predicted that the population of the Second City will stabilize and even begin to grow again by 2020,
which means that counting on the continued stagnation of transit riders in the city seems to be betting against the house.