Imagine you want to plot hourly precipitation measurements from a rain gauge on a bar graph. Your first step will be to put the precipitation data inside a Python data structure. Data structures are how computer programs store information. This information can subsequently be processed, analyzed and visualized.

There are many different types of data structures depending on the kind of data you wish to store. You will have to choose data structures that best meet your requirements for the problem you are trying to solve. Fortunately, Python as a "batteries-included" language gives you a practical choice of data structures to select from. We will do a minimal exploration of Python data structures; just enough to get you going. For a more complete treatment of Python data structures, see the Data Structures section in the Python documentation.

First, scientific data can be large and complex and may require data structures appropriate for scientific programming. We cover Python for scientific data further along in the Scientific Python Package section. This notebook covers basic Python data structures meant for general-purpose programming, necessary to write programs in any capacity. Choosing the right data structure for the problem you are targeting will help your programs run correctly and efficiently, and make them easier for others to understand. We will specifically examine three Python data structures: lists, tuples, both Python sequences, and dictionaries.

Python offers several data structures to store sequences of information such as hourly mean sea level pressure readings from a weather station, or three-dimensional coordinates describing a location in a climate model. To accommodate storage of such data, Python has a few different choices. We will discuss two of them: lists and tuples.

A Python list is a sequence of values that are usually the same kind of item. They are ordered, which means items of a list stay in the order they are inserted in. They can contain strings, numbers, or more complex items. Lists are mutable, which is a fancy way of saying they can be changed after they are created. Here is a Python list of decadal concentrations of carbon dioxide (ppm) measured at the Mauna Loa observatory from 1970 to 2010 assigned to the co2 variable.

In [1]:

co2=[325.68,338.68,354.35,371.13,387.37]

The list is demarcated with square brackets, the values are comma delimited and assigned to the co2 variable with the = assignment operator.

Continuing with our list of carbon dioxide concentrations, we want to add a prediction for the year 2020 of 400.0 ppm to the co2 list. We can use the appendmethod to add an item to the end of the list. (A method is like a function, but denoted with the . notation after the variable it is acting on. Instead of append(co2, 400.0) you have co2.append(400.0).)

In [2]:

co2.append(400.0)print(co2)

[325.68, 338.68, 354.35, 371.13, 387.37, 400.0]

Add an Item to the Front of the List

The carbon dioxide concentration in 1960 at Mauna Loa was 316.91 ppm. We can use the insert method, to add an item to the list at the location of our choosing, in this case location or index 0. (Python sequences start at index 0, not 1 like Matlab or Fortran.)

In [3]:

co2.insert(0,316.91)print(co2)

[316.91, 325.68, 338.68, 354.35, 371.13, 387.37, 400.0]

Change a Value in the List

We want to improve our estimate of the year 2020 carbon dioxide concentration to a value of 401.0 ppm. We will access the 7th value on the list with the square bracket notation.

In [4]:

co2[6]=401.0# Remember, 7th item at index 6 because we start at 0, not 1print(co2)

Tuples are also ordered sequences of information but they are immutable, which means once they are created, they cannot change. Immutability may seem like a strange concept given that computer programs are constantly manipulating and changing data, but your program becomes easier to understand when you can guarantee something is unchanging. Tuples tend to contain related items such as an x and y locations in a Cartesian plane, or an author, title and journal in a scholarly citation.

Here we define a tuple representing a geographic coordinate expressed latitude, longitude and elevation in meters:

In [5]:

location=(40.0,-105.3,1655.1)

The tuple definition is demarcated with parentheses, the values are comma delimited and assigned to the location variable with the = assignment operator. Because tuples are immutable, unlike lists, there are no operations to change them in-place.

Python offers a rich variety of options to access values inside lists and tuples, and you will want to eventually understand indexing, slicing and striding expressions. For brevity, we will only examine a couple of examples to get values inside sequences. Again, note valid indices on lists and tuples start at 0 and end at size of list - 1.

Individual items inside the list can be obtained with the square bracket notation. Here will assign a couple of values from inside the list to two variables: co2_1960 and co2_2010. We will the print the values with Python 3 positional formatting.

Dictionary data structures are easy to understand because you are already familiar with them. When you look up a word definition in a language dictionary or use an index in the back of a book, you are using a dictionary data structure. Dictionaries are composed of key and value pairs. For example,

hydrometeor - an atmospheric phenomenon or entity involving water or water vapor, such as rain or a cloud

Here, the key is "hydrometeor" and the value is "an atmospheric phenomenon or entity involving water or water vapor, such as rain or a cloud."

Let's build upon the earlier tuple example by defining a dictionary of METAR weather stations. The keys are strings representing the METAR ICAO identifier, the values are tuples representing the location of the station expressed in latitude, longitude and elevation in meters

Unlike lists and tuples, dictionaries are unordered; entries in a dictionary are not in the order they are inserted in and you cannot rely on any predictable ordering. This is not a problem as you will be using Python dictionary operations to look up the information contained within the dictionary.

There are many topics concerning Python data structures that we did not cover in the interest of brevity. We encourage you to research more elaborate indexing, slicing and striding expressions. Also, we did not cover Sets, which is a data structure composed of unique, unordered values similar to keys in a dictionary data structure. There are several valuable built-in Python functions that merit study: filter(), map(), sorted() functions to name a few. Lastly, in the "Flow Control" notebook, we will examine Python list comprehension to process information inside of sequences and dictionaries.

This website does not host notebooks, it only renders notebooks
available on other websites.