Before you start to work with your financial data, it’s important to get a feeling for the two key different types of data you may encounter.

While these terms may have different meanings on a country by country basis, it is possible to separate the types of data into two basic types by looking at the political significance and technical differences between the data. In this section, we look briefly at the two different types of data and what questions can be addressed using them.

Budget Data – Political Details

Budget data is defined as data relating to the broad funding priorities set forth by a government, often highly aggregated or grouped by goals at a particular agency or ministry. For instance, a government may pass a budget which contains elements such as “Allocate $20 million in funding for clean energy grants” or “Allocate $5 billion for space exploration on Mars”. These data are often produced by a parliament or legislature, on an annual or semi-annual basis.

Spending Data – Execution Details

Spending data is defined as data relating to the specific expenditure of funds from the government. This may take the form of a contract, loan, refundable tax credit, pension fund payments, or payments from other retirement assistance programs and government medical insurance programs. In the context of our previous examples, spending data examples might be a $5,000 grant to Johnson’s Wind Farm for providing renewable wind energy, or a contract for $750,000 to Boeing to build Mars rover component parts. Spending data is often transactional in nature, specifying a recipient, amount, and funding agency or ministry. Sometimes, when the payments are to individuals or there are privacy concerns, the data are aggregated by geographic location or fiscal year.

The fiscal data of some governments may blur the lines of these definitions, but the aim is to separate the political documents from the raw output of government activity. It will always be an ultimate goal to link these two datasets, and to allow the public to see if the funding priorities set by one part of the government are being carried out by another part, but this is often impractical in larger governments since definitions of programs and goals can be “fuzzy” and vary from year to year.

Budget data

Using the definitions above, budget data is often comprised of two main portions: revenue and taxation data and planned expenditures. Revenue and spending are two sides of the same coin and thus deserve to be jointly considered when budget data is released by a government. Especially since revenue tends to be aggregated to protect the privacy of individual taxpayers, it makes more sense to view it alongside the budget data. It often appears aggregated by income bracket (for personal taxes) or by industrial classification (for corporate taxes) but does not appear at all in spending data. Therefore, budget data ends up being the only source for determining trends and changes in revenue data.

Somewhat non-intuitively, revenue data itself can include expenditures as well. When a particular entity or economic behaviour would normally be taxed but an exception is written into the law, this is often referred to as a tax expenditure. Tax expenditures are often reported separately from the budget, often in different documents or at a different time. This often stems from the fact that they are released by separate bodies, such as executive agencies or ministries that are responsible for taxation, instead of the legislature (http://internationalbudget.org/wp-content/uploads/Looking-Beyond-the-Budget-2-Tax-Expenditures.pdf).

Budgets as datasets

A growing number of governments make their budget expenditure data available as machine-readable spreadsheets. This is the preferred method for many users, as it is accessible and requires few software skills to get started. Other countries release longer reports that discuss budget priorities as a narrative. Some countries do something in between where they release reports that contain tables, but that are published in PDF and other formats from which the data is difficult to extract.

On the revenue side, the picture is considerably bleaker, as many governments are still entrenched in the mindset of releasing revenue estimates as large reports that are mostly narrative with little easily extractable data. Tax expenditure reports often suffer from these same problems.

Still, some areas that relate to government revenue are beginning to be much better documented and databases are beginning to be established. This includes budget support through development aid, for which data is published under the IATI (http://www.aidtransparency.net/) and OECD DAC CRS (http://stats.oecd.org/Index.aspx?DatasetCode=CRSNEW) schemes. Data about revenues from extractive industries is starting to be covered under the EITI (http://eiti.org/) with the US and various other regions introducing new rules for mandatory and granular disclosure of extractives revenue. Data regarding loans and debt is fairly scattered, with the World Bank providing a positive example (https://finances.worldbank.org/), while other major lenders (such as the IMF) only report highly aggregated figures. An overview of related data sources can be found at the Public Debt Management Network (http://www.publicdebtnet.org/public/Statistics/).

Connecting revenues and spending

It is highly desirable to be able to determine the flow of money from revenues to spending. For the most part, many taxes go into a general fund and many expenditures come out of that general fund, making this comparison moot. But in some cases, in many countries, there are taxes on certain behaviours that are used to fund specific items.

For example, a car registration fee might be used to fund the construction of roads and highways. This would be an example of a user fee, where the main users of the government service are funding it directly. Or you might have a tax on cigarettes and alcohol that funds healthcare grants. In this case, the tax is being used to offset the added healthcare expense of individuals taking part in at-risk activities. Allowing citizens to view what activities are taxed in order to pay for other expenditures makes it possible to see when a particular activity is being cross-subsidized or heavily funded by non-beneficiaries. It can also allow them to see when funds are being diverted or misused. This may not always be practical at the country level, as federal governments tend to make much larger use of the general fund than other local governments. Typically, local governments are more comprehensive with regards to releasing budget data by fund. Having granular, fund-level data is what makes this kind of comparison and oversight possible.

What questions can be answered using budget data?

Budget expenditure data has an array of different applications, but it’s prime role is to communicate to it’s user broad trends and priorities in government spending. While it can help to have a prose accompaniment, the data itself promotes a more clear-cut interpretation of proposed government spending over political rhetoric. Additionally, it is much easier to communicate budget priorities by economic sector or category than it is at the spending data level. These data also help citizens and CSOs track government spending year over year, provided that the classification of the budget expenditure data stays relatively consistent.

Spending data

For most purposes, spending data can be interpreted as transactional or near-transactional data. Rather than communicating the broad spending priorities of the government like budget data should, spending data is intended to convey specific recipients, geographic locations of spending, more detailed categorisation, or even spending by account number.

Spending data is often created at the executive level, as opposed to legislative, and should be more frequently reported than budget data. It can include many different types of expenditures, such as contracts, grants, loan payments, direct payments for income assistance and maintenance, pension payments, employee salaries and benefits, intergovernmental transfers, insurance payments, and more.

Some types of spending data – such as contracts and grants – can be connected to related procurement information (such as the tender documents and contracts) to add more context regarding the individual payments and to get a clearer picture of the goods and services covered under these transactions.

Opening the checkbook

In the past five years, there have been a spate of countries and local governments that have opened up spending data, often referred to as “checkbook level” data. These countries include, but are not limited to, the US (including various state governments), UK, Brazil, India (including some state governments) and many funds of the European Union.

What questions can be answered using spending data?

Spending data can be used in several different areas: oversight and accountability, strategic resource deployment by local governments and charities, and economic research. However, it is first and foremost a primary right of citizens to view detailed information about how their tax dollars are spent. Tracking who gets the money and how it’s used is how citizens can detect preferential treatment to certain recipients that may be illegal, or if certain political districts might be getting more than their fair share.

It can also help local governments and charities respond to areas of social need without duplicating federal spending that is already occurring in a certain district or going to a particular organization. Lastly, businesses can see where the government is making infrastructure improvements and investments and use that criteria when selecting future sites of business locations. These are only a few examples of the potential uses of spending data. It’s no coincidence that it has ended up in a variety of commercial and non-commercial software products – it has a real, economic value as well as an intangible value as a societal good and anti-corruption measure.

Task: Examine whether budget and / or spending data are available for your country. Note: these may have different names (e.g. Enacted Budget)! “Budget” and “spending” are categories rather than necessarily the names of the documents, policy-folks may actually refer to what we refer to here as spending as budgets (with a particular qualifier).

Save your file online somewhere. You may like to add it to the OpenSpending group on the Datahub – make sure to add all the necessary tags and details so that you (and others) can find it there.

Extra Credit: Read the definition of machine readable in the School of Data glossary. Determine whether your budget or spending data is machine readable. You will need machine readable data in the next stage, so if it is non machine readable you may:

If your data is available in a webpage, but there is no download link – scrape it! Take the introduction to scraping course. (Don’t be afraid – you don’t necessarily need to be able to code!)

#. If your data is available in a PDF – take the extracting data from PDFs course. # Note: You may find it easier to submit a Freedom of Information request for machine-readable data. See this example of a successful request for machine-readable data for the EU budget. Don’t forget to ask for any of the supporting documents which help you to understand things like jargon, or how figures were calculated!