Data Champions: Spring (Data) Clean Up!

Becoming a Data Champion (Part 4)

It’s already May (can you believe it?), and for many of us that means spring clean-up. Just like throwing out all of your old junk to the curb, I hope you are starting to throw out all of your data use bad habits. As much as cleaning feels like a chore, I have to admit that the end product – a clean and organized house – is well worth the effort. The same goes for data, too! Unfortunately, quality data often times falls short of its true potential because the data is not clearly stated or is disorganized in its presentation. This final data tip of the year should help you de-clutter your data so that all you have left is a clean and tidy message for your stakeholders.

Tip #4 Organize the data in a user-friendly manner

Now that you have 1) laid the ground rules for the appropriate data to be collected, 2) collected said data, and 3) performed timely analysis on it, it is time to 4) make sure that your data is heard and is understandable.

Much like the frustration of having your name incorrectly announced during a presentation or sporting event, having your data misunderstood by others can also be a very frustrating experience. As a data coordinator, very little is more frustrating to me than to see good data go to waste because it was not communicated effectively. There are a multitude of different data representations (K.1.G) out there as simple as a bar graph up to as sophisticated as a Sankey diagram in order for you to get your point across. The keys are knowing your audience and which data representations convey your message the most effectively. As a token of my appreciation for you reading this article, I have included a handy cheat-sheet for you to refer to when deciding which data visualization to use below.

Prior to the start of a project or program, I meet with the facilitator in charge to discuss a number of things (see Step 1: Choosing the Right Data), and among the topics we discuss is, “Who will be using the data I will be analyzing?” Essentially, “Who are we trying to communicate this data to and for what reason(s)?” and “What decisions will be made as a result of this data?”

Knowing the answers to these questions help to solidify a number of things:

Types of Analysis (K.2.A) that will be conducted on the data (quantitative, qualitative, descriptive, inferential, etc.)

Types of data visualizations to be used (bar graph, line graph, scatterplot, data table, etc.)

Examples:

Parent-teacher conferences You may not want to share the output from a t-test performed on their child’s test score compared with the class average. Instead you may opt to share with the parents some basic descriptive statistics about the test such as the minimum, maximum and average scores along with what percentile that student’s score fell in relation to the rest of the class. This could easily be shown using a bar graph of the distribution of the class’s scores (unidentified of course) with the individual student’s score highlighted.

Grant Funder It may be important to share more sophisticated data, such as the output of a significance test, using more advanced or comprehensive methods in a larger program report or executive summary

Administrator Odds are they would not have the time to read a lengthy 50-page evaluation report for a new reading program implemented in the school, but rather a single-page infographic highlighting key data points may suffice.

Another key to organizing data in a user-friendly manner is understanding just how deep those receiving/using the data will want to go in their analysis.

Would a parent be interested in comparing the average ACT scores for each of the districts in the state? Probably not. However, a superintendent may be interested in the average ACT score for their district and where they stand compared to other districts in the state. This type of comparison could be one way of helping inform future curriculum decisions. The key is to start basic and then ease into more sophisticated analyses and comparisons as you go, as opposed to immediately jumping into the deep analyses. Part of this is having a solid grasp of which messages are the most important or salient for the audience of interest, and aligning these messages with the appropriate analyses and data visuals.

In regard to presenting data in a user-friendly fashion, I follow many guidelines proposed by Stephanie Evergreen – a renowned speaker in the data visualization and reporting community. I encourage you to access her website at www.stephanieevergreen.com for more information about her and the resources she has to offer. Among the data visualization tips she has proposed, here are a few that I believe are important to a data user’s success:

Proportions are accurate.

When making comparisons between years, groups, or multiple pieces of data, ensuring that proportions are accurate is crucial. Inaccurate proportions lend to misleading information, similar to this infographic regarding the 2016 election below:

Disclaimer: This graphic is not intended to show support for any political candidate, but simply a portrayal of misleading data representation.

As you can see in the picture above, the difference in supporters between Donald Trump and Hillary Clinton was much smaller in the state of Virginia (4%) compared to Oklahoma (24%). However, the disproportional lengths and sizes of the bars make it look like a much larger difference in Virginia than it actually was. ALWAYS make sure that when making comparisons between different pieces of data that they are based on the same scale.

Data are intentionally ordered.

The human brain is built to identify patterns and sequences. Having an intentional order to the data you present allows for easier interpretation and analysis for the receiver of the data.

As you can see in the bar charts below, it would be much easier for the manager to identify which of their salespeople are outperforming the others (and by how much) by looking at the graph on the right as opposed to the graph on the left. Analyzing the data on the graph on the left is much more taxing on the brain in regard to effort as you basically have to go one by one to compare the lengths of each of the bars. Not only does this increase the time and effort it takes to analyze the data, but it also increases the susceptibility for people to make errors during analysis.

Include a descriptive title and axis labels.

The title of the data visualizations you present should give the viewers an accurate understanding of what is being presented.

For example, in the graphs above it is very clear that the graphs are depicting first quarter sales for different sales persons in a particular company during the year 2013. Also, the vertical and horizontal axes are labeled such so that viewers know that the bars are calculated in thousands of dollars. To add further ease of interpretation Stephanie Evergreen suggests providing a subtitle or annotation identifying a takeaway message or highlighting an important comparison or relationship in the graph. If you are emphasizing a specific comparison in a visualization it might be helpful to change the color scheme so that the non-important comparisons are a neutral color (e.g. gray) and the comparison of interest is in color. An example of this is provided in the chart below.

Present your data as if it were a story.

Although, in some cases people simply skip to the conclusion or summary of an article or report for the most significant findings, not providing enough context for the data may lead to inaccuracies or misrepresentations of the data moving forward.

In my earlier post “Step 1.Choosing the right data,” I provided an example of a school which implemented a supplemental reading program for students scoring below proficient on the state assessment in reading. In the example, School X found that the percentage of students failing to meet proficiency on the state reading assessment significantly dropped by 30% after instituting the supplemental reading program. If this finding were presented the way it is written in the previous sentence, this finding could be misconstrued to mean that this program could be effective for all students (regardless of proficiency status), when in reality this program was only intended for students who were non-proficient in reading. In addition to providing readers with the data context, there should also be a logical organization or flow between visualizations and analyses such that the reader should understand where the data is coming from, how it was analyzed, and how it relates to the overall message or other data findings. One way to reduce the risk of using or communicating data ineffectively is by being as Transparent (B.1.B) as possible regarding the sample, data collection methods, and data analysis calculations.

Conclusion

Over the course of the last few months I have provided you with a few tips to help you along your data use career. These included 1) choosing the right data, 2) establishing a common data language and consistency in data collection, 3) being timely in the data analysis, and 4) organizing the data in a user-friendly manner.

As you begin incorporating these tips into your practice I highly encourage you to keep in mind the following questions throughout the data use process:

What is the question that needs to be answered?

Who will be using this data and for what reasons?

What data will allow us to answer this question most effectively?

How can I be sure that everyone is on the same page regarding what data we need?

How can I ensure that the data is collected in a consistent manner?

When does this data need to be analyzed and communicated out?

What types of communication materials are needed to communicate the findings?

Is the data communicated in an easy to understand format?

Is the communication medium appropriate given the audience’s level of knowledge, interest, and time availability?

When it comes to data, always try to ensure that you are using the right data for the right purpose. With these tips in mind, you are well on your way to becoming a data use champion.

1 Response

ABOUT

South East Education Cooperative

The South East Education Cooperative (SEEC) is one of eight Regional Education Associations (REAs) in North Dakota. Its membership includes 36 public school districts and four private schools in the southeast portion of N.D. Through these members the SEEC serves over 34,800 N.D. students. REAs strive to offer consistent high-quality programs and services in the areas of professional development, technology support, data systems support, school improvement support, and curriculum enrichment that reflect the needs of its region.