Category: Uncategorized

This post was contributed by Kristin Briney and her team members listed below.

Libraries are increasingly seeking data—qualitative and quantitative—to tell a story about their value to patrons, administrators, and financial stakeholders. With the widespread interest in Big Data throughout society, librarians are asking: 1) what data do we have on student behaviors in our information systems, and 2) how can this data explain their learning outcomes? To answer these questions, librarians are beginning to mine EZproxy logs, eResource interactions by looking at single sign on data, collection histories, instruction and other librarian interactions, and physical building access through ID cards. Published analyses indicate potential correlations between particular types of library usage and retention, time-to-degree, GPA, and graduation rates.

As well-intentioned as these efforts are, and as important as it is to demonstrate that libraries aid the institutional mission, there are important issues to consider. Systematic data mining—especially of student behaviors and interactions with library resources—raises student privacy and intellectual freedom issues. Additionally, there are also practical questions regarding whether libraries have data management plans that carefully consider data anonymization, deidentification, retention, and deletion. Researchers across seven institutions have come together to develop a research agenda that seeks to answer these questions and more.

Briney, K., Goben, A., & Zilinski, L. (2015). Do You Have an Institutional Data Policy? A Review of the Current Landscape of Library Data Services and Institutional Data Policies. Journal of Librarianship and Scholarly Communication.

This blog post was submitted by the Data Refuge Team, including Patricia Kim, Bethany Wiggin, Paul Farber. In case you are not familiar with the Data Refuge project, check out their website!

We have now launched the first phase of the Data Refuge Stories project with the a grant from the National Geographic Foundation. We are currently developing storytelling tools and a national Data Storybank that illustrates how federal climate and environmental data lives in the world and connects people, places, and non-human species. The site is still under construction and is being populated with new data stories on an almost daily basis.

While Phase One is based mostly in Philadelphia, we have already tested the first of our tools (called “Data MadLibs”) at the American Geophysical Union last December in New Orleans, in partnership with the Union of Concerned Scientists. You can read more about it here.

As a part of our first phase, we will continue to test our tools in conjunction with Endangered Data Week and Love Data Week, and the series of Love Data Month events hosted by Penn Libraries throughout February. These tools will include the “Data MadLibs” and a Data Valentine-making contest.

Phase Two launches in mid-2018 and will scale the Data Stories project nationally with several competitive fellowships that task journalists to tell stories about data in Climate Cities. Phase Three launches in 2019 at National Geographic headquarters, where a diverse range of stories will be shared at a two-day conference.

Take Action!
We are very eager to continue our collaborative work of storytelling, and we think that since the theme for LDW is “Data Stories,” that we’d make a great match. With that in mind, we would greatly appreciate it if you could spread the word about our ongoing work and storytelling tools—and, if you’d like, contribute to our Omeka Storybank by adapting our tools, both throughout the months of February and March, and longer-term when we move into the national phase of the project.

We have a few slides with that include a description of this exciting new aspect of Data Refuge. Please continue to follow us on social media (Twitter, Instagram, Facebook) as well, where we’ll be updating our followers.

Several people have asked for more information and guidance in writing blog posts. While we don’t have restrictions on length, style, or voice, we do want to make sure that the posts follow a code of conduct that embodies the values of the community. To that end, we are asking that you follow the code of conduct used by the OpenCon community.

We don’t want to stifle your creativity, so we have a few guidelines…

Know your purpose – why are you writing this piece? To inform, persuade, or entertain?

Know your audience and write for their needs, knowledge, and expertise.

by Wei Yin, Research Support & Data Services Librarian, Columbia University Libraries

Why storytelling with data is important?

A piece of news from Forbes indicates data storytelling is the essential skill everyone needs in this big data era. A best-seller book on Amazon, “Storytelling with data: a data visualization guide for business professionals”, addresses the importance of knowing how to choose a most effective way of visualization (i.e. tables, graphs, bars, or others) for drawing audience’s attention both intellectually and emotionally. In other words, data visualization helps telling a meaningful story. Storytelling with data is not only useful in Business, but also embeds in people’s daily life. Strategic storytelling using government-released opendata is becoming more and more popular in city governance. Here, strategic storytelling helps with building up a campaign platform and promoting administrative level legislations and policies. For better storytelling with data, a research paper by two Stanford scholars suggests a balance between author-driven and reader-driven visualization. Author-driven visualization is good for message delivering, while reader-driven visualization supports interactive thinking based on current data display.

There are 4 kinds of open-source data visualization tools for storytelling that are popular in academia nowadays.

Tableau brands itself as a powerful Business Intelligence tool for visual analytics. Rather than displaying graphs and tables for audience, it provides interactive data dynamics to make both presenter and audience understand data better. Tableau products include Tableau desktop (for personal use), Tableau server (for enterprise use) and Tableau online (for cloud service). Though these services are not free, Tableau offers one-year free desktop license, called Tableau public, to students at K12 and postsecondary levels around the world.

D3 is short for “Data-Driven Documents”, which is a JavaScript library for manipulating documents based on data. D3 uses HTML, CSS, and SVG for any type of data visualization you can imagine. Most of functions are free and open-source.

R is widely used in academia, which is famous for its data visualization functions. Jeff Chen and Star Yang from Commerce Data Service wrote a detailed introduction to both 2D and 3D visualization using R. 2D visual tools include the packages of “ggplot2”, “datatables” and “dygraphs”, and 3D visual tools include the libraries of “Threejs” and “leaflet” (for mapping).

Python is another good programming tool for data scientists because it has extensive built-in functions and libraries. This article compares 5 essential data visualization libraries, Pandas, Seaborn, Bokeh, Pygal and Ploty, to help you choose the right data visualization tool.

This year is all about learning from each other – by telling the stories of our data struggles, epiphanies, failures, and everything in between. To broaden the range of voices that we hear from, we are inviting guest bloggers to contribute their data stories. While we welcome all voices, we especially want to invite students and citizen scientists who are engaged in all types of research. Whether you are a local historian or entrepreneur, digital humanist or an teacher, we would love to hear from you.

Written submissions can be short (300-500 words) or longer essays. We would love to share stories in other formats as well – audio recordings, video – any form of storytelling that can be shared online. There are few rules and many ways to tell a good story! Check out some tips below, if you’re unsure how to get started.

Thanks to everyone for waiting patiently as we put together what I hope will be the best Love Data Week yet! Life sometimes throws you curve balls, or even three or four at a time, but we are tremendously excited about the topics for 2018. We welcome your thoughts and ideas for building on the content and activities that we will post next week, so please do reach out and connect via Twitter or email.

Without further ado, our four topics are:

Stories about data

Telling stories with data

Connected conversations

We are data

Go on over to the 2018 home page for more details, and check back next week when we’ll start pushing out activities and resources!