Each unit of data analyzed can be described as topical, asking “what.”6 • What is the number of courses offered in each major and minor? • What is expended in each subject area? • What is the size of the physical collection in each subject area? • What is student enrollment in each area? • What is the circulation in specific areas for one year?

libraries, if they are to survive, must rethink their collecting and service strategies in radical and possibly scary ways and to do so sooner rather than later. Anderson predicts that, in the next ten years, the “idea of collection” will be overhauled in favor of “dynamic access to a virtually unlimited flow of information products.” My note: in essence, the fight between Mark Vargas and the Acquisition/Cataloguing people

The library collection of today is changing, affected by many factors, such as demanddriven acquisitions, access, streaming media, interdisciplinary coursework, ordering enthusiasm, new areas of study, political pressures, vendor changes, and the individual faculty member following a focused line of research.

subject librarians may see opportunities in looking more closely at the relatively unexplored “intersection of circulation, interlibrary loan, and holdings.”

Using Visualizations to Address Library Problems

the difference between graphical representations of environments and knowledge visualization, which generates graphical representations of meaningful relationships among retrieved files or objects.

p. 771 By looking at the data (my note – by visualizing the data), more questions are revealed, The visualizations provide greater comprehension than the two-dimensional “flatland” of the spreadsheets, in which valuable questions and insights are lost in the columns and rows of data.

By looking at data visualized in different combinations, library collection development teams can clearly compare important considerations in collection management: expenditures and purchases, circulation, student enrollment, and course hours. Library staff and administrators can make funding decisions or begin dialog based on data free from political pressure or from the influence of the squeakiest wheel in a department.

Visualization can increase the power of data, by showing the “patterns, trends and exceptions”

Librarians can benefit when they visually leverage data in support of library projects.

Nathan Yau suggests that exploratory learning is a significant benefit of data visualization initiatives (2013). We can learn about our libraries by tinkering with data. In addition, handling data can also challenge librarians to improve their technical skills. Visualization projects allow librarians to not only learn about their libraries, but to also learn programming and data science skills.

The classic voice on data visualization theory is Edward Tufte. In Envisioning Information, Tufte unequivocally advocates for multi-dimensionality in visualizations. He praises some incredibly complex paper-based visualizations (1990). This discussion suggests that the principles of data visualization are strongly contested. Although Yau’s even-handed approach and Cairo’s willingness to find common ground are laudable, their positions are not authoritative or the only approach to data visualization.

a web application that visualizes the library’s holdings of books and e-books according to certain facets and keywords. Users can visualize whatever topics they want, by selecting keywords and facets that interest them.

Primo X-Services API. JSON, Flask, a very flexible Python web micro-framework. In addition to creating the visualization, SeeCollections also makes this data available on the web. JavaScript is the front-end technology that ultimately presents data to the SeeCollections user. JavaScript is a cornerstone of contemporary web development; a great deal of today’s interactive web content relies upon it. Many popular code libraries have been written for JavaScript. This project draws upon jQuery, Bootstrap and d3.js.

To give SeeCollections a unified visual theme, I have used Bootstrap. Bootstrap is most commonly used to make webpages responsive to different devices

D3.js facilitates the binding of data to the content of a web page, which allows manipulation of the web content based on the underlying data.

Overview

A picture is worth a thousand words, but developing a data picture worth a thousand words involves careful thought and planning. IT leaders are often in need of sharing their story and vision for the future with campus partners and campus leadership. Delivering this message in a compelling way takes a significant amount of thought and planning. This session will take participants through the process of constructing their story, how to (and how not to) incorporate data and anecdotes effectively, how to design clear data visualizations, and how to present their story with confidence.

Learning Objectives

During this course, participants will:

Develop a story that elicits a specific outcome

Identify and effectively use data elements to support a compelling story

Learn how to tell your story in a clear and effective way

NOTE: Participants will be asked to complete assignments in between the course segments that support the learning objectives stated below and will receive feedback and constructive critique from course facilitators on how to improve and shape their work.

Facilitator

Leah Lang leads EDUCAUSE Analytics Services, a suite of data services, products, and tools that can be used to inform decision-making about IT in higher education. The foundational service in this suite is the EDUCAUSE Core Data Services (CDS), higher education’s comprehensive IT benchmarking data service.

Storytelling with Data: An Introduction to Data Visualization

Data visualization is about presenting data visually so we can explore and identify patterns in the data, analyze and make sense of those patterns, and communicate our findings. In this course, you will explore those key aspects of data visualization, and then focus on the theories, concepts, and skills related to communicating data in effective, engaging, and accessible ways.

This will be a hands-on, project-based course in which you will apply key data visualization strategies to various data sets to tell specific data stories using Microsoft Excel or Google Sheets. Practice data sets will be provided, or you can utilize your own data sets.

Upon completion of this course, you will be able to create basic data visualizations that are effective, accessible, and engaging. In support of that primary objective, you will:

Describe the benefits of data visualization for your professional situation

Identify opportunities for using data visualization

Apply visual cues (pre-attentive attributes) appropriately

Select correct charts/graphs for your data story

Use appropriate accessibility strategies for data tables

Prerequisites

Basic knowledge of Microsoft Excel or Google Sheets is required to successfully complete this course. Resources will be included to help you with the basics should you need them, but time spent learning the tools is not included in the estimated time for completing this course.
What are the key takeaways from this course?

The ability to explain how data visualization is connected to data analytics

The ability to identify key data visualization theories

Creating effective and engaging data visualizations

Applying appropriate accessibility strategies to data visualizations

Who should take this course?

Instructional designers, faculty, and higher education administrators who need to present data in effective, engaging, and accessible ways will benefit from taking this course

3 Reasons Why Data Storytelling Will Be A Top Marketing Trend of 2018

A study that looked at reader engagement across articles that contained charts and infographics vs. articles that were text-only found that those with graphical storytelling, or what I like to call data storytelling, had up to 34 percent more comments and shares and a 300 percent improvement on the depth of scroll down the page.

Using storytelling techniques to present data not only makes it more visually appealing but also enables easy spotting of key trends, seamless results-tracking, and quick goal-monitoring.

Here are things that can help you build a bridge from your current methods to effective data storytelling–

Choose a topic by identifying your target audience, the goal of your visual, what you would like to achieve.

Organize your data by thinking about what you want to convey and then get rid of anything that doesn’t help you tell that story.

Spend time making your visualization look sharp by keeping it simple, using color and interactivity.

A few bonus tips to make your data visualizations really pop–

Don’t use more than two graphs at a time so as not to confuse participants.

Stick with one color per graph; making things multicolored will cause data to look jumbled.

Give context to your concept. Introduce your idea slowly and tell the story of what you want your data to reveal instead of assuming everyone in the room is on the same page.

Try using interactive data storytelling techniques to support your data.

The future of collaboration: Large-scale visualization

More data doesn’t automatically lead to better decisions. A shortage of skilled data scientists has hindered progress towards translation of information into actionable business insights. In addition, traditionally dense spreadsheets and linear slideshows are ineffective to present discoveries when dealing with Big Data’s dynamic nature. We need to evolve how we capture, analyze and communicate data.

Large-scale visualization platforms have several advantages over traditional presentation methods. They blur the line between the presenter and audience to increase the level of interactivity and collaboration. They also offer simultaneous views of both macro and micro perspectives, multi-user collaboration and real-time data interaction, and a limitless number of visualization possibilities – critical capabilities for rapidly understanding today’s large data sets.

Visualization walls enable presenters to target people’s preferred learning methods, thus creating a more effective communication tool. The human brain has an amazing ability to quickly glean insights from patterns – and great visualizations make for more efficient storytellers.

Grant: Visualizing Digital Scholarship in Libraries and Learning Spaces
Award amount: $40,000
Funder: Andrew W. Mellon Foundation
Lead institution: North Carolina State University Libraries
Due date: 13 August 2017
Notification date: 15 September 2017
Website: https://immersivescholar.org
Contact: immersivescholar@ncsu.edu
Project Description
NC State University, funded by the Andrew W. Mellon Foundation, invites proposals from institutions interested in participating in a new project for Visualizing Digital Scholarship in Libraries and Learning Spaces. The grant aims to 1) build a community of practice of scholars and librarians who work in large-scale multimedia to help visually immersive scholarly work enter the research lifecycle; and 2) overcome technical and resource barriers that limit the number of scholars and libraries who may produce digital scholarship for visualization environments and the impact of generated knowledge. Libraries and museums have made significant strides in pioneering the use of large-scale visualization technologies for research and learning. However, the utilization, scale, and impact of visualization environments and the scholarship created within them have not reached their fullest potential. A logical next step in the provision of technology-rich, visual academic spaces is to develop best practices and collaborative frameworks that can benefit individual institutions by building economies of scale among collaborators.

The project contains four major elements:

An initial meeting and priority setting workshop that brings together librarians, scholars, and technologists working in large-scale, library and museum-based visualization environments.

Scholars-in-residence at NC State over a multi-year period who pursue open source creative projects, working in collaboration with our librarians and faculty, with the potential to address the articulated limitations.

Funding for modest, competitive block grants to other institutions working on similar challenges for creating, disseminating, validating, and preserving digital scholarship created in and for large-scale visual environments.

A culminating symposium that brings together representatives from the scholars-in-residence and block grant recipient institutions to share and assess results, organize ways of preserving and disseminating digital products produced, and build on the methods, templates, and tools developed for future projects.

Work Summary
This call solicits proposals for block grants from library or museum systems that have visualization installations. Block grant recipients can utilize funds for ideas ranging from creating open source scholarly content for visualization environments to developing tools and templates to enhance sharing of visualization work. An advisory panel will select four institutions to receive awards of up to $40,000. Block grant recipients will also participate in the initial priority setting workshop and the culminating symposium. Participating in a block grant proposal does not disqualify an individual from later applying for one of the grant-supported scholar-in-residence appointments.
Applicants will provide a statement of work that describes the contributions that their organization will make toward the goals of the grant. Applicants will also provide a budget and budget justification.
Activities that can be funded through block grants include, but are not limited to:

Commissioning work by a visualization expert

Hosting a visiting scholar, artist, or technologist residency

Software development or adaptation

Development of templates and methodologies for sharing and scaling content utilizing open source software

Student or staff labor for content or software development or adaptation

Funding for operational expenditures, such as equipment, is not allowed for any grant participant.

Application
Send an application to immersivescholar@ncsu.edu by the end of the day on 13 August 2017 that includes the following:

Statement of work (no more than 1000 words) of the project idea your organization plans to develop, its relationship to the overall goals of the grant, and the challenges to be addressed.

List the names and contact information for each of the participants in the funded project, including a brief description of their current role, background, expertise, interests, and what they can contribute.

Project timeline.

Budget table with projected expenditures.

Budget narrative detailing the proposed expenditures

Selection and Notification Process
An advisory panel made up of scholars, librarians, and technologists with experience and expertise in large-scale visualization and/or visual scholarship will review and rank proposals. The project leaders are especially keen to receive proposals that develop best practices and collaborative frameworks that can benefit individual institutions by building a community of practice and economies of scale among collaborators.

Awardees will be selected based on:

the ability of their proposal to successfully address one or both of the identified problems;

the creativity of the proposed activities;

relevant demonstrated experience partnering with scholars or students on visualization projects;

whether the proposal is extensible;

feasibility of the work within the proposed time-frame and budget;

whether the project work improves or expands access to large-scale visual environments for users; and

the participant’s ability to expand content development and sharing among the network of institutions with large-scale visual environments.

Awardees will be required to send a representative to an initial meeting of the project cohort in Fall 2017.

Researchers at Carnegie Mellon University‘s CyLab Security and Privacy Institute have developed a new tool for analyzing network traffic and identifying cyber attacks. The tool uses data visualization to make it easier for network analysts to see key changes and patterns generated by distributed denial of service attacks, malware distribution networks and other malicious network traffic.

Learn data mining languages: R, Python and SQL

W3Schools – Fantastic set of interactive tutorials for learning different languages. Their SQL tutorial is second to none. You’ll learn how to manipulate data in MySQL, SQL Server, Access, Oracle, Sybase, DB2 and other database systems.

Treasure Data – The best way to learn is to work towards a goal. That’s what this helpful blog series is all about. You’ll learn SQL from scratch by following along with a simple, but common, data analysis scenario.

10 Queries – This course is recommended for the intermediate SQL-er who wants to brush up on his/her skills. It’s a series of 10 challenges coupled with forums and external videos to help you improve your SQL knowledge and understanding of the underlying principles.

TryR – Created by Code School, this interactive online tutorial system is designed to step you through R for statistics and data modeling. As you work through their seven modules, you’ll earn badges to track your progress helping you to stay on track.

Leada – If you’re a complete R novice, try Lead’s introduction to R. In their 1 hour 30 min course, they’ll cover installation, basic usage, common functions, data structures, and data types. They’ll even set you up with your own development environment in RStudio.

Advanced R – Once you’ve mastered the basics of R, bookmark this page. It’s a fantastically comprehensive style guide to using R. We should all strive to write beautiful code, and this resource (based on Google’s R style guide) is your key to that ideal.

Swirl – Learn R in R – a radical idea certainly. But that’s exactly what Swirl does. They’ll interactively teach you how to program in R and do some basic data science at your own pace. Right in the R console.

Python for beginners – The Python website actually has a pretty comprehensive and easy-to-follow set of tutorials. You can learn everything from installation to complex analyzes. It also gives you access to the Python community, who will be happy to answer your questions.

PythonSpot – A complete list of Python tutorials to take you from zero to Python hero. There are tutorials for beginners, intermediate and advanced learners.

Read all about it: data mining books

Data Jujitsu: The Art of Turning Data into Product – This free book by DJ Patil gives you a brief introduction to the complexity of data problems and how to approach them. He gives nice, understandable examples that cover the most important thought processes of data mining. It’s a great book for beginners but still interesting to the data mining expert. Plus, it’s free!

Data Mining: Concepts and Techniques – The third (and most recent) edition will give you an understanding of the theory and practice of discovering patterns in large data sets. Each chapter is a stand-alone guide to a particular topic, making it a good resource if you’re not into reading in sequence or you want to know about a particular topic.

Mining of Massive Datasets – Based on the Stanford Computer Science course, this book is often sighted by data scientists as one of the most helpful resources around. It’s designed at the undergraduate level with no formal prerequisites. It’s the next best thing to actually going to Stanford!

Hadoop: The Definitive Guide – As a data scientist, you will undoubtedly be asked about Hadoop. So you’d better know how it works. This comprehensive guide will teach you how to build and maintain reliable, scalable, distributed systems with Apache Hadoop. Make sure you get the most recent addition to keep up with this fast-changing service.

Online learning: data mining webinars and courses

DataCamp – Learn data mining from the comfort of your home with DataCamp’s online courses. They have free courses on R, Statistics, Data Manipulation, Dynamic Reporting, Large Data Sets and much more.

Coursera – Coursera brings you all the best University courses straight to your computer. Their online classes will teach you the fundamentals of interpreting data, performing analyzes and communicating insights. They have topics for beginners and advanced learners in Data Analysis, Machine Learning, Probability and Statistics and more.

Udemy – With a range of free and pay for data mining courses, you’re sure to find something you like on Udemy no matter your level. There are 395 in the area of data mining! All their courses are uploaded by other Udemy users meaning quality can fluctuate so make sure you read the reviews.

CodeSchool – These courses are handily organized into “Paths” based on the technology you want to learn. You can do everything from build a foundation in Git to take control of a data layer in SQL. Their engaging online videos will take you step-by-step through each lesson and their challenges will let you practice what you’ve learned in a controlled environment.

Udacity – Master a new skill or programming language with Udacity’s unique series of online courses and projects. Each class is developed by a Silicon Valley tech giant, so you know what your learning will be directly applicable to the real world.

Treehouse – Learn from experts in web design, coding, business and more. The video tutorials from Treehouse will teach you the basics and their quizzes and coding challenges will ensure the information sticks. And their UI is pretty easy on the eyes.

Learn from the best: top data miners to follow

John Foreman – Chief Data Scientist at MailChimp and author of Data Smart, John is worth a follow for his witty yet poignant tweets on data science.

Nate Silver – He’s Editor-in-Chief of FiveThirtyEight, a blog that uses data to analyze news stories in Politics, Sports, and Current Events.

Andrew Ng – As the Chief Data Scientist at Baidu, Andrew is responsible for some of the most groundbreaking developments in Machine Learning and Data Science.

Bernard Marr – He might know pretty much everything there is to know about Big Data.

Gregory Piatetsky – He’s the author of popular data science blog KDNuggets, the leading newsletter on data mining and knowledge discovery.

Christian Rudder – As the Co-founder of OKCupid, Christian has access to one of the most unique datasets on the planet and he uses it to give fascinating insight into human nature, love, and relationships

Dean Abbott – He’s contributed to a number of data blogs and authored his own book on Applied Predictive Analytics. At the moment, Dean is Chief Data Scientist at SmarterHQ.

Facebook – As with many social media platforms, Facebook is a great place to meet and interact with people who have similar interests. There are a number of very active data mining groups you can join.

LinkedIn – If you’re looking for data mining experts in a particular field, look no further than LinkedIn. There are hundreds of data mining groups ranging from the generic to the hyper-specific. In short, there’s sure to be something for everyone.

Meetup – Want to meet your fellow data miners in person? Attend a meetup! Just search for data mining in your city and you’re sure to find an awesome group near you.

Webinar: Text and Data Mining: The Way Forward, June 30, 10am (EDT)

a critically important means of uncovering patterns of intellectual practice and usage that have the potential for illuminating facets and perspectives in research and scholarship that might otherwise not be noted. At the same time, challenges exist in terms of project management and support, licensing and other necessary protections.

Audrey McCulloch, Chief Executive, Association of Learned Professional and Society Publishers (ALPSP) and Director of the Publishers Licensing Society

Text and Data Mining: Library Opportunities and ChallengesMichael Levine-Clark, Dean and Director of Libraries, University of Denver

As scholars engage with text and data mining (TDM), libraries have struggled to provide support for projects that are unpredictable and tremendously varied. While TDM can be considered a fair use, in many cases contracts need to be renegotiated and special data sets created by the vendor. The unique nature of TDM projects makes it difficult to plan for them, and often the library and scholar have to figure them out as they go along. This session will explore strategies for libraries to effectively manage TDM, often in partnership with other units on campus and will offer suggestions to improve the process for all.

Michael Levine-Clark, the Dean and Director of the University of Denver Libraries, is the recipient of the 2015 HARRASOWITZ Leadership in Library Acquisitions Award. He writes and speaks regularly on strategies for improving academic library collection development practices, including the use of e-books in academic libraries, the development of demand-driven acquisition models, and implications of discovery tool implementation.

This talk will address the challenges and successes that the MIT libraries have experienced in providing enabling services that deliver TDM access to MIT researchers, including:
· emphasizing TDM in negotiating contracts for scholarly resources

· defining requirements for licenses for TDM access

· working with information providers to negotiate licenses that work for our researchers

· addressing challenges and retooling to address barriers to success

· offering educational guides and workshops

· managing current needs v. the long-term goal– TDM as a reader’s right

Ellen Finnie is Head, Scholarly Communications & Collections Strategy in the MIT Libraries. She leads the MIT Libraries’ scholarly communications and collections strategy in support of the Libraries’ and MIT’s objectives, including in particular efforts to influence models of scholarly publishing and communication in ways that increase the impact and reach of MIT’s research and scholarship and which promote open, sustainable publishing and access models. She leads outreach efforts to faculty in support of scholarly publication reform and open access activities at MIT, and acts as the Libraries’ chief resource for copyright issues and for content licensing policy and negotiations. In that role, she is involved in negotiating licenses to include text/data mining rights and coordinating researcher access to TDM services for licensed scholarly resources. She has written and spoken widely on digital acquisitions, repositories, licensing, and open access.

Jeremy Frey, Professor of Physical Chemistry, Head of Computational Systems Chemistry, University of Southampton, UK

Text and Data Mining (TDM) facilitates the discovery, selection, structuring, and analysis of large numbers of documents/sets of data, enabling the visualization of results in new ways to support innovation and the development of new knowledge. In both academia and commercial contexts, TDM is increasingly recognized as a means to extract, re-use and leverage additional value from published information, by linking concepts, addressing specific questions, and creating efficiencies. But TDM in practice is not straightforward. TDM methodology and use are fast changing but are not yet matched by the development of enabling policies.

This webinar provides a review of where we are today with TDM, as seen from the perspective of the researcher, library, and licensing-publisher communities.

How do you present the idea of your research and intertwine it with data in a cohesive, interesting way? Join us in a short session to learn effective communication through infographics using data visualization and design.