Do Good Data 2016

One of the things we love about Do Good Data is that it provides an opportunity to connect with other social sector leaders like us. It is so easy to feel alone in this work and here are 1,000 people just like you! We are very excited about the community features and the opportunity to connect with one another. Take advantage and connect and make the most of this community.

One of the things we love about Do Good Data is that it provides an opportunity to connect with other social sector leaders like us. It is so easy to feel alone in this work and here are 1,000 people just like you! We are very excited about the community features and the opportunity to connect with one another. Take advantage and connect and make the most of this community.

Traditionally, Educational Data has been used to do unit-wide analysis (ie, School, Classroom, District, College) and diagnostics. This talk will be about how we can transition to a world of personalized learning and just-in-time interventions by using analytics and machine learning on data that is already collected. It will draw on a series of projects done by the Impact Lab and other organizations and provide a real-world playbook for implementing these ideas.

Feeding America is a network of food banks across the U.S. that helps 1 in 9 Americans feed themselves and their families. As of today, the network provides 3 billion meals—but another 8 billion meals are needed to truly close the meal gap. This session will explore the innovative approach food bank managers are taking to use data to streamline operations and identify new sources of food donations to fill the gap.

In today’s world of Big Data, there’s never been more information available to work with. Unfortunately, all this data is hard to use, especially if it’s been entered by multiple people or came from different systems. The simple task of figuring out who is who in a spreadsheet or database can be a daunting, time-consuming task.

Derek Eder and Forest Gregg from DataMade will demonstrate Dedupe, a machine learning service that, based on training you give, learns the optimal way to quickly and automatically cluster similar records in any spreadsheet or database.

When nonprofits think of data, Excel is often the only tool at their disposal for analysis, visualization, and storage. With so many new tools constantly coming out, choosing where to invest your organization’s limited resources can feel bewildering. At the same time, enormous operational data demands can make innovations exhausting.

Attendees will hear the multi-year story of my tool evolution and use at an education non-profit. We start from a world of only having Excel and spending days manually formatting reports, and end in a world of Excel plus databases, Tableau and Python. There will be numerous demonstrations of past (and current) applications of tools. With every tool, attendees will hear the motivations behind adoption, the challenges in implementation, and the long term impact.

Attendees will walk away having learned:-Strengths and Limitations of Excel -Exposure to new tools (VBA, SQL, Tableau, Python), with a reasoned understanding of their respective strengths and weaknesses. -Approaches for identifying process bottlenecks in data workflow -Understanding of data challenges being separated as being collection, storage, access, visualization, and analysis -Key questions to ask (and get answered) before investing in a new tool -Strategies on successful launch and implementation of new tools

Some social sector organizations are born understanding how to use data effectively. For many others, however, the data team(s) must become advocates for change in not only the way their staff interacts with technology, but in the way they work and solve problems.

Many organizations across the social sector have taken initial steps of collecting programmatic data and managing databases to store it. This multi-disciplinary panel will address what some have called the last mile of the data value-chain: engineering ways for data to add value to the daily lives of staff members. If you’re called to affect change the way your organization uses data to inform your work, this is a panel you won’t want to miss.

Panelists for this interactive discussion include professionals with significant experience helping foundations, nonprofits, and local governments make effective, day-to-day use of data. We will share time-tested strategies and examples of techniques, tools, and thought leadership pieces that will help you create a learning culture in your organization.

Imagine asking 20,000 Iranians about the future of their country - in 12 days; or 64,000 people about Open Government - simultaneously in 62 countries; or citizens in 51 countries about potentially highly divisive issues such as same-sex marriage.Welcome to the new era of Global Citizen Voice.This session will be presented by Eric Meerkamper, President of RIWI, which collects citizen opinion via an innovative online software called Random Domain Intercept Technology (RDIT) for clients including the World Bank, Freedom House, MasterCard Foundation, Munk Centre for Global Affairs, and the International Foundation for Electoral Systems.Initially used at the University of Toronto for global pandemic surveillance, RDIT has since been applied to understanding and addressing various global challenges such as: women’s rights in ISIS influenced countries, mental health stigma, creating an Arctic sanctuary, women voting initiatives in Indonesia, youth optimism in sub-Sahara Africa, and many others.So, what do you want to ask the world?

Bridging the gap between fundraising and programs — or, more specifically, connecting an organization’s fundraisers to the populations they serve — is a critical component of any cohesive revenue-generating fundraising strategy. But that relationship is typically explored on a one-way street; few organizations take time to explore if there are spatial connections between the incidence rates of their chosen cause and giving patterns in a specific location, and that’s where spatial analysis comes in. Join Plenty Consulting’s Claire Thomison for a case study in leveraging spatial analysis and mapping to explore potential geographic connections between disease and donations. Claire will first overview the basics of spatial analysis with geographic information systems (GIS) and its applications for nonprofits. Then, she’ll walk you through the sources, tools, and techniques you’ll need to acquire, analyze, and overlay disease or cause incidence rate data onto your existing fundraising data, with specific focus on QGIS, a free open source GIS software. Then, using a real-life case study, one of Plenty’s largest nonprofit clients will dive into the insights and trends that surfaced from analyzing P2P program fundraising through the lens of area-level disease incidence rates, and how the program plans to leverage those findings to optimize their peer-to-peer fundraising strategy moving forward.Attendees will leave this session with answers to left-brained questions, including:• How do I approach spatial analysis for nonprofits and cause fundraising? (What are my software options and key considerations?)• Where can I find publicly available data specific to my organization’s cause or disease?• To what extent can incidence rates impact giving within a specified geographic area?

The Center for Employment Opportunities (CEO) is dedicated to providing immediate, effective and comprehensive employment services to men and women with recent criminal convictions. Our highly structured and tightly supervised programs help participants regain the skills and confidence needed for successful transitions to stable, productive lives.CEO is collecting client feedback through SMS text delivery to calculate Net Promoter Score (NPS), an index that measures customer satisfaction and loyalty. This is the kind of information that can be used for program improvement and it comes straight from the most important source, the Voice of the Customer. While timely client feedback data is regularly collected and used in the private sector, the social sector has focused more on longer term evaluations that take time to complete and do not lead to the kind of rapid changes that are critical for continuous improvement. Through the Constituent Voice project, CEO is committed to collecting real time data that lead to real time improvements. CEO’s Constituent Voice project integrates SMS text delivery within Salesforce to gather client feedback. We send participants SMS text messages at various program touchpoints asking them questions about their experience, providing them with structured opportunities to provide real-time feedback to CEO staff. Real-time feedback that is tracked and analyzed using Tableau’s powerful data visualization technology. CEO would like to demonstrate how to use SMS text delivery with Salesforce and Tableau integration to create visual displays on Tableau that allow us to track feedback and NPS over time, assess how feedback changes over time as improvements are implemented, and explore how feedback is associated with various program outcomes.

People will learn how to use SMS text delivery methods that integrate with Salesforce and Tableau to obtain real time client feedback that leads to real time quality improvement in the social sector. They will also learn how to create data visualizations on Tableau to monitor quality improvement. We will demonstrate how to aggregate individual level feedback into an organization wide Net Promoter Score (NPS), track org-wide NPS over time using line graphs, assess program improvement through changes in NPS using annotated control charts, and analyze the influence of NPS on participant outcomes through Tableau vizualizations.

While Machine Learning and Recommendation Engines in particular can be powerful tools for your users, by their very nature they can also unintentionally judge and segment your users in ways that go against your core values as an organization. Given their obscure nature it can be difficult to find and fix these problems, and if the media or your members find it first, it can turn in to a PR nightmare. We will learn how recommendation engines commonly work in order to understand their weaknesses, and hear specific examples of Recommendation Engines gone wrong and what might have been done differently to prevent it.

Attendees will learn:-How recommendation engines work.-Why machine learning and recommendation engines can be difficult, what their weaknesses are, and how that can go wrong.-How to make sure that a machine learning model or recommendation engine doesn't undermine your organizations core values.

Too often organizations declare victory in providing access to data, without considering how different types of users want to consume and/or make use of the data. Researchers, web developers, policy makers, policy influencers, the media, and consumers all use data differently. Understanding how they think about data, find it, and use it will help data publishers and storytellers be more effective. This session will provide actionable advice on:-Engaging your users to understand what they really want-Making your data easier to find and understand-Serving sophisticated data users without overwhelming casual users-Working with internal data owners and decision-makers to help them think more about these audiences’ needs-Maximizing the reach of your data to secondary and tertiary audiences

The session will draw upon extensive audience research and over a decade’s worth of work in the trenches delivering raw data, query tools and visualizations to people who need them. I'll share Forum One's framework for communicating data to different audience types and discuss real-world applications for nonprofit and government data.

What can anonymized telecommunications data tell us about human behavior? And how can such data be used by governments or the UN to improve operations? Come join us to discuss the latest scientific advancements in understanding human activity through telecom data and explore practical examples on how that data can be transformed into products that affect people’s lives.

The United Nations World Food Programme has deployed or tested new technology to monitor the impact of natural or man-made disasters. These technologies – including mobile surveys, crowdsourced data collection, and cell phone meta-data analysis - enable the humanitarian community to obtain real-time information from areas that had been off-limits, supporting more agile responses. In particular, mVAM, WFP’s remote mobile data collection initiative, has been deployed to 20 countries including Iraq, Yemen and South Sudan, reflecting the value that such systems offer in volatile humanitarian settings. While new technology has enabled allowed quicker, faster and more effective data collection, their use also involves opportunities for new methodologies, data management and partnerships.

Much like a snowfall is made up of many different snowflakes, the nonprofit sector is made up of many unique organizations. Yet we learn as much by studying our similarities as we do by dwelling on our differences. We can better understand our own organizations by understanding patterns and trends in the sector as a whole. Kerstin Frailey, GuideStar’s Director of Data Science, will help you discover your organization’s oftentimes surprising peers within the nonprofit sector. She will talk you through the insights gained from her analysis generated at the hub of the nonprofit universe. GuideStar holds a unique position as a data and analysis center, both gathering information from and serving organizations across the nonprofit community.

Attendees to this session will:-Discover why it is essential to understand where your organization fits in the complex nonprofit universe.-Gain cross sector insights of naturally forming communities inside the nonprofit sector.-Understand funding models and methods from those naturally forming communities.-Appreciate the importance of contributing to the nonprofit data revolution.-Gain insights into what your funding model tells you about the future of your organization.-Learn about the new tools that GuideStar is creating to help nonprofits learn about themselves and their peers.-Learn how to use GuideStar’s tools to benchmark your nonprofit against similar organization

The Wisconsin Dropout Early Warning System (DEWS) is an open source machine learning tool which provides bi-annual predictions about the likely graduation of nearly every student in grades 5-9 in the state of Wisconsin, roughly 240,000 students. DEWS has been nationally recognized and is a highly accurate system used in hundreds of Wisconsin schools to identify students in need of additional support to succeed in school.

This workshop will discuss the process of developing and deploying a machine learning tool within a large government organization. In the workshop I will start from the beginning -- identifying the business use of DEWS -- through the process of evaluating and persuading program staff of the utility of the machine learning approach.

Next, I will discuss the actual development of the machine learning algorithm including the system architecture of DEWS as a standalone R program between a data warehouse and an ETL routine to import the predictions back into the data warehouse. I will discuss the tradeoffs that lead to this configuration and alternative solutions. I will discuss the business process of identifying and validating algorithms for use in the system and the R extensions used to automate this process.

DEWS has been operational for three years now and I will discuss how it compares to other systems in place across the country, and the lifecycle of the system since it was first used. Finally, I will discuss the efforts at DPI to open source our solution and share it with others, give a demo of the open source code behind DEWS, and close with a discussion about the challenges and opportunities related to making a large-scale analytics project open source from within a government organization.

The goal of the talk is not to teach the audience about the finer points of machine learning, but instead to discuss the tradeoffs associated with using a machine learning application within a government agency. The talk will discuss how to translate from statistics and machine learning into the language of the substantive domain. Additionally, I will discuss the varying incentives operating within an organization and how to identify allies and learn from critics.

A great introduction to Tableau regardless of technical or analytical background. This session will touch on core concepts in data visualization and show how Tableau empowers everyone in an organization to see, understand and make decisions informed by their data. Learn how to connect to different data sources, explore different chart and dashboard types, and discover the powerful analytical tools built right into the software.

To get the most out of the live training, bring a laptop with Tableau Desktop already loaded on your machine. Also, please download the UN Asylum Seekers dataset ahead of time. This is the data set we’ll be using to start learning basic functionality within Tableau.

Embarking and using the latest tools and technologies to revolutionise market, social, and opinion research is an exciting prospect for all of us.It's a fantastic time to be a data scientist - everything seems possible!

As we move towards the era of smart data, we need to be even more vigilant, despite our excitement to adopt new tools and methods, so that we don't undermine the trust that has been built over decades.

Join us as we talk through the top regulator concerns, and the principles they expect us to implement within a research project.

From the privacy policy, to the collection of consent, from the design of your data management framework, to the expected security measures. We'll talk about the steps and provide practical guidance to ensure data sets remain your best asset rather than your worst liability.

We'll underline how the ICC/ESOMAR Code and our guidelines can be used in your day-to-day work to help you to innovate, and stay ahead of the regulatory curve.You'll come away knowing what's essential to bare in mind to ensure our industry's future remains possible, turning privacy into a competitive selling point, rather than inadvertently creating a future where regulators increasingly see us as a threat to individuals' privacy.

Excel is a tool that some people use every day. This session will provide you with tips and tricks that will help you analyze, visualize, manage, and just generally do more with your civic data using Excel.

Attendees will learn:-Tips for finding and loading data into Excel. -Managing and cleaning data for your purposes. -Using functions to break down or combine data. -Mapping using Excel. -Building queries using Power BI.

People are often eager to use data to derive meaningful insights and answer important questions. However, data comes to us in a variety of formats from poorly formatted Excel files to sparsely populated CSVs and has to go through a thorough cleaning process before it can be used for interesting things. In this workshop, I will go through one of the least appreciated parts of data analytics, data cleaning. This tutorial will introduce some fun into the data cleaning process and encourage individuals to introduce an investigate methodology to the process of data cleaning. If data scientists can start off their analytic workflow with a strong investigative process during data cleaning, they can discover interesting information earlier, avoid presenting inaccurate and incomplete information, and reinvent their viewpoint on data cleaning.

After this session, attendees will walk away having learned the following:(A) A set of questions to ask when approaching a new data set.(B) How to effectively use Pandas' features to clean a data set(C) How to effectively use command line functionality to clean a data set prior to using Pandas or another tool(D) How to fuse the exploratory data analysis and data cleaning phases to form a more cohesive and robust process

The last ten years have seen an unprecedented level and pace of change affecting the nonprofit space. Consider the cause marketing revolution, the rise of the millennial generation, and the increased pressure on nonprofits for transparency and accountability. The pressure to evolve, be relevant, and make an impact is greater than it’s ever been before – and many organizations are struggling to keep hold of their culture, personnel, and constituents while they grapple with change and disruption.

It’s time to recognize that change isn’t an option, but a necessity. Join Jeff Shuck for an honest dialogue about change in the nonprofit space – what it looks like, what it feels like, and what it means for you – whether you’re handing change down to your team or trying to roll with change dictated to you from above. He’ll walk you through effective strategies for turning disruption into progress, and provide examples of successful change management from ten years as a consultant in the nonprofit space.

Participants will leave with answers to the following questions:• What trends are impacting the evolution of the nonprofit space?• How should I approach change, and how can I lead it from my post?• How should I discuss changes with my team?• How should my organization communicate change to our constituents?

The world is a diverse place. Yet, one common mistake of data analysts is to treat it as a homogenous group. This workshop will talk about problems caused by treating all populations as one and some common sense approaches to respecting diversity in analysis. It will include strategic advice regarding creating equitable and realistic research questions and defining “sensitive populations”. It will include data collection advice such as how to design instruments sensitive to population differences and how to collect data from populations that may be traditionally difficult to recruit. Finally, it will also include analysis advice such as when it is fair to combine populations and when they should be treated separately. Anecdotes, case studies and examples from literature plus the Museum of Science and Industry’s own experience will be used as stories to discuss these topics. Basic use of SPSS and Excel will be included to introduce some best practices about how to present data from diverse groups. Finally, all discussions will be held through the lens of mindfully supporting equity and fairness at every stage of the decision making process when working with sensitive populations.

The most important takeaway will be about what questions to be asking at each stage of the data collection and analysis process. The issue of diverse populations needs to be considered from the beginning to the end and not relegated to a one-time thought or discussion. We will cover the implications of those discussions. Examples: Best formats for asking ethnic/racial questions on surveys, when it is acceptable to merge demographic groups in analysis, respecting gender identities, conducting interviews with at risk populations, less biased recruiting strategies, fair compensation/incentives, IRB implications, how to visualize and report sensitive data, etc.

HomeKeeper is a data management system that allows affordable home ownership organizations to track their complex day to day program management workflow in Salesforce. It is currently in use by 70 different home ownership organizations across the country. The system improves coordination between homeowners and program staff. It also allows organizations to more consistently manage compliance monitoring and homebuyer support activities.

The system is connected to a shared measurement system called the HomeKeeper National Data Hub. The HomeKeeper National Data Hub aggregates and crunches affordable housing data from member organizations to produce social impact report. This system uses Salesforce and Tableau to calculate a set of social impact metrics based on a core set of common fields and program procedures. The Hub provides a national view of how these programs are doing at meeting the needs of underserved buyers and preserving affordability of homes as they resell. The shared measurement system also provides programs benchmarking data comparing themselves to peers.

In this session you will learn:1. The history of the system, from its humble start with 3 organizations in the Northwest to a system in use by 70 organizations.2. Insights from the social impact report generated from the shared measurement system. This will include a demonstration of the relevant parts of the system.3. The factors that have driven its success, including the keys to replicating this in other sectors.

Currently, only the CDC and NIH offer large-scale datasets on mental health and crisis. Neither has access to real-time data that that show trends over time and space or capture how people think about crisis in their moment of greatest need. Crisis Text Line is starting to fill the need for real-time data at scale.

Since August 2013, Crisis Text Line has exchanged 7.5M messages with people in crisis. This is already the largest corpus of data on people in crisis in the country. By mid-2016, Crisis Text Line will be exchanging 100,000 messages per day.

Crisis Text Line has a volume, velocity, and variety of data that allows us to provide real-time insights. We use our data in two ways: (1) to imrpove the quality of our service internally, and (2) to improve the crisis space as a whole. In order to achieve #2, Crisis Text Line has built a series of real-time dashboards for the public and partners, all on a tight budget and with a very limited data staff.

Attendees will walk away having learned how to Extract, Transform and Load data in a way that's easy for Tableau to digest. How to set up private dashboards using Tableau. Ultimately, Tableau shows the power of off-the-shelf solutions for creating visualizations that can enable remote staff and partners to make smarter decisions.

Knowing what happened in the past is helpful, but predicting what might happen in the future is game changing. This session will break down the process of creating predictive analytics for nonprofit fundraising. We’ll explore lessons learned over the years and the iterative nature of the process. Session will discuss data sources along with the process and technology used. The learning objectives for this session include:

-How does predictive modeling fit into an analytics maturity model-What problems can predictive models solve for nonprofits and fundraisers-How to avoid some of the common pitfalls in developing predictive analytics-What does the development process for predictive analytics from start to finish

We’re entering new territory for peer-to-peer fundraising to better understand the ‘why.’ Why do some take action when others don’t? Armed with this knowledge you’ll be able to make decisions for your program to increase revenue. And, isn’t it everyone’s goal to increase revenue? Turns out, peer to peer fundraising is the perfect psychological storm. Data analysis tells us how participants act. Psychology tells us why they act that way. Having both at your disposal makes you a powerful fundraising professional. Understanding why people make decisions to use their time in particular ways will help you transform how you structure your program, and may cause you to question your own decision-making method. This session will help you:·Gain insights into the mindset and tendencies of the volunteer fundraiser·Learn how to avoid market relationships and foster social relationships·Understand the counter-intuitive results you might get in your data·Understand how to install and grow affinity in your fundraisers·Plan how to move even zero dollar fundraisers into action using these principlesDig into this kind of fun and more – presenting will be Otis Fulton, an expert in psychology who was shanghaied into the study of the peer to peer fundraising, and Katrina VanHuss, a 27-year industry veteran determined enough to press the 6’10” Fulton into psychological service. She looked like a dachshund bringing down a black bear getting it done.

Stories and data are changing the nonprofit and philanthropy landscape. How do you gain insight on new ways in which those stories and data can be collected to inform community decision-makers? How do you collect data on the people you're not listening to? We will share concrete examples about how affordable and convenient technology can be used to tell the stories and experiences of communities. You will hear about how the availability of stories and data is changing the nonprofit and philanthropy landscape at the community level.

With the increasing dominance of social networks, we are becoming accustomed to looking at data from a network perspective, where entities are nodes in a graph data structure, and relationships between them are links or edges.

The usefulness of this perspective however is broader – it can provide fresh insights in many application domains. For example, it can facilitate classification tasks by adding network-based dimensions to the feature set; It can be used to model resource flows between economic actors; It can be used in recommender systems to help rank recommendations. The possibilities are endless.

To help participants make better use of this powerful perspective, this session will introduce users to the NetworkX Python library through building a network data structure, exploring the data using various graph metrics and graph analysis algorithms; and using the data structure to answer questions about the underlying subjects that would otherwise be difficult to obtain.

Collecting “good” data in normal circumstances is difficult enough, but in volatile contexts, such as in conflict and post-conflict zones, it is even harder. Culture, trust, political instability, getting beyond the “façade” a country puts on for its visitors, and dealing with everyday logistics all play major and unpredictable roles. Don’t forget governmental censorship on data, leading to the phenomenal of “official numbers” versus “real numbers” (Hint: most NGOs and donor agencies unknowingly use official numbers, resulting in misleading scope and root of the problems).

Yet data plays such a vital role, especially for impact organizations, whether they are international NGOs, such as Red Cross and Catholic Relief Service, donor agencies, such as USAID or the United Nations, or small social enterprise startups. Data allows stakeholders, shareholders, and policymakers to make informed decisions about the strategic distribution of resources and funding streams so that they are meeting the needs of the people. More importantly, data gives opportunities to the people to empower themselves and hold their governments accountable.

In this interactive workshop, we will use real case studies and scenarios to explore the intersection of data and ethnography. Participants will break up into teams and actively come up with solutions to the following questions:•In these circumstances, what is considered “good” data? What are outside influences that could mislead or misdirect “good data”? •What kind of data should be expected and what is the most impartial and feasible: qualitative or quantitative? •What are the best practices of collecting such data from people who do not trust strangers, especially Westerners in places where macro-data simply does not exist?•When we are crunching and analyzing collected data, what do we need to keep in mind?

Over the past few years we have worked with the Catholic Church to measure the spiritual growth of Catholics and identify which factors drive spiritual growth (beliefs do not make a difference, interestingly). We will demonstrate how we built a scalable automated platform that collects, analyzes, and visualizes survey data collected thus far from over 100,000 parishioners so that the 225 parishes in 12 dioceses that we worked with are able to drive rapid hypothesis testing, benchmarking, and decision making.

The biggest corporations in the world are using data science and machine learning to drive their business - why aren’t we using the same approach to solve the world’s biggest problems? We’ll learn how different non-profits and government agencies are using data to make their programs more efficient and effective.

We’ll cover case studies from education, healthcare, environment, and criminal justice. You will learn about the data and skills that can bring data science into your organization at different levels of scale, from Excel to robust Machine Learning. We will talk about what kinds of problems can (and cannot) be solved with data science, and the specific problems you can solve with the data and organization you have today.

We would like to share our learnings from building a web tool that more readily provides teachers access to their student data, literally at their fingertips. In partnership with the Somerville Public Schools and the City of Somerville, a Code for America Fellowship team of one developer, one designer and one data scientist embarked on an endeavor that was surprisingly just as much --- if not more --- a journey of human relationships as it was a technical lift. We learned how teachers want easier access to their student data so that they can provide timely response to their students’ needs, both academic and non-academic. Teachers currently create innovative hacks to do this. They log into multiple systems, collect their own classroom data, and collate data into Excel sheets in order to find correlations between and visualize student metrics. We paired with teachers in order to create a tool that builds on the ways they look at their data but without the time drains.

Also, throughout our work, we also realized how locked down Student Information Systems can be and wondered how schools may benefit from an open source Student Information System ETL. Would schools be interested in building apps and tools on top of their SIS data if they had easier access to it?

Attendees will learn:+ how can access to student data by teachers, parents and students support learning? + what analyses and visualizations can help support classroom instruction?+ how can developers / designers / data scientists collaborate with educators to build open source tools using student data?

Analytical skill, not the ‘latest and greatest’ BI software, is what makes an analyst effective. Learn how to explore a new data set in the most relevant and insightful way. Learn how to apply Exploratory Data Analysis (EDA) methodologies to a new data set.

My presentation will provide a roadmap for initially exploring a data set via a prescribed series of graphical analyses that will allow you to find the most important aspects of a data set.

Attendees should have beginner Tableau skills and general knowledge of data analysis and statistics.

State and local governments are the original data science organizations: through collecting data, predicting outcomes, analyzing results, and measuring change, they are responsible for transforming legislative language into concrete outcomes---all at an individual level. Advances in data science techniques bring new speed and scale to the public sector’s work. With developments in data warehousing, it’s possible to join together databases that once lived in separate servers; with predictive modeling, we can develop a deep understanding of current and potential users to inform outreach; and with data visualization, we can easily and clearly share technical insights to non-technical analysts. The result is an ever-improving feedback loop between measuring outcomes and implementing policy. Civis Analytics has worked with a state agency to improve regional savings through individual-level data science: merging state data with other sources to develop a meaningful understanding of the constituency, crafting a tailored outreach strategy that takes into account individual variance, expanding access beyond populations traditionally engaged and utilizing state resources, and measuring results in a way that informs future action.

Starting only with a list of the state agency’s constituents, Civis Analytics has used advanced person-matching technology to enrich this data and build a more comprehensive and multi-faceted understanding of the state agency’s data. Further, Civis Analytics helped the state agency develop an outreach plan to efficiently expand their reach. Do Good Data attendees will learn about the technologies Civis Analytics uses to match records, ditching rule-based matching in favor of propensities based on machine learning. And attendees will learn how Civis can use current participation information to model others’ likelihoods to participate as well. Most importantly, however, Todd will discuss methods used to ensure that we adjust individual-level models to take the systemic disadvantages that racial and socioeconomic groups face into account, and that we consider the ways that these models can improve outcomes among these groups.

Understanding of predictive modeling methods is helpful, but no prior knowledge is required.

This workshop will cover when to use D3 over other options, the foundational concepts for working in D3, and some interactive code-along time demonstrating how to build some basic (and not so basic) visualizations. Bring your computer!

The session will chronicle the experience of utilizing operational, program and publicly-available data to inform program adoption, expansion and operation. Emphasis will be placed on the planning and internal processes such as determining the quality and applicability of public datasets to organizational data and engaging program stakeholders in the process. Data sets referenced will include public data sources such as the U.S. Census Bureau and non-profit research organization such as the Urban Institute and Pew Research Center.

This session is design for the novice analyst or “accidental analyst” who is unfamiliar with accessing and using public data sets. Attendees will leave with a stronger understanding of the experience of using public data and resources on public data quality, processes and practices.

Pay for success (PFS) projects are about measurably improving the lives of people in need through an innovative contracting model that drives government resources toward high-performing social programs in areas such as poverty, education, child welfare, recidivism, homelessness, and wellness. PFS contracts track the effectiveness of programs over time to ensure that funding is directed toward programs that succeed in measurably improving the lives of people most in need.A PFS contract involves an end-payer, funders, service providers, third party evaluators, and an intermediary advisor. This session provides perspectives on how data enables each of these stakeholders to construct a PFS project and how they use data to evaluate potential PFS projects.

Malaria remains one of the top killers of children under five in Africa. At the same time, tremendous progress has been made in the last decade that have allowed the global health community to focus on fully eliminating the disease even in high burden settings. PATH is partnering with Ministries of Health in Ethiopia, Senegal and Zambia to demonstrate how progress toward malaria elimination can be achieved. Some of the most critical tools in our arsenal has been the data generated from our field studies and our work to improve health information systems in these countries. Data visualization has become an essential approach we have employed to communicate how data from these two sources can be used by stakeholders and decision makers to focus scarce resources to eliminate the disease at a community level. In this workshop PATH will demonstrate how we are linking a country’s DHIS2 (open source health information platform) and Tableau to build better visualizations that quickly incorporate data from new interventions being tested to eliminate malaria. Our workshop will demonstrate specific visualizations that we use with our own staff, ministry officials and district health teams to understand progress toward elimination and interrogate whether or not they are applying the right health interventions in a given health facility catchment area. Our work to eliminate malaria has begun to show success and we are able to visualize important gains that are being made to detect, treat and prevent reinfection of malaria. At the same time, visualizations are also helping drive better data quality and improvement behaviors among the thousands of health workers, district medical officers and district information officers we work with across our focus countries.

Attendees will gain insights and learning about how we are integrating our field study data in real-time with national routine health information data in the countries we work. We will demonstrate how a global health organization uses DHIS2, excel and tableau in non-enterprise environments to generate enterprise quality dashboards and inspire use of the data and information at different layers of our organization and more importantly among key stakeholders that we work with in ministries of health.

Data is not just the domain of the data analyst in a nonprofit. Everyone in an organization from front line staff, to development, to Executive Director needs to be able to access and synthesize your organization's data.

This workshop will introduce you to Shiny Server, an open source project that brings the full power of R to the web browser. At the Family Independence Initiative we use R and Shiny to rapidly develop and deploy data dashboards for internal consumption and for external partners. This workshop will give you the tools to quickly develop and deploy your own data dashboards online, empowering your whole organization to learn from your metrics.

Attendees of this workshop will walk away understanding the basics of how to develop applications in Shiny Server, how to stand up a Shiny Server of their own or on RStudio's hosted solution, and how to deploy applications.

The world is awash in textual data, from webpages to Twitter, and even (Gasp!) 990s. Natural language processing provides a way to create structure out of this unstructured world, and in so doing, provides myriad opportunities to derive value from many different data sources. This tutorial provides a high-level introduction to natural language processing (NLP). Although the session has elements geared towards analysts, attendees will leave with knowledge about what sorts information can be extracted from text, what underlies many current NLP techniques, and where limitations lie.

In the session, we will cover collecting and preprocessing textual data, methods for visualization, as well as machine learning techniques. Supervised-learning approaches will include sentiment analysis, which allows you to assess the emotional sentiment in text, as well as document classification, a technique for tagging or bucket documents according to a known categorization system. We will also explore unsupervised approaches such as topic modeling, a method for extracting the themes across a set of textual documents, as well as document clustering, which allows you to assess similarity across pieces of text.

Along the way you will learn some tricks of the trade, limitations and words of caution about these techniques, and places where you can get lots of textual data to play with. You will also receive information about well-developed libraries in Python and R for conducting NLP analyses. We’ll walk through some use cases in which NLP has been used for social good, and will also provide some examples of how nonprofits might incorporate natural language processing into their analytics workflow to organize their text and potentially extract actionable information.

Although you don’t have to be a data analyst / scientist to learn something from this workshop, familiarity with data processing and some basics of machine learning would be helpful. People who attend will walk away both with a high-level understanding of what sorts of things can be done with textual data, as well as practical skills such as harvesting, processing and analyzing text. A list of useful resources will be provided, in addition to code snippets in both R and Python so you can go and explore these techniques yourself!

Using the principals of User Experience (UX) and Information Design, we'll discuss some ways to present and explain your data to all the layfolk you'll encounter (like me!) who are interested in your results but may not have the skills—or interest—to pick through all the gory details. We'll talk about some practical ways for how can we design and present information in a way that is intriguing, tells the right story, and that your audience will understand.

Some of the principles of User Experience and Information Design and practical ways to apply those principles to their data.

We'll also talk about some deceptive examples of information design as points of comparison.

Harnessing the power of Big Data lends an air of authority and depth to your work that can have a positive impact on grant makers and stakeholders alike. While there are legitimate concerns associated with Big Data, resources useful to researchers, organizers and others associated with working to promote social good usually require or divulge little or no personal information. What IS required to utilize Big Data in your work is to develop insight to determine the right questions to ask and persistence to pursue and distill information – skills many people working for the social good possess in abundance.

Primary Audience: Researchers, organizers, direct service workers and others who could benefit from utilizing “hard" numbers, anecdotal narratives and ethnographic data into written and other presentations.

Relevance: This presentation will utilize hands on, interactive exercises using a variety of data samples to demonstrate how Big Data can be incorporated into qualitative narratives in addition to being presented in charts and graphs.

What is data science? What is machine learning? How can it change the way social sector organizations operate? Join Impact Lab co-founder Matt Gee as he helps illuminate the world of data science. One of our highest rated sessions of 2015 is back!

The humanitarian response landscape is changing rapidly. Many active conflict zones are now in middle-income countries, and these conflicts are spilling over to the neighboring regions. In order to target diminishing resources and facilitate humanitarian responder coordination, it is imperative that innovative approaches are used. The integration of technology and humanitarian aid is in its' infancy, but is absolutely imperative. Jesse Berns, CEO of Dharma Platform, will discuss the need for a paradigm shift to a data-driven response model from the old, anecdotal one.

Much of the world of data analytics and the use of big data was developed by data scientists, and its potential application to the social sector is only recently being explored. Amongst social sector stakeholders, there is a budding perception that new sources and types of data – including big data-- have the potential to support evidence-based decisions by supplementing traditional monitoring and evaluation (M&E) approaches. While conventional M&E can generate retrospective data and evidence about the effectiveness and impact of a project or program after the fact, data analytics and big data may bring a different, forward-looking lens to the table. If the proliferation of new forms of dynamic and tech-enabled data could generate early insights into social and transactional data, the nature of evidence-based decision making may shift from retrospective, addressing the ‘what happened’ and ‘why,’ to prospective, addressing the ‘what may happen’.

This session will discuss the implications of new data types, sources and techniques for all aspects of evidence based decision-making, including monitoring and evaluation, in order to predict potential impact. Panelists will discuss and debate the application of tech-enabled data collection, aggregation and analysis to shape more impactful programs and identify and discuss current tensions in this debate.

Government is increasingly relying upon data to provide better services for its constituents. In this session we will hear from Stephen Goldsmith, former Deputy Mayor of New York City under Bloomberg and former Mayor of Indianapolis.