Looking for a quick way to train field researchers? How about quick tips on data management or a reminder about what a p-value is? The new EvalFest website hosts brief training videos and related resources to support evaluators and practitioners. EvalFest is a community of practice, funded by the National Science Foundation, that was designed to explore what we could learn about science festivals by using shared measures. The videos on the website were created to fit the needs of our 25 science festival partners from across the United States. Even though they were created within the context of science festival evaluation, the videos and website have been framed generally to support anyone who is evaluating outreach events.

Here’s what you should know:

The resources are free!

The resources have been vetted by our partners, advisors, and/or other leaders in the STEM evaluation community.

You can download PDF and video content directly from the site.

Here’s what we have to offer:

Instruments — The site includes 10 instruments, some of which include validation evidence. The instruments gather data from event attendees, potential attendees who may or may not have attended your outreach event, event exhibitors and partners, and scientists who conduct outreach. Two observation protocols are also available, including a mystery shopper protocol and a timing and tracking protocol.

Data Collection Tools— EvalFest partners often need to train staff or field researchers to collect data during events, so this section includes eight videos that our partners have used to provide consistent training to their research teams. Field researchers typically watch the videos on their own and then attend a “just in time” hands-on training to learn the specifics about the event and to practice using the evaluation instruments before collecting data. Topics include approaching attendees to do surveys during an event, informed consent, and online survey platforms, such as QuickTapSurvey and SurveyMonkey.

Data Management Videos — Five short videos are available to help clean and organize your data and to help begin to explore it in Excel. These videos include the kinds of data that are typically generated by outreach surveys, and they show step-by-step how to do things like filter your data, recode your data, and create pivot tables.

Data Analysis Videos — Available in this section are 18 videos and 18 how-to guides that provide quick explanations of things like the p-value, exploratory data analysis, the chi-square test, independent-samples t-test, and analysis of variance. The conceptual videos describe how each statistical test works in nonstatistical terms. The how-to resources are then provided in both video and written format, and walk users through conducting each analysis in Excel, SPSS, and R.

Our website tagline is “A Celebration of Evaluation.” It is our hope that the resources on the site help support STEM practitioners and evaluators in conducting high-quality evaluation work for many years to come. We will continue to add resources throughout 2019. So please check out the website, let us know what you think, and feel free to suggest resources that you’d like us to create next!

Social Network Analysis (SNA) is a methodology that we have found useful when answering questions about relationships. For example, our independent evaluation work with National Science Foundation-funded Integrative Graduate Education Traineeship (IGERT) programs typically include a line of inquiry about the nature of interdisciplinary relationships across trainees and faculty, and how those relationships change over time.

Sociograms are data displays that stakeholders can use to understand network patterns and identify potential ways to affect desired changes in the network. There are currently few, if any, rules regarding how to draw sociograms to facilitate effective communication with stakeholders. While there is only one network—the particular set of nodes and the ties that connect them—there are many ways to draw the network. We share two methods for visualizing networks and describe how they have been helpful when communicating evaluation findings to clients.

Approach 1: Optimized Force-Directed Maps

Figure 1 presents sociograms for one of the relationships defined as part of an IGERT evaluation as measured at two time points. Specifically, this relationship reflects whether participants reported that they designed or taught a course, seminar, or workshop together.

In this diagram, individuals (nodes) who share a tie tend to be close together, while individuals who do not share a tie tend to be farther apart. When drawn in this way, the sociogram reveals how people organize into clusters. Red lines represent interdisciplinary relationships, making it possible to see patterns in the connections that bridge disciplinary boundaries. These sociograms combine data from three years, so nodes do not move from one sociogram to the next. Nodes appear and disappear as individuals enter and leave the network, and the ties connecting people appear and disappear as reported relationships change. Thus, it is easy to see how connections—around individuals and across the network—evolve over time.

One shortcoming of this data display is that it can be difficult to identify the same person (node) in a set of sociograms spanning multiple time periods. However, with additional data processing, it is possible to create a set of aligned sociograms (in which node positions are fixed) that make visual analysis of changes over time easier.

Figure 1: Sociograms — Fixed node locations based on ties reported across all years

a) “Taught with” relationship year 4

b) “Taught with” relationship year 3

Approach 2: Circular/Elliptical Maps

Figure 2 introduces another way to present a sociogram: a circular layout that places all nodes on the perimeter of a circle with ties drawn as chords passing through the middle of the circle (or along the periphery when connecting neighboring nodes). Using the same data used for Figure 1, Figure 2 groups nodes along the elliptical boundary by department and, within each department, by role. By imposing this arrangement on the nodes, interdisciplinary ties pass through the central area of the ellipse, making it easy to see the density of interdisciplinary ties and to identify people and departments that contribute to interdisciplinary connections.

One limitation of this map is that it is difficult to see the clustering and to distinguish people who are central in the group versus people who tend to occupy a position around the group’s periphery.

Figure 2: Sociograms—All nodes from all years of survey placed in a circular layout and fixed

a) “Taught with” relationship year 4

b) “Taught with” relationship year 3

Because both network diagrams have strengths and limitations, consider using multiple layouts and choose maps that best address stakeholders’ questions. Two excellent—and free—software packages are available for people interested in getting started with network visualization: NetDraw (https://sites.google.com/site/netdrawsoftware/home), which was used to create the sociograms in this post, and Gephi (http://gephi.github.io), which is also capable of computing a variety of network measures.

Most of the evaluations I conduct include interview or focus group data. This data provides a sense of student experiences and outcomes as they progress through a program. After collecting this data, we would transcribe, read, code, re-read, and recode to identify themes and experiences to capture the complex interactions between the participants, the program, and their environment. However, in our reporting of this data, we are often restricted to describing themes and providing illustrative quotes to represent the participant experiences. This is an important part of the report, but I have always felt that we could do more.

This led me to think of ways to quantify the transcribed interviews to obtain a broader impression of participant experiences and compare across interviews. I also came across the idea of crowdsourcing, which means that you get a lot of people to perform a very specific task for payment. For example, a few years ago 30,000 people were asked to review satellite images to locate a crashed airplane. Crowdsourcing has been around for a long time (e.g., the Oxford English dictionary was crowdsourced), but it has become considerably easier to access the “crowd.” Amazon’s Mechanical Turk (MTurk.com) gives researchers access to over 500,000 people around the world. It allows you to post specific tasks and have them completed within hours. For example, if you wanted to test the reliability of a survey or survey items, you can post it on MTurk and have 200 people take the survey (depending on the survey’s length, you can pay them $.50 to $1.00).

So the idea of crowdsourcing got me thinking about the kind of information we can get if we had 100 or 200 or 300 people read through interview transcripts. For simplicity, I wanted MTurk people (Called Workers on MTurk) to read transcripts and rate (using a Likert scale) students’ experiences in specific programs, as well as select text that they deemed important and illustrative of those participant experiences. We conducted a series of studies using this procedure and found that the crowd’s average ratings of the students’ experiences were stable and consistent, even after we used five different samples. We also found that the text the crowd selected was the same across the five different samples. This is important from a reporting standpoint, because it helped to identify the most relevant quotes for the reports, and the ratings provided a summary of the student experiences that could be used to compare different interview transcripts.

If you are interested in trying this approach out, here a few suggestions:

1) Make sure that you remove any identifying information about the program from the transcripts before posting them on MTurk (to protect privacy and comply with HSIRB requirements).

2) Pay the MTurk people more for work that takes more time. If a task takes 15 to 20 minutes, then I would suggest that a minimum payment is $.50 per response. If the task takes more than 20 minutes I would suggest going $.75 to $2.00 depending on the time it would take to complete.

3) Be specific about what you want the crowd to do. There should be no ambiguity about the task (this can be accomplished by pilot testing the instructions and tasks and asking the MTurk participants to provide you feedback on the clarity of the instructions).

I hope that you found this useful and please let me know how you have used crowdsourcing in your practice.

Executive Director, The Evaluation Center at Western Michigan University

Data don’t speak for themselves. But the social and educational research traditions within which many evaluators have been trained offer little in the way of tools to support the task of translating data into meaningful, evaluative conclusions in transparent and justifiable ways (see Jane Davidson’s article). However, we can draw on what educators already do when they develop and use rubrics for grading student writing, presentations, and other assessment tasks. Rubrics can be used in similar ways to aid in the interpretation of project evaluation results. Rubrics can be developed for individual indicators, such as the number of women in a degree program or percentage of participants expressing satisfaction with a professional development workshop. Or, a holistic rubric can be created to assess larger aspects of a project for which it is impractical to parse into distinct data points. Rubrics are a means for increasing transparency in terms of how conclusions are generated from data. For example, if a project claimed that it would increase enrollment of students from underrepresented minority (URM) groups, an important variable would be the percentage increase in URM enrollment. The evaluator could engage project stakeholders in developing a rubric to interpret the date for this variable, in consultation with secondary sources such as the research literature and/or national data. When the results are in, the evaluator can refer to the rubric to determine the degree to which the project was successful on this dimension. To learn more about how to connect the dots between data and conclusions, see the recording, handout, and slides from EvaluATE’s March webinar evalu-ate.org/events/march_2013/.

EvaluATE is supported by the National Science Foundation under grant numbers 0802245, 1204683, 1600992, and 1841783. Any opinions, findings, and conclusions or recommendations expressed on this site are those of the authors and do not necessarily reflect the views of the National Science Foundation.