Month: May 2020

Contributing author Albert Y. Kim is an assistant professor of statistical & data sciences. He is a co-author of the fivethirtyeight R package and ModernDive, an online textbook for introductory data science and statistics. His research interests include spatial epidemiology and model assessment and selection methods for forest ecology. Previously, Albert worked in the Search Ads Metrics Team at Google Inc. as well as at Reed, Middlebury and Amherst colleges. You can follow him on Twitter @rudeboybert.

Contributing author R. Jordan Crouser is an Assistant Professor of Computer Science at Smith College. He is published in the areas of visualization theory, human-computer interaction, educational technology, visual analytics systems and human computation. For more information, visit his faculty page.

Contributing author Benjamin S. Baumer is an assistant professor in the Statistical & Data Sciences program at Smith College. His research interests include sports analytics, data science, statistics and data science education, statistical computing, and network science. For more information, visit his faculty page.

You might have heard of Slack before. But what is it? Is it email? Is it a chat room? Slack describes their flagship product as a “collaboration hub that can replace email to help you and your team work together seamlessly.” In this blogpost, we’ll describe how we’ve been using Slack for asynchronous course communication, as opposed to the synchronous course communications afforded by Zoom and other remote conferencing platforms.

Why do we stress (a)synchronous? The brick-and-mortar constraint of having everyone working at the same time is unworkable under the unfolding COVID-19 pandemic. Across the world, support staff, faculty, and students have suddenly been forced to convert to a remote learning model of education. In order for this model to be successful, flexibility is needed to ensure equitable learning experiences with respect to differences in time zones, suitability of student learning environments, internet access, and many other factors. In order to ensure this flexibility, many instructors are recognizing that some portion of their courses must be delivered in an asynchronous fashion, on top of the synchronous nature of regular lecture and meeting times.

Before we discuss how we’ve been using Slack, we must explain how Slack is organized.

How is Slack organized?

Slack is organized into workspaces, which loosely correspond to a “team” of individuals (such as a research or special interest group). In our case, this will be an individual course. When using Slack from the Desktop or Mobile app, a list of your workspaces appears in the left-hand vertical menu bar. For example, of the 8 workspaces highlighted in red, we are currently viewing the “220” course workspace:

Within each workspace are channels (identified with hashtags), highlighted here in blue. You can think of channels as forums corresponding to topics. In this example, we have #general (announcements), #questions, and several others. Different stakeholders can join each channel, and channels can be designated public or private as appropriate. Note how the #problem_sets channel has a lock icon, indicating that it is private (to just instructors and graders).

Additionally, within each workspace are direct messages (DMs), highlighted in green. You can think of DMs as group text messages. Unlike with channels, people cannot later “join” these conversations.

What are the benefits of Slack?

Slack’s primary benefit is centralization and organization of communications, which helps to minimize inefficient context switching:

For example, if we want to ignore messages related to the 220 course and focus our attention on the 293 course, we can do so easily. This inherent compartmentalization of communications relating to courses is especially helpful when managing asynchronous communication across multiple courses, the challenges of which have been amplified during the recent outbreak of COVID-19.

Second, Slack facilitates the posing and answering of student questions via channels dedicated to discussion boards. This is a welcome feature of Slack given the importance of (a)synchronous communications in light of COVID-19.

Note that Slack is certainly not the only platform that has such functionality; other platforms include Moodle, Piazza, and Discord.

Third, the benefits of Slack increase not only as the number of team members grows, but also as the number of distinct groups of team members grows. For example, this semester’s two sections of Smith College’s SDS/MTH 220 Introduction to Probability and Statistics have 79 students who form 31 term project groups, 2 instructors, 2 lab instructors, 2 graders, and 2 in-class teaching assistants. By carefully constructing both private and public channels and direct messages, we can localize communications in their appropriate destinations. This is critical at a time where we can’t meet in person, nor can we easily meet at the same time.

Fourth, the more casual nature of Slack interactions versus email reduces instructor/student barriers. For example, less time can be spent choosing appropriate email greetings and signoffs. Additionally, Slack’s use of newer modalities of communication like emojis and GIFs can further facilitate expression at a time when maintaining open communication is paramount.

Other benefits of Slack include (1) seamless transition between Desktop and Mobile interfaces; (2) a growing ecosystem of 3rd party applications to integrate with platforms such as Zoom, GitHub, PollEverywhere, Google Drive, and Dropbox; and (3) unlike Moodle or Piazza, Slack is widely used in industry. While we won’t argue that Slack is a skill, familiarity with it certainly won’t hurt students as they enter the workplace.

What are some pitfalls of Slack?

As with any communication platform, Slack has its share of potential pitfalls:

There are cognitive costs associated with switching to Slack-based course communication, and student buy-in can vary depending on (1) general comfort with technology and (2) the use of Slack within other courses at your institution or department.

Notifications settings really matter: students who only use Slack via their browser often miss messages sent between lectures if their email notifications aren’t set. Students who use the Desktop or Mobile applications encounter this issue far less often, but this does require installation of these interfaces.

Since Slack was designed for tech companies rather than for education, it is consequently not FERPA compliant. Thus, certain sensitive communications should not take place on Slack.

While Slack offers a “freemium” version, it caps access to the most recent 10,000 messages and 5GB of file storage. To exceed these caps, monthly per user fees must be paid.

When to make the switch

Should you switch to Slack right now (during the COVID-19 pandemic)? Our answer: if you have an existing method that gets the job done, probably not. Switching your communication tool amid the stress currently facing staff, faculty, and students may cause more harm than good. However, you may want to consider the following reasons we think you should use Slack in future courses:

Do you prefer having your communications centralized and compartmentalized?

Are there multiple groups to coordinate within your team: instructors, teaching assistants, graders, students, and various groupings thereof?

Are you looking for ways to make communication between students and faculty feel more accessible?

Does your course involve collaborating on code, either directly or via GitHub?

Do other instructors in your department or institution use Slack?

Do you hate email?

As your answers to these questions tend toward yes, the case for Slack gets stronger. At our institution, we have been vocal advocates of using Slack in the classroom. The increased importance of (a)synchronous communication brought on by the COVID-19 pandemic has further reinforced our belief in the benefits that Slack can provide for course communication.

Contributing author Jonathan Duggins is a Teaching Assistant Professor in the Department of Statistics at North Carolina State University.

Introduction

Most of us statistics (and data science!) educators understand that knowing how to use statistical software is integral to student successes, both in their coursework and in their careers, for our statistics and data science majors. However, in many degree programs, software usage is seen as a means to an end – getting an analysis – rather than an end goal in its own right. How did this come about, why does it matter, and what can we do to change our software-related instruction? These are the questions I discuss below, first by looking at some history of programming in these contexts, then by presenting two current philosophies on how to incorporate programming.

Background

Getting a bachelor’s degree in mathematics has long meant learning computer programming. As statistics degree offerings appeared, they adopted this convention and the emerging field of data science, with its inherent computational needs, has followed suit. Whether Java, C, R (or S-PLUS back in my day!), Python, or SAS – students pursuing a degree in statistics or data science routinely program in at least one of these languages. Unfortunately, this is often enforced by adding a course to an existing degree program. In some cases, this course is merely borrowed from another department and does not meet the students’ discipline-specific needs! Even in relatively new degree programs, our field’s approach to programming seems anachronistic.

We commonly use software in the classroom to help teach a variety of topics – exploring data graphically, computing classical summary or inferential statistics, or conducting a simulation to study the properties of a resampling technique – and these advances in the inclusion of software are often touted when discussing how we have modernized our curricula. However, if students do not build the programming skills necessary to implement and understand these analyses, then software becomes a black box.

Why Does This Matter?

While there are several reasons to revisit how we teach programming, the one I’m focusing on here is that programming is different than most other skills we teach – if a program is inefficient, doesn’t follow good programming practices, or is otherwise sub-optimal, it can still produce correct results! Programming is not just something students should be doing to get an answer. We have an obligation to go beyond teaching students how to write functional code – we must train high-quality programmers. Statistics and data science careers that make extensive use of programming are exceedingly popular in their own right and as data sets get larger and programming becomes a ubiquitous skill, there is immense value in students having not only an ability to write code that solves a problem, but in using best practices when doing so.

Two Philosophies

Degree programs have typically adopted one of two prevailing philosophies regarding programming instruction: integrated or standalone. Both approaches have advantages and disadvantages that are important for designing an optimal educational experience that prepares our students to write the end-to-end programs they will use in their careers. In this context, I’m defining end-to-end programming as the application of the following three components.

1. Data cleaning and preparation

2. Data summary, analysis, and modeling

3. Reporting/presentation of results

Of course, most programs make use of general computing concepts (e.g. file types, paths, etc.) and not every program needs to employ all three components. However, students trained to write this style of end-to-end program can easily adapt to writing programs that only require one or two of the components.

Integrated Instruction

This approach typically focuses only on data summary, analysis, and modeling – concepts used as a means to an end for discipline-specific course content – e.g. a regression course teaching SAS modeling tools such as PROC REG but excluding any programming concepts not explicitly needed to complete the course. The most obvious pedagogical benefit to this approach is instructors can present a programming skill after students are familiar with the statistical concept. A second, but related, benefit of the integrated approach is logistical – students do not need to worry about when to take a programming course because programming is learned in concert with the discipline-specific content.

However, the drawbacks to solely using integrated instruction are substantial. The overarching issue is that students are less likely to gain an appreciation for, or even an understanding of, the general computing principles necessary to be a practicing statistician. One of the primary examples is that the classroom data sets to which students are exposed have already been sanitized, meaning a loss of opportunities to develop skills with reading, cleaning, and restructuring data. Students are also better able to understand the requirements for developing good data collection methods when exposed to the results of poorly collected and/or maintained data sets. Additionally, more instructional time is required to teach programming along with discipline-specific content.

Integrated instruction also requires all instructors to teach at least some computing concepts in addition to the course-specific content. Depending on department size, this can be an unrealistic expectation if all faculty are not well-versed in the same language because, as learners, it is important to expose students to the same language repeatedly. Exposing them to multiple languages is valuable, of course, but if done without proper structure, students cannot build on what they learned in an earlier course and instructors cannot assume prior knowledge.

Standalone Instruction

I’m defining standalone instruction to mean classes covering the language-specific concepts and any general programming concepts required to effectively use the language. For example, in a SAS course this would mean not only covering SAS concepts but also including path/directory structure, file types/attributes, image resolution, etc. There are two common “flavors” of the standalone course: applications-focused and whole-language. The applications-focused flavor – where material on data summary/analysis/modeling (Component 2) provides students with the skills necessary to carry out discipline-specific analyses needed in their other courses – is similar to the integrated approach above except these analysis tools are all in a single course and some time may be devoted to data cleaning/preparation and reporting/presentation of results (Components 1 and 3, respectively). The whole-language approach provides much less in the way of Component 2 skills and instead focuses on Components 1 and 3 by teaching the analysis software from the computer science perspective by covering syntax, compilation, and good programming practices while including a few basic Component 2 concepts so students get practice writing end-to-end programs.

The applications-focused course suffers from several significant logistical issues – when should students take the course and what should be included? If taken too early, students are unlikely to understand most of the analysis techniques but if taken too late they cannot apply any of the programming skills in their discipline-specific courses, severely limiting the programming course’s utility. To determine the course’s content, instructors need to agree on what skills will be useful throughout the degree program and deviation from that list in later courses also reduces the course’s utility. Additionally, concentrating the analysis in a single course still deprives students of a deeper understanding of the software’s capabilities and operation and is less likely to instill an understanding of good programming practices.

The whole-language approach should still include simple analysis techniques which is both a benefit and a drawback. It removes the logistical barriers because students can take the course earlier in their degree program, but then it derives its usefulness from how programming is emphasized in the remainder of a student’s coursework. If future courses never/rarely require students to use the skills obtained in this early-career course, then its benefits are severely blunted. However, when used properly, the whole-language approach provides a solid foundation onto which students can add skills presented in later classes while lowering the instructor’s burden in those courses.

What Should We Do?

To best educate our students, we should apply both approaches in a way that minimizes drawbacks and maximizes benefits to make sure we are truly training programmers and not just teaching our students to write a program as a means to an end. Because students need to be prepared to write high-quality end-to-end programs, we need to explain to students early in their career what that process looks like. To meet these goals, I propose the following as a starting point for degree programs looking to modernize their approach to teaching programming skills.

1. Employ an early-career, standalone course using whole-language instruction. Use it to introduce the three components and establish good programming practices.

2. Use integrated instruction in the same language in multiple future courses, each time assessing the students’ programming skills.

3. Enforce a common set of good programming practices across all courses.

4. Apply a common rubric for assessing programming skills across all courses.

Of course, most of us are not in a position to rewrite our department’s curricula or convince our colleagues to teach their courses differently. However, we can all take steps, such as collaborating with whoever teaches your programming course (or proposing a new course!) and choosing to assess programming in our own classes to help our students develop these crucial skills. By building a strong foundation, vertically integrating a programming language into our curriculum, and enforcing good programming practices we can not only produce high-quality data scientists and statisticians, we can also move beyond just teaching programming and start training programmers for the careers that are waiting for them.

Duggins, J. and Blum, J. SAS Global Forum. March 29 – April 1 2020. The Past, Present, and Future of Training SAS Professionals in a University Program.