Cheating or Collaboration?

By

The computer science department at the University of Illinois at Urbana-Champaign, seeking a balance between promoting student collaboration and fostering individual academic achievement, will continue to let students share their work online.

The department last week cracked down on students who were posting code -- and thereby sharing answers to homework assignments -- to the repository Web site GitHub. But after students criticized the department, which invoked copyright law to force the Web site to remove the code, faculty backtracked.

"Balance is indeed the key issue," Rob A. Rutenbar, department chair and Abel Bliss Professor of computer science, said in an e-mail. "We have great students, they come to us with some software skills, often learned in a collaborative environment, and expecting to use the Internet as a resource. But then it's important that we carefully assess each student’s individual understanding of material. Writing code for a class in this sort of a 'vacuum' is rather unnatural. This is an issue all CS programs are wrestling with."

The university used the Digital Millennium Copyright Act of 1998 to request the code be removed from GitHub. The law, now a common weapon to fight online piracy, protects Web sites from copyright violations their users might commit as long as the sites respond to “takedown notices” from copyright holders and remove the offending content.

Speaking to Inside Higher Ed, Rutenbar said the department issued takedown notices against repositories containing copyrighted code used in 3 large 200-level courses: data structures, computer architecture and system programming. The repositories included code written by students and "scaffold" code written by instructors to help students understand concepts taught in the courses. The takedown notices targeted the "scaffold" code, Rutenbar said in an e-mail.

As opposed to upper-level courses in the computer science program, which feature group projects, the three courses all include homework to be completed individually -- often in the form of fill-in-the-blank questions with code. That often led students to look for an easy way to complete the problems instead of doing their own work, said Cole Gleason, a senior computer science major. “I’m on the academic integrity grievance committee, and we get these kinds of cases all the time where students have copied off of GitHub,” he said. “It’s kind of a hassle for professors and course staff because they have to track how students are cheating.”

GitHub is a platform for coders to share their work and collaborate with others. By making their work publicly available through open-source repositories, coders can let other users create “forks” in the development process, meaning the same source code can spawn any number of different products -- like a family tree with sprawling branches all connecting to a common ancestor. GitHub has about 8.5 million users and hosts nearly 20 million repositories, according to official figures.

That makes GitHub a useful resource for projects with multiple people collaborating, a quality that Rutenbar said the program aims to instill in its students. "The goal is to build first the solid, individual software skills in the early years of the curriculum," he said. "Then we know that in their later courses -- and in their careers -- our students will be strong collaborators in team projects."

According to one of the takedown notices sent to GitHub, the department cited both the DMCA and the university’s own academic integrity code as reasons why the Web site should comply. Adapted for the computer science department, the code means “copying text directly from someone else” is cheating. “This is true regardless of whether the source is a classmate, a former student, a Web site, a program listing found in the trash or whatever.”

The Web site for the data structures course re-emphasizes that part of the honor code. “You may not reference any code outside of that provided in lecture and the textbook,” a section on academic integrity reads. “Your turned-in work must be your own product, representing your own knowledge. Any form of cheating is unacceptable.”

Emily Tran, president of the university's chapter of Women in Computer Science, questioned what having the code available online might mean for future students taking the same courses.

"My current sentiment is that students who do look up past semesters' students' work for problems that get reused are just obfuscating code to essentially copy it," Tran said in an e-mail. "There's no good reason students should need to upload their finished assignments for programming classes onto public repositories because that's not something that should be shared."

Gleason, chair of the university’s chapter of the Association for Computing Machinery, said he doesn’t like the idea of using DMCA takedown notices “in a censorship context,” but added that the computer science department’s use could be “legally viable.” In past cases, he said, the department has handled incidents of students posting copyrighted code internally by asking them to take it down.

“I totally understand going after those people,” Gleason added. “I don’t think the department has anything against using GitHub.”

Rutenbar said that some of the GitHub users who had posted the code had done so anonymously, so the department couldn't contact them directly. "When our instructors are able to ask students personally to remove these sort of posts, the students routinely comply," he added.

The department is experimenting with different forms of evaluation, including giving students randomized exams that give them access to compilers and debuggers but not collaborations or the Internet, Rutenbar said. The experiments have so far produced "very encouraging" results.