Binge watching via video-on-demand services is now considered the new ‘normal’ way to consume television programs. In fact, recent surveys suggest upwards of 80 percent of consumers prefer and indulge in binge watching behavior. Despite this, there is no evidence regarding the impact of binge watching on the enjoyment of and memory for viewed content. In this, the first empirical and controlled study of its kind, we determined that, although binge watching leads to strong memory formation immediately following program viewing, these memories decay more rapidly than memories formed after daily- or weekly-episode viewing schedules. In addition, participants in the binge watching condition reported significantly less show enjoyment than participants in the daily- or weekly-viewing conditions — though, important considerations with regards to this finding are discussed. Although it is a preferred viewing style catered to by many internet-based on-demand distribution companies, binge watching does not appear to benefit sustained memory of viewed content and may affect show enjoyment.

The rapid evolution and proliferation of digital streaming technologies has had an enormous, some would argue ‘disruptive’, impact on nearly all forms of mass media (e.g. print — Gilbert, 2015; radio — Albarran, et al., 2007; film — Currah, 2006). Television, in particular, has undergone substantial transformation in the face of evolving technological innovation. In fact, the last decade has seen technologically generated shifts in television production practices (Webster and Ksiazek, 2012; Ha, 2011), marketing (Buschow, et al., 2014; Kozinets, 1999), and distribution (McDowell, 2016; Baccarne, et al., 2013).

One aspect of television, in particular, that has been influenced by emerging internet-based technologies is scheduled viewing practices. Much of the history of television falls under the category of ‘appointment’ viewing, whereby networks determined the broadcast schedule of each aired program thereby requiring the audience to devote specific periods of time during the week to television viewing (Castleman and Podrazik, 2003). Appointment broadcast schedules typically followed either a daily (reality television; soap operas) or a weekly (network dramas; network comedies) rotation (Pingree, et al., 2001). However, with current on-demand streaming services, users are no longer passively tied to externally generated broadcast schedules and can, instead, actively determine their own preferred program viewing schedules and timing (Gillian, 2011).

A major consequence of self-generated viewing schedules has been the rise of binge watching. Although there remains debate about the specifics, binge watching is commonly defined as the viewing of three or more hours of programming within a single sitting (Jenner, 2016). Once considered an aberrance, binge watching is now commonplace, with recent surveys suggesting upwards 75 percent of American and 85 percent of Chinese viewers report binge watching television shows (ARRIS, 2015) and over 80 percent of people between 14–31 indulge in binge-watching at an average rate of a five-episode binge at least once a week (Deloitte Development LLC, 2015). In fact, so popular has this type of viewing become, that many distribution companies reliant on streaming capabilities (Netflix; Amazon Prime) actively cater to this practice by releasing entire seasons of new programs simultaneously (Matrix, 2014).

Despite the proliferation of binge watching practices, there has yet to be any controlled, empirical studies undertaken exploring the impact of this practice on the perceived comprehension and/or long-term retention of viewed content. Of the relatively small number of empirical studies exploring binge watching to date, the focus has been to characterize the underlying psychological features of this practice. These studies suggest that binge watching is best understood as a socially legitimate expenditure of luxury time that allows for autonomous action leading to an impact on self-identity (Jenner, 2017). Key drivers of this behavior appear to be fear of missing out (the feeling that missing and event — in this case, a television program — could result in exclusion from cultural conversation, see Przybylski, et al., 2013; Conlin, et al., 2016), hedonistic drive (Pittman and Sheehan, 2015), and social connection (Pittman and Tefertiller, 2015), whereas key deterrents of this behavior appear to be anticipated regret and goal conflict (Walton-Pattison, et al., 2016).

Despite the emerging psychological picture, several important features of on-demand encouraged binge watching have yet to be determined. For instance, although there has been some evidence to suggest a moderately higher viewer satisfaction rating of broadcast platforms that cater to active, rather than passive, scheduling (such as Netflix: Merikivi, et al., 2016), the impact of viewing schedule on viewer reception of the watched content remains unknown. In addition, the impact of binge watching behavior on comprehension of and memory for watched content is equally unknown. As there is strong evidence to suggest that television program comprehension and memory influence future program engagement (such that stronger memories for prior content lead to a higher chance of future engagement with similar content; see Sherry, 2004; Zillmann and Bryant, 1985; Vidmar and Rokeach, 1974), it is important to understand how viewing schedule affects long-term retention of the viewed material, both perceived and actual.

This study aimed to explore several of these unknown impacts of television viewing schedules. In a controlled laboratory environment, participants watched a complete season of a popular television show at the rate of either one episode a week, one episode a day, or all episodes in a single sitting. Perceived comprehension questionnaires and retention tests were administered one day, seven days, and 140 days after show completion. Through this, we hoped to gain a better understanding of the impact of binge-watching on important measures of perceived comprehension and memory.

Methods

Participants

Fifty-one undergraduate and graduate students from the University of Melbourne participated in this study (29 female; Age M = 22.19, SD = 3.58). Each participant had no knowledge of or prior exposure to the viewed television program. All subjects gave informed consent before participating in this experiment, which was approved by the local ethics committee (University of Melbourne) and conformed to the standards set by the Australian National Health and Medical Research Council.

Instrument/procedure

The television program used for this experiment was the BBC America cold war drama The Game (2014: permission to use this program in this study was granted by BBC America). This program contained one season consisting of six approximately one-hour episodes and was well received by critics (95 percent ‘fresh’ on Rotten Tomatoes). The program was played from a digital file and displayed on a hi-resolution 30-inch computer monitor in a private screening booth in a laboratory at the University of Melbourne.

The perceived comprehension questionnaire was adapted from methodology described by Dunlosky, et al. (2016) and contained four five statements rated on a 100-point scale (How well did you understand the content of the show? How challenging did you find the content of the show? How much mental effort did you expend whilst watching this show? How well do you think you will be able to answer questions about this show on the upcoming quiz? How much did you enjoy this show?). This instrument was used to determine each participant’s perceived comprehension of the program.

Two content quizzes (A and B) were developed for this study. Each quiz contained a total of 60 questions: 40 were open-ended, short-response used to measure Recall memory (e.g., — In episode four, what was delivered to Arkady’s secret mailbox?) and 20 were multiple-choice used to measure Recognition memory (e.g., — In episode five, who was framed for the Parliament bombing: The Russians, The British, The Irish, or The Americans?).

After enrolment and consent, participants in this study were randomly assigned to one of three possible groups: the weekly group, the daily group, or the binge group (17 people per group). Members in the weekly group were assigned a specific day/time combination (e.g., — Mondays at 10am) and asked to come into the laboratory at that allotted time over six consecutive weeks. Each week, participants in this group watched a single episode. Members in the daily group were assigned a specific time (e.g., — 10am) and asked to come into the laboratory at that allotted time over six consecutive days. Each day, participants in this group watched a single episode. Members in the binge group were assigned a specific day (e.g., — Monday) and asked to come into the laboratory at 10am that day. Participants in this group watched each episode consecutively with a three=minute break between all episodes and a 30min lunch break between episodes three and four. Although a feature of binge watching is that viewers are free to choose the timing of their consumption of episodes, for the purpose of this study we controlled this viewing schedule to determine what effect consecutive episode viewing had on the outcomes of interest.

All episodes were watched on a computer in a private screening booth to ensure a consistent and distraction-free environment. Participants were not allowed to bring their cell phones, food, or any other material into the screening booth. In order to ensure participants were actively attending to the program, each person was asked to press the space bar on a keyboard each time a character in the program either lit a cigarette or poured a beverage. These actions occurred an average of 6.3 times per episode spaced an average of ~nine minutes apart. All space bar presses were time-stamped to the episode and assessed after viewing to determine accuracy. If an individual missed two or more space bar presses during an episode, they were instructed to pay more attention to future episodes. If, after this discussion, they missed two or more space bar presses on any future episodes, they were removed from the study. Over the course of this experiment, space bar accuracy was incredibly high (M = 92.11 percent, SD = 2.63 percent), only one person required reminding, and no one was removed from the study.

Immediately after watching the final episode of the series, individuals were asked to fill out the perceived comprehension questionnaire. Then, 24 hours later, they were asked to return to the lab to undertake the first retention quiz. At this time, individuals were randomly assigned either quiz A or B and completed the quiz on the same computer upon which they watched the program. No feedback was given with regards to performance on this quiz. Following this, seven days after watching the final episode of the series, participants were again asked to return to the lab where they once again completed a perceived comprehension questionnaire followed by the remaining quiz. Again, feedback was not supplied. Finally, 140 days after watching the final episode of this series, participants were again asked to return to the lab where they once again completed a perceived comprehension questionnaire followed by a final memory retention quiz. This final quiz was the same length as the prior two quizzes and was composed of randomly selected questions from quizzes A and B. Actual timing for this final quiz was M = 140.55 days, SD = 1.47 days. In addition, at this point, participants were asked what (if any) interaction they had with the program in the 140-day interim. All participants answered that they neither re-watched nor looked up the show on the Internet — though a couple did say they had recommend the show to friends shortly after viewing.

Analysis

On quizzes, each correct answer was give one point, whilst each incorrect answer was given zero points. Scores were calculated for each of the three quizzes. A two-way ANOVA (group x time) was run for each sub-question type (recall & recognition) to compare performance patterns. In addition, two-way ANOVAs (group x time) were run for each item on the perceived comprehension questionnaire to determine response patterns. Bonferroni adjusted significant alpha value is p=0.0045.

Results

Perceived comprehension scores

With regards to perceived comprehension, analyses revealed the main effect of Time was significant for the Understand, Challenging, and Predicted Performance measures, such that average Understanding & Predicted Performance ratings decreased with time whilst the average Challenge rating increased with time (Table 1; Table 2). In addition, analyses revealed a significant main effect of Group for the Enjoyment measure. Post-hoc analyses revealed that, on average, people in the binge group enjoyed the program less than those in the daily or weekly groups (Table 2; Figure 1).

Table 1: Average perceived comprehension and enjoyment ratings for each group across each testing session.Note: Score in parentheses represents 95% CI.

Figure 1: Television program reported enjoyment ratings (out of 100). The blue line represents the binge group, the orange line represents the daily group, and the grey line represents the weekly group (Error Bars=SEM). Important considerations are explored in the ‘discussion’ section of this piece.

Memory scores

Analyses revealed a significant main effect of Time for both the recall and recognition memory questions, such that performance in all groups tended to decrease over time (Table 3). Of interest, there was a significant interaction effect for both the Recall and Recognition memory questions, such that people in the binge group demonstrated better performance one day after watching the program with a sharper decline with time than people in either the daily and or weekly groups (Table 4; Figure 2).

Table 3: Average memory scores for each group across each testing session.Note: Score in parentheses represents 95% CI.

One day

Seven days

140 days

Recall (out of 40)

Weekly

24.29 (21.61, 26.98)

25.53 (23.22, 27.84)

21.77 (19.33, 24.21)

Daily

26.71 (24.77, 28.64)

26.58 (24.50, 28.68)

19.98 (18.30, 21.66)

Binge

28.59 (27.03, 30.14)

27.09 (25.12, 29.06)

20.04 (17.22, 22.85)

Recognition (out of 20)

Weekly

14.88 (13.12, 16.65)

13.23 (11.30, 15.17)

14.19 (12.46, 15.92)

Daily

14.29 (12.74, 15.84)

14.06 (12.69, 15.43)

12.81 (11.86, 13.76)

Binge

15.25 (13.74, 16.76)

11.53 (10.13, 12.94)

12.19 (10.62, 13.76)

Table 4: Statistics for memory scores.

Group(Main: df=2)

Time(Main: df=2)

Group x Time
(Intxn: df=4)

Recall

F=0.531p>0.250ɳp2=0.022

F=62.606p<0.001ɳp2=0.566

F=4.419p=0.003ɳp2=0.155

Recognition

F=0.656p>0.250ɳp2=0.027

F=15.375p<0.001ɳp2=0.243

F=4.652p=0.002ɳp2=0.162

Figure 2: Percentage correct from television program memory tests. The blue line represents the binge group, the orange line represents the daily group, and the grey line represents the weekly group (Error Bars=SEM). A) Recall percentage scores across 3 testing sessions. B) Recognition percentage scores across 3 testing sessions.

Discussion

This represents the first controlled and independent study to explore the impact of varied viewing schedules on the perceived comprehension of, and memory for, a popular television program. With regards to perceived comprehension, our results revealed no significant effect of viewing schedule on how well participants thought they understood the show, how challenging they found it, how much mental effort they expended watching it, or how well they predicted they would perform on future tests of memory. Though the purpose of this experiment was not for participants to meet a specific learning outcome, exploring perceived comprehension allows a glimpse into the psychological processes underpinning their cognitive engagement with the series. The fact that the measures did not differ between groups, even across the different time points, suggest that viewing schedule did not affect the manner in which participants approach, consider, and comprehend show material.

Interestingly, our results revealed that binge-watchers reported enjoying the viewed program significantly less than people who watched the same show on a daily or weekly schedule. This pattern remained at seven- and 140-days after viewing the show.

This is, perhaps, a counterintuitive finding given the increased popularity of binge watching. However, this apparent discrepancy may be reconciled by differentiating between the provider and the show. As noted above, binge watchers demonstrate increased satisfaction for the program provider (e.g., — Netflix: see Merikivi, et al., 2016). However, this is different than satisfaction for the show being watched. Our results suggest that, although individuals may appreciate the opportunity to binge watch, this enjoyment may not be reflected in ratings for the show itself.

In addition, there is one important factor that may have influenced the reported enjoyment rating: process. Although all participants watched the show on the same computer screen in the same manner, it would be naïve to think that the act of doing so in a single afternoon (à la the binge watch group) would not be reflected in this measure. As such, this result should not be understood as a definitive measure of television program enjoyment, per se: rather, it should be interpreted as reported show enjoyment within the confines and structure of this unique environment. Additional experiments exploring this measure within more realistic and ecologically valid settings are important to explicate the interaction of viewing process on reported enjoyment for the show being watched.

Finally, with regards to this enjoyment measure, it has been argued that the mechanics of a serialized program (as the type utilized here) are quite different from those utilized for episodic, un-scripted, and other forms of television programs. For example, serialized shows employ a number of techniques aimed at assisting the viewer in not only following the story of a particular episode, but also embedding this within the larger narrative of the show (Mittell, 2010). These ‘embedded redundancies’ (such as diegetic retelling, visual cueing, flashbacks, etc.) shift viewer expectations and can influence show enjoyment depending upon their relative spacing, such that a binge viewer will possibly be reminded of a key event several times over the course of only a couple hours thereby leading to a clear recognition of repetition within the program generating frustration or ennui, whilst a daily- or weekly-viewer will come across the same reminders after long delays of time, making them more meaningful and appreciated. As other forms of television programs do not employ similar techniques, it would be dangerous to extrapolate these findings to other genres, styles, or shows. Similarly, as modern serialized programs typically utilize more complex narrative mechanics that eschew the practice of ‘simple mass appeal’ (such as centripetal complexity: the deep and continual exploration of the psychological characteristics of one or several characters over the course of potentially dozens of hour-long episodes; see Mittell, 2013), self-reported show enjoyment levels will likely reflect these complexities and change as the narrative style changes. It is wholly possible that had we utilized an episodic comedy or documentary-type show, we would have generated different self-reported show enjoyment levels across groups.

Finally, with regards to memory, our results revealed significant effects of viewing schedule on both recall and recognition memory. More specifically, we found binge watchers demonstrated the strongest memory performance 24 hours after watching the program, but this performance showed the steepest decline over the ensuing 140 days. Conversely, weekly viewers demonstrated the weakest memory performance 24 hours after watching the program, but this performance showed the shallowest decline over the ensuing 140 days. Interestingly, daily viewers showed a memory pattern somewhere between these two extremes; with no significant decline in recall or recognition memory performance in the week following program watching, and a relatively steep decline over the ensuing months.

These memory patterns are in strong alignment with the literature exploring the impact of practice schedule on future performance (for review, see Cepeda, et al., 2008). There is a large body of evidence suggesting that ‘distributing’ practice over a longer time period often leads to worse short-term but stronger long-term performance than ‘massing’ or ‘cramming’ practice into a shorter time period. More specifically, as the time between practice sessions grows, immediate skill/fact acquisition decreases, but retention of these skills/facts increases. Though the exact mechanisms remain uncertain, this improvement is thought to occur due to a combination of increased consolidation (the storage of information within long-term memory, largely during periods of sleep) and differential encoding (the development of novel memory traces with subsequent presentations) (Bjork and Allen, 1970). Although this body of evidence is almost exclusively conducted exploring practice (repeated exposure to and revision of the same set of stimuli set over successive exposures), it is interesting to see that the same memory patterns hold true for the novel not-repeated material viewed in this study. This suggests that, perhaps, there is a distribution effect of instruction as well as practice: more specifically, as the time between sessions of exposure to evolving information within a singular story is increased, the number of facts immediately encoded decreases whilst the strength of memory consolidation increases. Of course, this is simply a theory at this stage — accordingly, it would be worth considering research into this possible ‘distributed instruction’ effect to determine if it does, indeed, exist beyond these findings and if its boundary conditions match those of the well established ‘distributed practice’.

Taken together, our results demonstrate a significant impact of viewing schedule on viewer perceived comprehension and memory. These results may present an issue of consideration for modern on-demand television distribution models. As noted above, as content memory is a strong predictor for future program engagement, it is possible this model may drive viewers away from future programs. On the other hand, established networks and cable networks champion a fixed daily- or weekly-model of program release. Whereas this might promote future engagement, there is evidence that this ‘appointment’ schedule may negatively affect network perception. This leaves distributors with a possibly quandary, such that there might be some tension between product and brand satisfaction.

Future controlled studies will be important to elucidate the boundaries of memory effects found here. Will these patterns hold for all types of narrative shows (e.g., — episodic, reality, or documentary), or only serial shows like the one utilized here? In addition, will these patterns hold when people view a program in their own home with access to additional technologies (e.g., — laptop or cell phone) and can make decisions about the timing of their viewing? As binge watching becomes the premiere way to watch TV, it would behoove us to further elucidate the impact of this practice on sustained learning and perceived comprehension so as to develop content and provide viewing recommendations that lead to the greatest enjoyment and potential for sustained memory (and, by extension, continued engagement) with unique programs.

Conclusion

This study demonstrates a significant impact of viewing schedule on viewer memory and at least one aspect of perceived comprehension. More specifically, it appears that, despite its position as the preferred viewing schedule amongst modern television consumers, binge watching may affect both sustained memory of viewed content and self-reported show enjoyment levels. Interestingly, the more traditional daily- and weekly-episode viewing schedules improved sustained memory in an incremental manner, but also differentially impacts self-reported show enjoyment levels. This raises interesting and important questions for digital on-demand program distributors concerning the balance between user preference and product satisfaction.

About the authors

Jared Cooney Horvath (Ph.D., Med) is a research fellow with the Science of Learning Research Center at the University of Melbourne.E-mail: jared.horvath [at] unimleb [dot] edu [dot] au

Alex J. Horton is a research assistant with the Science of Learning Research Center at the University of Melbourne.E-mail: alex [dot] horton [at] unimleb [dot] edu [dot] au

Jason M. Lodge (Ph.D., MHEd, MLS&T) is a senior lecturer with the Melbourne Center for Study of Higher Education at the University of Melbourne.E-mail: jason [dot] lodge [at] unimleb [dot] edu [dot] au

John A.C. Hattie (Ph.D.) is professor with the Melbourne Graduate School of Education at the University of Melbourne.E-mail: jhattie [at] unimleb [dot] edu [dot] au

Acknowledgements

This work was supported by the ARC-SRI: Science of Learning Research Centre (project number SR120300015).