Research has shown that surveys which require respondents to recall events can be subject to fairly high measurement error. Recall error tends be less problematic for highly salient events and events that have recently occurred. However, there is less information on whether some respondents are more or less prone to having problems with answering questions that involve recalling an event. This research uses data from the American Driving Study where people are asked to report the length of driving trips that they made yesterday. We analyze over 16,000 reported driving trips from data collected from 7,913 respondents who reported having been the driver for at least one driving trip on the day before they were interviewed (yesterday). For this analysis, we are concerned with two types of recall problems: (1) the inability of the respondent to provide an estimate of either the length or duration of a driving trip and (2) providing an estimate of miles driven that is inconsistent given the duration and purpose of the trip. This paper finds difference in the characteristics of respondents who are more likely to have problems reporting their prior day driving behavior. But, the main finding is that longer trips were harder to report on and have a bigger impact on key survey estimates. We conclude with some discussion of important considerations that survey practitioners should keep in mind when designing surveys that include recall questions.

Introduction

The American Driving Study (also known as the National Light Vehicle Use Survey) is a nationally representative study that continuously gathers data on the driving exposure of different groups of drivers. The American Driver Study (ADS) is sponsored by the American Automobile Association Foundation for Traffic Safety (AAAFTS) and managed by the Urban Institute with data collected by Social Science Research Solutions (SSRS).

This report includes ADS data collected between May 21, 2013 and December 31, 2015. For completeness, we briefly summarize our ADS approach and protocols. A detailed description of the ADS methods and strategy can be found in the AAAFTS report American Driving Survey 2014–2015.

The ADS is a telephone interview which uses a random sample of both landlines and cell phones. The survey instrument begins with a household roster which is administered to an adult respondent. If the respondent reports that one or more drivers live in the household, the program then randomly selects the driver(s) who are asked to complete the second part of the instrument, the Trip/Driver Interview. The Trip Interview is administered to one or more drivers in the household, determined using a probability procedure that ensures that teenage drivers, drivers over 75 years of age, and those who report driving every day receive a higher chance of being selected.

Table 1 provides detailed information on the number of households that participated in the study, number of persons 16 or older living in these households, and how many of them were drivers. The table also shows how many drivers were sampled to fill-out the trip interview, how many of them finished the trip interview, and how many trips were reported. In addition, the table also provides information on the 2014 through 2015 response rates (using the AAPOR – RR3 formula) and length of the survey. Given data are collected continually throughout the year, we break the data down by seasons.

Table 1 shows we collected a total of 16,130 driving trips, and for most people, we were able to estimate the miles they were driving. However, there were 601 driving trips for which the driver either was not able to provide an estimate of the miles or duration they drove or provided a response where the miles driven was inconsistent with the duration of the trip. The focus of this methods brief is to learn more about the drivers who had difficulty recalling or reporting the miles they drove yesterday and to think about possible questionnaire changes that could possibly improve the quality of the recall data.

Frequency and Impact of Recall Problems

Table 2 shows both the number and percentage of driving interviews and driving trips that required imputation. Figure 1 displays graphically the breakdown of driving reports that required data editing or imputation compared with those did not require any editing or imputation. The data are based the 7,913 American driver interviews in which drivers interviewed for this study completed a 24-hour report of driving trips taken the day before the respondent was interviewed. Eligibility for the driver interviews was anyone age 16 or older in the United States who lived in households with a landline or had a cell phone and for whom they or a household member reported that they drove ‘almost every day,’ ‘sometimes’ or ‘rarely’. Around 6 percent of all driving interviews included a driving trip that required imputation to come up with the estimate of the miles driven. However, there were only about 2 percent driving interviews where all the driving trips reported required imputing the miles driven.

Table 2 Number of trips and driving interviews that required imputation.

Counts

Percent of total

Interviews:

Number of completed trip Interviews

7,913

100.0%

Number of trips with no imputation

7,622

93.7%

Number of trip Interviews with some trips used in determining miles driven or duration needing imputation

369

4.8%

Number of trip Interviews with all trips used in determining miles driven or duration needing imputation

115

2.2%

Trips:

Number of driving trips reported

16,130

100.0%

Number of reported trips requiring imputation to determine miles driven or duration

From the 7,913 driving interviews, a total of 16,130 unique driving trips were reported. Of this 601 or almost 4 percent were trips in which imputation was needed to determine miles or duration of the trip. In most cases (92 percent), the imputation was a result of a person unable to estimate the miles driven or they were able to provide an estimate of when the trip started and ended, but not both. Thus, much of the imputation is based on estimating miles driven or length of trip for trips in which we either know how long the trip lasted or how many miles were driven.

Table 3 compares the estimates of duration and miles driven when you include or exclude respondents whose estimates had trip data that required imputation. We show this comparison since approximating the miles driven on average is one of the most important estimates that comes out of the ADS analyzes. The key finding is that although less than 4 percent of trips require imputation the inclusion of imputed trips estimates does increase the estimate of annual miles driven by almost 1,000 miles annually. This increase means that the trips requiring imputation are not random occurrences and more likely to have occurred on longer trips. This is somewhat intuitive since you would expect it to be harder to estimate miles driven on longer trips and longer trips are also more likely to be unusual trips that drivers do not routinely make.

Table 3 Impact of imputation on driving estimates.

Daily trip estimates

Total duration of all driving trips (minutes)

Total miles driven, all driving trips

Mean

Median

Mean

Median

All drivers 16+ years old (including respondents with imputed data)

29

10

46.0

23.0

All drivers 16+ years old (excluding respondents with imputed data)

26

9

44.0

20.0

Annual trip estimates

Mean

Mean

All drivers 16+ years old (including respondents with imputed data)

10,589

16,790

All drivers 16+ years old (excluding respondents with imputed data)

9,618

16,060

Recall Difference by Demographics and Types of Trip

Table 4 shows differences in the percentage of drivers with missing miles driven or missing trip duration by various subgroups. Figure 2 displays which groups required more imputation on average (above the overage average of 6.1 percent) followed by groups that required less imputation on average.

Table 4 Demographic differences in the percentage of drivers with missing miles driven or missing trip duration.

Demographic group

Percentage of drivers with estimates without any edits/imputations

Percentage of drivers with missing miles driven

Percentage of drivers with missing trip duration

All drivers (n=7,913)

93.9%

3.7%

2.5%

Gender:

Males (n=3,806)

95.5%

1.9%

2.6%

Females (n=4,107)

92.4%

5.3%

2.4%

Race and ethnicity:

White (n=5,608)

94.5%

3.2%

2.3%

African American (n=1,021)

91.2%

6.0%

3.0%

Hispanic (n=769)

93.9%

4.0%

2.1%

Other (n=428)

92.3%

3.7%

4.2%

Age:

16–19 (n=485)

91.3%

5.6%

3.1%

20–29 (n=1,051)

93.2%

3.4%

3.3%

30–49 (n=2,047)

93.1%

3.9%

3.2%

50–64 (n=2,189)

94.5%

3.0%

2.6%

65–74 (n=1.070)

95.5%

3.6%

0.9%

75+ (n=1,071)

94.4%

4.2%

1.5%

Education:

Grade school or some high school (n=471)

93.0%

4.5%

2.5%

High school graduate (n=2,186)

93.5%

3.8%

2.7%

Some college (n=1,906)

93.1%

4.1%

2.8%

College graduate (n=1,850)

94.8%

3.1%

2.3%

Graduate school (n=1,187)

94.9%

3.0%

2.2%

Yellow shaded box means estimate is significantly different from the overall estimate at the 0.05 level.

Figure 2 Percentage of drivers with estimates that have any edits/imputations.

The likelihood of having to impute miles driven was almost three times greater for female drivers relative to male drivers. But, there was almost no gender difference in the need to impute trip duration. We find higher rates of imputation for miles and duration among African American drivers and lower rates of imputation for White drivers. Teenage drivers and drivers age 75 or older had higher rates of imputation for both miles driven and duration of trip. Education was not a factor for imputation rates because of trip duration but was a factor for imputation rates for miles driven with lesser educated drivers having higher rates of imputation.

Table 5 shows differences in the percentage of drivers with missing miles driven or missing trip duration by various other factors such as whether the trip took place on a weekend versus weekday, time of year the trip took place, the length of the trip, how often does the person drive, and the region of the country the driver lives in. Figure 3 displays which of these other factors required more imputation on average (above the overage average of 6.1 percent) followed by factors that required less imputation on average.

Table 5 Other key differences in the percentage of drivers with missing miles driven or missing trip duration.

Demographic group

Percentage of drivers with estimates without any edits/imputations

Percentage of drivers with missing miles driven

Percentage of drivers with missing trip duration

All drivers (n=7,913)

93.9%

3.7%

2.5%

Weekend or weekday

Weekday (5,570)

93.9%

3.8%

2.4%

Weekend (2,343)

93.8%

3.4%

2.9%

Season

Winter Q1: January–March (n=1,480)

93.0%

4.1%

3.1%

Spring Q2: April–June (n=1,718)

94.6%

2.7%

2.6%

Summer Q3: July–September (n=2,333)

94.0%

3.7%

2.4%

Fall Q4: October–December (n=2,382)

93.8%

4.0%

2.2%

Length of driving trip

Drove less than 10 miles (n=1,675)

96.4%

3.2%

0.4%

10 to 20 miles (n=1,894)

93.6%

5.2%

1.3%

Drove more than 20 miles (n=4,343)

88.0%

6.2%

6.0%

Frequency of driving

Drive almost every day (n=6,335)

93.2%

3.9%

2.9%

Drive sometimes (n=1,114)

96.1%

3.1%

0.9%

Rarely drives (n=464)

97.4%

1.7%

0.4%

Census region

Northeast (n=1487)

94.7%

3.6%

1.7%

Midwest (n=1,862)

94.3%

3.3%

2.5%

South (n=3,068)

93.4%

3.7%

3.0%

West (n=1,496)

93.5%

4.2%

2.3%

Yellow shaded box means estimate is significantly different from the overall estimate at the 0.05 level.

Figure 3 Percentage of drivers with estimates that have any edits/imputations.

Driving reports for weekend compared with weekday travel did not differ for the percentage of estimates that needed edits or imputation. This was somewhat surprising, since you would expect weekday driving to be more routine and thus easier to report on. We did observe a higher percentage of driving reports during the winter that needed to be edited or imputed. Also, a lower percentage of driving reports from people who live the Northeast region of the country needed to be edited or imputed. Trip length was by far the most important factor in determining whether a driver is more likely to have trouble recalling trip miles or duration. Because longer trips, which are trips that were greater 20 miles, were much more likely to require editing or imputation.

Discussion

A key factor in deciding whether to impute for missing data is whether the missing data is missing at random. In our driving study, few respondents had difficulty recalling the length or duration of driving trips, but the trips where that information was missing were not random trips. So, it became important to impute values for the missing data especially since the reported estimates were aggregate variables the summed information across all trips taken. Had we not imputed but just aggregated across fewer trips, the estimates would have been inaccurate since the trips with missing information differed considerably from the typical driving trip.

Some of the differences in the missing trip information can be attributed to the characteristics of the respondent. Since women, African Americans, and both younger and older drivers were all more likely to be unable to estimates the miles a trip was or how long a trip lasted. However, the bigger story is about the type of trip that was being reported. That is there was a greater chance for respondents to have difficulty reporting miles driven or the duration of trips that were longer. Because these longer trips are invaluable in estimating overall miles driven or time spent driving, replacing the missing data with imputed values based on other trip information was essential for any analysis of driving behavior.

Like the ADS study, we would expect that many if not most studies that collect recall data will find that some groups of people will tend to have a harder time providing the requested information. Therefore, it can be useful for recall studies to figure out who and why people have difficulty providing responses. Knowing who could lead to thinking about ways of improving the wording or tailoring the questions. It may even be worth developing questions that collect alternative information that would help with imputing or interpreting a respondent’s responses. For instance, for our travel study, we learned that longer trips provided more recall challenge which could be aided by collecting more information on longer or unusual driving trips.

Another possibility for collecting difficult recall information is to ask respondents about their confidence in their responses. Perhaps then collecting more information from those respondents who report a lower confidence in their answers? Finally, consider developing flexible probes that interviewers could tailor to help some respondents or use for unusual events. It would be important to test to see if these probes do lead to less missing or unusable data without producing any bias in the responses collected. For instance, for the second year of data collection, we started checking the estimated miles for each hour a person was driving based on the miles and duration of his or her driving trip. For trips where people were driving less than 5 miles per hour of more 65 miles per hour we had them verify their responses. We found that this reduced the number of trips that required imputation without introducing any potential bias in the responses.

Disclaimer
The Survey Practice content may not be distributed, used, adapted, reproduced, translated or copied for any commercial purpose in any form without prior permission of the publisher. Any use of this e-journal in whole or in part, must include the customary bibliographic citation and its URL.