The Longitudinal Study of American Youth (LSAY) is a project that was funded by the National Science Foundation in 1985 and was designed to examine the development of: (1) student attitudes toward and achievement in science, (2) student attitudes toward and achievement in mathematics, and (3) student interest in and plans for a career in science, mathematics, or engineering, during middle school, high school, and the first four years post-high school. The relative influence parents, home, teac... (more info)

The Longitudinal Study of American Youth (LSAY) is a project that was funded by the National Science Foundation in 1985 and was designed to examine the development of: (1) student attitudes toward and achievement in science, (2) student attitudes toward and achievement in mathematics, and (3) student interest in and plans for a career in science, mathematics, or engineering, during middle school, high school, and the first four years post-high school. The relative influence parents, home, teachers, school, peers, media, and selected informal learning experiences had on these developmental patterns was considered as well.

The older LSAY cohort, Cohort One, consisted of a national sample of 2,829 tenth-grade students in public high schools throughout the United States. These students were followed for an initial period of seven years, ending four years after high school in 1994. Cohort Two, consisted of a national sample of 3,116 seventh-grade students in public schools that served as feeder schools to the same high schools in which the older cohort was enrolled. These students were followed for an initial period of seven years, concluding with a telephone interview approximately one year after the end of high school in 1994.

Beginning in the fall of 1987, the LSAY collected a wide array of information including: (1) a science achievement test and a mathematics achievement test each fall, (2) an attitudinal and experience questionnaire at the beginning and end of each school year, (3) reports about education and experience from all science and math teachers in each school, (4) reports on classroom practice by each science and math teacher serving a LSAY student, (5) an annual 25-minute telephone interview with one parent of each student, and (6) extensive school-level information from the principal of each study school.

In 2006, the NSF funded a proposal to re-contact the original LSAY students (then in their mid-30's) to resume data collection to determine their educational and occupational outcomes. Through an extensive tracking activity which involved: (1) online tracking, (2) newsletter mailing, (3) calls to parents and other relatives, (4) use of alternative online search methods, and (5) questionnaire mailing, more than 95 percent of the original sample of 5,945 LSAY students were located or accounted for. In addition to re-contacting the students, the proposal defined a new eligible sample of approximately 5,000 students and these young adults were asked to complete a survey in 2007. A second survey was conducted in the fall of 2008 that sought to gather updated information about occupational and education outcomes and to measure the civic scientific literacy of these young adults, in which to date more than 3,200 participants have responded. A third survey was conducted in the fall of 2009 that sought to gather updated information about occupational and education outcomes and to measure the participants' use of selected informal science education resources, in which to date more than 3,200 participants have responded. A fourth survey was conducted in the fall of 2010 that sought to gather updated information about occupational and education outcomes, as well as provided questions about the participants' interactions with their children, in which to date more than 3,200 participants have responded. Finally, a fifth survey was conducted in the fall of 2011 that sought to gather updated information about education outcomes and included an expanded occupation battery for all participants, as well as an expanded spousal information battery for all participants. The 2011 questionnaire also included items about the 2011 Fukushima incident in Japan along with attitudinal items about nuclear power and global climate change. To date approximately 3,200 participants responded to the 2011 survey.

The public release data files include information collected from the national probability sample students, their parents, and the science and mathematics teachers in the students' schools. The data covers the initial seven years, beginning in the fall of 1987, as well as the data collected in the 2007, 2008, 2009, 2010, and 2011 questionnaires.

Part 1: LSAY Merged Cohort (Base File) contains student and parent data from both cohorts of the LSAY from 1987-1994 and student follow-up data from 2007-2011. Additionally, Parts 2 - 5 contain information gathered from two teacher background questionnaires and two principal questionnaires from 1987-1994.

Access Notes

These data are available only to users at ICPSR member institutions. Because you are not
logged in, we cannot verify that you
will be able to download these data.

Dataset(s)

WARNING: This study is over 150MB in size and may take several minutes to download on a typical internet connection.

Universe:
7th and 10th grade students in public schools in the United States in 1987 as well as those same students that could be recontacted again in 2007, 2008, 2009, 2010, and 2011 with a follow-up questionnaire.

Data Types:
survey data

Data Collection Notes:

The original two-cohort, two-file data structure reflected the initial period of data collection, however it was awkward for users that wanted to compare the two cohorts or to combine them for various analyses. The merged data file includes a variable to indicate the original cohort, allowing a user to repeat or extend any analysis conducted with the previous LSAY release file, however the naming of the variables in the merged file has been revised to correct dual or conflicting variable names and indicators. The new merged file structure will facilitate the annual release of new cycles of data collection through the addition of variables to the base system.

Methodology

Study Design:
The LSAY sample design consisted of a sample from high schools and a sample of middle or junior high schools that sent students to the participating high schools. Selection of the latter set of schools was accomplished by obtaining information from high school officials on feeder patterns to their schools. Many of the sampled high schools were served by only one feeder school, and nine selections included the middle school grade levels included in the participating high school. A number of the high schools however received students from two or more feeder schools, and in these cases one feeder school had to be selected. The selection procedure involved calculating the proportion of students in the high school who came from each feeder school and then randomly selecting one feeder school, where the probability of selection was proportional to the feeders' contributions to the high school's enrollment. In the event that a school or district declined to participate in the LSAY, a school of similar size and zip code indicating proximity to the original selection was chosen. Once a school's cooperation was secured, the LSAY obtained a complete student roster for the seventh and tenth grade cohorts. To provide a sufficient number of students in each school to compute school effects in subsequent analyses, a sample of 60 students was selected from each school. Students were selected randomly from the lists and asked to participate until the target response size was achieved. In some schools with fewer than 60 students in their seventh or tenth grade classes, all students were selected for participation. When a student refused to participate, the school research coordinator was directed to draw a replacement from an additional list of students, starting at the beginning of the alternate list and proceeding sequentially until a participant was secured. The alternate list was selected randomly, using the same procedures outlined above in constructing the original sample. The LSAY fielded over 40 instruments for Cohort Two and 26 for Cohort One from October 1987 through June 1994. Resumption of LSAY tracking activities began in April, 2006 and re-entry questionnaires were administered in 2007, 2008, 2009, 2010, and 2011. For more information on Study Design, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Sample:
The sampling scheme for the base year of the LSAY was a two-stage stratified probability sample. The United States was stratified by four geographic regions and by three levels of urban development (central city, suburban, and nonmetropolitan) to produce a total of 12 strata. Stage I involved the selection of schools to participate in the study. Stage II was the random selection of 60 students within each school selected in Stage I. Resumption of LSAY tracking activities began in April, 2006 and re-entry questionnaires were administered in 2007, 2008, 2009, 2010, and 2011. For more information on Sampling, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Time Method:
Longitudinal: Cohort/ Event-based

Kind of Data:
quantitative

Weight:
The data are not weighted. There are many weights present in Part 1: LSAY Merged Cohort (Base File). Weight variables have been calculated in order to adjust for the unequal erosion from the original sample over the period of the longitudinal study. For example, if 10 percent of students from School A drop out of or are lost to the study and 20 percent of students from School B drop out or are lost to the study, the unweighted use of the dataset would produce estimates that overestimated the contribution of students from School A and underestimated the contribution of students from School B. Correct estimates of national distributions can only be obtained by using the appropriate weight variable for the analysis at hand. A new longitudinal weight was created for the merged file containing both cohorts, WGT12A, and should be used for all longitudinal analyses containing both cohorts for the high school years. In addition, please refer to the Original P.I. Documentation in the ICPSR User Guide for a description of all weights that are present in the data collection.

Response Rates:
For more information on Response Rates, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Presence of Common Scales:
For more information on Scales, please refer to the Original P.I. Documentation in the ICPSR User Guide.

Extent of Processing: ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of
disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major
statistical software formats as well as standard codebooks to accompany the data. In addition to
these procedures, ICPSR performed the following processing steps for this data collection:

Standardized missing values.

Checked for undocumented or out-of-range codes.

Version(s)

Original ICPSR Release:2011-03-31

Version History:

2014-08-26 This is an update to LSAY data (ICPSR 30263). Part 1: LSAY Merged Cohort File (Base File) has been updated and includes new and revised variables. All other variables and parts in the collection remain the same. In addition, a new LSAY User's Manual has been provided.

2014-06-26 This is an update to LSAY data (ICPSR 30263). Part 1: LSAY Merged Cohort File (Base File) includes the following: (1) data collected in the 2010 and 2011 questionnaires (this data was not available in the previous release), (2) all data collected in the 2007, 2008, and 2009 questionnaires which include additional cases not available in the earlier release, as well as corrections and clarifications to some cases, and (3) all constructed student and parent variables from 1987-1994. The 2007, 2008, and 2009 data and constructed variables that were previously included in ICPSR (30263) were replaced with Part 1: LSAY Merged Cohort (Base File). Parts 2 - 5 include data collected from 1987-1994. Question text has been added to Part 1: LSAY Merged Cohort (Base File) for the variables that were present in the previous update. Newly added variables do not contain question text. The data files are identical to the previously released files.

2014-04-24 This is an update to LSAY data (ICPSR 30263). Part 1: LSAY Merged Cohort File (Base File) includes the following: (1) data collected in the 2008 and 2009 questionnaires (this data was not available in the previous release), (2) all data collected in the 2007 questionnaire which includes additional cases not available in the earlier release, as well as corrections and clarifications to some cases, and (3) all constructed student and parent variables from 1987-1994. The 2007 data and constructed variables that were previously included in ICPSR (30263) were replaced with Part 1: LSAY Merged Cohort (Base File). Parts 2 - 5 include data collected from 1987-1994. The data files are identical to the previously released files. In addition, R data files have been added for Parts 2 - 5.