Cigarette smoking is a behavior with an impact on health, where an individual's genetic make-up, family history, and vulnerability to addiction may play crucial roles. As such, smoking behavior offers a rich source of study that can draw high school students into an authentic research experience.

This is the goal of Exploring Databases, a high school inquiry-based research project developed through collaboration between the Department of Genome Sciences and the Institute for Science and Math Education at the University of Washington (UW). Funded by the National Science Foundation (NSF) through a program called Innovative Technology Experiences for Students and Teachers (ITEST), the curriculum unit gives students experience with genotyping and database research as well as the means to test hypotheses about genetic and environmental factors influencing smoking behavior.

Students conduct research by analyzing the data from a real case control study, a widely used epidemiological study design that is retrospective, looking at a certain outcome--becoming a regular smoker or not--by studying two groups of people: the cases who have that outcome, and the controls who had tried smoking at some point in their lives but never became regular smokers.

"This is a topic that is relevant to high-school students," said Principal Investigator Maureen Munn. "A lot of them have experience with smoking among their family and friends, if not themselves. They do some research in the literature as well, to develop their ideas about factors that might affect smoking behavior."

In a previous project Munn had worked with another group of students to assemble data associated with about 300 research subjects, all adults 25-54. The research subjects completed a questionnaire regarding environmental influences on their smoking behavior and gave a blood sample that was used to genotype their DNA at three gene regions shown in other studies to be associated with smoking behavior. Questionnaire responses and genotyping data for each subject were entered into the database, which became the prime resource for the Exploring Databases unit.

In the first of seven one- to two-hour lessons, the unit exposes students to different aspects of human subject research and the study of the biology of nicotine addiction, as well as criteria used to distinguish causality from associations.

As part of the curriculum, students examine profiles of smokers (people who were interviewed about their smoking behavior), and develop an understanding of the variation in how people smoke--from initiation and maintenance, to cessation and relapse.

The heart of the project is the students' use of the Smoking Behavior database to test hypotheses, followed by designing a new case control study. Using an online portal, students first put forth an overarching hypothesis regarding how physiological or environmental factors might make someone become a smoker after trying smoking.

Students can test genetic or environmental factors to determine whether they influence smoking behavior in specific populations. For each hypothesis, the system generates an odds ratio showing the strength of the association, and a confidence interval, an indication of the ratio's statistical significance.

In testing hypotheses, some students have shown interest in how peers influence smoking behavior, and others in how family and media influence it. Still others look at physiological factors--which might be the three gene regions used in the study, or some of the questions in the questionnaire, such as, "During your experimental smoking period, did you get a buzz or a pleasurable feeling when you smoked?"

In the last lesson, students mine the data by analyzing many questions and exposure combinations to generate a new hypothesis to integrate into a hypothetical research study. Students create a final presentation that demonstrates their results and their claims based on the evidence from their analysis, as well as their proposed study.

"By engaging students in the practices of science to examine issues of immediate, personal relevance, this project exemplifies the leading edge of STEM education practices advocated by the ITEST program," said Program Director David Haury.

The Exploring Databases unit was recognized by Science Magazine as an Inquiry-based Instruction (IBI) series winner. The curriculum has been used in a wide variety of high school science courses, including introductory biology, an Upward Bound seminar, and advanced elective courses such as genetics and biotechnology. It has also been used in community college courses. It reached nearly 600 students in the 2011-2012 academic year.

Munn and her team have trained a number of teachers in the curriculum, and about 20 teachers are currently using it. Munn is looking forward to broader adoption of the project in the months ahead, given its alignment with the Next-Generation Science Standards.

A crucial reason for the project's success is the different perspectives reflected in its design.

"This project is a collaboration of genome scientists and learning scientists at UW," said Munn. "The strength of the project is bringing experts with different expertise together."

The National Science Foundation (NSF) is an independent federal agency that supports fundamental research and education across all fields of science and engineering. In fiscal year (FY) 2015, its budget is $7.3 billion. NSF funds reach all 50 states through grants to nearly 2,000 colleges, universities and other institutions. Each year, NSF receives about 48,000 competitive proposals for funding, and makes about 11,000 new funding awards. NSF also awards about $626 million in professional and service contracts yearly.