Wisconsin Science and STEM Education

Tuesday, March 21, 2017

Whether you’re at the end of the unit or want to check for understanding earlier, performance tasks provide a way to gauge students’ abilities to engage in scientific thinking and use their content knowledge. It’s difficult to truly determine their depth of understanding of a concept or their ability to create scientific models and explanations through multiple choice or brief-response questions. As seen in the image to the right, people training to be astronauts don’t just answer multiple-choice questions! Performance tasks have the potential to provide more meaningful information to guide instruction and to frame feedback for students. But how do you create high-quality, NGSS-aligned tasks? Here’s one idea for a process to do so, and my next blog post will detail an example of going through this process.

Determine a phenomenon – considering the current unit, what relevant phenomenon would make students go “hmmmm”? One new resource I found that includes some fabulous phenomena comes from the California Academy of Sciences, called BioGraphic. Generally, phenomena don’t need to be earth-shattering ideas. I like having an interesting question to guide a unit, then connect that to large- and/or small-scale experiences and engaging stories. For example, that could be declining bat populations or dropping a bowling ball and a feather in a vacuum. A task with your selected phenomenon as a context could frame a performance task at the beginning, middle, or end of the unit.

Work with practices – After determining a relevant phenomenon, I consider which science and engineering practice (SEP) would bring it to life and which SEP my students need more work with. It would be great if I was collaboratively working on a particular SEP with my department, making that a natural choice. Considering practices, I would not try to assess a practice as a whole, such as analyzing and interpreting data. It’s more useful to focus on a particular subskill in order to design the task, clearly determine students’ abilities, and provide specific feedback. Handy ideas for subskills can be found within Appendix F, the progression of SEPs, and the NGSS Evidence Statements, which break down each performance expectation by subskills of each practice.

Form a learning target – My primary learning target would be having students use a subskill of a science practice to work with a specific disciplinary core idea (DCI). To achieve three-dimensionality, a crosscutting concept (CCC) might be an implicit part of this learning target. Once I start framing learning targets that are three-dimensional, I start stuttering in the process of rubric creation (as noted in the last blog post). Instead, I often use two learning targets: one that connects practice and content and a second that connects content and big ideas (CCCs).

Flesh out the scenario – With the goals and context of the task in mind, I begin to craft the story and related questions. Which part of the story are students exploring in this task? How does it fit into the overall storyline of the unit. My task might begin a unit, such as engaging students in data that describe concentrations of various chemicals in a nearby lake over the past 50 years. Students would go on to explore ecosystems, water chemistry, and human impacts. Crosscutting concepts are a wonderful resource for creating questions for the task, as each can be transformed into an authentic scientific question. For example, “What is the scale of the agricultural runoff problem?” Or, “What are the important inputs and outputs to consider in the sturgeons’ ecosystem?” This could be an opportunity for students to ask their own questions. Another great resource for framing questions based on the practices is the NGSS Task Formats from the Research + Practice Collaboratory. It provides a series of question templates that can be adapted to wide-ranging contexts. In the end, you’ll want to consider whether the question or series of questions in the task will be moving them further toward expertise in relation to performance expectations (PEs)—not that you’d have a goal of checking off proficiency in relation to PEs, more that you’d consider building student progress toward them through multiple authentic tasks.

Create a vision of proficiency – I outline my main ideas on proficiency in my previous blog post on rubrics. For proficiency with explanations, I also like the “What, How, Why” rubric by Thompson et al.. Notably, expectations for proficiency may start out a bit vague – having sample student work will help clarify what proficiency looks like, and further rubrics will improve over time. It’s a process! Also, I believe that teachers should reflect on whether or not these individual pieces of proficiency will add up to an assessment of your overall vision for students’ learning in science. Additionally, it’s important to consider whether you want individual proficiency or if you can glean important information to guide instruction from group work. Or, can students’ self- and peer-assessment provide the critical learning at this point? Rubrics or other proficiency guidance should be accessible to students.

Reflect – Both students and teachers should take time to reflect on the task. Teachers would reflect on evidence of student learning and how the task performed. Did it provide the information wanted in relation to the practice and content? Was it clear to students? Students should receive feedback sufficient to understand where they’re at in their learning in relationship to the goals put forth. That reflection can be supported by personal, peer, or teacher feedback. A key question with all assessment is: How are you giving feedback to students and how are they acting on that feedback? Honestly, I wouldn’t do in-depth reflection with every task; that could quickly become overwhelming. I’d recommend at least once per unit, with a range of practices throughout the year. Teachers will need at least a few common tasks and rubrics to use collaboratively through the year and discuss.

As noted above, my next blog post will provide an example of going through this process to create a performance task.

Monday, August 22, 2016

I created the
three-dimensional rubric below in an attempt to help get the ball rolling. I have
honestly not yet seen a rubric where the creator claims it is three-dimensional.
I’m not sure I’m there yet, so critique away! Most rubrics I’ve found only
focus on the practices, which I agree is a good place to start (see the resource list at the end of this
post). I would, however, like to see practices and crosscutting concepts
linked to content within a rubric, so I attempted to do that here. Importantly,
column three represents where a proficient student should be, while four
provides ideas for more advanced studies.

Some background on
this unit of study and the related performance task:

High
school biology students are investigating ecosystems (LS2.C), human impacts on
those ecosystems (LS4.D), and related pollution chemistry (PS1.B).

Imagining
I’m still teaching… I engage the class in this unit by having them walk over to
a nearby lake to make observations, ask questions, and take multiple water
samples, highlighting the presence of large amounts of algae if students don’t
bring it up. We meet the regional limnologist there and she briefly shares some
information about pollution in the lake system and is on hand for questions (could
alternatively Skype w/ a scientist or even watch a short watching a short video
detailing pollution challenges – such as this news story).

The next
day students discuss their observations and consider how and why the ecosystem in
their local lake may be changing. They model the ecosystem of the lake,
detailing relationships within and across biotic and abiotic elements, including
what might be causing ecosystem changes. The models provide a formative
assessment on students’ modeling ability and their background understanding of
ecosystems generally, but also within the lake context. After completion, class
sharing and discussion of those models serves to build common background
knowledge about topics such as farm runoff and other pollutants affecting the lake.

I want
to know where students are at in their ability to ask testable questions in an ecosystem
modeling framework (Practice - Asking Questions; Crosscutting Concept – Systems
and System Models). So, toward the end of that class I ask them to individually
develop questions for studying changes to the lake ecosystem, framing those
questions with the lens of the full system and available data on lake chemistry
(e.g. data like this).
I use the following rubric to score students’ individual responses before
having them revise their questions in groups the next day.

I developed goals for the unit first and then created the rubric in conjunction with creating the investigations within the unit. I want multiple opportunities to assess student learning in a more formal way through a unit, and this performance task and rubric flowed out of the progression being built. So, the goals for learning represented in the rubric were in mind throughout the process, not an afterthought.

Our
state vision for science learning in Wisconsin comes from page one of the
summary of the NRC Science Education Framework. I’d want my
assessment to provide information as to whether students are progressing toward
that vision as well as through the NGSS progression we’d laid out for the year.
The goals of this lesson, students being able to ask meaningful questions about
local water pollution and the chemical impact on ecosystems, do fit within
those broader goals.

Possibly
the most important resource for designing the rubric was Appendix F, the progression
document for the practices.
The progression detailed for grades K-2, 3-5, 6-8, and 9-12 for asking
questions provided ideas for where students should be and where they’re coming
from, supporting the development of the columns within the rubric. They provide
ideas for a developmental progression of learning without resorting to terms
like never, somewhat, and always. Specifically, based on the progressions of
the asking questions practice, I included having students connect questions to an
analysis of data and systems.

Another important
resource for designing the rubric was the NGSS Evidence Statements document.
The evidence statements provide a concrete way to break down a practice into
specific subskills, which is very useful in articulating the multiple rows of a
rubric. In my case, they were most useful in suggesting that the question needs
to be practicably testable (in the classroom) and relate to cause and effect.

Finally,
I also used Appendix G, the progression document detailing the crosscutting concept of
systems and system models.
From this progression, I pulled ideas of inputs and outputs within the system, understanding
the boundaries of the system to better formulate the question. So, the rubric pushes students to consider how
timeframes and a narrowed focus on particle chemicals and lake inputs could
lead to a better question.

I also wanted to focus on questioning as the NGSS performance expectations (PEs) have limited connections to the questioning practice (only two in middle school and two in high school). Because teachers make the mistake of using the PEs to design their instruction, I worry students won’t have as many opportunities as they should to ask questions.

I used
the idea of “with guidance” as part of the progression. It was a tough decision
to include that. I felt that if we’re talking about a true developmental
progression, the first step is often being able to do it with some help. Some
students need scaffolding to get going with a skill, and they’re not going to
be independent at first. So, I reflected that within this rubric.

Additionally,
I’d want to have student responses to the performance task to serve as examples
(anchors) of the varying levels within the rubric. I didn’t feel I could meaningfully
create those on my own, so I hope to get some teachers to try this rubric, or
something similar, and share anonymized samples of student work.

For the best
outcomes, teachers should collaboratively create these rubrics or collaboratively
refine and revise an existing rubric to meet their needs/vision. To improve
instruction for all students, it’s also essential that they collaboratively review student work in light of the rubric.
It won’t be perfect the first time! Teachers will have to improve the rubric over
time along with other elements of their instruction based on formal and
informal assessment data.

The Design-BasedImplementation Research team created a first draft of a rubric on the practiceof scientific modeling. It provides super useful details on what constitutes
effective modeling. A problem is that it’s a bit long to be useful, though
perhaps portions of it could be pulled out to assess subskills. I also don’t think
progressions of ability using language such as “does not,” “some,” and “all” is
as straightforward as denoting what students at different levels can do.

The Instructional Leadership for Science Practices group provides a series of
rubrics based on each practice that can be used to evaluate student
performance. Or, there’s another version of the rubrics that could be used by
an observer to provide teachers feedback on how the practices are being used in
his/her classroom. Though, both versions tend to focus more on what students
have the opportunity to do than what they have the capacity to do.

Arapahoe Elementary in the Adams County Five Star School District provides standards-basedgrading rubrics linked to NGSS – It gives a generic rubric template you’d use
to plug in specifics for each particular CCC or SEP or DCI, but it might not
provide sufficient information or nuances for individual SEPs, CCCs.

Edutopia provides a rubric for science projects, which has some good ideas
for progressions of abilities, but remains fairly traditional - built from “scientific
method” steps.

And, thanks to Cathy Boland, @MsBolandSci, for
sharing a rubric for explanations through Twitter - I hope others will share too!

Monday, July 25, 2016

Evaluating current efforts to move science education forward,
such as that framed by the Next Generation Science Standards, requires
“assessments that are significantly different from those in current use” (National Academies report, Developing Assessments for the Next
Generation Science Standards, 2014). Performance tasks in particular
offer significant insights into what students know and are able to do. “Through
the use of rubrics [for such tasks] … students can receive feedback” that
provides them a “much better idea of what they can do differently next time” (Conley
and Darling-Hammond, 2013). Learning is enhanced! Building from a vision of
the skills and knowledge of a science literate student, rubrics can allow
students (and other educators) to see a clearer path toward that literacy.

Of course, using rubrics with performance tasks is generally
a more time-intensive process than creating a multiple-choice or
fill-in-the-blank exam. In order to be used as part of a standardized testing
system or as reliable common assessments, scoring these types of tasks requires
more technical considerations:

Tasks must include a clear idea for what
proficient and non-proficiency looks like;

Scoring must involve multiple scorers who all
have a clear understanding of the criteria in the rubric; and,

Often, when teachers grade student projects or other
performance tasks on a rubric, it looks something like this (click for larger image/pdf):

While this example comes from a 4th grade
classroom, secondary rubrics often have similar characteristics.

Considering alignment to the NGSS, and effective rubric
qualities in general, there are several changes I’d make:

What’s the science learning
involved? They appear to be drawing or making a model. About what? What
understanding would a proficient model of that phenomenon display?

What would mistakes look like in a
model? I’m a little worried that the “model” here is just memorizing and
recreating a diagram from another source. Students’ models should look
different. There might be a mistake in not including a key element of a model
to describe the phenomenon, or not noting a relationship between two of those
elements. Types of mistakes indicating students aren’t proficient should be
detailed.

Neatness and organization are
important, but I question the use of those terms as their own category. I would
connect that idea to the practice of scientific communication. Does the student
clearly and accurately communicate his or her ideas? Do they provide the
necessary personal or research evidence to support their ideas? The same is
true within the data category. I’m more concerned about whether the students
can display the data accurately and explain what the data means, than whether
they use pen, markers, and rulers to make their graphs…

It’s not a bad idea to connect to English
Language Arts (ELA) standards—done here with the “project well-written”
category. At the elementary level in particular that makes sense; however, I’d
want to ensure that I’m connecting to more, actual ELA standards, such as the CCSS ELA anchor
standards for writing, which emphasize ideas like using relevant and
sufficient evidence. At upper grades, I’d also want to emphasize disciplinary
literacy in science (e.g., how do scientists write?) over general literacy
skills.

This rubric actually does better
than many at focusing on student capacity rather than behaviors. I see many rubrics
that score responsibility and on-task time, rather than scientific skills and
understanding. Check out Rick Wormeli’s ideas.

What does “somewhat” really
suggest? Is the different between two mistakes and three mistakes really a
critical learning boundary? I see a lot of rubrics that substitute always,
sometimes, and never for a true progression of what students should know and be
able to do. I also see many rubrics that differentiate rankings by saying
things like no more than two errors, three to five, errors, more than five
errors. What do we really know from that? What types of errors are made? Is it
the same error multiple times? What exactly can’t the student do in one
proficiency category vs. the next? I really can’t tell by just saying two vs.
four errors.

Use anchors for clarity – this
rubric notes that models are “self-explanatory” and that sentences have “good
structure.” Do students really know what that means? Have they seen and
discussed a self-explanatory model vs. one that is not self-explanatory? If there
is space, an example of a model or sentence meeting the standard could be
embedded write into the rubric in the appropriate column. If there isn’t a
space, a rubric on a Google doc could link to those types of examples.

Looking around,
really looking, I have found very few rubrics that make an attempt to align to
the NGSS. I suspect some people are still nervous about sharing. In my next
blog post, I’ll share an NGSS-aligned, three-dimensional rubric I created and
detail the process involved. I’ll also share some resources from other groups
tackling this work.

Monday, July 18, 2016

A new colleague here at the Wisconsin Department of Public Instruction, Lauren Zellmer, read through my last blog on formative assessments. Her question was, “What specifically do teachers do with the information once they’ve conducted these formative assessments?” Great question! I decided to write a Part 2 of the formative assessment blog, where I’ll share a few more details for possible instructional next steps based on hypothetical results. First teacher – Rubric on modelingIn the first example, the teacher collected whole-class and individual information using a modeling practice rubric, as he walked around asking probing questions and jotting down student names across the rubric continuum. After some reflection in pairs, he had a few students share their models in order to highlight key aspects of the practice, which will help build capacity in all students. Depending on students’ level of understanding, further support might include the following:

If he found that most students did not fully understand this
element of modeling (mostly 1’s and 2’s), he could provide scaffolded modeling instruction
in the next part of the lesson requiring modeling. He would prepare a modeling
handout which lists possible elements of the model and requires students to
note whether to include those aspects and why. Students already proficient
would complete models without that scaffold.

If he found that understanding is fairly varied (largely 2’s and 3’s,
with some 1’s and 4’s), the teacher could provide further time for group
reflection and sharing. That further reflection would best happen
immediately—after a few, selected groups shared elements of their models, the
class could get back in pairs to improve their models based on those ideas.
And, next time modeling occurs in a lesson, the teacher could repeat a similar
in-depth process, like the first time, to continue to provide significant
support.

If he found that most students proficiently performed this aspect
of the practice (largely 3’s, with some 2’s and 4’s), he could make a note of
which students are still struggling. The next time modeling happens, instead of
moving around the class generally to assess where students are at, he could
narrowly focus his support and questioning on those students. He might provide
them some in-depth small group help, with scaffolds provided. He could also
pair them with proficient students where he knows they won’t just be given
answers, but meaningfully supported in their learning.

Second teacher – Testable questionsIn the second example, the teacher had students write questions about biodiversity while on a nature walk, thus collecting whole-class and individual information on whether students could write testable questions. The next day, with the whole class, she shared useful “yellows,” where students’ questions needed more work, and “greens,” where students’ questions were testable. She also made some notes in a file as to where the class seemed to be overall with this skill. Depending on students’ level of understanding, further support might include the following:

If she found that most students could not write a testable question (lots of “yellow”), she should do more than read through notable yellows and greens. After that review, students could receive more practice in a guided, whole class discussion, evaluating a series of questions, noting whether or not they’re testable, and fixing them to make them testable. Then, she could ask students in small groups to collaboratively revise their questions to make them testable. These groups would include a student who did proficiently demonstrate this skill where possible.

If she found that about half could write testable questions and half could not, I would again suggest she have students rewrite their questions in small groups. For students still struggling after a round-two attempt, she
could support them in a small group with an activity like that noted above (evaluating examples together). The remainder of the students might begin some independent research or brainstorming on the design of the investigation to answer their questions.

If most could write testable questions, she might provide individual help to students still struggling with their questions after the discussion of the greens and yellows. Those students could use the reviewed questions as models to revise their own, with the teacher ensuring they can explain why their original ideas were not testable.

Formative assessment is “assessment for learning”. Teachers need specific ideas as to what strategies they’re going to implement depending on the results of the formative assessment. They also need some way to record progress, not only relying on “gut feelings” and memory (my brain and gut, at least, aren’t that reliable). Having some class assessment notes on a page or a rubric can be a quick means to do so. Finally, coming together with peers to discuss the data and possible strategies is critical to moving forward as a science department (and community).

Wednesday, June 1, 2016

Within
this post, I am going to focus on formative assessment as an ongoing assessment
conducted as part of daily instruction to guide further instruction. They are
informal or formal checks of knowledge happening in conjunction with
instruction. I’m not considering more summative type assessments such as
end-of-unit tests or student projects. Though to be sure, all assessments
should be formative, in that the results will be used to guide decision-making
in relation to instruction.

Therefore,
formative assessments probably won’t be the primary tools that teachers and
administrators will look at when determining whether their school as a whole is
making progress toward their vision for science education. Instead, they are
tools that will influence daily decisions in the classroom, as well as student
and/or teacher collaborative conversations. Do I need to provide more time for peer
discussion around crafting a procedure for their investigation? Should I
include more scaffolding for students to create an effective data table? Notably,
they might inform elements reported on a standards-based report card, but more
often they will not.

In
a classroom using the NRC Framework and/or the Next Generation Science
Standards, educators aim to use three-dimensional instruction,
including within formative assessment. Two examples of what that could look
like in practice might help:

A
fifth grade teacher shows students a large syringe full of air. He asks them to
discuss with a neighbor what will happen if he plugs the end and pushes down.
Students then get to try it out at their table. After a couple minutes of students
investigating the phenomenon, the teacher asks them to model the phenomena by
drawing a diagram of it, prompting them to consider how to show things they can
and cannot see. Why does it get harder to push down? Students create the model
on their own first, then discuss their model with a peer, making revisions to
their models as desired and considering evidence. The teacher walks around
asking probing questions such as, “What are those little circles in your
syringe?” “Are they really that big?” “How do they look different before and
after pushing down on the syringe?” As he walks, he’s jotting down student
names and occasional notes along the continuum of the rubric, which he has on a
tablet or clipboard. Students then discuss their models in relation to the portion
of a modeling rubric focused on clearly representing all important aspects of
the phenomenon (not yet relationships among components). The teacher walks
around, looking and listening for important components of models, evidence, and
comments to share with the class to illustrate this aspect of modeling. He has
a couple pairs of students share theirs, highlighting key criteria from the
rubric where students appeared to be struggling. He also keeps his notes in a
file with similar notes about students’ abilities with the science and
engineering practices to look for progress over time and keep track of areas
that need more work.

Modeling
- Subskill

1

2

3

4

Identifying
important components of a scientific model

Student
represents the object or occurrence

Student
represents the object (etc.) with details (evidence) related to the
phenomenon

Students
models the phenomenon (etc.) in such a way that it adequately represents
important components of it (and not extraneous elements) and evidence gathered

Student
models the phenomenon, explains why those are the important components based
on evidence, and can analyze why components noted in one model better
represent the phenomenon than those in another model

In
This Example

Student
draws a syringe

Student
draws a pushed down syringe with packed little circles inside of it and label
of “gas”

Student
draws one syringe pulled back and another one pushed down, each has a magnification “bubble”
representing the scale of air particles and w/ them being closer together in
the pushed down image

Student
draws two syringes, as noted in 3, explains the importance of the scale and
the particles being closer together, and notes why a model depicting air
particles still far apart even with syringe down is more accurate (e.g. evidence
– can’t see the air)

As
a second example, a high school biology teacher asks students how many types of
organisms there are in the world, leaving the question intentionally vague.
Students discuss the answer in groups of three (no devices used at this point).
The teacher pushes for justification and evidence for quantitative responses,
as well as proper vocabulary. After a few minutes, she asks some student groups
to share their answers and evidence. And, then she asks, “Why is this diversity
of life important?” After sharing ideas with a partner, the whole class
discusses it for a few minutes. The teacher then describes an investigation the
class will do of “biodiversity” within their school grounds (as part of a
larger unit on ecosystems, invasive species, and adaptations). The class goes
on a quiet, mindful walk outside where students observe and come up with a testable question(s) about biodiversity
on their school grounds and/or in the local area. At the end of the walk, the
teacher collects the cards with students’ names and ideas on them. She quickly sorts
them into yellow—needs more work, or green—testable; she also jots down some
notes as to where students are at in general with this skill and adds it to her
assessment file. The next day she anonymously shares a few of her favorite
greens and yellows to illustrate key concepts in relation to testable questions,
eliciting student ideas first in that conversation.

These
formative assessments must be part of a larger cycle of guiding and reflecting
on student learning. I like the APEX^ST model described by Thompson, et al., in
NSTA’s
Nov 2009, Science Teacher:

Educator teams
collaboratively define a vision of student learning.

They teach and
collect evidence of learning.

They
collaboratively analyze student work and other formative evidence of learning
(such as conversations) to uncover trends and gaps.

They reflect
how opportunities to learn relate to evidence of student learning.

They make
changes, ask new questions, conduct another investigation, etc.

And,
then they reflect again on evidence of student learning in light of their vision,
considering changes to their vision and objectives as necessary.

One
goal here, even in formative assessment, is to be thinking about how student learning
fits into the overall picture of three dimensional instruction. Within the gas particle
modeling task above, the science/engineering practice (SEP) is modeling, the
disciplinary core idea (DCI) is 5-PS1.A—“gases are made from particles too
small to see,” and the crosscutting concepts (CCCs) are cause and effect and
scale. Within the biodiversity question task, the SEP is asking questions, the
DCI is HS-LS2.C—“ecosystem dynamics,” and the CCC is systems and system models
(though others could apply).

While
those 3D connections are being made overall, the specific, in-the-moment, formative assessment goals here do not
capture all three dimensions. In other words, the full task and work throughout
the unit will involve students in all dimensions, but this formative snippet is
really about one element of one practice in each case. Can students represent
important aspects of a phenomenon within a model? Can students generate a
testable question? The formative data gathered and acted on could also focus on
their understanding of biodiversity or the particle nature of matter (DCIs).
Or, it could focus on their ability to reason about the scale of a phenomenon
(CCC). But, in this case the teacher kept things simple and more manageable
with a specific, narrow goal in mind, which clearly related back to a larger
vision for student learning and objectives for this unit. Importantly, while
it’s a narrow goal, it’s still a deeper, conceptual learning goal. It’s not
just an exit card asking students to regurgitate a fact or plug numbers into a
formula.

Examples
of assessments (not necessarily exemplars, could be formative or summative)

Thursday, March 3, 2016

Surveys of
students, teachers, and community members will provide critical information in
the process of determining whether changes made to your science education program
improve desired outcomes. Many important questions cannot be answered through typical
science assessments. While it’s clearly essential that students understand and
can do science, do they sincerely believe that someone like themselves could be
a scientist? Further, are you changing not just knowledge of, but beliefs
about, science? Do students see how science relates to their lives? Is it
meaningful for them? Or, do they see a need to question “scientific evidence”
within popular media?

A recent article in National
Geographic noted that solid,
research-based science often faces organized and angry opposition. We don’t
want students leaving school doubting the consensus of the scientific community
(unless they somehow have sufficient, valid evidence to doubt a claim). They
can understand how vaccines work and still decide not to have their children
vaccinated. It’s unfortunate that our society believes in science, but not its
findings.

Furthermore,
do students understand who scientists are and what they do? While it was created
as a tool for K-5, the “Draw-a-Scientist” test (DAST) could be done at
secondary levels as well. My 8th graders certainly held onto
stereotypes of scientists. We want students to see science as including a
wide-range of tasks by a wide-range of people, particularly people who look
like them and have interests similar to their own.

These
surveys can provide teachers with data to evaluate their individual courses and
the science program more generally.

While
surveys of student outcomes are critical within a system of assessments, it’s
also important to understand the views of parents/community members and
teachers during the change process. Surveying parents and other community
members can help ensure they’re aware of and meaningfully connecting to the school
science vision and students’ science learning. Teacher surveys can ensure
they’re comfortable teaching their content and the practices of science in
accordance with the vision for science learning. Within results, you can look
at trends by demographics, such as race and ethnicity, or differences between
new and veteran teachers.

In a survey
of parents and community members, you probably don’t want to get into the
content being taught. Hearing about personal views of evolution and climate
change isn’t necessary for these purposes. Questions could have a Likert-scale
format, with selections from strongly agree to strongly disagree. Some examples
include:

Through the
science courses, I believe my student is becoming a better scientific thinker
(for the broader community that would be rephrased as “students are becoming”).

I am
familiar with the district vision for science education.

I believe my
student is receiving a quality foundation in his/her science classes to pursue
science careers in the future.

I believe my
student is being well-prepared for science classes at the college or university
level.

There should
also be an open-ended text box, asking survey takers to please share any
comments or questions about the science education program at their school. Of
course, even with community input, you’re not going to resort to poor
instructional practice that isn’t research based, like lecture. Educators are
the professionals in this setting. You may, however, decide to make more career
linkages in your courses or bring in more guest scientists.

It’s also
important to know where teachers are at in the change process. Are they getting
the support they need in teaching science? Tools like the Survey of Enacted Curriculum (SEC) can also let educators and
administrators know whether what they’re doing actually lines up with the
intentions of the instructional program. It’s not a “gotcha” system, but an
approach like Lesson Study that can lead to tremendous,
collaborative professional learning.

A brief
endnote… While it is true that for statistically-validated studies surveys need
to undergo extensive testing, everyday school surveys can provide a useful
piece of information for guiding instructional programs. Surveys linked above
have largely undergone testing and include multiple item constructs, so using
them or learning from them is a good step.

Some other tips
for creating quality surveys include:

Use multiple
questions to measure each idea or topic. Looking at several questions together provides
a more valid picture of what people really think.

Have a
student, parent, etc. verbally talk through their thoughts on the survey with
you. They think aloud as they read and answer the questions. Are they
understanding the questions in the way that was intended? Is there some
confusing wording? Having people of different backgrounds do so helps ensure
the questions are similarly interpreted by people.

A focus
group, with a neutral facilitator (i.e., not your boss), can provide a
different perspective and bring out ideas that a survey cannot. It can also
inform survey development.

And, yes, it takes extra
time and effort to know whether you’re actually making a long-term difference
for your students and whether the large-scale changes improve classroom
practice, but it’s worth it.

Monday, February 1, 2016

Assuming
your district/school has established a vision for science education and
large-scale, specific goals aligned to that vision, you will next need to
determine a system of assessments for evaluating progress toward those goals.
As mentioned in my last post, many districts are working to adopt and implement
new science standards. Strategically assessing science-related outcomes at
multiple levels will provide ongoing evidence of effective change – after all,
why make changes if you don’t know whether they actually make any difference?

While it
might be obvious, an evaluation of a science program based on these goals will
take more than one assessment! In other words, the annual state standardized
test, often the only systematic science test used by a school, will not measure
the full range of outcomes related to a meaningful vision for science
education. That requires leaders to strategically implement a system of assessments. The Wisconsin DPI
has a chart that illustrates some components of such a system, including formative, interim,
and summative elements.

The majority
of assessment will happen formatively at the classroom level. This level is
where teachers see the day-to-day use of scientific practices by their students
as they investigate, communicate, and ask questions about science. It will be
critical for teachers to have the structures to discuss what they’re observing
from their students, collaboratively determining next steps. Processes of informal formative
assessment should
drive instructional practice. If schools are moving toward the NGSS or NRC
Science Framework, formative, as well as all levels of assessment, should be three-dimensional.

Large-scale
district summative tests (or state level tests) often afford the least amount
of data for specific instructional guidance. They might, however, suggest areas
for professional development or foci for revised student project rubrics. For
example, a set of district end-of-course exams might all show that students
across the district struggle with using data effectively. Often these types of
tests are multiple-choice, which provide limited information in relation to
authentic science practice, but they can be effectively paired with open-ended
opportunities for students to describe their reasoning.

In summary,
schools and districts reviewing and attempting to improve their science
programs will have unclear success in that process if they haven’t defined what
outcomes they want and how to measure them. A meaningful and strategic system of science assessment will be an essential part of
this process.

The next
series of blog posts will discuss formative, interim, and summative assessments
in more depth, as well as effective surveys of student attitudes. Each will
provide examples of these assessment types and suggestions for classroom or
school use.