Transcript

1.
Educational Design ResearchEdited by: Jan van den Akker University of Twente, the Netherlands Koeno Gravemeijer University of Utrecht, the Netherlands Susan McKenney University of Twente, the Netherlands Nienke Nieveen University of Twente, the Netherlands

2.
AcknowledgementsThe roots for this book stem from an educational design research seminar organizedby the Netherlands Organization for Scientific Research, in particular, its ProgramCouncil for Educational Research (NWO/PROO). This book was conceptualizedduring that seminar and contains chapters based on presentations and discussions fromthose fruitful days in Amsterdam. We therefore express our gratitude to PaulBerendsen, Hubert Coonen, Joris Voskuilen and the staff at NWO/PROO for theirinterest and support in educational design research, and for bringing this group ofscholars together.Jan van den AkkerKoeno GravemeijerSusan McKenneyNienke Nieveen

4.
List of visuals and captionsBoxesBox 9.1: Learning environment examplesBox 9.2: Genres in design research deliverablesFiguresFigure 1.1 How research improves practiceFigure 4.1: Developmental research, a cumulative cyclic process.Figure 4.10: Salary against years of educationFigure 4.2: Two data sets in minitool 1.Figure 4.3: Reflexive relation between theory and experiments.Figure 4.4: Micro and macro design cycles.Figure 4.5: An interpretive framework for analyzing individual and collectiveactivity at the classroom levelFigure 4.6: Box plot as a model for reasoning about distributions.Figure 4.7: Battery life span data, always ready and tough cell batteries.Figure 4.8: Speed data, before and after a speed campaign.Figure 4.9: T-cell data, four-equal-groups inscription, with data points hidden.Figure 5.1: Predictive and design research approaches in educational technologyresearch.Figure 6.1: Curricular spider web (van den Akker, 2003)Figure 6.2: System coherenceFigure 6.3: Three main outputs of design researchFigure 6.4: Design research on curriculum, flanked by validation and effectivenessstudiesFigure 6.5: Design research taking place in contextFigure 6.6: Analysis, design and evaluation cycle shaped by tenets at the coreFigure 6.7: Generic quality criteria for evaluation of curricular designsFigure 6.8: Conceptual model of design research in the curriculum domainFigure 6.9: Conceptual framework shared by the three example studiesFigure 10.1 Graphical representation of typical study outputs (based onSchoenfeld, 2002)Figure 10.2 Snakes and Ladders assessment taskFigure 11.1: Design research within the scientific cycleFigure 11.2: Educational engineering research cycleTablesTable 3.1: Three modes of engaging in organization research (adapted from:Banathy, 1996)Table 6.1: Examples of design research in the curriculum domainTable 10.1: Four levels of R&D

5.
Introduction 1INTRODUCING EDUCATIONAL DESIGN RESEARCHJan van den Akker, Koeno Gravemeijer, Susan McKenney and Nienke Nieveen ORIGINS OF THIS BOOK Design research has been gaining momentum in recent years, particularly in thefield of educational studies. This has been evidenced by prominent journal articles (e.g.Burkhardt & Schoenfeld, 2003), book chapters (e.g. Richey, Klein, & Nelson, 2004) aswell as books (e.g. van den Akker, Branch, Gustafson, Nieveen, & Plomp, 1999) andspecial issues of journals dedicated specifically to the topic (Educational Researcher32(1), 2003; Journal of the Learning Sciences 13(1), 2004) or to the more general needto revisit research approaches, including design research (Journal of Computing inHigher Education 16(2), 2005). Definition of the approach is now beginning to solidify, but also to differentiate.As methodological guidelines and promising examples begin to surface with abundance,pruning becomes necessary (Kelly, 2004). Dede (2004) as well as Gorard, Roberts andTaylor (2004), call for the educational research community to seriously reflect onsetting standards that improve the quality of this approach. This book offers such a reflection. Most of its chapters are revised, updated andelaborated versions of presentations given at a seminar held in Amsterdam, organizedby the Dutch Program Council for Educational Research from the Netherlands

6.
Introduction 2Organization for Scientific Research (NWO/PROO). As a funding agency,NWO/PROO is interested in clarification of what design research entails as well asarticulation of quality standards and criteria to judge proposals and to evaluate outcomesof such research. The presentations and discussions during the seminar were veryfruitful and stimulating. They provided the impetus to produce this book, which makesthe findings available to a wider audience. MOTIVES FOR DESIGN RESEARCHThe first and most compelling argument for initiating design research stems from thedesire to increase the relevance of research for educational policy and practice.Educational research has long been criticized for its weak link with practice. Those whoview educational research as a vehicle to inform improvement tend to take suchcriticism more seriously than those who argue that studies in the field of educationshould strive for knowledge in and of itself. Design research can contribute to morepractical relevance. By carefully studying progressive approximations of idealinterventions in their target settings, researchers and practitioners construct increasinglyworkable and effective interventions, with improved articulation of principles thatunderpin their impact (Collins, Joseph & Bielaczyc, 2004; van den Akker, 1999). Ifsuccessful in generating findings that are more widely perceived to be relevant andusable, the chances for improving policy are also increased.

7.
Introduction 3 A second motive for design research relates to scientific ambitions. Alongsidedirectly practical applications and policy implications, design research aims atdeveloping empirically grounded theories through combined study of both the processof learning and the means that support that process (diSessa & Cobb, 2004;Gravemeijer, 1994, 1998). Much of the current debate on design research concerns thequestion of how to justify such theories on the basis of design experiments. As thethrust to better understand learning and instruction in context grows, research mustmove from simulated or highly-favorable settings toward more naturally occurring testbeds (Barab & Squire, 2004; Brown, 1992). A third motive relates to the aspiration of increasing the robustness of designpractice. Many educational designers energetically approach construction of innovativesolutions to emerging educational problems, yet their understanding oftentimes remainsimplicit in the decisions made and the resulting design. From this perspective, there is aneed to extract more explicit learning that can further subsequent design efforts (Richeyet al., 2004; Richey & Nelson, 1996; Visscher-Voerman & Gustafson, 2004). ABOUT DESIGN RESEARCHIn this book, we use “Design Research” as a common label for a ‘family’ of relatedresearch approaches with internal variations in aims and characteristics. It should benoted, however, that there are also many other labels to be found in literature, including(but not limited to) the following:

8.
Introduction 4 • Design studies; Design experiments; • Development/Developmental research; • Formative research; Formative evaluation; • Engineering research. Clearly, we are dealing with an emerging trend, characterized by a proliferation ofterminology and a lack of consensus on definitions (see van den Akker, 1999, for amore elaborate overview). While the terminology has yet to become established, it ispossible to outline a number of characteristics that apply to most design studies.Building on previous works (Cobb, Confrey, diSessa, Lehrer, & Schauble, 2003; Kelly,2003; Design-based Research Collective, 2003; Reeves, Herrington, & Oliver, 2005;van den Akker, 1999) design research may be characterized as: • Interventionist: the research aims at designing an intervention in the real world. • Iterative: the research incorporates a cyclic approach of design, evaluation and revision. • Process-oriented: a black box model of input-output measurement is avoided; the focus is on understanding and improving interventions. • Utility-oriented: the merit of a design is measured, in part, by its practicality for users in real contexts. • Theory-oriented: the design is (at least partly) based upon theoretical propositions; and field testing of the design contributes to theory building.The following broad definition of Barab and Squire (2004) seems to be a generic onethat encompasses most variations of educational design research: “a series of

9.
Introduction 5approaches, with the intent of producing new theories, artifacts, and practices thataccount for and potentially impact learning and teaching in naturalistic settings.” Further clarification of the nature of design research may be helped by aspecification of what it is not. The most noteworthy aspect is probably that designresearchers do not emphasize isolated variables. While design researchers do focus onspecific objects and processes in specific contexts, they try to study those as integral andmeaningful phenomena. The context-bound nature of much design research alsoexplains why it usually does not strive toward context-free generalizations. INSIDE THIS BOOKThis book was created to appeal to a rapidly growing international audience ofeducational researchers who situate their studies in practice. The publication containsfour main parts, plus supplemental materials available on the publisher’s website. First,a mixture of substantive information is presented for those interested in learning aboutthe essence of design research. This includes: its origins; applications for this approach;and discussion of benefits and risks associated with studies of this nature. The secondpart of the book features domain-specific perspectives on design research. Here,examples are given in terms of how this approach can serve the design of learningenvironments, educational technology and curriculum. The third part of the book speaksto the issue of quality assurance. Three researchers express their thoughts on how toguard academic rigor while conducting design studies. In the last part of the book,

13.
Toward productive design 9TOWARD PRODUCTIVE DESIGN STUDIESDecker Walker WHY NOW? My thinking about design research begins with the question: Why now? Whyhave some researchers and policy-makers become interested in design research at justthis moment in history? I think that there are two major reasons. The most important isdisappointment with the impact of conventional approaches to research in education.We have seen no intellectual breakthrough from research in education comparable tobreakthroughs in medicine, engineering, and the sciences, nor have we seen anymeasurable improvement in teaching practices or student learning on a large scale. Inclinical experiments, practices and programs supposedly ‘backed by research’ havegenerally proven to be only slightly better than conventional practice, at best. In short,over half a century of research into education since World War II has not improvededucation noticeably. In many countries the quality of education seems to have declinedover the past several decades, just when educational research supposedly had begun toaccumulate enough knowledge for its findings to make an impact. Many of us whoadvocate design research believe that it, in conjunction with standard forms of inquiry,has the potential to produce the kind of impact research has made in other areas of life,an argument I will develop later. The second reason why some researchers and policy-makers find design

14.
Toward productive design 10research attractive now is the availability of promising new theories of learning andtechnologies through which these theories can be applied. Cognitive science, activitytheory (or social constructionism), and brain research offer new perspectives on learningthat may well be more powerful than the theories that have guided traditional researchsuch as behaviorism, depth psychology (Freud, Jung, Adler,…), and conventional socialpsychology. Some of these new theories make predictions about intricate details oflearning that are not accessible to teachers and students in ordinary classroom situations.Others consider a much wider wide range of social influences and interactions thanoccurs in classrooms. New forms of educational intervention may be needed to realizepractical benefits from these new theories. Fortunately information and communicationtechnologies have developed to the point that new technologically-supportedinteractions may now be designed to apply and test these new theories. Design researchseems valuable if not essential, in developing these new interventions. HOW RESEARCH INFLUENCES PRACTICE For most of its history, research in education has influenced practice onlyloosely and indirectly. Researchers taught theories and findings to educators – teachers,professional leaders, and researchers-in-training – and they in turn applied the theoriesin practice. In practice, however, theory and research findings often functioned as littlemore than slogans for reformers. Child-centered learning, discovery learning, and theproject method, for instance, were said by their advocates to be ‘based on research,’ butthe range of practices included under their banners was so broad that each became more

15.
Toward productive design 11of a philosophy than a well-defined design. Occasionally theorists and researchersthemselves actually designed concrete materials for teachers and students to use. MariaMontessori’s pre-school tasks, the look-say method of reading instruction, the initialteaching alphabet, standardized tests, and programmed instruction are well-knownexamples of materials designed by theorists and researchers. In both cases, though,studies comparing research-based teaching methods or materials with conventional onesshowed small effects or no statistically significant differences. Design research envisions a tighter, more rigorous connection between learningprinciples and features of the educational innovation. In design research a theorist orresearcher’s rigorous analysis of a learning problem leads to quite specific ideas forinterventions. Designers then build systems that use information technology to buildspecific teaching and learning materials and methods designed to realize learning gainspredicted by theory and research. If the theoretical analysis is right then theseinterventions ought to give markedly more effective results. The designing of thesesystems is an R&D endeavor, not a work of imagination nor a straightforward deductionfrom theory. In order to create the interventions designers need to study how studentsand teachers actually respond to specific features of the design suggested by the theory.In other words, in order to show that a design rigorously implements principles fromresearch and theory, designers must do design research. Having shown that their design functions the way that theory predicts it should,designers need to try their design and see if its results really do live up to predictions.Most likely the first tests of any new design will show weak results or none at allbecause designs need to be tuned and optimized to give best results. To be effective anycomplex system normally requires a precise configuration of its elements. Early radios,

16.
Toward productive design 12for instance, worked – they would transmit and receive radio frequency signals – butthey were weak and unreliable. Through design research engineers discovered moreeffective ways to amplify the signal, sharpen the tuning, reduce noise, and make theradio’s operation more reliable. It is only logical to suppose that the kind of researchengineers do to improve the design of radios and other devices will also be needed toimprove educational designs. (Of course, the kind of research needed for educationaldesigns will be different from the kinds of research used in engineering. Teachers andstudents are central to the functioning of educational practices and so design research ineducation needs methods drawn from the human sciences, arts, and humanities.) In order to study the effectiveness of preliminary designs, design researchersneed sound, reliable indicators of learning. Traditional teacher made tests andconventional standardized tests are too crude and imprecise to test for the kinds oflearning that the new theories envision. Design researchers have already developed arange of techniques for developing good indicators of learning, including closeethnographic observation, standard learning tasks with scoring rubrics, and othertechniques for assessment of learning. Assessment techniques are domain-specific, thatis, specific to the content and goals being taught, and so new techniques must bedeveloped for each specific domain of learning and teaching. Developing or adaptingassessments is an important part of the design research process. Figure 2.1 shows theserelationships in a diagram.[ FIGURE 2.1 ABOUT HERE ]

17.
Toward productive design 13 GUIDELINES FOR GOOD DESIGN STUDIES I believe that good design research will lead to more and better learning, thus thephrase ‘Productive design research’ in the title of my chapter. What research methodsand approaches are most likely to lead to productive design research? For the most part,these methods will be drawn from established disciplines in the human sciences, arts,and humanities. I will mention several criteria that I would use to choose the methodsthat are most appropriate for design research studies. Riskier designs Standards of methodological rigor traditionally applied to social science researchare not, in my opinion, likely to lead to productive design research. Traditionalstandards are designed to test theories and for this purpose it is crucial to minimize therisk of accepting a false conclusion. Any mistake in research may lead researchers toaccept mistaken conclusions that will hinder the growth of knowledge in the discipline.Any wrong turn in theory building can waste years of effort of the best scholars andresearchers. In testing theories it pays to go to great lengths to get results that canwithstand every criticism. Design research is not done to test theories, even though its results cansometimes suggest weaknesses in theory. Rather, design research discovers ways tobuild systems based on theories and to determine the effectiveness of these systems inpractice. Design research therefore needs to balance boldness and caution in a differentway. A super-cautious insistence on design studies that guard against every potential

18.
Toward productive design 14source of error will yield large, lengthy, expensive studies that delay designs andmultiply their cost many times. A series of smaller, less well-controlled studies maygive results nearly as reliable much faster and cheaper. Designers must dealsimultaneously with many ambiguities and unknowns. It is often better for them to get avery preliminary result on several of these than to study one or two thoroughly whilenecessarily relying on guesswork, speculation, and assumptions for all the others.Design research that takes greater risks of accepting erroneous conclusions may havehigher payoff. Looser studies that do not fully disprove alternative hypotheses but lookinstead for a wide range of possible effects of complex designs may be sufficient toreveal ways to improve designs. This doesn’t mean that anything goes. An overly boldapproach that’s full of unsubstantiated speculation provides little more than a randomchance of hitting on the right design. The key to productive design research is to strike a new balance between cautionand risk-taking. Concentrate on the most important design problems, understand themthoroughly, identify the most promising features for the design in light of thatunderstanding, build prototypes with these features, and try them out. This is a muchbolder and riskier research strategy than conventional social science researchmethodologists recommend but it stands a much better chance of leading to innovativedesigns. Cycles of studies Traditional approaches to research methods focus on individual studies. Thegoal is to design the best possible study to answer a given question. But design projectsalways face many questions and varying degrees of uncertainty about them. No single

19.
Toward productive design 15study could help with all these questions, so the temptation is to focus on one questionand do one study to answer that question. This leaves all the other questions completelyopen. A more sensible approach would be to identify the most important questionssurrounding a particular design problem and plan a series of studies addressing eachquestion. Begin in each case with brief, inexpensive studies that give a general idea ofthe most promising approaches to the question. Then invest in studies (perhapssomewhat more intensive and expensive and lengthy) of the questions that now seemmost crucial. Confine the most rigorous (and therefore most expensive) studies to thelast stage and the most crucial remaining questions. Study the resource requirements of designs All designs cost money, take time to implement, and require expertise and effort.A design may be successful in improving learning but at a prohibitive cost or only iftaught by someone with a Ph.D. Resource requirements can and should be estimatedand, in the case of programs already in operation, studied empirically. An aspect ofevery design study ought to be a consideration of the resources required to sustain thedesign. Compare practices The researcher’s temptation is to study in great depth the ‘best’ design, i.e., thedesign option favored by the designer. However, designs advance best when the mostpromising design options are compared to one another. Understanding one optiondeeply will still not tell the designer whether another option might not be even better. Soit is usually good practice to compare the promise of all the reasonable design options

20.
Toward productive design 16and narrow the field to two or three of the most promising options, then compare thesedirectly in a study. Often the gold standard in education – the best known way to teachsomething – will be something like a human tutor – too expensive to provide foreveryone. Still, it can be helpful to compare an innovative design with this standard tosee how closely the new design approaches the gold standard. It is also often useful tocompare a new design to conventional or accepted practice. A new design may notimmediately offer better results than accepted practice, but it may cost a great deal lessor it may be more easily improved or both. Consider sustainability and robustness A design that works in the laboratory may not work in the classroom. One thatworks in an experimental classroom may not work in a typical classroom. One thatworks when everything goes right may degrade drastically when teachers or studentsmiss classes because of illness or when a teacher resigns and a new, untrained teacher isappointed, or under any of the countless circumstances that occur frequently in real life.Every form of practice degrades under severe conditions. We need designs that degradegracefully rather than catastrophically. We need sustainable designs that produceimpressive results not only under ideal conditions but also under severe but realisticconstraints, i.e., robust designs. And we want designs that thrive and improve year afteryear not ones that slide downhill every year, i.e., sustainable designs. Design researchcan estimate robustness and sustainability and can study them empirically once designshave been put in practice. Involve stakeholders in judging the quality of designs

21.
Toward productive design 17 Teachers may be more interested than others in how much work and effort willbe required of them by a new program. Parents may be more interested than teachers inconflicts between what students learn in the new design and traditional religious orcultural beliefs. Civic leaders may be more interested in community involvement.Employers may be more interested in preparation for work. All these are legitimateconcerns and the only way to ensure that everyone’s concerns are considered in buildinga new design or studying it is to involve them in the process. This becomes especiallyimportant in judging the overall desirability of a design compared to accepted practices.The weighing of incommensurables involved in such a conclusion rules out an expertjudgment and calls for the representation of the various viewpoints of those with moststake in the matter. TODAY’S OPPORTUNITY Researchers today have an opportunity to pioneer design research and establishit as an essential part of the creation of new designs for learning and teaching. Thealternative is a future in which designs are dominated by fashion and marketingconsiderations. I know of one prominent company that produces learning software forchildren to be used in the home whose design process consists of doing market researchwhich leads to a design of the package the customer will see on the shelf. Severalcompeting package designs are shown to focus groups of parents and eventually adesign for the box is finalized. At this point, software developers are given the box andtold to produce software to fit it. This might not be a bad way to start a design process if

22.
Toward productive design 18only the software developers were empowered to conduct further studies with childrento develop software that actually fostered learning more effectively. But in this case andin so many others, unfortunately, the rest of the design process was done by the seat ofthe pants. If we researchers and academics want more considered designs withestablished effectiveness, we will have to show the way. Productive design research isthe way.

24.
Normal and design sciences 19NORMAL AND DESIGN SCIENCES IN EDUCATION: WHY BOTH ARENECESSARYFinbarr SloaneMainstream research in education is based on science and the humanities. Science helpsus to understand education, and interventions in education, from an outsider position, asempirical objects. The humanities contribute to understanding, and critically reflectingon, the human experience of actors inside educational practices. This chapter arguesthat, in view of the persistent relevance gap between theory and practice, research ineducation should be broadened to include design as one of its primary modes ofengaging in social research. Design is characterized by its emphasis on solution finding,guided by broader purposes and ideal targets. Moreover, design develops, and draws on,design propositions that are tested in pragmatic experiments and grounded ineducational science (e.g., research in education, cognition, sociology). In this chapter Ifirst explore the main differences and synergies between science and design, and then Idevelop a framework for communication and collaboration between the science anddesign modes. In doing so I highlight why government funding agencies need tocontinue their support of design research in education. INTRODUCTION

25.
Normal and design sciences 20 Education research is currently based on the sciences1 and humanities, whichserve as its main role models. The goal of a scientific approach is to help us tounderstand educational settings and learning in those settings by uncovering theconnections that determine their characteristics, functioning, and outcomes. Scienceitself is based on a representational view of knowledge, in which educationalphenomena are approached as empirical objects with descriptive properties (Bunge,1979; Mohr, 1982). The descriptive and analytic nature of science helps to explain anyexisting or emerging educational phenomena, but, generally speaking, cannot accountfor qualitative novelty. In this respect, the notion of causality underpinning science isthe study of variance among variables, the linkage of a known empirical phenomenoninto a wider network of data and concepts. From this perspective then science tends tofocus on testing propositions derived from general theories. Education research that draws on the humanities as its main role model assumesknowledge to be constructivist and narrative in nature (e.g., Denzin, 1983; Denzin &Lincoln, 1994). The central thesis is that knowledge arises from what actors think andsay about the world (Denzin, 1983). Here the researcher focuses on trying tounderstand, interpret, and portray the human experience and discourse that occurs ineducational settings. In this way, the goal of appreciating complexity is givenprecedence over the goal of achieving generality. Drawing on Simon’s (1996) writings, this chapter argues for a design approachto organization studies. He notes that “design is the principal mark that distinguishes theprofessions from the sciences. Schools of engineering, as well as schools ofarchitecture, business, education, law, medicine, are all centrally concerned with the 1 My use of the term science here is quite broad and is inclusive of both qualitative andquantitative methodologies.

26.
Normal and design sciences 21process of design,” Simon (1996, p.111). The central idea of design involves inquiryinto systems that do not yet exist (either complete new systems or new states of existingsystems). The main question becomes, “Will it work?” rather than, “Is it valid or true?”Design is based on pragmatism as the underlying epistemological notion, and designresearch draws on “design causality” to produce knowledge that is both actionable andopen to validation. The basic argument I make in this chapter is that the study of education requiresa design mode, as much as a scientific and humanities mode, to engage in research.Consequently, I argue that research based on the design mode of inquiry has alegitimate claim on federal funding (as long as the highest standards of research practiceare maintained). In some respect, science and humanities use and study the creations ofhuman design. As such, design research and the tested products of such research(settings, software, curricula, etc.,) contribute to solving a perceived fundamentalweakness of education research -- the so-called relevance gap between theory andpractice (DBRC, 2003; Cobb et al, 2003). In Table 3.1, I provide a conceptual framework to clarify the main differencesand complementarities of science, humanities, and design as three idealized modes ofengaging in education research. As such, the framework provides the setting for theremainder of this chapter. In this chapter I will focus on the differences and synergies between science anddesign; reference to the humanities perspective will be made merely in Table 3.1 for thesake of completeness. However, space does not allow for a full comparison andintegration of all three modes of inquiry. As is evident from Table 3.1, the humanitiesserve as one of three key modes of engaging in education research (Burkhardt and

27.
Normal and design sciences 22Schoenfeld, 2004). Each of these three modes is essential to the pluralistic nature of thefield of education research. The future development of education research largelydepends on building improved interfaces for communication and collaboration betweenhigh quality research in and across these three modes. This chapter focuses on the science-design interface because the relevance gapbetween theory and practice is most likely to be bridged by discussing differences andcomplementarities between the mainstream science’s and (practitioner’s) design mode.Moreover, the debate between the science and (postmodern) humanities camps appearsto have turned our attention away from the important issue of research objectives andour commitments as scholars. In this respect, the pragmatism of the design mode canalso be described as the common ground—in an epistemological sense—on whichscience and humanities can meet (DBRC, 2003). [TABLE 3.1 ABOUT HERE ] The framework in Table 3.1 also suggests differences in the use terminologyacross the three modes of inquiry. When discussing the science mode, I will refer toeducational systems as empirical objects with descriptive and well-defined properties,whereas artificial objects with both descriptive and imperative properties serve asobjects of design research. That is, science and design may focus on the same kind ofobjects, but do so from different epistemological positions (Cobb, et al, 2003). The argument is organized as follows. First, I explore education as a researchdomain with a basis in science (NRC, 2002) from the representational perspective aswell as from more recently developed understandings of the practice of science.

28.
Normal and design sciences 23Subsequently, I discuss and develop the notion of design more extensively; here I alsoexplore how and why the design disciplines have largely moved away from academia toother sites in the economy. The first and third columns in Table 3.1 anticipate andsummarize the argument about science and design to this point in the chapter. Finally, Iexplore the implications of an education research at the interface of science and design,and propose a framework for developing research at this interface. In sum, Simon notes that “design is the principal mark that distinguishes theprofessions from the sciences. Schools of engineering, as well as schools ofarchitecture, business, education, law and medicine, are all centrally concerned with theprocess of design (Simon 1996, p. 111). However, I also highlight the need for strongerprofessionally based linkages between science and design. Every issue of The StructuralEngineer, the official journal of the British Institution for Structural Engineering, carriesprominently displayed in a box on its contents page this definition of its subject:“Structural engineering is the science and art of designing and making, with economyand elegance, buildings, bridges, frameworks, and other similar structures so that theycan safely resist the forces to which they may be subjected.” Since some engineers denythat engineering is either science or art, it is encouraging to see this somewhat officialdeclaration that it is both. By the end of this chapter I hope that you, the reader, will beconvinced that education research is also both. For indeed it is. The conception of newcurricula, or piece of software to support learning for example, can involve as much aleap of the imagination and as much synthesis of experience and knowledge as any artistis required to bring to a canvas or paper. However, once that design is articulated by theeducation researcher as artist, it must be analyzed by the education researcher asscientist in as rigorous a way as possible. It has to be held up to the scrutiny of the

29.
Normal and design sciences 24education research community as a scientific community (Kelly, 2004; NRC, 2002,Sloane & Gorard, 2003). SCIENCE AS AN IDEALIZED MODE OF RESEARCH Purpose Science develops knowledge about what already is, by discovering andanalyzing existing objects (Simon, 1996). It is based on several key values, particularlydisinterestedness and consensual objectivity. Disinterestedness implies that scientistsare constrained to protect the production of scientific knowledge from personal bias andother subjective influences (Merton, 1973). Because researchers can never becompletely cleansed of individual and other interests, science therefore strives to attainconsensual objectivity, that is, a high degree of agreement between peers (Merton,1973). This implies that scientific investigation, in its ideal-typical form, strives forconsensual objectivity in researching and understanding general patterns and forces thatexplain the phenomena under study. Science as the Role Model for Education Science Mainstream education “science” is based on the idea that the methodology of thenatural sciences should and can be the methodology of education as a science. Thisapproach asserts that knowledge is representational in nature (Donaldson, 2003), andassumes that our knowledge represents the world as it is. The key research question is

30.
Normal and design sciences 25thus whether or not general knowledge claims are valid. As a result, the nature ofthinking in education science tends to be both descriptive and analytical. Nature of Objects Knowledge claims in science refer to educational phenomena as empiricalobjects with descriptive properties. Education science assumes general order to beempirically manifested as a set of stable regularities that can be expressed in the form ofhypothetical statements. These statements are usually conceived as revealing the natureof education itself, namely as a set of objective mechanisms underlying diverseeducational realities (Donaldson, 2003). This approach (implicitly) assumes that theseobjective mechanisms exist and that they can be most effectively studied from anunbiased “outsider” position. Focus of Theory Development Education science tends to focus on the discovery of general causal relationshipsamong variables. These causalities can be rather simple (“If x and y, then z”). Becausevariations in effects may be due to other causes than those expressed in a givenproposition, causal inferences are usually expressed in probabilistic equations orexpressions. This concept of causality helps to explain any observable educationalphenomena, but in itself cannot account for qualitative novelty. Conclusions, and anyrecommendations, therefore, have to stay within the boundaries of the analysis. Thefollowing research methods are frequently used in education science: the controlledexperiment, the field study, mathematical simulation modeling, and the case study. Incontrolled experiments, the research setting is safeguarded from the constraints and

31.
Normal and design sciences 26disturbances of the practice setting, and thus a limited number of conditions can bevaried in order to discover how these variations affect dependent variables (NRC,2002). In the field study, also known as the natural experiment, the researcher gathersobservations regarding a number of practice settings, measuring in each case the valuesof relevant (quantitative or qualitative) variables. Subsequently the data are analyzed totest whether the values of certain variables are determined by the values of othervariables (see, Shadish, Cook & Campbell, 2002). Mathematical simulation modelinginvolves the study of complex cause-effect relationships over time; this requires thetranslation of narrative theory to a mathematical model, to enable the researcher todevelop a deep understanding of complex interactions among many variables over time.Finally, the single or comparative case study helps researchers to grasp holistic patternsof educational phenomena in real settings (Yin, 1984). Criticism of Science as Exclusive Mode of Research Drawing on the humanities, some writers explicitly criticize the representationalnature of science-based inquiry (Gergen, 1992; Tsoukas, 2000). Others express severedoubts about whether the representational and constructivist view are reallyincompatible (Czarniawska, 1998; Tsoukas, 2000). This debate on the nature ofknowledge has primarily addressed epistemological issues and has turned attentionaway from the issue of research objectives, that is, from our commitments as educationresearchers. Studies of how research is actually conducted in the natural sciences have beenundermining science as the (exclusive) role model for education research.Anthropological studies of how research in some of the natural sciences actually comes

32.
Normal and design sciences 27about suggest that the actual operations of scientific inquiry are constructive rather thanrepresentational, and are embedded in a social process of negotiation rather thanfollowing the (individual) logic of hypothesis formulation and testing (Latour &Woolgar, 1979; Knorr-Cetina, 1981). Knorr-Cetina (1981) suggests the concept of“tinkering” to describe and understand what she observed in the natural sciences:Tinkerers are “aware of the material opportunities they encounter at a given place, andthey exploit them to achieve their projects. At the same time, they recognize what isfeasible, and adjust or develop their projects accordingly. While doing this, they areconstantly engaged in producing and reproducing some kind of workable object whichsuccessfully meets the purpose they have temporarily settled on” (Knorr-Cetina, 1981,p. 34; see also Knorr, 1979). DESIGN AS IDEAL-TYPICAL MODE OF RESEARCH In this section I describe the nature of design research, in comparison withscience, and also describe how and (perhaps) why the design disciplines have movedaway from the academic community to other sites in the economy. Purpose In his classic work, The Sciences of the Artificial, Herbert Simon (1996) arguesthat science develops knowledge about what already is, whereas design involves humanbeings using knowledge to create what could be, that is, things that do not yet exist.

33.
Normal and design sciences 28Design, as the activity of changing existing situations into desired (or more desirable)ones, therefore appears to be the core competence of all professional activities. Role Model Historically and traditionally, the sciences research and teach about naturalthings, and the engineering disciplines deal with artificial things, including how todesign for a specified purpose and how to create artifacts that have the desiredproperties (Simon, 1996). The social sciences have traditionally viewed the naturalsciences as their main reference point. Further, he argues that engineers are not the onlyprofessional designers, because “everyone designs who devises courses of action aimedat changing existing situations into preferred ones. The intellectual activity thatproduces material artifacts is no different fundamentally from the one that prescribesremedies for a sick patient, or the one that devises a new sales plan for a company, or asocial welfare policy for a state” (Simon 1996, p. 111). Simon (1996) also describes how the natural sciences almost drove the sciencesof the artificial from the curricula of professional schools in the first 20 to 30 years afterWorld War II. This was particularly true in engineering, business, and medicine. Animportant factor driving this process was that professional schools in business and otherfields craved academic respectability, when design approaches were still largely“intuitive, informal and cookbooky” (Simon, 1996, p. 112). In addition, the enormousgrowth of the higher education industry after World War II created large populations ofscientists and engineers who dispersed through the economy and took over jobsformerly held by technicians and others without academic degrees (Gibbons et al.,1994). This meant that the number of sites where competent work in the areas of design

34.
Normal and design sciences 29and engineering was being performed increased dramatically. This change served toundermine the exclusive position of universities as knowledge producers in these areas(Gibbons et al., 1994). A third force that contributed to design being (almost) removedfrom professional school curricula was the development of capital markets offeringlarge, direct rewards to value-creating enterprises (Baldwin and Clark, 2000). In otherwords, design in the technical as well as managerial and social domains moved fromprofessional schools to a growing number of sites in the economy where it was viewedas more respectable, and where it could expect larger direct economic rewards. View of Knowledge Design is based on pragmatism as the underlying epistemological notion. Thatis, design research develops knowledge in the service of action. The nature of designthinking is thus normative and synthetic. It is directed toward desired situations andsystems and toward synthesis in the form of actual actions. The pragmatism of designresearch can be expressed in more detail by exploring the normative ideas and valuescharacterizing good practice in many professions for example, architecture, engineering,and medicine. Three of these normative values are presented here (for others, see Nadler, 1981;Nadler and Hibino, 1990). They explicitly define the content dimension of designinquiry and include: (1) the uniqueness of each situation; (2) a focus on purposes andideal solutions; and (3) the application of systems thinking. Each Situation is Unique

35.
Normal and design sciences 30 This assumption implies that no two situations are fully alike. Each problemsituation is unique and is embedded in a unique context of related problems, requiring aunique approach (Cobb et al., 2003). The unique and embedded nature of each situationmakes it ill defined, or wicked, which means that there is insufficient informationavailable to enable the designer to arrive at solutions by simply transforming,optimizing, or superimposing the given information (DBRC, 2003). Focus on Purposes and Ideal Solutions The sole focus on ideal solutions helps “strip away” nonessential aspects of theproblem situation. It opens the door to the creative emergence of larger purposes andexpanded thinking. It also leads to an increase in considering possible solutions, andguides long-term development and evolution (Banathy 1996, Nadler and Hibino 1990,Tranfield et al. 2000). If an ideal target solution can be identified and agreed upon, thistarget solution puts a time frame on the system to be developed, guides near-termsolutions, and infuses them with larger purposes. As Nadler and Hibino note “even ifthe ideal long-term solution cannot be implemented immediately, certain elements areusable today” (Nadler and Hibino, 1990, p. 140). Apply Systems Thinking Design researchers argue that systems thinking helps designers to understandthat every unique problem is embedded in a larger system of problems (Barab &Kirshner, 2001; Rowland & Adams, 1999). It helps them to see “not only relationshipsof elements and their interdependencies, but, most importantly, provides the bestassurance of including all necessary elements,” (Nadler and Hibino 1990, p. 168).

36.
Normal and design sciences 31 Four other central ideas make up design researchers values regarding the processof design: (1) limited information; (2) participation and involvement in decision makingand implementation; (3) discourse as medium for intervention; and (4) pragmaticexperimentation. Limited Information The available information about the current situation (or system) is by definitionlimited. In the context of a design project, this awareness should guard participantsagainst excessive data gathering that may make them experts with regard to the existingartifacts. In sum, the expressed goal is to become expert in designing new artifacts. Toomuch focus on the existing situation may prevent people from recognizing new ideasand seeing new ways to solve the problem (Nadler and Hibino, 1990). Participation and Involvement in Decision Making and Implementation Those who carry out the solution should be involved in its development from thebeginning. Involvement in making decisions about solutions and their implementationleads to acceptance and commitment (Vennix, 1996). Moreover, getting everybodyinvolved is the best strategy if one wants long-term dignity, meaning, and community(McKenney, 1999 & 2001). In some cases, the benefits of participation in creatingsolutions can be more important than the solution itself (McKenney, 1995). Discourse as Medium for Intervention For design professionals, language is not a medium for representing the world,but for intervening in it. Thus, the design process should initiate and involve dialogue

37.
Normal and design sciences 32and discourse aimed at defining and assessing changes in educational settings andeducational practices (Gettman, McNelly & Muraida, 1999; van den Akker, 1999). Pragmatic Experimentation Finally, pragmatic experimentation is essential for designing and developingnew artifacts, and for preserving the vitality of artifacts developed and implementedearlier (Edelson, 2002; van den Akker, 1999). Pragmatic experimentation emphasizesthe importance of experimenting with new ways of organizing and searching foralternative and more-open forms of discourse. For example, this approach is necessaryto “challenge conventional wisdom and ask questions about ‘what if?’ but it is temperedby the pragmatist’s own commitment to finding alternatives which are useful” (Wicks &Freeman, 1998, p. 130). Some of these ideas are familiar to other approaches. For example, the notion ofdiscourse is shared with postmodernism (Gergen, 1992), although the latter may notsupport the underlying notion of pragmatism (see Table 3.1). The notion ofexperimentation is also central to laboratory experiments in the natural sciences and(some parts of the) social sciences; however, experiments by designers in organizationalsettings are best understood as action experiments (Argyris, Putnam & McLain Smith,1985), rather than as controlled experiments in a laboratory setting (Brown, 1992). In response to the need for more relevant and actionable knowledge, educationresearchers (and in particular learning scientists) tend to adopt action research methodsto justify a range of research methods and outputs. Action research has been, and still is,not well accepted on the grounds that it is not normal science (Tranfield and Starkey,1998). Action researchers have been greatly concerned with methods to improve the

38.
Normal and design sciences 33rigor and validity of their research, in order to gain academic credibility. Actionresearchers in education have emphasized retrospective problem diagnosis more thanfinding and creating solutions. Some have confused design research for action research(Shavelson, et al., 2003). At its core design research is quite different. Design researchincorporates several key ideas from action research, but is also fundamentally differentin its future-oriented focus on solution finding. Nature of Objects Design focuses on learning issues and systems as artificial objects withdescriptive as well as imperative properties, requiring non-routine action by agents ininsider positions. The imperative properties also draw on broader purposes and idealtarget systems. The pragmatic focus on changing and/or creating artificial objects ratherthan analysis and diagnosis of existing objects makes design very different fromscience. The novelty of the desired (situation of the) system as well as the non-routinenature of the actions to be taken imply that the object of design inquiry is rather illdefined. Focus of Theory Development The key question in design projects is whether a particular design “works” in acertain setting. Such a design can be based on implicit ideas (e.g., the way we plan mostof our daily activities). However, in case of ill-defined educational issues generally, andlearning issues specifically, requires a systematic and disciplined approach. Thisapproach involves the development and application of propositions, in the form of acoherent set of related design propositions. Design propositions are depicted, for

39.
Normal and design sciences 34example, as follows: “In situation S, to achieve consequence C, do A” (van den Akker,1999). In case of an ill-defined current and desired situation, a design approach isrequired that cannot and should not stay within the boundaries of the initial definition ofthe situation. Archer (1984, p. 348) describes an ill-defined problem as “one in whichthe requirements, as given, do not contain sufficient information to enable the designerto arrive at a means of meeting those requirements simply by transforming, reducing,optimizing, or superimposing the given information alone.” Ill-defined issues are, forexample, lack of communication and collaboration between team members;nonparticipation as the typical response of students to assigned work, etc. By contrast,well-defined problems are, for example, analyzing the test data for a particular student;selecting the best candidate from a pool of applicants on the basis of an explicit list ofrequirements; and computing a regression analysis of a certain dependent variable on aset of independent and control variables (Newell and Simon 1972). When faced with ill-defined situations and challenges, designers employ asolution-focused approach. They begin generating solution concepts very early in thedesign process, because an ill-defined problem is never going to be completelyunderstood without relating it to an ideal target solution that brings novel values andpurposes into the design process (Banathy 1996, Cross 1984). According to Banathy(1996), focusing on the system in which the problem situation is embedded tends tolock designers into the current system, although design solutions lie outside of theexisting system: “If solutions could be offered within the existing system, there wouldbe no need to design. Thus designers have to transcend the existing system. Their task isto create a different system or devise a new one. That is why designers say they can

40.
Normal and design sciences 35truly define the problem only in light of the solution. The solution informs them as towhat the real problem is” (Banathy 1996, p. 20). DEVELOPING THE DESIGN-SCIENCE INTERFACE This section explores the implications of positioning educational research at theinterface of science and design, and describes a framework for developing this interfaceby first outlining two conceptual forms of causality, and then describing opportunitiesfor an intersection of both science and design. Two Concepts of Causality The concept of causality underpinning science is the study of variance amongvariables across time or space, that is, the linkage of a known empirical phenomenoninto a wider network of data and concepts. Thus science tends to focus on testingpropositions derived from general theories (Maxwell, 2004; Mohr, 1982). Design drawson what Argyris (1993, p. 266) calls design causality to produce knowledge that is bothactionable and open to validation. The notion of design causality appears to be lesstransparent and straightforward than the concept of causality underpinning science. Thisis because of two characteristics of design causality. First, design causality explains howpatterns of variance among variables arise in the first place, and in addition, whychanges within the pattern are not likely to lead to any fundamental changes (Collins,1992). Second, when awareness of a certain ideal-target system (e.g., the circulardesign) has been created, design causality implies ways to change the causal patterns.

41.
Normal and design sciences 36That is, ideal-target systems can inspire, motivate, and enable agents to develop newprocesses and systems. Argyris (1993) emphasizes, however, that the causality of theold and the new structure will co-exist, long after a new program or learning artifact hasbeen introduced. These two characteristics of design causality tend to complicate the developmentand testing of design propositions as hypotheses in science. A full integration of thedesign and science modes is not easy and perhaps not feasible. This reinforces theargument made earlier in this chapter that simple integration would be difficult, andmay not be desirable. However, once an artifact is designed its value to learning needsto be investigated scientifically. Toward an Interface Between Design and Science If design and science need to co-exist as important modes of engaging ineducational research, any attempt to reduce the relevance gap between mainstreamtheories and the world of practice starts with developing an interface between designand science. A critical element of the interface proposed here involves the notion ofdesign propositions. Design propositions, as the core of design knowledge, are similar to knowledgeclaims in science-based research, irrespective of differences in epistemology andnotions of causality. These design propositions can provide a shared focus for dialogueand collaboration between design and science. This suggests that research at the design-science interface should focus on design propositions developed through testing inpractical contexts as well as grounding in the empirical findings of education science.This type of research would enable collaboration between the design and science mode,

42.
Normal and design sciences 37while it would also respect some of the methodological differences between the twomodes. At the interface between science and design, some research methods appear tobe more effective than others. The nature of alpha and beta testing of designpropositions by means of action experiments is highly similar to the replication logicrecommended for comparative case studies (Sloane & Gorard, 2003; Yin, 1984).Another research method that may be effectively employed at the design-scienceinterface is simulation modeling. In particular, simulation methods involving bothconceptual models (mathematical simulation and learning laboratories) appear to bevery promising. Simulation modeling allows people to build and test models describingthe current and desired (states of the) system, which, in turn, helps them to move outsidethe mental boundaries of the current situation. In general, the collaboration betweeninsiders and outsiders with regard to the learning systems under study appears toincrease the effectiveness of research projects at the design-science interface. Design research must be directed toward rigorous research to produce designpropositions that can be grounded in empirical research as well as tested, learned, andapplied by “reflective practitioners” in educational settings (Schön, 1987). The form ofsuch propositions and rules—as the core of design knowledge—is very similar toknowledge claims in science. This similarity is an important condition for dialogue andcollaboration between design and science, to the extent that these propositions canprovide a shared focal point. A more rigorous approach to design inquiry will facilitatecollaboration and dialogue with education science. In general, the possible synergy between science and design can be summarizedas follows. First, the body of knowledge and research methods of education science can

43.
Normal and design sciences 38serve to ground preliminary design propositions in empirical findings, suggest ill-defined areas to which the design mode can effectively contribute, and build acumulative body of knowledge about educational theory and practice. In turn, the designmode serves to translate empirical findings into design propositions for furtherpragmatic development and testing. It can suggest research areas (e.g., with emergingdesign propositions that need empirical grounding in education science) to whichscience can effectively contribute. Finally, design research can reduce the relevance gapbetween science and the world of practice. CONCLUDING REMARKS After enjoying a certain degree of paradigmatic consistency and unity in the firsthalf of the 20th century, educational research has become increasingly pluralistic innature. In this respect, science and the humanities help to understand existingeducational systems and settings for learning, rather than to actually create much needednew learning artifacts. This suggests that education research should be reconfigured asan academic enterprise that is explicitly based on all three modes of inquiry includingscience, humanities, and design. With a few exceptions in the academic community,design inquiry in education is left to learning scientists, many of whom were trained asengineers or computer scientists. There are of course some exceptions (particularly inareas like mathematics and science education). One result of the small size and diversityof the design community in education is that the body of design knowledge appears tobe fragmented and dispersed in contrast to more mainstream education research

44.
Normal and design sciences 39modeled after the scientific paradigm. Design research should therefore be redirectedtoward more rigorous research, to produce outcomes that are characterized by highexternal validity but that are also teachable, learnable, and actionable by practitioners.Collaboration and exchange between science and design can only be effective if acommon framework is available that facilitates interaction and communication betweenthe two. The work of national funding agencies should sponsor such collaboration. Thiscan be seen in the history of the National Science Foundation’s funding decisions withrespect to design technologies, and more recently in the call for design research by theU.S. Department of Education’s Institute for Education Sciences (IES). In close, the argument in this chapter involved a modest attempt to define themain conditions, differences, and synergies of three modes of engaging in educationresearch (see Table 3.1). Subsequently, the nature and contribution of the design modewas explored and illustrated in more detail. Finally, modest opportunities for linkingscience and design to better serve education were briefly acknowledged. REFERENCESArcher, L. B. (1984). Whatever became of design methodology? In N. Cross ( Ed.), Developments in design methodology. (pp. 347–350). New York: Wiley.Argyris, C. (1993). Knowledge for Action: A Guide to Overcoming Barriers to Organizational Change. San Francisco. Jossey-Bass.Argyris, C., Putnam, R., & McLain Smith, D. (1985). Action science: Concepts, methods, and skills for research and intervention. San Francisco: Jossey-Bass.

50.
Explanatory Interpretive Design Science Science ResearchView of Representational Constructivist and PragmaticKnowledge: knowledge that narrative knowledge knowledge that represents the that arises from what serves to initiate world as it is; actors think and say change in man- nature of thinking about the world; made things and is descriptive, nature of thinking is systems (artifacts); explanatory and critical, interpretive nature of thinking is analytic. and reflexive. normative and synthetic.Nature of Organizational Discourse that actors OrganizationalResearch phenomena as and researchers issues and systemsObject: empirical objects, engage in; as artificial objects with well-defined appreciating the with descriptive as descriptive complexity of a well as imperative properties, that can particular discourse is properties. be effectively given precedence Imperative studied from an over the goal of properties also draw outsider position. achieving general on broader purposes knowledge. and ideal target systems.Main Is this hypothesis To what extent is this Does this particularQuestion: valid? Conclusions (category of) human design work? stay within the experience(s) in an Conclusions tend to boundaries of the organizational setting move outside analysis. “good”, “fair”, and so boundaries of initial forth? definition of the situation.Interplay Methodology is Interplay between Methods are toolsBetween clearly defined and theory and method is for creating andTheory and should be adhered reciprocal and changing humanMethod: to rigorously, dialectic. artifacts. Methods regardless of the Methodological are thus selected to empirical object. straightjackets are fit the specifics of Although the choice avoided, to facilitate the problem and of method tends to the development of situation, and may affect theory, interesting ideas and consist of any one or method as tool-for- theory. a combination of testing prevails in explanatory, justifying and interpretive, publishing research experimental, findings. computational, mathematical or exploratory methods.Table 3.1: Three modes of engaging in organization research (adapted from:Banathy, 1996)

51.
A learning design perspective 45DESIGN RESEARCH FROM A LEARNING DESIGN PERSPECTIVEKoeno Gravemeijer and Paul Cobb In this contribution, we want to elaborate an approach to design research that hasbeen used and refined in a series of design research projects in which the two authorscollaborated over a ten-year period. To locate our contribution in this book, we maycategorize our approach as falling within the broader category of design research thataims at creating innovative learning ecologies in order to develop local instructiontheories on the one hand, and to study the forms of learning that those learningecologies are intended to support on the other hand.1 The research projects on which wewill focus involve a research team taking responsibility for a group of students learningfor a period of time. And all concern the domain of mathematics education (includingstatistics education). The approach to design research, which we developed over the years, has itsroots in the history of the two authors. One is that of socio-constructivist analysis ofinstruction. The other is that of the work on realistic mathematics education (RME) thatis carried out in the Netherlands. The underlying philosophy of design research is that you have to understand theinnovative forms of education that you might want to bring about in order to be able toproduce them. This fits with the adagio, that “if you want to change something, youhave to understand it, and if you want to understand something, you have to change it.”The two sides of this adagio mirror the authors’ two histories. The RME approach wasinspired by a need for educational change, the socio-constructivist approach in a desirefor understanding. If we take the first of these perspectives, we may observe that the notion ofdesign research has been around for a long time. Various forms of professionalinstructional design may be perceived as informal predecessors of design research. Therecognition that instructional design often had an innovative character, while the 1 This corresponds with the types two and three in the discussion paper of Gravemeijer & vanden Akker (2003)

52.
A learning design perspective 46available scientific knowledge base was far too limited to ground the design worksparked the idea for a type of instructional design that integrated design and research.This was idea was strengthened by the experience that conscious and thoroughinstructional design work brought a learning process in which the designers developedvaluable and well-grounded knowledge in what retrospectively might be called designexperiments. Over time a number of proposals have been made to define design research inmathematics education, of which Brown’s (1992) article on design experiments is oneof the most notable. In the Netherlands, Freudenthal (Freudenthal, Janssen, & Sweers,1976) was perhaps the first to propose an approach of this type with his concept of“developmental research”, an idea that was further elaborated by Streefland (1990) andGravemeijer (1994, 1998).2 Freudenthal’s ideas were put to practice in the DutchInstitute for the Development of Mathematics Education, IOWO (the later OW&OC,now called Freudenthal Institute). This work has created fertile soil for the developmentof the so-called domain specific instruction theory3 of realistic mathematics education(RME) (Treffers, 1987). This domain specific theory could be reconstructed as ageneralization over numerous local instruction theories (Gravemeijer, 1994). The second part of the adagio, “if you want to understand something, you haveto change it” points to the other predecessor of our collaborative work on designresearch, the constructivist “teaching experiment methodology” (Cobb & Steffe, 1983;Steffe, 1983). Within this methodology, one-to-one teaching experiments aimedprimarily at understanding how students learn rather than at educational change. Theseone-to-one teaching experiments were later expanded into classroom teachingexperiments. The need for classroom teaching experiments arose when analysis oftraditional instruction within the same (socio-constructivist) research program, producedonly negative advice for the teachers; advice of the type: “Don’t do this, don’t do that.”To create more productive classroom environments, the researchers had to take theresponsibility for the design of the instruction of a classroom for an extended period of 2 Similar efforts have been made in science education (Lijnse, 1987). Coincidentally, van denAkker and colleagues developed a more general design-theory oriented form of design research in theNetherlands, which they also called ‘developmental research’ (van den Akker, 1999). 3 The prefix ‘domain specific’ is used to delineate RME from general instructional theories, andto express that this instruction theory is specific for the domain of mathematics education.

53.
A learning design perspective 47time. In doing so, the one-on-one teaching experiment methodology was expanded toclassroom teaching experiments. The focus on understanding is a salient characteristic of design research. In thisrespect, the distinction Bruner (1994) makes a between research that aims at (statistical)explanation, and research that aims at understanding comes to mind. We may use thisdistinction to emphasize that the goal of design research is very different from researchalong the lines of an experimental or quasi-experimental research design. And differentgoals imply different methods and different forms of justification. In relation to this wemay quote the NCTM Research Advisory Committee (1996) that observes “a shift innorms of justification” in mathematics education research. This is a shift, they argue,from research that proves that treatment A works better than treatment B, towardsresearch that has as its goal to provide an empirically grounded theory on how theintervention works. Mark that the intended result of this type of research, is theory. The purpose ofdesign experiments is to develop theories about both the process of learning and themeans that are designed to support that learning. One may work towards this goal in twoways, either by developing local instruction theories, or by developing theoreticalframeworks that addresses more encompassing issues. In our approach to designresearch we try to combine the two. In the following, we make the issue of what design research is for us concrete bydiscussing the three phases of conducting a design experiment, which are 1) preparingfor the experiment, 2) experimenting in the classroom, and 3) conducting retrospectiveanalyses. In doing so, we will address a range of methodological considerations. Toground the discussion in a concrete design experiment, we will use an experiment onstatistics to illustrate the various phases. Although some may not consider statistics apart of mathematics, we contend this illustrative case of statistics education iscompatible with the kind of mathematics education we seek to bring about. PHASE ONE, THE PREPARING FOR THE EXPERIMENT

54.
A learning design perspective 48 From a design perspective, the goal of the preliminary phase of design researchexperiment is to formulate a local instruction theory that can be elaborated and refinedwhile conducting the intended design experiment. From a research perspective, a crucialissue is that of clarifying its theoretical intent. In elaborating these points, we will startby clarifying how one goes about establishing the learning goals, or instructional endpoints to which one is aiming, and the instructional starting points. Next we will discussthe conjectured local instruction theory that the research team has to develop. This localinstruction theory encompasses both provisional instructional activities, and aconjectured learning process that anticipates how students’ thinking and understandingmight evolve when the instructional activities are employed in the classroom. We willclose this section by elaborating on the theoretical intent of an experiment. End Points The preparation for a classroom design experiment typically begins with theclarification of the mathematical learning goals. Such a clarification is needed, as onecannot adopt the educational goals that are current in some domain. These goals will inpractice largely be determined by history, tradition, and assessment practices. Designresearchers therefore cannot just take these goals as a given when starting a designexperiment. Instead, they will have to problematize the topic under consideration from adisciplinary perspective, and ask themselves: What are the core ideas in this domain? We may illustrate this activity of problematizing with our work in the domain ofearly statistics. The conventional content of statistics at the US Middle School level (12-14 year old students) is rather meager. It is basically a collection of separate topics—such as mean, median, and mode—and standardized graphical representations. Reviewing the literature, however, did not offer much help, there appeared to be no consensus on what the central ideas should be. By analyzing what doing statistics entails, we came to the conclusion that the notion of distribution plays a central role. We concluded that distribution could function as an overarching idea that could go through elementary

55.
A learning design perspective 49 school, middle school, and up to high school and college. From this perspective, notions like “center”, “skewness”, “spread”, and “relative frequency” are ways of characterizing how the data are distributed, rather than separate topics or concepts on themselves. In addition, different types of statistical representations come to the fore as different ways of structuring and organizing data sets in order to detect relevant patterns and trends. This elaboration serves to emphasize that the goal of design research is not totake the currently instituted or institutionalized school curriculum as a given, and to tryto find better ways to achieve the given goals. Instead, the research team has toscrutinize those goals from a disciplinary point of view, in order to establish what themost relevant or useful goals are. Consequently, the design research we describe here isinterventionist in character. In our example, part of our agenda was to attempt toinfluence what statistics should be in school, at least at a middle school level in the US. Starting Points In order to be able to develop a conjectured local instruction theory, one also hasto consider the instructional starting points. Mark that the focus in doing so is tounderstand the consequences of earlier instruction, not merely to document the typicallevel of reasoning of 12 or 14 year old students in a given domain. Here, the existingresearch literature can be useful. Psychological studies can usually be interpreted asdocumenting the effects of prior instructional history. To complement such a literaturestudy, the researchers will also have to carry out their own assessments, before starting adesign experiment. In some cases, they may be able to use available items andinstruments. In addition to written tests, there will also be a need for other forms ofassessment, such as interviews, or whole class performance assessments. We havefound performance assessments to be particularly useful in documenting instructionalstarting points. We may illustrate this with the example of the statistics designexperiment. In preparation for the design experiment in data analysis, we gave a number of tasks to a two classes. Then, rather than attempting to support the

56.
A learning design perspective 50 students’ learning in the whole class discussion, the role of the teacher was to probe the students’ understanding and reasoning, and to find out why they used particular approaches. These performance assessments clearly revealed the consequences of the students’ prior instruction. For them, data analysis was trying to remember what you’re supposed to do with numbers. Data were not numbers plus context for them, to use a phrase from David Moore (1997). In his view, statisticians are always dealing with data plus context. In other words, data for these students were not measures of an attribute of a situation that was relevant with regard to the problem or issue under investigation. So, our initial challenge in the design experiment was to support a change in what statistics was about for these students, so that they were actually analyzing data. Local Instruction Theory Given the potential end points on the one hand, and the instructional startingpoints on the other hand, the research team has to formulate a local instruction theory.Such a conjectured local instruction theory consists of conjectures about a possiblelearning process, together with conjectures about possible means of supporting thatlearning process. The means of support encompass potentially productive instructionalactivities and (computer) tools as well as an envisioned classroom culture and theproactive role of the teacher. The research team tries to anticipate how students’thinking and understanding might evolve when the planned but revisable instructionalactivities are used in the classroom. In this manner, the research team tries to reconcilethe need to plan in advance, and need to be flexible when building on the students’current understandings when the design experiment is underway. In many domains, the available research literature provides only limitedguidance. In the case of statistics we had work hard to find five relevant articles.4 Thesort of articles that are relevant for construing local instruction theories are reports ofthe process of students’ learning in a particular domain together with descriptions of theinstructional settings, the tasks, and the tools that enabled or supported that learning. To compensate for the lack of guidance that the literature offers, design 4 We initially worked with univariate data. When we moved to bivariate data we were not able toidentify or find any research report that we that we could build on.

57.
A learning design perspective 51researchers have to turn to other resources, such as curricula, texts on mathematicseducation, and the like. Actually, the design researcher may take ideas from whateversources to construe an instructional sequence. Mark, however, that adopting oftenmeans adapting. In this respect, the way of working of a design researcher resembles themanner of working of what the French call a “bricoleur.” A bricoleur is an experiencedtinker/handy person, who uses as much as possible those materials that happen to beavailable. To do so, many materials will have to be adapted; the bricoleur may evenhave to invent new applications, which differ from what the materials were designedfor. The design researcher follows a similar approach, labeled “theory-guided bricolage”(Gravemeijer, 1994), to indicate that the way in which selections and adaptations aremade will be guided by a (possibly still emergent) domain specific instruction theory. The Classroom Culture and the Proactive Role of the Teacher Instructional designers typically focus on instructional tasks and tools aspotential means of support. We would argue, however, that one also has to consider thecharacteristics of the envisioned classroom culture and proactive role of the teacher.One cannot plan instructional activities without considering how these activities aregoing to be enacted in the classroom. Design researchers therefore also have to considerthe nature of classroom norms and the nature of classroom discourse. We know fromexperience that the norms of argumentation can differ radically from one classroom toanother, and that they can make a profound difference in the nature and the quality ofthe students’ mathematical learning (Cobb, Yackel, & Wood, 1989). Considerations onclassroom norms and classroom discourse should therefore, be included in the design. Thus one of the tasks of the teacher will be to establish the desired classroomculture. Further, the proactive role of the teacher will include introducing of theinstructional activities, or more specifically in the case of statistics, guiding the processof talking though the process of data creation. Further, the teacher will have to selectpossible topics for discussion, and orchestrate whole-class discussions on these topics. Theoretical Intent In addition to elaborating a preliminary instructional design, the research groupalso has to formulate the theoretical intent of the design experiment. For the goal of a

58.
A learning design perspective 52design experiment is not just to describe what happened in a particular classroom.Analyses will have to be cases of a more general phenomenon that can inform design orteaching in other situations. One of the primary aims of a design experiment is tosupport the constitution of an empirically grounded local instruction theory. Another aim of a design experiment might be to place classroom events in abroader context by framing them as instances of more encompassing issues. Forexample, analyses might be conducted that focus on the proactive role of the teacher,teacher’s and students’ negotiation of general classroom norms, or the teacher’slearning. Also the role of symbolizing and modeling, or more generally of semioticprocesses, in supporting students’ learning can become an explicit focus ofinvestigation. As a final example, we may mention that the statistics design experimentbecame a case of cultivating students’ mathematical interests in that in the course ofthese experiments students became very interested in conducting data analysis toinvestigate issues. They came to view this as an activity worthy of their engagement.This relates to issues such as motivation and persistence. Ultimately, this mightinfluence their decision whether to continue to study mathematics or not. For us, thecultivation of students’ domain specific interests is an important aspect of mathematicalliteracy in its own right. In addition to these more encompassing issues, we may point to a third type oftheory that may emerge during a series of design experiments. A series of designexperiments can serve as the context for the development of theories or theoreticalframeworks that entail new scientific categories that can do useful work in generating,selecting, and assessing design alternatives. The development of a conceptualframework to describe the phenomena under study is an essential part of a scientificendeavor. New categories, however, do not come readymade, and cannot simply becaptured by writing down a definition. New categories have to be invented andembedded in a supporting theoretical framework. Defining scientific terms is more likefinding and validating a new category of existence in the world, for which we may usethe term “ontological innovation” (diSessa & Cobb, 2004). Examples of such ontological innovations include the interpretative framework

59.
A learning design perspective 53for interpreting classroom discourse and communication, which we will discus later(Cobb & Yackel, 1996), the “discovery” of meta-representational competence (diSessa,1992, 2002), the theory of quantitative reasoning (Thompson, 1994, 1996), the designheuristic of emergent modeling (Gravemeijer, 1999), and RME theory in general(Treffers, 1987, Gravemeijer, 1994). The new frameworks and categories may besought for, but often they emerge from design experiments in answer to the need to geta handle on surprising observations. The initial conceptualization, however, willtypically be crude and in need of further elaboration and improvement. Ontologicalinnovations therefore become a topic of a research program that spans a series of designexperiments, within which the theoretical frameworks will be revised and refined toadjust them to a range of design contexts. Mark that ontological innovations can play a dual role. On the one hand they canserve as lenses for making sense of what is happening in the complex, more-or-less realworld instructional setting in which a design study is conducted. On the other hand,ontological innovations can function as guidelines or heuristics for instructional design.The social norms and the socio-mathematical norms that we will discuss in more detaillater, may function as an example. On the one hand, the concepts of social norms andsocio-mathematical norms offer an interpretative framework for analyzing classroomdiscourse and communication. On the other hand, the same framework reveals whatnorms to aim for to make the design experiment successful. RME theory may play asimilar dual role; the theory not only guides the design, but also offers a framework forinterpreting the learning process of the students. One point of attention, for instance,will be the variety of solution procedures that the students produce. This can be seen asan indication of the extent in which these solution procedures are student inventionsrather than unreflected copies of examples given by the teacher or other students.Moreover, according to the reinvention principle, one expects the variation in solutionprocedures to correspond with the conjectured reinvention route. PHASE TWO, THE DESIGN EXPERIMENT

60.
A learning design perspective 54 The second phase consists of actually conducting the design experiment. Whenall the preparation work has been done, the overall endpoints are specified, the startingpoints are defined, and a conjectured local instruction theory is formulated, the designexperiment can start. The research group will take the responsibility for the learningprocess of a group of students, whether for 5 weeks, for three months, or even for awhole school year. However, before describing this second phase, it is important toclarify the intent or purpose for actually experimenting in the classroom. Although, for some, the term “experiment” may evoke associations withexperimental, or quasi-experimental, research, the objective of the design experiment isnot to try and demonstrate that the initial design or the initial local instruction theoryworks. The overall goal is not even to assess whether it works, although of course theresearchers will necessarily do so. Instead the purpose of the design experiment is bothto test and improve the conjectured local instruction theory that was developed in thepreliminary phase, and to develop an understanding of how it works. We will start our discussion of the design experiment with the iterative sequenceof tightly integrated cycles of design and analysis, which is key to the process of testing,improving, and understanding. Next we will briefly touch upon the kind of data that aregenerated. Then we address the need for explicating the interpretative framework(s) oneuses, on the one hand for interpreting classroom discourse and communication, and onthe other hand for interpreting students’ mathematical reasoning and learning. Micro Cycles of Design and Analysis At the heart of the design experiment lies a cyclic process of (re)designing, andtesting instructional activities and other aspects of the design. In each lesson cycle, theresearch team conducts an anticipatory thought experiment by envisioning how theproposed instructional activities might be realized in interaction in the classroom, andwhat students might learn as they participate in them. During the enactment of theinstructional activities in the classroom, and in retrospect, the research team tries toanalyze the actual process of the students’ participation and learning. And, on basis ofthis analysis, the research team makes decisions about the validity of the conjecturesthat are embodied in the instructional activity, the establishment of particular norms andso forth, and about the revision of those specific aspects of the design. The design

61.
A learning design perspective 55experiment therefore consists of cyclic processes of thought experiments and instructionexperiments (Freudenthal, 1991), see Figure 4.1.[ FIGURE 4.1 ABOUT HERE ] We may associate these micro cycles of design and analysis with Simons (1995)“mathematical teaching cycle.” According to this idea of a mathematical teaching cycle,a mathematics teacher will first try to anticipate in advance what the mental activities ofthe students will be when they will participate in some envisioned instructionalactivities, and next will try to find out to what extend the actual thinking processes ofthe students correspond with the hypothesized ones during the enactment of thoseactivities, to finally reconsider potential or revised follow-up activities. To characterizethe teacher’s thinking, Simon coins the term, “hypothetical learning trajectory,” whichhe describes as: “The consideration of the learning goal, the learning activities, and thethinking and learning in which the students might engage (...)” (Simon, 1995, p. 133).The mathematical teaching cycle, then, may be described as conjecturing, enacting, andrevising hypothetical learning trajectories. We may compare the micro cycles of design and analysis with the concept of anempirical cycle of hypotheses testing. A fundamental difference, however, is that theevaluation of the former concerns inferences about the mental activities of the students,not merely observable behavior of the students. Since, for the design researcher, thegoal is not just to find out whether the participation of the students in those particularactivities results in certain anticipated behaviors, but to understand the relation betweenthe student’s participation and the conjectured mental activities. To give an example of such more encompassing conjectures we may return toour example of statistics. Earlier we stated that one of our initial goals was that the students would actually be analyzing data, not just numbers without context. With that in mind, we instituted a process that we called “talking through the process of data creation.” On the basis of pragmatic considerations, and since our focus

62.
A learning design perspective 56 was on data analysis, we did not involve the students in activities of data gathering. We did not, however, want the data to drop out of thin air for the students. Moreover, following (Tzou, 2000), we would argue that data are not ready available; data are created. Data are the result of measuring, and often specific measures are construed to find an answer to a certain question. We conjectured that it would be essential for students to experience this process of creating data to answer a question if data were to be measures rather than mere numbers for them. We may illustrate this with an example. In one of the initial instructional activities, we wanted the students to compare data on a life span of two brands of batteries. However, it was important that they do so for a reason that they considered legitimate. The teacher therefore began by asking the students if they used batteries, and what do they used them for. They told that they used them in portable CD- players, tape recorders, and so forth. So, for them the quality of batteries appeared to be a significant issue. Next the teacher asked about the things that they focus on when buying batteries. The students came up with life span and costs. So together teacher and students identified life span as a relevant dimension. Then the discussion turned to how to figure out which of two different brands of batteries would have the better life span. And the students were asked to come up with ideas about how to make measurements. They offered various proposals, often the idea came up of putting a number of batteries in “identical” appliances, everything from torch flash lights, to clocks, to whatever. It was only against that background of actually having talked through the data creation process that the data the students were to analyze were introduced. In doing so, we conjectured that as a consequence of engaging in this process the data that were introduced would have a history for the students. Shown in 4.2 are the data on the life-span of two brands of batteries, which are presented by “magnitude-value bars” in the first computer minitool.[ FIGURE 4.2 ABOUT HERE ]

63.
A learning design perspective 57 Each bar signifies the life span of a single battery. This computer tool has a number of options; the students can for example sort the bars by size, by the colors that correspond with different sub sets. When we introduced this type of visual representation, we purposely chose situations with linearity, such as time, that in our view would fit with this representation. We conjectured that this representation would be relatively transparent for the students thanks to their experience with scale lines and the like. We further conjectured that the students would focus on the position of the end points of the bars when comparing the data sets, and that the combination of a significant number of high values of the Always Ready batteries in combination with a few short life spans would create opportunities for a productive discussion. In this illustration we focused on various conjectures, such as the conjecture thatby engaging the students in the task of comparing two sets of data, which differedmarkedly in distribution of data values—while using the first minitool—would lead to adiscussion about how the data values are distributed. We would be remiss if we did notclarify that the actual conjectures were in fact more complex, in that they alsoencompassed choices about organization of the classroom activities and classroomnorms, as well as the nature of instructional activities and tools. These are relativelydetailed conjectures about the means of supporting shifts in students’ reasoning that weanticipated would be important. As a clarifying note, it is helpful to distinguish between two complementaryways of identifying causal relations, the regularity conception of causality that isconnected to observed regularities, and a process oriented conception of causalexplanation, “that sees causality as fundamentally referring to the actual causalmechanisms and processes that are involved in particular events and situations”(Maxwell, 2004, p. 4). Within the latter, “causal explanation” refers to “the mechanismsthrough which and the conditions under which that causal relationship holds” (Shadish,Cook, & Campbell, 2002, cited in Maxwell, 2004, p. 4). In contrast to the regularity

64.
A learning design perspective 58conception of causality that is connected to observed regularities, causal explanation canin principle be identified in a single case (Maxwell, 2004, p. 6). These mechanisms areexactly the kind of causal explanation that the design researchers seek to develop. Inthis sense, the micro cycles of thought and instruction experiments correspond to aprocess-oriented conception of causal explanation, while the empirical cyclecorresponds with regularity conception of causality. Note, however, that in the contextof design research, it will not be sufficient to come to understand one student’s thinking.Instead, to be of value, the researchers must document that a significant proportion ofstudents reason in a comparable manner. In addition, regularities in the variation instudent thinking will be essential for productive classroom discussions. In a design experiment, the mini cycles of thought and instruction experimentsserve the development of the local instruction theory. In fact there is a reflexive relationbetween the thought and instruction experiments, and the local instruction theory that isbeing developed. At one hand, the conjectured local instruction theory guides thethought and instruction experiments, and at the other hand, the micro cycles of designand analysis shape the local instruction theory (see Figure 4.3).[ FIGURE 4.3 ABOUT HERE ] These micro cycles require that the research team engages in an ongoinganalysis of individual students’ activity and of classroom social processes to inform newanticipatory thought experiments, the design or revision of instructional activities, andsometimes the modification of learning goals. In service of such an analysis, it is criticalin our experience that the researchers are present in the classroom when the designexperiment is in progress, and conduct short debriefing sessions with the collaboratingteacher immediately after each classroom session in order to develop sharedinterpretations of what might be going on in the classroom. We also find it vital to have longer periodic meetings. The focus of thesemeetings is primarily on the conjectured local instruction theory as a whole. A localinstruction theory encompasses both the overall process of learning and the instructionalactivities that are designed to foster the mental activities that constitute the long-term

65.
A learning design perspective 59process. So we may also observe a process of conjecturing and revising on two levels,on the level of the individual classroom sessions, and on the level of the instructionalsequence as a whole. In addition to the adaptation of the overall learning process duringa design experiment, we may also discern macro design cycles, which span entireexperiments, in the sense that the retrospective analysis of a design experiment can feedforward to inform a subsequent experiment (see Figure 4.4).[ FIGURE 4.4 ABOUT HERE ] From this process emerges a more robust local instructional theory that, wewould add, still is potentially revisable. Data Generation Decisions about the types of data that need to be generated in the course of anexperiment depend on the theoretical intent of the design experiment. These are in asense pragmatic decisions in that the data have to make it possible for the researchers toaddress the issues that were identified as the theoretical intent at the start of the designexperiment. If the design experiment focuses on the development of a local instructiontheory, for instance, it makes sense to video record all classroom sessions, to conductpre- and post-interviews with the students, to make copies of all of the students’ work,and to assemble field notes. In addition, appropriate benchmark assessment items thathave been used by other researchers might be incorporated if they are available. We also find it crucial to audio-record the regular research group meetingsbecause these meetings offer one of the best opportunities to document the learningprocess of the research team. Data generation therefore involves keeping a log of theongoing interpretations, conjectures, decisions, and so forth. The specific foci of a design experiment may require additional types of data. Togive an illustration, we return to the statistics experiment again, which also became acase of cultivating students’ mathematical interests (Cobb & Hodge, 2003). We weretherefore interested in how the students perceived their obligations in the classroom andin how they evaluated those obligations. As a consequence, a member of the researchteam conducted student interviews that focused on these issues while the experiment

66.
A learning design perspective 60was in progress. It turned out to be more productive to conduct these interviews withpairs or groups of three students. So, this specific research interest necessitated anotherform of data collection. Interpretative Framework(s) A key element in the ongoing process of experimentation is the interpretation ofboth the students’ reasoning and learning and the means by which that learning issupported and organized. We contend that it is important to be explicit about how one isgoing about interpreting what is going on in the classroom. In (quasi-)experimental research, the relation between the empirical reality andthe scientific interpretation is made explicit by operationalizing the variables that aretaken into account. Likewise, design researchers have to explicate how they translateobservations of events in the classroom into scientific interpretations. The researcherswill necessarily employ an interpretive framework to make sense of the complexity andmessiness of classroom events both while a design experiment is in progress and whenconducting a retrospective analysis of the data generated during an experiment. It isessential in our view that researchers explicate the basic constructs of their interpretiveframework if inquiry is to be disciplined and systematic. Key elements of such a(potentially revisable) interpretative framework include (a) a framework for interpretingthe evolving classroom learning environment, and (b) a framework for interpretingstudent mathematical reasoning and learning mathematics. In the following we will firstdiscuss the framework we use to interpret classroom discourse and communication, andnext turn to the domain specific instruction theory for realistic mathematics educationthat is used as a conceptual framework for interpreting student learning. In doing so, weclarify that for us socio-constructivism functions as a background theory. Emergent Perspective The framework that we currently use for interpreting classroom discourse andcommunication is the “emergent perspective” (Cobb & Yackel, 1996; Yackel & Cobb,1996), see Figure 4.5). We mentioned aspects of this framework earlier as examples ofan ontological innovation.

67.
A learning design perspective 61[ FIGURE 4.5 ABOUT HERE ] The framework can be viewed as a response to the issue of attempting tounderstand mathematical learning as it occurs in the social context of the classroom.With regard to the specifics of the framework, the column headings “SocialPerspective” and “Psychological Perspective” involve a focus on the classroomcommunity and on individual students’ reasoning respectively. In the followingparagraphs, we first discuss social norms, then socio-mathematical norms, and finallyclassroom mathematical practices. Social norms refer to expected ways of acting and explaining that becomeinstantiated through a process of mutual negotiation between the teacher and students.The social norms will differ significantly between classrooms that pursue traditionalschool mathematics, or reform mathematics. In traditional mathematics classrooms, therole of the teacher is to explain and evaluate, while the social norms include theobligation of the students to try to figure out what the teacher has in mind, and to actaccordingly. Examples of norms for whole-class discussions in reform math classroomsinclude obligations for the students to explain and justify solutions, to attempt to makesense of explanations given by others, to indicate agreement and disagreement, and toquestion alternatives in situations where a conflict in interpretations or solutions hasbecome apparent. The psychological correlate to social norms concerns the teacher’s and students’individual beliefs about their own and others roles. The reflexivity between socialnorms and individual beliefs is better understood when analyzing the negotiationprocess of classroom communities. On the one hand, individuals’ beliefs about ways toact contribute to the negotiation of social norms. On the other hand, an individual’sbeliefs are enabled and constrained as he or she participates in this negotiation process. The socio-mathematical norms can be distinguished from social norms as waysof explicating and acting in whole-class discussions that are specific to mathematics.Examples of such socio-mathematical norms include what counts as a differentmathematical solution, a sophisticated mathematical solution, an efficient mathematical

68.
A learning design perspective 62solution, and an acceptable mathematical explanation and justification. The students’personal beliefs about what makes a contribution acceptable, different, sophisticated orefficient encompasses the psychological correlate of the socio-mathematical norms.Students develop personal ways of judging whether a solution is efficient or different,and these beliefs are mutually negotiated as the classroom microculture is continuallybeing structured. That is, the teacher cannot merely state specific guidelines for whattypes of solutions are acceptable and expect the guidelines to be understood and enactedby students. Instead, socio-mathematical norms are continually negotiated and redefinedas the teacher and students participate in discussions. The analysis of socio-mathematical norms has proven to be pragmaticallysignificant when conducting design experiments in that it clarifies the process by whichteachers may foster the development of intellectual autonomy in their classrooms. Tocreate the opportunity for the students to take over the teacher’s responsibility asvalidators, socio-mathematical norms have to be in place that enable students to makeindependent judgments that contribute to the teacher’s instructional agenda. The last social aspect of the theoretical framework concerns the mathematicalpractices that are established in the classroom (see also Cobb, Stephan, McClain, &Gravemeijer, 2001). A mathematical practice can be described as the normative ways ofacting, communicating and symbolizing mathematically at a given moment in time. Incontrast to the socio-mathematical norms that are specific to mathematics, themathematical practices are specific to particular mathematical ideas or concepts. Inaddition, mathematical practices necessarily evolve in the course of an experimentwhereas socio-mathematical norms tend to be more stable. An indication that a certainmathematical practice has been established is that explanations pertaining to theparticular practice have become beyond justification. Individual students’ mathematicalinterpretations and actions constitute the psychological correlates of the classroommathematical practices. Their interpretations and the mathematical practices arereflexively related in that students’ mathematical development occurs as they contributeto the constitution of the mathematical practices. Conversely, the evolution ofmathematical practices does not occur apart from students’ reorganization of theirindividual activity.

69.
A learning design perspective 63 We may conclude by noting that in the context of a design experiment, adetailed analysis of evolving classroom practices offers a way of describing the actuallearning process of the classroom community as a whole. This offers a viable alternativefor describing the learning process of the classroom rather than implying either that allstudents are learning in unison, or of attempting to describe the learning processes ofeach individual student. RME Theory When discussing theoretical intent of design experiments, we noted thatontological innovations, such as interpretative frameworks, serve a dual role, both aslenses for making sense of what is happening in a real world instructional setting, and asguidelines or heuristics for instructional design. On the one hand, we may observe thatalthough the emergent framework was initially developed to interpret classroomdiscourse and communication, it also offers guidelines on the classroom culturecharacteristics that fit the intended learning ecology. On the other hand, it may beobserved that the RME theory not only offers design heuristics, but also may function asan interpretative framework for interpreting student activity in terms of learningmathematics. In the following we elaborate this dual role of RME theory. Given its origin, wefocus first on the instructional design perspective. RME emerged at least in part in resistance to instructional and designapproaches that treated mathematics as a ready-made product. Freudenthal (1971, 1973)argued that mathematics should primarily have the character of an activity for thestudents. A process of guided reinvention then would have to ensure that thismathematical activity of would foster the construal of mathematics as a body ofknowledge by the students. This requires the instructional starting points to beexperientially real for the students, which means that one has to present the studentsproblem situations in which they can reason and act in a personally meaningful manner.The objective of guided reinvention is that the mathematics that the students developwill also be experientially real for them. Learning mathematics should ideally beexperienced as expanding one’s mathematical reality.

70.
A learning design perspective 64 We may further elaborate this point by clarifying the way in which Freudenthalconceives reality: “I prefer to apply the term reality to what common sense experiencesas real at a certain stage” (Freudenthal, 1991, p. 17). He goes on to say that reality is tobe understood as a mixture of interpretation and sensual experience, which implies thatmathematics, too, can become part of a person’s reality. Reality and what a personperceives as common sense is not static but grows, and is affected by the individual’slearning process. The goal of realistic mathematics education then is to support studentsin creating some new mathematical reality. This is to be realized by guided reinvention,or, “progressive mathematization”—if we take a student perspective. Progressivemathematization refers to a mixture of two forms of mathematizing, horizontally andvertically, which refers respectively to students mathematizing subject matter fromreality, or mathematizing their own mathematical activity (Treffers, 1987). This latteractivity is essential in the constitution of some new mathematical reality, as “Theactivity on one level is subjected to analysis on the next, the operational matter on onelevel becomes subject matter on the next level” (Freudenthal, 1971, p. 417). This shiftfrom “activity” to “subject matter” relates to the shift from procedures to objects, whichSfard (1991) observed in the history of mathematics. If we look at the history of mathematics, we may observe that mathematicsemerged from solving problems, or as Freudenthal puts it, from organizing subjectmatter. According to Freudenthal (1983), mathematical “thought-things”, such asconcepts, tools and procedures, are invented to organize certain phenomena. Thereinvention heuristic then suggests that the instructional designer should try to findsituations that create the need for the students to invent the mathematical thought thingsthe students are supposed to construct. To find such situations, the instructional designershould analyze the relation between those mathematical “thought-things”, and thephenomena they organize. This phenomenological analysis lays the basis for adidactical phenomenology (ibid), which also incorporates a discussion of whatphenomenological analysis means from an educational perspective. For example, toconstruct distribution as a mathematical object, students should be confronted withsituations where it is reasonable and sensible for them to achieve a goal by organizingphenomena in terms of distributions.

71.
A learning design perspective 65 Freudenthal’s level-theory also shaped the RME-view on educational models.Instead of ready-made models, RME looks for models that may emerge first as modelsof situated activity, and then gradually evolve into entities of their own to function asmodels for more sophisticated mathematical reasoning (Gravemeijer, 1999). Accordingto this “emergent-modeling” heuristic, the model and the new mathematical reality co-evolve; the emergence of the model is reflexively related to the development of somenew mathematical reality. The teacher may support this process by supporting a shift inthe students’ attention from the context situation that the model refers to, towards themathematical relations involved. In this manner, the students may develop a network ofmathematical relations. Then the model can begin to function as a model for moresophisticated mathematical reasoning, in that the model derives its meaning from thisnetwork of mathematical relations. At the same time, relations in this network maythemselves become mathematical objects that constitute a new mathematical reality. Asa further elucidation, we may note that, the term model should not be taken too literallyin that it can also concern a model situation, or a model procedure. Moreover, what istaken as “the model” from a more overarching design perspective will be constituted asa series of sub-models in the instructional activities. As we argued before, the RME domain specific instruction theory also offers aframework for interpreting student activity in terms of learning mathematics (see alsoGravemeijer, 1994). It orients the researcher to focus, for instance, on the variouslearning processes that might take place, with a special attention to the question ofwhether the students are inventing their own solution procedures or are merely imitatingthe teacher or some leading students. In such a case, one might look at the variety ofstudents’ solution procedures. On the basis of the reinvention principle, one wouldfurther expect to recognize the reinvention route in the students’ solutions. In addition,one would expect that the students would spontaneously drop back in their collectivelearning history when they are faced with new problems that represent difficulties forthem. If they instead choose informal procedures that do not correspond with thereinvention route that has been followed, this would be an indication that that route isnot experienced as a natural reinvention process. In a similar manner, the researcher may investigate whether the models that are

72.
A learning design perspective 66used fit with the informal solution procedures demonstrated by the students: Do thestudents use similar procedures with the model, as they did (or would do) without themodel? In other words, the model must not dictate to the students how to proceed, butmust be a resource that fits with their thought processes (Gravemeijer, 1993). Alongthese lines, the RME framework might generate additional points of focus, such as thefollowing: Do the students rely on their own domain-specific knowledge? Do the instructional activities provide the expected traction for the students’ informal solution procedures? Do the solutions that the students develop offer possibilities for vertical mathematization? Do the students mathematize their own informal mathematical activities?And so forth. We will not, however, try to be exhaustive here. We want to close this section on the second phase of the design experimentmethodology by presenting a short sketch of the instructional sequence that wasdeveloped in the statistics design experiment. We clarify the set up of the statistics sequence by first describing how thedidactical phenomenological analysis plays out in this case. The first step in thisanalysis was to analyze the notion of distribution as a mathematical (or statistical)thought thing. This led to the conclusion that distribution can be thought of as a densityfunction, indicating that density can be conceived of as that which is organized bydistribution as a thought thing. Density—as a thought thing in and of itself—in turnorganizes collections of data points in a space of possible data values. This insight canbe made concrete as a dot plot, showing data points on an axis (e.g., these data points onan axis can be viewed as thought things that organize data values). The measures can inturn be thought of as a means for getting a handle on some real world phenomena; thenotion of data creation can also be construed as a form of organizing. This phenomenological analysis reveals a possible reinvention route in which acumulative process of organizing would lead the students through the above steps inreverse order. This lays the basis for the following instructional sequence.

73.
A learning design perspective 67 Point of departure is a bottom-up approach in which the computer minitools areexperienced by the students as sensible tools to use given their current conceptions ofanalyzing data. So for the students, the primary function of the minitools is to help themstructure and describe data sets in order to make a decision or judgment. In this process,notions such as mean, mode, median, skewness, spreadoutness, and relative frequencymay emerge as ways of describing how specific datasets are distributed within thisspace of values. Further, in this approach, various statistical representations orinscriptions may emerge as different ways of structuring distributions. In fact, theminitools are so designed, that they can support a process of progressivemathematization by which these conventional statistical tools are reinvented. At the same time, the activity of structuring data sets by using the minitoolsfosters a process by which the students come to view data sets as entities that aredistributed within a space of possible values. The intent is to support a process in whichthe means of symbolizing, and the meaning of what these symbolizations signify for thestudents co-evolve, similar to that which Meira (1995) describes when he speaks of a“dialectical relation between notations-in-use and mathematical sense making” (Meira,1995, p. 270). The backbone of the sequence consists of a series of inscriptions that areembedded in the computer tools. The idea is that the activities with the computer toolssucceed each other in such a manner that the activity with the newer tool is experiencedas a natural extension of the activity with the earlier tool. The starting point is in themeasures, or magnitudes, that constitute a data set. With the first minitool, magnitude-value bars (Figure 4.5) are introduced where each value bar signifies a single measure.(Initially, the measures under investigation are of a linear type, like “length”, and“time”. Later, this is generalized to other types of measures.) We conjectured that as aconsequence of participating in discussions about various data sets represented by valuebars, the students would begin to focus on the end points of a value bars. As aconsequence, these end points come to signify the corresponding value bars. This allowsfor the introduction of a line plot as a more condense inscription that omits the valuebars and preserves only the end points (Figure 4.6). The second minitool offers studentsa range of options for structuring data sets represented as line plots that include creating

74.
A learning design perspective 68equal intervals, creating two equal groups, and creating four equal groups of data points.We conjectured that as a result of analyzing data sets by using these options, thestudents would begin to reason about data in terms of density, and come to see the shapeof the line plot as signifying the distribution of data values in terms of density. In retrospect, we may recognize the emergent-models design heuristic with “agraphical representation of the shape of a distribution” as the overarching model. Thisoverarching model is instantiated by various sub-models that change over time. Thegraph was initially introduced in an informal manner, as a way of inscribing a set ofmeasures by representing each measure by a bar (Figure 4.2). We can see this as a pre-stage of the model, where the set of measures is still very much tied to the situation.Nonetheless, from a statistical perspective, the shape of the distribution is visible in theway the endpoints are distributed in regard to the axis. In this phase, we can speak of thegraphical representation as a model of a set of measures. Next we introduced activitiesthat were designed to draw the students’ attention to distribution of the end points of thebars. This supported the introduction of the line plot, where the second minitool wasused to structure data sets in various ways to answer the questions at hand. Analysesthat involved structuring the data into four equal groups with the corresponding tooloption (which anticipates the box plot) were particularly important in drawing thestudents’ attention to distribution of density. This then supported a gradual shift fromseeing the graph as signifying as a set of measures to seeing it as signifying adistribution. If once this latter shift occurred, the graph could be used to reason aboutdistributions. Students could, for instance, discern various types of distributions (withthe normal distributions as one of them), and could reason about characteristics of(univariate) distributions, like skewness (Figure 4.6). The model had then become amodel for reasoning about distributions.[ FIGURE 4.6 ABOUT HERE ] PHASE THREE, THE RETROSPECTIVE ANALYSIS

75.
A learning design perspective 69 Thus far, we have discussed the planning of a design experiment and theongoing experimentation in the classroom that is central to the methodology. A furtheraspect of the methodology concerns the retrospective analyses that are conducted of theentire data set collected during the experiment. The goal of the retrospective analyseswill of course depend on the theoretical intent of the design experiment. However, oneof the primary aims is typically to contribute to the development of a local instructiontheory. Other goals may concern more encompassing issues, or ontological innovations.Although differences in theoretical objectives are reflected in differences in theretrospective analyses, the form of the analysis will necessarily involve an iterativeprocess of analyzing the entire data. We will, therefore, first describe the retrospectiveanalyses in general, and then discuss analyses to develop a local instruction theory, andnext analyses conducted to address more general research topics. The data sets typically include (but are not limited to) video-recordings of allclassroom lessons, video-recorded individual interviews conducted with all studentsbefore and after the experiment to assess their mathematical learning, copies of all thestudents’ written work, field notes, and audio-recordings of both the daily debriefingsession and weekly project meetings. The challenge then is to analyze thiscomprehensive data set systematically while simultaneously documenting the groundsfor particular inferences. Claims will be based on a retrospective, systematic andthorough analysis of the entire data set collected during the experiment. To ascertain thecredibility of the analysis, all phases of the analysis process have to be documented,including the refining and refuting of conjectures. Final claims and assertions can thenbe justified by backtracking through the various levels of the analysis, if necessary tothe original video-recordings and transcripts. It is this documentation of the researchteam’s learning process that provides an empirical grounding for the analysis. Further, itprovides a means of differentiating systematic analyses in which sample episodes areused to illustrate general assertions from questionable analyses in which a few possiblyatypical episodes are used to support unsubstantiated claims. Additional criteria thatenhance the trustworthiness of an analysis include both the extent to which it has beencritiqued by other researchers who do not have a stake in the success of the experimentand the extent to which it derives from a prolonged engagement with students and

76.
A learning design perspective 70teachers (Taylor & Bogdan, 1984). This latter criterion is typically satisfied in the caseof classroom design experiments and constitutes a strength of the methodology. The specific approach we use is a variant of Glaser and Strauss’s (1967)constant comparative method (see also Cobb & Whitenack, 1996). We first workthrough the data chronologically, episode by episode, and at each point we are testingour current conjectures against the next episode. For example, one of the key criteria iswhen we claim that a particular norm of argumentation has been established is that astudent who appears to violate that norm will be challenged. If we find instances wheresuch challenges do not occur, we either have to revise our conjecture about the norms ofargumentation that have been established, or we have to substantiate the argument thatthe norms have evolved. As a result of this first round of data analysis, we end up with a sequence ofconjectures and refutations that are tied to specific episodes. In the second phase of aretrospective analysis, this sequence of conjectures and refutations in effect becomes thedata. It is while “meta-analyzing” these episode-specific conjectures, confirmations andrefutations, that particular episodes become to be seen as pivotal. And they are pivotalin the context of the analysis, because they allow us to decide between two or morecompeting conjectures. These are the episodes that are typically included in researchreports. As an illustration, we present some typical episodes from the statistics designexperiment. We already described the battery lifespan problem in which the data were represented as magnitude bars in the first computer tool. The students first worked on this problem in groups, and then the teacher initiated a whole class discussion of the students’ analyses. The computer tool was projected on an overhead screen, the data were sorted by size, and the so-called “range tool” option was used to highlight the ten highest data values (see Figure 4.7).[ FIGURE 4.7 ABOUT HERE ]

77.
A learning design perspective 71 One of the students, Casey, argued that the green batteries were better because seven of the top ten were green (Always Ready), and her argument is supported by another student. Janice: She’s saying that out of ten of the batteries that lasted the longest, seven of them are green, and that’s the most number, so the Always Ready batteries are better because more of those batteries lasted longer. However, this argument was challenged by another student, James, who argued that four of the pink bars (Tough Cell) were “almost in that area and then if you put all those in you would have seven (rather than three pinks).” Later in the discussion, Brad asked for the value tool (the single vertical line) to be placed at 80, in order to substantiate his claim that the Tough Cell brand is better. Brad: See, there’s still green ones (Always Ready) behind 80, but all of the Tough Cell is above 80. I would rather have a consistent battery that I know will get me over 80 hours than one that you just try to guess. One of the issues of interest in this episode is the use of the word “consistent”,which the students introduce an informal way of describing the extent to which data setswere bunched up or spread out. This episode also proved to be pivotal in documentingthat a norm of argumentation was being established, namely that students were obligedto explain why the way in which they had partitioned or organized or structured the datagave insight into the problem or issue under investigation. We were able to demonstratethat this norm remained stable throughout the experiment. A second illustrative episode concerns a comparison of two sets of data that showed the speeds of cars before, respectively after, a campaign against speeding (Figure 4.8).[ FIGURE 4.8 ABOUT HERE ]

78.
A learning design perspective 72 In this case, one of the students had focused on the shape of the data sets to compare how they were distributed. Janice: If you look at the graphs and look at them like hills, then for the before group the speeds are spread out and more than 55, and if you look at the after graph, then more people are bunched up close to the speed limit which means that the majority of the people slowed down close to the speed limit. What is of interest here is that this student did not use the word “hill” to refer tothe figural image, but instead used it as a metaphor to describe the distribution of thedensity of the data (“bunched up, close”) as giving her insight into the effectiveness ofthe campaign against speeding. The students continued to use this metaphor throughoutthe design experiment to indicate that the “majority” of the data points were “bunched-up.” In a follow up experiment, we found that the students could even identify wherethe hill was in the value-bar representation of the first computer minitool (Bakker,2004)—which underscores the metaphorical character of this term. As a third example we may describe an episode in which the students had to compare data on T-cell counts for two different treatments for AIDS- patients, an experimental treatment with 46 patients, and a standard treatment with 186, where the goal is to raise the patients’ T-cell counts. Various groups of students analyzed these data in a range of different ways. One group of students identified the intervals where the “hill” was located in each data set, where the data were bunched up. And on this basis they argued that the new, experimental treatment was effective, because the “hill” was in a higher interval than the hill in the standard treatment data. Another group of students had used the four-equal-groups option (Figure 4.9).[ FIGURE 4.9 ABOUT HERE ]

79.
A learning design perspective 73 This is a precursor of the box plot in that each interval contains 25% of the data. They had used another available option to hide the dots. Their argument was: the new treatment is better because 75% of the data is above 550, whereas in the traditional treatment 75% is below. Note that we could picture the shape of the hill in this representation, if we knew this was a uni- modal distribution. We may briefly show why this notion of shape of a univariate distribution became important for analyzing bi-variate data in a subsequent design experiment conducted the following school year with some of the same students. In this follow-up experiment, we asked the students to compare, for instance, data on years of education, against salary levels for men, and women. The students analyzed data of this type by using a third computer minitool (Figure 4.10, left). One of the tool options was similar to the four- equal-groups option, rotated 90 degrees (Figure 4.10, right).[ FIGURE 4.10 ABOUT HERE ] Here in doing so, the students typically talked about where the “hill” was located or where the “clutter” was in the data. As the students discovered, the ranges were similar for the men’s and women’s salary levels. The big difference was that the data for females was skewed much more heavily towards the bottom end of the distribution for each level of education. As this example clarifies, analyzing bi-variate data is not so much about drawing a line through a cloud of dots, but about investigating how the distribution of the dependent variable changes as the independent variable changes. Reconstructing the Local Instruction Theory One of the primary aims of a retrospective analysis is to support the constitutionof a revised local instruction theory. However, it is important to emphasize that the

80.
A learning design perspective 74results of design experiments cannot be linked to pre and post tests results in the samedirect manner as is common in standard formative evaluation because the proposedlocal instruction theory and prototypical instructional sequence will differ from thosethat are tried out in the classroom. Because of the testing and revising of conjectureswhile the experiment is in progress, a revised, potentially optimal instructional sequencehas to be discerned by conducting a retrospective analysis. It does not make sense, forexample, to include instructional activities that did not fulfill their expectations, but thefact that these activities were enacted in the experiment will nonetheless have affectedthe students’ learning. Adaptations will therefore have to be made when the non-, orless-functional activities are left out. Consequently, the instructional sequence will beput together by focusing on and reconstructing the instructional activities that proved toconstitute the effective elements of the sequence. This reconstruction of an optimalsequence will be based on the observations and inferences made during the designexperiment, complemented by the insights gained by conducting retrospective analyses.In this manner, it can be claimed that the results of a design experiment are empiricallygrounded. As a point of clarification, although the constitution of a revised local instructiontheory is primarily a reconstruction activity, the retrospective analysis may spark designideas that go beyond those that were tried out in the classroom. These insights might inturn create the need for a new experiment, starting with a new conjectured localinstruction theory. Here, the cyclic nature of the methodology that we noted at the levelof instructional design micro-cycles reappears at a broader level. An entire designexperiment and the subsequent retrospective analysis together constitute a larger,macro-cycle of design and analysis (Figure 4.4). In this cycle, the conjectures and assumptions formulated the outset whenplanning a design experiment are scrutinized in the retrospective analysis. An exampleof such an analysis can be found in Cobb, Gravemeijer, Yackel, McClain, andWhitenack (1997). Here, the retrospective analysis indicated that several keyassumptions that underpinned an instructional sequence were ill founded. As aconsequence, the instructional sequence was radically revised and a further designexperiment was conducted. An extensive report of this largely successful follow-upexperiment can be found in Stephan, Bowers, Cobb, and Gravemeijer (2003).

81.
A learning design perspective 75 Encompassing Issues and Ontological Innovations In addition to retrospective analyses that directly aim at the reconstruction andrevision of a local instructional theory, a retrospective analysis might be conducted be toplace classroom events in a broader context by framing them as instances of moreencompassing issues. Earlier, we mentioned as examples analyses that focus on the roleof the teacher, the teacher’s learning, the role of semiotic processes, or on the process ofcultivating the students’ mathematical interests. In addition we mentioned ontologicalinnovations, which might include issues such as the interpretative framework forinterpreting classroom discourse and communication, meta-representationalcompetence, quantitative reasoning or emergent modeling. In such cases, the aim of the analysis is to frame events that occurred in thedesign experiment classroom as instances, or paradigm cases, of a broader class ofphenomena. The goal is to come to understand (the role of) the specific characteristicsof the investigated learning ecology in order to develop theoretical tools that make itpossible to come to grips with the same phenomenon in other learning ecologies. Dataanalysis that aims at understanding a paradigm case differs significantly from dataanalyses that aim at establishing causal relations within a regularity conception ofcausality. Claims are not based on statistical analysis, but on a systematic and thoroughanalysis of the data set. Virtual Replicability Metaphorically speaking, the course of a design experiment can be characterizedin terms of the learning process of the research team. We would argue that this learningprocess has to justify the products of the research project. This characterization isespecially fitting for the construal of the local instruction theory, which encompassestwo processes, (a) the learning process that is inherent to the cyclic process of(re)designing and testing instructional activities and other aspects of the initial design,and (b) the retrospective analysis that scrutinizes, and builds on, this primary process,and looks for patterns that may explain the progress of the students. In relation to thislearning process, we can refer to the methodological norm of “trackability” that is usedas a criterion in ethnographic research. Smaling (1990, 1992) connects trackability with

82.
A learning design perspective 76the well-known criterion of “reliability.” He notes that reliability refers to the absence ofaccidental errors and is often defined as reproducibility. He goes on to say, that forqualitative research this means virtual replicability. Here the emphasis is on virtual. It isimportant that the research is reported in such a manner that it can be retraced, orvirtually replicated by other researchers. This ethnographic norm of trackability fits withFreudenthal’s conception of developmental or design research: Developmental research means: “experiencing the cyclic process of development and research so consciously, and reporting on it so candidly that it justifies itself, and this experience can be transmitted to others to become like their own experience.” (Freudenthal, 1991, p. 161) Likewise, Smaling (1990, p. 6) states that trackability can be established byreporting on, “failures and successes, on the procedures followed, on the conceptualframework and on the reasons for the choices made.” Note that this norm of trackabilitydoes not necessarily require that everyone has to subscribe the conclusions of theresearchers. Eventually, outsiders, who have virtually replicated the learning process ofthe researchers, may interpret their experiences differently or come to differentconclusions on the same experiential basis. The power of this approach is that it createsan experiential basis for discussion. Ecological Validity A central assumption that underpins our work is that instructional innovationsdeveloped in the course of a design research experiment can be used productively tosupport students’ learning in other classrooms. However, as we know only too well, thehistory of research in education in general, and in mathematics education in particular,is replete with more than its share of disparate and often irreconcilable findings. Aprimary source of difficulty is that the independent variables of traditional experimentalresearch are often relatively superficial and have little to do with either context ormeaning. As a consequence, it has frequently been impossible to account for thedifferences in findings when different groups of students receive supposedly the sameinstructional treatment.

83.
A learning design perspective 77 In contrast to traditional experimental research, the challenge when conductingdesign experiments is not that of replicating instructional innovations by ensuring thatthey are realized in precisely the same way in different classrooms. The conception ofteachers as professionals who continually adjust their plans on the basis of ongoingassessments of their students’ mathematical understanding in fact suggests thatcomplete replicability is neither desirable nor, perhaps, possible (cf. Ball, 1993; Simon,1995). Design research aims for ecological validity, that is to say, (the description of)the results should provide a basis for adaptation to other situations. The premise is thatan empirically grounded theory of how the intervention works accommodates thisrequirement. Therefore, one of the primary aims of this type of research is not todevelop the instructional sequence as such, but to support the constitution of anempirically grounded local instruction theory that underpins that instructional sequence.The intent is to develop a local instruction theory that can function as frame of referencefor teachers who want to adapt the corresponding instructional sequence to their ownclassrooms, and their personal objectives. One element that can be helpful in thisrespect, is offering, what is called a “thick description” of what happened in the designexperiment. By describing details of the participating students, of the teaching-learningprocess, and so forth, together with an analysis of how these elements may haveinfluenced the whole process, outsiders will have a basis for deliberating adjustments toother situations. Conversely, feedback from teachers on how the instructional sequencewas adjusted to accommodate various classrooms can strengthen the ecological validitysignificantly. We therefore find it critical to have repeated trials in a variety of settings. In the case of the statistics sequence, for example, we worked with middle school students, with “at risk” high school students, perspective elementary teachers, practicing teachers, and there have also been follow-up groups, including a series of design experiments by Arthur Bakker (2004), in the Netherlands. We have been surprised by the extent to which he have been able to document regularities in the development of the participants’ thinking across these various settings. That is to say, there is diversity in how a group of participants reasoned at any point in time. But we were able predict with some confidence the primary types of analyses or forms of

84.
A learning design perspective 78 reasoning within a group at any point in the experiment. We think that is useful knowledge from a teacher’s point of view in that enables teachers to anticipate the types of reasoning that they can build on or work with. Developing Domain Specific Instruction Theories Design research provides a means of developing local instruction theories thatcan serve to support for teachers who adapt instructional sequences as part of theirteaching practice. In addition, design research also contributes to the development of adomain specific instruction theory, in our case the RME theory. This theory emerges inan iterative, cumulative process that embraces a series of design research projects. Inthis regard, we can speak of theory development at various levels: At the level of the instructional activities (micro theories) At the level of the instructional sequence (local instruction theories) At the level of the domain-specific instruction theory. The relations between these levels can be clarified by drawing on the distinctionthat Kessels and Korthagen (1996) make between “episteme” and “phronesis.”Following Aristotle, they use the Greek word episteme to refer to scientific knowledge,and the word phronesis to refer to “practical wisdom.” They argue that theincompatibility of the products of scientific research with the needs of teachers can betraced to the contrast between these two realms. Teachers rely on practical wisdom,which they share with one another in the form of narratives. They experience scientificknowledge that is produced by research as too abstract and too general to directlyinform their practice (see also Hiebert & Stigler, 1999). In this respect, we would arguethat design research has the potential to bridge the gap between theory and practice, asdomain-specific instruction theory can be categorized as episteme and micro-didacticaltheories as phronesis. In design research, scientific knowledge is grounded in practicalwisdom while simultaneously providing heuristics that can strengthen the practicalwisdom. Developing Ways of Analyzing Innovations A related challenge is that of developing ways of analyzing innovations thatmake their realization in different classrooms commensurable. An analysis of classroom

85.
A learning design perspective 79events structured in terms of constructs such as social norms, socio-mathematicalnorms, and classroom mathematical practices serves to relate the students’ mathematicallearning in a particular classroom to their participation in sequences of instructionalactivities as they were realized in that classroom. As we noted earlier, classroom socialnorms, and socio-math norms can make a profound difference in the nature and thequality of the students’ mathematical reasoning. This part of the retrospective analysis raises its own methodological issues. Atheoretical analysis is the result of a complex, purposeful problem-solving process. Onewould therefore not expect that different researchers would necessarily developidentical theoretical constructs when analyzing the same set of design experiment data.This implies that the notion of replicability is not relevant in this context. FollowingAtkinson, Delamont, and Hammersley (1988), we suggest that the relevant criteria areinstead those of the generalizability and the trustworthiness of the constructs developed. We touched on the issue of generalizability when discussing the importance ofviewing classroom events as paradigm cases of more encompassing issues. It is thisframing of classroom activities and events as exemplars or prototypes that gives rise togeneralizability. This, of course, is not generalization in the sense that the characteristicsof particular cases are ignored and they are treated as interchangeable instances of theset to which assertions are claimed to apply. Instead, the theoretical analysis developedwhen coming to understand one case is deemed to be relevant when interpreting othercases. Thus, what is generalized is a way of interpreting and understanding specificcases that preserves their individual characteristics. For example, we conjectured thatmuch of what we learned when investigating symbolizing and modeling in a first-gradedesign experiment that focused on arithmetical reasoning would inform analyses ofother students’ mathematical learning in a wide range of classroom situations includingthose that involve the intensive use of technology. This in fact proved to be the case inrecently completed sequences of design experiments that focused on the students’development of statistical reasoning (Cobb, 1999; Cobb, McClain, & Gravemeijer,2003). It is this quest for generalizability that distinguishes analyses whose primary goalis to assess a particular instructional innovation from those whose goal is thedevelopment of theory that can feed forward to guide future research and instructionaldesign.

86.
A learning design perspective 80 Whereas generalizability is closely associated with the notion of a paradigmcase, trustworthiness is concerned with the reasonableness and justifiability ofinferences and assertions. This notion of trustworthiness acknowledges that a range ofplausible analyses might be made of a given data set for a variety of different purposes.The issue at hand is that of the credibility of an analysis. As we have indicated, the mostimportant consideration in this regard is the extent to which the analysis of thelongitudinal data set is both systematic and thorough. DESIGN AND RESEARCH Although our emphasis in the above paragraphs has been on ways of justifyingthe results of design experiments, we do not want to loose sight of the fact that designresearch is about researching and designing. We have discussed issues such as validityand trustworthiness at some length much of the current debate about design research hasfocused on justification.5 However, the design aspect of the methodology is equallyimportant. Design research presupposes that there is an adequately grounded basis fordesigning the innovative learning ecology/instructional sequence. The description“learning ecology” introduced by Cobb, Confrey, diSessa, Lehrer, and Schauble (2003)might be more adequate as it accentuates that we are dealing with a complex, interactingsystem involving multiple elements of different types and levels—by designing theseelements and by anticipating how these elements function together to support learning.Taking into account the complexity of a learning ecology, this implies the need for avery broad framework. The research of Doerr and Zangor (2000) serves to illustrate thecomplexity of a learning ecology. The authors found that productive use of graphiccalculators requires coherence between the following elements of a learning ecology: the beliefs of the teacher, the ability of the teacher to work with the graphic calculator, the classroom culture (social norms en socio-math norms), and social practices, the design of the instructional sequence, the characteristics of the instructional tasks, 5 As seemed to bet the case at the Symposium Design-Based Research: Grounding a NewMethodology, at the AERA 2004.

87.
A learning design perspective 81 the manner in which the graphic calculator is construed as a tool, and last but not least, the pedagogical-didactical skills of the teacher in making this whole system work. In light of this list, it can be argued that the theoretical base for the design shouldincorporate general background theories such as socio-constructivism, or socio-culturaltheory, domain-specific theory and theories on specific elements of the learningecology, such as theories on tool use. In addition to this the research team should be well informed about the state-of-the-art professional knowledge of the domain under consideration. ACKNOWLEDGEMENT The analysis reported in this paper was supported by the National ScienceFoundation under grant No. REC REC 0231037. The opinions expressed do notnecessarily reflect the view of the Foundation. REFERENCESAkker, J. van den. (1999). Principles and methods of development research. In J. van Akker, R. M. Branch, K. Gustafson, N. Nieveen, & T. Plomp (Eds.), Design approaches and tools in education and training (pp. 1-14). Boston: Kluwer Academic Publishers.Atkinson, P., Delamont, S., & Hammersley, M. (1988). Qualitative research traditions: A British response to Jacob. Review of Educational Research, 58, 231-250.Ball, D. (1993). With an eye on the mathematical horizon: Dilemmas of teaching elementary school mathematics. Elementary School Journal, 93, 373-397.Bakker, A. (2004). Design research in statistics education, On symbolizing and computer tools. Utrecht: CDβ-Press.Brown, A. L. (1992). Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. The Journal of the Learning

102.
A technology perspective 86DESIGN RESEARCH FROM A TECHNOLOGY PERSPECTIVEThomas ReevesThe effectiveness of the field known as educational technology to fundamentallyenhance teaching and learning has increasingly been called into question, as has theefficacy of educational research in general. Doubts about educational technologyresearch stem primarily from decades of an arguably flawed research agenda that hasbeen both pseudoscientific and socially irresponsible. It is proposed that progress inimproving teaching and learning through technology may be accomplished using designresearch as an alternative model for inquiry. Design research protocols require intensiveand long-term collaboration involving researchers and practitioners. Design researchintegrates the development of solutions to practical problems in learning environmentswith the identification of reusable design principles. Examples of design researchendeavors in educational technology are described in this chapter. The chapter endswith a call for action in the educational technology research community to adopt designresearch methods more widely. INTRODUCTION Educational technology as a field of study and area of practice emerged in thewake of World War II, although its roots are sometimes traced as far back as the 1920s(Saettler, 1990). (For purposes of this chapter, educational technology and instructionaltechnology are considered synonymous, although subtle distinctions exist in theliterature [Reiser & Ely, 1997].) Originally built on the foundations of the behaviorist

103.
A technology perspective 87theories of Watson, Skinner, Hull, and others, the earliest innovations created byeducational technologists and their collaborators, such as educational films,programmed instruction, and instructional television, were viewed by their creators andearly adopters as having enormous potential to improve education. As the foundationsof the underlying learning theories changed from behaviorism to cognitive learningtheory and eventually social constructivism, and new technologies emerged such ascomputer-assisted instruction and web-based learning environments, ever moreoptimistic promises were made about the capacity of educational technology to improveeducation across all levels in diverse contexts. However, thousands of individualresearch studies and large-scale meta-analyses of these studies (e.g., Bernard et al.,2004; Dillon & Gabbard, 1998; Fabos & Young, 1999) have clearly demonstrated thateducational technology has not even begun to reach its widely-promoted potential, andin recent years, skepticism about the effectiveness of this field has steadily increased(e.g., Cuban, 2001; Noble, 2000; Oppenheimer, 2003). The perceived failure of educational technology cannot be isolated from theperception in some quarters that educational research and development as a whole hasbeen a failed enterprise, at least in the United States of America. The U.S. Departmentof Education under the current federal administration of President George W. Bush hasshown its distain for past educational research and development by agreeing with thepolitically conservative Coalition for Evidence-Based Policy (2002, p. 1) that “over thepast 30 years the United States has made almost no progress in raising the achievementof elementary and secondary school students…despite a 90 percent increase in realpublic spending per student.” But such attacks do not arise solely from those who arepolitically motivated. The cover story of the August 6, 1999 edition of The Chronicle of

104.
A technology perspective 88Higher Education, the weekly newspaper of record for North American academics,decried “The Failure of Educational Research” (Miller, 1999). The article claimed thateducational researchers waste the vast resources spent on educational research, employweak research methods, report findings in inaccessible language, and issue findings thatare more often contradictory than not. Indeed, the educational research community has often been its own worst enemyas a result of focusing more on establishing the legitimacy of one educational researchtradition over another (such as the long-term struggle among the adherents ofquantitative, qualitative, and critical methodological paradigms) rather than onimproving education per se. As just one example of this infighting, Lagemann (2000)argued that, in a misguided effort to be recognized as being truly “scientific,”educational researchers have turned away from the pragmatic vision of John Deweywhereas Egan (2002) contended that the progressive ideas of Dewey and others arelargely responsible for the general ineffectiveness of schools in North America. Contentiousness within the educational research and development community ismost obvious today with respect to arguments about the value and feasibility ofrandomized controlled trials (RCTs), as used in medical research, as an approachcapable of guiding progress in education. Slavin (2002), among others, asserted that theremarkable progress evident in the last hundred years of medical practice could beachieved in education if only educational researchers would adopt that samerandomized experimental trials approach to revealing “what works.” Slavin (2002)optimistically proclaimed that “Once we have dozens or hundreds of randomized orcarefully matched experiments going on each year on all aspects of educational practice,we will begin to make steady, irreversible progress” (p. 19). Slavin failed to

105.
A technology perspective 89acknowledge sufficiently the frequent failures of medical research. For example,Ioannidis (2005) found that one-third of the most frequently-cited clinical researchstudies published in three prestigious medical journals (JAMA, New England Journal,and Lancet) between 1990 and 2003 reported positive findings that were contradicted bylater research or found to have exaggerated effects. In response to Slavin (2002) and other proponents of RCTs in education, Olson(2004) argued that double blind experiments, although feasible in medicine, areimpossible in education. He further questioned the viability of RCTs in education on thebasis that implementation variance in educational contexts reduces treatmentdifferences, causal agents are underspecified in education, and the goals, beliefs, andintentions of students and teachers affect treatments to an extent much greater than thebeliefs of patients affect pharmaceuticals and other medical treatments. Chatterji (2004)maintained that the emphasis on RCTs and the establishment of a “What WorksClearinghouse” “ignore the critical realities about social, organizational, and policyenvironments in which educational programs and interventions reside.” She advocated“decision-oriented” evaluation research over “conclusion-oriented” academic research,and recommended extended-term mixed-method (ETMM) designs as a viablealternative. Despite these and other criticisms, the U.S. Department of Education and relatedfederal agencies such as the National Science Foundation have apparently assumed theprimacy of RCTs, as evidenced by the fact that their most recent funding requirementsmandate the use of the “scientific” methods advocated by Slavin (2002) and others. TheAmerican Evaluation Association (2003) is just one of several professionalorganizations that have taken issue with this direction at the U.S. federal level,

106.
A technology perspective 90concluding that the priority given to randomized trials: manifests fundamental misunderstandings about (1) the types of studies capable of determining causality, (2) the methods capable of achieving scientific rigor, and (3) the types of studies that support policy and program decisions. We would like to help avoid the political, ethical, and financial disaster that could well attend implementation of the proposed priority. Perhaps Slavin (2002), Feuer, Towne, and Shavelson (2002), and others who arepromoting the expenditure of millions of dollars on RCTs in education should considerthe conclusions drawn about educational research by the renowned philosopher ofscience, Thomas Kuhn (as quoted in Glass & Moore, 1989, p. 1) who said: “Im not sure that there can now be such a thing as really productive educational research. It is not clear that one yet has the conceptual research categories, research tools, and properly selected problems that will lead to increased understanding of the educational process. There is a general assumption that if youve got a big problem, the way to solve it is by the application of science. All you have to do is call on the right people and put enough money in and in a matter of a few years, you will have it. But it doesnt work that way, and it never will.” If the proponents of RCTs discount Kuhn, perhaps they will find the later workof Lee Cronbach (1980, 1982), one of the most eminent educational researchers of thelast half of the 20th Century, more credible. After decades of experimental research,Cronbach came to the conclusion that we could not pile up generalizations fast enoughfrom numerous small scale studies to have any meaningful leverage to apply the results

107.
A technology perspective 91in specific classrooms at a specific time. Simply put, Cronbach (1975) cautioned that"when we give proper weight to local conditions, any generalization is a workinghypothesis, not a conclusion" (p. 125). In light of the conclusions of experts such as Kuhn and Cronbach, it isfrustrating to see renewed enthusiasm for RCTs and other forms of experimentalresearch in education. Educational researchers appear to be unable to learn from theirpast history of inconsequential impact on practice. As Labaree (1998) lamented: “One last problem that the form of educational knowledge poses for those who seek to produce it is that it often leaves them feeling as though they are perpetually struggling to move ahead, but getting nowhere. If Sisyphus were a scholar, his field would be education. At the end of long and distinguished careers, senior educational researchers are likely to find that they are still working on the same questions that confronted them at the beginning. And the new generation of researchers they have trained will be taking up these questions as well, reconstructing the very foundation of the field over which their mentors labored during their entire careers.” (p. 9). THE STATE OF EDUCATIONAL TECHNOLOGY RESEARCH Given the difficulty of conducting any sort of educational research (Berliner,2002), it should not surprise anyone that educational technology research has yielded asdismal a record as other areas of educational inquiry. The reality of educational

108.
A technology perspective 92technology research is that isolated researchers primarily conduct “one-off” quasi-experimental studies rarely linked to a robust research agenda, much less concernedwith any relationship to practice. These studies, often focused on alternative mediatreatments (e.g., online versus face-to-face instruction) or difficult to measure individualdifferences (e.g., self-regulated learning), are initially presented as papers at conferencesattended by educational technology researchers, and eventually published in academicjournals that few people read. As in many other contexts for educational inquiry,educational technology research has been plagued by a history of “no significantdifferences,” and even the most thorough meta-analyses of the quasi-experimentalresearch studies conducted by educational technologists yield effect sizes that areextremely modest at best (e.g., Bernard et al., 2004; Dillon & Gabbard, 1998; Fabos &Young, 1999). Reeves (1995) reviewed five years of the research papers published in two of thepremiere educational research journals of that time (Educational Technology Researchand Development and the Journal of Computer-Based Instruction), and found that themajority of the published studies had predictive goals of testing hypotheses derivedfrom theory or comparing one medium for instructional delivery with another. Despitethe fact that these journals were refereed, Reeves found that most of these studies usedflawed quasi-experimental designs and/or weak quantitative measures of the primaryvariables related to achievement, attitudes, or other outcomes. One result of thegenerally pseudoscientific nature of much of the published literature in educationaltechnology is that when other researchers conduct meta-analyses of these studies, theyoften find that they must reject upwards of 75 percent of the published literature for avariety of failings (Bernard et al., 2004; Dillon & Gabbard, 1998; Fabos & Young,

109.
A technology perspective 931999). Among the weaknesses in the ubiquitous media comparison studies arespecification error, lack of linkage to theoretical foundations, inadequate literaturereviews, poor treatment implementation, major measurement flaws, inconsequentiallearning outcomes for research participants, inadequate sample sizes, inaccuratestatistical analyses, and meaningless discussions of results (Reeves, 1993). Bernard et al. (2004) conducted a comprehensive meta-analysis of empiricalcomparisons of distance education courses with face-to-face instruction coursesbetween 1985 and 2002. Although they found over 1,000 comparison studies in theresearch literature, the majority of the studies did not meet their criteria for inclusion inthe meta-analysis. Using the reduced set of papers focused on measures of studentachievement, Bernard et al. detected a very small, but statistically significant, positivemean effect size for interactive distance education in comparison to traditionalclassroom instruction. Further analysis indicated that synchronous communication andtwo-way audio and video were among the conditions that contributed to effectiveinteractive distance education. While this meta-analysis is one of the best of its kind, itsfindings, as well as those derived from other related meta-analyses (Cavanaugh, 2001;Machtmes & Asher, 2000), fall far short with respect to specifying design guidelines forpractitioners. The kind of media comparison research synthesized in most meta-analyses has along and dubious history in educational technology (Clark, 1983, 2001). Saettler (1990)found evidence of experimental comparisons of educational films with classroominstruction as far back as the 1920s, and comparative research designs have beenapplied to every new educational technology since then. As evidenced by the 1,000 plusquasi-experimental studies of distance education versus traditional methods examined

110.
A technology perspective 94by Bernard et al. (2004), this ultimately futile approach to educational research is wellentrenched in the minds and work habits of educational technology researchers. Indeed,despite frequent admonitions against it, media comparison studies continue to bepublished in one guise or another (e.g., Koory, 2003; MacDonald & Bartlett, 2000;Scheetz & Gunter, 2004; Summers, Waigandt, & Whittaker, 2005). No significantdifferences in learning have been the most consistent result. Educational technologyresearchers would do well to heed Sir John Daniel (2002, p.x) who wrote: “…the futile tradition of comparing test performances of students using new learning technologies with those who study in more conventional ways…is a pointless endeavor because any teaching and learning system, old or new, is a complex reality. Comparing the impact of changes to small parts of the system is unlikely to reveal much effect and indeed, ‘no significant difference’ is the usual result of such research.” In the face of this legacy of ill-conceived and poorly-conducted research thatresults in no-significant differences, or at best, modest effects sizes, even journalists canbuild a strong case against the endeavors of educational technologists and others topromote the use of technology in education. For one, Todd Oppenheimer (2003) wrote: “Our [American] desperation for objective information [is] illustrated nowhere more gorgeously than in the field of education. I am speaking of our tendency to promote any new concept by invoking volumes of quantitative “research” that ostensibly proves its value…technology advocates have played it expertly when it comes to claims about what computers will do for student achievement. As it turns out, the vast bulk of their research is surprisingly questionable.” (p. xix).

111.
A technology perspective 95 NEW RESEARCH DIRECTIONS FOR EDUCATIONAL TECHNOLOGISTS Clearly, there is an urgent need for a better approach to educational technologyresearch. Instead of more media comparison studies, educational technologists shouldundertake the type of research that others have labeled “design-based research” (Kelly,2003), “development research” (van den Akker, 1999), “design experiments” (Brown,1992; Collins, 1992), or “formative research” (Newman, 1990). The criticalcharacteristics of “design experiments,” as described by Brown (1992) and Collins(1992) are: addressing complex problems in real contexts in collaboration with practitioners; integrating known and hypothetical design principles with technological affordances to render plausible solutions to these complex problems; and conducting rigorous and reflective inquiry to test and refine innovative learning environments as well as to define new design principles. There are major differences between the philosophical framework and goals oftraditional educational technology research methods, both predictive and interpretive,and design-based research approaches. Van den Akker (1999) clarified the differencesas follows: “More than most other research approaches, development research aims at making both practical and scientific contributions. In the search for innovative ‘solutions’ for educational problems, interaction with practitioners…is essential. The ultimate aim is not to test whether theory,

112.
A technology perspective 96 when applied to practice, is a good predictor of events. The interrelation between theory and practice is more complex and dynamic: is it possible to create a practical and effective intervention for an existing problem or intended change in the real world? The innovative challenge is usually quite substantial, otherwise the research would not be initiated at all. Interaction with practitioners is needed to gradually clarify both the problem at stake and the characteristics of its potential solution. An iterative process of ‘successive approximation’ or ‘evolutionary prototyping’ of the ‘ideal’ intervention is desirable. Direct application of theory is not sufficient to solve those complicated problems.” (pp. 8-9). Reeves (2000) described van den Akker’s (1999) conception ofdesign/development research as a viable strategy for socially responsible research ineducational technology. As illustrated in Figure 5.1, even if the results of business-as-usual predictive research in this field provided unassailable results demonstrating theefficacy of educational technology, translating those findings into instructional reformwould not be a given. Educational research is usually published in refereed journals thatare unread by the vast majority of practitioners. Reading research papers and translatingthe findings into practical solutions is a formidable task for educational practitioners.Nor can educational technologists simply install purportedly innovative technologiesinto the classroom and expect them to work.[ FIGURE 5.1 NEAR HERE ] One of the primary advantages of design research is that it requires practitioners

113.
A technology perspective 97and researchers to collaborate in the identification of real teaching and learningproblems, the creation of prototype solutions based on existing design principles, andthe testing and refinement of both the prototype solutions and the design principles untilsatisfactory outcomes have been reached by all concerned. Design research is not anactivity that an individual researcher can conduct in isolation from practice; its verynature ensures that progress will be made with respect to, at the very least, clarificationof the problems facing teachers and learners, and ideally, the creation and adoption ofsolutions in tandem with the clarification of robust design models and principles. DESIGN RESEARCH EXEMPLARS Fortunately, a few good examples of design-based research in educationaltechnology are emerging. The January/February 2005 issue of Educational Technologyhighlights six of the best known design-based research initiatives in North America.Squire (2005) presented design-based investigations of game-based learningenvironments in which he and his colleagues employed a blend of quantitative andqualitative methods to explore the messiness of innovation in authentic contexts. Barab,Arici, and Jackson (2005) described how they have applied design-based researchmethods within the context of the Quest Atlantis project, a noteworthy initiative that notonly has produced a rich online learning environment that supports important learning,but has also yielded a promising theoretical framework called Learning EngagementTheory. Nelson, Ketelhut, Clarke, Bowman, and Dede (2005) clarified the benefits ofdesign-based research with respect to working closely with practitioners in their River

114.
A technology perspective 98City science education project, a curricular level innovation in the form of Multi-UserVirtual Environment Experiential Simulators (MUVEES). In describing their VirtualSolar System project, Hay, Kim, and Roy (2005) delineated the challenges of designresearch conducted in a higher education academic environment, including the need tointegrate technological innovations with traditional educational artifacts such astextbooks. Hoadley (2005), one of the founders of the Design-Based ResearchCollective (2003), illustrated how design-based research takes considerable time andpatience as his theory of “Socially Relevant Representations” related to learningthrough discussion has been gradually refined over a decade in inquiry. A compelling design-based research demonstration of the benefits ofcollaborating with practitioners to develop innovative educational technologies in thesame context in which they will be used is summarized in Yasmin Kafai’s (2005) articletitled “The Classroom as Living Laboratory.” For decades, educational technologistshave developed instructional innovations in laboratories and later inserted them intoclassrooms with an appalling lack of impact (Cuban, 2001). Design-based researchers,by contrast, make a fundamental commitment via close collaboration with teachers andstudents to developing interactive learning environments in the contexts in which theywill be implemented. The development research of Jan Herrington conducted at Edith CowanUniversity in Australia (Herrington, 1997; Herrington & Oliver, 1999) is a rareexemplar of design-based research done by a doctoral student. Herrington employed arange of innovative investigative strategies, including video analysis of the dialoguebetween pairs of students engaged in multimedia learning. First, she worked withteacher educators to develop a model of the critical factors of situated learning, and

115.
A technology perspective 99second, she instantiated these factors in an innovative multimedia learning environment.Subsequently, she and her collaborators tested the model and the technological productsin multiple contexts, including pre-service teacher education courses and K-12 schools.This line of research had value within the immediate context of its implementation, andit also has yielded generalizable design principles that are being applied in many othercontexts. (Barab and Squire [2004]) illustrated how design research of this kind requiresboth “demonstrable changes at the local level” [p. 6 as well as contributions to theory.)Herrington’s research agenda still thrives, recently focusing on the design of authenticactivities in Web-based learning environments (Herrington, Reeves, Oliver, & Woo,2004). Other notable design-based research dissertations have been undertaken at theUniversity of Twente in the Netherlands. De Vries (2004) conducted four designexperiments in primary schools to address the research question: “How can reflection beembedded in the learning process to improve the development of personalunderstanding of a domain and learning task? Several years earlier, McKenney (2001)employed development research methods in a large-scale dissertation that addressed theresearch question: “What are the characteristics of a valid and practical support tool thathas the potential to impact the performance of (resource) teachers in the creation ofexemplary lesson materials for secondary level science and mathematics education insouthern Africa? Even earlier, Nieveen (1997) pursued development research with theaim of developing a computer support system to assist curriculum developers inoptimizing the effectiveness of formative curriculum evaluation efforts. Van den Akker(1999) played important roles in all three of these dissertation studies. These exemplarsare especially useful because they clearly demonstrate that with proper support doctoral

116.
A technology perspective 100students can engage in fruitful design research within the field of educationaltechnology. CONDITIONS FOR REFORM OF EDUCATIONAL TECHNOLOGY RESEARCH For design research to be taken seriously by the educational technology researchcommunity, fundamental changes in our methods of research and development arerecommended. The kinds of design/development research described by van den Akker(1999), Bannan-Ritland (2003), Barab and Squire (2004), and others hold great promise.But other changes are needed. For example, the conceptualization of learning theory assomething that stands apart from and above instructional practice should be replaced byone that recognizes that learning theory can be collaboratively shaped by researchersand practitioners in context. This shift in our way of thinking about research and theoryas processes that can be use-inspired is taking place in other fields of inquiry as well(Stokes, 1997). In addition, educational technologists may need to rethink theirconceptualization of the field as a science. Educational technology is first and foremosta design field, and thus design knowledge is the primary type of knowledge sought inour field. Design knowledge is not something that educational researchers derive fromexperiments for subsequent application by teachers. Design knowledge is contextual,social, and active (Perkins, 1986). Educational technology is a design field, and thus,our paramount research goal should be solving teaching, learning, and performanceproblems, and deriving design principles that can inform future development and

117.
A technology perspective 101implementation decisions. Our goal should not be to develop esoteric theoreticalknowledge that we think practitioners should apply whenever we get around todescribing it in practitioner-oriented publications for which researchers usually receivelittle credit, at least within the traditional academic tenure and promotion reviewsystems. This has not worked for more than 50 years, and it will not work in the future. Accordingly, the reward structure for scholarship must change in highereducation. Educational researchers should be rewarded for participation in long-termdevelopment research projects and their impact on practice rather than for the number ofrefereed journal articles they publish. In a design field such as educational technology, itis time that we put the “public” back in publication. Academic researchers and theteachers with whom they collaborate must be provided time for participation in designresearch, reflection, and continuous professional development. Hersh and Merrow(2005) illustrated how the overemphasis on personal research agendas in all fields ofacademe has led to the decline of teaching and service in higher education in the USAover the past 30 years. Of course, additional financial support is needed for the types of long-termdesign research initiatives called for in this paper. In the USA, the National ScienceFoundation (http://www.nsf.gov) and the Institute of Educational Sciences at theDepartment of Education (http://www.ed.gov/about/offices/list/ies/index.html) currentlymaintain that randomized controlled trials are the preferred method for educationalresearch. Hopefully, the kinds of examples presented in this book will encourageauthorities at those agencies and other funding sources to consider design research as aviable alternative approach to enhancing the integration of technology into teaching andlearning and the ultimate improvement of education for all.

118.
A technology perspective 102 A CALL FOR ACTION Inspired by the design-based research initiatives outlined above and guided bymethodologists such as van den Akker (1999), it is time for educational technologists toadopt a more socially responsible approach to inquiry. The design knowledge requiredin our field is not something that can be derived from the kinds of simplistic, often“one-off,” quasi-experiments that have characterized our shameful legacy ofpseudoscience. Without better research, teachers, administrators, instructional designers,policy makers, and others will continue to struggle to use educational technology toreform teaching and learning at all levels. Who can doubt that the traditional research practices of educationaltechnologists will never provide a sufficient basis for guiding practice? The currentsituation approaches the absurd. Writing in Presentations magazine, a trade publicationaimed at the education and training community, Simons (2004) summarized themultimedia research of Professor Richard Mayer, one of the top researchers working inthis area. With insufficient irony, Simons pointed out that Mayer’s research (Clark &Mayer, 2003; Mayer 2001) supports the use of multimedia in education and training, butalso fails to support it! “It depends” seems to be the best educational technologistspursuing traditional research agendas can provide practitioners. This is not sufficient.There is an important difference between “it depends” and the type of warrantedassertions provided by design research. The time to change the direction of educationaltechnology research is now.

126.
Figure 5.1: Predictive and design research approaches in educational technology research

127.
A curriculum perspective 110DESIGN RESEARCH FROM A CURRICULUM PERSPECTIVESusan McKenney, Nienke Nieveen and Jan van den Akker Departing from a slightly broader perspective than most other discussions ofdesign research in this volume, this chapter contributes to understanding of designresearch in the curriculum domain. It begins by clarifying what is meant by ‘curriculum’before characterizing design research from this perspective. The discussion of designresearch in the curriculum domain builds toward a conceptual model of the process. Toillustrate various aspects in the model, three design research cases from the curriculumdomain are briefly presented. The chapter concludes with discussion of design researchdilemmas, and finally, guidelines for mitigating potential threats to design study rigor. THE CURRICULUM DOMAIN As a field of study, "it is tantalizingly difficult" to know what curriculum is,(Goodlad, 1994, p. 1266). Although Tabas (1962) definition of a plan for learning isgenerally accepted, dispute abounds with regard to further elaboration of the term(Marsh & Willis, 1995). In this chapter, the notion of curriculum is treated from aninclusive perspective. That is, the broad definition of a plan for learning has been usedas a starting point, while related views have been sought to enhance understanding of

128.
A curriculum perspective 111curriculum. The remainder of this section presents the curricular perspectives that mostrobustly underpin our vision of design research in the curriculum domain. Perspectives How does one go about planning for learning? Curricular decision-making isgenerally an iterative and lengthy process, carried out by a broad range of participantsand influenced by an even wider variety of stakeholders. Curricular decisions may beanalyzed from various angles; Goodlad (1994) defines three: socio-political, technical-professional and substantive. The socio-political perspective refers to the influenceexercised by various (individual and organizational) stakeholders. The technical-professional perspective is concerned with methods of the curriculum developmentprocess; whereas the substantive perspective refers to the classic curriculum question of:What should be learned? The question of what schools should address has confronted society since thedawn of schooling. Curriculum scholars offer numerous perspectives on how thesubstance of curriculum is determined. Based on the work of Tyler (1949) andsubsequent elaborations by others (Eisner & Vallance, 1974; Goodlad, 1984, 1994;Kliebard, 1986; Van den Akker, 2003; Walker & Soltis, 1986), Klep, Letschert andThijs (2004) describe three major orientations for selection and priority setting: Learner: which elements seem of vital importance for learning from the personal and educational needs and interests of the learners themselves? Society: which problems and issues seem relevant for inclusion from the perspective of societal trends and needs?

129.
A curriculum perspective 112 Knowledge: what is the academic and cultural heritage that seems essential for learning and future development? Components Traditionally, curriculum deliberations focus on the aims and content oflearning. Building on broader definitions (Walker, 1990) and typologies (Eash, 1991;Klein, 1991), van den Akker (2003) presents a visual model that illustrates both theinter-connectedness of curriculum components and the vulnerability of the structure thatconnects them (see Figure 6.1). At the hub of the model is the rationale, through whichall other components are connected: aims and objectives; content; learning activities;teacher role; materials and resources; grouping; location; time; assessment. The spiderweb metaphor emphasizes that, within one curriculum, component accents may varyover time, but that any dramatic shift in balance will pull the entirety out of alignment.Though it may stretch for a while, prolonged imbalance will cause the system to break.Efforts to reform, (re)design, develop or implement curricula must therefore devoteattention to balance and linkages between these ten components.[ FIGURE 6.1 ABOUT HERE ] Consistency, Harmony and Coherence Curriculum concerns may be addressed at various levels: macro (system/society/nation/state); meso (school/institution); and micro (classroom/learner). While the spiderweb metaphor emphasizes the need for internal consistency between components,consistency across levels in a system is also a chief concern. This implies that, for

130.
A curriculum perspective 113example, change efforts striving toward a particular approach to classroom teaching andlearning must be designed while taking into account the overarching school andeducation system, or else they risk inconsistent design and - along with that - hindrancesto implementation. Even at differing levels, the ideas bound together in a curriculum may bemanifested through various representations. Goodlad, Klein and Tye (1979)distinguished a range of representations, adapted by van den Akker (1988, 1998, 2003),who offers three broad distinctions: the intended curriculum; the implementedcurriculum; and the attained curriculum. The intended curriculum contains both theideal curriculum (the vision or basic philosophy underlying a curriculum) and theformal/written curriculum (intentions as specified in curriculum documents and/ormaterials). The implemented curriculum contains both the perceived curriculum(interpretations by users, particularly teachers) and the operational curriculum (asenacted in the classroom). The attained curriculum is comprised of the experientialcurriculum (learning experiences from pupil perspective) and the learned curriculum(resulting learner outcomes).High quality curriculum development strives for internal consistency with regard tocurricular components (spider web elements) and levels (micro through macro), as wellas harmony among curricular representations (intended through attained). High qualitycurriculum development also reflects system coherence. That is, within the context forwhich a curriculum is intended, curricular decision-making is influenced by a thoroughunderstanding of the system elements that most influence curriculum enactment: teacher

131.
A curriculum perspective 114capacity and (large scale) pupil assessment; see Figure 6.2.[ FIGURE 6.2 ABOUT HERE ] Much research has been conducted to explore how the curriculum is shaped bythe teacher (Clandinin & Connelly, 1992; Eisenhart & Borko, 1991; Zumwalt, 1988);and while conclusions vary, there is little debate that teacher capacity influences theenactment of curriculum. In exploring the fit between a curricular innovation and thesystem in which it functions, alignment with pre- and inservice teacher education iscritical to successful implementation. Additionally, experience and research alike(Black & Atkin, 1996) attest to the notion that external examinations wield some of thestrongest influences on curriculum implementation. This aspect of system coherence is,with surprising frequency, downplayed – or in more extreme cases – ignored, as furtherdiscussed below. The issue of system coherence often presents a struggle for developers of innovativecurricula. For example, an in-depth discovery-learning curriculum designed for use wherestate examinations tend to favor relatively shallow but extremely broad content knowledgewould not be coherent at the system level. But strong beliefs in the benefits of discoverylearning might prompt curriculum developers to take on the implementation challengesassociated with such inconsistency. In these situations, teacher development tends to beseen as the primary vehicle for compensating for curricular vulnerability. While this may beviable, it does not represent robust curriculum design. From an implementation perspective,

132.
A curriculum perspective 115robust curriculum design is evidenced by attention to the three criteria discussed in thissection: consistency among curricular components (spider web) and across levels (macro, meso, micro); harmony between representations (intentions, implementations and attainments) coherence within the system context (factoring in the influences of teacher development, school development, and large-scale assessment) Curriculum Implementation Around the globe and particularly in the areas of science and mathematicseducation, curricula have undergone several waves of renewal in the past few decades. Amajor reform period began following the launch of Sputnik and continued on until the1970s (in some cases, even longer); this period was earmarked by innovations taking placeon a large scale. Despite concerted efforts, many improvement projects were consideredfailures (for a synthesis of curriculum implementation research from this era, see Fullan andPomfret [1977]). Perhaps in part due to disappointing results in the past, the 1980s saw ashift to debates on accountability and the most appropriate forms of curricular reform.These debates were fueled by changing societal concerns as well as new scientific insights.The 1990s bore witness to a rebirth of large-scale reform, tempered by cautions resultingfrom failed efforts in the past. In this most recent wave of reform, systemic, sustainablechange receives the main focus. A core element of this focus is the careful consideration ofnew (and/or improved) implementation strategies, which have been addressed in literature(Confrey, Castro-Filho, & Wilhelm, 2000; Fullan, 2000; van den Akker, 1998).

133.
A curriculum perspective 116 Throughout earlier waves of curriculum reform, most study of curriculumimplementation took place from a fidelity perspective. That is, researchers focused onmeasuring the degree to which a particular innovation was implemented as planned, and onidentifying the factors that facilitated or hindered implementation as planned (Snyder,Bolin, & Zumwalt, 1992). Implementation research has led to an appreciation for theperspectives of mutual adaptation and - in more recent years - enactment. Mutual adaptation(Berman & McLaughlin, 1977, 1978) suggests that curriculum implementation is a processwhereby adjustments in a curriculum are made by curriculum developers and those whoactually use it in a school or classroom context. Curriculum enactment, on the other hand,views curriculum as the educational experience jointly created by students and teachers.From this perspective, the role of the teacher is that of a curriculum developer who, withtime, according to Snyder et al. (1992, p. 418), "grows ever more competent in constructingpositive educational experiences." A teachers ability to construct such experiences is helpedor hindered by the quality of the externally created curriculum (which may be highlyspecified or may contain broad guidelines). Yet, ones own professional capacities yieldeven stronger influences on this process. The enactment perspective is becomingincreasingly widespread (Ball & Cohen, 1996; Barab & Leuhmann, 2003; Ben-Peretz,1990, 1994; Clandinin & Connelly, 1992; Fullan, 2000) and forms the cornerstone of ourvision on design research in the curriculum domain. DESIGN RESEARCH IN THE CURRICULUM DOMAIN

134.
A curriculum perspective 117 Having described core elements in our understanding of curriculum, designresearch in this domain is now addressed. This section describes why design research isusually chosen; what outputs result from its undertaking; when this kind of approach isuseful; who is generally involved; where it takes place; and how it is conducted.Thereafter, a conceptual model for design research in the curriculum domain ispresented. Why Choose a Design Research Approach Numerous motives for design research have been cited in this volume andelsewhere. Many of them speak to the long-standing criticism that educational researchhas a weak link with practice. Increasingly, experts are calling for research to be judgednot only on the merits of disciplined quality, but also on the adoption and impact inpractice (Design-Based Research Collective, 2003; Kelly, 2004; Reeves, Herrington, &Oliver, 2005). In combination with other approaches, design research has the potentialto help develop more effective educational interventions, and to offer opportunities forlearning during the research the process. In the curriculum domain, design research isoften selected to help improve understanding of how to design for implementation. Bycarefully studying successive approximations of ideal interventions in their targetsettings, insights are sought on how to build and implement consistent, harmonious,coherent components of a robust curriculum (as described in the previous section). What Are the Outputs of Design Research? In considering what design research in the curriculum domain ought to produce,it may be useful to return to the aforementioned orientations of knowledge, society and

135.
A curriculum perspective 118learner. The primary output of design research is the knowledge that is generated as aresult; this usually takes the form of design principles. The secondary output of designresearch in this domain is its societal contribution: curricular products or programs thatare of value in schools or to a broader education community. The tertiary output, relatedto the learning orientation, is the contribution made by design research activitiesthemselves to the professional development of participants. Design research thuscontributes to three types of outputs: design principles, curricular products (orprograms) and professional development of the participants involved, as illustrated inFigure 6.3.[ FIGURE 6.3 ABOUT HERE ] Design Principles The knowledge claim of design research in the curriculum domain takes theform of design principles (Linn, Davis, & Bell, 2004; van den Akker, 1999), also knownas domain theories (Edelson, this volume), heuristics (Design-Based ResearchCollective, 2003) or lessons learned (Vanderbilt, 1997). While the format may vary,design principles generally offer the kinds of heuristic guidelines described by van denAkker (1999): If you want to design intervention X [for purpose/function Y in context Z]; then you are best advised to give that intervention the characteristics C1, C2, …, Cm [substantive emphasis]; and do that via procedures P1, P2, …, Pn [procedural emphasis]; because of theoretical arguments T1, T2, …, Tp; and empirical arguments E1, E2, … Eq.

136.
A curriculum perspective 119 Design principles are not intended as recipes for success, but to help othersselect and apply the most appropriate substantive and procedural knowledge for specificdesign and development tasks in their own settings. Curricular Products To some extent (substantive) knowledge about essential characteristics of anintervention, can be distilled from the secondary output of design research in thisdomain: curricular products. Design research contributes to designing, developing andimproving the quality of curricular products (or programs). For specificrecommendations on improving quality, see the ‘how to conduct design research’section later in this chapter. The range of curricular products is commensurate withbroad the scope of the domain. Some examples include: manifestations of the writtencurriculum (e.g. national syllabus or teacher guide); materials used in the classroom(e.g. instructional booklets or software for pupils); or professional development aids(e.g. online environments for teacher communities). Professional Development When research methods are creatively and carefully designed, they cancontribute to the tertiary output of design research: professional development ofparticipants. For example, data collection methods such as interviews, walkthroughs,discussions, observations and logbooks can be structured to stimulate dialogue,reflection or engagement among participants. This perspective also stems from theconviction that there is a natural synergy between curriculum development and teacherdevelopment, and that sensitivity to this can provide more fruitful research and

137.
A curriculum perspective 120development opportunities. Because of the implications for research design, furtherdiscussion of this issue is subsequently addressed in the ‘how’ section. When is this Approach Useful? As illustrated in Figure 6.4, design research in the curriculum domain utilizes theoutputs of validation studies, which provide the necessary starting points (e.g. theories,perspectives) for beginning to engage in design based on scientific insights. Designresearch is especially warranted when existing knowledge falls short, as is often the casewith highly-innovative curriculum improvement initiatives. The main aim of designresearch in these situations is to develop new knowledge that can help constructpioneering curricular solutions which will prove viable in practice. Because of the focuson understanding curriculum enactment and the implementation process, this kind ofstudy rarely includes large-scale effectiveness research. However, effectiveness studiesare necessary to understand long-term impact and provide valuable informationpertaining to the quality of the outputs (design principles, curricular products andprofessional development) of curriculum design research. For further discussion ofvalidation and effectiveness studies, please refer to Nieveen, McKenney and van denAkker (this volume).[ FIGURE 6.4 ABOUT HERE ] Who is Involved in Design Research? Thought hardly universal, many studies in curriculum design andimplementation create space for (varying degrees of) participatory design. In

138.
A curriculum perspective 121participatory design, the end users of curriculum contribute to the design process. Thismeans that researcher insights are augmented by those offered from such participants aspupils, teachers, school leaders and external experts. Until recently, the bulk of thesecontributions tended to come during the generation of initial design specifications(analysis) and refinement of draft products (formative evaluation). While the movementto directly involve teachers in the design process has been gaining momentum since the1970’s (Stenhouse, 1980), it would seem that only in the last decade have researchersbegun to seriously consider how to marry the interests of teachers in a participatorydesign process and reap the relevance benefits without putting the methodologicalquality of the research in jeopardy (McKenney, 2001; McKenney & Van den Akker,2005; van den Akker, 2005). Where Does Design Research Take Place? Researchers have found that practices or programs developed in one setting canoften be used successfully in other places (Loucks-Horsley & Roody, 1990). But despitethe potential insights to be gained through exploration of how ideas may be translatedfor use in other settings, it should be noted that such a task is far from easy. “There is no(and never will be any) silver bullet” for educational change in varying contexts (Fullan,1998). Neglecting to seriously understand and consider the “fit” of an innovation is acommon cause for failure (Guthrie, 1986). Contextual understanding is essential for robust curriculum design. Because thereal-world settings of schools and classrooms tend to be so complex and unpredictable,the only way to sincerely explore curricular products is by doing so together with

139.
A curriculum perspective 122stakeholders (see previous section) in the target context (cf. Design-Based ResearchCollective, 2003; McKenney, 2001; Richey & Klein, 2005; van den Akker, 1999).Based on careful analysis, design principles offer situated guidelines, which rely onaccurate, thorough portrayal of pertinent contextual variables. Design research musttherefore take place in naturally-occurring test beds to address usability issues and tointimately portray the setting for which the design was created. Contextual variablesthat should be portrayed include local factors (eg. school climate, pupil population,resources available, etc.) and system factors (including large scale examinations andteacher development). Figure 6.5 shows design research taking place within the systemfor which it is intended.[ FIGURE 6.5 ABOUT HERE ] How is Design Research Conducted? Tackling design research is no easy task. But, as in responsible curriculumdevelopment, it can be useful to begin by articulating the underlying philosophy beforeaddressing the process itself. This section therefore begins by describing tenets thatform the foundation of the design research approach described here, and then addressesthe iterative nature of the design research process. Tenets As previously mentioned, design research efforts contribute to three main typesof outputs: design principles, curricular products and the professional development of

140.
A curriculum perspective 123participants. Related to each output, we define a set of tenets to shape design research inthe curriculum domain (respectively): rigor, relevance, and collaboration.Rigor As noted elsewhere in this volume (Nieveen, McKenney & van den Akker), andin the ‘what’ section of this chapter, the design research yields knowledge in the form ofdesign principles. For these to be valid and reliable, the research from which they aredistilled must adhere to rigorous standards. The wealth of literature on naturalisticresearch (Miles & Huberman, 1994; Patton, 1990; Yin, 1994) offers much support foraddressing issues of internal validity (extent to which causal relationships can be basedon the findings); external validity (extent to which findings are transferable to somebroader domain); reliability (extent to which the operations of the study can be repeatedwith the same results); and utilization (extent to which action could be taken based onthe findings).Relevance The societal contribution of design research refers to the curricular product orprogram that benefits educational practice. As mentioned earlier in this chapter,curricular products must be carefully examined and, if necessary, (re)tailored for thecontext and culture in which they will be implemented. When it comes to designresearch (McKenney, 2001; McKenney & van den Akker, 2005), such efforts must bebased on a working knowledge of the target setting and be informed by research anddevelopment activities taking place in naturally-occurring test beds.

141.
A curriculum perspective 124Collaboration If design research activities are to contribute to the professional development ofparticipants, then design and development must be conducted in collaboration with andnot for those involved. Additionally, data collection procedures should be mutuallybeneficial – addressing research needs while simultaneously offering meaningfulexperiences for the participants. As mentioned previously, data collection methods suchas interviews, walkthroughs, discussions, observations and logbooks can be structuredto stimulate dialogue, reflection or engagement among participants. Iterations There is little debate that, in any domain, the design research process tends to beiterative (Cobb, Confrey, diSessa, Lehrer, & Schauble, 2003; Design-Based ResearchCollective, 2003; Reeves et al., 2005; van den Akker, 1999). In the curriculum domain,each iteration helps to sharpen aims, deepen contextual insights and contribute to thethree main outputs (e.g. design principles drafted, curricular products improved,opportunities for professional development created). Within each iteration, the classiccycle of analysis, design and evaluation takes place, as illustrated in Figure 6.6.[ FIGURE 6.6 ABOUT HERE ]Analysis => Gap Closure Analysis in the curriculum domain is conducted to understand how to target adesign. It primarily features assessment of harmony (or dischord) between theaforementioned intended, implemented and attained curricula. Further study of internal

142.
A curriculum perspective 125consistency (macro, meso and micro level) and system coherence (alignment withteacher development and pupil assessment) often sheds light on how and where gapsbetween representations are created. Additionally, good analysis makes use of inputssuch as creativity, inspiring examples and a systematic approach. In design research,good analysis includes these features, but is also more forcefully driven by theoreticaland empirical insights (often, but not exclusively, from validation studies as mentionedpreviously). Once completed, the analysis findings usually offer guidelines for designthat target the closure of one or more gaps between the intended, implemented andattained curricula. These guidelines take the form of design specifications that willshape curricular products such as: standards descriptions; textbooks; learner toolkits,etc. In some cases, the guidelines also shape the development process.Design => Prototype Stages The iterative nature of design research is due, in part, to the dominance of aprototyping approach. Prototyping has traditionally been associated with engineeringand is a well-proven method for solving real problems. Over time, this approach hasspread to other arenas that apply a systematic approach to problem-solving and design,including education. Ranging from ‘formative experiments’ (Newman, 1990; Reinking& Watkins, 1996), to iterative revision (McKenney, 2001; Nieveen, 1997; Nieveen &van den Akker, 1999), prototyping refers to when design products are evaluated andrevised through a systematic process. Various forms of prototypes have been identified in literature, based on differingstages of development (Connel & Shafer, 1989; Nieveen, 1999). In earlier iterations, the

143.
A curriculum perspective 126product of the design stage may be an initial prototype that globally demonstrates thenature of the design. As development continues, prototypes may be partially or evencompletely elaborated. At the conclusion of a design cycle, a product’s stage ofdevelopment (global, partial or complete prototype) influences the kind of formativeevaluation activities that may take place.Evaluation => Trade-Offs Formative evaluation is performed to improve (instead of prove) the quality ofprototypes. Depending on the stage of development (global, partial or complete),evaluation approaches may include: developer screening; expert review; microevaluation; and/or classroom try-outs. Participants in evaluation activities can includethe educational designers themselves, other members of the development team (graphicdesigners, software engineers, etc.), experts, teachers, pupils, parents and any otherrelevant stakeholder group. While specific criteria for exploring (ways to improve) the quality of a prototypevary along with the aims of the curricular product, the following three general criteria,rooted in earlier work (Blumenfeld, Fishman, Krajcik, Marx, & Soloway, 2000;Fishman & Krajcik, 2003; Fishman, Marx, Blumenfeld, Krajcik, & Soloway, 2004;McKenney, 2001; McKenney & van den Akker, 2005; Nieveen, 1997; Nieveen & vanden Akker, 1999), can be useful: viability, legitimacy and efficacy. These aresummarized in Figure 6.7 below.[FIGURE 6.7 ABOUT HERE ]

144.
A curriculum perspective 127 Viability relates predominantly to the implementability of a design. Whileviability can best be tested through the (formative) evaluation of draft products,consideration of viability aspects is a recommended starting point for drafting andrefining design intentions. Three aspects of viability are distinguished: practicality,relevance and sustainability. Viability questions include: Will this design be realisticallyusable in everyday practice? Is this design relevant to the needs and wishes of those inthe target setting? May one expect this design (or its use) to be sustainable? Thecriterion of legitimacy relates to the underlying intentions; legitimacy aspects areinvestigated through questions like: Is (the rationale behind) this design based oncontemporary scientific insights? Are the design components internally consistent?Does the design support coherence among system factors? Efficacy relates to how wella design yields the desired result(s). Because efficacy is defined in terms of the aims ofa design, the criteria for efficacy vary per case. For example, if a curricular product isintended to foster the development of teacher classroom management skills, then impacton teacher classroom management skills would be one of the criteria to look for inevaluation. Some generic efficacy questions include: Is the time, effort and financialcost worth the investment? How (in)efficient is (the use of) this design? Insights gleaned from evaluation activities focused on the viability, legitimacyand efficacy provide inputs for subsequent re-design of curricular prototypes. Moreoften than not, a renewed analysis must take place as trade-off decisions are made. Forexample, a design may prove to be highly legitimate and effective, but not viable (e.g.too expensive or too time-consuming) in practice. If viability can be achieved, but only

145.
A curriculum perspective 128with costs to legitimacy or efficacy, then trade-offs must be weighed. Without a balanceamong these three, implementation will be challenged or possibly fail. Conceptual Model The model for design research in the curriculum domain (Figure 6.8) is based onthe previous discussion. At the heart of the process are the tenets of research rigor, localrelevance and collaboration with participants. These foundational ideas shape theanalysis, design and evaluation cycle. Triangular flows indicate principle outputs fromone phase that influence the subsequent phase, specifically: contextual analysis leads todesign targets for closing gaps between the intended, implemented and attainedcurricula; prototype stage of development has implications for framing an evaluation;and evaluation insights help to deliberate over trade off decisions between the viability,legitimacy and efficacy quality aspects. Design research cycle iterations contribute tothe aforementioned outputs of professional development, curricular products and designprinciples. While this all takes place within and is influenced by the target context,design principles that are well-articulated (and carefully portray the context) may beuseful in other settings. Finally, this kind of research relies on the results of validationstudies to provide relevant scientific insights to help shape the design (legitimacy); thistype of research further provides starting points for larger scale effectiveness studies toexplore long range impact.[ FIGURE 6.8 ABOUT HERE ] Examples

146.
A curriculum perspective 129 This model (Figure 6.8) has its roots in the topics discussed throughout the firsttwo sections of this chapter. Numerous studies have also been based on these andrelated ideas. Although articulated and visualized in this format for the first time here,the conceptual model shown in Figure 6.8 aptly depicts the research approach used inseveral curriculum studies; three of these are briefly described below. For comparison purposes, the examples selected share main elements of anunderlying rationale. As shown in Figure 6.9, the three studies all explore support forcurriculum development and teacher development at a naturally-occurring cross-roads:the use and production of lesson materials.[ FIGURE 6.9 ABOUT HERE ] Each of the studies yielded design principles, curricular products and contributedto the professional development of participants. They were all conducted in the targetcontext, using a synthesis of literature and previous research as major points ofdeparture. Cycles of analysis, design and evaluation took place in all three cases, andthese were shaped by the need for research rigor, contextual relevance and collaborationwith the participants. While they all aimed to contribute to (improved) development ofcurriculum materials and teacher professional development, their iterative cycleactivities were conducted differently and they each resulted in different types ofcurricular products; these are briefly described in Table 6.1.[ TABLE 6.1 ABOUT HERE ]

147.
A curriculum perspective 130Please visit www.routledge.com to access this book’s supplemental website, containinginformation on the research design and approach used in each of these three studies. DESIGN RESEARCH DILEMMAS In the curriculum domain, a design research approach is often chosen because of theopportunities it offers to help improve educational realities directly (through the curricularproducts designed and the professional development opportunities created by the studyitself) and indirectly (through design principles to inform future endeavors). While theoverall benefits of this approach are perceived to be worth the investments, design studiesare subject to challenges not typically encountered through more conventional approaches.This section addresses three: designer as implementer and evaluator; real-world researchbrings real-world complications; and the warrants and risks associated with adaptable studydesign. Designer as (also) Implementer and Evaluator Due to the nature of the approach (prototyping in context), design researchers oftenfind themselves in the conflicting roles of advocate and critic (Design-Based ResearchCollective, 2003). The multiple roles can be extremely useful, for example during formativeevaluation. When designers participate in the formative evaluation activities, they areafforded the opportunity to gain deeper and often sharper insights into the strengths andweaknesses of a design. This has the potential to shorten both the lines of communication

148.
A curriculum perspective 131within a development team and the time needed for revision decisions. However, themethodological concerns would seem obvious. Despite efforts to stimulate criticism, thefact that the designer and evaluator may be the same person, increases the chance for anevaluator effect (Patton, 1990). Participants may react differently due to the designer’spresence during formative evaluation; and the designers may be (unintentionally) lessreceptive to critique. Regardless of efforts made to obtain neutrality, value-free researchwill not be possible when the designer performs evaluation activities. “Rather thanpretending to be objective observers, we must be careful to consider our role in influencingand shaping the phenomena we study. This issue is obvious when individuals take onmultiple roles of researchers; teachers; teachers of teachers…” (Putnam & Borko, 2000, p.13). Acknowledging the inevitable impact of researchers (especially those donningmultiple, even conflicting roles) on the context in which understanding is sought is a firststep toward combating associated obstacles, such as (cf. Krathwohl, 1993): the Hawthorneeffect (involvement in the study makes participants feel special and thereby influences theirbehavior); hypothesis guessing (participants try to guess what the researcher seeks and reactaccordingly); and diffusion (knowledge of the treatment influences other participants).Thereafter, these threats may be further reduced by striving for unobtrusiveness throughmaking the research setting as natural and as genuine as possible. Real-World Research Settings Bring Real-World Complications As stated previously, design research makes use of naturally-occurring test beds.The benefits of conducting research in authentic settings would seem obvious: the morerealistic the research setting, the more the data will reflect reality. But deeperunderstandings come at the (potential) cost of losing control over data collection rigor.

149.
A curriculum perspective 132For example, if pilot teachers share their enthusiasm with colleagues, who in turnrequest to join the participant group, researchers are faced with a dilemma: compromisethe sample or encourage teamwork? When researcher interests are no longer the onlyones at stake, compromise is imminent. Particularly when a cultural stranger (cf. Choksi & Dyer, 1997 in Thijs, 1999)attempts to carry out research in a foreign setting, the degree to which an outsider canconduct meaningful research must be addressed. In many situations, participants aresometimes hesitant to be completely open with researchers from different culturalcontexts. Toward earning participant trust and building an understanding of the context,the importance of collaboration and mutually beneficial activities cannot be over-emphasized, as these are the two main avenues available to a researcher who prioritizesthe insider perspective. This is not to say that being an outsider is completely withoutadvantages. In some situations, it actually allows for a degree of objectivity and, alongwith that, a freedom (or forgiveness) for honesty that is not permitted to those within aparticular group. Adaptability Design research is often mapped by evolutionary planning. Given theaforementioned real-world research settings, adaptability is essential. Further, it isimportant that research cycles be responsive to findings from previous ones. Yet, aresearch design that keeps changing is weak. The notion of evolutionary planning refersto a sound planning framework that is responsive to field data and experiences, atacceptable moments during the course of a study. The need for adaptability in design

150.
A curriculum perspective 133research pertains not only to planning, but also to role of the researcher during thestudy. According to van den Akker (2005), the synergy between research and practicecan be maximized when researchers demonstrate adaptability by: (a) being prepared,where desirable, to take on the additional role of designer, advisor and facilitator,without losing sight of the primary role as researcher; (b) being tolerant with regard tothe often unavoidably blurred role distinctions and remaining open to adjustments in theresearch design if project progress so dictates; and (c) allowing the study to beinfluenced, in part, by the needs and wishes of the partners, during what is usually along-term collaborative relationship. Such adaptability requires strong organizationaland communicative capacities on behalf of the researcher. Adaptability also requiressound understanding of research rigor so that prudent changes and choices are made,that maximize value and minimize threats to the quality of the study. DESIGN STUDY GUIDELINES Earlier in this chapter, three tenets that lay the foundation for design researchinitiatives were discussed: rigor, relevance and collaboration. Related to the previoussection on design research dilemmas, this chapter concludes by addressing some guidelinesfor guarding academic rigor while still conducting relevant, collaborative inquiry. Theseguidelines may help in being able to generate credible, trustworthy and plausible designprinciples. Explicit Conceptual Framework

151.
A curriculum perspective 134 As with all sound research, design research activities should be rooted in anunderlying rationale. The underlying rationale should evolve through formal activities (e.g.literature review and interviewing experts) as well as informal activities (discussions withcritical friends and during conferences), which should lead to an explicit conceptualframework. Providing the conceptual framework gives others the opportunity to makeanalytic generalizations (external validity). Congruent Study Design In a congruent study design, the chain of reasoning (Krathwohl, 1993) is bothtransparent and tight. This means that the structure of the study as a whole demonstratesclear and strong links between: previous research, theory, research questions, researchdesign, data analysis and conclusions. In studies with a congruent design, the ideassupporting each of these components are compatible. Triangulation During each iteration of a design study, various individuals may participate andvarious methods of data collection must be carefully chosen and applied; these areapplications of triangulation. According to several authors (cf. Merriam, 1988; Miles &Huberman, 1994; Patton, 1990) triangulation assists in enhancing the reliability and internalvalidity of the findings. This effect rests on the premise that the weaknesses in each singledata source, method, evaluator, theory or data type will be compensated by thecounterbalancing strength of another. This is one strategy that can help speak specifically toconcerns associated with the multiple, sometimes blurred roles taken on by designresearchers. Triangulation of data sources, data collection settings, instruments, or even

152.
A curriculum perspective 135researchers can be quite robust, but should not be driven by the misconception that more isbetter. This notion is aptly conveyed by Dede (2004, p. 107) who notes, “everything thatmoved within a 15-foot radius of the phenomenon was repeatedly interviewed, videotaped,surveyed and so-forth – and this elephantine effort resulted in the birth of mouse-likeinsights in their contribution to educational knowledge.” Inductive and Deductive Data Analysis As with other forms of research, it can be useful to tackle data analysis fromdiffering perspectives. Often, both deductive and inductive analyses are useful. Deductiveanalyses classify data according to existing schemes, usually based on the conceptualframework; and inductive analyses explore emergent patterns in the data. Interim analysis(following a phase or cycle) is essential for informing subsequent efforts. Repeated interimanalysis is referred to as ‘sequential analysis’ by Miles and Huberman (1994, p. 84) whocomment as follows on the strengths and weaknesses of this approach: "Their [interimanalyses] strength is their exploratory, summarizing, sense-making character. Theirpotential weaknesses are superficiality, premature closure, and faulty data. Theseweaknesses may be avoided through intelligent critique from skeptical colleagues, feedingback into subsequent waves of data collection." Data analysis may further be bolsteredwhen conducted or revisited by a team of researchers or critical friends. Finally, insightsstemming from only one participant may be extremely useful, even if not echoed by others.While frequency of findings is a good indicator of importance, data analysis proceduresmust be sensitive to the salience and depth of findings. Full Description

153.
A curriculum perspective 136 Another tactic mentioned by several authors (Merriam, 1988; Miles & Huberman,1994) is providing a context-rich description of the situation, design decisions and researchresults. While the generalizability of design research findings is limited, full descriptionswill help the readers of such portraits to gain insight in what happened during researchstages and make inferences based on (or transfer) the findings to other situations (externalvalidity). In addition, a full description may also make replications possible. If thereplication of a study led to similar results, this would demonstrate that the study had beenreliable. Member Check Merriam (1988) states that taking data and interpretations back to the source mayincrease the internal validity of the findings. For instance, participants of a try-out can beinvited to provide feedback on an outline with the major results of the try-out. Likewise, itcan be useful when interviewees review a researcher’s synopsis of an interview, andprovide corrections if necessary. CONCLUDING COMMENTS This chapter set out to contribute to understanding of design research in thecurriculum domain. Essential to understanding the ideas presented here on designresearch are the perspectives discussed at the start of the chapter pertaining to the natureof curriculum, how it is built and what factors affect its implementation. This lensdemonstrates that curriculum design and design research in this domain cannot be de-

154.
A curriculum perspective 137coupled from the system in which these activities take place. This calls for both designand research work to be conducted in situ, together with the participants (teachers,pupils, etc.). The yields - design principles, curricular products and professionaldevelopment of participants - can make the effort worthwhile, but care must be taken toguard the rigor of a research process that is subjected to the complexities of the real-world. Some tactics for coping with these challenges were offered in the last section ofthis chapter, but these represent only an initial set of guidelines based on existingapproaches to research. Further work is needed to better understand this emerging fieldand to provide additional guidelines as well as examples. In the meantime, it is hopedthat this chapter and the related website examples will further dialogue on designresearch from the curriculum perspective. REFERENCESBall, D., & Cohen, D. (1996). Reform by the book: What is - or might be - the role of curriculum materials in teacher learning and instructional reform? Educational Researcher, 25(9), 6-8, 14.Barab, S., & Leuhmann, A. (2003). Building sustainable science curriculum: Acknowledging and accommodating local adaptation. Science Education, 87(4), 454-467.Ben-Peretz, M. (1990). The teacher-curriculum encounter. Albany: State University of New York Press.Ben-Peretz, M. (1994). Teachers as curriculum makers. In T. Husén, & T. Postlethwaite

170.
Iterations Curricular productCascade- Two analysis cycles, four Software tool to help (facilitator)Sea design cycles and two teachers create teacher guides evaluation cyclesTomc@t Three full cycles of analysis, ICT-based learner materials to foster design and evaluation early literacyMac Two smaller and three larger Teacher guides for secondary level cycles of (re)design and science evaluationTable 6.1: Examples of design research in the curriculum domain

171.
Assessing the quality 144ASSESSING THE QUALITY OF DESIGN RESEARCH PROPOSALS: SOMEPHILOSOPHICAL PERSPECTIVESD.C. Phillips Design research or design experiments (DR) have become the topic of muchdiscussion among educational researchers since the essays by Collins (1992) and Brown(1992). The claim that design of effective and innovative programs or treatments canproceed not only at the same time as, but fully integrated with, the pursuit of traditionalresearch objectives, has proven to be somewhat controversial (see, for example, thesymposia in Educational Researcher, 2003, and Educational Psychologist, 2004); this isbecause the variation of many factors at once that seems to occur frequently in many ofthe former endeavors runs counter to the fundamental principle of research which is to“control the variables.” Nevertheless, it clearly seems fruitful both for the production ofeffective programs and for the opening up of interesting (and useful) lines of research,to have researchers closely involved in the design process. I am pursuing some of thedifficult issues associated with this topic elsewhere (Phillips and Dolle, in preparation),so here I wish to make some preliminary remarks about a related but somewhatneglected matter. Funding agencies around the world are becoming recipients of proposals to carryout design research, and these of course need to be evaluated. Assessing the worth of afinalized program, or of a completed research report, is difficult enough, but the

172.
Assessing the quality 145problems pale when compared with the difficulties inherent in assessing the quality of aproposal to carry out DR. What follows is a brief discussion of some of these, togetherwith some small suggestions that might be helpful.1. Education research, and more specifically design research or design experiments, donot constitute what philosophers call a “natural kind”; they do not form a species withone or a small number of species-defining characteristics. It is a category made byhumans for human intellectual and “political” and self-identification purposes, and likea hard disk or a file labeled “miscellaneous” it has a lot of different things crammed intoit. There are many different types of education research, all of them having their ownexcellences, and their own limitations; and the same can be said of design experimentswhich vary enormously in how they manage the relationship between research anddevelopment (if this distinction is recognized at all). I do not regard it as productive tospend much time trying to come up with a simple account that ends all controversyabout “what design experiments really are” – there is no right answer. There is no onething that they are like.2. To help make this clear to potential funders of DR, it may be helpful to refer to thephilosopher Ludwig Wittgenstein’s notion of “family resemblances,” one that heillustrated with the concept of “games” (Wittgenstein, 1963). (In summary: In defining“games,” there is a large set of “family characteristics”; each game instantiates some ofthe set, but two games might still be games yet have none of these family characteristicsin common.) My suggestion is that there will also be a set or family of criteria by whichDR should be judged – elements from the set that are appropriate for evaluating one

173.
Assessing the quality 146study will not be relevant to another. This does not make the task of the evaluator ofresearch or DR proposals easier, but it is better to be realistic rather than simplistic.3. There is a well-known issue in assessing research in education relating to the fact thatthere are many different frameworks or methodological approaches (or, to use theKuhnian term in its now established very loose sense, paradigms) within which specificexamples of research are located. Thus the criteria for assessing specific examples often(if not always) need to be framework or paradigm specific, for – to cite an obviousexample – the factors that make a piece of ethnography rigorous are not the samefactors that mark a good experimental or quasi-experimental study. To make mattersmore complicated, adherents of one framework or methodological approach often havevery little tolerance for rival approaches, so they make unfair judges of such work.Those individuals who fund research, including DR, must somehow overcome anyinstinctive prejudice they feel about frameworks other than their own favored ones, andat the very least they must make a serious effort to appreciate the rationales that aregiven to justify different styles of work. (There is a limit to toleration, of course;personally I have little toleration for postmodern approaches that often – perhaps notalways – reject the entire scientifically-oriented research approach and especially the“modernist” notion of reason that underpins it. But the situation is difficult, and theassessor of DR needs to have the wisdom of Solomon!)4. There is a serious oversimplification of scientifically-oriented research (that probablyamounts to an egregious misunderstanding of the nature of scientific inquiry – seePhillips, 2005) that can cloud the assessment of research proposals, including proposals

174.
Assessing the quality 147to do DR. (This oversimplification has reached epidemic proportions in the USA, wherethere are very determined efforts to make the use of so-called “gold standardmethodology” – namely the use of randomized controlled experiments or field trials,and as a back-up quasi-experiments and regression-discontinuity designs – into anecessary condition for funding of research and evaluation by governmental agencies.For examples of favored studies see the US federally funded “What WorksClearinghouse” at www.W-W-C.org.)The mistake or oversimplification lies in placing almost all the emphasis on the finalstage of a research cycle (the stage of testing or validating the claim that an interventionor treatment or program is causally responsible for some outcome); this is to neglect oreven to denigrate what the philosopher Hans Reichenbach many years ago called the“context of discovery” (Reichenbach, 1938), which is the context in which researchersdisplay creativity and do much preliminary investigation (often guided of course bydeep factual and theoretical background knowledge) in the vital effort to come up withan intervention or treatment that is worth testing.To put this issue in a slightly different way: The great American pragmatist philosopher,John Dewey, and the great critical rationalist philosopher Karl Popper, gaveindependent accounts of the logical processes involved in science that are remarkablysimilar (and Reichenbach, a logical positivist, would have endorsed their accounts).They saw science as proceeding through cycles, each of which involves identificationand clarification and analysis of a problem, analysis of possible solutions, followed bytesting of the solution or hypothesis that seems most likely to be efficacious (for further

175.
Assessing the quality 148discussion and references, see Phillips, 1992, ch. 6). To focus almost entirely upontesting (as in using randomized field trials as the “gold standard” to judge the scientificmerit of education research) is to ignore or to treat as scientifically insignificant theabsolutely crucial earlier stages.One of the very great virtues of the DR community is that its members take seriouslythe whole of the scientific research cycle. It will be fatal to the future of DR, and indeedto the future of the whole of science, if funding decisions are to be based entirely, oreven just very largely, on the use of a particular family of methods for testinghypotheses.5. This raises another difficult issue. It is easy to be seduced by a proposal for fundingsupport that focuses upon testing some hypothesis or treatment, for such a proposal willbe concrete – it will be quite specific about what will be done. Any competentresearcher who focuses on the testing of hypotheses (and not upon their derivation andanalysis) will find it easy to specify how the sample will be drawn and from whichpopulation it will be drawn, about how attrition will be monitored, about how thetreatment under test will be delivered, about how randomization will be carried out.Such matters can be assessed quite straightforwardly by a potential funding agency. It ismuch more difficult to assess a proposal that takes the whole scientific cycle seriously,starting with the identification and clarification of a significant problem and moving, viaa series of studies and analyses, to a refined hypothesis or treatment that is worthy ofbeing taken seriously enough to be tested.

176.
Assessing the quality 149There are several important subsidiary issues here:(a) Why is this difficult? Simply because it is literally impossible for a person proposing to do an original piece of research to specify in precise terms, beforehand, what will be done during the early stages of the investigative cycle. Scientific investigation writ large (as opposed to the narrower domain of hypothesis testing) is not a matter of following a precise algorithm, and the investigator cannot predict at the outset what issues will arise and what opportunities will present themselves. Investigation is a creative, but also necessarily a disciplined, process. Einstein is reported to have remarked that a good scientist is an “unscrupulous opportunist,” and Nobel Laureate Percy Bridgman said that “the scientist has no other method than doing his damnedest,” but it is difficult for all but the most trusting funding agency to rate highly a proposal that states “I will be an opportunist and do my damnedest” – however, in reality, this is what a good investigator will do in the early phases of his or her work. And this is surely what good practitioners of DR do. But to repeat: the damnedest, the opportunism, is disciplined.(b) If the history of science is any guide, it will turn out that the creative processes during what Popper and Dewey identified as the early stages of the scientific inquiry process, and what Reichenbach called the context of discovery – the analyses, the small scale studies and observations, the small scale testing and probing that occurs, the crafting of arguments, and so forth – will often result in the development of such a convincing case (such a convincing “warrant” will be built up) that large scale hypothesis testing may not be needed at all.

177.
Assessing the quality 150Almost any example from the work of great scientists of the past can be drawn upon asan illustration of the importance of the early phases of a piece of scientific inquiry.Consider the work of William Harvey, who built up – over time – the case that the heartpumps blood, which is carried around the body by arteries and veins. He could not havespecified what he was going to do, before he did it; at most he could have indicated thathe was interested in investigating why it was that animals have hearts. As problems andcriticisms arose, he devised inventive ways to counter them, using a variety of types ofstudy (today we would say that he used mixed methods). Eventually the warrant for hisclaim about the circulation of the blood was undeniable – and it is noteworthy that hedid not place reliance upon a sole “gold standard” methodology. It also is sobering toreflect on the fact that he would not have received support from most funding agenciesin existence today, because – at the outset of is work – he would not have been able tobe specific about what he was going to do. I believe that most if not all designresearchers are in the same situation as William Harvey! (The same points could bemade in the context of Darwin’s or Pasteur’s work. See Phillips, 2005, for furtherdiscussion and references.)6. This leads to the last difficulty I shall raise in the present remarks: In the light of thefive previous points, what advice can I give to a funding agency about the criteria theyshould look for in assessing DR proposals? I seem to have made a strong case (at least Ithink I have made a strong case) that design researchers, being good scientists whosefocus is healthily much wider than mere hypothesis testing, cannot be precise aboutwhat they are going to do at the start of their work. So how can an intelligent decision

178.
Assessing the quality 151be made about whether to fund a piece of DR? I have three tentative suggestions to offerhere.First: Although it is, in my view, unreasonable to expect a design researcher to beprecise about what it is that he or she will actually do in the course of the work, it is notunreasonable to expect some indication of where it is hoped that the main contributionwill lie. Let me start back with William Harvey: Harvey could say at the outset of hiswork that he was generally interested in the heart; but note that he did not have severalquite different interests. Design researchers are sometimes not as clear as Harvey; forthere is an ambiguity here that runs through much of the DR literature, and that needs tobe clarified in any DR proposal (this ambiguity bedevils, for example, a number of thecontributions to the symposium on DR in the Educational Researcher, 2003). This isthe ambiguity over whether the main purpose of a piece of DR is to contribute to anunderstanding of the design process itself (and what this might mean and why it ispotentially important ought to be made clear), or whether it is hoped to throw light onsome educationally-relevant phenomenon associated with the program or interventionthat is being designed (for example, the learning of certain topics in mathematics byyoung girls), or whether it is hoped to actually design a technically impressive programor intervention or artifact, or whether it is hoped to do two or all three of these things.This will be helpful information for a funding agency, which will need to decide if sucha purpose (or purposes) falls within its domain of interest. (And, of course, to indicate ageneral interest at the outset of the work is not to suppose that the focus cannot changeas the work proceeds and as interesting new phenomena arise.)

179.
Assessing the quality 152Second: The proposal should make clear that the authors understand that the warrantssupporting the claim that one or other of these three aims has been achieved, aredifferent. The warrant supporting the claim to have improved the design process isdifferent from the warrant that some aspect of learning has been illuminated, and bothof these are different from (but sometimes related to) the warrant for the claim that atechnically impressive artifact has been produced. In practice these different purposesoften are so intertwined that their different logical/epistemic warranting requirementsbecome seriously confounded. Furthermore, it should be shown that the DR team (formost often it will be more than one individual working on the project) has representedwithin it the bodies of research and development skills that reasonably can be expectedto be needed for such an enterprise, whatever it is. (A study of learning requiresdifferent skills from mathematics curriculum development, and the production ofrigorous warrants for knowledge claims might require even other skills.)Third: Funding agencies might adopt as their model the way in which projects arechosen for funding in the creative arts. A composer being considered for a commissionto produce a new symphony cannot be expected to show what the symphony will be likebefore she has actually composed it! But the composer’s portfolio of work can beexamined, and the training the composer has received can be investigated. It will beknown that this person is good with symphonies for small orchestras but not soproficient for works for the solo violin. It will be known what style of music theparticular individual favors (atonal, jazz oriented, romantic, triumphal, or so forth) andhow this fits with the type of work that the commissioning body has in mind. In the end,those awarding the commission place a bet: Given the past record (or training or

180.
Assessing the quality 153apprenticeship), it is likely that this composer can produce the desired goods. (TheMacArthur Foundation does something along these lines with its so-called “geniusawards”.) This process is fuzzy, but a surprising degree of inter-rater reliability can beachieved between intelligent judges. And it is perhaps the only way to deal with thecomplexities of the creative scientific process that is central in the design experiment.7. I conclude with an “upbeat” example that illustrates many of the preceding points; itis drawn from a documentary I saw when pursuing my addiction to late night television.In the period immediately following the end of World War 2 the US military, inconjunction with the agency that was the forerunner of NASA, set out on a project todesign a plane that could regularly (and safely) fly faster than sound. Not only was itdesired to produce a workable product (the X-1 plane), it was also desired to understandthe physics – the aerodynamic principles – of flying at speeds greater than Mach 1. Inessence, then, the participants were involved in an early piece of design research. It isnoteworthy that they obtained funding because the project was a national defensepriority and because the participants were experienced in aircraft design andmanufacture; they could not (and they did not) specify in detail what they were going todo before they did it – the changes/improvements they made were decided on almost adaily basis. During each test flight Chuck Yeager, the brash pilot, pushed the prototypeto its limits, and afterwards he and his mechanics tinkered to try to increase the speed.(He represented the interests of the military, which was to produce a fast plane asquickly as possible.) The researchers, however, resisted, for after each modification theywanted to carry out a detailed program of testing before further variables were tinkeredwith (thus illustrating the classic clash between development and research). The

181.
Assessing the quality 154situation was alleviated by the use of two planes, one for pushing as hard as possible,the other for slower testing. Maybe there is a moral here for design researchers, andtheir funders. REFERENCESBrown, A. L. (1992). Design experiments: Theoretical and methodological challenges increating complex interventions in classroom settings. Journal of the Learning Sciences,2, pp. 141-178.Collins, A. (1992). Toward a design science of education. In E. Scanlon & T. O’Shea(Eds.), New directions in educational technology. New York: Springer-Verlag.Educational Psychologist, 39 (4), 2004. Theme issue on design-based research methodsfor studying learning in context. (Eds.), W. Sandoval and P. Bell.Educational Researcher, 32 (1), (2003). Theme issue: The role of design in educationalresearch. (Ed.) Anthony Kelly.Phillips, D.C. (1992). The social scientist’s bestiary. Oxford: Pergamon.

183.
What we learn 156WHAT WE LEARN WHEN WE ENGAGE IN DESIGN: IMPLICATIONS FORASSESSING DESIGN RESEARCHDaniel Edelson In this essay, I consider the question of how to assess proposed design research.In contrast to others in this volume, I will approach the question more from theengineer’s perspective than from the scientist’s perspective. The engineer’s approachassumes that one is doing design and development to begin with and asks the question,what generalizable lessons can one learn from those processes? The scientist’sapproach assumes that one is engaged in scientific inquiry about learning and asks thequestion, what can one learn by integrating design into the research process that onecould not learn otherwise? Scientist’s approach to design research in education: How can research be enhanced by integrating iterative design cycles into it? Engineer’s approach: How can iterative design cycles be used to generate useful research results? To address the question of how we should assess proposed design research, wemust first consider what we can learn from design research and how. WHAT CAN WE LEARN FROM DESIGN?

184.
What we learn 157 A few years ago I published a paper in the Journal of the Learning Sciences ondesign research with the title, What do we learn when we engage in design? In thispaper, I took the engineer’s perspective on design research. I approached the questionwith the assumption that design is a learning process. When a design team creates adesign to achieve a goal under a set of constraints, they must develop an understandingof the goals and constraints they are designing for, the resources they have available forconstructing a design, and the implications of alternative design decisions. In a typicaldesign process, the understanding that the design team develops remains implicit in thedecisions that they make and the resulting design. I characterize these design decisionsas falling into three categories: Decisions about the design process. These are decisions about what steps tofollow in constructing a design, who to involve, and what roles they should play. Assessments of the nature of the design context. These are decisions about thegoals, needs, and opportunities that must or should be addressed by a design. Thiscategory also includes decisions about the design context that must be addressed, suchas the challenges, constraints, and opportunities in the context. Decisions about the design itself. These are decisions about the design itself.This category includes decisions about how to combine design elements and balancetradeoffs in order to meet goals, needs, and opportunities. While these do not exist in any explicit form in many designs, I find it useful tocharacterize these sets of decisions as belonging to three implicit entities. The decisionsabout the design procedure can be characterized as comprising a design procedure;assessments of the nature of the design context can be characterized as a problemanalysis; and decisions about the design itself can be characterized as a design solution.

185.
What we learn 158 The engineering approach to design research assumes that the designer isengaged in these local learning processes and asks how the local lessons that a designerlearns in order to make decisions about the design procedure, problem analysis, anddesign solution can be made explicit and public to serve the needs of a largercommunity. It is important to recognize that the role that design research can play in thelarger research endeavor is not hypothesis testing. The appropriate product for designresearch is warranted theory. As such, the challenge for design research is to captureand make explicit the implicit decisions associated with a design process, and totransform them into generalizable theories. In Edelson (2002), I describe three kinds oftheories that can be developed through design research. Each of these corresponds toone set of decisions that designers make: Domain Theories. A domain theory is the generalization of a portion of aproblem analysis. There are two types of domain theories: A context theory is a theory about a design setting, such as a description of the needs of a certain population of students, the nature of certain subject matter, or of the organization of an educational institution. An outcomes theory describes the effects of interactions among elements of a design setting and possible design elements. An outcomes theory explains why a designer might choose certain elements for a design in one context and other elements in another. Design Frameworks. A design framework is a generalized design solution. Adesign framework provides guidelines for achieving a particular set of goals in aparticular context. A design framework rests on domain theories regarding contexts and

186.
What we learn 159outcomes. Van den Akker (1999) also describes design frameworks, which he callssubstantive design principles, as being a product of design research. Design Methodology. A design methodology is a general design procedure thatmatches descriptions of design goals and settings to an appropriate set of procedures.Van den Akker (1999) calls a design methodology a set of procedural design principles. In addition to these three forms of theories, another important product of designresearch is the design as a case. While it is speculative to generalize from any individualdesign, an accumulation of related cases can form the basis for making supportedgeneralizations. Therefore, it is important to recognize the value of an individual case ofinnovative educational design to a larger design research and theory developmentagenda. HOW CAN WE USE DESIGN AS AN OPPORTUNITY TO CONDUCT DESIGN RESEARCH? I’ve identified four steps to make an educational design process be a useful partof theory development. The first is that it should be research driven. That is, thedecisions made in the design process should be informed by a combination of priorresearch and the design researcher’s emerging theories. The fact that decisions areinformed by prior research does not mean that they must be consistent with the findingsof prior research. In fact, a design process may be driven by the desire to question priorresearch, but it must do so for clear and coherent reasons. Neither does it mean thatresearch exists to inform all or even most design decisions. However, a design

187.
What we learn 160researcher should be informed as to where research does and doesn’t exist to shapedesign decisions. Being aware of research increases the likelihood that a design researchprogram will have impact. The second step toward design research is systematicdocumentation. To support the retrospective analysis that is an essential element ofdesign research, the design process must be thoroughly and systematically documented.The third step is formative evaluation. Formative evaluation is essential in designresearch because it can identify weaknesses in the problem analysis, design solution, ordesign procedure. Ideally, educational design research is conducted in the form ofiterative cycles of design, implementation, and formative evaluation. The forth steptoward design research is generalization. To generalize, the design researcher mustretrospectively analyze the design-specific design lessons to identify appropriategeneralizations in the form of domain theories, design frameworks, and designmethodologies. Being research-driven, maintaining systematic documentation, andconducting formative evaluation all support the process of generalization. HOW SHOULD WE ASSESS PROPOSALS FOR DESIGN RESEARCH? I turn now to criteria one might use in assessing proposals for design research.The first criterion is one that applies to any research proposal: If successful, theproposed work must promise to yield insight into an important need or problem. In thecase of design research, this insight would take the form of new or elaborated theory, incontrast to more traditional research, which would be more likely to take the form ofevidence for or against an existing theory. Other proposal evaluation criteria follow

188.
What we learn 161from the processes described in the preceding section for using design as an opportunityto conduct research: The proposed design must be grounded in prior research or soundtheory; it must have a plan for systematic documentation; it must incorporate formativefeedback into a plan for iterative design and development; and, it must allow for aprocess of generalization based on the other elements of the plan. These criteria, however, address the research side of design research. Taking theengineering perspective seriously requires that we also consider the design side of adesign research proposal. We have to appreciate that design research is inherentlyexploratory and speculative. It begins with the basic assumption that existing practicesare inadequate or can, at least, be approved upon, so that new practices are necessary.The underlying questions behind design research are the same as those that driveinnovative design: What alternatives are there to current educational practices? How can these alternatives be established and sustained? These questions demand a form of research other than traditional empiricism.We cannot apply traditional research methods to these alternative practices because theydo not yet exist to be studied. Since design research is exploratory, it is inherently risky. In fact, designresearch may lead to designs that are worse than existing practices because they eitherlead to unsatisfactory outcomes or they are not feasible to implement. This poses achallenge for the evaluation of proposed research because on the one hand, we want toencourage risk-taking, and on the other we want to limit risk. In order to improve oncurrent practice, we need to take chances on alternatives. In particular, we want to beopen to the possibility of dramatically improved outcomes. However, in a resource-

189.
What we learn 162limited world, we want to be cautious with resources and use them wisely. So, how does one apply these conflicting needs regarding risk to the evaluationof design research proposals? First, it is important to distinguish between innovationand risk. The opportunity offered by design research is the opportunity to fosterinnovation. While the novelty of innovation carries risk, the criterion that designresearch should be research-driven helps to mitigate that risk. If the proposed design isgrounded in existing research or sound theory, then it can be innovative without beingoverly risky. If it is not well-grounded, then, it may, in fact, be too speculative and carrytoo much risk. On the other hand, if the design concept at the heart of a design researchproposal is not sufficiently innovative, then it may not be worth the investment. In short,a design research proposal must involve design that is innovative enough to exploreterritory that cannot be explored with traditional means, but it must be guided by soundtheory, to insure that it is not overly speculative. A second way to manage risk in design research is through the ongoingevaluation that should be part of design research. Thus, the nature of the evaluation planmust be an important consideration in evaluating design research. Specifically, is theformative evaluation plan sufficient for determining whether or not the design is movingtoward better results and more insight at every stage in the design cycle? In particular,will the evaluation yield insight into why the design is or is not yielding desired results?It is this “why” question that is essential to good formative evaluation. In learning fromdesign, it is critical to have a sense of why the design is leading to the outcomes that itis. Coupled with this formative evaluation, it is also helpful to have summary evaluationto help limit risk in design research. Summary evaluation can answer the question, “Isthis design approach showing enough promise to deserve to continue the design

190.
What we learn 163process?” Due to the innovative and exploratory nature of design research, we must becareful not to ask this question too early or too often, for fear of abandoning aninnovative approach before it has been given the time to reach its potential.Nevertheless, we must ask this question periodically and be willing to abandon anunsuccessful approach, having learned the lessons that could be learned by attemptingit. This point about learning lessons from failed designs is important. In considering riskwe must recognize that the failure of a design is not the failure of design research. Whendesign research is conducted properly and systematically, just as many lessons can belearned from the design that fails to achieve its goals as one that does achieve its goals. Finally, in considering the design side of a design research proposal, it isimportant to apply the criteria one would use for any design or development proposal.These include: How appropriate is the proposed design approach for the problem orneed? What expertise in design do the proposers bring? What is their previousexperience with design? FINAL THOUGHTS: RESOURCES FOR DESIGN RESEARCH Design research offers the opportunity to create successful innovations and learnlessons that cannot be achieved through design and empirical research independently. Inmy opinion, this potential for gain outweighs the risk inherent in an exploratory researchagenda. In this essay, I have focused on the considerations in assessing proposals fordesign research. However, there are other important questions regarding design researchthat remain to be addressed. For example, while I am able to say that good design

191.
What we learn 164research requires plans for research-driven design, systematic documentation, formativeevaluation, and generalization, I must acknowledge that we lack accepted methods foruse in developing and executing these plans. Each design research effort mustessentially invent these methods themselves. Similarly, I argue that design researchteams must combine expertise in both design and research. However, it is not clear thatwe have the capacity to compose teams that combine these forms of expertise inexisting educational institutions. Our expertise in educational design tends to besegregated from expertise in research. The former is locked up in institutions that havedevelopment responsibility and the latter in research institutions. In addition, we need toknow how to integrate these forms of expertise and integrate the practices of these twodifferent communities of practice if design research teams are to be successful. Finally,design research relies on a cooperative relationship between organizations that areresponsible for the design and research and organizations that are responsible forimplementation. The iterative design, development, implementation, and evaluationcycles that are critical for design research require long-term cooperative effort. Thesearrangements can be hard to maintain in an environment where those responsible forimplementation are subject to demands for short term results. An innovative design mayrequire several iterations to reach its promise. In an atmosphere for short-termaccountability, it may be hard to find implementation sites that can participate in thisform of research. For all these reasons, the challenges of supporting design researchgoes beyond the challenges of evaluating which research to pursue, to includesignificant challenges to existing human resources and institutional arrangements.Nevertheless, the possibility that substantial improvement in educational outcomes willresult from innovative design research merits tackling those challenges.

192.
What we learn 165 ACKNOWLEDGMENTS This paper is based in part on research supported by the National ScienceFoundation under grants no. REC-9720663 and ESI-0227557. All opinions expressedhere are those of the author and do not necessarily reflect the views of the Foundation. REFERENCESEdelson, D. C. (2002). Design research: What we learn when we engage in design. Journal of the Learning Sciences, 11(1), 105-121.van den Akker, J. (1999). Principles and methods of development research. In J. van den Akker, R. M. Branch, K. Gustafson, N. Nieveen, & T. Plomp (Eds.), Design approaches and tools in education and training (pp. 1-14). Boston: Kluwer Academic Publishers.

193.
Quality criteria for design 166QUALITY CRITERIA FOR DESIGN RESEARCH: EVIDENCE ANDCOMMITMENTSAnthony Kelly Society devotes enormous effort at the education of children and adults. How toimprove that effort via research is an ongoing challenge. Currently, researchers fromfields as diverse as mathematics education, science education, computer science, andengineering are combining insights to forge a methodology that sits between thetraditional randomized trials and qualitative approaches. One set of these approaches is captured by the general label, design research.Design research is interventionist, iterative, process focused, collaborative, multilevel,utility-oriented, and theory driven (see Kelly, 2003). It works with proceduralambiguity, ill-defined problems, and open systems that are socially multi-level, andmulti-timescale (Lemke, 2001). Design research is often situated within a domain (e.g., mathematics or science)and in many cases uses the structure of the domain as a theoretical guide (e.g., Cobb,Confrey, diSessa, Lehrer& Schauble., 2003). Together with domain knowledge designresearch uses a theory of social learning to pose and address “revisable conjectures”relevant to some “background theory” (see Barab & Squire, 2004). Design research attempts to work within and describe, holistically, learning andteaching in spite of its complexity and local idiosyncracy. Design research samples

194.
Quality criteria for design 167broadly and repeatedly from phenomena in addition to sampling theories or propositionsin the literature. Design research does not treat the educational process as some “black box” asrandomized trials are wont to do, but embraces complexity of learning and teaching andadopts towards it an interventionist and iterative posture. It uses ongoing in situmonitoring of the failure or success of (alternative versions of) some designed artifact(software, curricular intervention, tutoring session, etc.) to provide immediate (andaccumulating) feedback on the viability of its “learning theory” or “hypotheticallearning trajectory” (Cobb et al., 2003). Over time, the accumulated trail of evidenceadds, cumulatively, both to artifact re-design and theory revision. Data collection often involves videotaping of actual learning occurrences, thecollection of “learning artifacts” (e.g., students’ work), computer-tracked learning trails,and in some cases may involve other techniques such as clinical interviews or tutoringsessions (Kelly & Lesh, 2000). Design research seeks out and is responsive to emerging factors in complex,naturalistic settings and acknowledges the constraints and influence of the many nestedlayers impacting school practice from poverty to policy (Fishman, Marx, Blumenfeldand Krajcik, 2004). In a preparatory document to the conference that shaped this book, Gravemeijerand van den Akker (2003) noted that there are at least three different uses for designresearch in education: Design research that aims at shaping an innovative intervention and developing a theory that underpins that intervention.

195.
Quality criteria for design 168 Design research that aims at creating learning ecologies to investigate the possibilities for educational improvement by bringing about new forms of learning in order to study them. Design research as a scientific approach to the design of educational interventions, aiming at contributions to design methodology. Of these three, much attention in the US has been focused on the use of designprinciples to intervene in educational settings in what have traditionally been called“teaching experiments,” but more recently, “design experiments.” The designexperiment does not indicate a single methodology, since among these variants (forexamples, see Cobb et al., 2003; Kelly & Lesh, 2000), exist differences in goals,methods, and measures: One-to-one teaching experiments; Classroom teaching experiments (including multi-tiered and transformative versions); Preservice teacher development studies; Inservice teacher development studies; School/district restructuring experiments. As Gravemeijer and van den Akker noted, a second major emphasis is on thedesign and trialing of “learning environments” (see Box 9.1 for examples). Within thissubcategory, there also exist different goals, methods, and measures of learning orteaching or both.[ BOX 9.1 ABOUT HERE ]

196.
Quality criteria for design 169 Also, as noted, some recent effort has been spent reflecting on the need toreconceptualize existing or create new research methodologies. Some notable effortsinclude the various articles in Kelly (2003), Barab and Kirshner (2001) and Barab andSquire (2004). Design research deliverables vary across many genres (see Kelly, 2003),including the examples in Box 9.2.[ BOX 9.2 ABOUT HERE ] The short review in Box 9.2 makes plain that a simple or single codification ofcriteria for judging the quality of “design research” studies or proposals is not plausibleor even desirable. What I propose to do, instead, is to bracket the problem by returningto Brown’s seminal article on design experiments (Brown, 1992). Brown did not simplypropose a new method, rather: She advocated mixed methods approaches (qualitative and quantitative) in the same study: measuring magnitude of effects, yet also developing richer pictures of knowledge acquisition. She did not denounce quantitative measures; even for idiographic studies, she remarked, “it is perfectly possible to subject case studies to statistical analysis if one chose to do so” (p. 156). She saw design research as involving classroom and laboratory bi-directionality (see McCandliss, et al., 2003). She recognized that overwhelming amounts of data would be collected most of it un- analyzable by researchers and peer reviewers due to time constraints.

197.
Quality criteria for design 170 She warned against the “Bartlett Effect” of selecting episodes that supported one’s favorite hypothesis. “This selection issue is nontrivial. The problem is how to avoid misrepresenting the data, however unintentionally” (p. 162). She saw the need for dissemination (diffusion) studies: “[I]t is extremely important for the design experimenter to consider dissemination issues. It is not sufficient to argue that a reasonable end point is an existence proof, although this is indeed an important first step” (emphasis added, p. 170). Advocated scaling studies: “The alpha, or developmental, phase is under the control of the advocate, and by definition it must work for there to be any later phases. It works, though, under ideal supportive conditions. Next comes the beta phase, tryouts at carefully chosen sites with less, but still considerable, support. Critical is the gamma stage, widespread adoption with minimal support. If this stage is not attempted, the shelf life of any intervention must be called into question.” (emphasis added, p. 172). Brown did not reject the goals of isolating variables and attributing causal impact. She recognized that to go to scale demands unconfounding the variables that were treated as confounded, initially: “I need to unconfound variables, not only for theoretical clarity, but also so that necessary and sufficient aspects of the intervention can be disseminated. The question becomes, what are the absolutely essential features that must be in place [for] ... normal school settings” (p. 173). She did not provide a methodological solution for this problem, but suggested school system analyses and a study of the sociology of dissemination as sources. We can now draw directly from Brown (1992) for a set of concerns for thejudging of the design and claims of design research in education:

198.
Quality criteria for design 171 Inadequate attention to sampling bias; Inadequate attention to response bias; Inadequate attention to researcher bias; Overwhelming amounts of data and unsatisfactory methods of turning data into evidence; Confounded variables; Inadequate attention to scaling up or scaling out studies that test parameters outside the initial sample; Inadequate attention to dissemination and diffusion studies as tests of the efficacy of the emerging design “products”. Brown’s original concerns have not evaporated, recent sources documentingsimilar concerns include: Fishman, et al. (2004), Barab and Kirshner (2001), andShavelson, Phillips,Towne and Feuer (2003). It should be pointed out that the above problems not only afflict design research,but any research methods that attempt to model a phenomenon as complex as education.I do not see the above concerns about design research as invalidating the genre simplythat it is new. Similar concerns became apparent in the development of correlational andexperimental methods in psychology and education. A review of any current text ontraditional experimental design will show a history of fixes and repair to the originalwork of Fisher in the early part of the 20th Century (Shadish, Cook & Campbell, 2002).As with these other methods, we can expect, as a design research methodology matures,guidelines for researchers and journal editors on these concerns will emerge and becodified (Kelly & Lesh, in progress).

199.
Quality criteria for design 172 COMMISSIVE SPACES IN SOCIAL SCIENCE RESEARCH At the Dutch seminar that led to this book, I introduced the notion of acommissive space drawing on Searle’s (1969) speech act theory, particularly theillocutionary/perlocutionary act of the commissive (i.e., a commitment to act inaccordance with certain background assumptions) (see also Austin, 1962). From this perspective, science as it is enacted socially is about commitments tocertain assumptions that support specialized conversations within a peer group. Absentthese conversations (which encompass the education and apprenticeship of newmembers), the various scientific propositions in books and articles are merely ink onpaper. Communities of practitioners develop shared commitments. These commitments– to background assumptions, acceptable verbal moves, adherence to standards ofevidence, warrant, data and technique, constitute the space in which conversations canoccur. Conversation here is viewed broadly as communications across many media andacross time via researcher training, professional presentations, articles, etc. Violations ofthe implicit and explicit commitments in the space define the speaker as being outsideof one commissive space and (presumably) a member of some other commissive space.Note that within a commissive space, it is often permissible to debate a finding (e.g.,disagreements about content, “is the p value significant?”), but not to question thefoundational commitments. To question a fundamental commitment (and sometimes even to make thebackground assumptions explicit) may cause the members of a commissive space toview the questioner as suspect. The questioner is assumed to be a member of some other

200.
Quality criteria for design 173commissive space. The other commissive space is, by the nature of the socialcommitments of the exiling space, in some way foreign or inferior. Note that thisjudgment is not made according to a set of background assumptions that encompassboth commissive spaces; rather, that the questioner enunciated a position that violatedthe rejecting space’s explicit or implicit rules and assumptions. In educational research one commissive space is evidenced by a commitment torandomized field trials and assumes the analysis of a posteriori data in the light of apriori commitments to notions of sampling, sources of bias, the logic of experimentaldesign, and inferential rules related to probability (Wainer & Robinson, 2003; NRC,2002; Shavelson et al., 2003). In this space, it is preferable to conduct poor randomizedtrials, rather than to question the putative model of causation in such trials. Design researchers, in practice, violate many of the assumptions of therandomized field trials commissive space. Reiterating Brown (1992), Collins (1999)noted that a design researcher: Conducts research in messy (not lab) setting; Involves many dependent variables; Characterizes, but does not control, variables; Flexibly refines design rather than following a set of fixed procedures; Values social interaction over isolated learning; Generates profiles; does not test hypotheses; Values participant input to researcher judgments. The point I want to make is that these violations of assumptions of therandomized field trials commissive space do not, necessarily, invalidate the scientificclaims from the design research commissive space. To so argue, is to reduce the human

201.
Quality criteria for design 174endeavor of science to its rare confirmatory studies. Rather, I wish to suggest that thecommitments for conversations in the two spaces are not fundamentally antagonistic,but can be seen as complementary in a more comprehensive view of the social sciences. To illustrate an antagonistic stance, consider that Shavelson et al., (2003)purportedly criticizing design research, focus their criticism almost exclusively on thenarrative work of Bruner. As far as I know, Bruner does not see himself as a designresearcher, nor do the design researchers I read base their work (at least centrally) onBruner’s ideas. Thus, the criticism is directed not against the particulars of designresearch in practice (whose complexity we see from the introduction). Rather, it isdirected against design research’s challenge to Shavelson et al.’s idea of whatconstitutes warrant for a research claim. In Shavelson et al.’s commissive space,assurance is best established by randomized field trials, which, presumably, are up tothe task of ruling out rival hypotheses in complex educational settings (a claim I contest,National Research Council, 2004). Design research is, thus, suspect because itpurportedly relies on Brunerian “narratives,” which in their inability to rule out rivalhypotheses (it is asserted) places the design research claims in doubt, and exile themethodology outside of the critics’ commissive space. Thus, I read the criticisms of Shavelson et al., (2003) of design research asunderstandable, but only within one set of commitments. Moreover, the assertion thatrandomized field trials stand as some kind of “gold standard” in educational researchnarrows significantly the enterprise of science and devalues processes and proceduresknown to support the generation of scientific knowledge (e.g., Holton, 1998; Holton &Brush, 2001) and is rejected by the more thoughtful users of randomized trials (Shadishet al., 2002).

202.
Quality criteria for design 175 THE COMMISSIVE SPACE OF DESIGN RESEARCH By comparison to the randomized field trials space, which is confirmatory andconservative, design research is exploratory and ambitious. Design research by contrastvalues novelty, unconventional and creative approaches (see Newell et al., 1962). Sincedesign research perturbs and intervenes in learning or teaching situations (and thusprimes unexpected behavior) it does not rely (solely) on existing frameworks ofmeasures, but must provide solutions to modeling, sampling, and assessment problemsas they emerge (e.g., Barab & Kirshner, 2001). Design research does not assume a mechanical (input/output) model ofinstruction and learning, but is more organic in its approach. It does not accept simplecause and effect models in complex social settings, so it does not centrally value thesatisfaction of the establishment of internal validity. Design research does not strive for “context free” claims; rather, it sees contextas central to its conceptual terrain. Its goal is to understand and foster meaning makingand sees this process as necessarily historical, cultural, and social. It thus does not seekto “randomize away” these influences (classifying them as “nuisance variables”), but toengage, understand, and influence them in an act of co-design with teachers andstudents around the learning of significant subject matter. Thus, design research is notconcerned with isolating variables or with making generalizable claims that arise fromthe satisfaction of techniques establishing external validity. It thus has an affinity withthe methodological approaches in personality theory of McAdams (1996, 1999, 2001),

203.
Quality criteria for design 176where the move is away from the “averaged” description of subjects – knowledge at the“level of the stranger” – to a more intimate definition of learning. Design research is experimental, but not an experiment. It is hypothesisgenerating and cultivating, rather than testing; it is motivated by emerging conjectures.It involves blue-printing, creation, intervention, trouble-shooting, patching, repair;reflection, retrospection, re-formulation, and re-intervention. Design research promotesa dialectic between empirical direct observation, video-taped records, co-researchers’commentary and the design researcher’s own fundamental understanding (models) ofthe subject matter, students’ and teachers’ emerging models of the subject matter (and insome cases, models of the social classroom milieu). Thus, the multi-factor expertise ofthe researcher(s) and the commitment and engagement of the “subjects’ is paramount. Design research may be seen as a stage-appropriate response in a multi-stageprogram of research that moves from speculation, to observation, to identification ofvariables and processes via prototyping, to models, to more definitive testing of thosemodels, to implementation studies, scaling studies, and ongoing diffusion of innovations(Bannan-Ritland, 2003). Thus, design research may lead to, support, and enrichrandomized field trials. Indeed, the findings of a randomized trials study may beincorporated into the “theory” being tested in design research. Design researchers choose to work in the “context of discovery” rather than inthe “context of verification” (Schickore & Steinle, 2002). Thus, in areas in which littleis known (e.g., how to teach and how students learn statistics), exploratory ordescriptive work naturally precedes (and informs) randomized field trials, which,incidentally, are meaningless without this foundational work. What variables should becontrolled for or measured if the phenomenon is not well understood? Design research

204.
Quality criteria for design 177may be seen as contributing to model formulation “meaningfulness” (not yet estimationor validation – Sloane & Gorard, 2003) not (yet) “demarcation” (Kelly, 2004).Ultimately, the use of design research methods is a “point of entry” choice on theresearcher’s part – where in the cycle of observation/correlation/experimentation toengage. On the other hand, while it is a category error arbitrarily to impose theadjudicative criteria of one commissive community upon the other, if claims pertinent toa different space are made, then the criteria of the other space are relevant. Thus, ifresearchers make strong causal claims using only the methods of design research, theseclaims can and should be met with the force of the argumentative grammar of therandomized field trials’ commissive space (Kelly, 2004). Conversely, unless significantmultiple or mixed methods are adopted in randomized field trials (Shadish et al., 2002),statements beyond black-box process models equally trespass on the conceptual train ofdesign research and other commissive spaces in educational research. MOVING FORWARD Design research should pay greater attention to advances in mixed methods (e.g.,Mosteller & Boruch, 2002; Shadish et al., 2002) and more expansive views ofrandomized field trials (e.g., Dietrich & Markman, 2000; ODonnell, 2000; Tashakkori& Teddlie (2000). In particular, since design research focuses on learning, it mustconsider the foundation for choosing one and abandoning other potential designdirections in the light of revised models of transfer (Lobato, 2003). Perspectives on

205.
Quality criteria for design 178transfer are also key for tests of generalizability of any models that arise from designresearch data. It should advance over time from model formulation to estimation and validation(Sloane & Gorard, 2003). Proponents should continue to lead in developing newapproaches for video use and analysis: e.g., the CILT work (e.g,http://www.cilt.org/seedgrants/community/Pea_Final_Report_2002.pdf, and Diver(http://www.stanford.edu/~roypea/HTML1%20Folder/DIVER.html). Incidentally, thereare many innovative approaches to data collection and analyses at the National ScienceDigital Library project (http://nsdl.org/). Design researchers need to continue to temper strong causal claims and be clearabout the character of their claims and their appropriate evidence and warrant (Kelly,2004; NRC, 2002). In some cases, it may be feasible and productive to conduct minirandomized trials at choice points and consider actor-oriented perspectives on learningat key junctures. Design research should continue to explore models for diffusion of innovations(Rogers, 2003) so that the deliverables of design research are used within the research,policy and practice fields. Equally, it should explore models for scaling successfulinnovations (http://www.gse.harvard.edu/scalingup/sessions/websum.htm;http://drdc.uchicago.edu/csu/index.shtml). Since design research is emerging as a new approach to research in appliedsettings, it is important to recognize its growing appeal. I wish to recognize the supportreceived from the National Science Foundation for a grant to me and Richard Lesh(University of Indiana, Bloomington) on explicating this emerging method. This granthas supported and documented a significant international co-emerging interest in the

206.
Quality criteria for design 179role of design in educational research. These meetings led to the special issue of theEducational Researcher (Kelly, 2003), and an upcoming book on design research(Kelly & Lesh, in preparation). Additionally, I have learned of spontaneous examples of design-based researchmethods during visits to the UK (particularly speaking to researchers associated with theESRC’s Teaching and Learning Research Programme – TLRP.org – and at OxfordUniversity), Sweden (Gothenburg University, particularly the Learning Study method ofthe Marton group), the Learning Lab, Denmark, the Center for Research in Pedagogyand Practice in Singapore, and, of course, the NWO and PROO in the Netherlands. Myhope is that the creative efforts of design researchers continue to be supported and thatthese efforts are not prematurely derailed by perceived violations of certain commissivespaces (see Asch, 1966). The methodological work of the next decade should includethe articulation and strengthening of the various research commissive spaces that spanthe program of research from innovation, to diffusion, to societal consequences(Bannan-Ritland, 2003). REFERENCESAsch, S. E. (1966). Opinions and social pressure. In A. P. Hare, E. F. Borgatta & R. F. Bales (Eds.), Small groups: Studies in social interaction (pp. 318-324). New York: Alfred A. Knopf.Austin, J. L. (1962). How to do things with words. UK: Oxford.Bannan-Ritland, B. (2003). The role of design in research: The integrative learning

214.
From design research 185FROM DESIGN RESEARCH TO LARGE-SCALE IMPACT: ENGINEERINGRESEARCH IN EDUCATIONHugh Burkhardt INTRODUCTION Previous chapters have described the current state of design research and itsfuture prospects, setting out its contribution in bringing research in education closer towhat actually happens in classrooms. Indeed, the last thirty years is remarkable for theshift from the traditional combination, of critical commentary on the one hand andlaboratory experiments on the other, towards the empirical study of teachers andchildren in real classrooms. The growth of cognitive science and its application toresearch in more realistic learning environments has contributed much to this. Theestablishment of design research over the last decade represents the next step in thissequence. However, there is more to be done before teaching and learning in themajority of classrooms can possibly move to be research-based. How we may get there,and progress so far, is the theme of this chapter. First, we need an established research-based methodology for taking the designresearch approach forward to produce processes and tools that work well in practicewith teachers and students who are typical of the target groups. I shall argue that thismethodology is already in place, and describe how it works. This is the researchapproach that is characteristic of engineering disciplines, with new or better productsand processes as the primary outcomes, so we call it the engineering research

215.
From design research 186approach.1 Direct impact on practice is the main criterion of quality, though engineeringresearch also delivers new insights – and journal articles. Secondly, we need reliablemodels of the process of educational change. These we do not yet have. I will outlineprogress that has been made, giving reasonable hope that a research-baseddevelopmental approach can succeed here too. The chapter is structured as follows. The first section begins with a comparison of the traditional craft-basedapproach to the improvement of professional practice with the research-basedapproach, going on to outline the key elements in the latter and how various groupscontribute to it. The second section discusses the research infrastructure in education,the contributions of the different research traditions to it, and the increased proportion ofengineering research needed for the development of new and improved products andprocesses – and, through this, for impact on policy and practice. The third sectiondescribes the process of design, development and evaluation that is characteristic of thisengineering research approach. Section four discusses how we may build the skill base needed for such aprogram, while the fifth section analyses the implications for governments, industry andacademia. The sixth section sketches an answer to the question "How much would goodengineering cost?", while the seventh section summarises the implications of all this forpolicy and for the design and development community. This wide-ranging agenda inevitably limits the detail in which the analysis canbe justified, or exemplified; the still-too-limited evidence is thus mainly in thereferences. 1 Some of the work reported in earlier chapters is engineering research – but most of it is not. Theterm design research covers a very wide range and I hope to show that the distinction is important.

216.
From design research 187 R↔P: RESEARCH-BASED IMPROVEMENT OF PROFESSIONAL PRACTICE Educational research still has much less impact on policy and practice than wewould wish. If politicians have a problem in their education system, is their first moveto call a research expert? Not often. Indeed, in most countries, there is no obvious linkbetween changes in practice and any of the research of the tens of thousands ofuniversity researchers in education around the world. It is not like that in more research-based fields like medicine or engineering. Let us try to identify why. Craft-Based versus Research-Based Approaches I shall use two contrasting terms for improvement methodologies, craft-basedand research-based. Like most dichotomies, it is an over-simplification but, I believe,useful. In all professional fields, there is recognised good practice, embodying theestablished craft skills of the field. These are based on the collective experience ofpractitioners – they must always have a response to every situation that presents itself,whether they are teachers in a classroom, doctors in a surgery, or administrators runninga system. Good practice, and the skills it involves, is passed on by experiencedprofessionals to new entrants in their training. This is what I mean by the craft-basedapproach. Historically, it was the approach of the craft guilds including, among manyothers, doctors and teachers. In this approach, innovation comes from a few peoplepushing the boundaries of good practice, trying something new and seeing if it works –for them. This sometimes involves the invention of new tools – instruments, teaching

217.
From design research 188materials, etc. Others learn about it, and some try it; on the basis of this experience, theydecide whether to adopt the innovation. If many take it up, it gradually becomes part ofgood practice – even then, it may be adopted by only a small minority of theprofession. All fields start with this approach. It has strengths and limitations. It isinexpensive and anyone can take part. However, in judging an innovation it lackssystematic evaluation of effectiveness in well-defined circumstances. Who, for what,when, does it work, and with what range of outcomes? Further, it inevitably dependson the extrapolation of current experience in a clinical context; since any extrapolationis inherently unreliable, such exploration tends to be limited in scope. Thus the craft-based approach to innovation is limited in the range of innovativepossibilities it explores, and in the reliability of its conclusions. This has led to thesearch in many fields for more powerful and more systematic, research-basedapproaches. Millennia ago, engineering took the lead in this. Starting in the late 19thcentury, medicine began to follow. Education set out on this path in the 20th century.Other fields are also instructive. The clothing industry remains firmly craft-based, withchanges driven by changes of fashion – hemlines go up and down, just as educationalfashion swings back and forth between basic skills and problem solving; however,performance clothing for campers, climbers or astronauts is substantially research-based, and even has useful influence on some areas of fashion. For education, medicineseems a better model than the fashion industry. Research-based approaches to improvement in large-scale educational practiceare the theme of this chapter. Here it will suffice to say that its methods aim to: build on results from past research, as well as best practice;

218.
From design research 189 use research methods in a systematic process of exploring possibilities, then develop tools, and processes for their use, through creative design and successive refinement based on using research methods to get rich and detailed feedback in well-specified circumstances. This is inevitably a slower and more costly approach than craft-basedinnovation. Slowly, over many years, research-based innovations gradually make anincreasing contribution to the quality of practice in a field. Engineering is now largelyresearch-based – bridges, aeroplanes and other products are designed from well-established theories and the known properties of materials. Over the last century,medicine has moved from being entirely craft-based to being substantially research-based. Fundamental discoveries, particularly in molecular biology, and a huge researcheffort have accelerated that process. However, many treatments, in areas from thecommon cold to low-back pain, are still largely craft-based. Alternative/complementarymedicine is almost entirely craft-based. However, the areas that have the firmerfoundations of the research-based approach gradually expand. Education has only begunto move in this direction; nonetheless, progress has been made and is being made inboth methodology and outcomes. What follows outlines some of it. Key Elements for Research-Based Improvement In a recent paper on Improving educational research (Burkhardt and Schoenfeld2003, referred to here as IER) we described the elements of R↔P mechanisms that arecommon to successful research-based fields of professional practice such as medicineand the design and engineering of consumer electronics. They all have robustmechanisms for taking ideas from laboratory scale to widely used practice. Such

219.
From design research 190mechanisms typically involve multiple inputs from established research, the imaginativedesign of prototypes, refinement on the basis of feedback from systematic development,and marketing mechanisms that rely in part on respected third-party in-depthevaluations. These lab-to-engineering-to-marketing linkages typically involve theacademic community and a strong research-active industry (for example, the drugcompanies, Bell Labs, Xerox PARC, IBM, Google,...). The following elements are all important in achieving effective and robustproducts: a body of reliable research, with a reasonably stable theoretical base, a minimum of faddishness and a clear view of the reliable range of each aspect of the theory. This in turn requires norms for research methods and reporting that are rigorous and consistent, resulting in a set of insights and/or prototype tools on which designers can rely. The goal, achieved in other fields, is cumulativity—a growing core of results, developed through studies that build on previous work, which are accepted by both the research community and the public as reliable and non-controversial within a well-defined range of circumstances. Such a theory base allows for a clear focus on important issues and provides sound (though still limited) guidance for the design of improved solutions to important problems. stable design teams of adequate size to grapple with large tasks over the relatively long time scales required for sound work of major importance in both research and development. Informed by the research base and good practice, they add other crucial ingredients – design skill, even brilliance and.... systematic iterative development that takes the tools through successive rounds of trialling and revision in increasingly realistic circumstances of use and users.

220.
From design research 191 independent comparative evaluation-in-depth provides validation (or not): Do they work as claimed, in the range of circumstances claimed? This provides the basis for .... individual and group accountability for ideas and products, so that reputations are built on a track record of evidence. It should be clear that this approach requires ongoing funded developmentprograms on realistic time-scales, funded by clients who understand the process. To doit well needs substantial teams; it cannot be done by individual researchers on few-yeartime-scales – the normal circumstances of university research in education. Around the world there are some well-established high-quality engineeringresearch groups – EDC, TERC and COMAP are notable US examples; however, theseand most other groups work from project to project, with no continuity of funding or,consequently, of work. Only the Freudenthal Institute has had a substantial team (nowabout 70 people) supported by continuing funding from the Netherlands Governmentover many decades2; the quality of their work on basic research, design and engineeringis universally recognised – and reflected in the performance of the Netherlands ininternational comparison tests in mathematics. It represents the best current exemplarfor governments elsewhere to study. Key Contributors – Roles and BarriersIn this approach, which are the key groups of players, what are their roles – and currentbarriers to their fulfilment? 2 The Shell Centre, also founded about 40 years ago with similar goals, has a team of about 5 peopleand lives from project to project on short-term funds. Sustaining long-term strategies for improvement inthese circumstances requires stubborn selectivity – and luck.

221.
From design research 192 Client-funders, as well as providing the money that supports the work, should be partners in goal and product definition through a continuing process of negotiation that reconciles their goals and design in the best solution that can be devised; the main barriers to this are often an unquestioned acceptance of the traditional craft- based approach, left undisturbed by the lack of comparative evaluation-in-depth of the effectiveness of products, or an insistence on simple solutions that will not, in practice, meet their goals. Project leaders lead the strategic planning and ongoing direction of their teams work. This involves negotiation with funders, and process management at all levels from design to marketing; the main barriers to their development are the lack of continuity produced by one-off project funding, and the consequent absence of a career path for engineering researchers – i.e. systems for training and apprenticeship, appointment and promotion, and recognition. Designer-developers provide excellence in design, and in refinement through feedback from trials; the main barriers are, again, the lack of any career path for designers or the evaluation in depth that would enable the recognition of excellence in design. Insight-focused researchers build the reliable research base, and carry through the comparative evaluation-in-depth of products, both formative and summative, which is so important; the main barrier is the academic value system, which undervalues such project-focused work, and also contributes to the other barriers above.Each groups needs to play these vital roles. In Section 5, we look at the changes that areneeded in their current working environments to make this possible.

222.
From design research 193 THE RESEARCH INFRASTRUCTURE – INSIGHT VERSUS IMPACT AS RESEARCH GOALS At a fundamental level, the relative impotence of research in education arisesfrom the interaction of different research traditions and styles, characteristicrespectively of the humanities, sciences, engineering, and the arts. For this analysis, weneed to go beyond the familiar controversies and paradigm wars in education; well-organised fields recognise that strength in research requires a wide range of approaches,tailored to the problems in hand. Let us take a broader view, looking across fields at thefour characteristic research styles and asking how each contributes in education. Forthis, it is useful to have a definition of research (HEFC 1999), designed to cover allfields3: Research is to be understood as original investigation undertaken in order to gain knowledge and understanding. It includes work of direct relevance to the needs of commerce and industry, as well as to the public and voluntary sectors; scholarship; the invention and generation of ideas and, images, performances and artifacts including design, where these lead to new or substantially improved insights; and the use of existing knowledge in experimental development to produce new or substantially improved materials, devices, products and processes, including design and construction. If you then look for a fundamental measure of quality in research across allfields, it is difficult to go beyond impressing key people in your field – but the balance 3 UK university departments in all subjects undergo a Research Assessment Exercise every fiveyears. This was the definition of research for the 2001 RAE.

223.
From design research 194of qualities that achieves this varies. What balance would be most beneficial foreducation, and how well is it reflected in current criteria for excellence in research? Letus look at each style in turn, the nature of the activities, the forms of output and, incontext of education, the potential impact on students’ learning in typical classrooms. The Humanities Approach This is the oldest research tradition, which was summarised for theaforementioned RAE (2001) exercise as “original investigation undertaken in order togain knowledge and understanding; scholarship; the invention and generation of ideas…. where these lead to new or substantially improved insights.” Empirical testing of theassertions made is not involved. The key product is critical commentary, usuallypublished in single-author books, journal papers or, indeed, journalism.There is a lot of this in education, partly because anyone can play in making assertions,“expert” or not; indeed, there is no popular acceptance of expertise. The ideas andanalysis in the best work of this kind, based on the authors’ observation and reflectionson their experience, are valuable. Without the requirement of further empirical testing, agreat deal of ground can be covered. However, since so many plausible ideas ineducation have not worked in practice, the lack of empirical support is a majorweakness. How can you distinguish reliable comment from plausible speculation? Thishas led to a search for “evidence-based education.” The Science Approach This approach to research is also focussed on the development of better insights,of improved understanding of how the world works, through the analysis of

224.
From design research 195phenomena, the building of models which explain them, but now with empirical testingof those models. This last is the essential difference from the humanities approach – theassertions made, now called hypotheses or models, depend for credibility on rigorousempirical testing. The key products are: assertions with evidence-based arguments insupport, including evidence-based responses to key questions. The evidence must beempirical, and presented in a form that could be replicated. The products are conferencetalks and journal papers. Alan Schoenfeld (2002) has suggested three dimensions for classifying researchoutputs: Generalizability: To how wide a set of circumstances is the statement claimed to apply? Trustworthiness: How well substantiated are the claims? Importance: How much should we care? Typically, any given paper contains assertions in different parts of this 3-dimensional space. Importance, a key variable, could usefully distinguish ‘insight’ from‘impact. Figure 10.1 focuses on the other two variables, G and T, say. A typicalresearch study looks carefully at a particular situation, perhaps a specific treatment andstudent responses to it. The results are high on T, low on G – the zone A in the figure.Typically, the conclusions section of the paper goes on to discuss the ‘implications’ ofthe study, often much more wide-ranging but with little evidence to support thegeneralisations involved, which are essentially speculative (in the humanities tradition)– shown as X, Y and Z.[ FIGURE 10.1 ABOUTE HERE ]

225.
From design research 196 Much research really provides evidence on treatments, not on the principles theauthors claim to study; to probe the latter, one needs evidence on generalizability – onemust check stability across a range of variables (student, teacher, designer and topic).Only substantial studies, or sometimes metanalysis, can meet this need. The ‘DiagnosticTeaching’ work of Alan Bell, Malcolm Swan and others illustrates this. (The approachis based on leading students whose conceptual understanding is not yet robust intomaking errors, then helping them to understand and debug them through discussion –see, for example Bell, [1993].) Their first study compared a diagnostic teachingtreatment to a standard ‘positive only’ teaching approach. It showed similar learninggains through the teaching period (pre- to post-test) but without the fall-away over thefollowing 6-months that the comparison group showed. This study was for onemathematics topic, with the detailed treatment designed by one designer, taught by oneteacher to one class. Only five years later, when the effect was shown to be stableacross many topics, designers, teachers and classes could one begin to make reasonablytrustworthy statements about diagnostic teaching as an approach. Few studies persist inthis way. Even then, there will remain further questions – in this case, as so often, abouthow well typical teachers in realistic circumstances of support will handle diagnosticteaching. Work on this continues. My general point is that insight-focused research with adequate evidence on itsrange of validity needs time and teams beyond the scale of an individual PhD orresearch grant. Other academic subjects, from molecular biology to high-energyphysics, arrange this; if it were more common in education, the research could havehigh G and T and, if the importance were enough, be worth the serious attention of

226.
From design research 197designers and policy makers. There is a further issue of grain size. Studies of student learning in tightlycontrolled laboratory conditions are too artificial to use directly in guiding design forthe classroom. At the other extreme, studies of student performance on completeprograms tell you little about the classroom causes. Design research addresses the keyproblems directly. The science approach is now predominant in research in science andmathematics education; it is not yet influential in policy formation. Such researchprovides insights, identifies problems, and suggests possible ways forward. Designresearch takes this forward into more realistic learning environments; it does not itselfgenerate practical solutions for situations that are typical of the system – that also needsgood engineering. The Engineering Approach This aims to go beyond improved insights to direct practical impact – withhelping the world "to work better" by, not only understanding how it works, butdeveloping robust solutions to recognised practical problems. It builds on scienceinsights, insofar as they are available, but goes beyond them. Within the broad RAE(2001) definition mentioned previously it is “the invention and generation of ideas …and the use of existing knowledge in experimental development to produce new orsubstantially improved materials, devices, products and processes, including design andconstruction.” Again there is an essential requirement for empirical testing of theproducts and process, both formatively in the development process and in evaluation.The key products are: tools and/or processes that work well for their intended uses and

227.
From design research 198users; evidence-based evaluation and justification; responses to evaluation questions.When, and only when, it includes these elements, development is engineering research. This is still uncommon in education – though there are many good examples, itis not the way most challenges are tackled4. In the academic education community suchwork is often undervalued – in many places only insight-focussed research in thescience tradition is regarded as true research currency. In this environment, it is notsurprising that most design research stresses the new insights it provides rather than theproducts and processes it has developed, valuable though these could be if developedfurther. The effects of the current low academic status of ‘educational engineering’include: lower standards of materials and processes, since the imaginative design and rigorous development that good engineering demands are not widely demanded; lower practical impact of important results from insight-focused research, since designers feel less need to know or use this research; pressure on academics in universities to produce insight research papers, rather than use engineering research methods. (These could also be used improve their own practice, both in effectiveness and in transferability to others.) All this leaves a hiatus between insight-focused research and improvedclassroom practice which is unfortunate. Society’s priorities for education are mainlypractical – that young people should learn to function in life as effectively as possible,including the personal satisfaction and growth that good education provides. The failureof educational research to deliver in practical terms is reflected in the low levels of 4 There are some areas where the need for careful development is recognised, notably in thedevelopment of tests; here the results have not been encouraging, largely because the design has oftenbeen approached on a narrow basis, dominated by traditional psychometrics, with little attention towhether what is actually assessed truly reflects the learning goals of the subject in a balanced way.Cognitive science research and better design are both needed here.

228.
From design research 199support for it. The ‘Arts’ Approach A movement of the criteria of excellence in fine arts is noteworthy. Fifty yearsago, only critics and historians of art or music would be appointed to senior academicpositions; now active artists, painters and composers, are appointed – as are designersand innovative practitioners in engineering and medicine. This may be seen as related tothe ‘humanities’ approach rather as ‘engineering’ is to ‘science’. It reminds us thatdesign is more than the routine application of a set of scientific principles. (Indeed, thefinest engineering, from the Porsche to the iPod, has a strong aesthetic aspect.) Itenriches education and could do so more. Integrating the Traditions In summary, let me stress that this is far from a plea for the abandonment ofinsight-focussed science research in education, or even the critical commentary of thehumanities tradition; these are essential, but not nearly enough. Rather, it is an argumentabout balance – that there should be much more impact-focussed engineering researchand that it should receive comparable recognition and reward. The different styles canand should be complementary and mutually supportive. But more engineering researchis essential if impact is to be a research priority of the field. As we deliver impact, ourwork will become more useful to practice, more influential on policy – and, as otherfields have shown, much better funded. The Status and Roles of “Theory”

229.
From design research 200 Finally, some related comments on “theory,” seen as the key mark of quality ineducational research as in most fields. I am strongly in favour of theory. (Indeed, in myother life, I am a theoretical physicist.) However, in assessing its role, it is crucial to beclear as to how strong the theory is. From a practical point of view, the key question is:How far is current theory an adequate basis for design? A strong theory provides an explanation of what is behind an array ofobservations. It is complete enough to model the behaviour it explains, to predictoutputs from conditions and inputs. In fields like aeronautical engineering the theory isstrong; the model is complete enough to handle nearly all the relevant variables so thatthose who know the theory can design an aeroplane at a computer, build it, and it willfly, and fly efficiently. (They still flight test it extensively and exhaustively.) Inmedicine, theory is moderately weak, so that trial design and testing is more central.Despite all that is known about physiology and pharmacology, much development is nottheory-driven. The development of new drugs, for example, is still mainly done bytesting the effects of very large numbers of naturally occurring substances; they arechosen intelligently, based on analogy with known drugs, but the effects are notpredictable and the search is wide. However, as fundamental work on DNA hasadvanced, and with it the theoretical understanding of biological processes, designerdrugs with much more theoretical input have begun to be developed. This process willcontinue. Looking across fields, it seems that the power of theory and the engineeringresearch approach develop in parallel. Education is a long way behind medicine, let alone engineering, in the range andreliability of its theories. By overestimating their strength, damage has been done to

230.
From design research 201children – for example by designing curricula based largely on behaviourist theories. Itis not that behaviourism, or constructivism, is wrong; indeed, they are both right in theircore ideas but they are incomplete and, on their own, an inadequate basis for design.Physicists would call them ‘effects’. The harm comes from overestimating their power. Let me illustrate this with an example from meteorology. “Air flows fromregions of high pressure to regions of low pressure” sounds and is good physics. Itimplies that air will come out of a popped balloon or a pump. It also implies that windsshould blow perpendicular to the isobars, the contour lines of equal pressure on aweather map, just as water flows downhill, perpendicular to the contour lines of a slope.However, a look at a good weather map in England shows that the winds are closer toparallel to the isobars. That is because there is another effect, the Coriolis Effect. It isdue to the rotation of the earth which ‘twists’ the winds in a subtle way, clockwisearound low pressure regions (in the northern hemisphere). In education there are manysuch effects operating but, as in economics, it is impossible to predict just how they willbalance out in a given situation. Thus design skill and empirical development areessential, with theoretical input providing useful heuristic guidance. The essential pointis that the design details matter – they have important effects on outcomes and areguided, not determined, by theory5. This paper is about how to achieve educational goals. I will not explicitlydiscuss the goals themselves. This is only partly because the subject is huge anddisputatious. I believe that, in mathematics and science at least, many of the apparentlygoal-focussed disputes are in fact based on strongly-held beliefs about how to achieve 5 The difference between Mozart, Salieri, and the hundreds of little-known composers from that timewas not in their theoretical principles; it was what each did with them. The principles and rules ofmelody, harmony and counterpoint were well-known to, and used by, all of them.

231.
From design research 202them, some flying in the face of the research evidence. For example, those who thinkmathematics should focus on procedural "basic skills" in arithmetic and algebra usuallywant students to be able to solve real world problems with mathematics, but that "theymust have a firm foundation of skills first." It is true that some with a strong faith-basedworld view believe that schools should not encourage students to question authority –an essential aspect of problem solving and investigation; however, even here, the greaterchallenge is to equip teachers to handle investigative work in their classroom. There is,therefore, an implicit assumption here that the educational goals we address are thosethat research on learning and teaching suggest are essential – a challenging enough set. SYSTEMATIC DESIGN, DEVELOPMENT AND EVALUATION From this analysis of framework and infrastructure, it is time to move on to lookin more detail at engineering research itself – the methodology that enables it to producehigh-quality tools, with processes for their effective use. The approach is based on afusion of the elements already discussed: research input from earlier research and development worldwide; design skill, led by designers who have produced exceptional materials; co-development with members of the target communities; rich, detailed feedback from successive rounds of developmental trials to guide revision of the materials, so that intentions and outcomes converge; a well-defined locus of ‘design control’, so that wide consultation can be combined with design coherence.

232.
From design research 203 Typically, there are three stages: design, systematic development, andevaluation. I discuss each in turn, with brief exemplification from one project. Researchof various kinds plays several roles: input to design from earlier research; researchmethods for the development process; research in depth for evaluation, to inform userson product selection and the design community on future development. The example I shall use is from the (still ongoing) development of support formathematical literacy, now called functional mathematics in England. Our work beganin the 1980s with a project called Numeracy through Problem Solving (NTPS, ShellCentre 1987-89). UK Government interest has recently revived (Tomlinson Report2004), partly through the emergence of PISA (OECD 2003) and (yet again) ofemployers concern at the non-functionality of their employees mathematics. Design The importance of sound design principles, based on the best insight research,have long been clear. They are necessary but not sufficient for the production ofexcellent tools for practitioners, essentially because of our current far-from-completeunderstanding of learning and teaching. The other key factor is excellence in design; itmakes the difference between an acceptable-but-mediocre product and one that isoutstanding, empowering the users and lifting their spirits. Excellent design is balanced across the educational goals, covering bothfunctional effectiveness and aesthetic attractiveness. (Porsches are the wonderful carsthey are because theyre classy and superbly engineered. There are sleek-looking carsone wouldnt want to own, for very long at least). As in every field, design that is aimedat superficial aspects of the product (textbooks in 4-color editions, with lots of sidebars

233.
From design research 204with historical and other trivia, pictures that have no connection with the topic, etc...) atthe expense of effectiveness (in promoting learning) are poor educational design.(Focused on increasing sales rather than increasing student understanding, they may begood business design). In the humanities and the arts, teachers can build their lessons aroundoutstanding works – of literature or music, for example, in their infinite variety. Becausethe learning focus in mathematics and science is on concepts and theories rather thantheir diverse realisations, excellence has to be designed into the teaching materials. Howto achieve such excellence is less well understood, or researched – partly because itsimportance is not widely recognised in education. Design skill can be developed but it is partly innate, born not made. It growswith experience over many years. Outstanding designers seem to work in differentways, some being mainly driven by theoretical ideas, others by previous exemplars, orby inspiration from the world around them. All have the ability to integrate multipleinputs to their imagination. All have deep understanding of the craft skills of theenvironment they design for – mathematics or science teaching, for example. Qualityseems to lie in combining specific learning foci with a rich complexity of connections toother ideas, integrated in a natural-seeming way that feels easy in use. Not so differentfrom literature and music. Design excellence is recognisable; people tend to agree on which products showdesign flair. At the present crude level of understanding of design, the best advice toproject leaders and institutions is heuristic – look for outstanding designers, and givethem an environment in which they flourish and develop. Above all, if you wantoutstanding products, dont over-direct designers with detailed design objectives and

234.
From design research 205constraints; balancing and fine tuning can be done later by other, analytic minds. Keepeach design team small so that communication is through day-by-day conversation,rather than management structures. (The extraordinary research creativity of XeroxPARC was achieved with a maximum of 10 people per research team.) The stages of the design phase are typically: outline agreement with the client on the broad goals and structure of the product; generating design ideas within the design group, in consultation with experts and outstanding practitioners; drafting materials, which are tried out in the target arena (e.g. classrooms) by the lead designer and others in the design group, then revised, producing the all-important ‘alpha version’. In practice, as always with creative processes, there is cycling among thesestages. Design control is a concept we at the Shell Centre have found to be important tothe progress of any project. The principle is that one person, after appropriateconsultation with the team, takes all the design decisions on the aspect they control. Ithas two major advantages. It retains design coherence, which improves quality, and itavoids extended debates in the search for consensus, which saves time and energy. If aconsensus is clear in discussion, the designer is expected to follow it – or have a verygood explanation for doing something else. (Everyone is expected to take very seriouslythe empirical feedback from the trials.) Numeracy through Problem Solving moved through the design phase in thefollowing way: In the mid-1980s, there was general discontent with the national system of

235.
From design research 206assessment at age 16, stimulated in mathematics by the Cockcroft Report (1979). Therewas an opportunity to try new examinations, linked to a recognition that the academicremoteness of mathematics was not ideal or essential – that functional mathematicsshould be tried. I proposed to the Joint Matriculation Board that the Shell Centre teamshould develop a new assessment with them and, because it breaks new ground,teaching materials to enable teachers to prepare for it. We agreed on five to ten 3-weekmodules, each with its own assessment, both during the module and afterwards. We assembled a group of six innovative teachers and, with a few outside expertson mathematics education, held a series brainstorming sessions on topics from theoutside world that might make good modules. About 30 topics were considered;individuals or pairs drafted rough few-page outlines for each topic. After muchdiscussion, we settled on a ten for further exploratory development, with a fairly clearview of their order. (In the end, there were five.) Design control was clarified. Malcolm Swan would lead the design, particularlyof the student materials and assessment tasks. John Gillespie organised the trialling andrelations with schools, and led the design of one module. Barbara Binns wrote notes forteachers, and managed the development process, including links with the examinationboard. I led on strategic issues (for example, an initial challenge was to ensure that thedesign remained focused on the unfamiliar goal, functional mathematics in the serviceof real problem solving, rather than reverting to just mathematics’), on the overallstructure of the product, and how we would get there. Everyone contributed ideas andsuggestions on all aspects of the work. In understanding the challenge of this kind of problem, we decided to break eachmodule into four stages, characteristic of good problem solving: understanding the

236.
From design research 207problem situation; brainstorming; detailed design and planning; implementation andevaluation. A key challenge in all investigative work is to sustain students autonomy asproblem solvers, without their losing their way or being discouraged. We decided thatstudents should work in groups of 3-4, guided by a student booklet. Individualassessment at the end of each stage would monitor the understanding of each student.Among many design details, the booklet gave strategic guidance on what to do in eachstage, with delayed checklists to ensure that nothing essential had been overlooked. ‘Design a Board Game’ was chosen as the first module to develop.Understanding of what this involves was achieved by creating a series of amusingly badboard games for the students to play, critique, and improve. (The students weredelighted that these "wrong answers" came from the examination board – an unexpectedbonus.) The Snakes and Ladders assessment task in Figure 10.2 exemplifies this. [ FIGURE 10.2 ABOUT HERE ] They now understood that a game needs a board and a set of rules, it should befair – and should end in a reasonable time! Each group then enjoyed exploring a rangeof ideas. The design and construction of their board, and testing the game followed. (Inone enterprising trial school, this became a joint project with the art department)Evaluation was accomplished by each group playing the other groups games,commenting on them, and voting for a favourite. Notes for teachers had been built up by the team through this try-out process.They and the student books were revised and assembled into first draft form, ready fortrials.

237.
From design research 208 The final examinations were designed rather later. There were two papers foreach module, called Standard and Extension Levels, which assessed the students abilityto transfer what they had learnt, to less- and more-remote problem situationsrespectively. These were externally scored by the board. Basic level was awarded on thebasis of the assessment tasks embedded in the teaching materials. Assessing the groupsproducts was seen as a step too far.To summarise, it is the integration of research-based design principles and excellence indesign with appropriate educational goals that produces really exceptional educationalproducts. Systematic Iterative Development The design process produces draft materials. The team has some evidence on theresponse of students, albeit with atypical teachers (the authors), but none on how wellthe materials transfer, helping other teachers to create comparable learning experiencesin their classrooms. It is systematic development that turns fine drafts into robust andeffective products. It involves successive rounds of trials, with rich and detailedfeedback, in increasingly realistic circumstances. The feedback at each stage guides the revision of the materials by the designteam. Feedback can take many forms; the criterion for choosing what information tocollect is its usefulness for that purpose. This also depends on presenting it in a formthat the designers can readily absorb – too much indigestible information is as useless astoo little; equally, it depends on the designers willingness to learn from feedback, andhaving the skills to infer appropriate design changes from it. Cost-effectiveness thenimplies different balances of feedback at each stage. In the development of teaching

238.
From design research 209materials, these typically are: Alpha trials in a handful of classrooms (normally 5-10), some with some robustteachers who can handle anything and others more typical of the target group. Thissmall number is enough to allow observers to distinguish those things that are generic,found in most of the classrooms, from those that are idiosyncratic. The priority at thisstage is the quality of feedback from each classroom, including: structured observation reports by a team of observers, covering in detail every lesson of each teacher; samples of student work, for analysis by the team; informal-but-structured interviews with teachers and students on their overall response to the lesson, and on the details of the lesson materials, line by line. The process of communicating what has been found to the designers isimportant, and difficult to optimise. We like to have meetings, in which the observersshare their information with the lead designer in two stages presenting: first, an analytic picture of each teacher in the trials, working without and with the new materials; then a line-by-line discussion of the materials, bringing out what happened in each of the classrooms, noting: where the materials did not communicate effectively to teacher or students; how the intended activities worked out. The discussion in these sessions is primarily about clarifying the meaning of thedata, but suggestions for revision also flow. The role of the lead designer in this processis that of an listener and questioner, absorbing the information and suggestions, and

239.
From design research 210integrating them into decisions on revision. Revision by the lead designer follows, producing the beta version. Beta trials follow. The priorities are different now, focussed on the realisationof the lessons in typical classrooms. A larger sample (20-50) is needed. It should beroughly representative of the target groups. (We have usually obtained stratified,reasonably random samples by invitation – "You have been chosen...." has goodacceptance rates, particularly when the materials can be related to high-stakesassessment.) Within given team resources, a larger sample means more limited feedbackfrom each classroom, largely confined to written material from samples of students.Observation of the beta version in use in another small group of classrooms is animportant complement to this. Revision by the lead designer again follows, producing the final version forpublication.The development of NTPS worked very much in this way. Numerous improvementswere made as the result of the feedback from the alpha trials, some removing bugs inthe activities themselves, or in the teachers misunderstanding of the guidance, othersincorporating good ideas that emerged from individual classrooms. The beta trials weremostly checking and validating with the larger sample what we had learnt – theyproduced many small changes. The evaluative feedback we received provided asubstantial basis for a summative view of the outcomes, positive in terms of studentachievement and (vividly) of attitude to mathematics. A notable feature was the gap-narrowing between previously high- and low-performing students – an important equitygoal that is notoriously hard to achieve. It seemed to arise largely from discomfort withnon-routine tasks for some, and improved motivation for others. Because of pressures

240.
From design research 211and priorities, none of this data was collected is a sufficiently structured way for aresearch journal – a defect, common in such work, that one would like to have the timeand resources to overcome. This is not the end of the process. Feedback from the field will guide futuredevelopments. Both informal comments from users and more structured research willproduce insights on which to build. Changing circumstances may lead to furtherdevelopment of the product. NTPS, with its own examination, was sidelined by the introduction of the GCSEas a universal examination at age 16. To continue to serve the schools that had becomeenthusiastic about this functional approach to mathematics, we and the examinationboard developed a GCSE built around it. As so often, fitting into this new frameworkled to compromises and some distortion of the approach, with more emphasis onimitative exercises rather than mathematics in use. Such an engineering research methodology is common in many fields for thedevelopment of tools and processes so as to ensure that they work well for theirintended users and purposes. It is still often neglected in education for the craft-basedapproach, which may be summarised as: write draft materials from your ownexperience; circulate to an expert group; discuss at meetings; revise; publish. This isquicker and cheaper, but does not allow substantial innovations that work effectively forthe whole target community of users. Weak design and development can produce costly flaws. (For example, theunintended consequences of pressure for simple tests in mathematics include adestructive fragmentation of learning as teachers teach to the test.) Indeed it is wellknown in engineering that the later a flaw is detected, the more it costs to fix – more by

241.
From design research 212orders of magnitude! Comparative Evaluation-In-Depth This third key element in the engineering approach is also the least developed6.It is nonetheless critically important, for: policy makers and practitioners, guiding choices of materials and approaches; all the design teams, informing product improvement and future developments. For the first of these, evaluation needs to be seen to be independent; for both itneeds to look in depth at: widely available treatments, competing in the same area; all important variables: types of user, styles of use, and levels of support (professional development etc); outcome measures that cover the full range of design intentions, including classroom activities as well as student performance; alternative products, their approaches and detailed engineering. The scale implied in this specification explains why, as far as I know, there hasbeen no such study, anywhere in the world, although the research skills it needs are inthe mainstream of current educational research. In a back-of-the-envelope look at what such a study in the US might entail, Iestimate: Time scale: Year 1 preparation and recruiting; year 2 piloting, schools start curriculum; year 3-6 data capture; year 4-7 analysis and publication; year 5-7 6 The US What Works Clearinghouse in its study of mathematics teaching materials, the mostactive area of materials development, found no study of this kind to review – this in a country with tens ofthousands of educational researchers, many of them evaluators (see Schoenfeld, 2006).

242.
From design research 213 curriculum revision; then loop to year 3 with some new materials. Grade range: 3 middle grades, ages 11-14 (others a year or two behind) Curricula: 9 diverse published curricula; 2 or 3 focal units per year in each curriculum. School systems: 10 nationwide, diverse, subject to agreement to go with the methodology Schools and classrooms: 2 classrooms per grade in 10 schools per system, assigned in pairs to 5 of the curricula "by invitation7", with levels of professional development and other support that the school system agrees to reproduce on large-scale implementation Data: Pre-, post- and delayed-test scores on a broad range of assessment instruments; ongoing samples of student work; classroom observation of 10 lessons per year in each class, with post-interviews; questionnaires on beliefs, style and attitudes8, pre-, post- and delayed-. A rough estimate of the cost of such an exercise focuses on lesson observation, the most expensive component. Assuming one observer- researcher for 8 classrooms (ie 80 observations per year, plus all the other work), this implies for each grade range: 10 school systems*20 classrooms*3 grades/8 = 75 researchers @$100K per year ~$10M a year for the 3 main years of 3 to 6, of the study ~$30million including leadership, support and overhead. With five universities involved, each would need a team of about 15 people covering the necessary range of 7 The issues of sampling, random or matched assignment, etc need ongoing study and experiment. 8 These can also be designed to probe teachers pedagogical content knowledge of mathematics orscience.

243.
From design research 214 research, development and system skills. This is, indeed big education – but likely to be cost-effective (see later section on the costs of good engineering). Functional Mathematics will need this kind of evaluation in due time, bearing inmind that a typical time from agreeing goals to stable curriculum implementation is ~10 years. Meanwhile it represents a research and development challenge. Systematic Development of Models for System Change In the introduction, I noted the need for reliable, research-based models of theoverall process of educational change – approaches that are validated, not merely bypost-hoc analyses but by evidential warrants for robustness that policy makers can relyon. This is clearly a challenging design and development problem. Nowhere has it beensolved for the kind of educational changes that research has shown to be essential forhighly-quality learning, at least in mathematics and science. Some progress has beenmade, giving reasonable hope that a similar developmental approach can succeed. Here the system of study is much more complex (even) than the classroom or theprofessional development of teachers, involving a much broader range of key players –students, teachers, principals, professional leadership, system administrators, politiciansand the public. As with any planned change, all of the key groups must move in the wayintended, if the outcomes are to resemble the intentions. There are now well-engineered exemplars of many of the key elements in such achange. Some are familiar tools – classroom teaching materials, and assessment thatwill encourage and reward aspects of performance that reflect the new goals, have longbeen recognised as essential support for large-scale change. In recent years, tools have

244.
From design research 215been developed in some other areas that have previously been seen as inevitably craft-based. For example, materials to support specific kinds of live professionaldevelopment have been developed, and shown to enable less-experienced leaders toreplace experts without substantial loss to the participants. Given that there are athousand mathematics teachers for every such expert, this is an important step forwardin seeking large-scale improvement. The focus has now moved beyond such specific areas to the change processitself. Until now, support for systems has been in the form of general advice andoccasional technical assistance by experts. We have begun to develop a Toolkit forChange Agents (see www.toolkitforchange.org), which aims to suggest to userssuccessful strategies for responding to the common challenges that inevitably arise inevery improvement program, and the tools that each employs. The entries in the toolkitare based on the successful experience of other change agents who faced similarchallenges. This work is still at an early stage but shows promise of helping with thiscore problem. However, my purpose here is mainly to draw attention to the educational changeprocess as an area that needs more than the critical commentary that has guided it so far.Because the system of study is larger than a student or a classroom, with moreobviously important variables, the challenge to systematic research and development isgreater. However, it is surely possible to provide those seeking to promote improvementwith well-engineered tools that will increase their effectiveness, based on research-based insights into the processes of change. This, too, will need large-scale projects.

245.
From design research 216 BUILDING THE SKILL-BASE FOR ENGINEERING RESEARCH The number of groups capable of high-quality engineering is now small and, aswe have noted, they are far from secure. If the research-based approach is to become asubstantial and effective part of large-scale improvement, the number of those who cando such work will need to grow. This will involve both finding and training people withspecific skills and creating institutional structures that can handle such work, and theteams of people with complementary skills that it requires. This takes time, typically adecade or so, fortunately matching the time any new approach will need to build publicconfidence in its value. We also need to bear in mind the changing balance of work. Four Levels of R&D – Improving Balance I noted that a focus on improving practice will need a different balance of effortamong the research styles presented earlier (insight versus impact), with moreengineering research. A complementary perspective on balance is provided by lookingat different levels of research (R) and development (D), summarised in Table 10.1. Note the crucial difference between ET, which is about teaching possibilities,usually explored by a member of the research team, and RT, which is about what can beachieved in practice by typical teachers with realistic levels of support. Note how theresearch foci, R in the third column, change across the levels. Currently, nearly allresearch is at L and ET levels. A better balance across the levels is needed, if researchand practice are to benefit from each other as they could. The main contribution ofdesign research has been to link the R and D elements in the third column – but, in mostcases, only for the first two levels, L and ET. Both RT and SC research need larger

246.
From design research 217research teams and longer time-scales, difficult to accommodate within typicalacademic structures.[ TABLE 10.1 ABOUT HERE ] What Skills, and Where Will They Come From? In the first section of this chapter I noted the key groups of contributors. Let uslook at where they are needed, and where they will come from. Insight-focused researchers with the necessary range of skills for their roles in the engineering approach already exist in large numbers in universities; the challenge, to create an academic climate that will encourage them to do such work, is discussed in the next section. Designer-developers of high-quality are rare9, partly because of the lack of any career path, from apprentice to expert professional, that encourages this activity; the development of this area, and the understanding of design skill in education, is still at an early stage. Progress in this area will be an important factor in the whole enterprise (see Gardner and Shulman 2005). ISDDE, an International Society for Design and Development in Education has recently been founded with the goals (see www.ISDDE.org) of: − improving the design and development process − building a design and development community − increasing the impact of this on educational practice Project leaders are a similarly rare species, for much the same reasons – the multi- 9 After many years of searching for outstanding designers, I know of only a few tens in mathematicseducation worldwide who I would enthusiastically like to work with.

247.
From design research 218 dimensional skills needed for this work are fairly well understood but, even with a supportive environment, it will take time to develop within the design and development community project leaders with experience in educational engineering. Client-funders with understanding of good engineering will appear as its potential is recognised – indeed the scale of the funding of long-term coherent programs may be the best measure of progress; the ability of the engineering community to demonstrate this potential through projects that are funded will be crucial to justifying expansion. All of these groups play vital roles. In the following section, we look at thechanges that are needed in their current working environments to make progresspossible. CHANGING BEHAVIOUR IN ACADEMIA, INDUSTRY, GOVERNMENTS If these things were happening in major education systems, there would be noneed for this paper. I and my colleagues could just concentrate on good engineering.Currently, moving from the present to the kind of approach outlined above will requirechange by all the key players in educational innovation. It is simplest to discuss them inreverse order. Governments Experience in other fields suggests that substantial government funding willflow into research in education, if and when policy makers and the public become

248.
From design research 219convinced that the research community can deliver clear and important practicalbenefits to the system. (IER discusses this in some detail.) Medical research onlyreceived significant support from the early 20th century on, as research-based benefitssuch as X-rays began to appear. Massive support followed the impact of penicillin andother antibiotics after 1945, perhaps helped by the drama of its discovery by AlexanderFleming. Physics too, particularly nuclear physics, was a fairly abstruse field until thattime. (The annual budget in the 1930s of the world-leading Cavendish Laboratory underRutherford was about £3,000 – peanuts by todays standards, even in real terms.) Therole of physicists in the Second World War in the development of radar, operationsresearch, nuclear weapons and many other things increased funding for pure, as well asapplied, physics research by many orders of magnitude – a situation that continues tothis day. (It has also continued to spin-off practical benefits, including the founders ofmolecular biology, the internet, and the world wide web.) However, in most of these cases, government played a crucial pump-primingrole, providing funding for proof of concept studies while the practical benefits werestill unproven. That will be necessary in education; however, it will only makes sense topolicy makers if credible structures are in place that give real promise of clear and directpractical benefits in the medium term. This will need a growing body of exemplarproducts of well-recognised effectiveness. Industry Here there is a different problem. In medicine and engineering, for example,there are industries that turn the prototypes of academic research into fully developedpractical tools and processes. Pharmaceuticals and electronics are two obvious examples

249.
From design research 220but the same is true across manufacturing industry. Firms have established links withacademic researchers in their fields; they support pure research and the research-baseddevelopment of prototypes. The firms then take these through the long and costlyprocess of development into robust products. No comparable industry exists in education. The publishing industry (theobvious candidate) turns protoypes (manuscripts) into products (books), but withminimal development – typically comments by a few experts and a small-scale trialwith teachers who, again, are simply asked to comment. Neither you nor the regulatorwould allow your children to be treated with a drug, or fly in a plane, that had beendeveloped like this. It produces products that work, in some sense, but is no way tobreak new ground with products that are both well-designed and robust in use10. Why is this so? The main reason is the continuing dominance of the craft-basedapproach, surviving largely because of the inadequate evaluation process. There is nosystematic independent testing and reporting on the effectiveness of products. Newteaching materials are regularly reviewed – but by an expert who must deliver thereview in a week or two, purely on the basis of inspection. The improved effectivenessproduced by development often depends on quite subtle refinements; it only showswhen a representative samples of users (e.g. typical teachers and students, working withthe materials), are studied in depth. As we have noted, such studies take time and costmoney. The situation is exacerbated because the buyer and the user of many productsare different – for example, the school system buys the teaching materials that theteachers use. Marketing is, of course, aimed at the buyer. Given this situation, there is a 10 It is true that some educational software is developed somewhat more systematically – theinevitably high cost of design and programming makes room for this. However, even this is held back bythe same lack of reliable data for users on how well it works.

250.
From design research 221greater need, and responsibility, for the academic community to do this kind of work –and for governments to fund it11. It goes almost without saying that there are no regulatory agencies on the linesof those every country has for drugs and for airplanes. Thus currently there is no incentive for industry to invest in systematicdevelopment. Systematic evaluation, preferably before marketing, would change that; itwould increase the cost of materials but not to a strategically significant extent (see thefollowing section). Academia If the situation is to improve, major changes will need to come in academia.Currently the academic value system in education, which controls academicappointments and promotions, is actively hostile to engineering research. As discussedin IER, it favours: new ideas over reliable research new results over replication and extension trustworthiness over generalizability small studies over major programs personal research over team research first author over team member disputation over consensus building journal papers over products and processes 11 The "What Works Clearinghouse" in the US has such a purpose; the methodology is profoundlyflawed (see Schoenfeld [2006]) – but perhaps it is a start.

251.
From design research 222Schoenfeld (2002) describes most such studies as of "limited generality but ....(ifproperly done)...here is something worth paying attention to’…." As I have noted, thatis a totally inadequate basis for design. In all respects these values undermine researchthat would have clear impact on the improvement of teaching and learning. A status pattern, where the pure is valued far more than the applied, is commonbut it is not general at any level of research. Many Nobel Prizes are for the design anddevelopment of devices – for example, only two people have won two Nobel Prizes inthe same field: John Bardeen , the physicist, for the transistor, and for the theory of superconductivity Fred Sanger, the biologist, for the 3D structure of haemoglobin (a first in this application of X-ray crystallography) and for the procedure for sequencing DNA. At least two of these are engineering in approach. With examples like these,education need not fear for its respectability in giving equal status to engineeringresearch. These lie in "Pasteurs Quadrant" (Stokes 1997) of work that contributes bothpractical benefits and new insights. However, one should not undervalue work inEdisons Quadrant, with its purely practical focus – contributions like the luminousfilament light bulb are of inestimable social value. Note also that, in making thisdiscovery, Edison investigated and catalogued the properties of hundreds of othercandidate materials, adding to the body of phenomenological knowledge that is part ofthe theoretical underpinning of all engineering. In contrast, so much research ineducation lies in the quadrant that has no name – advancing neither theory nor practice. Changing the culture in any established profession is notoriously difficult. Whatactions may help to bring this about in educational research? Leaders in the academic

252.
From design research 223research community can make a major contribution by including direct impact onpractice as a key criterion for judging research, complementing valid current criteria.One may (hopefully) envisage a future search committee at an academic institution thatwants to hire a senior person in education, and is mindful of public pressure to “make adifference.” The institution has decided that candidates must either be outstanding onone of the following criteria, or be very strong on two or three. Impact on practice – evidence should cover: the number of teachers and students directly affected; the nature of the improvement sought, and achieved; specific expressions of interest in future development. Contribution to theory and/or knowledge – evidence should cover: how new or synthetic the work is; warrants for trustworthiness, generality, and importance; citations; reviews; how frequently researchers elsewhere have used the ideas. Improvement in either research or design methodology – evidence should cover: how far the new approaches are an improvement on previous approaches; in what ways the work is robust, and applies to new situations; to what degree others employ these methods. Given the self-interest of those who are successful under current criteria,progress in this area will not be easy; however, real leaders often have the necessaryconfidence to promote principled improvements. Funding agencies can play their part,as they currently do, by funding projects that require good engineering. Further, theycan encourage universities to give successful teams, including their designers, long-termappointments.

253.
From design research 224 WHAT DOES GOOD ENGINEERING COST? It is clear that the process of design and development outlined earlier (systematicdesign, development and evaluation) is more expensive than the simple author =>publisher chain of the craft-based approach. How much does it cost? How does thisinvestment in R&D compare with that in other fields where improvement is needed? The NSF-funded projects for developing mathematics curriculum materials inthe 1990s were each funded at a level of about $1,000,000 for each years materials,supporting about 200 hours of teaching – i.e. about $5,000 per classroom hour. Eachteam worked under enormous pressure to deliver at the required rate of one yearsmaterial per project year. At the Shell Centre, we have tackled less forbiddingchallenges. We have developed smaller units, each supporting about 15 hours, at typicalcost of £7,000 - £15,000 ($15,000-$30,000) per classroom hour. The difference, and therange, reflects the amount of feedback, particularly classroom observation, that thefunding and time has enabled. The cost of the full process (as outlined in the design,development and evaluation section) is at the top of this range. What would the redevelopment of the whole school curriculum cost at $30,000per classroom hour? (No-one is suggesting that everything needs to change but thisgives an upper limit to the total cost.) Let us assume: 14 years of schooling*200 days per year*5 hours per day = 14,000 hours 3 parallel developments to meet different student needs > 40,000 hours $30,000 per classroom hour for high-quality development Which gives a total of ~ $1.2 billion. Spread over, say, 5 years – the minimumtime such a development effort would need – that yields an R&D cost of $120 millionper year; since the annual expenditure on schools in the US is at least $300 billion, this

254.
From design research 225amounts to investing ~0.04% of total running cost in R&D Any measurable gain in the effectiveness or efficiency of schooling wouldjustify this expenditure. (It could be saved by increasing the average number of studentsin a school of 2500 students by just one student!) For smaller countries, the proportion would be higher but still modest. Forcomparison, other fields that are developing rapidly typically spend 5-15% of turnoveron R&D, with 80% on research-based development, 20% on basic research. I believethat a level of 1% such investment in education is an appropriate target for manyadvanced countries. This would cover not only the R&D but the (larger) extraimplementation costs, involving as it must networks of live support. This wouldtransform the quality of childrens education, with consequent benefits in personalsatisfaction and economic progress. All this takes time. However, government are used to planning and funding longterm projects in other fields – 4 years to plan and 5 years to build a bridge or anaeroplane. IMPLICATIONS FOR POLICY – AND THE DESIGN COMMUNITY In this chapter, we have seen how an engineering research approach to mayenable research insights => better tools and processes => improved practice through creative design and systematic refinement using research methods.Achieving this will need changes at policy level in the strategies for educationalimprovement; each of these changes will depend on active effort by the research and

255.
From design research 226development communities. To summarise, the strategic changes that seem to be neededare: Recognition that good engineering is valuable and weak engineering costly. Good engineering produces more effective and reliable outcomes, which justify the higher cost and longer timescales than the craft-based approach; persuasive evidence on this can only come from independent comparative evaluation in depth of widely available products in use in realistic circumstances – a smarter buyer will then support better design. This needs a substantial effort by research communities, and appropriate funding; Coherent planning and funding of improvement by school systems, combining the long time-scales of substantial educational improvement with demonstrable year-by- year gains that will satisfy political needs; the design and development community can help by linking its responses to short-term funding opportunities to a realistic long term vision, negotiated with funders and based on basic research and past successes; Substantial multi-skilled teams of designers, developers, evaluators and other insight researchers capable of carrying through such major projects with the long time scales they imply; while specialist centres will continue to play an important role, there is a need for universities to play the central role they do in other big science fields such as physics and medicine (recognition, above, will be important in persuading governments to make the investment needed); Broadening of the academic value system in universities, giving equal research credit to in-depth insights and impact on practice; this will need leadership from the research community and pressure from funders;

256.
From design research 227 Building credible theories of learning and teaching to guide research-based design and development that links to that of insight-focussed research and, in turn, drives the latter to build a consensus-based core of results that are well-specified and reliable enough to be a useful basis for design; Collaboration – all the above will be advanced if funders, project leaders, designers and researchers learn to work more closely together over time; while the community of researchers is long established, the design and development community in education has still to acquire similar coherence.Clearly, there is much that is challenging to be accomplished here. But, if governmentsand other funders become convinced that we can deliver what they need then, together,we can make educational research a more useful, more influential, and much betterfunded enterprise. The analysis in this paper owes much to discussions with Phil Daro, AlanSchoenfeld, Kaye Stacey and my colleagues at the Shell Centre – Alan Bell, MalcolmSwan and Daniel Pead. REFERENCESBell, A. (1993). Some experiments in diagnostic teaching. Educational Studies in Mathematics 24, pp. 115-137. See also www.toolkitforchange.orgBurkhardt, H., & Schoenfeld, A. H. (2003) Improving educational research: Towards a more useful, more influential and better funded enterprise. Educational Researcher 32, pp. 3-14.Cockcroft Report (1979). Mathematics Counts. HMSO, London.

261.
Educational design research 229EDUCATIONAL DESIGN RESEARCH: THE VALUE OF VARIETYNienke Nieveen, Susan McKenney and Jan van den AkkerThis book has offered a platform for discussing educational design research, and severalviews on how to assess and assure its quality. In this closing chapter, we explore therole that design research plays in the broader scientific cycle, including implications forassessing the quality of design research proposals, activities and reports. THE SCIENTIFIC CYCLEThe discussion in this chapter departs from a contribution made by Phillips, earlier inthis volume, in which he expresses concern about “serious oversimplification” ofscientifically-oriented research. He points to the contemporary trend to primarilyemphasize the final stage of the research cycle - testing claims of causality (e.g.randomized field trials). And he reminds readers about the importance of thepreliminary investigation stage that is guided by deep factual and theoreticalunderstanding. In so doing, he indicates that a view of science as proceeding throughseveral stages has been well-established for quite some time, citing writings ofReichenbach, Popper and Dewey as examples. With regard to educational designresearch, Kelly (this volume) and Bannan-Ritland (2003) point to the need to see designresearch as an integral approach within a larger scientific cycle. From a slightly

262.
Educational design research 230different perspective, Sloane (this volume) describes two harmonizing modes ofeducational research (‘science mode’ and ‘design mode’), which ought to communicateand collaborate with each other.The illustration in Figure 11.1 is based on the notion that scientific inquiry in general,and educational research in particular, flows through multiple cycles. It shows thatearlier stages share an exploratory emphasis including speculation, observation,identification of variables/processes, modeling, prototyping and initial implementation.Design research is conducted within or across these stages. Later stages share aconfirmatory emphasis, in which causality is tested. This may range from smaller scalelearning experiments through large-scale diffusion and comparative testing of impact.Here, effect studies, such as randomized field trials, are conducted. The exploratoryemphasis is necessary to arrive at well-designed innovations, worthy of going to scale;and the confirmatory emphasis is necessary not only to test the impact of an innovation,but also to provide sound inputs for future exploratory work.[ FIGURE 11.1 ABOUT HERE ] DESIGN RESEARCH DIFFERENTIATIONThe contributions within this book discuss varying perspectives on design research.While all of the chapters touch on the exploratory nature of design research, variationdoes exist. In exploring that variation, particularly, when revisiting the variation in

263.
Educational design research 231design aims, we find it useful to distinguish between: (a) studies that aim to (dis)provelearning theories - validation studies; and (b) studies that aim to solve an educationalproblem by using relevant theoretical knowledge - development studies. Whereas bothtypes involve the design, development and evaluation of learning innovations incontext, their scientific output differs. As elaborated in the following section, validationstudies ultimately contribute most to advancing (domain-specific) instructional theories;while development studies yield design principles for use in solving educationproblems. Validation StudiesValidation studies feature the design of learning trajectories in order to develop,elaborate and validate theories about both the process of learning and resultingimplications for the design of learning environments. In this category, we draw parallelswith well-known works such as those of Brown (1992) and Collins (1992); as well asnewer ones from this volume (Gravemeijer & Cobb; Walker). With the aim ofadvancing learning theory, validation studies contribute to several levels of theorydevelopment (cf. Gravemeijer & Cobb, this volume):• micro theories: at the level of the instructional activities;• local instruction theories: at the level of the instructional sequence;• domain-specific instruction theory: at the level of a pedagogical content knowledge.

264.
Educational design research 232In order to reach these aspirations, researchers deliberately choose for naturallyoccurring test beds (though tend to work with above-average teaching staff) instead oflaboratory or simulated settings. In doing so, their work evolves through several stages(cf. Gravemeijer & Cobb, this volume) such as:• environment preparation: elaborating a preliminary instructional design based on an interpretative framework;• classroom experiment: testing and improving the instructional design/local instructional theory and developing understanding of how it works;• retrospective analysis: studying the entire data set to contribute to the development of a local instructional theory and (improvement of) interpretative framework.DiSessa and Cobb (2004, p.83) warn that “design research will not be particularlyprogressive in the long run if the motivation for conducting experiments is restricted tothat of producing domain specific instructional theories.” A practical contribution ofvalidation studies lies in the development and implementation of specific learningtrajectories that were implemented to test the theoretical basis of the design. Development StudiesWhereas the practical contribution is a (secondary) benefit of most validation studies,derivation of design principles for use in practice is a fundamental aim of mostdevelopment studies. Here, research is problem-driven, situated in the educational fieldand involves close interaction between practitioners, researchers, experts and otherstakeholders. Design studies of this nature have been described previously (McKenney& van den Akker, 2005; Nieveen & van den Akker, 1999; Richey, Klein, & Nelson,

265.
Educational design research 2332004; Richey & Nelson, 1996; van den Akker, 1999) as well as in this volume (Edelson;Reeves; Sloane; McKenney, Nieveen, & van den Akker). Development studies integrate‘state of the art’ knowledge from prior research in the design process and fine-tuneeducational innovations based on piloting in the field. Throughout the process, implicitand explicit design decisions are captured. By unpacking the design process, designprinciples are derived that can inform future development and implementationdecisions. Two main types of principles are addressed (cf. Edelson, this volume;McKenney, Nieveen, & van den Akker, this volume; van den Akker, 1999):• procedural design principles: characteristics of the design approach; and• substantive design principles: characteristics of the design itself.Since design principles are not intended as recipes for success, but as heuristicguidelines to help others select and apply the most appropriate knowledge for a specificdesign task in another setting, comprehensive and accurate portrayal of the context is anessential companion to both types of design principles. It should be noted thatdevelopment studies do more than ‘merely’ demonstrate local utility. As Barab andSquire (2004, p. 8) put it, “…design scientists must draw connections to theoreticalassertions and claims that transcend the local context.”In order to solve educational problems and arrive at design principles, developmentstudies usually progress through several stages, as described in several chapters of thisvolume (cf. Edelson; McKenney, Nieveen, & van den Akker; Walker):• preliminary research: thorough context and problem analysis along with development of conceptual framework based on literature review;• prototyping stage: setting out design guidelines, optimizing prototypes through

266.
Educational design research 234 cycles of design, formative evaluation and revision;• summative evaluation: often explores transferability and scaling, along with (usually small-scale evaluation of) effectiveness;• systematic reflection and documentation: portrays the entire study to support retrospective analysis, followed by specification of design principles and articulation of their links to conceptual framework.Development studies rely on local ownership for being able to observe phenomena intheir natural settings. This requires a long term link with practice as time is necessary tofully explore and optimize an intervention, before implementing in ‘normal’ settings.Collaborative design activities offer outstanding joint professional learningopportunities for researchers and practitioners (cf. McKenney, Nieveen, & van denAkker, this volume). Why we need VarietyWe may be able to learn from ‘sister fields’ such as engineering product design andresearch on diffusion of innovations (Zaritsky, Kelly, Flowers, Rogers, & ONeill,2003). ‘Educational engineering,’ as with all design disciplines, requires varied types ofinvestigations at differing stages along an evolving process. Related to this notion,Brown (1992), Burkhardt and Schoenfeld (2003), and Burkhardt (this volume) advocatescaling mechanisms, such as:• alpha trials, held under control of the design research team and under ideal circumstances;• beta trials, performed at carefully chosen sites with some support;

267.
Educational design research 235• gamma trials, focusing on widespread adoption with minimal support.Taking these ideas seriously implies commitment to long-term endeavors that requiresubstantial support from a research program that embraces an overarching vision.Building on earlier work (van den Akker & McKenney, 2004), Figure 11.2 illustratessuch a vision in which educational engineering departs from a sound theoretical basebuilt by validation studies, develops through practical understandings from developmentstudies, and is tested by effectiveness research. Due to the scope of each study type,most research programs specialize in one area (validation studies, development studiesor effectiveness research).[ FIGURE 11.2 ABOUT HERE ] UNDERSTANDING AND ASSESSING DESIGN RESEARCH QUALITYWhile design researchers may find it easy to see their place in a larger framework, suchas the one sketched in the previous section, Kelly (2003, this volume) and Phillips (thisvolume) emphasize that researchers from other well-established traditions are likely tohave little tolerance for rival approaches. This poses great challenges to garneringsupport for a comparatively nascent research approach. The merits of design researchcan only become evident to ‘outsiders’ when good examples begin to crop up, and theserequire substantial long-term commitments. Since the majority of gatekeepers tofunding and publication opportunities speak alternate ‘research dialects’ (Kelly, 2003),the barriers to launching serious, longitudinal design research seem daunting.

268.
Educational design research 236How then, shall we stimulate thoughtful consideration of design research proposals,activities and manuscripts? For starters, the design research community could becomemore explicit about the quality standards that it adheres to and wants to be heldaccountable for (cf. Dede, 2004). As mentioned in the first chapter of this book, it wasthe need for an internal debate on quality assurance that stimulated the ResearchCouncil on Educational Research of the Netherlands Foundation for Scientific Research(NWO/PROO) to invite well-reputed design researchers to participate in a seminardedicated to the topic. It was the intention of this book to take the dialogue one stepfurther. Now nearing a close, we draw upon international literature, chaptercontributions and discussions from the NWO/PROO seminar to offer someconsiderations for understanding and assessing design research quality. Portray Design Research in PerspectiveWe view the placement of design research at the exploratory side of a larger scientificcycle to be crucial to understanding both its worth and merit. The distinction betweenvalidation studies and development studies can be useful in furthering more nuanceddiscussions of design research. Further, demonstrating that exploratory studies providethe necessary inputs for other research approaches (e.g. effectiveness research) maybegin to facilitate more productive dialogue with those funders and manuscriptreviewers who are open to, but not familiar with, design research. Demonstrate Coherent Research Design

269.
Educational design research 237As with other research approaches, the inclusion of a carefully-considered, well-informed conceptual or interpretive framework is essential. Due to the scope of designstudies, evidence must be provided for how to deal responsibly with large amounts ofdata (Reeves, Herrington, & Oliver, 2005; Richey & Klein, 2005) without wasting time,effort and resources through “massive overkill” (Dede, 2004, p.107) in terms of datagathering and analysis. Whether through an interpretive framework or other tools, theresearch design should clarify how data will be analyzed and interpreted. Additionally,research planning should demonstrate intentions of obtaining regular, critical formativefeedback. Although less relevant for most validation studies, development study designshould evidence a long-term perspective, include scaling options and address conditionsfor sustainable implementation. Support Claims for the Expected Scientific OutputThe main contributions of the research should be clearly indicated, and warrants mustbe provided to match each output. Returning to the motives for design researchdiscussed earlier in this book (van den Akker, Gravemeijer, McKenney, & Nieveen, thisvolume) three general types of contributions are likely: (a) formulation of education-related theories or principles; (b) educational improvement with local ownership; (c)contribution to an understanding of the design process itself. Exhibit Scientific Quality of ApplicantsDesign researchers must offer assurance that they are up to the task. A team thatincludes distributed expertise (e.g. strong past performance in the design area and indomain expertise) should be highly regarded. It may even be possible to use portfolios

270.
Educational design research 238(cf. Phillips, this volume) for ascertaining design competence. Finally, evidence of asustainable relationship among an interdisciplinary team (including researchers,practitioners, experts and other stakeholders) must be provided. CLOSING COMMENTSIdeally, design research proposals, activities and publications should be reviewed byparticipants from the same commissive space (cf. Kelly, in this volume). And, accordingto Edelson (2002), since design research objectives differ from those of a traditionalempirical approach, they should not be judged by the same standards. But until thecommissive space of design research grows sufficiently and quality standards within thecommunity become widely known and accepted, design researchers must find ways tohelp other reviewers look beyond their own methodological preferences. Articulationand discussion of standards by which design research ought to be judged is a first step inthis direction. Toward that goal, we hope that this book has offered some useful ideas tohelp advance the field of design research in education. REFERENCESBannan-Ritland, B. (2003). The role of design in research: The integrative learning design framework. Educational Researcher, 32(1), 21-24.Barab, S., & Squire, K. (2004). Design-based research: Putting a stake in the ground.

275.
Author biographies 241AUTHOR BIORAPHIES* = also editorJan van den Akker *Jan van den Akker is professor and head of the Department of Curriculum (Faculty ofBehavioral Sciences) at the University of Twente. Since the summer of 2005 he hasbeen serving as executive director Curriculum of the Netherlands Institute forCurriculum Development [SLO]. In his wide teaching, research, supervision, andconsultancy experiences (both in the Netherlands and abroad) he tends to approachcurriculum design challenges from a broader educational innovation perspective. Overthe years his preference for design research has grown because of its strongcombination of practical relevance and scholarly progress.Hugh BurkhardtHugh Burkhardt has led a series of international projects from Michigan State andNottingham University, where the Shell Centre team takes an engineering researchapproach at the systemic end of design research. In this, imaginative design andsystematic development, with theory as a guide and empirical evidence the ultimatearbiter, produce tools that make the system work better. Many such tools for change(assessment instruments, teaching materials, etc.) are needed to turn goals of policyinto outcomes in practice. Hughs other interests include making school mathematicsmore functional for everyone. He remains occasionally active in particle physics.Paul Cobb

276.
Author biographies 242Paul Cobb is a professor in the Department of Teaching and Learning at VanderbiltUniversity where he teaches courses in elementary mathematics methods, designresearch methodology, and the institutional setting of mathematics teaching. Hisresearch interests focus on instructional design, students’ statistical reasoning, theclassroom microculture, and the broader institutional setting of mathematics teaching.He has conducted a series of classroom design experiments in collaboration withKoeno Gravemeijer and has used the methodology more recently to investigate theprocess of supporting a group of mathematics teachers learning as it is situated in theinstitutional settings in which they work.Daniel EdelsonDaniel Edelson is an associate professor of Learning Sciences and Computer Scienceat Northwestern University in the USA. He leads the Geographic Data in Education(GEODE) Initiative at Northwestern, which is dedicated to the improvement of earthand environmental science education through the integration of authentic inquiry intothe science curriculum. Throughout his career, he has employed design researchtoward the improvement of curriculum and software design, professionaldevelopment, and classroom implementation. His research interests include interestmotivation, the application of research on learning in instructional design, andsoftware interaction design for students and teachers.Koeno Gravemeijer *Koeno Gravemeijer is research coordinator at the Freudenthal Institute (Departmentof Mathematics of the Faculty of Science) and holds a private chair at the Departmentof Educational Sciences (Faculty of Social and Behavioral Sciences) at Utrecht

277.
Author biographies 243University. His research interests concern the domain-specific instruction theory forrealistic mathematics education, RME; design research as a research method; and therole of symbols, (computer) tools, and models in mathematics education. The point ofdeparture, which runs as a continuous thread through his work, is that educationalresearch and the actual improvement of mathematics education in schools arereflexively related.Anthony KellyAnthony E. Kelly is a professor and head of the instructional technology program atGeorge Mason University in Virginia, USA. His research interests extend to researchmethodology design, and research at the intersection of cognitive neuroscience andeducation. He has a National Science Foundation grant on design research methodswith Richard Lesh. A volume edited by both principal investigators on the topic ofdesign research is forthcoming. Kelly edited the special issue on design research,Educational Researcher, vol 32, 2003. He served as a program manager at theNational Science Foundation from 1997-2000.Susan McKenney *Susan McKenney is an assistant professor in the Department of Curriculum (Facultyof Behavioral Sciences) at the University of Twente in the Netherlands. Dr.McKenney’s current research and teaching focus on curriculum development, teacherprofessional development and, often, the supportive role of computers in thoseprocesses. Careful design, evaluation and revision of educational improvementinitiatives is a recurring theme in her consultancy and research endeavors, most ofwhich are situated in either the Netherlands, India or southern Africa.

278.
Author biographies 244Nienke Nieveen *Nienke Nieveen is an assistant professor in the Department of Curriculum (Faculty ofBehavioral Sciences) at the University of Twente in the Netherlands. Her currentresearch explores successful scenarios for school-wide and school-based curriculumimprovement that incorporate productive relations between curriculum, teacher andschool development. She actively incorporates the design research approach inresearch projects and in courses on educational design and evaluation.D.C. PhillipsD.C. Phillips is Professor of Education, and by courtesy Professor of Philosophy, atStanford University. He is a member of the US National Academy of Education and aFellow of the International Academy of Education. A philosopher of education andsocial science with a particular interest in research methodology, and history of 19thand 20th century social science and educational theory, he is author, co-author oreditor of eleven books and numerous journal articles. His most recent book isPostpositivism and Educational Research (with N. Burbules).Thomas ReevesProfessor Reeves teaches, conducts research, and provides service related to programevaluation, multimedia design, educational research, and other professional activitiesin the Department of Educational Psychology and Instructional Technology of theCollege of Education at The University of Georgia. For more than 20 years, he hasbeen a vocal critic of traditional approaches to educational technology research aswell as a proponent of alternative, more socially-responsive, approaches, including

279.
Author biographies 245design research. He believes that educational researchers, especially those who workin publicly-funded institutions, should pursue research goals that have a morally-defensive rationale for improving human well-being.Finbarr SloaneFinbarr C. Sloane received his PhD from the University of Chicago in measurement,evaluation and statistical analysis. He is a senior research scientist at the Center forResearch on Education in Science, Mathematics, Engineering and Technology(CRESMET), and associate professor of mathematics education at Arizona StateUniversity. His research interests include student learning of mathematics, multileveltheory and statistical modeling to support the scaling of educational interventions inmathematics, and the conceptualization of design methodologies in education.Decker WalkerDecker Walker is a professor in the School of Education at Stanford University. Hisscholarly interests are in the study of curriculum and finding ways to improve the pre-college curriculum through the use of information technology (computers, video,telecommunications). He has published widely on curriculum development andevaluation, including Fundamentals of Curriculum (2000) and Curriculum and Aims(2004). His recent work concentrates on the role and meaning of informationtechnology in education. He was a founding faculty member of the Learning Designand Technology program at Stanford and has served as Director of the program since1997. His long term interest in design research has heightened in recent years asdesign research has assumed a central role in the work of students and faculty in thatprogram.