Goals

The goals of this course are to learn data collection, design, and analysis methodologies that are particularly useful for scientific research in education. The course will be organized in modules addressing particular topics including cognitive task analysis, qualitative methods, protocol and discourse analysis, survey design, psychometrics, educational data mining, and experimental design. We hope students will learn how to apply these methods to their own research programs, how to evaluate the quality of application of these methods, and how to effectively communicate about using these methods.

Course Prerequisites

To enroll you must have taken 85-738, "Educational Goals, Instruction, and Assessment" or get the permission of the instruction.

Textbook and Readings

"The Research Methods Knowledge Base: 3rd edition" by William M.K. Trochim and James P. Donnelly.

Flipped Homework: Reading Reports and Pre-Class Assignments

We are often going to implement "flipped homework", a variation on the flipped classroom idea you might have heard of. Flipped homework is an assignment before a relevant class meeting rather than after it. It helps students (you!) to "problematize" the topic -- to get a better
sense of what you don't know and what questions you have. It helps instructors focus the class discussion to better avoid belaboring what students already know and to better pursue student needs and interests.

Students will be asked to write "reading reports" before most class sessions. We will use the discussion board on Blackboard (www.cmu.edu/blackboard) for this purpose.

Unless otherwise directed by instructors, students should make two posts on the readings before 3:30pm on the day of class that those readings are due. If slides for the class are available, please review these as well.

These posts serve multiple purposes: 1) to improve your understanding and learning from the readings, 2) to provide instructors with insight into what aspects of the readings merit further discussion, either because of student need or interest, and 3) as an incentive to do the readings before class!

In general, please come to class prepared to ask questions and give answers.

Your two posts may be original or in response to another post (one of both is nice).

Original posts should contain one or more of the following:

something you learned from the reading or slides

a question you have about the reading or slides or about the topic in general

a connection with something you learned or did previously in this or another course, or in other professional work or research

Replies should be an on-topic, relevant response, clarification, or further comment on another student’s post.

You may be asked to do other activities before class, such as answer questions on-line using the Assistment system, parts of the an OLI course, or beginning work on an assignment. That way you can come to class with a better appreciation for what you do not understand and need to learn.

Grading

There will be assignments associated with each section of the course. Grades will be determined by your performance on these assignments, by before-class preparation activities including reading reports, by your participation in class, and by a final paper.

Project & final paper - Initial ideas due Feb 15, research question and likely data source due March 30 [satisfied by posting on Blackboard], Final paper due May 10.

30% Design a new study based on one or more of these methods that pushes your own research in a new direction.

Apply a method from the class to your research. You should not choose a method that you already know well. Because some methods will be introduced after the project proposal date, we are open to a modification in your project to apply the newly introduced method. But, please check with us to get feedback and approval on a proposed change.

No more than 15 double-spaced pages. Be efficient. Space is always limited in academic publications and you will find it useful to learn to include only what is important. You can frame your write-up as though the audience were reviewers of a grant proposal or an internal project proposal. As you would in a grant proposal, please include some literature review and discussion of significance of the area you want to investigate. You should also briefly detail plans for participants, explain specifically how you will apply the method, and describe how you will analyze the data.

Cognitive Task Analysis (CTA) (Koedinger)

One point of reflection for you on the Clark et al reading is to compare and contrast with recommendations for collection and analysis from van Someren et al and from Ericsson et al. (If you saw Bror Saxberg's PIER talk last year, you may have heard that Kaplan is using CTA, with Clark's advice, to revise and improve their courses.)

Besides being an interesting read, a key point of this reading is the nature of expert knowledge (declarative and procedural) and how it is highly "conditionalized". Their discussion of adaptive expertise is also important and interesting.

Pick one of these readings to focus on and skim the other two. Target your first post on that reading (and make clear which one it was). Your second post can be on any of the three. These readings illustrate the use of Cognitive Task Analysis (CTA) for higher level thinking and learning skills. The Klahr & Carver reading shows how CTA can facilitate the design of instruction that achieves a substantial level of transfer. The Azevedo et al and Aleven et al readings provide examples of CTA at the level of metacognitive skills or learning skills. When you skim all three, pay particular attention to 1) what are tasks the authors are analyzing, 2) what is their goal, 3) what is(are) the method(s) of analysis, and 4) what modeling approaches do the authors use to represent the output of their analysis: Do they use any of production rules, goal trees, semantic nets, hierarchical task models, or other?

In addition to think aloud, another empirical approach to Cognitive Task Analysis is to compare student performance on a space of similar tasks designed to test specific hypotheses about the knowledge demands of those tasks. We have called this approach "Difficulty Factors Assessment" and the Koedinger & Nathan paper is an early example. The former assignment below, which is focused on rational CTA, provides an example of the similarity in the logic of contrast used in Difficulty Factors Assessment and the contrast between the two tasks or solutions one can do in a rational CTA. Skim Koedinger & MacLaren to see another example of a production rule model and of a method of quantitative evaluation of that model by fitting it to coding categories from a solution protocol analysis.

r-tutorial-1.R - examples of statistical things that you will do in R, for this assignment

thermo11_data_integrated.csv - a data set for the examples.

2-28

1. From Trochim:

A. Chapter 3 - the vocabulary of measurement
B. Chapter 5 - on constructing scales (it's ok to focus
on the material up through sect 5.2a; the rest is
more of a skim [but I'd be happy to talk about that
in class also])

2. On item response theory (IRT), a set of statistical models that are used
to construct scales and to derive scores from them, especially in education
and psychological research:

A. Harris Article (PDF)
Please take and self-score the test at the end of
this article. Count each part of question one as
one point, and each of the remaining three questions
as one point (no partial credit!). Bring your 8
scores to class. E.g. if you missed 1(c) and (d), and
you also missed question 4, then you would bring to
class the following scores:
1 1 0 0 1 1 1 0
If you missed 1(a) and (b) and question 2, bring the
following scores:
0 0 1 1 1 0 1 1
(note that the total score is 5 in both cases, but
the pattern of rights and wrongs differs; it is the
pattern that we are interested in).
B. Please browse *online* through pp 1-23 of the pdf at
[2].
The math is a bit heavy going but there are links
to apps that illustrate various points in the
harris article.
So skim the math and play with the apps.

3-2

The assignment for this lecture has two parts.

(A) An R assignment TBA. This you can actually email to my by Fri Mar 7.

Read through p 18. This is a more modern modern look at some of
the same issues that are addressed in Trochim's chapters.
The remainder of this paper surveys various probabilistic models
for the "measurement model" portion of Mislevy's framework (Figure
1). It is quite interesting but we will not pursue it.

Please read up through p 266 only.
The math is a bit heavy going so please try to read around it to
see what the point of the article is.
We will try to look at some of the data in the article as examples
in lecture 2.

3-7 Continued discussion of Psychometrics

NO CLASS – Spring break 3-14 and 3-16

Surveys, Questionnaires, Interviews (Ogan)

3-21

Reading: Trochim Ch 4 and 5

You already read Ch 5 for the Psychometric section, so just review it. For both chapters, answer Trochim's on-line questions before and/or after reading (answering the questions before gives you goals for reading). For discussion board posts, do one post on how have or might use a survey (e.g., of student attitudes) in your own research. Make another post about Chapter 4, such as something you learned, a question you have, or an answer to someone else's question.

3-23

Do the following homework assignment Media:Arm-modQuestEduc.doc. Keep the text that's there and fill in answers, working through it step by step. I'm just as interested in your revisions as in the final version. Est time 45 minutes.

Assignment: The assignment ( Learning-curve-assignment-2014.doc) is a tutorial on using DataShop to begin analyzing learning curves. Upload to Blackboard (or email to me) comfortably before class on Thursday -- by 3pm. Also, in addition to the problem content file indicated in the assignment handout see other files in the same location to get a more complete description and list of the files: Geometry Area Problems PDF Explanation.docx and solutions.zip.

In-class activity: Start on one of the two exercises (A or B) below. Provide a brief writeup in response to each of the numbered steps and include a summary of the result you achieved (e.g., did you get a more predictive model as measured by AIC, BIC, or cross validation). Turn in this writeup and the supporting file (KC model table or R file) on Blackboard. Make significant progress before class next Tuesday (at least get to a point where you are stuck or can see your way to the end). Due by end of day on Wednesday, 4-5.

Do A or B:
A. Modify a KC model in a DataShop dataset
1. What is the DataShop dataset you modified? (Look for datasets with the lego block icon on them -- these have associated problem descriptions)
2. Describe how you used the HMST procedure (from Stamper paper)
to identify a KC to try to improve
3. Show how you recoded that KC with new KCs (turn in your modified
KC file) & describe why you made the change you did
4. After importing your new KC model to DataShop, did it improve the
predictions on any of the metrics, AIC, BIC, or cross validation?
(Caution: Make sure your new KC model labels the same number of
observations as the KC model you are modifying.)

How do the parameter
estimates and metrics (AIC and BIC) compare with results in DataShop?
2. Modify the regression equation to try to improve the prediction.
Some options include: a) adding a student by KC interaction (there
are just main effects of student and KC in AFM), b) adding student
slopes (there is just a KC slope in AFM), c) counting success and
failure opportunities separately (both kinds of opportunities are
lumped together in AFM), d) using log of Opportunity, e) including
step (as a random effect) ...
3. Turn in your R file including metrics (log-liklihood, parameters,
AIC, BIC) on the statistical models you compared
4. Summarize whether or not your modification changes model fit (log
liklihood), changes the number of parameters (from what to what),
and, most importantly, improves prediction (e.g., as measured by AIC or BIC or cross validation)

Flex day (Koedinger)

4-6 To be used in case of rescheduling, for a student-driven topic, and/or for Review of Projects or Past Topics

We will wrap up on EDM for learning curves (option1) and, time permitting, give work time for your project.

Optional reading: Chapter on Design Research in Handbook of Learning Sciences

Educational Data Mining -- Causal Inference from Data (Scheines)

4-11

Before class on 4-11, do Unit 2 in the OLI course Empirical Research Methods

Go to: http://oli.cmu.edu/learn-with-oli/see-our-free-open-courses/
Scroll down and click on the rightmost tab, "Prior work (5)"
Click on "Empirical Research Methods" and then on "[Enter Course]"
Click on "CMU users sign in here" to login with your CMU account
or "Enter Without an Account"
Complete "UNIT 2: Regression, Prediction and Causation"