Affective states and state tests: Investigating how affect throughout the school year predicts end of year learning outcomes.

What is the correspondance between students’ affect during the school year and their end of year state test scores. Affect – every second! – engagement, boredom, confusion, frustration; behaviour – off-task, gaming.

Predictions from room: Might assume that more engaged student would have a higher test score. Boredom – maybe, arguments both ways. Confusion and frustration similar. Off task and gaming probably less good. Smilies/emoticons to illustrate!

O_O

😮

?_?

>:o

Performance on the tutor and test outcomes has been investigated (Fend et al 2009), compared no feedback (just testing) and feedback – can predict just as well if you let the students learn, so maybe the state test isn’t so necessary. What part of the outcome story does affect tell?

What is this important? Affect is an important state that could tell the story of what interventions are right for a given state. E.g. if bored need condition A, if confused need condition B. Could give you a bigger effect size. If the VLE knew your affective state – e.g. engagement, arousal – could say e.g. you’re fading, take a break for 20 minutes and come back. Could be valuable signal for teachers, get info they can’t know all the time. Teachers are expert affect detectors, but they can’t look everywhere all the time. Why affect correlation with state tests? High-level: a lot of K-12 teachers, there’s a lot of teaching to the test. State standards, common core. A lot of concern there’s not enough focus at the policy level to help students in non-maths-skilled ways. Paulo Blikstein “We teach what we can measure”. If we can measure affect, and it’s a strong correlate of state tests, it’s Ok to teach to that.

Methodology: Measure during the whole year. Not observers in classroom the whole year. Code a sample of students with labels while using a tutor. Classroom field observations, human coded. Take those labels, find corresponding tutor log data, create features, then learn a function mapping from those features to the labels, then apply that to the entire dataset. Machine learn a function, apply to the out-of-sample students. Collect end of year state test scores, observe the correlation between raw affect and end of year scores. Two years. Only measure during time using the tutoring system.

ASSISTments system. Primarily algebra and geometry. Two types of questions. Original questions – publically release state test items. Or scaffolded version – gives scaffolding that breaks question up in to steps (if asked for or student answers incorrectly). Content problem set types, while student is in the lab they get random items from the problem set. Also mastery and skill builder sets.

Coding the students. Two expert field observers chose classes, two schools, rural and urban. Code students one at a time for 20s intervals, and label the affective states. Inter-rater reliability tested – kappa of 0.72 for affect, 0.86 for behavior. 3075 observations of 229 students.

Baker at al EDM 2012 methodology. StudentID/Time/Label, use machine learning classifiers to map from tutor log features to the label. Stepwise regression, add features until it no longer improves fit, then take those features and include in the more complex classifiers. Example features – number of correct answers during clip, proportion of actions taking >80s to respond; whether student followed scaffolding with a hint request; how many of students’ previous five actions included same. Many others.

Gaming is trying to get the right answer without thinking – e.g. answering question wrong, getting sub-things wrong, until it tells you the answer. The teacher is alerted that they had the hint with the answer in.

Applied eight classifiers, with the idea that different classifiers might work better for each state. The average A’ was about 0.8, average kappa from 0.2 to 0.5 – chose on kappa.

Applied these tools to two whole years of ASSISTments log data – 639 students 2004/5, 764 2005/6 – a while ago because had state test scores available.

Goal to correlate affect with state test score but affect needs to first be summarised for each student. Average of each affective state calculated for originals and scaffolds, weighing affect in each skill equally. It’s not a simple average, it’s treating each skill (subtraction, addition, etc) equally. The state test samples skills equally, so more likely to see a correlation.

Results: Pearson correlations for affect vs original / scaffold problem types. None much higher than 0.4, many small. Engagement and gaming had strongest signals – engaged students had +ve correlation (0.44) – gaming had -ve correlation (about -0.44 for each). Boredom is negative if bored during original question, but positive is bored during scaffolding (0.3) – maybe you knew it and are bored of the help. Confusion – similar – if confused on the originals, that’s not good for state test correlation. But confused on the help, that’s not bad; in lit, confusion is good if it’s on items that resolve your confusion – which scaffolding is meant to be.

Plot summary of affect scores for different proficiency categories. Advanced to failing. Frustration scores are higher the more advanced you are. Confusion scores go down the more advanced you are – same with gaming.

Affect has strong correspondance to state test outcomes. Looked at difference between affect during the original and the scaffolding – e.g. confusion, boredom. Affect important in state test achievement.

Questions

Q: If student was frustrated and then solved the problem and moved in to happy, how was it coded if a state shift in the clip?

The first one was used.

Q Role of stepwise regression in the process? Stepwise is usually highly sensitive to outliers. Do any transformation or diagnosis?

Don’t think we did treatment of data for stepwise regression. In the methodology, was used for feature selection before the fivefold validation to choose best qualifiers. Could be improved by using a multi-state classifier, rather than separate for each. Might not be perfect – could still have different errors for each. Work to hone these models.

Q: Analysing the in-tutor time, take precautions to say don’t care about off-tutor? What was effect of out of tutor on final score?

We didn’t measure that, outside the tutor learning. Presumably there was lots. The tutor was once per week. It’s an open question how much affect in the modality of a VLE relates to in the classroom. It was a sampling of their whole time.

Phil: What was the operational definitions of the affects? How do you know if I am frustration?

Field observer: We used the BROMP protocol for affect coding. Holistic, not single signals, try to make a judgement based on posture, facial expressions, etc. You get double inter-rated reliability. People good at spotting someone’s bored, bad at saying why that is.

Chris: What about the classifier?

I don’t know!

SNA and visualisation

Considering Formal Assessment in Learning Analytics within a PLE: The HOU2LEARN Case.

Used students – N=76 in MSc course PLH42 at HOU. Platform open to all. 6 assignments, 1 final exam. Aim to promote openness, to create and share content in the network. To communicate informally in a less stressed way, exchange experiences, content etc. To promote socialisation among members, endorsing ideas.

Front page of platform – has an activity stream prominently. Based on Elgg. From Sep 2010, research and educational purposes. It doesn’t support roles.

Designed activity metrics (SQL queries) according to the course needs and set up – this is work in progress. E.g. topics each user has uploaded, new bookmarks, etc.

SNA visualisation from early on – many students entirely unconnected, and the most connected/largest node is the instructor. Then next shot after two months, the connections have increased, with still many students who ‘are introverts’ (!). But the instructor is no longer dominant in the network, there are several other nodes that are large too. Then in May, connections now around 400, can identify clusters among the users.

Then took in to consideration the final grades, with nodes size proportional to the final grades. The largest grade was not well connected (but not very badly). Then indegree centrality, popularity of the node – most popular node does not have the highest grades. Higher grade does not mean higher indegree centrality. Next – outdegree centrality – most outward pointing nodes, how many users you follow – the highest outdegree centrality had a pretty high grade. Finally betweenness centrality – brokering – top node was same as highest outdegree centrality, and higher grades.

Highest indegree centrality node had lower grade than average. Highest outdegree centrality had one of the highest grades and also had highest betweenness centrality. Students with high grades have to increase betweeness centrality.

Future work – run again, more development and combination, integrate experiments with groups.

Questions

Sheila: Did you share these slides back with the students, did they understand what action they could take?

Not yet. Want to consider anonymisation and ethics first. We work with their IDs, have to find methods to face, otherwise will feel they are monitored.

Doug: Did you do the correlations between the centrality measures and the final grades?

No, this is preliminary research.

Visualizing Social Learning Ties by Type and Topic: Rationale and Concept Demonstrator

Bieke talking, from OUNL. Work is close to previous presenters. This place is special for me, I graduated 10 years ago in this auditorium, the ceremony was here. Was is a political and social scientist doing here? It’s the result of interdisciplinary collaboration.

Worked in EU projects on innovation in education, interested in CPD of teachers, 2y ago started PhD, moved to OUNL, supervisor Maarten de Laat. Question: How do teachers develop learning ties to develop in the workplace? Face to face.

Teachers fed up with questionnaires. Online reflective tool on reflective account of ties in the workplace. Tool helps us gather data, but direct reflection for them. Practice-based research. Linked up with IT person on visualisations, created the Network Awareness Tool presented last year. Want to visualise online interactions, link to Simon Buckingham Shum and Rebecca Ferguson, informal learning platform for learners and teachers – SocialLearn, plug in this tool and see what happens.

Social learning analytics are designed to support learning through social networks. Based on Networked Learning Theory. Interested in the structure of networks, position of people in a network adn the antecedents and consequences. Also Social Capital Theory – looking at the content as well as the network structure.

Developed the NAT plug-in for SocialLearn. Multiple levels at once. Have theme cloud, the overall network structure. Then the network structure itself. Then the tie and its multiplexity – friends, followers, responding – for seeking advice, people tend to go to people they like rather than real experts. And ego networks – individual network relations per person.

Live demo!

Big network of entire platform. Nodes are people, ties are friends, followers and responding shown as yellow, pink, blue lines. Content represented as tag cloud. Can see some individuals unconnected, there’s a really popular person with lots of ties.

Smaller network – select from tag cloud, see that subnetwork. Can replot. Mouseover a node gives you list of topics, mouseover the tie, gives you list of topics shared. Can click through to an individual’s ego network.

Relfected with teachers on the tool, but haven’t reflected with learners. From experience with teachers, expect learners can see their own learning network. Many important roles in the network as well as being popular. Could be a block if you are very central. Have to have tutorial, guide to understand these visualisations.

Many directions for future research. Learners perceptions, does the content of ties influence the structure, semantic analysis on the tagcloud, dynamic analysis. Do students find more peers to learn from using the NAT plug-in? Potential tool for conferences to find co-researchers to do interdisciplinary research?

Question

Q From network analysis, many things to pursue. Suggestions – looking for subcommunities, that would be very interesting here. Also if want to find different patterns, behavioural patterns, ? modelling.

Yes, definitely. This is for the teachers, want to offer for the users. But for us, more modelling is useful. Interesting from a research perspective. Find tacit, explicit knowledge, is it different structure.

Q Trying to get a community of PhD students, all sitting on an island. Would be able to use this tool to create communities?

Oh yeah, we’re using it in my institution also. It’s not related to an online community, but as a PhD student I say these are my interests, are displayed so that my colleagues can see and connect. It’s free to use.

–

This work by Doug Clow is copyright but licenced under a Creative Commons BY Licence.
No further permission needed to reuse or remix (with attribution), but it’s nice to be notified if you do use it.