Friday, May 29, 2009

So this paper proposes one of the key concepts that we are using as part of our meta-analytical review of intentional vocabulary learning; that of task-induced involvement (see figure for breakdown of different tasks in terms of 'need', 'search', and 'evaluation'). The concept is based on Craik & Lockhart's (1972) psychological 'depth of processing' hypothesis, even trying to address the main criticism of the hypothesis; that of how one specifies whether a particular level of processing is deeper than another.

It is not clear the extent to which their approach can be applied to intentional learning. They contrast incidental/intentional learning with implicit/explicit learning. They point out the explicit learning can occur both incidentally and intentionally. The key difference appears to be that the implicit/explicit distinction is based on whether the learner can introspect the details of something they have previously learnt, whereas the incidental/intentional distinction concerns whether the learner was paying explicit attention to the learning process as it happened.

They focus on incidental learning, saying:

A researcher or a teacher may, for example, suggest the use of the keyword method, yet the learner will choose another memorization strategy with which s/he may feel more comfortable. Incidental learning, on the other hand, can be manipulated and therefore empirically investigated.

Which seems to imply that intentional learning cannot be empirically investigated. However I would have thought that the same criticism can be applied to incidental learning in that the experimenters still cannot control what aspects of a task a learner actually pays attention to. It seems like the studies that they are referring to, which were not conceived in terms of task-induced involvement, are not necessarily all incidental learning studies, although I am not 100% sure about that.

As an example of the difficulty in trying to determine the depth of processing of an instructional task they present the following simple tasks learners could perform related to the word 'skinny':

1) looking up its meaning in a dictionary and writing a sentence with the word
2) looking up its meaning and explaining the difference between 'skinny', 'thin' and 'slim'
3) receiving a sentence with the word and trying to infer its meaning from four alternatives presented by the teacher

It seems to me that the operations required to complete each of the above tasks are as follows

Alphabetical sort to perform dictionary lookup; reading dictionary definition to attempt form-meaning mapping, which may require lookup of other words; given comprehension of meaning, generation of possible sentence (in L1?) followed by filling in sentence based on L2 grammar and lexis

Alphabetical sort to perform dictionary lookup; reading dictionary definition to attempt form-meaning mapping, which may require lookup of other words; further alphabetical sort to lookup unknown synonyms; further reading of definitions; process of synonym meaning comparison (involving mapping back to L1?)

Of course there are various unknowns here related to the extent to which the word in question is known to the learner, and the level of knowledge they have of the grammar and lexis used in the example sentences and dictionary definitions. There is further ambiguity based on the extent to which the learner is operating cognitively in L2 versus L1. Thus it seems to me that each of these tasks will require highly variable amounts of processing time. The authors present their own assessment heuristic of the extent to which a task aids long term retention in terms of the degree of 'need', 'search' and 'evaluation' that each task requires. However it seems to me that the critical question is what do we mean by long term retention. It seems to me that each of the tasks above prepares the learner for success in similar tasks in the future. Measured retention will likely be higher for whichever future retention tests most closely match the task originally performed.

I get the feeling from the rest of the paper that retention is to be measured by the learner being able to describe the meaning of the word, but this is not explored in detail. Seems like the important things to determine are the degree of transfer that different tasks provide in terms of supporting other tasks, and the identification of what tasks the learner wants to succeed at in future - i.e. what are there language learning objectives. For me I would like to be able to choose an appropriate L2 word as I am constructing a sentence in my head, as well as be able to infer the meaning from an L2 word as I am parsing a sentence. The key question is not the extent to which I retain the definitions of individual words, but how successful I can be in these tasks in future. Which takes me back to the idea about trying to get the practice task as close as possible to the objective task. Makes me think that all language learning tasks should be built around things that learners want to express, not words they should be retaining. The authors mention the Newton (1995) study where those observing negotiation also get retention benefits, and I wonder if the same transfer would be found for learners observing another learner working on a 'personal expressive task'.

The paper also includes an interesting section on motivation with some references I was not yet aware of; and motivation is part of their construct in terms of the 'need' variable, which is none if there is no need for a learner to understand a word to complete a task, is 'moderate' if a teacher is asking the learner to perform a task specific to a certain word, and 'strong' if the learner is trying to comprehend a word of their choice to help them complete a larger task. Interesting that even in this 'strong' case that the overall task may have been set by a teacher, e.g. perform this comprehension task. I would advocate for an even stronger category of need when the top level task itself is actually generated from the learners interests and/or personal needs, e.g. I need to negotiate with my landlord over my rent, how can I say X, Y and Z persuasively?

The 'search' component of task-induced involvement is binary and appears to be designed to distinguish between those cases when learners are provided with glosses (definitions) and when they are not. The 'evaluation' component is 'moderate' if there is some comparison between word alternatives, e.g. multiple choice cloze test, and 'strong' when the learner is actively constructing a novel sentence or text. Interestingly there is no particular mention of receptive versus productive tasks or recall versus recognition, although evaluation seems related to both of these.

I think I need to read a number of other papers before I can work out whether these concepts make sense for intentional learning - they certainly appear to. I guess the authors would just argue that intentional learning studies contain a greater degree of ambiguity, but there is nothing a priori preventing us from classifying intentional studies in terms of task-induced involvement load.

Tuesday, May 26, 2009

So I was just in Borders looking at the programming books, and I remembered my frustration programming with Java again for Android. It seemed to needlessly verbose compared to scripting languages like Ruby. Handling XML seems like a big point of difference. For handling XML in android I was looking at using all kinds of libraries:

1) JAXB (requires schema in advance)2) JDOM (couldn't get it to work on Android)3) SAXB (default - others built on this?)

compared with ruby's crack library where I can do something like this:

Combining this with the Mash class I can access a hash generated from an XML document like an object, like so:

xml.posts.post.title# => "Foobar"

Anyhow, so in Java I just ended up building the closest equivalent I could on top of SAXB which involved dumping the xml structure into a Node Class with a Vector of children as well as an LinkedHashMap index mapping the element strings to Vectors of those children of that element, and a LinkedHashMap of attributes, thus:

node.get("item") returns Vector of all children of type <item></item>node.atts.get("att_name") returns attribute e.g. <att_name="frank"></item>

It all makes me wonder if there is a book on why scripting rocks somewhere, and I have also been having these ideas about how scripting works as being all about how simple it is to manipulate data structures. The key ones seem to be collections of objects and hashes. So much stuff seems to come down to manipulating one of those two in some way; and making the syntax for handling those two as magically simple as possible seems key. Even using the advanced for loop in Java, trying to dump information about an http request took a lot of effort.HttpResponse response1 = client.execute(post); # gives http status codeLog.d("DEBUG", response1.getStatusLine().toString());array = response1.getAllHeaders();for (Header h: array) {Log.d("DEBUG", h.toString()); # gives all the http response headers}long length = response1.getEntity().getContentLength();byte[] response_bytes = new byte[(int) length];response1.getEntity().getContent().read(response_bytes); # gives the http response bodyLog.d("DEBUG", new String(response_bytes));

Of course I also love Ruby mixins and the lack of multiple inheritance support in Java seems particularly disturbing. In this case I was particularly frustrated by wanting to print out the headers, and calling array.toString() gives me something like [Ljava.lang.String;@89ae9e. Now I can see the argument that this is sensible default behaviour and that it is then up to the programmer to implement alternative pretty print outs by overriding toString() in a sub class, but I am not sure that this is possible in Java. In Ruby I could simply insert the necessary functionality using a mixin or similar. There is some discussion on how to achieve this in Java in mailing lists, e.g. here, but all the solutions are awkward and complex.

I mean Java ends up being so much more verbose. Maybe it is much much more efficient, and this complexity supports all sorts of things that I'm not thinking about, but this is pretty much the point - most of the time I'm not thinking about them. Most of the time I want to explore something without having to worry about grabbing the length of the http response body, reassigning its type and then creating a byte array to store the results in. I just want to see the body. I'll come back later and optimise if I need to.

Perhaps I am missing some clever java scripting tricks, in which case please let me know, but I do find it quite painful dropping back to Java after scripting with Ruby.

Monday, May 25, 2009

Prior to releasing my own twitter app, I thought I'd just check and see what other existing twitter statistic apps there are. I found TweetStat, which had very interesting stats on my own twitter usage and also pointed me at another site to make a Wordle - pictured here.

I also found Mr. Tweet which seems to require me to follow them before I can find out more info about stats of friends and followers. I'm less enthusiastic about that, but there model seems to be to push out stuff over the twitter wire ... will blog when I get something interesting from them. Seems like this might be a slow service. TweetStat was remarkably fast given that I didn't have to sign in.

Friday, May 22, 2009

This is another paper I am reading as part of a meta-analysis of different approaches to teaching vocabulary. This was a very interesting study that showed big improvements not only in vocabulary retention on straightforward recall tests, but also on listening comprehension and knowledge transfer. There were four experimental conditions, including teacher led class study (P&P), computer study lists (CW), computer study lists with pictures (CP), and the clear leader; computer study based on learning vocabulary in the context of narrative (CC). I would really have liked to see some example images from the interface. The results seem to show that learning simple vocabulary in the context of narratives is highly effective.

I think it is fascinating that students in the context condition (CC in the above diagram) outperformed on pure recall tests. I had been starting to form the impression that preparing for a particular form of test is the best way to improve performance on that test, but in this case it seems like preparing for knowledge transfer also led to better scores on straight recall tests; although I'd need to see interface images to check if this really does contradict the kind of cross-match up we see in Groot (2000) where concordancing practice boosts performance on concordance tests, but not on straight paired associate recall, and vice versa.

Kang cites lots of relevant theory such as "inert knowledge" (Brown et al., 1989) and "cognitive embedding" (Ausubel, 1968), which links up with points I was making in a recent journal paper I co-authored with Maria Uther of Brunel University in the UK: Joseph & Uther (2009) In that paper I referred to some research (Chi & Koeske, 1983) indicating that information that is more deeply embedded in an individual's knowledge network is likely to be remembered longer; although this is a common theme in the literature on memory and second language acquisition, the references that Kang cites are different from those that I have been aware of so far. Of course that is not so surprising, but it gives me further pointers to link up this concept as far as computer assisted language learning goes.

One query I have is about the reliability coefficient that Kang reports for each type of test. My, possibly flawed, understanding of reliability is that one is attempting to work out how effectively some measure is at assessing an individual construct; such as in a social science questionnaire where multiple questions attempt to probe the same underlying construct like racism or sexism. However in an experiment like Kang's each test is attempting to measure the learner's knowledge of a particular word. There are no repeat measurements using different instruments, except to the extent that Kang employs three types of vocabulary tests. One could assess reliability across those tests, but Kang reports reliability for each individually.

The only way I can make sense of a reliability coefficient for a single type of test on multiple words, over multiple learners is that we are thinking of knowledge of multiple words as a single construct and are assessing reliability of the test instrument in those terms. However since the leaner may have had individual difficulties with each word separately, i.e. they are likely to learn at different rates for different words, that doesn't quite make sense. Unless all the words are very similar in terms of abstractness, visualizability, frequency etc., i.e. we have determined that the each test is probing the users learning of the same sort of word, e.g. the reliability of a test type such as productive recall for assessing learning of concrete nouns. Kang describes the vocabulary used in the study as common everyday words such as household items and routine activities. Anyhow, I don't think this is a serious criticism of a very interesting study, and it's highly likely that I am misunderstanding the meaning of reliability measures in this context, but it does seem a little like a situation where the statistical software generates reliability measures and they are reported verbatim without assessment of their suitability for the experiment in question.

So I am this close to releasing my Twitter app. I've got hooked up with twitter-auth and a publicly accessible web server (thanks Ken!) and I'm now able to start processing data from other twitter users. Here's an example of my friend Ken Mayer's twitter friends activity cloud:

I've made some small changes based on great input from my friend Robert Brewer (and George Lee and Ken Mayer), such as making each icon have a minimum size so you have a fighting chance of seeing who of your twitter friends is not being very active. A little more cleaning up and I hope to release this to anyone who wants to see their own twitter friends activity cloud.

Last stumbling block is that getting all the data requires hitting twitter as many times as you have friends, and if this exceeds the 100 requests per hour that twitter allows then the interface can fail a little ungracefully. Hope to fix that and do at least a partial release next week. Hoping to get my twitter friends to test it first, and then all comers, gulp!