Plan the Test

Well, first you plan the test. What are the domains that you want to address?

In our case, we looked at semantics, morphosyntax, phonology, and pragmatics.

Write Items for Each Domain

Then what we did is we wrote items for each of these domains. We wrote probably three or four times the items that we would eventually want.

A clinical test is one that is pretty efficient. You want to be able to give a test, and have each subtest take maybe 20 minutes, right? You don’t want to do a three- or four-hour test. But to start, you have to start out with a three or a four hour test to figure out which items are your best items for this population. That’s basically what we did, and each person is going to talk about how we did that for each domain in particular.

Administer to a Small Sample

Then you administer all the items to a small sample. Usually 50 kids. If you’re developing something like the GRE, you would administer that to several hundred. We’re not doing the GRE, so 50 or so kids was a good sample size to start with.

Conduct Item Analysis

Then you conduct item analysis. And through that item analysis, you start throwing out the items that don’t work.

Administer Revised Test and Cross-Validate.

Then, we take this shorter, revised test. And we administer it to another sample. A bigger sample.

Then we continue to do this, doing item analysis and cross-validation as we start to step through all these steps and reduce the number of items that we have in each of the different domains.

Item Difficulty

I’m going to tell you a little bit about item difficulty. Item difficulty is the proportion of a population that got an item correct. Those are p values. And p values go from 0 to 1. If they are close to 0, that means nobody got that item right. So it’s way too hard, so that doesn’t tell you anything, so you get rid of it.

If everybody got that item right, then it’s way too easy. So you might include one or two so you don’t totally bum out the kids, but you don’t want too many really easy items, because that’s not going to tell you anything either.

Then the items that you select depends, of course, on the purpose of your test. So we use these same principles for item difficulty and apply it for different kids at different levels of exposure to two language as well as for kids with and without impairments that we were trying to identify.

Item Discrimination

The next thing you do is look at item discrimination. Item discrimination is the comparison of the percentage of typical kids who got the item right, and the percentage of children with impairments who got that item right. We subtract and look at the differences between those two. There we’re going to get a difference of, again, 0 to 1. And the greater the difference, the more sensitive that item is to the impairment you’re trying to identify.

If you can build your test based on those items, items that are sensitive to that impairment, but don’t have big discrimination values for level of exposure, for example, or age of first language exposure, then you have a better chance of having a measure that is going to help you make distinctions between your clinical and non-clinical group.

Brice, A. & Montgomery, J. (1996). Adolescent pragmatic skills: A comparison of Latino students in English as a second language and speech and language programs. Language, Speech, and Hearing Services in Schools, 27(1), 68–81[Article]

Originally presented at the ASHA Convention (November 2013) as part of the session Development of a Bilingual Test for Spanish-English Children: A Long and Winding Road. Co-Presenters: Elizabeth D. Peña, University of Texas at Austin; Aquiles Iglesias, Temple University; Vera F. Gutierrez-Clellen, San Diego State University; Brian A. Goldstein, La Salle University; and Lisa M. Bedore, University of Texas at Austin.
Disclosure: All of the above-listed authors/co-presenters benefit financially from royalty payments from the Bilingual English-Spanish Assessment (BESA.).
Copyrighted Material. Reproduced by the American Speech-Language-Hearing Association in the Clinical Research Education Library with permission from the author or presenter.

Follow ASHA Journals on Twitter

The Clinical Research Education Library is supported in part by the National Institute on Deafness and Other Communication Disorders (NIDCD) of the National Institutes of Health under award number U24-DC012078. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.