These Guidelines were presented as a draft at the 2005 ILTA meeting in Ottawa and then circulated among members for further consideration. They were finally adopted at the 2007 ILTA meeting in Barcelona with the proviso that they be reviewed and, if appropriate, revised in 2010.

(Grateful acknowledgement is made to the Japanese Language Testing Association for their pioneer work on these Guidelines. Much of Part 1 is based on their work. Acknowledgement is also made to the American Psychological Association whose Standards for Educational and Psychological Testing informs Part 2.)

International Language Testing Association

Guidelines for Practice

(Adopted at the annual meeting of ILTA, held in Barcelona, June 2007)

Part 1

A. Basic Considerations for good testing practice in all situations

1. The test developer’s understanding of just what the test, and each sub-part of it, is supposed to measure (its construct) must be clearly stated.

2. All tests, regardless of their purpose or use, must provide information which allows valid inferences to be made. Validity refers to the accuracy of the inferences and uses that are made on the basis of the test’s scores. If, for example, the test purports to be measuring the ability to use English in business communication, the inferences based on the test score are valid to the degree that the test does in fact measure that ability. However, since the ability to use English in business communication is a construct, the test developer must spell out just what that construct is or what it consists of. The test score inference or interpretation can be valid only if the test construct offers as accurate as possible a picture of the skill or ability it is supposed to measure.

3. All tests, regardless of their purpose or use, must be reliable. Reliability refers to the consistency of the test results, to what extent they are generalizable and therefore comparable across time and across settings.

B. Responsibilities of test designers and test writers

1. Test design should include a determination and explicit statement of the test’s intended purpose(s).

2. A test designer must decide on the construct to be measured and state explicitly how that construct is to be operationalized.

3. The specifications of the test and the test tasks should be spelled out in detail.

4. The work of the task and item writers needs to be edited before pretesting. If pretesting is not possible, the tasks and items should be analysed after the test has been administered but before the results are reported. Malfunctioning or misfitting tasks and items should not be included in the calculation of individual test takers’ reported scores.

5. Information guides on scoring (also known as grading or marking schemes) must be prepared for test tasks requiring hand scoring. These guides must be tried out to demonstrate that they permit reliable evaluation of the test takers’ performance.

6. Those doing the scoring should be trained for the task and both inter and intra-rater reliability should be calculated and published.

7. Test materials should be kept in a safe place and handled in such a way that no test taker is allowed to gain an unfair advantage over other test takers.

8. Care must be taken to ensure that all test takers are treated in the same way in the administration of the test.

9. Scoring procedures must be carefully followed and score processing routines checked to make certain that no mistakes have been made.

10. Reports of the test results should be presented in such a way that they can be easily understood by test takers and other stakeholders.

C. Obligations of institutions preparing or administering high stakes examinations

Institutions (colleges, schools, certification bodies etc) developing and administering entrance, certification or other high stakes examinations must utilize test designers and item writers who are well versed in current language testing theory and practice. Items written by non-native speakers of the language being tested must be checked by someone with a high level of competence in the language.

Responsibilities to test takers and related stakeholders

(Before the test is administered)

The institution should provide all potential test takers with adequate information about the purposes of the test, the construct (or constructs) the test is attempting to measure and the extent to which that has been achieved. Information should also be provided as to how the scores/grades will be allocated and how the results will be reported.

(At the time of administration)

The institution shall provide facilities for the administration of the test that do not disadvantage any test taker. Test administration materials should be carefully prepared and proctors trained and supervised so that each administration of the test can be uniform, ensuring that all test takers receive the same instructions, time to do the test, and access to any permitted aids. If something occurs that calls into question the uniformity of the administration of the test, the problem should be identified and any remedial action to be taken to offset the negative impact on the affected test takers should be promptly announced.

In the case of speaking tests, the facilities shall be capable of proper invigilation and oversight, providing a safe and secure environment in professional surroundings for raters/interlocutors and for test takers.

(At the time of scoring)

The institution shall take the steps necessary to see that each test taker’s test paper is scored/graded accurately and the result correctly placed in the data-base used in the assessment. There should be ongoing quality control checks to ensure that the scoring process is working as intended.

(Other considerations)

If a decision is to be made on candidates who did not all take the same test or the same form of a test, care must be taken to ensure that the different measures used are in fact comparable.

If more than one form of the test is used, inter-form reliability estimates should be published as soon as they are available.

D. Obligations of those preparing and administering publicly available tests

They should:

1. Make a clear statement as to what groups the test is appropriate for and for which groups it is not appropriate.

2. Make a clear statement of the construct the test is deigned to measure in terms a layperson can understand.

3. Publish validity and reliability estimates and bias reports for the test along with sufficient explanation to allow potential test takers and test users to decide if the test is suitable in their situation.

4. Report the results in a form that will allow test users to draw the correct inferences from them.

5. Refrain from making any false or misleading claims about the test.

6. Publish a handbook for test takers which:

6.1. Explains the relevant measurement concepts so that they can be understood by non-specialists.

6.2. Reports evidence of the reliability and validity of the test for the purpose for which it was designed.

6.3. Describes the scoring procedure and, if multiple forms exist, the steps taken to ensure consistency of results across forms.

6.4. Explains the proper interpretation of test results and any limitation on their accuracy.

E. Responsibilities of users of test results

Persons who utilize test results for decision making must:

1. Use results from a test that is sufficiently reliable and valid to allow fair decisions to be made.

2. Make certain that the test construct is relevant to the decision to be made.

3. Clearly understand the limitations of the test results on which they will base their decision.

4. Take into consideration the standard error of measurement (SEM) of the device that provides the data for their decision.

5. Be prepared to explain and provide evidence of the fairness and accuracy of their decision making process.

F. Special considerations

In norm-referenced testing:

The characteristics of the population on which the test was normed must be reported so that test users can determine if this group is appropriate as a standard to which their test takers can be compared.

In criterion-referenced testing:

The appropriateness of the criterion must be confirmed by experts in the area being tested.

Since correlation is not a suitable way of determining the reliability and validity of criterion referenced tests, methods appropriate for such test data must be used.

In computer adaptive testing:

The sample sizes must be large enough to ensure the stability of the Item Response Theory (IRT) estimates.

Test takers and other stakeholders must be informed of the rationale of computer adaptive testing and of the difference between paper and pencil tests and computer adaptive tests.

3. Be tested with measures that meet professional standards and that are appropriate, given the manner in which the test results will be used.

4. Receive a brief oral or written explanation prior to testing about the purpose(s) for testing, the kind(s) of tests to be used, whether the results will be reported to you or to others, and the planned use(s) of the results. If you have a disability, you have the right to inquire and receive information about testing accommodations (special arrangements). If you have difficulty in comprehending the language of the test, you have a right to know in advance of testing whether any such special arrangements may be available to you.

5. Know in advance of testing when the test will be administered, if and when test results will be available to you and if there is a fee for testing services that you are expected to pay.

6. Have your test administered and your test results interpreted by appropriately trained individuals who follow professional codes of ethics.

7. Know if a test is optional and learn of the consequences of taking or not taking the test, fully completing the test, or cancelling the scores. You may need to ask questions to learn about these consequences.

8. Receive a written or oral explanation of your test results within a reasonable amount of time after testing and in commonly understood terms.

9. Have your test results kept confidential to the extent allowed by law.

10. Present concerns about the testing process or your results and receive information about procedures that will be used to address such concerns.

As a test taker, you have the responsibility to:

1. Read and/or listen to your rights and responsibilities as a test taker.

2. Treat others with courtesy and respect during the testing process.

3. Ask questions prior to testing if you are uncertain about why the test is being given, how it will be given, what you will be asked to do, and what will be done with the results.

4. Read or listen to descriptive information in advance of testing and listen carefully to all test instructions. You should inform an examiner in advance of testing if you wish to receive a testing accommodation or if you have a physical condition or illness that may interfere with your performance on the test. If you have difficulty comprehending the language of the test, it is your responsibility to inform an examiner.

5. Know when and where the test will be given, pay for the test if required, appear on time with any required materials and be ready to be tested.

6. Follow the test instructions you are given and represent yourself honestly during the testing.

7. Be familiar with and accept the consequences of not taking the test, should you choose not to take the test.

8. Inform appropriate person(s), as specified to you by the organization responsible for testing, if you believe that testing conditions affected your results.

9. Ask about the confidentiality of your test results, if this aspect concerns you.

10. Present any concerns you may have about the testing process or results in a timely, respectful way.