Multiple choice is a form of an objective assessment in which respondents are asked to select the only correct answer out of the choices from a list.[1] The multiple choice format is most frequently used in educational testing, in market research, and in elections, when a person chooses between multiple candidates, parties, or policies.

Although E. L. Thorndike developed an early scientific approach to testing students, it was his assistant Benjamin D. Wood that developed the multiple choice test.[2] Multiple choice testing increased in popularity in the mid-20th century when scanners and data-processing machines were developed to check the results.[3]

Contents

Multiple choice items consist of a stem, the correct answer, keyed alternative, and distractors. The stem is the beginning part of the item that presents the item as a problem to be solved, a question asked of the respondent, or an incomplete statement to be completed, as well as any other relevant information. The options are the possible answers that the examiner can choose from, with the correct answer called the key and the incorrect answers called distractors.[4] Only one answer can be keyed as correct. This contrasts with multiple response items in which more than one answer may be keyed as correct.

Usually, a correct answer earns a set number of points toward the total mark, and an incorrect answer earns nothing. However, tests may also award partial credit for unanswered questions or penalize students for incorrect answers, to discourage guessing. For example, the SAT Subject tests remove a quarter point from the test taker's score for an incorrect answer.

For advanced items, such as an applied knowledge item, the stem can consist of multiple parts. The stem can include extended or ancillary material such as a vignette, a case study, a graph, a table, or a detailed description which has multiple elements to it. Anything may be included as long as it is necessary to ensure the utmost validity and authenticity to the item. The stem ends with a lead-in question explaining how the respondent must answer. In a medical multiple choice items, a lead-in question may ask "What is the most likely diagnosis?" or "What pathogen is the most likely cause?" in reference to a case study that was previously presented.

The items of a multiple choice test are often colloquially referred to as "questions," but this is a misnomer because many items are not phrased as questions. For example, they can be presented as incomplete statements, analogies, or mathematical equations. Thus, the more general term "item" is a more appropriate label. Items are stored in an item bank.

Ideally, the Multiple Choice Question should be asked as a "stem", with plausible options, for example:

The IT capital of India is

Bangalore

Mumbai

Mexico

Hyderabad

A well written multiple-choice question avoids obviously wrong or silly distractors (such as Mexico in the example above), so that the question makes sense when read with each of the distractors as well as with the correct answer.

A more difficult and well-written multiple choice question is as follows:

Consider the following:

An eight-by-eight chessboard.

An eight-by-eight chessboard with two opposite corners removed.

An eight-by-eight chessboard with all four corners removed.

Which of these can be tiled by two-by-one dominoes (with no overlaps or gaps, and every domino contained within the board)?

There are several advantages to multiple choice tests. If item writers are well trained and items are quality assured, it can be a very effective assessment technique.[5] If students are instructed on the way in which the item format works and myths surrounding the tests are corrected, they will perform better on the test.[6] On many assessments, reliability has been shown to improve with larger numbers of items on a test, and with good sampling and care over case specificity, overall test reliability can be further increased.[7]

Multiple choice tests often require less time to administer for a given amount of material than would tests requiring written responses. This results in a more comprehensive evaluation of the candidate's extent of knowledge. Even greater efficiency can be created by the use of online examination delivery software. This increase in efficiency can offset the advantages offered by free-response items. That is, if free-response items provide twice as much information but take four times as long to complete, multiple-choice items present a better measurement tool.[citation needed]

Multiple choice questions lend themselves to the development of objective assessment items, but without author training, questions can be subjective in nature. Because this style of test does not require a teacher to interpret answers, test-takers are graded purely on their selections, creating a lower likelihood of teacher bias in the results.[8] Factors irrelevant to the assessed material (such as handwriting and clarity of presentation) do not come into play in a multiple-choice assessment, and so the candidate is graded purely on their knowledge of the topic. Finally, if test-takers are aware of how to use answer sheets or online examination tick boxes, their responses can be relied upon with clarity. Overall, multiple choice tests are the strongest predictors of overall student performance compared with other forms of evaluations, such as in-class participation, case exams, written assignments, and simulation games.[9]

The most serious disadvantage is the limited types of knowledge that can be assessed by multiple choice tests. Multiple choice tests are best adapted for testing well-defined or lower-order skills. Problem-solving and higher-order reasoning skills are better assessed through short-answer and essay tests.[citation needed] However, multiple choice tests are often chosen, not because of the type of knowledge being assessed, but because they are more affordable for testing a large number of students. This is especially true in the United States where multiple choice tests are the preferred form of high-stakes testing.

Another disadvantage of multiple choice tests is possible ambiguity in the examinee's interpretation of the item. Failing to interpret information as the test maker intended can result in an "incorrect" response, even if the taker's response is potentially valid. The term "multiple guess" has been used to describe this scenario because test-takers may attempt to guess rather than determine the correct answer. A free response test allows the test taker to make an argument for their viewpoint and potentially receive credit.

In addition, even if students have some knowledge of a question, they receive no credit for knowing that information if they select the wrong answer and the item is scored dichotomously. However, free response questions may allow an examinee to demonstrate partial understanding of the subject and receive partial credit. Additionally if more questions on a particular subject area or topic are asked to create a larger sample then statistically their level of knowledge for that topic will be reflected more accurately in the number of correct answers and final results.

Another disadvantage of multiple choice examinations is that a student who is incapable of answering a particular question can simply select a random answer and still have a chance of receiving a mark for it. If randomly guessing an answer, there is usually a 25 percent chance of getting it correct on a 4 answer choice question. It is common practice for students with no time left to give all remaining questions random answers in the hope that they will get at least some of them right. Many exams, such as the Australian Mathematics Competition and the SAT, have systems in place to negate this, in this case by making it no more beneficial to choose a random answer than to give none.

Another system of negating the effects of random selection is formula scoring, in which a score is proportionally reduced based on the number of incorrect responses and the number of possible choices. In this method, the score is reduced by the number of wrong answers divided by the average number of possible answers for all questions in the test, W/(c – 1) where w is the number of wrong responses on the test and c is the average number of possible choices for all questions on the test.[10] All exams scored with the three-parameter model of item response theory also account for guessing. This is usually not a great issue, moreover, since the odds of a student receiving significant marks by guessing are very low when four or more selections are available.

Additionally, it is important to note that questions phrased ambiguously may confuse test-takers. It is generally accepted that multiple choice questions allow for only one answer, where the one answer may encapsulate a collection of previous options. However, some test creators are unaware of this and might expect the student to select multiple answers without being given explicit permission, or providing the trailing encapsulation options. Of course, untrained test developers are a threat to validity regardless of the item format.

Critics like philosopher and education proponent Jacques Derrida, said that while the demand for dispensing and checking basic knowledge is valid, there are other means to respond to this need than resorting to crib sheets.[11]

Despite being sometimes contested, the format remains popular due to its utility, reliability, and cost effectiveness.[citation needed]

The theory that students should trust their first instinct and stay with their initial answer on a multiple choice test is a myth worth dispelling. Researchers have found that although some people believe that changing answers is bad, it generally results in a higher test score. The data across twenty separate studies indicate that the percentage of "right to wrong" changes is 20.2%, whereas the percentage of "wrong to right" changes is 57.8%, nearly triple.[12] Changing from "right to wrong" may be more painful and memorable (Von Restorff effect), but it is probably a good idea to change an answer after additional reflection indicates that a better choice could be made. In fact, a person's initial attraction to a particular answer choice could well derive from the surface plausibility that the test writer has intentionally built into a distractor (or incorrect answer choice). Test item writers are instructed to make their distractors plausible yet clearly incorrect. A test taker's first-instinct attraction to a distractor is thus often a reaction that probably should be revised in light of a careful consideration of each of the answer choices. Some test takers for some examination subjects might have accurate first instincts about a particular test item, but that does not mean that all test takers should trust their first instinct.