Friday, November 28, 2008

Validating Multiple Choice Assessments on the Cheap

Anatagonistic questions give away the answer. A question where the stem gives away the answer is an “antagonistic question.” No idea why. But without some form of internal validation, finding antagonistic questions, leading questions, and questions with more than one correct answer can be difficult to spot. Subject matter experts love their wording, even when it invalidates the question. Numbers never lie.
This post originally appeared on the Central Texas Instructional Design blog on this date.

As I mentioned last time, internal validation is a method of estimating the fairness and effectiveness of questions on a multiple choice assessment with data from the assessment itself. Note that it cannot determine the fairness and effectiveness of the assessment. That requires some form of external validation (Mehrens & Lehmann, 1973). You should not make retention or promotion decisions based solely on an assessment that has only been internally validated, but that assessment may still be of value in stack ranking learners or identifying areas where they can improve.

In this post, I give you a quick and dirty procedure for internal validation that you can perform with nothing more complicated than a PC database and spreadsheet. I used Access and Excel, but any database and spreadsheet would do.

Here’s how I structure the record:

  • Student ID*—a value that differentiates individuals but does not necessarily tie to any personally identifiable information
  • Assessment ID*—a value that distinguishes between the various assessments used in a curriculum and between versions of the same assessment
  • Question ID*—a value that distinguishes between versions of the same question but may allow the question to be used on multiple assessments
    • Note: The Question ID should link to a separate table of questions that includes the text of the stem, correct answer, and distractors.
  • Correct answer— the value of the correct option (may reside in an external table)
  • Learner Selection—a value that identifies the option the learner chose

Fields with an asterisk are part of the key. This table links to another table that contained details about the question, including the text of the stem and options.

With this data, you can determine the statistical measurements that the Measurement and Evaluation Center of the University of Texas at Austin (2006) identifies as relevant for internal evaluation:

  • Item difficulty—the percentage of learners who got the question correct
  • Item discrimination—the relationship between how learners performed on the question and their overall score on the assessment
  • Reliability coefficient—the margin of error in the overall score

You also have the information you need to evaluate the distractors, which may be the most useful result of this method. If you can determine why learners answer incorrectly, you can take steps—either in the learning environment or in the workplace—to correct this behavior.

In future posts, I’ll discuss how to calculate and interpret each of these measurements.

References

  • Measurement and Evaluation Center (2006). Analyzing Multiple-choice Item Responses. Austin, Texas: The University of Texas. Retrieved November 16, 2008.
  • Mehrens, W.A. & Lehmann, I.J. (1973). Measurement and valuation in Education and Psychology. New York: Holt, Rinehart, and Winston.

 

No comments:

Post a Comment